Sale!

AMATH 483 / 583 HW3 solved

Original price was: $35.00.Current price is: $28.00.

Category:

Description

5/5 - (1 vote)

1 Problems

1. Matrix Multiplication Loop Permutations.

Implement templated gemm for each {i, j, k} loop permutation using the following specifications. Each computes C ← αAB + βC, where A ∈ R m×p , B ∈ R p×n, C ∈ R m×n, α, β ∈ R, but will exhibit distinct memory access patterns. Check these produce the correct results.

Turn in the .cpp and .hpp files for each. Include the header files into another header file hw3 p1 header.hpp and submit this as well. Pay special attention that your matrices will now be represented within a single vector in this exercise. Please utilize column major ordering as discussed in lecture when assigning and accessing matrix elements in this format.

• template void mm ijk (T a , const s t d : : v e c t o r& A, const s t d : : v e c t o r& B, T b , s t d : : v e c t o r& C, int m, int p , int n ) ; • template void mm jki (T a , const s t d : : v e c t o r& A, const s t d : : v e c t o r& B, T b , s t d : : v e c t o r& C, int m, int p , int n ) ; • template void mm kij (T a , const s t d : : v e c t o r& A, const s t d : : v e c t o r& B, T b , s t d : : v e c t o r& C, int m, int p , int n ) ;

• template void mm jik (T a , const s t d : : v e c t o r& A, const s t d : : v e c t o r& B, T b , s t d : : v e c t o r& C, int m, int p , int n ) ; • template void mm ikj (T a , const s t d : : v e c t o r& A, const s t d : : v e c t o r& B, T b , s t d : : v e c t o r& C, int m, int p , int n ) ; • template void mm kji (T a , const s t d : : v e c t o r& A, const s t d : : v e c t o r& B, T b , s t d : : v e c t o r& C, int m, int p , int n ) ;

2. Compiler Optimization.

Use the {kij} and {jki} loop permutation codes from problem 1 to explore the performance of your implementations applying compiler optimization levels default (no optimization or default case), -O3, and -ffast-math (or the equivalent for your compiler!) for square matrices of dimension n = 2 to n = 512, stride one.

Let each n be measured ntrial times and plot the average performance for each case versus n, ntrial ≥ 3. Submit your .cpp test code, and two plots -one for each loop variant on your choice of data type float or double. Extra credit: Submit plots for both data types.

3. (AM583 only, +5 for AM483) Strassen.

Use notes from the class lecture to implement a C++ template for the (recursive) Strassen matrix multiplication algorithm. Plot the double precision performance for square matrices of even dimension from n = 2 to n = 512.

Let each n be measured ntrial times and plot the average performance versus n, ntrial ≥ 3. You will turn in the .cpp and .hpp files for the strassen code, your .cpp test code, and performance plot. template v e c t o r> strassen mm ( const v e c t o r> &A, const v e c t o r> &B ) ; // v e c t o r>C=s trassen mm (A, B ) ; 1