-
- Downloads
WIP: Sparse kronecker v1
Rewrite in Fortran, using the same algorithm. Unfortunately this is slower than the C++ version. It seems that it is a compiler issue. Copying A and D into allocatable arrays improves the situation (looking at the generated assembly code, it seems that gfortran 10 does not exploit the “contiguous” attribute of pointers, and generates a more complex code that does index computations with an arbitrary stride; using an allocatable array removes this problem). However this is still not enough to be at par with C++ code.
Showing
- mex/build/kronecker.am 8 additions, 6 deletionsmex/build/kronecker.am
- mex/sources/kronecker/sparse_hessian_times_B_kronecker_C.cc 0 additions, 191 deletionsmex/sources/kronecker/sparse_hessian_times_B_kronecker_C.cc
- mex/sources/kronecker/sparse_hessian_times_B_kronecker_C.f08 164 additions, 0 deletionsmex/sources/kronecker/sparse_hessian_times_B_kronecker_C.f08
- mex/sources/matlab_mex.F08 27 additions, 4 deletionsmex/sources/matlab_mex.F08
Loading
Please register or sign in to comment