You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The neuralnet_predict() method takes one sample (and one sample only)
If I can rather calculate all inputs for several samples and store those inputs in a matrix, the forward calculation through the neural network becomes a matrix-matrix multiplication at each level. A BLAS system can do matrix-matrix calculations much faster than the corresponding N times vector-matrix calculations. I think that is a huge potential speedup.
I wonder how to add this feature. Should it be inside neuralnet.c? Guarded by #ifdef USE_CBLAS?
Or maybe I should add this as en "extension" in some way. I will probably have to pass in a pointer to some work memory, such that I can allocate work memory once. I can maybe just leave the code in the example folder?
The
neuralnet_predict()
method takes one sample (and one sample only)If I can rather calculate all inputs for several samples and store those inputs in a matrix, the forward calculation through the neural network becomes a matrix-matrix multiplication at each level. A BLAS system can do matrix-matrix calculations much faster than the corresponding N times vector-matrix calculations. I think that is a huge potential speedup.
What about:
Then this can be really optimized by BLAS. I also need bigger work memory and stuff. This could be cool.
The text was updated successfully, but these errors were encountered: