You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dear Roozbeh,
I tried to run cv_spatial(x), where nrow(x) == 37000000, and it does not finish within a day, even if iteration == 1 or selection == "systematic". I wanted to understand the source code, find the bottleneck, and modify the code in a way it run parallel in multiple cores, when I found this issue. The code of cv_spatial() calls sf::st_intersects() twice, even though st_intersects() is a really time-consuming function (and can be easily parallelized).
L254: sub_blocks <- blocks[x, ]
Here, you run "[.sf", which call st_intersects() and create a logical mask from the results (please refer to L325 in sf.R in package "sf"), but drop the results of the predicate function
You call again the time-consuming st_intersects(), with almost the same inputs.
I recommend calling st_intersects() once, then calculating the logical mask, and finally subsetting the blocks. E.g. something like this:
Also I suggest parallelizing st_intersects() that could make cv_spatial() much faster. Please let me know if I could collaborate in this development.
Have a nice week,
Ákos
The text was updated successfully, but these errors were encountered:
In the meanwhile, I realized that the two st_intersects() has different direction (i.e., x and y are swapped), but sgbp objects can be transposed by t(), which is really fast (implemented in C).
Dear Roozbeh,
I tried to run cv_spatial(x), where nrow(x) == 37000000, and it does not finish within a day, even if iteration == 1 or selection == "systematic". I wanted to understand the source code, find the bottleneck, and modify the code in a way it run parallel in multiple cores, when I found this issue. The code of cv_spatial() calls sf::st_intersects() twice, even though st_intersects() is a really time-consuming function (and can be easily parallelized).
Here, you run "[.sf", which call st_intersects() and create a logical mask from the results (please refer to L325 in sf.R in package "sf"), but drop the results of the predicate function
You call again the time-consuming st_intersects(), with almost the same inputs.
I recommend calling st_intersects() once, then calculating the logical mask, and finally subsetting the blocks. E.g. something like this:
Also I suggest parallelizing st_intersects() that could make cv_spatial() much faster. Please let me know if I could collaborate in this development.
Have a nice week,
Ákos
The text was updated successfully, but these errors were encountered: