You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a pretty counter-intuitive behavior related to ActiveRecord distinct and sum, which makes the view_sum cache incorrect. See issue #33082 in rails repo for more details on how this actually works.
The main method for CourseCacheManager class is update_cache, which updates the different caches.
The view_sum cache is updated through the following private method:
def update_view_sum
@course.view_sum = @course.articles_courses.tracked.live.sum(:view_count)
end
This method is associated to the following SQL query:
SELECT SUM(DISTINCT `articles_courses`.`view_count`) FROM `articles_courses` INNER JOIN `articles` ON `articles`.`id` = `articles_courses`.`article_id` WHERE `articles_courses`.`course_id` = 16 AND `articles_courses`.`tracked` = TRUE AND `articles`.`deleted` = FALSE
This means that if the value of view_count is the same for different article courses, the repeated values are ignored, generating an incorrect calculation of the total sum of views.
This is caused by a not very intuitive behavior when using distinct and sum in ActiveRecord (note that tracked and live articles courses scopes use distinct).
To Reproduce
Steps to reproduce the behavior:
Have a course downloaded locally with several articles courses, and make the view_count for those articles courses not unique. Example:
Move the update_view_sum private method out of the private methods section so that you can call directly
See that course.view_sum is 4049 (doesn't take the repeated 220 into account), while view_sum is 4269 (4049+220).
Expected behavior
view_sum field for Course should have the sum of view_count for all tracked live articles courses, no matter if the view_count value is not unique .
The SQL query should be: SELECT DISTINCT `articles_courses`.* FROM `articles_courses` INNER JOIN `articles` ON `articles`.`id` = `articles_courses`.`article_id` WHERE `articles_courses`.`course_id` = 16 AND `articles_courses`.`tracked` = TRUE AND `articles`.`deleted` = FALSE
One option is to use the following definition, but I' m not sure if this could be less performant.
def update_view_sum
@course.view_sum = @course.articles_courses.tracked.live.sum(&:view_count)
end
Additional context
It is possible that this same behavior is causing problems in other parts of the code. We should review all the code when we fix this.
The text was updated successfully, but these errors were encountered:
What is happening?
There is a pretty counter-intuitive behavior related to
ActiveRecord
distinct and sum, which makes theview_sum
cache incorrect. See issue #33082 in rails repo for more details on how this actually works.The main method for
CourseCacheManager
class isupdate_cache
, which updates the different caches.The
view_sum
cache is updated through the following private method:This method is associated to the following SQL query:
SELECT SUM(DISTINCT `articles_courses`.`view_count`) FROM `articles_courses` INNER JOIN `articles` ON `articles`.`id` = `articles_courses`.`article_id` WHERE `articles_courses`.`course_id` = 16 AND `articles_courses`.`tracked` = TRUE AND `articles`.`deleted` = FALSE
This means that if the value of
view_count
is the same for different article courses, the repeated values are ignored, generating an incorrect calculation of the total sum of views.This is caused by a not very intuitive behavior when using distinct and sum in
ActiveRecord
(note thattracked
andlive
articles courses scopes usedistinct
).To Reproduce
Expected behavior
view_sum
field forCourse
should have the sum ofview_count
for all tracked live articles courses, no matter if theview_count
value is not unique .The SQL query should be:
SELECT DISTINCT `articles_courses`.* FROM `articles_courses` INNER JOIN `articles` ON `articles`.`id` = `articles_courses`.`article_id` WHERE `articles_courses`.`course_id` = 16 AND `articles_courses`.`tracked` = TRUE AND `articles`.`deleted` = FALSE
One option is to use the following definition, but I' m not sure if this could be less performant.
Additional context
It is possible that this same behavior is causing problems in other parts of the code. We should review all the code when we fix this.
The text was updated successfully, but these errors were encountered: