Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify isNodeAvailable function and add comments #994

Open
wants to merge 2 commits into
base: unstable
Choose a base branch
from

Conversation

enjoy-binbin
Copy link
Member

Use getNodeReplicationOffset to replace the original logic
and add the corresponding annotations.

Use getNodeReplicationOffset to replace the original logic
and add the corresponding annotations.

Signed-off-by: Binbin <binloveplay1314@qq.com>
@@ -1332,16 +1332,18 @@ void addNodeToNodeReply(client *c, clusterNode *node) {
* not finished their initial sync, in failed state, or are
* otherwise considered not available to serve read commands. */
int isNodeAvailable(clusterNode *node) {
/* We don't consider PFAIL here because it's not a reliable indicator
* for node available and we don't want clients to use it. */
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that we rejected pfail in the past, but i can add a point of view. From myself own perspective, if a node is pfail, it means that it is unavailable from myself view and we can mark it unavailable in myself response.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite follow your comment. There isn't much harm in removing a temporarily unavailable node, but showing a dead node to clients is bad since they will timeout connecting to it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what @enjoy-binbin meant here is PFAIL state is not taken into consideration as it's just a self view and not yet confirmed by the quorum.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, hpatro saying is right. Sorry about the confuse, bad english.

there are two comments, the one in the code, is somehow i took from the old PR, maybe the CLUSTER SHARDS one, back then, we think a pfail node is available, so in here, we only skip fail node, the clusterNodeIsFailing one.

and in here, the comment i added as a github review comment, is saying, a pfail node may be unavailable in myself's view. The decision we made in the past in cluster shard was that the pfail node was available so we won't skip it. But if we look at it from myself own perspective, if a node pfails, can we assume that myself think it is unavailable?

Copy link

codecov bot commented Sep 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.56%. Comparing base (9033734) to head (49355bb).

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable     #994      +/-   ##
============================================
- Coverage     70.57%   70.56%   -0.01%     
============================================
  Files           114      114              
  Lines         61634    61630       -4     
============================================
- Hits          43498    43492       -6     
- Misses        18136    18138       +2     
Files with missing lines Coverage Δ
src/cluster.c 88.35% <100.00%> (-0.04%) ⬇️
src/cluster_legacy.c 85.88% <ø> (-0.08%) ⬇️

... and 12 files with indirect coverage changes

src/cluster.c Show resolved Hide resolved
src/cluster_legacy.c Outdated Show resolved Hide resolved
@@ -1332,16 +1332,18 @@ void addNodeToNodeReply(client *c, clusterNode *node) {
* not finished their initial sync, in failed state, or are
* otherwise considered not available to serve read commands. */
int isNodeAvailable(clusterNode *node) {
/* We don't consider PFAIL here because it's not a reliable indicator
* for node available and we don't want clients to use it. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what @enjoy-binbin meant here is PFAIL state is not taken into consideration as it's just a self view and not yet confirmed by the quorum.

Co-authored-by: Harkrishn Patro <bunty.hari@gmail.com>
Signed-off-by: Binbin <binloveplay1314@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants