Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't call MPI_Comm_split if NCCL_TESTS_SPLIT_MASK is not set #151

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

tstruk
Copy link

@tstruk tstruk commented Jun 30, 2023

No need to call MPI_Comm_split if the NCCL_TESTS_SPLIT_MASK variable is not set.
It only causes unnecessary traffic with no real effect, and in some cases might cause the ranks be invalid.

No need to call MPI_Comm_split if the NCCL_TESTS_SPLIT_MASK variable
is not set. It only causes unnecessary traffic with no real effect,
and in some cases might cause the ranks be invalid.

Signed-off-by: Tadeusz Struk <tstruk@gigaio.com>
@sjeaugey
Copy link
Member

sjeaugey commented Jul 3, 2023

in some cases might cause the ranks be invalid

Can you say more about this?

@tstruk
Copy link
Author

tstruk commented Jul 3, 2023

Can you say more about this?

Hi,
Yes, that is running on MPICH 4.1.2, and libfabric 1.18, with a custom provider, NCCL version 2.18.3, across 4 nodes.
Sometimes what I see is that after calling MPI_Comm_split() with the parameter color equal to 0, the rank in one of the nodes is not correct.
The color is 0 because I don't set the NCCL_TESTS_SPLIT_MASK environment variable.
Here is the debug code I added:

diff --git a/src/common.h b/src/common.h
index 20fa461..d2359ea 100644
--- a/src/common.h
+++ b/src/common.h
@@ -282,5 +282,11 @@ static int ncclstringtoop (char *str) {
 extern int is_main_proc;
 extern thread_local int is_main_thread;
 #define PRINT if (is_main_thread) printf
+#define PRINT_ALL_NODES(STR, ...)      \
+({                                     \
+  char hn[256];                        \
+  gethostname(hn, 256);                \
+  printf("%s: " STR "\n", hn, ## __VA_ARGS__); \
+})

and

diff --git a/src/common.cu b/src/common.cu
index 48a629c..a7531f6 100644
--- a/src/common.cu
+++ b/src/common.cu
@@ -852,6 +852,8 @@ testResult_t run() {
 #ifdef MPI_SUPPORT
   MPI_Comm_size(MPI_COMM_WORLD, &totalProcs);
   MPI_Comm_rank(MPI_COMM_WORLD, &proc);
+PRINT_ALL_NODES("proc %d, totalProcs %d\n", proc, totalProcs);
+
   uint64_t hostHashs[totalProcs];
   hostHashs[proc] = getHostHash(hostname);
   MPI_Allgather(MPI_IN_PLACE, 0, MPI_DATATYPE_NULL, hostHashs, sizeof(uint64_t), MPI_BYTE, MPI_COMM_WORLD);
@@ -867,6 +869,8 @@ testResult_t run() {
   MPI_Comm_split(MPI_COMM_WORLD, color, proc, &mpi_comm);
   MPI_Comm_size(mpi_comm, &ncclProcs);
   MPI_Comm_rank(mpi_comm, &ncclProc);
+PRINT_ALL_NODES("ncclProc %d, ncclProcs %d\n", ncclProc, ncclProcs);
+
 #endif
   is_main_thread = is_main_proc = (proc == 0) ? 1 : 0;

and here is what I see:

Node1: proc 1, totalProcs 4 
Node3: proc 3, totalProcs 4 
Node2: proc 2, totalProcs 4 
Node0: proc 0, totalProcs 4 
Node1: ncclProc 1, ncclProcs 4 
Node3: ncclProc 3, ncclProcs 4 
Node2: ncclProc 0, ncclProcs 2 
Node0: ncclProc 0, ncclProcs 4 

The MPICH documentation [1] says that the color should be "non-negative integer" so 0 should be ok, but it's not.
When this happens the test hangs after the NCCL init phase.
After the fix is applied everything works as expected.

[1] - https://www.mpich.org/static/docs/v4.1.2/www3/MPI_Comm_split.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants