Replies: 2 comments
-
Hey Dan, I'm away from my workspace for another week or two, so my support here is limited. I've been thinking about your request, and I believe the best way to do this would be with different file names. For particles, you could make files with names like: particles_step_124_rank_12_species_1.out And in this, use each line to output the position, momentum and weight of each macro-particle. This must be done by creating a particle pointer, and going through the linked list (see an example in bremsstrahlung.F90, for update optical depth). Then it becomes a problem of post-processing - combining all the particle info from all the rank files into a single collection. This doesn't have to be done in EPOCH with MPI barriers and gatherers. If each rank is writing to a separate file, there's no data conflicts. Similarly, you could have: Ex_step_124_rank_12.out Where you output the Ex grid which is local to the rank. If you output from 1 to nx (or ny or nz), then you just get the data without ghost cells. Perhaps it's not the most elegant way, but it's definitely the easiest. Cheers, |
Beta Was this translation helpful? Give feedback.
-
Thanks Stuart for suggestion. This was great insight ! |
Beta Was this translation helpful? Give feedback.
-
I'd like to do some outputs saved on harddrive from subroutine diagnostics.f90/output_routines and thinking what would be best method to do that.
That would most probably require to call MPI_BARRIER at some point to stop all the process writing in the same file simultaneously and allowing us to do that only one after another in sequence .
Here is the demo how it is potentially possible to do that with the classic example of using of MPI barrier.. Question is what is the best place to put the barrier and how to restart the code to continue after it. In short, substituting PRINT* with the OPEN/WRITE/CLOSE is essentially what is ideologically intended.
! compilation:
!1) gFortran : mpif90 hello_world_mpi.f90 -o hello_world_mpi.exe or
!2) Intel : mpiifort hello_world_mpi.f90 -o hello_world_mpi.exe
! After compilation running it for 4 cores: mpirun -np 4 ./hello_world_mpi.exe
PROGRAM hello_world_mpi
include 'mpif.h'
integer process_Rank, size_Of_Cluster, ierror
call MPI_INIT(ierror)
call MPI_COMM_SIZE(MPI_COMM_WORLD, size_Of_Cluster, ierror)
call MPI_COMM_RANK(MPI_COMM_WORLD, process_Rank, ierror)
DO i = 0, 3, 1
IF(i == process_Rank) THEN
print *, 'Hello World from process: ', process_Rank, 'of ', size_Of_Cluster
END IF
call MPI_BARRIER( MPI_COMM_WORLD, i_error)
END DO
call MPI_FINALIZE(ierror)
END PROGRAM
And another - should i put the barrier outside the output_routines (in the calling it code epoch3d) or somewhere inside output_routines ?
Another method is not to mess with the MPI barriers but just to collect all the data from each core for each species in the allocates arrays. And then at some point just spew all the data from the arrays to the disk without any APPEND.
But the small still annoying problem is that each core does not broadcast the number of particles it is currently working with before it goes into the output_routines where you have to do "trepanation of their brains" calling linked lists for each core. Or may be this data is still available without running and counting all the particles from all cores separately? Specifically, can we in one call from any thread find the number of particles running in all other cores? Yes, we can allocate in advance more memory in the arrays than potential number of particles but this is not the way intuition tells you to proceed...
Third way is to find number of particles in each core manually counting linked lists, create the derived type multidimensional array with TYPE and allocate its subdimensions according to the found number of particles for each core and each species. It is not clear if MPI or Fortran will like that essentially "harassment" of the branches of one global array in many different threads by arbitrary allocating/deallocating them there and also not just at ones but in parallel.
In all these and many other cases though pausing all running threads on specific place with the some kind of a barrier, do your things with the code data and continue looks so appealing.
Any considerations are welcome.
Beta Was this translation helpful? Give feedback.
All reactions