[RegCNET] occur error with mpi running

chen chenjh1213 at 163.com
Wed Dec 28 09:21:55 CET 2011


Dear Dr. Bi and all;
i add more variables to CHE output files, the step of ./configure  and make all is all right, but when i strar running the model with MPI version, it occured some error with the following information.
 
……
Opening new output file output //Test_ATM.1997110100.nc
Opening new output file output //Test_SRF.1997110100.nc
Opening new output file output //Test_RAD.1997110100.nc
Opening new output file output //Test_CHE.1997110100.nc
ATM variables written at  1997110100 0.00000000000
SRF variables written at  1997110100 2.00000000000
RAD variables written at  1997110100 2.00000000000
rank 0 in job 11(some time 3 or 5) node12_35514  caused collective abort of all ranks
 exit status of rank 0: killed by signal 9
&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&
 
i google the error, it's said that the memory was limited or MPI version was wrong, but when i run the model that was not modified, it can run well. so I think it must some wrong with MPI (maybe the mpi_gather or other). The possible reason was that the tranfer of data was wrong, but I'm not skillful in  parallel computing programming, so i hope to get your help and suggest!!!
 
thank you in advance!
 
PS:
i add two more 2D variables in CHE output file.
the MPI_gather i modified as below and the chem0 and chem_0 have been redefined as well 
call mpi_gather(chem0,iy*((ntr+3)*kz+ntr*7+7)*jxp,            &
                        & mpi_real8,chem_0,iy*((ntr+3)*kz+ntr*7+7)*jxp, &
                        & mpi_real8,0,mpi_comm_world,ierr)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ictp.it/pipermail/regcnet/attachments/20111228/3175d3df/attachment.html>


More information about the RegCNET mailing list