[RegCNET] occur error with mpi running
chen
chenjh1213 at 163.com
Wed Dec 28 09:21:55 CET 2011
Dear Dr. Bi and all;
i add more variables to CHE output files, the step of ./configure and make all is all right, but when i strar running the model with MPI version, it occured some error with the following information.
……
Opening new output file output //Test_ATM.1997110100.nc
Opening new output file output //Test_SRF.1997110100.nc
Opening new output file output //Test_RAD.1997110100.nc
Opening new output file output //Test_CHE.1997110100.nc
ATM variables written at 1997110100 0.00000000000
SRF variables written at 1997110100 2.00000000000
RAD variables written at 1997110100 2.00000000000
rank 0 in job 11(some time 3 or 5) node12_35514 caused collective abort of all ranks
exit status of rank 0: killed by signal 9
&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&
i google the error, it's said that the memory was limited or MPI version was wrong, but when i run the model that was not modified, it can run well. so I think it must some wrong with MPI (maybe the mpi_gather or other). The possible reason was that the tranfer of data was wrong, but I'm not skillful in parallel computing programming, so i hope to get your help and suggest!!!
thank you in advance!
PS:
i add two more 2D variables in CHE output file.
the MPI_gather i modified as below and the chem0 and chem_0 have been redefined as well
call mpi_gather(chem0,iy*((ntr+3)*kz+ntr*7+7)*jxp, &
& mpi_real8,chem_0,iy*((ntr+3)*kz+ntr*7+7)*jxp, &
& mpi_real8,0,mpi_comm_world,ierr)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ictp.it/pipermail/regcnet/attachments/20111228/3175d3df/attachment.html>
More information about the RegCNET
mailing list