[RegCNET] mpirun error with pgi 7.1 and AMD Opteron x86-64
Paulo Ricardo TEIXEIRA-SILVA
paulo.ricardo.gnu at gmail.com
Wed Sep 10 21:43:24 CEST 2008
Hello,
>
> Whilenusing NPROC=16 in a AMD Opteron dual core x86-64, using PGI compiler
> 7.1, I'm with problems to run regcm
> my grid is with iy=232 and ix=160, ds=40.0
>
> *when I types:*
> nohup mpirun -np 16 -machinefile machine.cluster -nolocal regcm >
> log.out.txt &
>
> *but no full sucess.
> the regcm run the 56 first days, exit with error mesg ->*
> ...
> BATS variables written at 2005022509 180.0000000000000
> at day = 55.4156, ktau = 53200 : 1st, 2nd time deriv of ps =
> 0.13552E-04 0.93734E-07, no. of points w/convection = process 3
> of 16
> p3_26190: p4_error: interrupt SIGFPE: 8
> process 8 of 16
> process 4 of 16
> p4_23892: p4_error: interrupt SIGx: 13
> process 12 of 16
> process 2 of 16
> p2_25778: p4_error: net_recv read: probable EOF on socket: 1
> process 10 of 16
> process 14 of 16
> rm_l_2_25789: (36773.972656) net_send: could not write to fd=5, errno = 32
> process 6 of 16
> process 9 of 16
> p9_30291: p4_error: net_recv read: probable EOF on socket: 1
> process 13 of 16
> rm_l_9_30302: (36773.339844) net_send: could not write to fd=5, errno = 32
> process 5 of 16
> process 1 of 16
>
> ...
> Writing rad fields at ktau = 53760 2005022600
> SAVTMP RESTART WRITTEN: idatex= 2005022600 ktau= 53760
> /bin/rm -f SAVTMP.2005022400
> BCs are ready from 2005022600 to 2005022606
> rm_l_3_26205: (36773.968750) net_send: could not write to fd=5, errno = 32
> process 11 of 16
> process 7 of 16
> process 15 of 16
> p4_23892: (36789.746094) net_send: could not write to fd=5, errno = 32
> p2_25778: (36797.988281) net_send: could not write to fd=5, errno = 32
> p9_30291: (36799.359375) net_send: could not write to fd=5, errno = 32
>
> *
> my regcm.param2 as:*
> INTEGER IX
> INTEGER NPROC
> INTEGER MJX
> INTEGER KX
> INTEGER NSG
> INTEGER NNSG
> INTEGER IBYTE
> INTEGER JXP
> CHARACTER*5 DATTYP
> CHARACTER*4 LSMTYP
> CHARACTER*7 AERTYP
> integer jxbb
> parameter(IX = 232)
> parameter(NPROC = 16)
> parameter(MJX = 160)
> parameter(JXP = MJX/NPROC)
> parameter(KX = 18)
> parameter(NSG = 1)
> parameter(NNSG = 1)
> parameter(IBYTE = 4)
> parameter(DATTYP='NNRP1')
> parameter(LSMTYP='BATS')
> parameter(AERTYP='AER00D0')
> parameter(jxbb=mjx-1)
> ~
>
>
>
> Can somebody suggests me how to overcome this, please!
>
>
> PS.: The problem can be , no. of points w/convection
> Regards,
>
> Paulo Ricardo Teixeira
>
> #########################################################################
>
> CV (Currículo Lattes):
> http://buscatextual.cnpq.br/buscatextual/visualizacv.jsp?id=K4705902T0
> ou neste link: http://lattes.cnpq.br/8914320939610393
> Paulo Ricardo Teixeira da Silva
> Diretor Adjunto de Assuntos Acadêmicos e Científico da UNEMET
> Mestre em Meteorologia - Radiação Solar / Modelagem da Radiação Solar
> (Processos de Superfície Terrestre)
>
> Bolsista/Pesquisador do NMA/LBA/INPA
> Instituto Nacional de Pesquisas da Amazônia - INPA
> Fone: +55 92 3643-3623
> Fax: +55 92 3643 3625
> Av. André Araújo, 2936 - Campus II
> Bairro: Aleixo - Cx. Postal 478 / Cep 69060-001
> Manaus/Amazonas
>
>
> Linux Counter desde de 2001-11-22
> N_LinuxCounter : #246599
>
> #########################################################################
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ictp.it/pipermail/regcnet/attachments/20080910/602cec2f/attachment-0002.html>
More information about the RegCNET
mailing list