Dear Ufuk,
Here are the steps that you should follow:
Assuming that you are using PGI compiler and MPICH2 under Linux.
First, you should create a file called "hostlist". In that file, you should list host name according to your CPU number. Such as, if you have 24 CPUs and name of the host is "uybhm"
>> vi hostfile
uybhm
uybhm
uybhm
.....
.....
.....
uybhm
total 24 lines, since you 24 CPUs. Then save it.
Second:
Issue the follwing command at the prompt in the relevent the directory
>> mpirun -np 24 -machinefile hostlist regcm <regcm.in
OR
>> time mpirun -np 24 -machinefile hostlist regcm <regcm.in>log.txt&
Assuming that at MAIN (Step 4) you issued the following commands successfully.
>> cd RegCM/Main
>> ln -sf 0options/0_NODIAG_PARALLEL_CODE MAKECODE
>> make clean
>> ./MAKECODE
>> make
If you need the contents of the Makefile how it is supposed to be at the MAIN step, please let me know.
Good luck,
Mustafa COŞKUN
________________________________
Dr. Mustafa COSKUN
Turkish State Meteorological Service (TSMS)
Department of Research and Data Processing
Research Division
Climate Change & Variability Group
06120 Kalaba/ANKARA/TURKIYE
Phone : +90 312 302 26 81
Fax : +90 312 361 20 40
e-mail: mcoskun@meteoroloji.gov.tr
-----Original Message-----
From: regcnet-bounces@lists.ictp.it on behalf of [BE] Ufuk Utku Turuncoglu
Sent: Thu 8/9/2007 18:14
To: XUNQIANG BI
Cc: regcnet@lists.ictp.it
Subject: Re: [RegCNET] strange behaviour in RegCM paralell configuration
I have already follow these steps and problem still exists. thanks for
your help.
ufuk
XUNQIANG BI wrote:
>
> Hi, Ufuk:
>
> You are using the bewolf cluster, right?
>
> Are you sure the way you did is about the same as
>
> http://www.ictp.trieste.it/~pubregcm/RegCM3/faq/parallel_1.txt
>
>
> On Thu, 9 Aug 2007, [BE] Ufuk Utku Turuncoglu wrote:
>
>> Hi,
>>
>> I try to run RegCM in parallel mode but when i submit job to cluster
>> wrong number of process will be spawn in each node.
>>
>> For example, I define the total number of cpu as 24 and i am using 8 cpu
>> nodes. so, i am using 3 nodes to create 24 porcess. When i check the
>> number of process in each node and count them, it is not exactly 24.
>> There are less process that i define in domain.param. I check the all
>> configuration again and again and i could not find any problem.
>>
>> I have already run model successfully in 24 cpu using different input
>> data (NCEP). But in this case (using ECHAM data) it is not running. But
>> once time i faced same problem with NCEP case but after installing again
>> of the model code, it solved and i could not find the bug. Is it
>> possible to input data could generate error?
>
> For ECHAM data preprocessing, I guess you add some lines in ICBC.f
> (You are not using RegCM3_for_EH5OM_3.tar.gz, right?) ,
> Does your ICBC file work with the serial code ?
>>
>> Also in buggy case, when i submit job, each one of the process runs like
>> an single/independent job and writes the information to the regcm.out
>> seperately. It means single processor version of RegCM runs in 24 cpu.
>
> I guess that you are not submitting the parallel job in a right way.
>
> The command I used is:
>
> mpirun -np 24 ./regcm
>
> I suggest that after you link the fort.10, fort.101, .... well
> to the ICBC files, you'd better use the above line (instead of
> regcm.x) to submit parallel job.
>
> Regards,
> Xunqiang Bi
_______________________________________________
RegCNET mailing list
RegCNET@lists.ictp.it
https://lists.ictp.it/mailman/listinfo/regcnet