Condor currently supports MPICH versions 122, 123, and 124 using the ch_p4 device. Condor does not support MPICH version 125. These supported implementations are offered by Argonne National Labs without charge by download. See the web page at http://www-unix.mcs.anl.gov/mpi/mpich/ for details and availability. Programs to be submitted for execution under Condor will have been compiled using mpicc. No further compilation or linking is necessary to run jobs under Condor.
Administratively, Condor must be configured such that resources (machines) running MPI jobs are dedicated. Dedicated machines are ones that, once they begin execution of a program, will continue executing the program until the program ends. The program will not be preempted (to run another program) or suspended. Since Condor is not ordinarily used in this manner (Condor uses opportunistic scheduling), machines that are to be used as dedicated resources must be configured as such. Section 3.10.10 of Administrator's Manual describes the necessary configuration and provides detailed examples.
To simplify the dedicated scheduling of resources, a single machine becomes the scheduler of dedicated resources. This leads to a further restriction that jobs submitted to execute under the MPI universe (with dedicated machines) must be submitted from the machine running as the dedicated scheduler.
Once the programs are written and compiled, and Condor resources are correctly configured, jobs may be submitted. Each Condor job requires a submit description file. The simplest submit description file for an MPI job:
############################################# ## submit description file for mpi_program ############################################# universe = MPI executable = mpi_program machine_count = 4 queue
This job specifies the universe as mpi, letting Condor know that dedicated resources will be required. The machine_count command identifies the number of machines required by the job. The four machines that run the program will default to be of the same architecture and operating system as the machine on which the job is submitted, since a platform is not specified as a requirement.
The simplest example does not specify an input or output, meaning that the computation completed is useless, since both input comes from and the output goes to /dev/null. A more complex example of a submit description file utilizes other features.
###################################### ## MPI example submit description file ###################################### universe = MPI executable = simplempi log = logfile input = infile.$(NODE) output = outfile.$(NODE) error = errfile.$(NODE) machine_count = 4 queue
The specification of the input, output, and error files
utilize a predefined macro that is only relevant to
mpi universe jobs.
See the condor_ submit manual page on
page
for further description of predefined macros.
The $(NODE) macro is given a unique value as
programs are assigned to machines.
This value is what the MPICH version ch_p4 implementation
terms the rank of a program.
Note that this term is unrelated and independent of the
Condor term rank.
The $(NODE) value is fixed for the entire length
of the job.
It can therefore be used to identify individual aspects
of the computation.
In this example, it is used to give unique names to input
and output files.
If your site does NOT have a shared filesystem across all the nodes
where your MPI computation will execute, you can use Condor's file
transfer mechanism.
You can find out more details about these settings by reading the
condor_ submit man page or section 2.5.4 on
page .
Assuming your job only reads input from STDIN, here is an example
submit file for a site without a shared filesystem:
###################################### ## MPI example submit description file ## without using a shared filesystem ###################################### universe = MPI executable = simplempi log = logfile input = infile.$(NODE) output = outfile.$(NODE) error = errfile.$(NODE) machine_count = 4 should_transfer_files = yes when_to_transfer_output = on_exit queue
Consider the following C program that uses this example submit description file.
/************** * simplempi.c **************/ #include <stdio.h> #include "mpi.h" int main(argc,argv) int argc; char *argv[]; { int myid; char line[128]; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD,&myid); fprintf ( stdout, "Printing to stdout...%d\n", myid ); fprintf ( stderr, "Printing to stderr...%d\n", myid ); fgets ( line, 128, stdin ); fprintf ( stdout, "From stdin: %s", line ); MPI_Finalize(); return 0; }
Here is a makefile that works with the example. It would build the MPI executable, using the MPICH version ch_p4 implementation.
################################################################### ## This is a very basic Makefile ## ################################################################### # the location of the MPICH compiler CC = /usr/local/bin/mpicc CLINKER = $(CC) CFLAGS = -g EXECS = simplempi all: $(EXECS) simplempi: simplempi.o $(CLINKER) -o simplempi simplempi.o -lm .c.o: $(CC) $(CFLAGS) -c $*.c
The submission to Condor requires exactly four machines, and queues four programs. Each of these programs requires an input file (correctly named) and produces an output file.
If input file for $(NODE) = 0 (called infile.0) contains
Hello number zero.and the input file for $(NODE) = 1 (called infile.1) contains
Hello number one.then after the job is submitted to Condor, there will be eight files created: errfile.[0-3] and outfile.[0-3]. outfile.0 will contain
Printing to stdout...0 From stdin: Hello number zero.and errfile.0 will contain
Printing to stderr...0