The ROCK Linux project has been discontinued in 2010. Here are the old data for the historical record!

About ROCK Linux

Rolling ROCK (eZine)
September 2005
July 2005
April/May 2005
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
Router PXE install
Gaming with ROCK Linux
Build Wrappers Overview
Status of Sparc and PowerPC
"Hidden" ROCK Script Features
ROCK 2.0 Install Disks
Multi tar-ball packages
ISO-Testing with VMWare
ROCK i18n Project
Building on a Beowulf Cluster
1.7 Status Reports
dROCK Overview
Alpha AXP and MIPS Status

Download

Community

Documentation

Source Trees

Architectures

Target Distributions

Related Projects

Feature Projects

PR Stuff (T-Shirts, ..)

Webpage Mirrors

The ROCK Portal Site

Sitemap

Search

Building on a Beowulf Cluster

Building ROCK 1.7 on a Beowulf Cluster

1. Basics I'm assuming you have read the BUILD file and know how to make a 'normal' build of ROCK Linux. I'm also assuming that you know how to use a Linux cluster (since you are reading this, you might have one). I'm now going to explain how to build ROCK Linux on a cluster. The techniques described here can also be used to build ROCK Linux on an SMP machine to get the best performance out of all CPUs. ROCK Linux can be build on a simple cluster of workstations connected with a normal LAN (ethernet, etc). No low-latency or high-bandwith network is needed to build ROCK Linux on a cluster with good performance. ROCK Linux has it's own job scheduler to distribute jobs over the cluster nodes. But you can also use any job scheduler you have already installed on your cluster to do the job. 2. Amdahl's law In a famous paper Amdahl observed that one must consider an entire application when considering the level of available parallelism. If only one percent of a problem fails to parallelize, then no matter how much parallelism is available for the rest, the problem can never be solved more than one hundred times faster than in the sequential case. Almost every package in ROCK Linux depends on a few very basic packages like the C-library, the C-compiler and the shell. So it's not possible to make use of the power of your cluster in the early phase of the build where these essential packages are build. Later in the build there are almost always a few more packages which can be build in parallel (more than 100 packages is very common after the base packages have been built).

  ----+----------------------------------------------------------------------+
  181 |                                     ::::.                            |
      |                                   .:::::::.                          |
    P |                              .::::::::::::::                         |
    a |                             .::::::::::::::::.                       |
    r |                           :::::::::::::::::::::.                     |
    a |                        ..::::::::::::::::::::::::.                   |
    l |              .  ..  ...::::::::::::::::::::::::::::                  |
    l |             ::::::::::::::::::::::::::::::::::::::::.                |
    e |             ::::::::::::::::::::::::::::::::::::::::::.              |
    l |             ::::::::::::::::::::::::::::::::::::::::::::.            |
      |            .::::::::::::::::::::::::::::::::::::::::::::::           |
    J |            ::::::::::::::::::::::::::::::::::::::::::::::::.         |
    o |            ::::::::::::::::::::::::::::::::::::::::::::::::::.       |
    b |            ::::::::::::::::::::::::::::::::::::::::::::::::::::.     |
    s |          ::::::::::::::::::::::::::::::::::::::::::::::::::::::::.   |
      |       :.::::::::::::::::::::::::::::::::::::::::::::::::::::::::::.  |
    1 |...::..::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::.|
  ----+----------------------------------------------------------------------+
      | 1                  Number of Jobs build so far                   424 |

You can see that the build doesn't parallelize very well in the early phase but soon reaches a state where over 100 jobs can be built at the same time. That the number of available jobs going down in the right side of the graph is normal. When E.g. 400 of 424 jobs are already built, there are only 24 jobs left and so it's not possible anymore to have 100 parallel jobs.

  -----+--------------------------------------------------------------------+
     8 |     :    :::                                                       |
       |     :.  ::::.                                                      |
       |   ..::  :::::                                                      |
       |   ::::..:::::.                                                     |
     1 |::::::::::::::::::                                                  |
  -----+--------------------------------------------------------------------+
     2 |    ::::::::::::::::::::::::::::::::                                |
       |  ::::::::::::::::::::::::::::::::::                                |
       |.:::::::::::::::::::::::::::::::::::                                |
       |::::::::::::::::::::::::::::::::::::                                |
     1 |::::::::::::::::::::::::::::::::::::                                |
  -----+--------------------------------------------------------------------+
     1 |::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::|
       |::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::|
       |::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::|
       |::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::|
     1 |::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::|
  -----+--------------------------------------------------------------------+
  Jobs | 00:00                       Time                             14:41 |

If you have 'gnuplot' installed and $DISPLAY set, you can also pass the option '-x11' to ./scripts/Create-ParaSim so it will use the program 'gnuplot' to graph the results. A screenshot of the '-x11' mode of ./scripts/Create-ParaSim can be found here 3. Setting up the master Extract the ROCK Linux source somewhere and export this directory read-write to all nodes using NFS. In many cases there will be already a directory on your cluster which is shared between all nodes (e.g. /home). I will assume the directory name /home/rock-master in this document. Configure your build as usual. Enable the config option 'Make a parallel (cluster) build'. The config option 'Maximum size of job queue' should have a value which is higher than the maximum number of jobs which will be built on our cluster. Set this config option to '0' (unlimited) when building on a big cluster. The option 'Command for adding jobs' will be explained in section 6 (Building with an external job scheduler) and can be left blank if you are using the built-in job scheduler. 4. Setting up the nodes The following has to be done on every node. If you have many nodes in your cluster you might mant to use 'prsh' from https://www.cacr.caltech.edu/beowulf/ to perform the following steps on all nodes. You need to create a local build directory on every cluster node (building the packages on the NFS share would cost too much performance). In many cases there will be already a directory on the cluster for this (e.g. /scratch). I will assume the directory name /scratch/rock-node in this document.

    # mkdir -p /scratch/rock-node
    # cd /home/rock-master
    # ./scripts/Create-Links -config -build /scratch/rock-node

5. Building with the built-in job scheduler Run './scripts/Build-Target' in /home/rock-master the master. Instead of building the packages the master will create a job queue and add those packages to the queue which can be built next. Run './scripts/Build-Job -daemon' in /scratch/rock-node on the nodes. Again, you might want to use 'prsh' to do this on all nodes. If you want to build multiple packages parallel on one cluster node (e.g. because they have two CPUs) you need to run './scripts/Build-Job -daemon' as often as many jobs you want to run on the node at the same time.

   18:41 2002-05-08:   --- current status ---
   Build-Job (daemon mode)       running on node01 with PID 18452
   Build-Job (daemon mode)       running on node02 with PID 18665
   Build-Job (daemon mode)       running on node03 with PID 19618
   Job 3-kdenetwork              node02 (18665) since 18:32 2002-5-08
   Job 3-kdeutils                node03 (19618) since 18:41 2002-5-08
   Job 3-kdevelop                node01 (18452) since 18:30 2002-5-08
   Job 3-kdebindings             waiting in the job queue (priority 2)
   Job 3-kdeadmin                waiting in the job queue (priority 1)
   Job 3-kde-i18n-fr             waiting in the job queue (priority 1)
   Job 3-kde-i18n-es             waiting in the job queue (priority 1)
   Job 3-kde-i18n-de             waiting in the job queue (priority 1)
   Job 3-kdeartwork              waiting in the job queue (priority 0)
   Job 3-kdeaddons               waiting in the job queue (priority 0)
   18:41 2002-05-08:   ----------------------

6. Building with an external job scheduler Let's say the command for adding jobs in your job scheduler is 'addjob' and has only one parameter, which is the command to execute. You would set the config option 'Command for adding jobs' to the value

    addjob 'cd /scratch/rock-node ; {}'

The text {} will automatically replaced with the Build-Job invocation for building the package and always has the form of:

    ./scripts/Build-Job -cfg <config-name> <stagelevel>-<package-name>

So if you want to make some inteligent job scheduling (e.g. build large packages on a faster node) you can also pass {} to another script and have the command in $*, the config name in $3 and the stagelevel and package name in $4. If not all jobs can be executed, the job scheduler should prefer those jobs which have been submitted first. This is importand to make sure that it always is possible that multiple packages can be build in parallel. Note that './scripts/Build-Job -daemon' does not work if the 'Command for adding jobs' config option is set. The './scripts/Create-ParaStatus' script works as usual.

(by Clifford Wolf)

rocklinux.org search: