PPC G4 Macintosh Cluster Howto

  1. Obtain G4 machines
  2. Obtain Black Lab Linux, Yellow Dog Linux, or other favorite PPC linux distribution.
  3. Install Linux on the 'server' machine.
  4. Create the computation node master image:

  5. Once the server is running, and before doing any server-specific configuration, create the 'node image' sudirectory tree, which will be used as the 'master image' for all of the computation nodes.
    1. This can be accomplished by something like this:
    2. mkdir /home/admin/node_image
      tar -cpf - --exclude="/proc" --exclude="/home" / | (cd /home/admin/node_image && tar -xpvf - )
      This creates a directory to hold the node image and copies files from the server into this directory by piping them through tar to preserve file attributes.
    3. NOTE: the '--exclude' options are rather important.. (it is assumed /home will be nfs mounted from the server onto the nodes) additional '--exlude' options can also be added.
  6. Configure the server
    1. install any packages to be installed only on the server
    2. Configure second network interface on the server. (This is not absolutely necessary, but makes management and security much easier)

    3. It is easiest to configure the second interface as the 'real-world' interface, and the built-in ethernet as the interface to the cluster. (this way all the cluster machines are using the same type of network card for internal communication).
      Configuration files for ethernet cards are found in:
    4. Configure DHCP server for netbooting (here is an example /etc/dhcpd.conf file)
      Make sure to enter the MAC addresses for your machines.
      NOTE: Newer revisions of Open Firmware require a patched version of DHCPd. Some help can be found here.
    5. Enable tftp in /etc/inetd.conf
    6. # TFTP server
      tftp    dgram   udp     wait    root    /usr/sbin/tcpd  in.tftpd
    7. Hostnames should be kept in /etc/hosts. Make sure this file is current on every node after changing it.
    8. Create an auto-installation ramdisk ( make-ramdisk.sh, install-ramdisk.tar.gz ). To get some help look here.
    9. Setup /tftpboot on the server with the install ramdisk (romfs.gz, vmlinux, yaboot, and yaboot.conf )
    10. Configure NFS (example /etc/exports )
  7. Configure the 'node image'
    1. to be in the environment the nodes will run in, run the following command, which will start a new shell:
    2. chroot /home/admin/node_image
    3. remove any uneeded packages, such as X Windows programs
    4. configure fstab to nfs mount /home and network to use dhcpd (example /etc/sysconfig/network-scripts/ifcfg-eth0 , /etc/fstab )
  8. Test out network booting
    1. Double check the dhcpd.conf file for mac address... if you network boot a mac it will *wipe out* whatever is currently on the machine. (I made the mistake of accidentally booting my ibook with the auto-install script wiping out my install ;)  )
    2. Booting the Machines. This has two options:
      1. Power on a G4 with a USB keyboard connected and hold down the 'N' key.
        1. NOTE: if you are booting a machine without a monitor, the 'N' key is detected when the keyboard 'caps lock' light flashes, which occurs about 15 seconds after a reset or power-on. A few seconds after the light flashes you can stop holding down the key and disconnect the keyboard.
      2. For the first time, it may be easier to boot into Open Firmware to see extra debugging information, by holding down 'command-option-o-f'
        Then enter 'boot enet:0'
    3. If all goes well, the following will happen:
      1. Yaboot will load, and load the kernel and ramdisk.
      2. The kernel will start the auto-install script
      3. The drive will be formatted with pdisk
      4. The node image will be mounted via NFS and copied from the server to the machine.
      5. an HFS partition will be formatted and set up to boot yaboot
        1. The default boot path on Apple machines is 'hd:\\:tbxi', which means the first 'blessed' folder on an hfs partition, with a file of the HFS type 'tbxi'
        2. the auto-install script will format the HFS partition, create a 'linux' directory and 'bless' it, and then copy in yaboot, yaboot.conf, and a backup vmlinux kernel. It will then set the type of yaboot to 'tbxi'. (see setup.py in /bin of the install-ramdisk for more details)
      6. the machine will reboot
      7. The machine will boot from the hard drive, load linux, and be ready to use.
    4. Supposing the first machine actually did work, now comes the fun part of booting the remaining machines and holding down the 'N' key
    5. Notes of interest:
      1. The patched version of DHCPd (see item 5.3) solves a problem where the computer being netbooted required additional parameters and was not receiving the initial boot file. If you think you may be having this problem, try booting into OF by holding down ' command-option-o-f'.
        At the prompt, type 'boot enet:0,yaboot' to specifically request yaboot.
      2. If a G4 is booted without a monitor attached, the video card will not be initialized. This power-saving feature can be annoying if you wish to attach a screen at a later time.
  9. Information on building kernels for your cluster can be found in this small howto.
  10. SSH configuration
    1. In order to login to the various nodes without password authentication, each user should run 'ssh-keygen' and use an empty passphrase. Copy the contents of identity.pub to authorized_keys. Both files are found in the .ssh directory in the user's home directory. This will be necessary for many parallel applications to function correctly.
  11. Myrinet configuration

  12. This is only needed if you plan on having a Myrinet network.
    1. General Procedure
      1. Get Myrnet cards.
      2. Get GM version 1.4
      3. Look at the Read Me files. Follow instructions.
    2. What We did.
      1. Ran configure like this "./configure --with-linux-smp --enable-new-features --disable-directcopy "
      2. Copied the gm-1.4/bin to /home/admin/node_image/usr/local/gm
      3. chroot /home/admin/node_image
      4. ./usr/local/gm/GM_INSTALL
      5. Then exit out of the chroot
      6. Created the start up script gm and placed them in /etc/rc.d/init.d and in /home/admin/node_image/etc/rc.d/init.d
      7. Created the links to this file in the proper run levels. ( chkconfig --add gm )
  13. MPI configuration
    1. General Procedure
      1. Get Mpich version 1.2..5
      2. Look at the Read Me files. Follow instructions.
    2. What We did.
      1. Changed the mpich.make file.
        1. set the GM home to our build directory of GM. "/home/admin/gm-1.4"
        2. changed the -rsh=rsh to -rsh=ssh in the configure part.
      2. Ran the mpich.make
      3. To install ran "./bin/mpiinstall -prefix=/home/admin/node_image/usr/local/mpich"
  14. PBS configuration
    1. General Procedure
      1. Get PBS version 2.2.
      2. Look at the Read Me files. Follow instructions.
    2. What We did.
      1. untared pbs at "/home/admin/node_image/usr/local/pbs"
      2. Ran configure like this "./configure --with-scp --set-default-server=server.fast"
      3. Used the makefile "make"
      4. Then used the "make install"
      5. Created the start up script PBS and placed them in /etc/rc.d/init.d and in /home/admin/node_image/etc/rc.d/init.d
      6. Created the links to this file in the proper run levels. ( chkconfig --add PBS )
      7. Added these files to mom_priv. config, epilogue , prologue
      8. Added this file to server_priv. nodes
      9. We configured our servers qmgr like this PBSconfig

      10. How Maui and PBS work with each other. Maui and PBS Description
  15. Maui configuration
    1. General Procedure
      1. Get Maui version 3.0.2.
      2. Look at the Read Me files. Follow instructions.
    2. What We did.
      1. Ran configure like this "./configure"
      2. Used the defaults for the options
      3. Used the makefile "make"
      4. Copied over the files in maui/bin to /usr/local/bin
      5. Created the start up script Maui and placed them in /etc/rc.d/init.d

      6. NOTE: it is only started on the server which is why it is only place in the servers start up scripts.
      7. Created the links to this file in the proper run levels. ( chkconfig --add Maui )
  16. Have some MFLOPS
Back to the G4 Cluster homepage
Back to the SCL homepage