2.4. Installing and Configuring Red Hat Enterprise Linux

After the setup of basic cluster hardware, proceed with installation of Red Hat Enterprise Linux on each node and ensure that all systems recognize the connected devices. Follow these steps:

  1. Install Red Hat Enterprise Linux on all cluster nodes. Refer to Red Hat Enterprise Linux Installation Guide for instructions.

    In addition, when installing Red Hat Enterprise Linux, it is strongly recommended to do the following:

    • Gather the IP addresses for the nodes and for the bonded Ethernet ports, before installing Red Hat Enterprise Linux. Note that the IP addresses for the bonded Ethernet ports can be private IP addresses, (for example, 10.x.x.x).

    • Do not place local file systems (such as /, /etc, /tmp, and /var) on shared disks or on the same SCSI bus as shared disks. This helps prevent the other cluster nodes from accidentally mounting these file systems, and also reserves the limited number of SCSI identification numbers on a bus for cluster disks.

    • Place /tmp and /var on different file systems. This may improve node performance.

    • When a node boots, be sure that the node detects the disk devices in the same order in which they were detected during the Red Hat Enterprise Linux installation. If the devices are not detected in the same order, the node may not boot.

    • When using certain RAID storage configured with Logical Unit Numbers (LUNs) greater than zero, it may be necessary to enable LUN support by adding the following to /etc/modprobe.conf:

      options scsi_mod max_scsi_luns=255
  2. Reboot the nodes.

  3. When using a terminal server, configure Red Hat Enterprise Linux to send console messages to the console port.

  4. Edit the /etc/hosts file on each cluster node and include the IP addresses used in the cluster or ensure that the addresses are in DNS. Refer to Section 2.4.1 Editing the /etc/hosts File for more information about performing this task.

  5. Decrease the alternate kernel boot timeout limit to reduce boot time for nodes. Refer to Section 2.4.2 Decreasing the Kernel Boot Timeout Limit for more information about performing this task.

  6. Ensure that no login (or getty) programs are associated with the serial ports that are being used for the remote power switch connection (if applicable). To perform this task, edit the /etc/inittab file and use a hash symbol (#) to comment out the entries that correspond to the serial ports used for the remote power switch. Then, invoke the init q command.

  7. Verify that all systems detect all the installed hardware:

  8. Verify that the nodes can communicate over all the network interfaces by using the ping command to send test packets from one node to another.

  9. If intending to configure Samba services, verify that the required RPM packages for Samba services are installed.

2.4.1. Editing the /etc/hosts File

The /etc/hosts file contains the IP address-to-hostname translation table. The /etc/hosts file on each node must contain entries for IP addresses and associated hostnames for all cluster nodes.

As an alternative to the /etc/hosts file, name services such as DNS or NIS can be used to define the host names used by a cluster. However, to limit the number of dependencies and optimize availability, it is strongly recommended to use the /etc/hosts file to define IP addresses for cluster network interfaces.

The following is an example of an /etc/hosts file on a node of a cluster that does not use DNS-assigned hostnames:

127.0.0.1         localhost.localdomain     localhost
192.168.1.81      node1.example.com         node1
193.186.1.82      node2.example.com         node2
193.186.1.83      node3.example.com         node3

The previous example shows the IP addresses and hostnames for three nodes (node1, node2, and node3),

ImportantImportant
 

Do not assign the node hostname to the localhost (127.0.0.1) address, as this causes issues with the CMAN cluster management system.

Verify correct formatting of the local host entry in the /etc/hosts file to ensure that it does not include non-local systems in the entry for the local host. An example of an incorrect local host entry that includes a non-local system (server1) is shown next:

127.0.0.1     localhost.localdomain     localhost server1

An Ethernet connection may not operate properly if the format of the /etc/hosts file is not correct. Check the /etc/hosts file and correct the file format by removing non-local systems from the local host entry, if necessary.

Note that each network adapter must be configured with the appropriate IP address and netmask.

The following example shows a portion of the output from the /sbin/ip addr list command on a cluster node:

2: eth0: <BROADCAST,MULTICAST,UP> mtu 1356 qdisc pfifo_fast qlen 1000
    link/ether 00:05:5d:9a:d8:91 brd ff:ff:ff:ff:ff:ff
    inet 10.11.4.31/22 brd 10.11.7.255 scope global eth0
    inet6 fe80::205:5dff:fe9a:d891/64 scope link
       valid_lft forever preferred_lft forever

You may also add the IP addresses for the cluster nodes to your DNS server. Refer to the Red Hat Enterprise Linux System Administration Guide for information on configuring DNS, or consult your network administrator.

2.4.2. Decreasing the Kernel Boot Timeout Limit

It is possible to reduce the boot time for a node by decreasing the kernel boot timeout limit. During the Red Hat Enterprise Linux boot sequence, the boot loader allows for specifying an alternate kernel to boot. The default timeout limit for specifying a kernel is ten seconds.

To modify the kernel boot timeout limit for a node, edit the appropriate files as follows:

When using the GRUB boot loader, the timeout parameter in /boot/grub/grub.conf should be modified to specify the appropriate number of seconds for the timeout parameter. To set this interval to 3 seconds, edit the parameter to the following:

timeout = 3

When using the LILO or ELILO boot loaders, edit the /etc/lilo.conf file (on x86 systems) or the elilo.conf file (on Itanium systems) and specify the desired value (in tenths of a second) for the timeout parameter. The following example sets the timeout limit to three seconds:

timeout = 30

To apply any changes made to the /etc/lilo.conf file, invoke the /sbin/lilo command.

On an Itanium system, to apply any changes made to the /boot/efi/efi/redhat/elilo.conf file, invoke the /sbin/elilo command.

2.4.3. Displaying Console Startup Messages

Use the dmesg command to display the console startup messages. Refer to the dmesg(8) man page for more information.

The following example of output from the dmesg command shows that two external SCSI buses and nine disks were detected on the node. (Lines with backslashes display as one line on most screens):

May 22 14:02:10 storage3 kernel: scsi0 : Adaptec AHA274x/284x/294x \
	      (EISA/VLB/PCI-Fast SCSI) 5.1.28/3.2.4 
May 22 14:02:10 storage3 kernel:         
May 22 14:02:10 storage3 kernel: scsi1 : Adaptec AHA274x/284x/294x \
              (EISA/VLB/PCI-Fast SCSI) 5.1.28/3.2.4 
May 22 14:02:10 storage3 kernel:         
May 22 14:02:10 storage3 kernel: scsi : 2 hosts. 
May 22 14:02:11 storage3 kernel:   Vendor: SEAGATE   Model: ST39236LW         Rev: 0004 
May 22 14:02:11 storage3 kernel: Detected scsi disk sda at scsi0, channel 0, id 0, lun 0 
May 22 14:02:11 storage3 kernel:   Vendor: SEAGATE   Model: ST318203LC        Rev: 0001 
May 22 14:02:11 storage3 kernel: Detected scsi disk sdb at scsi1, channel 0, id 0, lun 0 
May 22 14:02:11 storage3 kernel:   Vendor: SEAGATE   Model: ST318203LC        Rev: 0001 
May 22 14:02:11 storage3 kernel: Detected scsi disk sdc at scsi1, channel 0, id 1, lun 0 
May 22 14:02:11 storage3 kernel:   Vendor: SEAGATE   Model: ST318203LC        Rev: 0001 
May 22 14:02:11 storage3 kernel: Detected scsi disk sdd at scsi1, channel 0, id 2, lun 0 
May 22 14:02:11 storage3 kernel:   Vendor: SEAGATE   Model: ST318203LC        Rev: 0001 
May 22 14:02:11 storage3 kernel: Detected scsi disk sde at scsi1, channel 0, id 3, lun 0 
May 22 14:02:11 storage3 kernel:   Vendor: SEAGATE   Model: ST318203LC        Rev: 0001 
May 22 14:02:11 storage3 kernel: Detected scsi disk sdf at scsi1, channel 0, id 8, lun 0 
May 22 14:02:11 storage3 kernel:   Vendor: SEAGATE   Model: ST318203LC        Rev: 0001 
May 22 14:02:11 storage3 kernel: Detected scsi disk sdg at scsi1, channel 0, id 9, lun 0 
May 22 14:02:11 storage3 kernel:   Vendor: SEAGATE   Model: ST318203LC        Rev: 0001 
May 22 14:02:11 storage3 kernel: Detected scsi disk sdh at scsi1, channel 0, id 10, lun 0 
May 22 14:02:11 storage3 kernel:   Vendor: SEAGATE   Model: ST318203LC        Rev: 0001 
May 22 14:02:11 storage3 kernel: Detected scsi disk sdi at scsi1, channel 0, id 11, lun 0 
May 22 14:02:11 storage3 kernel:   Vendor: Dell      Model: 8 BAY U2W CU      Rev: 0205 
May 22 14:02:11 storage3 kernel:   Type:   Processor \
                          ANSI SCSI revision: 03 
May 22 14:02:11 storage3 kernel: scsi1 : channel 0 target 15 lun 1 request sense \
	      failed, performing reset. 
May 22 14:02:11 storage3 kernel: SCSI bus is being reset for host 1 channel 0. 
May 22 14:02:11 storage3 kernel: scsi : detected 9 SCSI disks total.

The following example of the dmesg command output shows that a quad Ethernet card was detected on the node:

May 22 14:02:11 storage3 kernel: 3c59x.c:v0.99H 11/17/98 Donald Becker
May 22 14:02:11 storage3 kernel: tulip.c:v0.91g-ppc 7/16/99 
May 22 14:02:11 storage3 kernel: eth0: Digital DS21140 Tulip rev 34 at 0x9800, \
	      00:00:BC:11:76:93, IRQ 5. 
May 22 14:02:12 storage3 kernel: eth1: Digital DS21140 Tulip rev 34 at 0x9400, \
	      00:00:BC:11:76:92, IRQ 9. 
May 22 14:02:12 storage3 kernel: eth2: Digital DS21140 Tulip rev 34 at 0x9000, \
	      00:00:BC:11:76:91, IRQ 11. 
May 22 14:02:12 storage3 kernel: eth3: Digital DS21140 Tulip rev 34 at 0x8800, \
	      00:00:BC:11:76:90, IRQ 10.

2.4.4. Displaying Devices Configured in the Kernel

To be sure that the installed devices (such as network interfaces), are configured in the kernel, use the cat /proc/devices command on each node. For example:

Character devices:
  1 mem
  4 /dev/vc/0
  4 tty
  4 ttyS
  5 /dev/tty
  5 /dev/console
  5 /dev/ptmx
  6 lp
  7 vcs
 10 misc
 13 input
 14 sound
 29 fb
 89 i2c
116 alsa
128 ptm
136 pts
171 ieee1394
180 usb
216 rfcomm
226 drm
254 pcmcia

Block devices:
  1 ramdisk
  2 fd
  3 ide0
  8 sd
  9 md
 65 sd
 66 sd
 67 sd
 68 sd
 69 sd
 70 sd
 71 sd
128 sd
129 sd
130 sd
131 sd
132 sd
133 sd
134 sd
135 sd
253 device-mapper

The previous example shows: