Suresh Kumar Pakalapati's Linux Administration: March 2012

Wednesday, March 28, 2012

Data Redundancy by DRBD

Has your database (or mail or file) server crashed? Is your entire department waiting for you to restore service? Are your most recent backups a month old? Are those backups off-site? Is this a frighteningly real scenario? Oh, yeah. Can it be avoided? Oh, yeah. TheDistributed Replicated Block Device (DRBD) system can save the day, your data, and your job. DRBD provides data redundancy at a fraction of the cost of other solutions.

Almost every service depends on data. So, to offer a service, its data must be available. And if you want to make that service highly-available, you must first make the data it depends on highly-available.

The most natural way to do this (and hopefully, it’s something you already do on a regular basis) is to backup your data. In case you lose your active data, you just restore from the most recent backup, and the data is available again. Or, if the host your service runs on is (temporarily) unusable, you can replace it with another host configured to provide the identical service, and restore the data there.

To reduce possible downtime, you can have a second machine ready to takeover.

Whenever you change the data on one machine, you back it up on the other. You can have the secondary machine switched off, and just turn it on if the primary host goes down. This is typically referred to as cold standby. Or you can have the backup machine up and running, a configuration known as a hot standby.

However, whether your standby is hot or cold, one problem remains: if the active node fails, you lose changes to the data made after the most recent backup. But even that can be addressed… if you have the bucks.

One solution is to use some kind of shared storage device. With media shared between machines, both nodes have access to the most recent data when they need it. Storage can be simple SCSI sharing, dual controller RAID arrangements like IBM’s ServeRAID, shared fiber-channel disks, or high-end storage like IBM Shark or the various EMC solutions.

While effective, these systems are relatively costly, ranging from five thousand to millions of dollars. And unless you purchase the most expensive of these systems, shared storage typically has single points of failure (SPOFs) associated with them — whether they’re obvious or not. For example, some provide separate paths to a single shared bus, but have a single, internal electrical path to access the bus.

Another solution — and one that’s as good as the most expensive hardware — is live replication.

Real Time Backup with Replication

DBRD provides live replication of data. DRBD provides a mass storage device, such as a block device, and distributes the device over two machines. Whenever one node writes to the distributed device, the changes are replicated to the other in real time.

DRBD layers transparently over any standard block device (the “lower level device”), and uses TCP/IP over standard network interfaces for data replication. Though you can use raw devices for special purposes, the typical direct client to a block device is a filesystem, and it’s recommended that you use one of the journaling filesystems, such as Ext3 or Reiserfs. (XFS is not yet usable with DRBD.) You can think of DRBD as RAID1 over the network.

No special hardware is required, though it’s best to have a dedicated (crossover) network link for the data replication. And if you need high write throughput, you should eliminate the bottleneck of 10/100 megabit Ethernet and use Gigabit Ethernet instead. (To tune it further, you can increase the MTU to something greater than the typical files system block size, say, 5000 bytes). Thus, for the cost of a single, proprietary shared storage solution, you can setup several DRBD clusters.

Installing from Binary Packages

When there are (official or unofficial) packages available for your favorite distribution, just install from those, and you’re done. For example, SuSE officially includes DRBD and Heartbeat in its standard distribution, as well as in SuSE Linux Enterprise Sever 8 (SLES8).

The most recent “unofficial” SuSE packages can be found in Lars Marowsky-Bree’s subtree at ftp.suse.com/pub/people/ lmb/drbd and its mirrors. For Debian users, David Krovich provides prebuilt packages (via the apt updater) at debhttp://fsrc.csee.wvu.edu/debian/apt-repository/binary/deb-src, and source packages at http://fsrc.csee.wvu.edu/debian/apt-repository/source/.

If you need to compile DRBD from source, get a DRBD source package or source tarball from the download section of http://www.drbd.org, or check it out from CVS. Be sure to have the kernel sources for your running kernel, and make sure that the kernel source tree configuration matches the configuration of the running kernel. For reference, these are the steps for SuSE:

# cd /usr/src/linux
# make cloneconfig; make dep
# cd /wherever/drbd
# make; make install

In case you got the source tarball, you should backup the drbd/documentation/ subdirectory first. Since the sgml/docbook/ stuff is difficult to get right, the tarball contains “precompiled” man pages and documentation, which might be corrupted by an almost, but not quite, matching SGML environment.

DRBD Configuration

Once installed, you need to tell DRBD about its environment. You should be able to find a sample configuration file in /etc/drbd.conf; if not, there is a well commented one in the drbd/scripts/subdirectory.

dbrb.conf is divided into at most one global{} section, and an “arbitrary” number of resourceresource id {} sections, where resource id is typically something like drbd2.

In the global section, you can use minor-count to specify how many drbds (here, in lower case, drbd refers to the block devices) you want to be able to configure, in case you want to define more resources later without reloading the module (which would interrupt services).

Each resource{} section further splits into resource settings, grouped as disk{}-, net{}-, and node-specific settings, where the latter settings are grouped in on host hostname{}subsections. Parameters you need to change are hostname, device, physical disk and virtual disk-size, and Internet address and port number.

Testing Your System

Once you’ve configured drbd.conf, start DRBD. Assuming that the names of the nodes are pauland silas, choose one node to start with, say, paul. Run the command:

paul# /etc/init.d/drbd start

When prompted, make paul primary, then create a file system on the drbd with the command:

paul# mke2fs -j /dev/nb0

Make an entry into /etc/fstab (on both nodes!), like this:

/dev/nb0         /www           auto
  defaults,noauto     0 0
/dev/nb1         /mail          auto
  defaults,noauto     0 0

On the other node, silas, run:

silas# drbd start

When DRBD starts on the second node, it connects with the first node and starts to synchronize. Synchronization typically takes quite a while, especially if you use 100 megabit Ethernet and large devices.

The device that’s the synch target (here, the device on silas) typically blocks in the script until the synchronization is finished. However, the synch source (the primary or paul) is fully operational during a synch. So back on the first node, let the script mount the device:

paul# /etc/init.d/datadisk start

Start working with this file system, put some large files there, copy your CVS repository, or something.

When synch is finished, try a manual failover. Unmount the drbd devices on paul, and mount them on silas:

paul# datadisk stop
silas# datadisk start

You should now find the devices mounted on silas, and all of the files and changes you made should be there, too. In fact, the first disk-size blocks of the underlying physical devices should be bit-for-bit identical. If you want, you can verify this with an MD5SUM over the complete device.

Next, start DRBD again on both nodes. This time there should be no synch. This is the normal situation after an intentional reboot: if both nodes are in a “secondary” state before the cluster loses its connection, there is no need for a synch. (See the sidebar “How DRBD Works” for more information about when DRBD syncs, and why.)

Finally, you can automate the assignment of the primary and secondary roles to implement failover.

Some Do’s and Don’ts

Here are some things you should do and some things you should avoid when running DBRD.

* Never mount a drbd in secondary state. Though it’s possible to mount a secondary device as read-only, changes made to the primary are mirrored to it underneath the filesystem and buffercache of the secondary, so you won’t see changes on the secondary. And changing metadata underneath a filesystem is a risky habit, since it may confuse your kernel to death.

Once you setup DRBD, never — as in never!! — bypass it, or access the underlying device directly, unless it’s the last chance to recover data after a catastrophic failure.

If your primary node fails and you rebuild it, make sure that the first synch is in the direction you want. Specifically, make sure that the synch does not overwrite the good data on the then-current primary (the node that didn’t fail). To ensure this happens correctly, remove all of the metadata found in /var/lib/drbd/drbd/ from the freshly-rebuilt node.

* Running DRBD on top of a loopback device, or vice versa, is expected to deadlock, so don’t do that.

You can run DRBD on top of the Linux Volume Manager (LVM), but you have to be very careful. Otherwise, snapshots (for example) won’t know how to notify the filesystem (possibly on the remote node) to flush its journal to disk to make the snapshot consistent. However, DRBD and LVM might be convenient for test setups, since you can easily create or destroy new drbds.

Tele-DRBD

The typical use of DRBD and HA clustering is probably two machines connected with a LAN and one or more crossover cables, and separated just a couple of meters apart, probably within one server room, or at least within the same building.

But you can use DRBD over long distance links, too. When you have the replica several hundred kilometers away in some other data center (a good plan for disaster recovery), your data will survive even a major earthquake at your primary location.

Running DRBD, the complete disk content goes over the wire. Consider issues about privacy: if the machines are interconnected through the (supposedly) hostile Internet, you should route DRBD traffic through some virtual private network. or even a full-blown IPSec solution. For a more lightweight solution for this specific task, have a look at the CIPE project.

Finally, make sure no other node can access the DRBD ports, or someone might provoke a connection loss and then race for the first reconnect, to get a full sync of your disk’s content.

More to Come…

If you have any troubles setting up DRBD, check the FAQ at http://faq.drbd.org. If that doesn’t help, feel free to subscribe and ask questions on drbd-devel@lists.sourceforge.net (there’s no drbd-users alias yet).

Development of DRBD continues. Work is already underway to eliminate the most displeasing limitations of drbd-0.6.x. drbd-0.7.x will be made more robust against block size changes to support XFS, and will avoid certain nasty side effects. Future versions will permit the primary node to be a target of an ongoing synchronization, which makes graceful failover/failback possible, and increases interoperability with Heartbeat. Combined with OpenGFS, future versions of DRBD will likely be able to support true active/ active configurations.

Unfortunately, these improvements are still in early alpha. But with your ongoing support, the pace of development should increase.

How DRBD Works

Whenever a higher-level application, typically a journaled file system, issues an I/O request, the kernel dispatches the request based on the target device’s major and minor numbers.
If the request is a “read,” and DRBD is registered as the major number, the kernel passes the request down the stack to the lower-level device locally. However, “write” requests are passed down the stack and sent over to the partner node.
Every time something changes on the local disk, the same changes are made at the same offset on the partner node’s device. If a “write” request finishes locally, a “write barrier” is sent to the partner to make sure that it is finished before another request comes in. Since later write requests might depend on successful finished previous ones, this is needed to assure strict write ordering on both nodes.
The two most important decisions for DRBD to make are when to synchronize and what to synchronize — is a full synchronization required, or just an incremental one? To make decisions, DRBD keeps several event and generation counters in metadata files located in/var/lib/drbd/drbd#/.
Let’s look at the failure cases. Say paul is our primary server, and silas is standby. In the normal state, paul and silas are up and running. If one of them is down, the cluster is degraded. Typical state changes are degraded to normal and normal to degraded.
* Case One: The secondary fails. If silas was standby and leaves the cluster (for whatever reason: network, power, hardware failure), this isn’t a real problem, as long as paul keeps on running. In degraded mode, paulsimply flags all of the blocks that inure write operations as dirty. Then, aftersilas is repaired and joins the cluster again, paul can do an incremental synchronization (/proc/drbd says SyncQuick). If paul fails while alone, the dirty flags are lost, since they are held in RAM only. So unfortunately, the next time both nodes see each other, they perform a full synch (“SyncAll”) from paul to silas.
* Case Two: The primary fails. When paul is the active primary and fails, the situation is a bit different. If silas remains standby (which is unlikely), and paul returns, paul becomes primary again. At that moment, it’s unknown which blocks were modified on paul that hadn’t reached silas. Therefore, a full synch from paul to silas is needed just to make sure that everything is identical again. In the more likely case that silas assumed the role of primary, paul becomes standby and synch target when it returns, receiving a full synch from silas. Why? It’s not known which blocks were modified on paul immediately before the crash.
* Case Three: Both the primary and secondary fail. If both nodes go down (due to a main power failure or something catastrophic), when the cluster reboots, paul provides a full synch to silas.
While it seems like a full synch is needed whenever paul becomes unavailable, that’s not exactly accurate. You can stop the services on paul,unmount the drbd, and make paul secondary. In this case, both nodes are on standby, and you can shut off both nodes cleanly. When both nodes reboot (from previously being on standby), no synch is required.
Or you can make silas primary, mount drbd there, and start the services. This configuration allows you to bring paul down for maintenance. Whenpaul reboots, silas can provide an incremental synch to paul.
* Case Four: Double failure.
If one of the nodes (or the network) fails during a synchronization, this is adouble failure, since the first failure caused the synch to happen. Assuming that paul was primary, paul has the good data; silas was receiving the synch. If silas became unavailable during the synch, it has inconsistent, only partially up-to-date data. So, when silas returns, the synch has to be restarted.
If the synch was incremental, it can be restarted at the place it was interrupted. If the the synch was supposed to be complete, it must be restarted from the very beginning. (This is a scenario that needs to be improved upon.)
If paul (the synch source) fails during the process, the cluster is rendered non-operational. silas cannot assume the role of the primary because it has inconsistent data.
However, if you really need availability, and don’t care about possibly inconsistent, out-of-date data, you can force silas to become primary. Use the explicit operator override…

silas# drbdsetup /dev/nb0 primary --do-what-I-say

But remember: if you use brute force, you take the blame.

Tuesday, March 27, 2012

Thin Client Computing

Thin client computing has been a buzzphrase for the past few years. In a thin client network, users sit at low-powered machines and run programs on more powerful central computers.

As described shortly, this approach to computing offers certain advantages over the more common desktop computer approach. This month’s column looks at the basic principles of thin client computing and presents basic information on the server side of the thin client equation. Next time I’ll describe the client side.

Why Use Thin Client Computing?

The idea behind thin client computing is one of centralization: A single computer, or perhaps a small” farm” of computers, holds all user accounts and most of the programs that users run. Users access these systems using less powerful systems– the thin clients. This approach provides several advantages over the more conventional method of providing each user with a fully-equipped desktop computer:

The thin clients are inexpensive, minimizing costs and perhaps enabling continued use of old computers as thin clients.
The thin clients hold simple software packages, reducing administrative requirements to maintain them.
The thin clients can be diskless and can use less power-hungry CPUs, minimizing power consumption, noise, and cooling costs.
The thin clients don’t need direct Internet access, reducing security risks and enabling them to run on private IP addresses, thus reducing the need for public IP addresses.
Users can use any thin client to access their own accounts and files. This is handy in public computing centers or when users move from one work space to another.
Upgrades to most software packages are restricted to the server systems, reducing administrative effort.
Backups are simplified; only the servers need to be backed up.

Of course, thin client computing isn’t without its drawbacks. The servers that power thin clients must be more powerful than the average desktop system, since each one will host several users. As a general rule of thumb, count on about 75-100MHz of CPU power and 50MB of RAM per client, plus a baseline of 512MB of RAM to start. Your exact needs will vary depending on your uses, though.

Computing reliability is tied to the server (s). When a user’s fat client fails, that user is disable– but the rest of the office can keep working. If an office using thin clients experiences a server failure, all users will be affected. Picking reliable hardware and having backup hardware ready can mitigate problems caused by hardware failures. Keeping server software backups is important, as well.

The network load is likely to be higher in a thin client configuration than in a typical desktop-using office.

Since thin client computing requires servers to accept remote logins, thin client computing introduces a potential security hole.

Access to local hardware (CD-ROM drives, printers, USB devices, etc.) can be complicated. Some thin client packages are preconfigured to handle some common local hardware, but this isn’t always the case. Finally, CPU-intensive and display-intensive programs are poorly matched to thin client configurations.

As a general rule of thumb, thin client computing works best in situations where users run typical office programs, such as e-mail, Web browsing, word processing, and light spreadsheet use. Tools that are very CPU-intensive or that make heavy use of the display (streaming videos, games, and scientific simulations, for instance) don’t work well because they demand too much of the server or of the network.

Servers Required to Support Thin Clients

The ideal setup for a network that uses thin clients involves configuring the thin clients to boot from the network. This enables thin clients to run without any local storage. Not all computers support network boots, though, so you may need floppy disks or even CD-ROMs on some or all of your thin clients to get the process started.

In order to boot thin clients from the network, you’ll need to run two server protocols: The Dynamic Host Configuration Protocol (DHCP), which assigns IP addresses to clients, and the Trivial File Transfer Protocol (TFTP), which delivers boot files to thin clients. Neither DHCP nor TFTP is strictly required in a thin client environment, but they are needed for network booting. On all but very large networks, only one system needs to be configured as a server for each of these protocols.

Of course, you’ll also need to configure X or Virtual Network Computing (VNC) to handle remote logins. You might have just one or many computers running remote login protocols. DHCP, TFTP, and remote login protocols need not all run on a single computer, but they can do so if it’s convenient.

Configuring DHCP

DHCP configuration is complex enough that I can’t describe the whole process here. If you’re not already using a Linux DHCP server, you’ll have to research basic Linux DHCP configuration before you proceed, learn to configure your non-Linux DHCP server to do the tasks I describe, or forego DHCP configuration for thin clients. You can still use thin clients without the help of DHCP, but you may need to tell clients where to find your TFTP server or even outfit them with complete software packages locally (on CD-Rs or hard disks).

The main point of DHCP configuration for thin clients is to enable the DHCP features that point clients to a TFTP server. If you’re using the popular Internet Software Consortium (ISC) DHCP server, you’ll probably find its main configuration file in /etc/dhcpd.conf. This file includes a series of global options followed by one or more subnet declarations, which define options for specific subnets. To support thin clients, you’ll want to ensure that two options are set, either globally or as part of specific subnet declarations:

option tftp-server-name "192.168.1.1";
filename "thinstation.nbi";

The tftp-server-name option points clients to a TFTP server. This line is extremely important if you intend to boot thin clients from the network. Note that the name must be enclosed in quotation marks, even if you specify it as an IP address.

The filename option provides the name of an OS image file in the TFTP server’s directory (described shortly). This option should be set to a name that’s appropriate for whatever thin client software you intend to run. The preceding example uses"thinstation.nbi", which is suitable for the ThinStation client described next month.

Some thin clients require one or both of two additional options:

allow bootp;
option x-display-manager 192.168.1.1;

These options enable the older BootP protocol, which is similar to DHCP, and point the client to an X server. Neither of these options is required for the ThinStation client software, but you may need these options if you use other thin clients.

Of course, you should adjust the values of all of these options for your own network. You can provide all of these options as globals, or in subnet, host, or other more specific declaration blocks. You can use this fact to provide a default configuration or tweak the configuration for individual clients or groups of clients. For instance, you can mix and match dedicated thin clients, which require their own software, with ThinStation clients.

Remember to restart your DHCP server once you’ve changed its configuration. Typically, you can do this by passing the restart option to a SysV startup script, like/etc/init.d/dhcpd restart.

Configuring TFTP

TFTP is delivered with most Linux distributions, or you can download a copy fromhttp://www.kernel.org/pub/software/network/tftp/. Linux TFTP packages typically ship with a file called /etc/xinetd.d/tftp, which launches the TFTP daemon via the xinetd server. Chances are you’ll have to edit the /etc/xinetd.d/tftp file and change the disable line to readdisable=no. You may also need to set the server_args line; it should use the -soption and point to a directory in which boot files are to be stored:

server_args=-s /tftpboot

Once you’ve edited /etc/xinetd.d/tftp, you should create the directory you’ve specified, if it doesn’t already exist. You should then restart the xinetd daemon. You can usually do this by typing /etc/init.d/xinetd restart, although the exact path to the startup script may vary depending on your distribution.

With the TFTP server running, you must still populate its file directory with the thin client software. Although the files reside on the TFTP server computer, this is really a client issue, so I describe it in more detail next month.

Configuring X and XDMCP

X is peculiar because it uses client and server roles that seem backwards to most people; you sit at the server computer and use X clients that are remote. Thus, in a thin client configuration, you don’t need to be concerned with X server configuration on the powerful host computer. Instead, you’ll run the X server on the thin client computer. In most cases, X configuration on the thin client is fairly straightforward, and I describe it next month.

You do, however, need to configure a login protocol server for X. This protocol is XDMCP. If your environment hosts multiple systems that a user might access, you’ll need to configure XDMCP on each of them.

Most Linux distributions ship with an XDMCP server. The most common are the X Display Manager (XDM), the KDE Display Manager (KDM), and the GNOME Display Manager (GDM). On a desktop computer, one of these programs, or something similar, manages the GUI login screen. Most Linux distributions, however, configure XDMCP to block outside access attempts. This is a useful security feature for a typical desktop computer, but it just won’t do if thin clients are to access the computer using X. Thus, you’ll have to loosen the security restrictions on your system.

The first trick is in finding which XDMCP server you’re running on the computer you intend to have accept thin client logins. Try typing ps ax|grep dm. This command shows you all processes with the string dm in their names. You’ll probably notice xdm, kdm, or gdm in the output. If you don’t, then the system isn’t configured to start up in GUI mode or you’re using some other XDMCP server.

If you’re running XDM, you can configure it to accept remote logins by editing two or three files. The first of these is /etc/X11/xdm/xdm-config. Locate the following line:

DisplayManager.requestPort: 0

Change the 0 in this line to 177 to have XDM listen for remote access attempts. The second file you must edit is /etc/X11/xdm/Xaccess. This file will probably include the following two lines:

# *
# * CHOOSER BROADCAST

Remove the hash marks (#) at the start of these lines to tell the computer to accept remote accesses and to generate a list of available remote hosts, respectively. You can substitute a wildcard matching your network, such as *.example.com, for the simple asterisk (*) in these lines, to limit access to your network. Finally, the third XDM file you may need to edit is /etc/X11/xdm/Xservers. Details vary greatly from one distribution to another, but you’ll probably find a line that resembles the following:

:0 local /usr/X11R6/bin/X -nolisten tcp -br vt7

The -nolisten tcp option prevents X from making remote connections. This won’t affect a thin client’s ability to connect to the computer, but it will prevent you from accessing other X servers from the affected system. Thus, you may want to edit it out. If you don’t want X to run locally at all, you should comment out this line entirely. This will cause XDM to accept remote connections but not to launch X locally.

In theory, KDM configuration is similar to that of XDM; however, you’ll need to track down the equivalent configuration files, which vary greatly in location from one distribution to another. In practice, KDM is also much more finicky than XDM, so configuring KDM can be frustrating. Be aware that you’ll probably need to edit a file called kde-config or kdm-config rather than xdm-config. An additional configuration file, kdmrc, may also require editing. Locate the [Xdmcp] section of this file and edit it so that it includes these lines:

Enable=true
Port=177

GDM uses the gdm.conf configuration file, which usually resides in /etc/X11/gdm. This file is similar to the kdmrc file, and you must make the changes just described to the [Xdmcp]section of this file. Alternatively, you can use the gdmsetup utility to make the changes using a GUI tool.

Whatever XDMCP server you use, you must restart it before the changes you’ve made take effect. This will shut down your current X session, so you should save your work, close your open programs, and log out. On many Linux systems, changing to runlevel 3 and then back to runlevel 5 will restart the XDMCP server. Type telinit 3 followed bytelinit 5 to do this. On others, you can restart the XDMCP server by typing/etc/init.d/xdm restart (you may need to substitute kdm or gdm for xdm on some distributions).

Configuring VNC

If you want to use VNC rather than X, you can do so, but configuration can be tricky; X is generally superior if your thin clients support it. Two popular Linux VNC servers are RealVNC (http://www.realvnc.com) and TightVNC (http://www.tightvnc.com); you’ll run one of these on your login host. Under Linux, VNC is designed to be run by an individual user, and the login connection then accesses that user’s account directly. This approach may be adequate for some small-scale operations, but for greatest flexibility, it’s best to link the VNC server to an XDMCP server. This way, remote users will see a Linux login screen when they connect, so any user with an account may enter a username and password to access the computer.

This configuration requires you to make the changes to the XDMCP setup just described. You can then create a xinetd configuration file, say /etc/xinetd.d/vnc, to launch the VNC server:

service vnc-800x600
{
 disable = no
 socket_type = stream
 protocol = tcp
 wait  = no
 user  = nobody
 server  = /usr/bin/Xvnc
 server_args = -inetd -query localhost \
    -once -geometry 800x600 -depth 24 \
    -fp /usr/share/fonts/local/,/usr/share/fonts/misc/
}

This configuration works for a TightVNC server on a Gentoo Linux system. Unfortunately, the details of VNC configuration vary from one VNC server to another. Thus, you may need to read the man pages and experiment to find a configuration that works. This particular configuration relies on an entry in /etc/services:

vnc-800x600 5901/tcp

This entry assigns TCP port 5901 to the service name vnc-800x600. On the client side, you’ll need to tell the thin client to connect to this port.

decide what software you want to use on the clients, prepare a client configuration, and distribute the software to the clients. The software distribution task can be handled by a TFTP server, as described last month; or you can put the software on a CD-R or other local storage medium. Once you’ve done all of this, you can boot your thin client and, if all goes well, use it.

Thin Client Options

Broadly speaking, thin clients come in two forms: dedicated thin client hardware and general-purpose PCs that run thin client software. Dedicated thin client hardware can be handy if you’re building a network from scratch, but using PCs as thin clients has the advantage of enabling you to re-use outdated but still functional computers.
If you’re using a dedicated thin client, you should consult its documentation to learn how to configure and use it. You should be able to drop some files, provided by the hardware’s manufacturer, into the TFTP server’s files area and configure your DHCP server to point specific machines to particular files. You should then be able to boot your thin clients and get on with using them.
To use an ordinary PC as a thin client, you must have software for it. Linux can serve this role quite well, but the question is, which Linux? I recommend you turn to a dedicated thin client distribution, such as ThinStation or 2x PXES. These distributions include the Linux kernel, X, additional remote-access protocols, and a set of additional tools required to run a system as a thin client. Note that many Linux thin client distributions can access Windows servers as well as Linux (or other Unix) servers.
Requirements for a computer to operate as a thin client are minimal. A Pentium-class CPU will do the job well, but a 486 or even a 386 can serve in a pinch. The ThinStation developers recommend a minimum of 64MB of RAM. You do not need a hard disk.
You do need a keyboard, mouse, and monitor, of course. To function well with modern Linux software, the video card and monitor should be capable of 1024×768 resolution or better, although lower resolutions will work. A network card is an absolute necessity. If the network card supports network booting via the Preboot Execution Environment (PXE) protocol, the thin client need not have a floppy disk or CD-ROM drive, but one or both of these will be required if the network card doesn’t support PXE.

Configuring ThinStation

I’ll describe ThinStation in more detail, since it’s a ready-made package that’s relatively easy to configure. Several options for obtaining and configuring ThinStation exist in its Web site’s downloads section, including a couple of source packages, a pair of prebuilt images (for CD-R and network boots), and TS-O-Matic (for highly customized configurations).
Because they’re relatively easy to set up, I describe the two prebuilt images in more detail. You should download whichever one seems more suitable for your network. The NetBoot version requires a network card that’s supported by the network booting tools, as well as working DHCP and TFTP servers on your network. The NetBoot system might or might not require a working floppy or CD-ROM drive, depending on the client’s motherboard and network card BIOS support. I describe version 2.2 of ThinStation.
You should begin by downloading the relevant files (Thinstation-2.2-prebuilt-NetBoot.zip andThinstation-2.2-prebuilt-LiveCD.zip), which you can uncompress with the Linux unzip utility. The result will be one directory tree for each zip file. How you proceed from here depends on which version you’re using.

Distributing ThinStation via CD-R

Once you’ve uncompressed the LiveCD image of ThinStation, you’ll need to prepare two disks: a CD-R and a floppy. The CD-R image is stored in the Cd subdirectory asthinstation.iso. You can use cdrecord under Linux to burn this file to a blank CD-R:

cdrecord dev=/dev/cdrom thinstation.iso

You may need to modify this command to suit your hardware; consult the cdrecord man pages for details.
The CD-R contains the ThinStation OS, including a Linux kernel, X server, and so on. This software does a remarkably good job of detecting and adapting to a wide variety of hardware; however, some configuration settings need to be set manually. For this task, you must prepare a floppy disk. Start with a FAT-formatted floppy (prepared on a Windows box or by using mkdosfs in Linux).
Mount it in Linux, copy the thinstation.conf.user file from the Floppy/thinstation.profilesubdirectory of the LiveCD directory tree, and begin modifying the on-floppy version of this file. The file format is fairly obvious and the sample file does a good job of explaining the various options.
Use the NET_USE_DHCP option to enable or disable the use of DHCP. If you don’t use DHCP, you can set the IP address, and so on. Using DHCP is almost certainly the best approach unless your network lacks a DHCP server.
Much of the remaining configuration involves setting up various session types. ThinStation supports Windows Terminal Server, Citrix Server, X, VNC, Telnet, SSH, and AS400. Each has a sample configuration section with options that begin SESSION_#, where # is a session number.
You must number your sessions starting with 0. The original file supports just one session, a Windows Terminal Server session. To use a thin client with a Linux host, you’d comment out this configuration block, uncomment the X and/or VNC blocks, and make appropriate changes. For instance, to support one X and one VNC session, you might make changes so that your file includes the following lines:

SESSION_0_TITLE="Linux X sessions"
SESSION_0_TYPE=x
SESSION_0_X_SERVER=192.168.1.2
# Default is '-query'
SESSION_0_X_OPTIONS="-indirect"
# You should set also "SCREEN_X_FONT_SERVER", below

SESSION_1_TITLE="Linux via VNC"
SESSION_1_TYPE=vncviewer
SESSION_1_VNCVIEWER_SERVER=192.168.1.2:5901

This configuration tells ThinStation to connect to the server at 192.168.1.2 for both X and VNC. The use of the SESSION_0_X_OPTIONS="-indirect" line tells ThinStation to use an indirect XDMCP request. This results in a list of available XDMCP servers when you select this session type, so you can use any remote system that accepts XDMCP logins.
The default option of "-query" results in a direct login to just one X server, which may be preferable if you’ve only got one or two systems. The VNC IP address appends a port number (5901) to the end. This is handy if you define multiple VNC configurations on the server to support different screen resolutions or other features. Note that you can include more than one session of a single type. For instance, you might have three X sessions and two VNC sessions defined.
Further down in the file you’ll find some miscellaneous additional options. I recommend you pay particular attention to the following:

SCREEN_RESOLUTION Set the screen resolution in the obvious way using this option.
SCREEN_COLOR_DEPTH Set the screen’s color depth with this option.
SCREEN_HORIZSYNC and SCREEN_VERTREFRESH Set the monitor’s horizontal and vertical sync ranges with these options. You may need to consult the monitor’s manual to learn what appropriate values are. Note that very old monitors can be damaged if you set these values inappropriately.
MOUSE_ options Several options beginning with MOUSE_ default to values that are suitable for PS/2 mice. If the thin client has an RS-232 serial, USB, or other mouse, you may need to adjust these items.

Once you’ve made your changes, unmount the floppy disk. You can then walk the floppy and CD-R over to the computer you intend to use as a thin client and test them, as described shortly, in “Using ThinStation.”

Distributing ThinStation via TFTP

If you want to boot and distribute ThinStation via your network, you can do so with the help of the NetBoot image. Just as with the LiveCD version, you must attend to two key details: the ThinStation OS itself and a configuration file. Within the ThinStation-NetBoot directory tree, the TFtpdRoot subdirectory holds a file called thinstation.nbi (autoextract).exe. This is a Windows self-extracting archive that holds the ThinStation OS. You can extract this file on a Windows machine or use WINE under Linux to extract it.
The extracted file is called thinstation.nbi, and you should place it in your TFTP shared directory and ensure that your DHCP server is configured to point to this file, as described last month. If any of your computers have network cards that support direct network boots via PXE, you should also copy the thinstation.nbi.zpxe file to the same location, and point to this file using the filename line in dhcpd.conf.
If your network includes both PXE-enabled computers and those that don’t support PXE, configure DHCP to point to the thinstation.nbi.zpxe file; the disk-based boot tools described shortly are smart enough to drop the .zpxe filename extension.
In addition to the thinstation.nbi file, you’ll need to edit a ThinStation configuration file and place it in the TFTP files directory. The sample file is located in the TFtpdRoot subdirectory as thinstation.conf.network. Copy this file, without renaming it, to your TFTP server’s files directory. You can then edit the file to make the changes described for thethinstation.conf.user file for the LiveCD version of ThinStation; however, you don’t need to copy the file to a floppy disk, since clients will retrieve the file via TFTP.
If your thin clients have differing hardware capabilities, you can create custom configuration files for each client. The simplest way to do this for a small number of clients is to create a new configuration file for each computer. This file contains only the unique items for this computer — say, SCREEN_RESOLUTION if the computer has a higher- or lower-resolution screen than others.
Name the new file thinstation.conf-ID, where ID is the computer’s IP address or MAC address. For instance, thinstation.conf-192.168.1.27 holds the customizations for the thin client with an IP address of 192.168.1.27, or thinstation.conf-000C7696A373 holds customizations for the thin client with a network adapter MAC address of 00:0C:76:96:A3:73. A more sophisticated method of creating customizations, which may be preferable for networks with lots of systems, is detailed in the _HowTo-NetBoot.txt file that comes with the NetBoot distribution.
Once these files are all copied to your TFTP files directory, you may need to prepare a boot floppy or CD-R. These images are stored in the BootDisk subdirectory. You can prepare a boot floppy by placing a formatted floppy disk in the drive, entering the BootDisksubdirectory, and typing:

dd if=eb-net.dsk of=/dev/fd0

If your target thin client lacks a floppy disk but has a CD-ROM drive, you can usecdrecord to burn the eb-net.iso file to CD-R:

cdrecord dev=/dev/cdrom eb-net.iso

You may need to tweak one or both of these commands for your system.
If your thin client supports PXE network booting, you shouldn’t need either the boot floppy or boot CD-ROM, but you may need to enter your BIOS setup utility to configure the boot devices. Be sure that booting from the network is an option, and that it’s higher in priority than other valid boot devices (such as a hard disk, if that disk is bootable).
Some plug-in network cards have boot ROMs that support PXE booting. When using such cards, you may need to disable other valid boot devices in your main BIOS. Consult the network card’s documentation for details.

Using ThinStation

To begin using ThinStation, you should first ensure that your servers are all configured and running, as described last month. Boot your thin client using whatever boot media you’ve prepared. If all goes well, you’ll see a series of boot messages appear on the screen, followed by a selection screen that lists your session options. Select one to test it.
Check to be sure you can log in fully and run programs. When you log out, you may be returned to the session selection screen or to an XDMCP login screen. If you see the XDMCP login screen but want to try another session, press Ctrl+Alt+Backspace to kill X; this will return you to the session selection screen.
Unfortunately, things don’t always go according to plan, and thin client troubles can be tricky to debug. Some common symptoms and likely solutions include:

A failure to fully boot, with a repeated listing of network modules — ThinStation locks itself into an endless loop testing its network drivers when it can’t detect a network card. If this happens, you may need to swap in a new network card or use the TS-O-Matic or Main Distribution option to create a custom ThinStation distribution that includes support for your card. A similar symptom can occur if your boot file (as specified by the filename option in your DHCP configuration) doesn’t exist. Be sure the filename matches.
A blank screen when you select an X session — ThinStation sometimes fails if you select an X session too quickly after the session list appears. Reboot and pause for a few seconds before selecting the X session.
A corrupt or blank display when you select a session — This can happen when you’ve specified incorrect video timing (horizontal and vertical sync values) for your monitor. Review your documentation and try again.
An “X”-shaped cursor on a gray background after you select an X session — This problem can occur when your XDMCP server isn’t properly configured. Review that part of the configuration. You might find it necessary to change XDMCP servers; I find that GDM and XDM are easier to configure than KDM.

In addition to these issues, firewall configuration can cause problems. Your host computer must not block incoming access to any of the vital ports. Such blocks can manifest themselves as problems at various points along the way.
With any luck, you won’t run into serious problems. Once it’s running, a thin client can be used much like a conventional desktop system. The fact that it’s running remotely will cause a speed hit, particularly on display-intensive applications. For typical office productivity tools, though, thin clients can be very usable tools.

Complete Kickstart: How to Save Time Installing Linux

The Requirements

Our requirements were that kickstart, once launched and after making a menu selection to choose a particular kickstart configuration, needed to be completely unattended. We also needed to install some local tools and make configuration changes to the installed boxes before they would be ready for use. The Anaconda installer menu must provide us with options to install multiple versions of this kickstart or to boot from hard drive. If no menu selection is made after a short timeout, the Anaconda installer is configured to boot from the hard drive.

Here’s what you need in order to perform a kickstart:

1. A Web server and/or FTP Server for delivery of the RPMs that are to be installed.
1. A DHCP server for IP address assignments and to launch PXE Boot.
1. A TFTP server for download of PXE Boot components to the machines being kickstarted.
1. An PXE Boot capable network card.
1. The BIOSes on the computers to be kickstarted must be configured to allow a network boot.

Each of the required servers can be located on a different system, or they can be combined onto a single computer.

In addition, Cisco Core routers require special configuration to transport UDP PXE Boot packets across subnet boundaries. Our environment requires the use of a serial console during Kickstart for menu selection. This gives us the ability to select from two or more different kickstart installations.

Initial Decisions and Sizing

We chose to use HTTP for file delivery, but due to the possibility that some need might arise in the future for an FTP kickstart, we decided to configure our kickstart server directory structure so that both FTP and HTTP can be used. We also chose to house the HTTP, TFTP, and DHCP servers on a single computer.

For our environment, we had no reason not to have all of the servers on one box, and the number of simultaneous kickstarts we expect to experience is well within the capability of the hardware and network infrastructure we have available. When sizing a prospective kickstart server the limiting factors are most likely to be the hard drive data transfer rates and the network. Experience has shown up to 20 or so systems can be kickstarted simultaneously in about an hour with a very modest Pentium 4, a single IDE hard drive, and a 100Mb connection.

Using a 3.0GHz Intel Core-Duo with 4GB of RAM and dual 120GB hard drives in RAID 1 configuration on a Gigabit Ethernet connection should allow us to support multiple simultaneous kickstarts in numbers far larger than we currently expect. The only reason we used this particular hardware is that it is what we had available.

Kickstart Sequence of Events

A network-based kickstart can be initiated by an PXE Boot capable network card. The PXE Boot first requests an IP address from a DHCP server. It also obtains the location of a PXE Boot file from the DHCP server. PXELINUX is a bootloader for Linux using the PXE network booting protocol. The PXE Boot file is loaded from the TFTP server along with the contents of a file which defines the location and name of the installation kernel and initrd.img file as well as some parameters for the boot kernel and a menu for the Anaconda installer. This configuration file for Anaconda also contains the location of the kickstart configuration file to be used during the installation.

The PXE Boot file then loads the boot kernel and initrd image still using TFTP. After booting, Anaconda is started and Anaconda loads the menu and displays a window with a timer with several menu options. The Menu and time-out can be skipped if you do not need to make any choices here.

After choosing the desired kickstart installation, Anaconda locates the kickstart configuration file from the HTTP server and reads it. The kickstart configuration file has a default name of ks.cfg, but can be named anything. We use several for our different configurations, so provide unique names for each. If all of the data required to perform a complete installation is included in the kickstart configuration file, the installation completes without further intervention from the administrator. The RPM files used during the installation are downloaded from the HTTP server as they are needed.

The kickstart configuration file can also contain bash script commands that can be run both before and after the rest of the installation. We make extensive use of the post-installation bash scripts to perform installations of locally required RPM packages and tarballs as well as to make configuration changes before the first reboot.

Hardware configuration

In order to boot from the network it is necessary not only to have a network card that is capable of a network boot, but also to configure the BIOS to boot appropriately. You have a couple of options. The first is to always attempt to boot from the network as the first choice, then CD/DVD, and then from the hard drive. The second is to boot from the CD/DVD first, then the hard drive, and finally from the network. Choose the option that best fits your needs.

When booting from the hard drive prior to booting from the network, an additional step requiring some manual intervention would be required to force a boot from the network. It is necessary to overwrite the boot record to prevent booting from the hard drive. This can be done with a small script or from the command line using the dd command but it is another point of intervention.

We chose to configure BIOS to boot first from the network. We then set a short timeout for Anaconda so that the default is to boot to the hard drive if no other action is taken.

DHCP Configuration

The /etc/dhcpd.conf file must be configured correctly to provide an IP address for each client host as well as information necessary to initiate a PXE Boot sequence for each kickstart client host. DHCP determines the host name using the MAC address of the NIC making the request. Although the IP address can be specified in the dhcpd.conf file, we use DNS to maintain the addresses and DHCP does the lookup and then passes the address to the host.

DHCP can also serve a range of addresses rather than a specific address for each host, but that is outside the scope of this article.

#######################################################################
allow booting;allow bootp;dns-update-style ad-hoc;
option domain-name "cisco.com";
option domain-name-servers 109.99.6.247;
max-lease-time 604800;
default-lease-time 604800;
deny unknown-clients;

# The next-server line is required even though we point to ourselves.
# Resolves some issues relating to pxeboot across subnets.
next-server 109.99.101.74;

# 109.99.222 Subnets:
subnet 109.99.222.0   netmask 255.255.255.0
  { authoritative ; option routers 109.99.222.1 ; }

# Red Hat Enterprise Linux 5.1 Kickstart boxes
group {
filename "RHEL/pxelinux.0";
host ems-lnc100.cisco.com
  { hardware ethernet 00:15:17:1D:42:88 ; fixed-address ems-lnc100.cisco.com ;}
host ems-lnx118.cisco.com
  { hardware ethernet 00:04:23:B7:9A:15 ; fixed-address ems-lnx118.cisco.com ;}
host ems-lnx145.cisco.com
  { hardware ethernet 00:04:23:B5:6B:A9 ; fixed-address ems-lnx145.cisco.com ;}
}
#######################################################################

Listing 1: The very basic dhcpd.conf required to support kickstarts.

The filename RHEL/pxelinux.0; statement in the group stanza directs the PXE Boot to load the pxelinux.0 boot file from the specified directory, RHEL. The full path for this directory in our setup is /opt/tftpboot/RHEL where /opt/tftpboot is a symbolic link to /tftpboot. The TFTP root, /tftpboot, is defined in /etc/xinetd.d/tftp.

In each host stanza we specify the MAC address of the NIC in the respective hosts and the hostname. DHCP queries DNS for the IP address and passes it back to the host along with the router and DNS server information.

We discovered during configuration of our server for the kickstart role that the next-server line is required in dhcpd.conf to resolve some PXE Boot issues even though the next-server is really the same server in our case. You should use this statement no matter which box hosts the PXE Boot server, even if it is the same as the DHCP server. It took us a couple days to figure this out and it is one of the things we could not find documented anywhere.

The allow booting and allow bootp statements are both required for kickstarts to function.

All of the options pertaining to PXE Boot can be placed in the group or individual host stanzas as well as in the global section of the DHCP configuration. This allows you as much granularity as you need to have multiple servers and kickstart configurations as well as to ensure that only specific hosts or groups of hosts can be kickstarted.

The PXE Boot files

Three PXE Boot files are required to perform a network boot. The first is pxelinux.0, the network boot loader. The second is the network boot kernel, vmlinuz, and the third is the initial RAM disk image, initrd.img.

We placed pxelinux.0 in /opt/tftpboot/RHEL/ as this is the location we specified in dhcpd.conf. We also have discovered that this is the only place from which it works.

The kernel and RAM disk image files are placed in a distribution or release unique location such as /opt/tftpboot/RHEL/RHEL-server. We also have an RHEL workstation based release we use and place its files in /opt/tftpboot/RHEL/RHEL-workstation. This allows us to keep them separate and helps us to know which is which. We have seen configurations in which files for different distributions and releases are all located in a single directory and named differently. Our method works better for us because we like the additional organization it imposes.

For the most part one set of PXE Boot files is pretty much like another. Most Red Hat Enterprise based distributions currently provide a set of these files. Most of these files should work with most distributions. However we did find that the Red Hat Enterprise Linux 5.1 files are specific to that distribution and that PXE Boot files from other distributions such as CentOS do not work with RHEL 5.1.

TFTP Configuration

The TFTP configuration file, /etc/xinetd.d/tftp, should look like the sample configuration below. We changed disable = yes to disable = no and server_args = -s -c -v -v -v /tftpboot to server_args = -s -c -v -v -v /opt/tftpboot.

#######################################################################
# default: off
service tftp
{
 disable = no
 socket_type  = dgram
 protocol  = udp
 wait   = yes
 user   = root
 server   = /usr/sbin/in.tftpd
 server_args  = -s -c -v -v -v /opt/tftpboot
 per_source  = 11
 cps   = 100 2
 flags   = IPv4
}
#######################################################################

Listing 2. The TFTP configuration file required only minor changes.

Creating the PXE Boot configuration file

Each host that is to be kickstarted requires a unique configuration file which is located in the /opt/tftpboot/RHEL/pxelinux.cfg directory. This file is used to specify the locations of specific files such as the kernel and the initrd image file. These files are named with the hexadecimal representation of the IP address of the computer to be kickstarted.

You’ll find an online IP to Hex converter at http://tinyurl.com/4lzthf and another tool, written in Perl, is available at http://tinyurl.com/4pz6g3. Usage is very straightforward for each of these tools.

For example, the IP address 192.168.0.55 converts to C0A80037 in hex, so in this case the name of the configuration file for the host with ip address 192.168.0.55 is C0A80037.

Loading the PXE Boot configuration file

The PXE Boot configuration files contain information that allows PXEBoot to locate the kernel and initrd image files for the kickstart process. They also specify the serial console parameters and provide a menu for selection of the desired kickstart. The kernel and initrd images are not the files that will be installed on the kickstarted machine, but are used only as the running operating system during the kickstart itself.

The PXE Boot process tries to load the correct file for the computer by using an interesting algorithm. First it tries to load a file with a name based on the MAC address of the system, then with names based on the hexadecimal IP address, removing one hex digit for each failure. The sequence would look like this:

/opt/tftpboot/RHEL/pxelinux.cfg/01-22-33-44-aa-cc-ee
/opt/tftpboot/RHEL/pxelinux.cfg/C0A80037
/opt/tftpboot/RHEL/pxelinux.cfg/C0A8003
/opt/tftpboot/RHEL/pxelinux.cfg/C0A800
/opt/tftpboot/RHEL/pxelinux.cfg/C0A80
/opt/tftpboot/RHEL/pxelinux.cfg/C0A8
/opt/tftpboot/RHEL/pxelinux.cfg/C0A
/opt/tftpboot/RHEL/pxelinux.cfg/C0
/opt/tftpboot/RHEL/pxelinux.cfg/C
/opt/tftpboot/RHEL/pxelinux.cfg/default

The contents of our files is identical for each of the installations this process is designed for, so only a single master file is located at /opt/tftpboot/RHEL/pxelinux.cfg. Then we use a soft link with the hexadecimal IP address as its name to point to a master file. We can do this because all of our Intel boxes have the same kickstart choices available. You could also use individual files if that suits your needs better.

The contents of our master file are shown below:

#######################################################################

# RHEL5 Kickstart configuration file.
#
# NOTE: The workstation and server versions of the RHEL 5.1 images require
# different initrd.img files.

default 1
prompt 1
timeout 200
display msgs/Main.msg
F1 msgs/Main.msg
F2 msgs/general.msg
F3 msgs/expert.msg
F4 msgs/param.msg
F5 msgs/rescue.msg

#F1 Main.msg

# Hard drive
label 1
    localboot 1

# RHEL5.1-MAX
label 2
    kernel RHEL-workstation/vmlinuz
    append ksdevice=eth0 initrd=RHEL-workstation/initrd.img \
    console=ttyS0,9600n8 ramdisk_size=6804 \
    ks=http://emstools2b.cisco.com/pub/kickstart/rhel-AP-Max-ks.cfg

# RHEL5.1-MIN
label 3
    kernel RHEL-server/vmlinuz
    append ksdevice=eth0 \
    initrd=RHEL-server/initrd.img console=ttyS0,9600n8 \
    ramdisk_size=6804
    ks=http://emstools2b.cisco.com/pub/kickstart/rhel-AP-Min-ks.cfg
#######################################################################

Listing 3. PXE Boot configuration files

In Listing 3 the PXE Boot configuration files contain data to create a menu for Anaconda and information that allows the PXE Boot process to locate the files needed to boot. Theappend lines have been split for formatting purposes, but should be on a single line when used.

Note that there are multiple stanzas in the file. One for each possible kickstart installation that is defined. Each stanza specifies different files for the vmlinuz, initrd.img and the location and name of the kickstart file to be used. Console parameters are also specified in the PXE Boot configuration file because we use the console to make the menu choice for the desired kickstart and to monitor the installation.

We also added the statement ksdevice=eth0 to the append line. This prevents manual intervention to choose the install NIC when more than one NIC is present. This information was also very hard to find.

This file also contains the definitions of the various menu options we want the Anaconda installer to provide, as well as the Function Key definitions for various help options. The menu options are created by Anaconda using the labels in each stanza. So the menu choices we have are 1, 2 and 3. Note that option 1 is local boot from the hard drive and that the “default 1″ line specifies that the system will boot to the hard drive after the timeout. The “timeout 200″ line specifies the length of the timeout in tenths of a second. This is a strange unit, but the value of 200 results in a timeout of twenty seconds.

The data to generate and display the menu itself is located in the file /tftpboot/RHEL/msgs/Main.msg. Separating the files that specify the options from the file that displays the available options allows us to define hidden options should we need to do that.

#######################################################################

       09Welcome to 0cThe Cisco Linux09 Installer!07
0a
                            |       |
                        . | | | . | | | .
                            '       '
                            C I S C O
07

Enter number of the Linux distribution you wish to install:

1. Cisco CEL 4
2. Red Hat Enterprise Linux MIN (Test)
3. Red Hat Enterprise Linux MAX (DEV test)

05[F1-Main] [F2-General] [F3-Expert] [F4-Kernel] [F5-Rescue]07

#######################################################################

Listing 4: The file /tftpboot/RHEL/msgs/Main.msg

You’ll see that Listing 4 contains the menu for the Anaconda installer. We have added our own options to the menu.

Cisco Core router configuration

The DHCP and TFTP protocols both use UDP rather then TCP packets. Most UDP packets are not forwarded across subnet boundaries and we have many different subnets in our network. Many Cisco routers with current versions of IOS have the ability to configure helper addresses for UDP packets. This enables the router to forward UDP packets to the DHCP and TFTP servers or to specific subnet(s).

Based on our experience, you should only configure this on the core router closest to your server.

#######################################################################
ip forward-protocol udp
!
interface ethernet 1
 ip helper-address 10.44.23.7
interface ethernet 2
 ip helper-address 192.168.1.19
#######################################################################

Listing 5. Sample Configuration

You’ll find a sample configuration from the Cisco IOS IP Configuration Guide, Release 12.2, in Listing 5 that provides an example of the commands required to set up a helper address.

If a protocol is not specified on your router, the helper address will forward all UDP packets to your kickstart server. If this is not what you want, be sure to specify only those protocols that need to be forwarded. This is another piece of information that was very hard to locate. Refer to the Cisco IOS IP Configuration Guide, Release 12.2, for details of this and related commands.

This will not be an issue if your DHCP and TFTP servers are located in the same subnet as all of the hosts you wish to kickstart.

Web Server (Apache) configuration

We chose Apache for our web server because it is supplied by all Red Hat distributions and because we use it on other internal servers so are familiar with its operation. Once you have Apache installed and running, nothing else needs to be done to the configuration to make it work for kickstarts. All you have to do is place the files in a location that is served by Apache.

Because we wanted our server to be as flexible as possible, we decided to plan for the eventuality that we would eventually support both FTP and HTTP kickstarts even though we are only using HTTP at this time. Therefore we chose a directory structure starting at /var/ftp/pub and created a symbolic link to this location from /var/www/html.

ln -s /var/ftp/pub /var/www/html/pub

Making the RPMs Available

While we wanted to make the ISO images of RHEL 5.1 available for download so that users can burn their own installation DVDs, it is also necessary to make the RPMs located in the ISO images available for the kickstarts. In order to accomplish this without having to store the files on the hard drive twice, we chose to keep only the ISO images on the hard drive and mount them using the loopback device to make the individual files in the ISO available to the kickstart.

To accomplish this, the following directories were created.

/var/ftp/pub/rhel/5.1/isos/i386/
   /var/ftp/pub/rhel/5.1/os

The iso images for RHEL 5.1 client and server were copied to the /var/ftp/pub/rhel/5.1/isos/i386 directory. The following entries were added to /etc/fstab to mount the ISOs automatically at boot time.

#######################################################################
/var/ftp/pub/rhel/5.1/isos/i386/rhel-5.1-client-i386-dvd.iso \
        /var/ftp/pub/rhel/5.1/os/i386/workstation iso9660 \
        loop=/dev/loop1,ro  0 0

/var/ftp/pub/rhel/5.1/isos/i386/rhel-5.1-server-i386-dvd.iso \
        /var/ftp/pub/rhel/5.1/os/i386/server \
        iso9660 loop=/dev/loop2,ro  0 0
#######################################################################

Listing 6: /etc/fstab

The entries in Listing 6 for /etc/fstab mount the ISO images so that the files in the images can be available for the kickstarts.

Note that we chose to use the loop1 and loop2 devices instead of the loop0 device so that the loop0 device would be available to anyone wanting to use a loopback.

The Kickstart Configuration File

The kickstart configuration file, by default named ks.cfg, is used by Anaconda to define the parameters of the installation. This file provides the answers to all of the questions and entries to all of the fields required by the installation process. Only by having answers to each and every question can the kickstart be fully automated. If any of the required fields does not have an entry the installation halts and waits for input.

Creating the Starting ks.cfg File

We initially created the kickstart file using the kickstart GUI configurator. Using this configurator allowed selection of the major software groups to be installed. There are other ways to obtain a kickstart configuration file to use as a starting point. Each time Red Hat Linux is installed, a kickstart configuration is stored at /root/anaconda-ks.cfg. This file can be used to exactly recreate the installation as it was performed. You could generate a kickstart file by performing a manual installation with the exact configuration you want and then use the anaconda-ks.cfg file generated as the starting point.

We renamed our kickstart files from anaconda-ks.cfg to something more meaningful, rhel-AP-Max-ks.cfg, and rhel-AP-Min-ks.cfg. This enables us to know from the names which type of installation the file is for, and also to keep multiple files in the same directory.

The kickstart configuration files have several sections. Each section has statements pertaining to a specific portion of the installation. Some sections are optional.

We did not use the %pre section which allows running scripts before the installation begins, so we will start with the command section. Most of this should be relatively self-explanatory, but if you need more information on any portion, the Red Hat Enterprise Linux Installation Guide (see Resources) contains an excellent description of each section of a kickstart file and describes each of the possible statements and commands that can be used. For the sake of brevity we will only discuss certain key portions of our kickstart file.

#######################################################################
# This is an installation not an upgrade
install
# The location of the RPM files
url --url http://emstools2b.cisco.com/pub/rhel/server
key 9a09007d99b6cd00
lang en_US
# Use text mode install
text
keyboard us
xconfig --defaultdesktop kde --resolution 640x480 --depth 8
network --device eth0 --bootproto dhcp --onboot=on
rootpw --iscrypted $1$tihTg7ne$hohhkj87hGGddg9B4WkXV1
authconfig --useshadow --enablemd5
selinux --disabled
timezone America/New_York
firewall --disabled
firstboot --disable
# Reboot after installation
reboot
bootloader --location=mbr --append="console=ttyS0,9600n8"
clearpart --all --initlabel

# define partitions
part /boot --fstype ext3 --size=512
part /opt --fstype ext3 --size=10000 --grow
part /usr --fstype ext3 --size=10000
part /tmp --fstype ext3 --size=7500
part /var --fstype ext3 --size=7500
part /home --fstype ext3 --size=2500
part swap --size=2048
part / --fstype ext3 --size=2048
part /usr/local --fstype ext3 --size=2000

#######################################################################

Listing 7: The command section of our kickstart file.

We added a key line to this section. This is what Red Hat calls the Installation number and is required to enable all Linux functionality. Just what would not be enabled is not specified. If the key line is not included in the kickstart files, the installation stops and waits for input which is not what we want.

We specified a text mode install because, as mentioned before, we need to use the console for installation and a graphical installation would not work for us.

We specified our console parameters in the “bootloader” line to ensure that they matched those of our console servers.

Due to issues we had with creating LVMs using kickstart, we only created EXT3 partitions, not Logical Volumes. We intend to revisit this and determine whether Logical Volumes can be used. It may be that, because our procedures are to simply re-kickstart systems that have any significant issues, such an effort would be more trouble than it is worth.

The %packages section of our kickstart file defines which groups are installed. These are the names preceeded by an “@” sign. Individual RPM packages can also be specified just by placing the appropriate package name on a line by itself in this section.

You can specify RPM packages that are not to be installed even if they are part of a group that you otherwise need to install. These RPMs are specified on a line by themselves but are preceeded by a “-” sign.

#######################################################################
%packages
@engineering-and-scientific
@mysql
@development-libs
@editors
@system-tools
@gnome-software-development
@text-internet
@gnome-desktop
@core
@base
@ftp-server
@network-server
@legacy-software-development
@java-development
@printing
@kde-desktop
@mail-server
@server-cfg
@sql-server
@admin-tools
@development-tools
@graphical-internet
festival
audit
kexec-tools
bridge-utils
device-mapper-multipath
dnsmasq
imake
-sysreport
mc
festival
audit
libgnome-java
libgtk-java
libgconf-java
kexec-tools
xorg-x11-server-Xnest
xorg-x11-server-Xvfb
-compiz-kde
-knetworkmanager
-amarok
#######################################################################

Listing 8: The %packages section of the kickstart file defines the groups and packages to install.

Using Post-installation scripts

We invested a great deal of effort developing the post-install scripts defined in the %post section of the kickstart configuration file. These scripts allow us to perform installation and configuration of RPMs and tarballs that are not part of the Red Hat installation.

The important thing to remember about the post-installation scripts is that they are executed using the bash command interpreter in a chroot’ed environment that behaves as it will when rebooted after the installation. This allows virtually any action that you could possibly work into a script to be performed during the final stages of installation.

#######################################################################
%post
# Install the yum repository configuration files
cd /tmp
wget http://emstools2b.cisco.com/pub/local/lab-repos.tar
cd /
tar -xvf /tmp/lab-repos.tar

# Set an ID to be used for other scripts
touch /LINUX_RHEL_MINIMAL_INSTALL

# Install Kshell as a preference of some developers.
yum -y install ksh

# Configure some local NFS mount points
service portmap start
mount  emsnfs:/export/linux/post   /mnt
cat /mnt/auto_localnfs >> /etc/auto.misc
cat /mnt/auto_misc >> /etc/auto.misc

# Get the command to create the motd and create it for the first time.
cp /mnt/createMOTDLinux /etc/init.d/create_motd
mv /etc/motd /etc/motd.orig
/etc/init.d/create_motd > /etc/motd

umount /mnt

# Create symlinks for mount points
# the links to /localnfs are to work around the issue with Linux
# mount points not being browsable as they are in Unix
mkdir /localnfs
ln -s /misc/apps       /localnfs/apps
ln -s /misc/rtp-chaos  /localnfs/rtp-chaos
ln -s /misc/black-hole /localnfs/black-hole
ln -s /misc/tools      /localnfs/tools
ln -s /misc/tftpboot   /localnfs/tftpboot
mkdir /opt/scratch
ln -s /opt/scratch /scratch

# Create ssh authorized keys
# Make the directory
mkdir /root/.ssh

# Create the keys file
cat  << xxEOFxx >> /root/.ssh/authorized_keys
sh-dss AAAAB3NzaC1kc3MAAACBAKyW6vv6uHKGKL54765VBHKJHhbfvfhJ/rkspGK2pmAM7awj7EwB
/wUBZUucmQSYnyaOlbvS6NkdE+sUC/asU/mEZjzoQgP+kdahxfJvWATaJweVFjRdHrIZxPB4nlO+MEBb
cPmUP7cLLQ1KGbfUakr35qzb9RjpBPDcBSDW2GZRAAAAFQCD/qw8FCSfEyWAmtkXDioJBWUCOwAAAIAm
4czfxx+Srm7FxGDTsiL52ojKzZCzddTi6YclknBXYpa3jhjhDfgkbGfHc746cVXm3hJ9ZgA3RQpMypKn
WS6EHimjkjEeqfw/viqPR1NCvj1xVs9XDjRtCelwsxUNj31Y2RHCsusa6DDwG765bnlk/BO4lUGRQpNy
QAAjKyDhPwAAAIAPJQcSf0tc4OrqNxy/gjkhkhgghfTRerthkljhGuyKarrmWan9ZkkFJQYnp09GNasZ
zI7Zwau3oqfutPTWJFehBskFKvRpSjYd59vKjWpDyCE5xHYxZfDORTj4pzjRSyiXDP/viA5DBCUWieM4
zGWa1RKVdskjPFS56y5GAkEwcA== root@emsjumpsssh-dss AAAAB3NzaC1kc3MAAACBAOqZBr62GU
La+NwGUatvO7OVXqGDn4qXvR2GAUputz9uyYmcWTvoHG0D3eAQ2flqhpyhJQo63GyntUtmGkXIHFuM3z
4qDt19qcpFRj10ZzRbZGhf+QbJwkxA9fpOy/BmoYykW5l36Db/Dvlzk4zNgJAmGXb2rNv8RSqYC6kCZf
aNAAAAFQDTb8EsksyknY++4zXC3TPNrH/+MwAAAIEAz3OCUfZXo+e/lJ/KSFj1un378KGGo9qfGSpVMV
Tva/z58KAZ174tJpgnfA8+fQOwq/ip8s9UyHA2qR+BICjjZo1KatevFN7l4rpNSqdLivEasrGBu6fRTP
/kQ6vt+OLIAQyr8t9RqpZKUVdd9odFA9NLiuOhG//eh2cDSXmjFnkAAACAbgzdEMcCMeMT/XPJrkZ/md
TX/EJ6VNQEuTP3fhrjKKjccYobXPOQhvliIhPGFbtrRZlYRPPFAkAAse0qRPOsy8XHKD18WnQr5JNJx+
C5PYMkb8APY55Ydwwrt4EFeqnFpF3RXFhPY1eiZNAI33GopEGVTiLTO4ZW9mYC8EI7e28= root@emstools
xxEOFxx

# Copy the logbanner and change sshd_config
cat  << xxEOFxx >> /etc/LogBanner
                               WARNING!!!
                   READ THIS BEFORE ATTEMPTING TO LOGON                      

     This System is for the use of authorized users only.  Individuals
     using this computer without authority, or in excess of their
     authority, are subject to having all of their activities on this
     system monitored and recorded by system personnel.  In the course
     of monitoring individuals improperly using this system, or in the
     course of system maintenance, the activities of authorized users
     may also be monitored.  Anyone using this system expressly
     consents to such monitoring and is advised that if such
     monitoring reveals possible criminal activity, system personnel
     may provide the evidence of such monitoring to law enforcement
     officials.                                                              

     Cisco Acceptable Use Policy:                                            

http://wwwin.cisco.com/infosec/policies/acceptable_use.shtml

xxEOFxx

echo "Banner /etc/LogBanner" >> /etc/ssh/sshd_config

#######################################################################

Listing 9: Post-installation Script Example

You’ll see that Listing 9 contains a partial listing of our post-installation script which installs both RPMs and tarballs designed for our unique lab environment as well as performing other necessary tasks.

As you can see, our post-installation is quite extensive. In addition to performing installations of several software packages we require, it also sets up a login banner, creates the /root/.ssh directory and copies some public keys there. We have only shown one of these keys to save space.

Notice that we can also start services as in the line service portmap start and access files on NFS mounts during this last portion of the kickstart. Post-installation provides a very flexible environment for performing a great many automated tasks.

Perform the kickstart

Performing the kickstart is very easy because we have done all of the hard work in setting up the network kickstart. We have four basic steps to perform.

Add the computer to DNS.
Add the appropriate information to the dchpd.conf file.
Boot the computer.
Select the desired kickstart from the menu.

The automated kickstart does the rest. The first two steps only need to be performed the first time a computer is kickstarted; after that the DNS and DHCP information will already be there.

Troubleshooting a failing Kickstart

The most common problems with network kickstarts the way we have set it up are network failures, MAC addresses that are incorrectly entered in the dhcpd.conf file, using the MAC address for the wrong NIC. These problems will present themselves on the console with messages from PXE Boot on the NIC unable to obtain an IP address.

An incorrectly named hexadecimal IP address file for a system or a problem with the TFTP server will allow the NIC to obtain the network data, but fails to load the PXE Boot configuration file for the system. Be sure your TFTP server is configured correctly using the tftp file in the /etc/xinetd.d directory.