Protos Reference Information
Administrative Concepts

This covers some common administrative concepts to be familiar with--and how they are usually implemented, such as sevice abstraction and high availability.

Service Abstraction

Service Abstraction is the concept of not directly tying services to any specific hardware. This is most commonly done with network addresses. By naming a web server 'www' or 'web' instead of something unique, it can easilly be moved from server to server without causing broken links. This concept is further extended by not giving that name as a primary name to the server--instead providing it as a name alias.

This concept can also be extended to other aspects of system administration. I usually mount hard drives on cryptic points off the root partition--usually named by the device name. I then create softlinks to services within that realm--to make it easier to move services around. For instance, on a server where home directories, ftp and http services exist, and the logical devices 'vol1' and 'vol2' were available, I would create directories within them named home, http and ftp, and softlink those to /home, /http and /ftp. Then later, if I were to change the volume configuation--all that would have to be altered is the softlinks (rather than many configuration files and hardcoded paths).

High Availability

While a properly configured unix system is capable of staying up indefinitely, there are usually other issues which will come into play, causing it to become unavailable. These are almost exclusively centered around hardware issues. These issues can be split into three areas:

Power (UPS)

Problems with power are the primary reason behind unavailable servers. However, this is also the simplest problem to solve. The most common solution for this is a Uniform Power Source (UPS). They usually include a battery, and convert the power between AC and DC in order to provide a uniform source. UPS systems are usually limited in life span, guaged in minutes to hours. There are also varians on the UPS concept, which will provide long-term solutions for commercial power outages. These usually include a mechanism to maintain power for a limited duration (such as batteries or a fly-wheel) while a generator starts and spins up to speed.

Storage (RAID)

Storage is another common problem. That is--hard drive failures. This can be solved with RAID solutions, where you cluster a group of drives together and redundantly write information across them in such a manner that if one drive fails, the others can usually compensate for it, and keep the server running. There are many different ways of doing this--and solutions in both the hardware and the software. Hardware RAID solutions are generally more efficient--but also more costly. Software RAID can be implemented in most any modern unix system, including Open Source systems such as FreeBSD and Linux. Different RAID configurations have the added advantage of actually increasing file system performance--even when implemented in the software.

General Failures (Clusters/Groups)

This is a category for any other failure--be it CPU, Main Board, RAM or whatever else. There is no standard solution for managing this--other than providing redundant servers. Usually there are two general ways of approaching it. In a hot and cold manner. Hot redundancy is implemented through a clustering system, where if one server fails, the services continue un-interrupted. Cold redundancy is where alternate servers can be switched into activity, if the primary server fails. Overall, Cold redundancy may or may not be desirable because it causes the services to be interrupted while they are switched to the alternate server.

Copyright © 2004, Protos LLC