Configuring Lots of OTP Nodes Using Open Source Tools
September, 2004
Hal Snyder
Contents
- Goals
- CVS
- Autoconf
- Pkgsrc
- Cfengine
- Benefits
- Drawbacks/Challenges
- Conclusion
Goals
- Divide servers into "classes" based on what they do. Keep all
software on servers in the same class at identical revision state.
a. at time of installation (jumpstart, ghost, g4u, etc.)
b. as updates occur
- Rebuild a server in less than an hour in case of hardware failure.
- Have complete and unambiguous record of the configuration state
of every server now and in the past, with accountability for each
configuration change.
- Keep programmers off the production servers.
CVS
- Gives you version control, but that is only part of configuration
management.
- CVS is the foundation of the CM system. Keep software under
development in it, as well as autoconf macros, packaging files, and
cfengine policy files.
- We chose CVS because, although there are newer version control
systems available, CVS is still used for the overwhelming majority
of open source software. Interfaces and limitations well understood.
Autoconf
- Our main use so far is for configuring builds, not portability.
- Based on decades of grueling CM automation work.
- Well known.
- Write a few OTP-specfic m4 macros for finding OTP libs and
headers.
- Allow engineers to specify during build whether they want release
or their own test versions of each component in the build.
Pkgsrc
- Highly portable packaging system based on very mature systems
used with FreeBSD, NetBSD, and OpenBSD.
- Maintains database of installed packages, lots of support tools.
- Works very well with autoconf'd software.
- Use with (almost) all installed software after initial OS load.
Cfengine
- Put servers into various classes. A server pulls all updates for
its class once per hour.
- If package X is not installed, install it.
- Replicate config files, prompts (often these are not in a package
release cycle).
- Edit shared config on the servers: /etc/inittab, crontabs, passwd.
- Certain "dangerous" options require manual intervention on target
server. Examples: change telco routing tables, start SIP proxy.
Benefits
- Convergence - all server in same class
- Replace any server within less than 1 hour in case of hardware
failure.
- Complete documentation of configuration of the platform.
- Give you a workable set of tools and processes for keeping
- programmers off the production servers.
- Full source for CM system. No licensing headaches.
- About as light-weight as you can get for what it does.
Drawbacks/Challenges
- Learning curve for CM staff.
- Culture shock for engineers and bosses.
- Process for package updates needs to be simplified.
- Pkgsrc supports only one installed version of a package at a time
- not a show-stopper for OTP.
- Cfengine file replication is slow. Consider rsync.
- Cfengine has bugs and limited syntax.
To Do
- Integrate with OTP release handler.
- Use autoconf more for portability.
- Adapt legacy code to autoconf build system.
- At present, the CM system is used almost exclusively for static
configuration. Expand its use to allow runtime configuration of
software we control.
- Support MS operating systems. (Interix?)
Conclusion:
- We have come a long way toward keeping our sanity while maintaining
hundreds of Unix-family servers. The CM system is already working
well, but is still a work in progress.
Reference: