Self-healing networks01 Feb 2007
Last year I wrote an article on building a self-healing network with off the shelf software components. If you are responsible for managing a large UNIX/Linux network it’s a must-read…
An excerpt from the article:
_Computer immunology is a hot topic in system administration. Wouldn’t it be great to have our servers solve their own problems? System administrators would be free to work proactively, rather than reactively, to improve the quality of the network.
This is a noble goal, but few solutions have made it out of the lab and into the real world. Most real-world environments automate service monitoring, then notify a human to repair any detected fault. Other sites invest a large amount of time creating and maintaining a custom patchwork of scripts for detecting and repairing frequently recurring faults. This article demonstrates how to build a self-healing network infrastructure using mature open source software components that are widely used by system administrators. These components are NAGIOS and Cfengine._