Complexity and Resilience
November 8th, 2006 by Lou
I’m convinced that exposed complexity contributes to more outages than any other technical factor. The operational or human sister to this is inadequate configuration management. Unless we ruthlessly squash non-value added diversity and complexity, our systems become unmanageable, and are crushed by their own mass.
The advance of technologies like Ruby over J2EE and MySQL over Oracle has a lot more to do with the ability to get things done quickly, consistently and cleanly than the relative technical merits of the implementations. Its not worth the CPU cycles to screw around with massively complex tools when you can throw more machines at most problems.