Author_Institution :
Down to the Metal, Dennis, MA, USA
Abstract :
“Houston, Tranquility Base here. The Eagle has landed.” Two obscure errors almost prevented these words from being spoken. The errors were not made by the crew of Apollo 11 or by the controllers in Houston, nor were they made during the mission. Rather, they were made by engineers and managers years before the flight. How they happened, and how they went substantially undetected and effectively ignored, is a pair of lessons in system integration that avionics engineers must never forget. The Apollo Program is justly famed as a giant leap for the techniques of management of complex system design and implementation. Nonetheless, these tools were used by human beings and so, necessarily, imperfectly. One of the most challenging tasks in any complex system is controlling and testing the interfaces between major components that are developed by different organizations. Among the management tools deployed by NASA were Interface Control Documents (ICDs). This author has not been able to determine whether this phrase was first coined for the Apollo program or the Mercury and Gemini programs that preceded it, but it was certainly a major tool in Apollo. One of the errors under discussion herein was caused by a blatant failure to update an ICD in response to an engineering change, which can be classed as a management error of omission. The other is much subtler, involving a question of how previously unsuspected vulnerabilities (to crew procedures, in this case) should be communicated when they fall outside the scope of an ICD, yet turn out to have relevance to the way the interface is used. This becomes a problem because an ICD is a top-level document limited to specifying the design parameters of one sub-system insofar as they are of concern to one other sub-system. It\´s not surprising that the symptoms caused by the latter problem have been totally misunderstood by almost everyone from President Nixon on down, and only partially understood even by Buzz Ald- - rin, who along with Neil Armstrong, had to deal with them at the time. This misunderstanding is so widespread that almost everyone with any acquaintance with the Program Alarms during the Apollo 11 landing believes that the LM\´s Primary Guidance Navigation System (PGNS) "failed" in some way and had to be rescued by human intervention. That is the exact opposite of the truth, which is that performance margins built into this very robust system quarantined the effects of the errors so that the landing could proceed with the designed level of human involvement, specifically dodging the "field of boulders" that the PGNS could know nothing about. This is largely a retelling of the higher-level parts of a paper, Tales from the Lunar Module Guidance Computer by this author\´s colleague Don Eyles [1], but with the orientation changed from a historical narrative to a cautionary tale with recommendations for modern avionics development management. Results of more recent research by this author and two colleagues are also incorporated.
Keywords :
aircraft landing guidance; avionics; Apollo 11 program; ICD; LM primary guidance navigation system; NASA; PGNS; avionics development management; avionics engineers; complex system design; gemini programs; human intervention; interface control documents; lunar module guidance computer; mercury programs; subsystem insofar; system integration issues; top-level document; Error analysis; Human factors; Navigation; Phase measurement; Software measurement; Space missions; Space vehicles;