Title :
An Autonomic Reliability Improvement System for Cyber-Physical Systems
Author :
Wu, Leon ; Kaiser, Gail
Author_Institution :
Dept. of Comput. Sci., Columbia Univ., New York, NY, USA
Abstract :
System reliability is a fundamental requirement of cyber-physical systems. Unreliable systems can lead to disruption of service, financial cost and even loss of human life. Typical cyber-physical systems are designed to process large amounts of data, employ software as a system component, run online continuously and retain an operator-in-the-loop because of human judgment and accountability requirements for safety-critical systems. This paper describes a data-centric runtime monitoring system named ARIS (Autonomic Reliability Improvement System) for improving the reliability of these types of cyber-physical systems. ARIS employs automated online evaluation, working in parallel with the cyber-physical system to continuously conduct automated evaluation at multiple stages in the system workflow and provide real-time feedback for reliability improvement. This approach enables effective evaluation of data from cyber-physical systems. For example, abnormal input and output data can be detected and flagged through data quality analysis. As a result, alerts can be sent to the operator-in-the-loop, who can then take actions and make changes to the system based on these alerts in order to achieve minimal system downtime and higher system reliability. We have implemented ARIS in a large commercial building cyber-physical system in New York City, and our experiment has shown that it is effective and efficient in improving building system reliability.
Keywords :
building management systems; computerised monitoring; data analysis; fault tolerant computing; ARIS; New York City; accountability requirements; automated online evaluation; autonomic reliability improvement system; building system reliability; cyber-physical systems; data-centric runtime monitoring system; human judgment; large commercial building cyber-physical system; operator-in-the-loop; safety-critical systems; Buildings; Reliability engineering; Software; Software reliability; Support vector machines; Time series analysis; cyber-physical system; data analysis; data mining; machine learning; reliability engineering; runtime environment; smart buildings; system reliability;
Conference_Titel :
High-Assurance Systems Engineering (HASE), 2012 IEEE 14th International Symposium on
Conference_Location :
Omaha, NE
Print_ISBN :
978-1-4673-4742-6
DOI :
10.1109/HASE.2012.33