DocumentCode :
1866588
Title :
Analyzing causes of failures in the Global Research Network using active measurements
Author :
Myakotnykh, Eugene S. ; Helvik, Bjarne E. ; Hellan, Jon Kåre ; Kvittem, Olav ; Skjesol, Trond ; Wittner, Otto J. ; Øslebø, Arne
Author_Institution :
Centre for Quantifiable Quality of Service in Commun. Syst. (Q2S), Norwegian Univ. of Sci. & Technol., Trondheim, Norway
fYear :
2010
fDate :
18-20 Oct. 2010
Firstpage :
565
Lastpage :
570
Abstract :
With the objective to better understand how the global Internet should achieve an availability in the order of five nines, i.e. be available 0.99999 of the time, active measurements were performed between Norway and China through the Global Research Network. End-to-end downtime statistics was continuously collected during a 3-month period up to mid February 2010. In addition to periodically sending probe packets between the two measurement systems, traceroute was used every two minutes to identify an exact IP-level path between the end-points. Also, TTL (time-to-live) counter in the IP-header, which is reduced by one on every hop, was analyzed for each packet. Causes of the observed network failures based on the collected data were identified and insight is gained into processes preceding and following communication downtimes. We distinguish inter- and intradomain failures and, when possible, identify an exact link or an Autonomous System where a certain event has happened. The study shows that the end-to-end path availability is mainly affected by interdomain failures and long BGP convergence time as well as series of events not straight forwardly explained by the anticipated (re)routing behavior.
Keywords :
IP networks; Internet; computer network reliability; failure analysis; quality of service; telecommunication network routing; BGP convergence time; IP-header; IP-level path; active measurements; autonomous system; end-to-end downtime statistics; end-to-end path availability; failure analysis; global Internet; global research network; interdomain failures; measurement systems; network failures; probe packets; quality of services; time-to-live counter; Availability; Convergence; Delay; Internet; Loss measurement; Probes; Routing; Quality of Services; dependability; failure analysis; failure detection; network measurements; routing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT), 2010 International Congress on
Conference_Location :
Moscow
ISSN :
2157-0221
Print_ISBN :
978-1-4244-7285-7
Type :
conf
DOI :
10.1109/ICUMT.2010.5676581
Filename :
5676581
Link To Document :
بازگشت