Week 2 | Review of Understanding BGP Misconfiguration by Ratul Mahajan, David Wetherall, and Tom Anderson
Any system intended for good use can cause disruption when misconfigured or not used properly and with care. The Border Gateway Protocol (BGP) is no exception. The paper Understanding BGP Misconfiguration looks at misconfiguration errors with BGP and aims to determine how frequent they happen, what the usual causes are, how big their impact is, and how could they be lessened.
The paper’s quantitative nature (it claims to be the first quantitative study on BGP misconfigurations) is evident through the paper with the inclusion of tables and graphs which made analysis and comparison with the numbers easier. Misconfigurations are often times not reported (and even not detected as they don’t always result to connectivity issues) except for a few that landed in news articles due to the widespread outage and/or network disruption they have caused. With this challenge, the paper have thoroughly documented the methodology and data used. Historical data on a 3-week period from 23 different vantage points were used to observe traffic and determine abnormalities. They have contacted the ISPs involved to verify if the abnormality was in fact a misconfiguration. Due to the staleness of the data, 30% of the emails bounced or got accepted by the wrong person (no longer connected to the company or not connected at all). Nevertheless, for the responses, the paper have reported that even with limited data, the answers of the limited correspondents still converged to the same things.
The paper looks at BGP misconfiguration problems and mainly classified them into 2 (import and export) and further subdivided them into 2: slips and mistakes. Slips were defined to be an error in execution of a correct plan while mistakes, on the other hand, are errors that have originated early in the planning and even if the plan was executed correctly, errors would still exists.
For most of the instances, it was human error that caused a huge chunk of misconfigurations. TO address these, several solutions were proposed to aid the human operator in making less mistakes. This included the proposition of a more friendly interface, the ability to set up configurations automatically, an easier way to detect if configuration errors have gone wrong, etc.
Aggressive error reporting and detection was also one of the proposed solutions. As what the paper has always reiterated, most errors are left undetected (since they just fail silently or erroneous packets are just dropped) unless a connectivity error arises or network administrators turn off filters in special scenarios.
Overall, the paper provided a comprehensive view of BGP misconfigurations, its types and classifications, usual causes, and possible preventive solutions. There was a part in the paper though that I felt loss in differentiating exports and import misconfigurations (maybe that was just me though). Maybe a thorough introduction and explanation on imports and exports and a concrete example of these types of misconfigurations before diving in the actual numbers and comparison would be more helpful. The succeeding examples, tables, graphs, and actual measurements were very helpful in analyzing how the different errors compare and how one is more frequent or more disruptive than another. The paper has also described very well the methodology used and how results were obtained, as well as its scope and limitations.