Regional Resiliency Assessment Program
The Regional Resiliency Assessment Program (RRAP) is a cooperative, non-regulatory assessment program implemented to examine the resilience of critical infrastructure and systems through regional analysis. The program, led by the Department of Homeland Security (DHS) Office of Infrastructure Protection (IP), addresses a range of hazards that could have significant consequences, both regionally and nationally. The COAR team lent our cyber infrastructure expertise to the Department of Homeland Security to support RRAP cyber-related resilience analyses.
Each RRAP typically involves data gathering and analytical effort, followed by continued technical assistance to support stakeholders in building infrastructure resilience. RRAPs can incorporate various components, including voluntary facility resilience assessments, targeted studies and modeling, first responder capability evaluations, subject matter expert workshops, and other valuable information-exchange forums. The RRAP process for the Ashburn, Virginia RRAP involved the following steps.
- STEP 1 Assess critical infrastructure on a regional level, focusing on physical vulnerabilities that may lead to cascading cyber consequences, from an all-hazards perspective
- STEP 2 Identify dependencies, interdependencies, and cascading effects by developing awareness of how this system can be disrupted, through facility site assessments, one-on-one interviews, and facilitated meetings with a wide range of stakeholders
- STEP 3 Identify gaps in resilience through additional research and analysis aimed at improvements in stakeholder planning and preparedness
- STEP 4 Assess capabilities to protect critical infrastructure and to improve communication between private sector stakeholders, resilience planners, and emergency responders
- STEP 5 Coordinate efforts to enhance resilience by developing Resilience Enhancement Options to be implemented by State and local stakeholders with the support of DHS IP
The cluster of data centers in the greater Ashburn area in Loudoun County serves as the primary global Internet traffic hub on the East Coast owing to the presence of major Internet exchange points (IXPs). With the unique concentration of both fiber and power, on average, 50 to 70 percent of all Internet traffic flows through the greater Ashburn-area data centers. These facilities contain information technology (IT) infrastructure supporting governmental agencies and private companies that, in turn, supply day-to-day services to critical utilities and the public.
Throughout the Ashburn, VA RRAP, the term “data center” was used generically to refer to a wide variety of facilities that have subtle but important differences. In actuality, a single data center can rarely be categorized as only one of the following logical types; however, for the purposes of illustration and explanation, the following distinctions remain important. The figure below depicts how these facilities are typically connected to one another.
Tier 1 Network Provider Data Centers
Tier 1 network providers are those that have access to every other network on the Internet without paying for transit. Tier 1 network providers have their own data centers where content providers can rent space to deliver increased performance to a variety of customers (known colloquially as “hosting content closer to the eyeballs”). Because tier 1 providers typically offer other services (e.g., cell phone service, home Internet service), they may house other services in their data centers in addition to peering with content providers and other networks.
Content Provider Data Centers
Large content providers and technology companies (e.g., Google, Microsoft, and Facebook) often build their own data centers. While these data centers frequently bring in transit from several major network providers, they usually house only one company’s data and infrastructure. Although the data centers may be built by a real estate company, they are typically operated by the content provider itself.
Internet Exchange Points (IXPs, also known as Network Access Points [NAPs])
Internet exchange points offer peering and transit interconnections. A peering interconnection agreement (also called “Bill and Keep” or “Sender Keeps All”) is where two networks provide access only to each other’s customers without making any financial settlements. In contrast, transit interconnections are agreements where one network provides reachability to the entire Internet in return for a monetary settlement. IXP data centers facilitate these connections, usually through the selling of space and power for networking equipment, as well as “cross connect” fees for running cable between customers’ equipment. Unlike tier 1 and content provider data centers, IXP data centers typically do not monitor the networks that connect through their facilities.
Colocation/Real Estate Data Centers
Colocation data centers are built by companies that typically consider themselves to be in the real estate market. Although these companies may sell power, cooling, and network in addition to physical space, they typically offer no network services (e.g., security, monitoring). They generally view their relationship with the customer as a traditional landlord/tenant relationship. Customers typically install and maintain their own equipment and contract their own network services. Most tier 1 network data centers and IXPs also offer colocation services.
Internet Service Provider Data Centers
Although many customers connect directly through tier 1 providers to the Internet, tier 2 and tier 3 ISPs often have data centers of their own, as well. A tier 2 network is an ISP who engages in the practice of peering with other networks, but who also purchases Internet protocol transit to reach some portion of the Internet. Tier 2 providers are the most common ISPs, as it is much easier to purchase transit from a tier 1 network than it is to peer with them and attempt to become a tier 1 carrier. Tier 3 networks purchase Internet protocol transit solely from other networks to reach the Internet. These data centers are smaller and often only house services for their direct customers.
Meet Me Rooms
Most data centers that offer colocation services also have a “meet me room,” which is not strictly a type of data center but is akin to an IXP within a data center. It is usually a separate room or data center “cage” where customers can house equipment that is used exclusively for connecting to other customers also located in that facility. Peering and transit agreements can be connected through meet me rooms as well.
Internet infrastructure is highly distributed among different private and public sector entities. Many companies own different parts of the infrastructure, and Internet traffic can travel on many paths. The Transmission Control Protocol (TCP) that is one of the primary communications protocols used on the Internet is designed as a stateful end-to-end protocol, meaning that it attempts to guarantee message delivery from sender to receiver despite changes or outages in the intervening network nodes. This protocol allows networks using TCP to be highly resistant to failures when multiple paths from sender to receiver exist. TCP’s resilience has the potential to fail if a concentration of high-capacity routes between a particular sender and receiver become unavailable. The geographical locality of such a large quantity of data centers and network routes in the Ashburn area makes this concentration a concern. The New York Times recently published an article on the vulnerabilities and importance of the physical infrastructure that the Internet comprises. As Internet access increasingly becomes a critical dependency, the public is becoming more aware of its physical vulnerabilities. This recognition has made it imperative to study these vulnerabilities and develop strategies to improve their resilience.
The properties that make this infrastructure resilient also make it difficult to gather data and conduct data-driven analysis of potential failures. Although several Internet companies have stated that they have conducted failure analyses to show that an outage of the Ashburn region would have no national or global consequences to Internet routing, any such study would be limited to an individual company’s own data, and hence, incomplete.
It is in part owing to this difficulty in data collection that the Ashburn, VA RRAP’s Key Findings focus on improving communication and information sharing. One thing is certain: as government and other critical infrastructure sectors begin to rely more heavily on data and network infrastructure for daily operations, studies of this type will only become more important. The Ashburn, VA RRAP presents a perfect opportunity to begin this data collection effort through follow-on activities designed to build consensus and demonstrate to stakeholders the need for and benefit of compiling such information.
The Ashburn, VA RRAP identified vulnerabilities that may affect the Internet community’s ability to prepare for and recover from the impacts of a variety of natural and manmade threats to its infrastructure assets. The associated Resilience Enhancement Options may be considered for implementation by Ashburn, VA RRAP stakeholders and partners to address the resilience gaps described in the Key Findings.
Internet resilience is contingent on a limited number of centralized Internet exchange points (IXPs).
A study should be conducted that simulates the outage of an IXP. Such a simulation would include modeling the traffic and Transmission Control Protocol congestion during an outage of IXP facilities in the greater Ashburn area.
Transparency in both network and data center infrastructure would enhance resilience planning.
The State, in concert with private sector stakeholders, should coordinate a workshop on the development of cloud and data center service taxonomies and assessments. Such taxonomies and/or assessments should allow for equal comparison of resilience features across providers and empower customers by fostering open, honest competition.
Local law enforcement personnel would benefit from training and the exchange of information concerning how to recognize suspicious activity around IT infrastructure assets.
Law enforcement personnel should engage industry stakeholders to facilitate training and education on fiber routes and suspicious activity. This training should also address how and when to approach maintenance personnel and how to confirm that they are authorized to work in a given area.
Data center and content providers may not have a pathway to contribute to resilience efforts and/or communicate criticality during an emergency.
A workshop should be conducted for the data center community so that all parties can communicate their needs for points of contact and access to emergency operation center (EOC) resources and for communication pathways during an emergency.
Transportation infrastructure and private trucking companies are necessary to supply diesel fuel to data center facilities in the event of an extended electric interruption.
An in-depth study of the region’s fuel supply chain should be conducted specifically to address concerns relating to fuel supplies and the ability to deliver fuel during an extended power outage to data center facilities. Such a study should include best- and worst-case scenarios for road conditions after a disaster. Data center and network providers should be present at the State’s annual exercise within the region. Their participation would improve decision makers’ understanding concerning the critical roles of data centers in response efforts.
Data centers and network providers should consider electromagnetic pulse (EMP) and radio frequency (RF) generator effects in developing resilience and protective measures plans.
Additional workshops should be conducted to support Ashburn-area data centers in their efforts to improve resilience to EMP/RF effects. A process should be established to keep data centers and local law enforcement up to date on new EMP/RF threats. Discussion about whether and to what extent EMP/RF protection should be deployed should be a standard risk management topic for data centers.
Communication and education efforts between data center providers and fire department personnel are necessary to support resilience planning.
Data centers and fire department personnel should work together to arrange training and education sessions to help ensure that fire department personnel are aware of distinctive data center needs and environments. Data centers should consider installing radio equipment that operates on frequencies used by fire and police department radios to assist with operations during emergency situations.