ABSTRACT
In an oil spill response environment, urgency looms, and virtually every action is geared toward immediate needs. Clean-up, safety, and listed species protection are at the forefront of the collaborative efforts carried out by an incident management team. However, these needs do not complete the obligation of the Federal Action Agency responsible for the event. This agency must also complete an Endangered Species Act (ESA) Biological Assessment (BA). To do that for a spill of national significance, it is paramount that response personnel track certain details about their daily operations. Unfortunately, in the BA for the Deepwater Incident Response, the action record had to be reconstructed forensically. Although operational permits to work, otherwise known as Shoreline Treatment Recommendations, used standard geographic references and response action terms, they are merely prescriptions for activity and provide only maximum default assumptions. To gain vital insight into more specific temporal elements such as frequency, intensity, and duration, daily response reports were required. These reports were not gathered into a central geodatabase along the way. They were printed to paper, boxed, shipped to a documentation unit, and scanned into image files. These files were organized into approximately 30,000 document sets of up to 4,000 pages each. Qualitative document content analysis was used to distill the needed details from these image sets into a database. This technique for generating the needed data for an effects analysis is arduous. However, the process of its development has produced valuable lessons learned. Here we present the needed schema design and architecture to promote a seamless transition on future responses from the urgency of immediate need to inevitability of post-spill ESA obligation.
INTRODUCTION:
What is history? Long definitions and even dissertations have been professed in efforts to define it. But when it is really boiled down, history is simply a collection of stories by people about people. Of course there are events, such as natural disasters, that mark history with their own tale. Yet, even these events are woven into the human story because of the people that are affected by them. Moreover, they are a part of the human story because of the people that overcome them. The Deepwater Horizon Oil Spill was certainly an historic event. The images of its immediate aftermath are sealed in the American consciousness. Fire on water, desperate oiled wildlife, and blackened white sand beaches will not be forgotten. However, once there was no more fire on the water, and the direct effects of crude oil on wildlife were less common, the story that followed received much less attention. As with the recovery of most disasters, the Deepwater Horizon response has been a long march, requiring not only strategic planning, but relentless perseverance. Although it was instigated by a tragic event, the vast response is an epic tale of diversity, collaboration, and human acheivement. As the response winds to a close, the Endangered Species Act Biological Assessment (BA) team has been tasked with telling this vast story geospatially, from the perspective of the listed species potentially affected by response actions. The BA Team must analyze the effects of the actions taken in response to the spill, not the direct effects of the spilled oil. This conversion of daily events to summations of effects is not possible without a digital version of the story. Make no mistake, there were many digital, tabular, and even geospatial chapters written by responders along the way. However, the BA team had to institute novel means to make them all sing the same school song.
Many aspects of the response were standardized and preplanned under the Incident Command System. The Federal Emergency Management Agency (FEMA 2013) defines this as a system that:
Allows for the integration of facilities, equipment, personnel, procedures and communications operating within a common organizational structure.
Enables a coordinated response among various jurisdictions and functional agencies, both public and private.
Establishes common processes for planning and managing resources.
It is true that this system lays out important procedures including interagency integration, clear structure, standard protocols, and even daily report forms. But what about the story? In these stated goals of ICS, a strategy for documenting events in a singular database is not even listed as a priority.
PROCESS:
Developing the Deepwater Horizon Action Record:
With a spill of national significance, it is easy for responders to recall anecdotes of what occurred on their watch. For many, they are simply evoking memories of moments which may have changed the entire course of their lives. But memories of the story do not translate into a geospatial action record. This is not to insinuate that there were no digital records at all. Many ambitious, tech-savvy responders tracked events that were pertinent to their post with GIS. Individually, the datasets they created were novel and effective in meeting the needs of their assigned mission. However, it is false to assume that all of their distinct wells of information can be fused together with a few fell strokes of automation. GIS was instrumental in the allocation of resources, the tracking of oiling, and the management of resource risks on the response. For instance, the entire shoreline is divided into Shoreline Cleanup Assessment Technique (SCAT) segments. SCAT teams delineate these parcels of affected shoreline as they complete their initial oiling surveys. They are labeled with a naming convention and serve as standard geographic references for the duration of the response. Additionally, GIS developers on the response were eventually able to leverage professional mobile grade GPS devices to generate response data from the field. As of the third year of response, operations personnel are entering standardized data in real time on these devices as they reach the end of each SCAT Segment. The main purpose of the data they are collecting is to help determine the overall benefit of the operation. These data can be used to target segments for completion and removal from emergency response status. These data are not designed to facilitate a species effects analysis.
In addition to data, response documents are an important source of information regarding cleanup activities. Responders complete a whole suite of ICS forms each day. The table below presents a sample of these forms as recommended by National Incident Management System (NIMS) ICS Forms Booklet, FEMA 502-2 (National Incident Management System 2010).
These forms are either filled out by hand on paper or submitted in basic digital formats such as spreadsheets and word processing documents. The documentation unit is responsible for collecting and storing these forms. The procedures used by the documentation unit are important to grasp in order to understand some of the challenges the BA team faced when preparing a digital version of the story. Forms were either collected in boxes in their original handwritten format or were printed from their basic digital form. These stacks of documents were then couriered to the documentation unit headquarters for processing. There, they were all fated to end up in the same format. Each page was filed into a set of like documents and then scanned into an individual image file.
The disparity in formatting and purpose of the wide range of data and document choices from the Deepwater Response presented a unique and diabolical challenge for the ESA BA Team. When they began, they operated under a primary false assumption. It was believed that there existed a cohesive, geographically aware action record with details such as equipment types used and scale of operations. As they embarked into the discovery phase of the assessment process to identify the best available data, it quickly became apparent that this golden record was little more than fantasy. It was true that all the information they needed was available – just not in a ready to use form. The scanned images of ICS forms were not text searchable. The GIS reporting tools had been designed to meet immediate response needs, such as allocation of resources and oiling reports. Even the conservation measure checklists used to minimize adverse effects to listed species were mired with amorphous comments and dissimilar use of form entry options.
Typically, a BA is pre-emptive. A traditional BA completes an effects analysis on proposed projects. Therefore, the actual data analysis required is usually minimal. However, the Deepwater Horizon BA was far from typical. Addressing effects to listed species from response actions that transpired over the course of at least three avian wintering migrations, three sea turtle nesting seasons, and a multitude of habitat types could not be accomplished with a cursory look at the overall situation. The BA team had to find a way to leverage what was available to reach their goal. This meant finding common ground among the wide ranging datasets. Although the majority of the data and documents available resembled little information islands, some data building blocks did stay constant throughout the response. These building blocks were fashioned as a function of the SCAT Process: Shoreline Treatment Recommendations (STRs) and SCAT Segments. The BA team used these as the necessary link to bridge the gap between the various sources of response information. SCAT Segments were the standard in geographic referencing during the response. In addition, STRs (essentially operational permits to work) were written for specific ranges of SCAT Segments. STRs use standard language when prescribing cleanup actions. This language helped the BA team to deconstruct components of response and distill a list of basic actions. Those actions could then be assessed holistically by species experts to determine what consequences they were likely to cause.
During the response, as the prescribed cleanup per each STR was completed, SCAT followed up with additional surveys. If the follow-up survey showed that SCAT segments had not reached cleanup endpoints, a new STR was issued to continue the cleanup as needed. This process was consistent across the area of response (AoR). If each sequential STR covered the same range of SCAT Segments for the duration of the response, filing them into an action record would have been more straight-forward. Instead, the range of SCAT Segments covered by each STR was much more dynamic. GIS technicians did manage datasets which contained information about past, present, and future STRs for each segment. Unfortunately, they did so using single cells of information. Storing data in arrays inside single cells is poor practice, because information held within single cells cannot be queried effectively using database management techniques. To make matters worse, this array also contained attributes to the STRs to document their status (e.g. completed or generated) at the time the record was made. Attributes within attributes also make for messy data. The BA team used what they could from this dataset, but every good data manager knows that dealing with data fraught with variable delimiters is messy and time-consuming. The BA team has to form an organized solution to set a strong foundation for their analysis. Relationships between SCAT Segments and STRs were paired in a streamlined events index to meet this need. In addition, the BA Analysis Coordinator built a full dataset to capture actions and dates associated with STRs, called the Prescribed Action feature class (PAFC). By folding this information into a cohesive dataset, the BA team successfully created a temporal and spatial foundation for the continued construction of the forensic action record. The figure below illustrates the concept behind the PAFC. Just as each STR covers a range of SCAT Segments spatially, each SCAT Segment temporally undergoes a series of STRs. The events index captures these unique relationships individually. For instance, in this example, [Segment 6] [STR 2] and [Segment 6] [STR 3] would both constitute unique events.
The BA team used available PDFs of each STR to qualitatively extract information pertinent to listed species. This information was funneled into ready to use data associated to each SCAT Segment/STR event. Although STRs are a good resource for general action descriptions per segment, they are lacking in temporal details such as frequency and active date ranges. The Federal On-Scene Coordinator signature date was used as the primary indicator of an STR going active. Secondary and tertiary indicators for activation of an STR were date prepared and SCAT Survey date, respectively. These surrogate activation dates were used to generate estimates on start and end dates of STRs. This method does not provide the desired accuracy regarding STR effective dates, in some cases. The start and end dates of STRs were important to the BA effects analysis, and a best estimate was not always sufficient. The timing of response activities can be just as essential as the location, as some species only use habitat resources seasonally. For example, sea turtles lay their eggs beneath the sand where they are vulnerable to disturbance and damage from subsurface cleaning activities. However, this is only a concern during sea turtle nesting season. With construction of the PAFC underway, the BA Team began to look to other sources of information to add clarity and accuracy to temporal components of the action record. Frequency, intensity, and seasonality could not be glanced over.
While STRs were beneficial for their standard language and consistent use throughout the response, the qualitative document content analysis could not end there. Due to inherent qualities in their original purpose, STRs could not be used as a stand-alone dataset for the effects analysis. STRs prescribe cleanup activities, sometimes with a range of options. Prescriptions do not equate to a clear record of what occurred when. They can only provide maximum default assumptions. Despite the drawbacks of the STRs, the BA Team was aware of the challenge they would face in finding actual action record components for the whole response. The BA team set the PAFC as a priority dataset to construct, and concurrently pursued a viable alternative for generating accurate temporal action data.
ICS forms emerged as the BA team's second target. They can be used to validate effective date ranges of STRs, identify which options from the STR prescriptions were used, and add data about intensity of the operations. Intensity is very difficult to predict using only the STR. By finding more details about the types of equipment used, the entourage of vehicles that accompanied heavy machinery, and the actual frequency of operations, intensity becomes less of a guess and more of a math problem. That being said, the math could not start until the useful content of the forms was unlocked. The BA team had to find a way to sift through the towers of documents and the terabytes of image files to unlock information held in the numerous ICS forms. Their strategy was to use stratified sampling with emphasis in returning results that were both statistically confident and legally defensible.
The most comprehensive source of ICS forms was the database of scanned images made by the USCG Documentation Unit. As previously discussed, these images were scanned copies of ICS forms generated by responders across the AoR. The documentation unit used a system of categorizing to prepare original documents for scanning. The attributes used to separate groups of documents were as follows: Categories, Subcategories, General Descriptions, Number of Pages, and Date Range. Each page of each document set was also assigned a serial number associated with the image file name. By June of 2012, 100 spreadsheets with varying schemas containing the document set categories and serial number ranges had been created by the documentation unit. Using database management techniques, the BA Team Analysis Coordinator organized these tables into a singular database with a cohesive schema. This compiled database contained nearly 30,000 document sets. Each document set contained a number of individual pages, some numbering upwards of 4,000 pages. By filing the variant tables of document information into one cohesive database, the BA team was able to take a systematic approach to bulk searching in the massive dataset. Areas within occupied or federally designated critical habitat for the focus species of the Deepwater BA were prioritized using this method. Additionally, date ranges when species are utilizing said habitat were targeted in the preliminary sorting.
The BA team worked remotely from home offices all over the country. Therefore, a cloud-based solution for data entry into the action record database was ideal. The BA Team GIS Administrator developed a geographically aware web form to fulfill this need. The fields on the form were tailored to encourage users to move through documents as swiftly as possible, gleaning only information which would be valuable to the final analysis. Design of form entry is tricky. It is important to balance the flexibility needed to capture user insights with the rigidity of maintaining consistency in the information. Data validation with drop down menus and data type restrictions are excellent tools in making control features feel less restrictive. In addition to careful design on the front end of user interface, it is critical to carefully design a streamlined data consumption structure on the back end.
Many of the response database tables contain pivoted data. This type of data stores too much information per record. One such instance is the BMP Checklist dataset. In the BMP Checklist data collected on the response, compliance for a list of BMPs is contained within a single record. This single record was collected for a wide range of locations, yet was tied to a single coordinate (latitude and longitude). A pivoted dataset, like this one, cannot be queried for specific results; it requires significant untangling before it is viable for analysis. These datasets, without even addressing the location disparity, require a complex query, subquery, and aggregate function with case analysis to return any useful data. This was prevented in the action record database by completing pre-posting filtering as the data was submitted. The records were addressed in a manner where the actions could be spread across many individual locations, but not contained within a single record. Each combination of location and action was given a single record of occurrence. For example, if the user selected three segments and three actions, nine records would be created, effectively unpivoting on two parameters. Information in data fields were paired in a way that generated unique records for every combination of activities, dates, and locations that were identified in the ICS forms. These data, combined with the PAFC, have facilitated detailed effects analyses for the listed species in the Deepwater Horizon AoR. The action types and data architecture used to build the forensic Deepwater Horizon Action record are flexible and could inform planning for future spills.
CONCLUSION AND LESSONS LEARNED:
The response community is regularly inundated with plans. Safety plans, environmental plans, cleanup plans, and spill prevention plans are all critical to protecting the lives of people in the industry and the resources in the surrounding environment. Asking for data strategy to be brought into the forefront may seem like a misalignment of priorities, but information is at the core of each of these other subjects. Strategic data management can make each of the other plans more powerful, while also addressing regulatory compliance. Furthermore, by using the lessons learned from the ESA BA of the Deepwater Horizon incident response, we can make a lasting contribution to responses yet to come.
The following principles should be used when generating a data plan:
Do not underestimate the urgency of a response environment. Individual motivations and responsibilities are variables which cannot be overlooked when considering a data plan. User interfacing must be efficient. It must be scalable and must not interrupt the priorities of the response at large.
Identify variables which are pertinent to regional species and other resources in advance. Design data fields to capture information that contributes to these variables, while still meeting the immediate needs of the response.
Do not create pivoted data. Take the time to incorporate ISO standard database management principles as the data is consumed.
Leverage best available technology. Cloud-based solutions and professional grade mobile GPS devices can be used in concert to lend both accuracy and precision to response data collection and distribution. This does not mean device fleets need to be purchased in advance. The device itself is not as important as advance preparation of a plan of fundamentals for its use. Use of a mobile program development matrix will be sufficient to evaluate against market offerings.
If the country faces another spill of national significance, these principles could be used to make sure that as the response marches on, so does its story. There will always be the need for analyses of effects of response actions when the spill interfaces with listed species or their habitat. An appropriate data plan would alleviate the current disconnect between events and documentation, leave room for surprises, and meet scalability requirements. The sheer size of the Deepwater Horizon Incident was not the only reason it will be remembered in history. The Deepwater Horizon Response is a story of human effort, a collaboration of passionate individuals to meet a crisis with muscle, strategy, and resolve. That power, the power of a team on a mission, can be channeled through a schema design to leave its footprints behind in clear detail. That is the gift of data management.