Archivists have spent the past several decades seeking solutions for managing born-digital collection materials. While progress has undoubtedly been made in the areas of acquisition and digital preservation, a recognizable gap exists in the area of processing. Defining what, exactly, born-digital processing is and what it entails is a conundrum. Following the 2016 Born Digital Archiving & eXchange (BDAX) unconference at Stanford University, a group of ten archivists produced the Digital Processing Framework to articulate what archivists do when processing born-digital archival collections. This article examines the current professional digital processing landscape and reflects on the framework group's lofty endeavor. It frames four issues that make born-digital processing enigmatic and challenging: defining the scope of digital processing; the ongoing tensions between minimal processing and digital preservation; confusion in terminology about the functions in digital processing; and the convergence of two fields of inquiry that borrow and share language and practice that have become digital processing. It concludes by recommending further actions and explorations for defining and guiding born-digital processing.

Archivists have spent the past several decades seeking solutions for managing born-digital collection materials. Archivists began sharing theory and practice about how to archive digital records once records became electronic. Conferences, publications, and formal and informal conversations continue to disseminate experiences and ideas about standards and best practices for managing digital materials. The profession has made significant progress in clearly identifying the work necessary in areas of acquisition and digital preservation, but a recognizable gap exists in the area of processing. Establishing what, exactly, born-digital processing is and what it entails is a conundrum; the term evokes different processes and steps to different archivists based on education, experience, and institutional situation. This article explores the challenges to defining born-digital processing and suggests that many of these challenges arise because digital processing exists at the intersection of two fields: analog archival processing and digital preservation.

Following the July 2016 Born Digital Archiving & eXchange (BDAX) at Stanford University, a group of ten archivists (hereafter referred to as the framework group) produced the Digital Processing Framework to better define born-digital processing.1 From August 2016 to July 2018, the members created matrices of activities and tasks designed to articulate exactly what archivists do when processing born-digital archival collections (see Figure 1). In developing the Digital Processing Framework (referred to as the Framework throughout this article), group members encountered numerous challenges devising something useful, flexible, and extensible that could ultimately assist a profession grappling with processing an influx of born-digital materials. The final Framework consists of information about its purpose, audience, scope, limitations, and twenty-three processing activities with associated possible tasks displayed as matrices according to the FrankenModel structure the group designed. In reflecting on the framework group's lofty endeavor, four issues that make born-digital processing enigmatic and challenging surfaced: defining the scope of digital processing; the ongoing tensions between minimal processing and digital preservation; confusion in terminology about the functions in digital processing; and the convergence of two fields of inquiry that borrow and share language and practice that have become digital processing practice: traditional archival processing and digital preservation.

FIGURE 1.

Example of one activity, or matrix, within the FrankenModel

FIGURE 1.

Example of one activity, or matrix, within the FrankenModel

Close modal

After a review of recent born-digital processing literature and professional efforts, this article describes the creation of the Framework (which both authors helped to develop) to articulate the challenges of processing digital materials in the twenty-first century. We find that, largely because of the influx of born-digital collections, digital processing now includes a complete collections management life cycle, more than only “arrangement, description, and housing.”2 In considering this broader concept of processing, the article concludes by recommending further actions and research areas to better define and guide born-digital processing.

Before commencing with the body of the article, we present several notes on terminology. Throughout this article, we use the term “born-digital” to include all digital material. Sometimes, digitized materials arrive without their analog originals, in effect leaving archivists to treat the materials as born-digital. Additionally, “framework group” refers to the group of archivists who developed the tool; “Framework” refers to the final document in its entirety; “FrankenModel” references the structured approach of relating activities to tasks and their level of processing; and “matrix” refers to a single activity/task example (such as Figure 1).

Archivists have been conducting and publishing valuable research on born-digital collections since the 1970s.3 Reviewing this literature for insight into digital archival processing, a vast range of topics emerges, including accessioning, digital preservation, digital curation, digital forensics, creation of finding aids, provenance and original order, access methods, and more. A consideration of digital processing merits an evaluation of the current landscape of published articles, monographs, and reports that use the term “digital processing” or articulate the authors' concept of archival processing: activities, workflows, tools, and best practices for making archival materials accessible to users.

Archivists began describing acquisition, arrangement and description, and digital preservation using tools and workflows with the shorthand of “archival processing” in the early 2000s. Three of the earliest articles include Lucie Paquet's summary of archival processing for electronic manuscripts at the National Archives of Canada and two case studies detailing the experiences of students processing digital collections at the Harry Ransom Center for courses at the University of Texas at Austin.4 These articles begin to describe some digital processing activities, such as converting file formats, incorporating description and access with physical counterparts of the same collection, generating or manually creating item-level metadata, and making appraisal decisions. They also began to illuminate issues that the framework group grappled with in 2016: undertaking digital preservation, description, and delivery in one systematic process, and the challenges of using a More Product, Less Process (MPLP) approach in processing digital materials.5

As archivists continued to consider how to process digital collections, institutions with enough resources to experiment, such as Emory and Yale, published case studies providing examples, workflows, and tools used.6 These case studies depict exploratory projects into authenticity, description, and emulation, but ultimately were not scalable for most institutions (including, detailed as follows, their own). According to Cyndi Shein, “These groundbreaking projects provide an invaluable foundation for the development of born-digital stewardship, but they come from a very similar and limited perspective—that of large institutions with solid funding and expert technical support, working on high-profile humanities collections.”7 Shein's 2014 case study was one of the first published articles to present an economical workflow for making a wide range of born-digital formats accessible while navigating the intricacies of restrictions, efficient processing, and accessioning as processing. Her case study began to fill a gap in the literature and addressed one of the purposes of the framework group: to delineate and share practical activities for getting digital materials to researchers.8 Other notable sources reflecting this need include Ben Goldman's article describing practical steps to make digital records accessible, OCLC's Demystifying Born-digital series of reports to assist practitioners, and numerous entries on the SAA Electronic Records Section's BloggERS.9

One case study that heavily influenced the Framework was a 2016 publication coauthored by Dorothy Waugh, a member of the framework group. Waugh and coauthors at Emory University describe a tiered, flexible approach to undertaking arrangement and description of born-digital materials by using a rubric to assess three areas: content, processing complexity, and access method.10 The rubric guides archivists to adjust the processing workflow depending on the collection's scope, file renderability, authenticity, restrictions, and expected use, and assign a processing tier (low, medium, or high) for each digital collection. In doing so, they spend an appropriate amount of effort processing varied collections, spending more time and resources on collections for which users would most benefit from the additional attention. This case study breaks new ground by demonstrating a scaled-up, nonlinear, and evolving workflow to accomplish access and preservation, which are the major deliverables in archival processing.

More recently, four published monographs elucidate both the activities involved in and the current professional ethos of born-digital processing. These books began to address the looming challenge recognized by the framework group—what does processing mean in a digital world and what should archivists do, on a practical level, to process electronic materials? SAA's Archival Arrangement and Description, Module 2: Processing Digital Records and Manuscripts was one of the first books to directly address these questions. Author Gordon Daines articulates the differences and similarities between analog and born-digital processing, writing, “Digital records and manuscripts force an important adjustment to this traditional definition of archival processing.”11 Daines presents seven processing tasks traditionally associated with analog records and suggests a revised workflow and adjusted set of subtasks for digital materials. While this particular workflow did not directly inform the Framework, it includes many of the same tasks, such as gathering contextual information and transferring content from external media.

Lori Hamill's book, Archival Arrangement and Description: Analog to Digital, heavily references Daines's workflow while attempting to define, describe, and bridge born-digital processing with analog processing. Building on Daines's idea that traditional processing requires reframing, Hamill remarks, “Core archival principles are still valid for digital records, but they require new steps to be added to old processes, reworking workflows, and new specialized knowledge and tools to adjust to this new format. The need for new techniques to accomplish archival functions, including arrangement and description, does not mean that the principles developed for analog records no longer apply.”12 Using Daines's digital processing activities and tasks, Hamill created a visual representation of the actions typically taken in analog processing compared to those of born-digital processing. Her research resulted in numerous tables outlining tasks involved in accessioning, arrangement, and description. Although the group was unaware of Hamill's research, the tasks and activities in her tables are similar to the Framework's matrices, with Hamill's activities corresponding to a workflow. Comparing and contrasting the framework group's work with Hamill's illustrates some of the inconsistencies of blurred digital processing stages. For example, both resources include “Run Virus Scan,” but the Framework delineates this as an activity with three associated tasks; whereas Hamill includes this as one task during accessioning.13 She also includes several tasks that do not appear in the Framework (e.g., “monitor date last modified on files”), largely because her process is based on her institutional workflow.

Heather Ryan and Walker Sampson addressed practicalities of born-digital archival processing in their 2018 No-Nonsense Guide to Born-Digital Content. As the title suggests, it aims to offer clear-cut solutions for practitioners beginning to manage born-digital materials.14 While archival processing is alluded to in various chapters on accession and ingest, description, preservation, and access, it is explicitly discussed in a chapter on workflows. The authors state, “. . . a great deal of the work of born-digital processing and general management can be organized as a series of inputs and outputs, with each output serving as the input for the next step in the chain.”15 Like Daines and Hamill, the authors present processing as a workflow that includes multiple activities both unique to electronic records and familiar to traditional analog processing workflows.

Trevor Owens directly tackles born-digital processing in two chapters in his 2018 monograph, The Theory and Craft of Digital Preservation. He makes clear that “much of the work of arrangement and description should fold into existing practices” and recognizes “a need to pivot from some of the traditional notions of cataloging functions to a world in which librarians, archivists, and curators function as wranglers of interconnected data.”16 Owens examines the “boundaries” of a digital object, demonstrating with practical examples why digital objects challenge archival notions of original order.17

Owens, and others, recognize sweeping changes to the definition of and activities involved in processing digital archival materials, yet write about digital archival processing as an extension of traditional analog processing. They specifically address ideas of arrangement, description, and preservation in the context of original order, provenance, and, although a relatively recent development, minimal processing. Other authors skip the practicalities entirely, questioning fundamental archival principles such as original order and trend toward a complete revision of traditional processing activities and workflows by letting the records describe themselves. Such an approach relies on leveraging information extracted from the records (e.g., file names, folder titles, embedded metadata, and image recognition). By asking, “How will traditional principles of archival arrangement and description be challenged or modified for born-digital materials?,” Jefferson Bailey examines concepts of respects des fonds and original order in regard to processing electronic content.18 He finds that these concepts were constructed for analog materials and are less important to management and access of digital materials. He notes that “the component bits of a digital object are non-sequential in their material physical arrangement. . . . Here, even at the bit level of a single item, there is no original order.” He also states that “Arrangement, as we think of it, is no longer a process of imposing intellectualized hierarchies or physical relocation; instead, it becomes largely automated, algorithmic, and batch processed.”19 Geoffrey Yeo and Jane Zhang also examine the concept of original order as it relates to processing and user interaction with archival materials.20 By proposing alternate user-driven discovery mechanisms, they began to introduce innovative approaches for processing in the digital realm, such as text mining and dynamic or on-demand arrangement. Both Yeo and Zhang acknowledge that these proposed alternatives to analog archival hierarchical arrangement and descriptive practices require item-level control over digital content.

The framework group was not the first to explore digital archival processing; several groups, initiatives, and projects have provided archivists with insight, advice, and guidelines for providing access to born-digital content. PARADIGM was one of these pioneering projects; it attempted to cover the complete digital curation life cycle from acquisition to preservation. This UK-based collaboration had multiple objectives, including “provide the participating institutions with hands on experience of curating personal digital archives” and “to pilot a system for . . . cataloging and providing long-term access to digital personal material.”21 PARADIGM participants generated a workbook and final report including fifty recommendations for research libraries. While the report includes information about processing, the topic is largely addressed in the accessioning stage as “initial processing.”22 This project also investigated the use of Encoded Archival Description (EAD) for descriptive cataloging of born-digital materials.23

Building from the results of PARADIGM, in 2012, archivists at four institutions in the United States and England published AIMS Born-Digital Collections: An Inter-Institutional Model for Stewardship. As with the framework group, authors of the AIMS document acknowledged “a common need among project partners to identify a methodology or continuous framework for stewarding born-digital archival materials.”24 The AIMS framework divides stewardship into four different functions (collection development, accessioning, arrangement and description, and discovery and access), and breaks out each function into objectives, outcomes, decision points, and tasks. It is tool-agnostic but refers to tools throughout. Objectives and tasks under “Arrangement and Description” are very much in line with traditional processing and include planning, arrangement, and the creation of metadata.25

Most recently, published after the framework group finished meeting, OSSArcFlow, a collaboration between UNC Chapel Hill's School of Library and Information Science, Lyrasis, and Artefactual, collected workflows from twelve US institutions outlining processes for what they term “digital curation.” Many of the workflows include tasks traditionally associated with archival processing merged with terminology and models derived from digital preservation.26 In addition, Digital Library Federation's “Levels of Born Digital Access” strives “to provide a set of format-agnostic baseline practices for born-digital access, laying out concrete and actionable recommendations that individual institutions can consider implementing according to their needs, resources, and abilities.”27 This access-first decision-making model offers a different approach to articulating what digital processing looks like.

The most common piece of advice throughout these case studies, articles, books, reports, and projects is for archivists to be “flexible” in their approach to processing digital archives.28 Digital processing is complex and nuanced—no universal streamlined process exists for arranging, describing, and preserving digital collection materials. A wide range of publications and reports provides useful examples and guidance that is valuable to discussions about processing born-digital materials, but lacks best practice and standardization for archival processing of digital records. While individual case studies provide detailed examples, the workflows and tools do not necessarily translate across local practices and resources. Theoretical articles questioning fundamental processing concepts offer exciting ideas for the future, but contain little how-to guidance for practicing digital archivists. The standout reports of past projects had multiple objectives, limiting exploration of widely applicable and specific processing guidance. The literature on born-digital processing leaves archivists to distill an enormous amount of information before determining the most useful, practical path for local collections and institutions. Clear guidelines telling archivists exactly what to consider and what to do are lacking from the variety of projects and publications. At Stanford University's second annual Born Digital Archiving & eXchange (BDAX) unconference, framework group members saw a need to coalesce and distill existing information and experience into best practice. Somewhat naively, the group members ultimately set out to accomplish this.

Conversations among experts in digital archives at BDAX uncovered a gap in best practice literature, general guidelines, and standardization around the umbrella term of “processing digital materials.” Following the 2016 unconference, ten participants (all practicing digital archivists, including the authors of this article) volunteered to attempt to untangle the thorny landscape of digital processing and to define best practice in the hopes of filling the identified gap.29 Lacking an institutional mandate or funding support, and initially thinking this would be an eight-month project, the group conducted its work through conference calls and Google documents for the following two years.

Soon after convening, the group created a project charter to help focus work from the infinitely possible to a more manageable output. Given the common limitations of institutional resources, the group saw a need to adapt concepts of minimal processing levels to digital processing. The charter's goal in creating a processing policy framework was “to articulate the actions (human and technical) necessary to arrange and describe digital materials at increasing levels of granularity.”30 The project's anticipated deliverables were

  • A tagged Zotero bibliography on processing digital collections;

  • A selected annotated bibliography of significant resources on processing digital collections;

  • A digital processing policy framework that proposes

    • high-level best practices for processing digital collections at increasing levels of granularity (collection→item),

    • factors and criteria to consider in establishing a processing level,

    • minimal descriptive requirements for providing access to digital collections in alignment with established models (DACS) at each level,

    • workflow and documentation (including tools and technology in use) for modeling an example digital processing program, and

    • metrics for measuring and reporting progress on digital processing projects.

As the group began exploring the effort needed for the deliverables, it quickly became clear that the original eight-month timeline was impossibly ambitious given the scope, topical complexity, and volunteer nature of the members' commitment. Ultimately, within two years the final Framework achieved many of the goals, but in the end did not develop workflows or metrics to support a digital processing program. Nor did the developed FrankenModel include tools or technologies given the prevalence of these lists in the profession.31

Following the charter creation, group members shared collective knowledge about existing literature that likely addressed digital processing. The group set up a publicly accessible Zotero library for everyone to add articles.32 Each group member read and annotated eight to ten resources, highlighting whether each resource was useful for the Framework and how the authors described digital processing work. The goal was to identify articles that best describe digital processing practices, developing a knowledge base complementary to the group's professional expertise of digital archival processing work. Group discussions about the literature demonstrated that although it contains a wealth of information about digital processing, the concept is broad and defined differently with varied intensity of activities among institutions.

Defining Digital Processing within the Group

While discussing the literature review to identify common best practice, group members struggled to find shared language describing and identifying the work of digital processing. Members discussed digital processing using language derived from workflow assumptions and tools based on local practice or experience. This led to miscommunications and a range of assumptions that, with further interrogation, demonstrated the need for a common set of terminologies for digital processing. Shared terminology was essential to make a framework that could be useful to professionals regardless of institutional context. Instead of distilling digital processing theory found in the literature into a practical framework, the group had to abstract practical experience informed by the theories into shared best practice.

The group addressed the terminology and local practice barrier by brainstorming a list of tasks that members perform as part of local digital processing work. The brainstorming tried to identify every possible task regardless of local or tool-based terminology, granularity of effort, potential overlap, or duplicative functional work. In doing so, three trends emerged. First, a high level of agreement existed around terminology for early capture and accessioning procedures, suggesting that these practices are well understood within the profession and terminology is stable enough for productive cross-institutional conversations about how the work happens in local contexts. Second, digital processing is a continuum of work that is institutionally specific due to policy and infrastructure capacity. As a term, “digital processing” is too broad to use in literature and in useful conversations without additional contextual specificity. Third, the identified tasks are predominately for simple digital objects, rather than complex digital objects or environments, such as email, web collection, databases, or software. Practicing archivists have few standard workflows or regular experience processing complex digital objects; such efforts are largely theoretical or unique projects rather than common practice.

These unexpected and revealing discoveries, combined with the urgency to produce deliverables within the proposed time, resulted in the group splitting into three teams for more efficient work: Team Lit Review, Team Survey, and Team Framework. The framework group continued hosting meetings for all members to ensure the teams shared their work with one another, and the collective efforts would build toward the identified deliverables.

Team Lit Review

Given the realization that the group did not share common sets of terminology or workflow assumptions, the members of Team Lit Review analyzed the collected literature for examples of the following:

  • Using inherent qualities of digital materials to assist in aggregation or arrangement decisions;

  • Using automation in identification and extraction of informational content in digital materials;

  • Applying extensible processing strategies to born-digital materials to help guide efforts to reveal minimum-level processing activities;

  • Approaches to access and identifying what users want;

  • Integrating the processing of born-digital and paper in hybrid collections;

  • Questioning the usefulness of traditional archival practices when applying them to born-digital materials.

In looking for examples, the team created annotations for the most relevant articles. In general, the articles did not discuss the intersection of traditional archival arrangement and description and digital preservation work. There were a handful of exceptions, most of which envision a version of “let the objects describe themselves” and some of which are referenced in our literature review.33 Ultimately, the team saw the final bibliography as a resource for practitioners.

Team Survey

Team Survey refined the brainstormed list of tasks into forty generalized activities that the team could analyze. Many of the activities referenced work done as part of another activity, indicating the need for further assessment and refinement. To accomplish this, the team created surveys for the group members as a mechanism to understand practice around these activities, divorced from the local terminology that was impeding communication. The goal was to reduce the number of activities to a more manageable list, define them for a broader audience, understand where in workflows an archivist might perform an activity, and, ultimately, whether it was part of the framework group's collective understanding of “minimum” processing.

For each of the forty activities, each framework group member answered eight questions based on local practice or professional expertise if local practice did not currently include the work (see Appendix A). Team Survey synthesized the responses and created a generalized definition and scope for each activity without using terminology that implies a specific institutional context, process, or workflow. Analyzing answers based on the functional purpose of an activity helped diminish institutional jargon that impeded shared understanding. Team Survey used the assessment to narrow the list to twenty-three different macro-activities (see Appendix B), composed of multiple microtasks. In developing the activity/task hierarchical relationship, the team often discussed how and why an institution would perform a task. This process, and the conversations about whether to include or exclude a task, further highlighted the wide variance in practice and the need for an extensible and flexible framework that institutions could adapt locally.

Team Framework

Given that the results of the framework group were to be disseminated in a way that would be useful to other practitioners, Team Framework created a structure for the finalized set of activities and tasks. The team developed five criteria the structure should meet:

  1. Extensible: easy to incorporate tools, methods, or approaches into the Framework as they are developed and/or the landscape of digital processing changes;

  2. Flexible: usable by anyone at any organization, regardless of their level of funding, expertise, staffing, or technical capacity;

  3. Useful: helpful in guiding decision-making about processing, rather than simply presenting the full range of possible options for a given task;

  4. Descriptive rather than prescriptive: offering a processor guidance about the implications of processing choices and recommendations about how to move forward, rather than making decisions on the processor's behalf;

  5. Simple: encouraging adoption and broad use; not intimidating, overwhelming, or complex.

To fulfill these requirements, the team looked at several existing models and policy frameworks to find synergy or structures they could reuse. The team examined a decision point model, which outlines processes and decisions in a document similar to a flowchart. A tiers model, similar to NDSA Levels of Digital Preservation, could serve as a benchmark of processing activities for organizations to evaluate against their own efforts. A model like Rightsstatements.org could present the processor with a range of categories to choose from. Team Framework evaluated each policy structure by creating a list of pros and cons according to the five structural criteria (extensible, flexible, etc.). Ultimately, the team incorporated some of these models' attributes, such as tiered categories and a grid layout. After a few drafts and redesigns, the team finalized what was deemed the FrankenModel (a whimsical term indicating the adoption of attributes from other models, which the group never replaced with a more professional term).

Team Framework also developed a three-tier definition of baseline, moderate, and intensive processing to support archivists' decision-making (see Table 1).

Table 1.

Processing Tiers

Processing Tiers
Processing Tiers

For each activity defined by Team Survey, a matrix utilizing the FrankenModel structure would list all of the tasks that happen during the activity and indicate whether a task is necessary for one or more of these processing tiers (see Figure 1). In doing so, the Framework would support archivists trying to determine how much digital processing is sufficient for their local institutional and collection needs.

SAA Panel and Member Feedback

The framework group sought constructive feedback from a broader audience about the draft Framework to determine if it was meeting the established deliverables. In the session “What We Talk About When We Talk about Digital Processing: Building a Framework for Shared Practice” at the 2017 SAA Annual Meeting, five group members presented a draft version of the Framework and FrankenModel to an audience of approximately 200 people. Following a brief overview of the Framework, audience members broke out into fifteen groups. Each group discussed the proposed FrankenModel and gave feedback about the level of detail describing the activity and tasks; if they would use the Framework at their institution, why or why not; and to give additional actionable suggestions or comments. The audience breakout groups agreed that a more explicit audience needed to be defined to contextualize the scope and purpose of the Framework. Additionally, the FrankenModel's layout impeded easy understanding of the work needed for each task at each level of processing. Finally, archivists at the session clearly wanted a method for evaluating whether their processing practices are responsible and adequate, and if not, what they should be doing instead. The final Framework provides an oblique way to perform an assessment, utilizing the levels of processing, but it is not a checklist to reassure archivists, donors, and users of good processing stewardship.

Following the SAA Annual Meeting, the group finalized the Framework incorporating feedback from the session. Team Survey continued to simplify and deduplicate the list of activities and tasks to be added to the FrankenModel. Team Framework produced FrankenModel 2.0 that addressed the layout concerns raised by SAA audience members. Team Lit Review wrote an introduction on how to use the model and defined the Framework's audience. The Framework's purpose is to suggest minimum processing guidelines while establishing common terminologies for shared practice.

Once the Framework and list of activities were finalized, each group member entered seven or eight activities with their companion tasks into FrankenModel 2.0, one activity per matrix. Using professional expertise and the synthesized terminology and functional definition of activities produced by Team Survey, each member determined which tasks should be performed at baseline, moderate, and intensive levels, noting variances in the tasks at each level of processing effort (see Figure 1).

The effort underscored the enormous amount of redundancy at the task level, further emphasizing that digital processing does not have a linear work-flow. Each framework group member also reviewed and left comments on two other peoples' grids to give a broader interpretation of what the proposed tiered processing model could look like. Finally, in July 2018, the Framework was published in Cornell's institutional repository under a CC-BY-NC license.34 The self-organized nature of the framework group means that no standing body exists to receive feedback or revise the Framework. Several individuals have reached out to the authors since the Framework went live to discuss its contents and ask questions, but the lack of professional sponsorship impedes further refinement. At this time, the Framework is orphaned. To the authors' knowledge, there is no discussion in SAA sections to transform it into a persistent, actively maintained tool.

The group intended that archivists in all types of repositories could use the Framework. While the Framework should be a useful starting point for developing institutional processing practices, the group expected users to have some familiarity with digital preservation standards and terms. The Framework's audience includes archivists who process born-digital collections, who have an understanding of OAIS and digital preservation, and who are familiar with archival professional standards. In other words, the Framework is not meant for beginners. The final Framework contains a matrix utilizing FrankenModel 2.0 for each of the twenty-three high-level activities. Each activity lists the identified discrete tasks contributing to its completion and a suggested level of processing for each task (baseline, moderate, intensive), with contextual information describing variances in the processing work at each level. Suggested levels were the group's attempt to distill the literature and expertise into best practice for other archivists. In outlining the work for each of the twenty-three activities, the Framework identifies baseline processing for born-digital content. The Framework's CC-BY-NC license, combined with the modular nature of the FrankenModel 2.0, allows institutions to adapt it to local practices and resources.

There are limitations to the Framework. The layout of tasks in an activity and their recommended level of digital processing may imply that a higher level of processing is better. Additionally, the modularity of activities, divorced from a workflow, combined with the repetition of tasks across activities, prohibits easy attempts to string activities together as a workflow. Because academic archivists made up the majority of the committee's membership, the Framework might unintentionally be more applicable to archivists at research institutions, although archivists from the Gates Archive were instrumental in designing the Framework. The review process at SAA was intended to provide archivists from nonacademic institutions an opportunity to evaluate whether the tool would be useful in their context. Although feedback from the SAA audience indicated a wide interest in such a tool, no follow-up assessments were undertaken to explore its use in academic or nonacademic repositories. Other weaknesses in the Framework align with current professional challenges in defining digital processing, discussed in the following section.

The large attendance at the 2017 SAA session and over 6,300 views of the Framework in the institutional repository indicate that the archival profession seeks straightforward guidance on digital archival processing. However, as seen in the literature and the framework group's experience, creating and providing such guidance is difficult even when some consensus exists about the tasks and activities that ought to occur during processing. Developing the Framework made several trends clear: defining the scope of digital processing is critical for clear communication across the profession; tensions are ongoing between minimal processing and digital preservation; the seismic shift from analog to digital archival processing has created confusion in terminology that leans on technological implementations; and the professional intersection between digital preservation and archival activities requires careful negotiation.

What Is Digital Processing?

Determining “even what processing entails in the digital realm” was a primary outcome of the Framework and this article.35 As seen from the literature review and discussions within the framework group, “processing” in the digital realm is an imprecise term. The processing stage refers to steps archivists perform after accessioning to further arrange, describe, and prepare materials for access. The SAA Glossary's definition of processing is “The arrangement, description, and housing of archival materials for storage and use by patrons,” followed by the note, “Some archives include accessioning as part of processing.”36 Accessioning as processing is an accepted practice, made common largely because of the desire to provide access more efficiently. With the advent of minimal processing, more analog “processing” tasks (mostly aggregate description and simple rehousing) are pushed to some degree of completeness in the accessioning stage. Ultimately, institutions must determine what steps happen as part of their accessioning stage versus their processing stage; although generally, after accessioning, archivists arrange, physically process, describe the material, and ultimately produce a finding aid or other access tool.

The accession to access stages are theoretically useful, but are practically arbitrary distinctions when working with digital content because of the repetition of technical tasks that are necessary to make a collection accessible to users. Gaining more control over digital content in earlier workflow stages is critical because of the fragility of digital objects and storage devices. Furthermore, robust and often time-consuming digital preservation activities (e.g., format migration, managing sensitive and restricted files, checksum verifications) may occur repeatedly over the life of the materials while in the repository because digital preservation is, in part, a set of ongoing tasks that require active management of the materials. Digital preservation tasks that are performed during accessioning and processing can be repeated throughout the life cycle of the materials, challenging the theoretically time-linear framework of processing and entangling boundaries of distinct archival workflow stages.

Group members discussed the blurred lines between accessioning and processing practices and whether or not to include actions typically performed during digital accessioning in the Framework. Ultimately, the group decided that, to encompass all the tasks identified as essential to a collection being considered “processed,” the Framework had to include tasks that are typically carried out during accessioning (e.g., “capturing digital content off physical media” and “creating checksums”) but are also essential to, and may be repeated when, processing digital materials. In other words, “digital processing” is the authors' shorthand for all the work to make a collection accessible, regardless of the institutionally defined stage at which the work is performed.

Minimal Processing and Privileging Preservation

In the words of Trevor Owens, “. . . People have to look for solutions to preservation management problems first, all other problems that come after are, therefore, secondary.”37 Compared to analog materials, the significantly shorter time period of digital stability necessitates performing preservation actions on all digital materials sooner rather than later. The digital content cannot simply wait for decades without a meaningful risk of unknown and potentially unmeasurable loss. For this reason, the theories and practices of digital preservation are fundamental to decisions made while processing all digital materials. Contrast this against archivists who traditionally perform preservation actions on a subset of a physical collection, but often not intensively on the entire collection, and the widespread practice of prioritizing access instead of preservation. In the seminal MPLP article, Greene and Meisner reasoned that item-level preservation actions on all paper-based materials, such as the pre-emptive removal of all staples/paperclips regardless of their condition, is often unjustified when the materials are housed in appropriate climate-controlled spaces, which will reasonably keep them in good condition for decades. Furthermore, labor-intensive preservation actions hinder the more important objective of “converting our massive backlogs into usable resources for our patrons.”38

Owens remarked about MPLP, “my sense is that almost no one is doing this for digital materials.”39 The Framework reveals that this is more than just one person's feeling. Out of 151 total tasks across the twenty-three activities, group members marked ninety-one tasks as necessary for baseline processing, the lowest tier. This indicates that sixty percent of the identified tasks are required for a set of materials to be considered processed. The framework group did not intentionally decide that most of the tasks should be required; this was simply the result of members' evaluation in determining which processing tasks are baseline, moderate, or intensive. This could indicate that digital processing is workload intensive, perhaps more so than analog processing. Providing access to digital materials means doing most of the work of traditional processing (e.g., arrangement and description), plus a range of technical tasks for digital preservation. Institutions that lack high technical expertise and support feel the burden of this work more acutely than those with more robust preservation software solutions and resources. On closer examination, the list of tasks is particularly granular, and, in simplifying their description, baseline processing would appear less labor intensive. But doing so would hide the peculiar tension between archival processing (as developed in an analog world) and the needs of digital preservation that drive digital processing, discussed shortly.

The intensity of recommended baseline processing may also indicate that archivists are fixated on and privileging tasks that are done for the sake of preservation even though, “However sophisticated some collection of digital material is, at the most basic level, it's possible to create a record for it and make that record available.”40 As with analog materials, archivists must ensure that completing a myriad of digital preservation tasks (e.g., creating checksums for all copies, running file format identification and verification) does not delay access to unprocessed materials and the potential for a growing backlog. Kim, Dong, and Durden noted in 2006 that “item-level processing is an especially important preservation activity for ensuring long-term accessibility of holdings” and that the use of automated processes “will help archivists achieve the goal of ‘more product, less [human-involved] process.'”41 At a time when professional discourse around processing focuses largely on efficiency and aggregate-level descriptive work for analog materials, when faced with digital materials, archivists grapple with balancing the requirements of item-level preservation while still arranging and describing at an aggregate. The profession has struggled to develop technical tool and workflow solutions for automating item-level preservation efforts while simultaneously facilitating aggregate description and arrangement of those items. The technical solutions that try to balance these needs are particularly out of reach for institutions that cannot invest in comprehensive software solutions.

Confusion in Terminology

The AIMS Group identified differences in terminology as one of its three formidable challenges to developing the AIMS Framework.42 The group produced a glossary to combat this challenge. Nearly ten years later, the framework group encountered the same challenge (and, unlike the AIMS group, members were all from US institutions). As group members shared local practice, language describing the work existed on two levels: standards and implementation. Conversation typically started in the language of conceptual models and standards (e.g., Submission Information Package or SIP, Archival Information Package or AIP, EAD) with a successful shared understanding of the concepts. This worked well on a high level, but standards are not implementation.

When the group needed to describe implementation and workflow strategies to write best practice guidance, the language of the institution, heavily informed by local tools and systems, became a barrier to shared understanding of what the work practically looked like. For example, discussing how to share information about the existence of digital materials in a collection required members to translate local and consortia discovery platform terms, as well as policies governing work to publish catalog records and EAD finding aids, because processing work was intimately tied to these infrastructures. The lack of shared language to discuss workflow and implementation decisions meant that one member's short-hand terminology of a process had a functional equivalent in one or more other members' processes, but was too difficult to identify because of the lack of common functional or implementation language and workflow differences between institutions. In this way, the survey work of the framework group's membership was vital to developing a shared vocabulary to discuss differences and commonalities in digital processing at a level below digital preservation theory and archival processing stages and standards, yet above that of the tools implemented in specific practice. The Framework could facilitate professional conversations about digital processing using an implementation language abstracted from workflow or tools.

Convergence of Two Practices

The research for the Framework suggests two sets of theories, practices, and tools intersecting to become digital processing: traditional archival processing and digital preservation. Traditional archival processing was developed in an analog world. The archival profession has codified stages of activity (which may happen across decades in a repository,) shared workflows and decision-making tools, and periodically questioned current practice in light of a changing world. Archival processing is a hierarchical approach to arranging, describing, preserving, and providing access to collections in aggregates of increasing granularity. While archivists also may undertake folder- and item-level inspection and assessment, the work involved in processing focuses on preparing the materials for access in aggregates. Item-level processing, especially item-level arrangement and description, is uncommon in the modern era because of the enormous amount of time and work involved. Furthermore, archivists judiciously perform physical preservation actions based on material condition and resources rather than on all items in all collections.

Digital preservation, on the other hand, has a younger practical history, a dominant OAIS framework that takes cues from archival practice but does not directly map to archival stages of activity, and a developing practice based on a life-cycle model that necessitates regular, active management. Digital processing tasks, as evidenced by the matrices, prioritize working with items because of digital preservation demands, characterizing them in detail before aggregating that information for archival arrangement and description. Archivists assess and perform actions to understand digital materials by using tools that operate against storage devices, interpreting bitstreams that comprise files. They use software, often first developed outside of an archival context, to accomplish many item-level digital preservation activities efficiently and successfully. Tools can also assist with other processing tasks, typically appraisal and description, by reviewing every single file and extracting item-level information that can be repurposed for archival arrangement and description. In this way, item-level preservation work can produce item-level descriptive control for nearly all digital collections in a tantalizingly automatable way that is inconceivable for analog materials.

The two practical approaches use similar language and can be tentatively mapped across one another. For example, archival accessioning and OAIS ingest function appear conceptually similar, the latter even borrowing process from the former (e.g., gathering information from donors prior to acquisition). But, when developing workflows, the needs of digital preservation are at odds with archival arrangement and description traditions. In traditional processing, archivists tend to work with items and folders in the service of creating aggregate arrangement and description; in digital processing, archivists have to work with individual files and technological layers using tools developed to support digital preservation. The OSSArcFlow project, which worked to overcome this challenge, identified that “a common shared pain point was that the description and arrangement practices for analog collections do not map well onto born-digital materials.”43 Archival arrangement and description standards are in tension with digital preservation. The languages of the two traditions borrow from one another, providing the illusion of synergetic practice, even as they are in service of fundamentally different levels of management: aggregate versus item. The tensions, elisions, and adoptions of practice and approaches to working within these related, but different, practices are among the reasons that defining digital processing is difficult and sharing local practice requires a better functional language.

Additionally, the possibilities of nonhierarchical or multihierarchical arrangement and description in a digital realm (e.g., full-text indexing, filtering, sorting content by date or filename, etc.) provide alternative possibilities for access to digital collections. Current archival arrangement and description standards (e.g., EAD) do not support these possibilities, in part because they are not possible or practical for analog materials. The item-level approach to digital preservation provides possibilities for item-level access through novel arrangement and description efforts. The possibility of item-level control exists in tension with a professional archival push for an aggregate level of control, leaving archivists processing digital materials to navigate overwhelming possibilities when it comes to arrangement, description, and management. In this way, digital processing presents new possibilities for archival arrangement and description that should expand the profession in new directions even as core ideas of archival provenance and context remain.

Examining the intersection of two professional practices raises provocative questions: Should archivists think of digital preservation as a subset of archival processing work? Or is archival processing a subset of digital preservation work? Is processing still analogous to arrangement and description with a dash of preservation? Or is processing an increasingly meaningless term, encompassing everything that happens from accession to access, that should be retired in favor of language that describes archival work in functional areas such as appraisal, accession, ingest, arrangement, description, and preservation? As archivists increasingly reappraise, rearrange, and redescribe analog collections, traditional accession-to-access stages cease to be as important as a collection life cycle management approach. This change aligns with the active management that digital objects require. These questions matter, because how the profession frames these separate, but related, sets of practices will shape how archivists communicate about what it means to “process” digital collection materials.

While the Framework outlines a range of tasks that archivists can undertake to provide access and preserve digital materials, the profession should continue to refine functional assessment and shared implementation terminology. Archivists can use the Framework's activities as functional areas and terminology basics to build sharable documentation that meaningfully describes what digital processing work looks like across repositories.

In addition to defining functions and terms, the profession will also need to consider the boundaries of digital archival processing. The literature and work of the framework group reveal that processing is not what it used to be. Processing is much more than “arrangement, description, and housing of materials”; it now involves an abundance of preservation tasks and necessitates creating or rethinking existing management, access, and preservation systems or models.44 Rather than a step within a workflow, the shorthand “processing born-digital materials” actually describes the complete collections management life cycle of archival materials—appraisal, accessioning, arrangement and description, access delivery, and preservation.

If processing now includes nearly everything an archivist does to make born-digital materials accessible, how does the MPLP-minded archivist reconcile minimal processing with doing everything? Can archivists escape the false binary between digital preservation and a processed, moderately accessible collection? Providing access to digital archival materials efficiently should be the objective of every repository with growing collections. The alternative is larger backlogs, which many archivists have either already dealt with or are still battling with analog collection materials. Tools can help automate processes and achieve this goal but usually have to be strung together, and many require an advanced level of technical expertise beyond that of the basic computer user. Most institutions that are still developing digital processing workflows do not have resources to support tools that are widely used in larger, better-funded repositories (and often recommended in the literature). Open source operating systems, the ability to create scripts, and knowledge of multiple programming languages are incredibly useful for efficiently processing born-digital materials, but not every institution allows open source tools. Furthermore, for some institutions, digital preservation support exists in units separate from archives, while other institutions combine the two services. Successful communication across divided responsibility within institutions, or even the different ways of sharing responsibility among institutions, requires carefully thinking about the intersection of these two traditions.

More specifically, proven guidelines are both needed and desired for less prestigious and/or underfunded archival institutions. While the archival profession has produced considerable information on digital archival processing—including case studies, guidelines, and reports—it is only now beginning to offer scalable workflows adaptable to different types and sizes of institutions and collections, such as DLF's Levels of Born-Digital Access.45 In this new processing world, perhaps more grant funding for less prestigious institutions or more research supported by professional organizations like SAA to develop efficient workflows using basic technology would go a long way in reconciling the tensions between old and new processing strategies.

Born-digital processing is difficult to grapple with at times. In some ways, it is an extension of analog processing in that the end goals are the same: arrange and describe the materials so that researchers can use them and preserve the materials so that they last as long as possible. This may be the easiest way to meld theoretical understanding with hands-on practice. But perhaps these traditional terminologies and stages are an impediment to how to talk about the work needed to prepare digital archives for users. The details of that work, driven by digital preservation's item-centric approach, are often where work-flows and the expertise needed to design those workflows become overwhelming and confusing. By exploring the larger issues that our work on the Digital Processing Framework illuminated, we hope to have moved the conversation about archival processing, and minimal processing of digital materials, forward.

Appendix A: Survey Questions for Framework Group Member

The questions group members answered for each of the forty brainstormed activities were:

  1. What tasks are included in this activity? (Free text response for respondent to describe how they do this work.)

  2. What is the functional purposes of this activity? (Free text response articulating expected work outcomes.)

  3. What is the activity's scope in a workflow? (Multiple selection options for: Accessioning, Appraisal, Arrangement, Description, Planning, Preservation, Wrap-up, Other.)

  4. Does the source of the material (email, website, legacy media, etc.) affect the activity? (Y/N/Maybe response.)

  5. Does format of the content (image, video, text, etc.) affect the activity? (Y/N/Maybe response.)

  6. Should the task be part of minimal processing? (Y/N/It depends); If the respondent answered “It depends,” they were given a free text response to explain why.

  7. How important is the activity to your current workflow? (1–5 scaled response.)

  8. Other thoughts? (Free text.)

Appendix B: Twenty-three Final Activities

  1. Survey the collection.

  2. Create processing plan.

  3. Establish physical control over removable media.

  4. Capture digital content off physical media.

  5. Create checksums for transfer, preservation, and access copies.

  6. Determine level of description.

  7. Identify restricted material based on copyright/donor agreement.

  8. Gather metadata for description.

  9. Add description about electronic material to finding aid.

  10. Record technical metadata.

  11. Create SIP.

  12. Run virus scan.

  13. Organize electronic files according to intellectual arrangement.

  14. Address presence of duplicate content.

  15. Perform file format analysis.

  16. Identify deleted/temporary/system files.

  17. Manage personally identifiable information (PII) risk.

  18. Normalize files.

  19. Create AIP.

  20. Create DIP for access.

  21. Publish finding aid.

  22. Publish catalog record.

  23. Delete work copies of files.

1

eCommons Open Scholarship at Cornell, “Digital Processing Framework,” https://ecommons.cornell.edu/handle/1813/57659.

2

Richard Pearce-Moses, s.v. “processing,” Glossary of Archival and Records Terminology, (Chicago: Society of American Archivists, 2005), https://www2.archivists.org/glossary/terms/p/processing.

3

Two of the earliest publications include Charles M. Dollar, “Documentation of Machine-Readable Records and Research: A Historian's View,” Prologue 3, no. 1 (1971): 27–31; Harold Naugler, The Archival Appraisal of Machine-Readable Records: A Ramp Study with Guidelines, (Paris, General Information Programme and UNISIST United Nations Educational Scientific and Cultural Organization, 1984).

4

Lucie Paquet, “Appraisal, Acquisition and Control of Personal Electronic Records: From Myth to Reality,” Archives and Manuscripts 28, no. 2 (2000): 71–91, https://publications.archivists.org.au/index.php/asa/article/view/8857; Catherine Stollar Peters, “When Not All Papers Are Paper: A Case Study in Digital Archivy,” Provenance 24, no. 1 (2006): 22–34, https://digitalcommons.kennesaw.edu/cgi/viewcontent.cgi?article=1068&context=provenance; Sarah Kim, Lorraine A. Dong, and Megan Durden, “Automated Batch Archival Processing: Preserving Arnold Wesker's Digital Manuscripts,” Archival Issues 30, no. 2 (2006): 91–106, http://digital.library.wisc.edu/1793/45800.

5

Mark A. Greene and Dennis Meisner, “More Product, Less Product: Revamping Traditional Archival Processing,” American Archivist 68, no. 2 (2005): 208–63, https://doi.org/10.17723/aarc.68.2.c741823776k65863.

6

Laura Carroll et al., “A Comprehensive Approach to Born-Digital Archives,” Archivaria 72, (Fall 2011): 61–92, https://archivaria.ca/index.php/archivaria/article/view/13360; Michael Forstrom, “Managing Electronic Records in Manuscript Collections: A Case Study from the Beinecke Rare Book and Manuscript Library,” American Archivist 72, no. 2 (2009): 460–77, https://doi.org/10.17723/aarc.72.2.b82533tvr7713471.

7

Cyndi Shein, “From Accession to Access: A Born-Digital Materials Case Study,” Journal of Western Archives 5, no. 1 (2014): 3, https://doi.org/10.26077/b3e2-d205.

8

Shein, “From Accession to Access,” 1–42.

9

Ben Goldman, “Bridging the Gap: Taking Practical Steps Toward Managing Born-Digital Collections in Manuscript Repositories,” RBM: A Journal of Rare Books, Manuscripts, and Cultural Heritage 12, no. 1 (2011): 11–24, https://doi.org/10.5860/rbm.12.1.343; OCLC Research, “Demystifying Born Digital,” https://www.oclc.org/research/themes/research-collections/borndigital.html; Society of American Archivists, Electronic Records Section, BLOGGERS! (blog), https://saaers.wordpress.com.

10

Dorothy Waugh, Elizabeth Russey Roke, and Erika Farr, “Flexible Processing and Diverse Collections: A Tiered Approach to Delivering Born Digital Archives,” Archives and Records 37, no. 1 (2016): 4, http://dx.doi.org/10.1080/23257962.2016.1139493.

11

J. Gordon Daines III, Processing Digital Records and Manuscripts (Chicago: Society of American Archivists, 2013), 2.

12

Lori Hamill, Archival Arrangement and Description: Analog to Digital (Lanham, MD: Rowman & Littlefield, 2017), 52.

13

Hamill, Archival Arrangement and Description, 78.

14

Heather Ryan and Walker Sampson, The No-Nonsense Guide to Born-Digital Content (London: Facet Publishing, 2018).

15

Ryan and Sampson, The No-Nonsense Guide, 156.

16

Trevor Owens, The Theory and Craft of Digital Preservation (Baltimore: Johns Hopkins University Press, 2018), 130–31.

17

Owens, The Theory and Craft, 128–86.

18

Jefferson Bailey, “Disrespect des Fonds: Rethinking Arrangement and Description in Born-Digital Archives,” Archive Journal (June 2013), https://www.archivejournal.net/essays/disrespect-des-fonds-rethinking-arrangement-and-description-in-born-digital-archives.

19

Bailey, “Disrespect des Fonds.”

20

Geoffrey Yeo, “Bringing Things Together: Aggregate Records in a Digital Age,” Archivaria 74 (Fall 2012): 43–91, https://archivaria.ca/index.php/archivaria/article/view/13407; Jane Zhang, “Original Order in Digital Archives,” Archivaria 74 (Fall 2012): 167–93, https://archivaria.ca/index.php/archivaria/article/view/13410.

21

Susan Thomas, “PARADIGM: A Practical Approach to the Preservation of Personal Digital Archives,” PARADIGM Final Report (Oxford, UK: 2007), 11, http://paradigm.sers.ox.ac.uk/projectdocs/jiscreports/ParadigmFinalReportv1.pdf, captured at https://perma.cc/4EHD-5V94.

22

Thomas, “PARADIGM,” 25.

23

Paradigm Project, Workbook on Digital Private Papers (Oxford, UK: 2005–2007), https://ora.ox.ac.uk/objects/uuid:116a4658-deff-4b06-81c5-c9c2071bc6d0.

24

AIMS Work Group, “AIMS Born-Digital Collections: An Inter-Institutional Model for Stewardship” (2012), i, https://dcs.library.virginia.edu/aims/white-paper.

25

AIMS Work Group, “AIMS Born-Digital Collections,” 31–43.

26

Educopia Institute, “OSSArcFlow,” 2017–2020, https://educopia.org/ossarcflow.

27

Elvia Arroyo-Ramirez et al., “Levels of Born-Digital Access,” Digital Library Federation, February 2020, https://osf.io/hqmy4.

28

There are likely many sources that state this, but here are a few specifically: Stollar Peters, “When Not All Papers Are Paper,” 34; AIMS Work Group, “AIMS Born-Digital Collections,” 38; and Philip C. Bantin, “Strategies for Managing Electronic Records: A New Archival Paradigm? An Affirmation of Our Archival Traditions?” Archival Issues 23, no. 1 (1998): 30, http://digital.library.wisc.edu/1793/45860.

29

Although membership fluctuated during the two-year project, the ten people central to the creation of the Framework were Susanne Annand, Sally DeBauche, Erin Faulder, Martin Gengenbach, Karla Irwin, Julie Musson, Shira Peltzman, Kate Tasker, Laura Uglean Jackson, and Dorothy Waugh.

30

Digital Processing Framework Group, “Digital Processing Policy Framework Charter,” (Google Doc, 2016).

31

For lists and descriptions of various tools, see the following: AIMS Work Group, “AIMS Born-Digital Collections,” Appendix G, 125–35; Matthew Kirschenbaum, Richard Ovenden, and Gabriela Redwine, “Digital Forensics and Born-Digital Content in Cultural Heritage Collections” (Washington, DC: Council on Library and Information Resources, 2010), 70–84; Shein, “From Accession to Access,” 41–42; Kari Smith, “Tools for Understanding Digital Files,” MIT Libraries LibGuide, 2012, https://libguides.mit.edu/digitalarchivestools.

32

Digital Processing Framework Group, “Born Digital Processing,” Zotero, 2016, https://www.zotero.org/groups/632302/born_digital_processing.

33

Jarrett M. Drake, “RadTech Meets RadArch: Towards A New Principle for Archives and Archival Description” (panel talk, 2016 Radcliffe Workshop on Technology & Archival Processing, 2016), Medium, On Archivy (blog), https://medium.com/on-archivy/radtech-meets-radarch-towards-a-new-principle-for-archives-and-archival-description-568f133e4325, captured at https://perma.cc/XZ7P-86GA; Jason Evans Groth, “Let the Bits Describe Themselves: Arrangement and Description of Born Digital Objects,” North Carolina State University Libraries, October 6, 2014, https://www.lib.ncsu.edu/news/special-collections/let-the-bits-describe-themselves%3A-arrangement-and-description-of-born-digital-objects, captured at https://perma.cc/4EK8-GMLV; John Langdon, “Describing the Digital: The Archival Cataloguing of Born-Digital Personal Papers,” Archives and Records 37, no. 1 (2016): 37–52, http://dx.doi.org/10.1080/23257962.2016.1139494; Christopher A. Lee, “A Framework for Contextual Information in Digital Collections,” Journal of Documentation 67, no. 1 (2011): 95–143, http://dx.doi.org/10.1108/00220411111105470.

34

eCommons Open Scholarship at Cornell, “Digital Processing Framework.”

35

Kirschenbaum, Ovenden, and Redwine, “Digital Forensics and Born-Digital Content in Cultural Heritage Collections,” 26.

36

Pearce-Moses, s.v. “processing,” Glossary.

37

Zhang, “Original Order in Digital Archives,” 189.

38

Greene and Meisner, “More Product, Less Process,” 222.

39

Owens, The Theory and Craft, 134.

40

Owens, The Theory and Craft, 134.

41

Kim, Dong, and Durden, “Automated Batch Archival Processing,” 92.

42

AIMS Work Group, “AIMS Born-Digital Collections,” vii.

43

Alexandra Chassanoff and Colin Post, OSSARcFlow: Guide to Documenting Born-Digital Archival Workflows (Atlanta: Educopia Institute, 2020), https://educopia.org/wp-content/uploads/2020/06/OSSArcFlow_Guide_FINAL.pdf, captured at https://perma.cc/NJ46-HREE, 51; see also the website for this project at https://educopia.org/ossarcflow.

44

Pearce-Moses, s.v. “processing,” Glossary,https://www2.archivists.org/glossary/terms/p/processing.

45

Arroyo-Ramirez et al., “Levels of Born-Digital Access.”

Author notes

Erin Faulder is the assistant director for Digital Strategies at Cornell University Library's Division of Rare and Manuscript Collections (RMC). She leads RMC's digital program by coordinating efforts with other RMC units to develop policies and processes for accessioning, arranging and describing, and providing access to born-digital and digitized collections. In collaboration with library colleagues, Faulder advances the library's digital preservation and access ecosystem to serve users. Previously, Faulder was the digital archivist at Tufts University's Digital Collections and Archives where she oversaw the collections management software and the Tufts Digital Repository.

Laura Uglean Jackson is archivist for University Collections at the University of Northern Colorado where she plays a central role in collections management, accessioning, and processing of all formats. Prior, she worked at the University of Wyoming American Heritage Center and the University of California, Irvine. She has a BA in art history from Colorado State University and an MS in library science from Simmons College.