This article surveys and analyzes archival literature and legal resources (primarily United States–focused) related to copyright considerations that archivists and other content managers must be aware of to effectively and legally maintain a collection of born-digital materials. These considerations include the centrality of copying to preservation actions, shifting definitions of ownership, unclear distinctions between published and unpublished content, digital rights management laws and technologies, and the layered copyrights that can exist in complex digital objects and their dependencies. Strategies for dealing with these challenges include securing rights ahead of time, adopting legal rationales related to orphan works and fair use, adapting practices from specialized digital preservation subfields, ensuring routine procedures adequately address copyright-related recordkeeping and risk management, and advocating for preservation-enabling copyright reforms. An examination of these issues and strategies in the context of current thinking about copyright suggests that while certain legal exceptions and existing rights frameworks can help to facilitate digital preservation activities, copyright will continue to be a barrier until significant reforms are enacted.
In 2015, in an American Libraries article on digital preservation,1 former American Library Association president James G. Neal raised but did not attempt to answer a number of legal questions related to digital preservation:
It will be challenging to create a robust and successful born-digital content preservation capacity without new thinking about copyright. Libraries are capturing and preserving digital materials as fair use. Efforts to produce new exceptions or limitations in Section 108 of the Copyright Act for purposes of digital preservation have not been successful. Our law is out of sync with technology and user needs. Where does the preservation of born-digital content intersect with orphan works, with transformative use, with the public interest? What should be the relationship between licensing and copyright limitations? What about the issue of open content and proprietary rights? How do we manage national copyright provisions in a global networked context?2
This straightforward summary captures some of the key questions that affect all institutions engaged in digital preservation work—newly emerging, bare-bones operations and well-funded, long-established programs alike. Neal's questions are, by and large, the same ones digital preservationists have been asking for the last twenty years, and, while the venue of Neal's article indicates such questions have moved from the periphery to the center of the information world's radar, a hopeful sign, the current state of the field suggests that satisfactory answers are still far off.
In this article, I will survey major copyright issues affecting the preservation of digital content, consider the implications of select cases shaping the legal understanding of electronic materials, and outline possible strategies and future steps for dealing with the largely untested intersection of copyright and digital preservation. I am not a lawyer, and this article does not constitute legal advice; rather, I assemble and analyze resources for informational purposes and to reflect on ways that the current United States copyright regime3 might affect the work of digital preservation. As an archivist, I work primarily with personal papers and institutional records and am writing with these types of collections in mind; however, the issues I raise are also relevant to librarians, curators, and others who steward digital content, including born-digital publications, research data, digital scholarship projects, born-digital art and music, and web archives.
Substantial attention has been paid to the copyright challenges inherent in digital archives work,4 and published works focused on digital archives or digital preservation typically address copyright to some extent,5 but these texts tend either to provide only a cursory overview or to focus their copyright discussions on the implications of digitizing and expanding current access to analog materials. The literature specifically exploring intellectual property issues related to born-digital content and the preservation of digital assets—that is, challenges affecting archivists' ability to simply collect and maintain digital content for long-term access and use, not necessarily reproduce and distribute it at present—is comparatively sparse. The few available treatments, while providing an essential foundation, do not reflect the latest legal developments, as most were written ten or more years ago6 and often focus on copyright regimes outside the United States.7 Bringing up-to-date analysis of legal options and best practices for copyrighted digital materials into the literature, as well as devoting more attention to the unique intellectual property questions raised by born-digital files, will benefit archivists seeking to understand and make informed decisions about their digital collections.
Although there has been general agreement among cultural heritage practitioners since at least the 2003 adoption of the United Nations Educational, Scientific and Cultural Organization (UNESCO) Charter on the Preservation of the Digital Heritage8 that copyright law must permit digital preservation processes, the information and cultural heritage professions have not yet, on the whole, established a shared understanding or framework of practices related to born-digital intellectual property. This leaves the archival profession ill equipped to preserve certain valuable records that, unlike many of the older materials archivists care for, are generally not freed from copyright constraints by public domain status, lack of registration or notice, or nonrenewal. Preservation-related copying in archives carries a relatively low risk of prosecution and liability, particularly when materials are not made publicly accessible, and thus archivists may reasonably decide to proceed with preservation actions despite inadequate legal provisions; however, archivists, parent institutions, legal counsel, and archival service providers would benefit from clearer precedents and guidelines to aid in accurately interpreting the copyright status of collection materials and confidently gauging the risks of specific preservation actions.
As one widely used foundational text on digital curation and preservation puts it in what might be read as dry understatement, “The legal constraints on our ability to curate, share, and reuse data are complex.”9 US copyright law is notoriously byzantine, and issues of intellectual property and ownership are particularly challenging for digital materials, not least because the technological changes that have occurred since the current legal framework was enacted in the 1970s have left the law out of step with systems of production, publication, and distribution and require archivists to “try to understand the law within a context that was never anticipated.”10 When attempting to curate and preserve digital content, archivists deal with added complications and unanswered copyright questions stemming from the technologies used to create digital content, changing norms around distribution, and restrictive licensing practices. In short, “copyright's ability to trammel digital curation remains substantial.”11 I have found, while leading the development of preservation policies and workflows and teaching other archivists about born-digital archives, that the actual or perceived constraints copyright imposes on preservation-related actions can be overwhelming to the point of inaction for archivists stewarding born-digital collections.
A detailed overview of digital archives' copyright problems is a 2003 Council on Library and Information Resources (CLIR) report by lawyer and copyright expert June Besek. In this report, one of the earliest detailed analyses of copyright issues related to digital records and archives, Besek enumerated the legal barriers to responsibly acquiring and reliably ensuring the long-term accessibility of digital content:
the increasing prevalence of content licenses that permit access without ownership and disallow saving or distributing copies on physical media
the lack of legal deposit requirements in place for online publications
the unclear publication status of much digital information
the existence of publishing contracts that predate the development of digital formats and dissemination platforms
limits on ownership and control of collective works
the difficulty of tracking ownership of works created in the period since mandatory registration and renewal ended
orphan works (works that are protected by copyright but whose owners cannot be contacted or identified)
the Digital Millennium Copyright Act (DMCA),12 1998 legislation that incorporates two previous World Intellectual Property Organization (WIPO) treaties into US law, makes circumventing copy-protection measures illegal, and limits the liability of internet service providers for copyright infringement
international differences in copyright law
the need to accommodate conditions rights holders place on ingest and reuse of their works (e.g., user agreements, download restrictions, etc.)13
A 2017 discussion document from the US Copyright Office acknowledges that “the very nature of embodying works in digital formats . . . implicates copyright law in fundamentally new ways.”14 The nature of digital content thus forces a reassessment of archives' understanding and use of copyright law. Intellectual property laws have implications for many stages of the Digital Curation Centre's Curation Lifecycle Model:15 they should be understood and documented during conceptualization and creation, they are a factor in appraisal and selection, they should be thoroughly recorded in object descriptions, they limit the options for storing or migrating objects, they restrict access, they dictate what constitutes permissible reuse and transformation, and they have the potential to complicate and undermine preservation. To make informed decisions about stewarding digital material throughout its life cycle, archivists should be familiar with the central conflicts and unanswered questions connected to digital preservation and copyright.
Copyright Challenges that Hinder Preservation
The primary impediments copyright poses to the work of digital preservation are both conceptual and practical. The challenges I believe are most connected to the nature of born-digital materials and most likely to affect the work of preservationists include the centrality of copying to preservation actions, shifting definitions of ownership, unclear distinctions between published and unpublished content, digital rights management (DRM) laws and technologies, and the layered copyrights that can exist in complex digital objects that depend on external content, software, or specifications. All of these issues are exacerbated by broader challenges facing many archivists, such as declining funding, inadequate staffing, and increasing expectations for the immediate availability of content. I will not dwell on these problems, as they are not specific to digital preservation and are covered elsewhere in the literature, but it is important to note that they may hamper archivists' ability to gain essential expertise in intellectual property and copyright and secure the time and resources needed to manage and track copyright in born-digital collections.
The Necessity of Copying
The most fundamental tension between copyright law and digital preservation centers, unsurprisingly, on copying. While copyright laws serve to place limitations on the copying of intellectual property, digital preservation cannot occur without extensive copying. Not only is copying fundamental to the computer systems and technologies that allow digital materials to be created, transferred, and used—even before an archivist begins any deliberate copying of a digital object, copies may be created within a device's memory or a browser's cache simply through the action of viewing or opening a file16—but copying is crucial for the long-term survival and integrity of those materials.
Unfortunately, current law and legal precedent do not adequately address this reality. Preservation of analog materials in their original format generally does not require any activities restricted by intellectual property law, and preservation of analog materials via reformatting can often be accomplished under existing legal carve-outs. These include the US Copyright Act's Section 107, which outlines a rationale for fair use—that is, limited uses of copyrighted materials that do not require permission from copyright holders—and identifies the four factors (purpose of use, nature of work, amount used, effect on the market) used to assess fair use defenses against claims of infringement, and Section 108, which grants certain archives and libraries limited preservation-related exceptions to reproduction restrictions.17 But many of the most basic tasks involved in digital preservation—“making multiple copies of a work, distributing copies among multiple institutions, and migrating works to new technological formats and media”—“involve the exercise of exclusive rights including but not limited to the reproduction right.”18
As archivists ingest, accession, process, arrange, redact, normalize, and back up digital records, their computers inevitably create copies. As Aprille McKay has pointed out, “there is no way to process born-digital materials for the archives without technically breaching the rights owner's exclusive rights under copyright,”19 leaving these preservation activities in legal gray areas.20 Beyond archivists' concerns about whether their unavoidable copying qualifies as fair use are questions about the very nature of the copies they create. When the copies are not exact duplicates, it may be difficult to determine whether the transformed versions are derivatives, new editions, or something else entirely. These may seem like trivial distinctions with no practical bearing on archivists' decision-making, but I believe questions about the nature of copies are worth asking because the answers might affect their copyright status or inform an archivist's fair use defense.
Redundancy—or creating and storing multiple copies of digital materials—is an integral practice under existing standards and frameworks for digital preservation.21 Redundant copying also underlies the idea of LOCKSS (Lots Of Copies Keep Stuff Safe) networks.22 An inverse relationship exists between the safety of digital materials and copyright-related risk mitigation. The more copies a collecting institution creates and disseminates, the more likely its collections are to survive—but also, the farther it moves from the established preservation limits codified in Section 108, and thus, hypothetically, the greater the chances of a copyright complaint. In their current form, Section 108's provisions, which cap the creation of copies at three, do not explicitly permit the type and degree of reproduction required for digital preservation. Nor do they adequately accommodate the common practice of transmitting and storing multiple copies off-site in the care of a vendor or partner organization to increase redundancy and protect against threats that might affect a particular geographical location. The shift toward outsourced preservation and asset management may prompt questions about the liability of service providers and preservation networks under the DMCA, their responsibility for infringing content hosted on their servers, and whether they are covered under statutory exemptions or fair use arguments as extensions of the repository on whose behalf they act.23
Further issues around copying emerge from the ongoing shift in common perceptions and expectations in a networked world where digital objects are easily and frequently replicated.24 While copying for preservation is done under controlled conditions, unlike the casual replication that characterizes much online sharing, both approaches are part of a sea change in how users of digital materials perceive the significance of any given act of copying. The ease of copying and sharing in a digital environment has influenced awareness and interpretation of intellectual property laws among the general public and even legal professionals, leading in some cases to the sense that because copying is simple and common, it must or should be acceptable regardless of legality.25 These shifting norms and behaviors could make it challenging for archivists to anticipate how creators, copyright owners, or users will understand the intellectual property residing in digital items and what they will expect in terms of making and managing copies. Recognizing that users may be accustomed to freely downloading, copying, and sharing digital content but may not be familiar with or attuned to the copyright implications of those actions, archivists have grown concerned about legal liability stemming from researchers' unsanctioned use of digital materials made available online.26 Because “the very nature of digital material makes copyright easier to infringe and such infringements . . . potentially much more visible,”27 these concerns might influence archivists' thinking about present and future risks associated with acquiring and storing digital content.
A 2002 Research Libraries Group (RLG)–Online Computer Library Center (OCLC) report on Trusted Digital Repositories (TDR) points out that whereas “responsibility for preservation has traditionally been considered alongside ownership of the materials,”28 ownership of digital materials is less straightforward and, consequently, often divorced from preservation incentives and responsibilities. Conversely, whereas traditionally there has been a clear distinction between ownership of an object and ownership of that object's intellectual content, this line is blurred in a digital context. Legal thinking about what it means to own a copy of a digital object, particularly when not stored on a physical medium in the owner's possession,29 is still evolving,30 as are models for understanding the relationship between ownership of copies, on the one hand, and ownership of intellectual property, on the other, for electronic data.
Case law has not yet provided straightforward guidance on ownership-related issues such as how the first sale or exhaustion doctrine, a statutory limitation on the copyright owner's exclusive distribution rights that allows the purchaser of a particular copy to resell that copy without authorization, functions in relation to digital objects.31 Nor is there a definitive answer in case law to the question of whether digital data is property (whether tangible or intangible) that can convey to a new owner.32, 33 To bring these questions into an archival context: Can a donor legally transfer ownership of their ebook or mp3 collection? Is a donor authorized to donate electronic records originally created by someone else? Does physical possession of a disk entitle a repository to create derivatives for preservation? As noted in a 2016 white paper from the US Department of Commerce's Internet Policy Task Force, the doctrine of first sale is fundamental to the preservation of cultural materials; if digital content is subject to more restrictions on ownership and transferability than analog materials are, it is less likely to be preserved and made available for the public benefit long term.34 What might otherwise seem like an arcane legal matter thus has real implications for the future collecting and preservation abilities of archives and other cultural heritage institutions.
Beyond the definitional problems of ownership in digital content is the more familiar challenge of identifying copyright owners. Archivists are accustomed to unattributed and orphan works among analog collections, but the problem is magnified for born-digital content by its volume. The sheer amount of digital materials a donor might have acquired, along with the now-common intermingling of personal and professional assets and the ease with which files can be shared, copied, and downloaded, means that their collection likely represents the intellectual property of many people. Determining the intellectual ownership and copyright status of each item is likely impossible, and, unlike for analog materials of mixed ownership, archivists do not yet have widely used appraisal workflows and rights review strategies to assist with managing this challenge at scale.
Locating owners is particularly problematic in the area of web archiving. Online works are often “orphaned” because many are created informally and collectively, making creators difficult to contact for permission to archive.35 As a result, curators and collecting institutions commonly store and reproduce archived sites without asking permission.36 Even if they never face legal challenges over their web archives, they are in effect pushing downstream the uncertainty about whether and how the content can be reused in the future.37 However, the 2012 Association of Research Libraries (ARL) Code of Best Practices in Fair Use for Academic and Research Libraries asserts that, with some limitations, “it is fair use to create topically based collections of websites and other material from the Internet and to make them available for scholarly use.”38 While no statutory guidance or case law can yet provide definitive parameters for these activities, the Code of Best Practices provides some reassurance and a legal rationale for institutions actively archiving web pages created outside their own organizations.
A final category of preservation concerns related to ownership involves content that institutions do not own but simply license. As collecting practices and use patterns evolve, this model is becoming a common way to provide access to media and scholarship. To some scholars and practitioners, this shift to a collecting environment dominated by contracts for access rather than one in which institutions own copies of materials poses the greatest threat to their ability to build and manage collections for the future.39 Without a legal mandate to collect40 or a legal exception to restrictive contracts and licenses allowing for the creation and modification of copies in the service of preservation, libraries and archives likely face the assumption of greater risk—either in a legal sense or in terms of their collections' long-term survival, depending on their course of action.
Because “most licensing agreements are still perilously vague about . . . how long-term access will be ensured”41 and because layers of ownership in content presumably deeded to the repository might still be unclear, curating digital collections involves additional legal considerations, metadata, and workarounds to ensure preservation and future access. But “a piecemeal approach to issues like ownership of digital cultural resources undermines the ability to find legal solutions across cultural institutions” to the legal risks that accompany preservation and access.42 The RLG-OCLC TDR report acknowledges the need to “work as closely as possible with content creators”—whether authors, researchers, publishers, or software developers—to keep track of rights information and accommodate preservation functionality;43 almost twenty years after the report was written, still no widely adopted framework or scalable model enabling content creators and curators to partner on preservation efforts exists.
Unclear Publication Statuses
US copyright law is not clear about basic questions such as “whether content on the Web or in databases should be treated as published or unpublished”44 and what types of online dissemination constitute publication,45 distinctions that have significant implications for the length of copyright terms, opportunities for reuse, and rights-related tracking of digital works. Section 101 of Title 17, the portion of the United States Code pertaining to copyright and based largely on revisions enacted by the Copyright Act of 1976,46 defines publication as “the distribution of copies or phonorecords of a work to the public by sale or other transfer of ownership, or by rental, lease, or lending” and notes that while distribution for the purposes of display or performance is also considered publication, “a public performance or display of a work does not of itself constitute publication.”47 According to the Compendium of US Copyright Office Practices, Section 1902, distribution only constitutes publication when authorized by the copyright owner.48 While some digital content—for example, email clearly intended as private correspondence and not for public distribution—can be categorized as unpublished in the same way its analog precursor would be, in other cases the ease with which digital files are reproduced and disseminated, along with the highly visible, connected nature of the Web, makes publication status less clear. Is posting a photo on social media an act of distribution or display? How does the copyright owner's intent or the specific tool or platform factor in? If someone other than the copyright owner distributes the photo beyond its initial posting, can the owner be assumed to have authorized its distribution by virtue of placing the image in a location that encourages resharing? The ease with which digital files are shared and hosted around the world may also make it difficult to determine the nation of first publication and therefore which copyright regimes' protections and formalities apply.
Determining the publication status and date for analog archival materials is already difficult; for born-digital files, it can be even more challenging. Although digital objects often come with embedded metadata and MAC (modified, accessed, created) dates, these can be misleading, are easily altered, and might indicate the creator or creation date of a copy or derivative rather than of the original item. But even if one can determine the date when the content was fixed in its current form, classifying that content as published or unpublished remains a challenging exercise because of the fuzziness of those categories in a digital, networked environment. One reason this distinction matters, apart from the fact that the length of copyright protection differs for published versus unpublished works, is that Section 108 of the Copyright Act allows library-created preservation copies of unpublished works to be shared with other institutions, whereas copies of published works may be accessed only on the premises of the institution that holds the original.49
Digital Rights Management Technology
One of the most concerning barriers to digital preservation of published materials has emerged in the wake of the DMCA, which initiated the widespread adoption of technological protection measures (TPMs) for digital rights management (DRM) by publishers and other rights holders. These practices are a reaction to the ease of copying and sharing in a digital environment and aim to make copyright infringement more challenging. They also, unfortunately, pose a number of challenges for educational and heritage organizations,50 often preventing libraries and archives from creating copies even for noncommercial, mission-driven purposes like education and preservation. DRM practices can impose technical and legal barriers to actions like decryption, duplication, annotation, and transformation, all of which are key aspects of digital preservation and curation.
DMCA deals only with digital materials so will become more relevant to archivists as born-digital holdings increase. Heather Briston elaborates several problems DMCA could cause for archivists:
“If a donation includes an encrypted hard drive and the password is not given to the repository, it is impossible to legally access and preserve the material that the archives program legally owns.”
“Many software companies are defaulting to copy protection/encryption of documents as a matter of course in creating and saving materials based on the demands of creators for security and privacy of information,” meaning that some significant records and correspondence are likely to be inaccessible to archivists in the future without advance planning and intervention.
DMCA “makes circumventing copy protections . . . illegal,” preventing archivists from capturing preservation copies of DRM-protected commercial content even when otherwise defensible under Section 108 or fair use.51
The reasons for TPMs in archival holdings will vary—software encryption protecting a donor's personal files is based on a different rationale than anti-piracy measures imposed by publishers, for example—so the possible risks and consequences for circumvention will vary as well. But, as Briston argues, these portions of the DMCA “are particularly antithetical to the work of archivists as currently defined and applied”52 because, regardless of intent, they interfere with archivists' ability to preserve content.
Any cultural heritage institution that collects published sound recordings or e-books is likely to encounter technological restrictions on the ability to copy and preserve that content.53 DRM might take the form of encryption, persistent authentication, limiting the ability to record or copy, or restricting the content to specific devices or software platforms. As a result, fair use is obstructed and copying that might otherwise be legally defensible becomes illegal under the DMCA or the contracts that govern use of the content. Even if an archives has permission to copy DRM-protected content, it risks running afoul of DMCA provisions forbidding the use of tools to break encryption or bypass other protections. Ideally, archives would “work with right[s] holders to develop workable approaches to the digital preservation of copyright materials protected by technological measures such as encryption or copy protection.”54 But this approach is difficult to scale,55 and, furthermore, given high-profile infringement prosecutions of individual consumers and protracted public battles over control of intellectual assets, “record companies and other rights holders [are likely to be] wary of cooperative preservation projects in which files might be shared between archives.”56 Without changes to laws “to allow digital preservation to be undertaken as necessary,”57 clearer preservation exemptions, and test cases weighing Sections 107 and 108 against the digital-only restrictions of the DMCA, DRM will continue to “[harm] long-term prospects for preservation of digital information by making content difficult, impossible or illegal to copy or convert.”58
Multilayered Rights in Digital Objects
Because of the technologies employed to create much digital content, additional layers of copyright might inhere in digital objects. Beyond the ostensible creator's intellectual contribution, digital files created in proprietary environments can contain or reflect the intellectual property tied to their file formats, the software that generated them, or the types of encoding used. As the RLGOCLC TDR report points out, “the content creator does not usually own the rights to the software and systems used to create the digital file,” hence the need to use open software and formats when possible to avoid “legal issues when access or changes to those systems are necessary.”59
Even if an archives owns a copy of a digital object, it may not always have the right to use any required software to render it for certain purposes. If proprietary software is required to view or use the object, the archives may need to preserve that software, which introduces an additional set of risks and complications related to copyright restrictions and license agreements. The rights-related risks associated with preserving and rendering certain digital objects are one reason that “openness” or “independence”—not requiring a specific hardware or software platform to function—is often a key consideration when evaluating the longevity of a particular file or format. 60
Attempting to maintain digital content or tools built using existing, copyright-protected materials has legal drawbacks, as evident in recent cases dealing with reuse of software code by entities other than the original creators. Google v. Oracle America,61 the culmination of a multiyear battle that went before the Supreme Court in late 2020,62 hinges on whether application programming interfaces (APIs) warrant copyright protection and whether using or including them as dependencies in other programs or devices constitutes infringement. While Google's commercial use of Oracle's APIs is not analogous to the uses archives would likely make of similar code, the case raises questions about what constitutes fair use in complex software environments. As archives increasingly aim to preserve and provide access to collections and objects that rely on layered software programs, format specifications, programming languages, and other elements, each with its own intellectual property rights, understanding the status of the different components and the implications of their interactions will become more important.
Intellectual property rights in formats and software are not the only rights held by parties other than primary creators that can complicate preservation. In settings focused on the preservation of research materials or institutional records, curators might have to consider not only the rights of the researchers who create the content but also any rights affixed to the databases in which the content is deposited.63 Records slated for curation may also be derived from sources that are copyright protected and may therefore involve permissions requirements or use restrictions.64
Understanding the layers of rights in complex digital objects and collections can be particularly challenging because it is not always clear what type of intellectual property a digital item may be and what laws and requirements apply. Computer programs, for example, “are eligible for copyright, but there is considerable confusion about what such a copyright protects or should protect”:65 the source code of the software, the object code, or the look and feel of the user interface.66 Without a solid understanding of what exactly is protected, it is difficult for archivists to accurately assess the risk of preserving or reusing software components. Research data is another category of content with unclear protections. Raw data are considered to be factual information under US law and so are typically not protected by copyright, but their status as intellectual property generally is unclear,67 and, regardless of that status, they may be protected from copying, transformation, and reuse by distribution licenses or restrictions imposed by government or other funding bodies.
Strategies to Aid Preservation of Copyrighted Material
Many of the problems librarians, archivists, and other digital curators face in dealing with copyright-protected materials stem from the fact that “copyright law is created by representatives from various industries debating specific points that they feel are crucial to their respective business models” and that result in “very specific legislation, often with unintended consequences for those with no lawyer at the bargaining table.”68 To deal successfully with these unintended consequences, archivists need to build strong relationships with their organizations' legal counsel, collaborate with scholarly communications and copyright librarians when available, and advocate collectively for new tools and legal reforms.
We are beginning to see more court cases and developing precedents related to copyright in the digital realm (e.g., Authors Guild v. Google,69Authors Guild v. HathiTrust,70 and Cambridge University Press v. Patton71). While these cases give some idea of how library-based or preservation-focused digital initiatives might fare in copyright litigation, the cases primarily focus on sharing and reuse, largely of digitized content, leaving digital preservation functions still to be addressed. It is difficult to extrapolate from these outcomes and speculate about whether certain new technologies and practices will stand up to legal scrutiny. But there are actions archivists can take to improve archives' ability to preserve born-digital content in the face of uncertainty, as well as subjects for further study and advocacy. Areas to focus on include securing rights ahead of time, adopting legal rationales related to orphan works and fair use, adapting practices from specialized digital preservation subfields, ensuring that standard procedures adequately address copyright-related recordkeeping and risk management, and advocating for preservation-enabling copyright reforms. These strategies will not solve all the challenges previously discussed, but they offer archivists some useful tools and models in the absence of a legal framework that is designed for the digital world and values cultural heritage preservation as much as commercial interests.
Secure Rights in Advance
The ideal approach whenever possible is to obtain any required rights and permissions up front. Thorough release forms, deeds of gift, or donor agreements will transfer or license the rights required for digital preservation. These rights might include the ability to transfer the content from its original media, create derivatives through normalization, make copies, store copies in distributed locations, crack passwords, recover deleted files or file fragments, crawl web pages or social media profiles, sublicense essential rights to service providers, and carry out any other dissemination and reproduction needed to support the institution's core functions. If the agreement governing the born-digital materials does not grant sufficient freedom to maintain the content according to best practices for digital preservation, it may be worth renegotiating the terms of the gift or preparing an addendum.
Conversations with donors about their deeds of gift are an ideal time to obtain information about any content with third-party rights; this might include published digital materials the donor has purchased or licensed, files downloaded from the Internet, software programs, or documents created by another person on a shared device. Even if the donor created all of the content, it is helpful to ascertain whether there might be other copyright claims based on assignment of rights to publishers or work made for hire. It is also important to identify and track any licenses the donor has granted in the materials, whether by selling exclusive licenses for commercial purposes or sharing items online under a Creative Commons license, as these prior arrangements may limit how much the archives can or wishes to invest in preservation. Asking the donor or creator to address ownership and rights issues at this stage allows an archives to reduce the risk it takes on when preserving digital content and to rely less heavily on fair use.
For any content that archives acquire directly from publishers or other vendors, securing rights in advance might mean negotiating whenever possible for the ability to preserve the licensed content. Vendor contracts that provide copies of perpetual access files or grant future access to titles preserved by Portico or a similar service give a higher level of assurance of long-term accessibility.
Adopt Best Practices for Orphan Works and Fair Use
Edward M. Corrado and Heather Moulaison Sandy argue that, ideally, “regardless of the source of content being digitally preserved, copyright issues should be investigated to make sure the proper intellectual property rights have been granted that are legally required to perform the actions necessary for long-term preservation”;72 however, this is not always possible. Until archivists have more effective tools and sustainable workflows for identifying owners and tracking digital object rights at scale, they must rely largely on existing legal frameworks and calculated risks.
Orphan works are a worldwide problem, but particularly so in countries like the United States that have very long terms of copyright. While legal researcher David Hansen's 2016 report on digitizing orphan works73 focuses on providing online access to reformatted content rather than on digital preservation, it provides a useful framework for dealing with the large amount of born-digital content, both published and unpublished, that is orphaned because the copyright owners are unknown or unfindable. The “legal limbo” in which orphan works exist can force archivists to “forgo socially beneficial uses of the work,”74 keeping institutions from actively curating and preserving the materials.75 Short of amending copyright law to address orphan works directly, deciding that “some important uses of orphan works count as fair use,” or lobbying for an additional “judge-made exception” in the vein of fair use but specifically focusing on orphan works,76 Hansen constructed a framework for managing risks that archivists can use to encourage preservation and use. He drew on current common practices (acknowledging use of orphan works, applying fair use arguments, focusing access efforts on unpublished/unregistered works, conducting limited searches, and preparing takedown procedures) and added to them more rigorous use of Section 108 to justify orphan-works projects,77 creative legal strategies to limit potential damages in a legal challenge, and strategic permission-seeking through quitclaim deeds (in which potential but unverified rights holders give permission based on any rights they might own without warranting that they are in fact the owners) and large-scale agreements. Archivists might look next to legal scholars and copyright experts within the profession to extend Hansen's arguments, as well as the recommendations outlined in the Society of American Archivists' statement about best practices for orphan works,78 and consider whether the same rationale used for digitizing and sharing online can serve to justify redundant copying, normalization, and other digital preservation actions.
Fair use, when an institution determines that it supports a particular preservation function, may be a more appropriate legal rationale than an orphan-work analysis for proceeding with digital preservation.79 Archivists might also find a fair use analysis easier to conduct at the collection level or for a group of items with similar preservation needs compared to an orphan-work analysis, which requires making a reasonable effort to locate individual copyright owners and document the search. Whether and how collecting institutions provide public access to the resulting files could alter the fair use assessment, so archivists working to preserve digital content in legal gray areas may wish to take a long-range perspective and focus on future rather than immediate access. Fortunately, archivists—especially those in nonprofit, educational settings—are often in a strong position to make a fair use argument when any copying or migration is done primarily for preservation purposes. Archivists should also note, however, that the strength of a fair use analysis must not trump other ethical considerations in their decision-making about collecting and preserving digital content. Even when a sound legal defense based on fair use exists, archivists might choose in some cases not to take advantage of it out of respect for the creator's intentions or circumstances, particularly if the material is culturally sensitive or represents a vulnerable community. While copyright restrictions often limit the work of archives in troubling ways, copyright can, in some situations, empower marginalized communities to set the terms for preservation and use of their own cultural output.
When archivists do wish to invoke fair use, they should be aware of its limits and recognize that the strength of fair use arguments for preservation activities will vary depending on the nature of the materials. While fair use is often on archivists' side because archival collections contain so much unique material that is not being commercially exploited, fair use claims related to unpublished materials are often subject to greater scrutiny.80 Fair use may also be less helpful when building special collections of published digital content or preserving scholarly output in university archives–managed institutional repositories. Furthermore, because the boundaries of fair use are not precisely defined by the framework outlined in Section 107 and are instead based on an evolving set of judicial precedents and best practices, fair use is vulnerable to copyright expansions and shifts in interpretation that limit its application.
ARL's Code of Best Practices in Fair Use for Academic and Research Libraries is often quoted as stating that a “four-factor analysis . . . supports digital preservation”;81 however, read in context, this assertion clearly refers to the digitization of analog materials for preservation purposes and not to the preservation of born-digital content with its many added copyright complications. It is still far from clear how the four factors of fair use function in a born-digital context; some archivists may perceive this lack of precedent as increasing risk, although it may also present opportunities for creative, new fair use defenses. Whether a fair use argument can successfully justify digital preservation activities will ultimately need to be tested through a court case if archivists are to develop definitive guidelines. But, as most institutions do not have the will or the resources to deliberately take on such a case, archivists can reasonably proceed in the meantime by extrapolating from fair use defenses and best practices in place for other materials in their care.
Borrow from Specialized Preservation Fields
The video game and software preservation communities are proving to be models and resources for archivists working to preserve all types of digital content. Game and software preservationists have long been aware of the risk that, if copyright laws remain “the way they are, games that are on the brink of erasure could be lost forever.”82 In response to this risk, they have written detailed analyses of the copyright challenges involved in preserving these specific types of content83 and are working to develop legal rationales for everything from emulation to dealing with orphan games.84 US archives generally avoid directly challenging copyright laws that affect the ability to preserve video games;85 preservationists instead proceed by developing legal rationales based on existing precedent. The guidelines presented in the Software Preservation Network (SPN) and ARL's Code of Best Practices in Fair Use for Software Preservation,86 like those found in similar statements of best practices, “offer a bridge between community values and the seemingly abstract world of fair use, helping communities overcome fear”87 and model reasoning archivists can use in fair use analyses supporting the preservation of other types of digital content. Legal strategies advancing software preservation also support the preservation of other digital content more directly by enabling the use of software on which that content might depend.
Relying in part on fair use, software preservationists have also invested heavily in building tools and infrastructure to support emulation as a preservation strategy. This approach may avoid some copyright pitfalls connected to the preservation of complex digital objects like computer programs, databases, and games by lessening the need for routine migration, alteration, and reproduction. But emulation still implicates issues of ownership and licensing, as it can require modifications to original software, media transfer from original storage devices to managed systems, and potential circumvention of the technological protection measures (TPM) that became common after the DMCA went into effect to help copyright holders enforce their exclusive rights.88
Fortunately, the SPN and all archivists working with born-digital content achieved a notable victory when, in 2018, the Library of Congress granted a three-year exemption89 allowing archives and libraries to circumvent TPM on computer programs, including operating systems, video games, and other types of software.90 The exemption applies only when the program is not readily available commercially, the “sole purpose of the circumvention activity [is] for lawful preservation of the computer program or digital materials dependent on a computer program,”91 and the archives' use is noninfringing (that is, justifiable under Section 107 or 108 or conducted with the permission of the copyright owner). Even with limitations, this rule significantly reduces the risk archives assume when preserving digital content and the computer programs on which they depend. The policy also signals the possibility of further compromises between copyright owners and archives in the future.
Address Copyright in Routine Procedures
While much of the literature recommends changes to copyright law, such reforms can face substantial opposition and are often slow to occur. Fortunately, occasional exceptions occur within the general trend of ever-longer copyright terms. A notable victory in support of preservation is a provision in the 2018 Music Modernization Act (MMA) allowing libraries to treat certain works as if they are out of copyright when they are near the end of their terms of protection and not being commercially exploited. This provision may eventually support preservation efforts for digital sound recordings, assuming the files survive long enough to qualify for the exception. But, in the meantime, especially when they are stewarding works not addressed by legislative changes like the MMA, archivists and curators need concrete best practices and tools with which to navigate the current US copyright landscape.92 Several recommendations can be synthesized from the literature and from the experience of working archivists:
Make rights analysis and risk assessment routine parts of ingesting digital content. Unlike analog materials, which can generally be acquired and used in limited ways without infringing copyright regardless of their status, potentially infringing actions on born-digital materials begin at the point of accessioning.
Work to promote among colleagues, donors, researchers, and administrators a thorough understanding of the rules that apply to works created in different years, formats, and contexts.
Prepare and maintain written documentation of any rationale for proceeding with copying and other potentially infringing preservation actions under a legal exception. This writing can assist the institution in the event of an infringement claim, demonstrating good faith and possibly limiting damages.93
Document all known rights information and rights-related decisions in metadata, accession records, or catalog records. Store rights information alongside digital objects or linked using consistent unique identifiers. Continue to improve and expand the rights-related meta-data elements devised by the Preservation Metadata: Implementation Strategies (PREMIS) working group.
Develop takedown procedures enabling the removal or restriction of content, particularly for items that are made openly available online. Ensure that takedown requests are easy to submit and addressed promptly.
Consider under what circumstances the repository would be willing to return or deaccession digital materials for copyright reasons and ensure deaccessioning procedures address removing born-digital materials from storage systems. Ensuring complete removal of a digital file from a hard disk or other device without wiping the media entirely is difficult, and what this means for archives trying to deaccession problematic materials is unclear; to my knowledge, the hypothetical ability to recover an infringing item using forensic methods after an order to destroy unauthorized copies has not been addressed in any legal case.
Advocate Copyright Reform
The gaps in scholarship and best practices related to copyright and digital preservation reflect gaps in the law. As a professional community, archivists should be aware of this fact and advocate for reforms, such as those introduced by the MMA, that support the work of collecting and preserving contemporary records. Intellectual property rights expert Andrew Charlesworth argued in 2008 that much of the difficulty in adapting copyright law, as well as related educational exemptions and fair use assessments, to the needs of digital curation and preservation is that “a defining characteristic of contemporary copyright law is the willingness of governments to accept the argument that the impact of digital technologies requires copyright owners to be given ever greater control over the use of their works, regardless of the detriment to the copyright regime's ‘public interest' elements.”94 This one-sided focus on the risks of digital technologies for creators causes problems for both rights holders and collecting institutions. Effecting a shift back toward limiting or allowing exceptions to owners' exclusive rights for the public good will require sustained attention and advocacy.
One area of possible reform is Section 108's provisions for preservation-focused copying and reformatting. A 2008 report from a Section 108 study group convened by the US Copyright Office notes that “certain preservation activities fall within the scope of fair use, regardless of whether they would be permitted by section 108”;95 however, during a routine copyright review hearing before Congress in 2014, Judiciary Committee chairman Bob Goodlatte pointed out that “while it is probably true that there are clear-cut cases in which fair use would apply to preservation activities, fair use is not always easy to determine, even to those with large legal budgets. Those with smaller legal budgets or a simple desire to focus their limited resources on preservation may prefer to have better statutory guidance than exists today.”96 Additional statutory guidance would free archivists to make proactive decisions in the interest of digital preservation rather than hesitating over concerns about legal risks.
A Society of American Archivists (SAA) issue brief in 2014 argued for reforms beyond those proposed by the 2008 study group, suggesting that a revised Section 108 could become a model for World Intellectual Property Organization (WIPO) efforts to establish minimum international standards for library and archives copyright exceptions. The brief includes two requests specifically supporting digital preservation: 1) “removal of the ‘3 copy' limitation on digital preservation copies” and 2) “expanded preservation of digital resources, including collection and preservation of publicly accessible networked publications (i.e., websites).”97 Surprisingly, SAA, perhaps concerned that opening the door to revisions might lead to detrimental changes or a net loss of preservation-focused exceptions, reversed course in 2016 and issued a statement opposing alterations to Section 108 and stating that “SAA does not consider Section 108 to be obsolete or in need of serious reform.”98 The statement begins by noting that “any and all reforms to Section 108 must be made with an eye toward either expanding on existing permitted uses by archives (and libraries) or adding new permitted activities”99 and persuasively enumerates a number of ways in which the current Section 108 is inadequate and suggests adjustments that would facilitate preservation, such as removing the strict limit on the number of preservation copies that can be created, explicitly allowing archives to outsource preservation activities, supporting the capture of web content, and making provisions for TPM circumvention. Strangely, the statement also definitively asserts that “now is not an appropriate time to rewrite or amend Section 108.”100
The Copyright Office subsequently released its discussion document recommending a Section 108 overhaul and including model language that would address many of archivists' concerns. The document proposes a number of encouraging changes:
. . . allowing multiple preservation copies, allowing preservation copies of published works, expanding access to digital preservation copies, amending the subsection 108(i) exclusions for copies made at the request of users, allowing more flexibility in making preservation copies of works covered by licensing or purchase agreements, and allowing the use of third-party vendors in some situations.101
Another reform archivists may wish to advocate for is changing mandatory deposit requirements so that the Library of Congress or other select institutions routinely acquire and preserve published electronic materials. While some countries do include digital works in their legal deposit schemes,102 the United States currently exempts most electronic works that are available only online (that is, not distributed on physical media).103 Implementing legal deposit for electronic materials is complex,104 and the United States' exemption likely exists in part to manage the volume of material the Library of Congress must commit to preserving. But as more content is distributed online only, the risk of losing long-term access to valuable cultural and scientific information increases and the need for a coordinated preservation initiative focused on that material grows.105
Despite archivists' progress in developing best practices and pursuing preservation exemptions, copyright law still presents significant barriers to the preservation of born-digital archival materials. While copyright reform, along with related legal clarifications or new preservation exceptions related to digital copying, TPM circumvention, and orphan works, could theoretically provide the most comprehensive solution for legal barriers to preservation, any future changes to the law are unlikely to satisfy both archivists and rights holders. Given this, archivists would be well served to pair advocacy with concrete risk-management actions that are more fully within their control: obtaining explicit permission when possible, embracing existing best practices documents (and contributing to the creation of additional codes of best practices that can shape community norms and defenses against infringement claims106), and documenting their rationale for preservation and access decisions. Because the risk of an infringement lawsuit is very low when the materials in question are not available to researchers, archivists may choose to take preservation actions that fall into legal gray areas and mitigate risk by restricting the preserved materials long term. When the possibility of losing certain digital records is unacceptable but the copyright status of the items remains a barrier, preservation without access may be the safest path forward—although not all archives will have the resources or ability to preserve materials indefinitely without a clear path to access, and archivists will likely still need to resolve copyright questions down the road.
New questions will continue to emerge as archives increasingly deal with multi-user, dynamic media such as collaboratively annotated documents (complicating questions about ownership) and new methods of using content such as text mining (prompting questions about what constitutes transformative use and how copyright restrictions might extend to the products of such research). In this landscape, all archivists working with born-digital materials should be aware of the copyright challenges that can slow or thwart digital preservation efforts, from the inadequacy of Section 108 to cover digital preservation–related copying and the prevalence of restrictive licenses and DRM, to questions about the nature of ownership and publication status in a digital environment. Familiarity with the issues will aid risk assessment, donor negotiations, and access decisions and will enable archivists to contribute to future research on key issues such as software preservation, large-scale digital appraisal and rights review, and born-digital access. Awareness of ongoing negotiations with the Copyright Office and copyright lobbying by major rights holders, as well as legal cases that are gradually shaping our understanding of digital property and rights, will position archivists to act as responsible stewards of unique digital collections and advocate for reforms that will support cultural heritage preservation.
As important as copyright reform is for scholarly, scientific, and cultural materials in all formats, gaining clarity on digital copyright and expanding preservation and access options for born-digital content are even more urgent. If archivists wait until copyright terms have ended or media and formats have become obsolete as required for many applications of Section 108—or wait for any other future change that might alter the status of the content to allow copying—they will have missed the window of opportunity to preserve that content. Some copyright owners might be willing to allow DRM circumvention for long-term preservation or use in the distant future after copyright expires. But technological obsolescence and media instability mean the content might not survive or be renderable long enough to reach that point if archivists cannot intervene to create preservation copies now.
Archivists' and librarians' professional ethics call on them to act “within prescribed law”107 when making decisions about preservation and copying, to “observe legal requirements and obligations determined by rights associated with digital objects,”108 and to “respect intellectual property rights and advocate balance between the interests of information users and rights holders.”109 But until the profession develops adequate tools and advocates for a forward-looking legal framework to allow for digital preservation actions without violating the letter of the law, such a balance will be virtually impossible to achieve.
Thank you to the anonymous peer reviewers as well as American Archivist editor, Cal Lee, for critically reading this manuscript and providing helpful comments that improved the final version. Thanks also to Laura Burtle, MSLS, JD, for additional feedback and advice.
For the purposes of this article, digital preservation is broadly defined as the combination of “policies, strategies and actions that ensure access to digital content over time.” See “Definitions of Digital Preservation,” Association for Library Collections and Technical Services, American Library Association, February 21, 2008, http://www.ala.org/alcts/resources/preserv/defdigpres0408, captured at https://perma.cc/9WWA-AEYQ.
Neal, “Preserving the Born-Digital Record,” American Libraries 46 (June 2, 2015): 33.
Most cases and concepts I reference will be based on US law because that is the setting in which I work and with which I am most familiar, but the underlying questions raised about the intersections of intellectual property law and digital preservation are largely transferrable to other legal contexts, and I have included references to international resources when possible. For an overview of the major provisions of US copyright law, see “Copyright Basics” (Circular 1), United States Copyright Office, 2012, https://www.copyright.gov/circs/circ01.pdf, captured at https://perma.cc/D3PT-VTVE; for more information on provisions affecting libraries and similar institutions, see “Copyright for Libraries,” American Library Association, https://libguides.ala.org/copyright. For a comparative discussion of intellectual property laws and digital preservation in Australia, the Netherlands, the United Kingdom, and the United States, see June M. Besek et al., “Digital Preservation and Copyright: An International Study,” International Journal of Digital Curation 3, no. 2 (2008): 103–11, https://doi.org/10.2218/ijdc.v3i2.61.
June M. Besek, Copyright Issues Relevant to the Creation of a Digital Archive: A Preliminary Assessment: Strategies and Tools for the Digital Library (Washington, DC: Council on Library and Information Resources, 2003), http://www.clir.org/pubs/reports/pub112/contents.html; Menzi L. Behrnd-Klodt and Christopher J. Prom, eds., Rights in the Digital Era (Chicago: Society of American Archivists, 2015); Heather Briston, “Contracts, Intellectual Property, and Privacy,” in The Digital Archives Handbook: A Guide to Creation, Management, and Preservation, ed. Aaron D. Purcell (Lanham, MD: Rowman & Littlefield, 2019), 95–120; Estelle Derclaye, Copyright and Cultural Heritage: Preservation and Access to Works in a Digital World (Cheltenham, UK: Edward Elgar Publishing, 2010); Laura N. Gasaway, “America's Cultural Record: A Thing of the Past?,” 2003, https://web.archive.org/web/20180125174932/http://www.unc.edu/~unclng/America's%20cultural%20record.htm; Peter B. Hirtle, Emily Hudson, and Andrew T. Kenyon, Copyright and Cultural Institutions: Guidelines for Digitization for U.S. Libraries, Archives, and Museums (Ithaca, NY: Cornell University Library, 2009), https://ecommons.cornell.edu/handle/1813/14142; Mags McGinley, “The Legal Environment of Digital Curation—A Question of Balance for the Digital Librarian,” in Research and Advanced Technology for Digital Libraries, ed. László Kovács, Norbert Fuhr, and Carlo Meghini, 534–38 (Berlin, Heidelberg: Springer, 2007), https://doi.org/10.1007/978-3-540-74851-9_62; Maureen Whalen, “Permissions Limbo: Intellectual Property and Licensing Issues,” RBM: A Journal of Rare Books, Manuscripts and Cultural Heritage 10, no. 1 (2009): 25–29, https://doi.org/10.5860/rbm.10.1.314.
Edward M. Corrado and Heather Moulaison Sandy, Digital Preservation for Libraries, Archives, and Museums (Lanham, MD: Rowman & Littlefield, 2017); Tom Evens and Laurence Hauttekeete, “Challenges of Digital Preservation for Cultural Heritage Institutions,” Journal of Librarianship and Information Science 43, no. 3 (2011): 157–165, https://doi.org/10.1177/0961000611410585; Christopher A. Lee, I, Digital: Personal Collections in the Digital Era (Chicago: Society of American Archivists, 2011); Jeremy Myntti and Jessalyn Zoom, Digital Preservation in Libraries: Preparing for a Sustainable Future (Chicago: American Library Association, 2019); Gillian Oliver and Ross Harvey, Digital Curation, 2nd ed. (Chicago: Neal-Schuman Publishers, 2016); Trevor Owens, The Theory and Craft of Digital Preservation (Baltimore: Johns Hopkins University Press, 2018); Aaron D. Purcell, ed., The Digital Archives Handbook: A Guide to Creation, Management, and Preservation (Lanham, MD: Rowman & Littlefield, 2019).
Peter Hirtle, “Digital Preservation and Copyright,” Stanford Copyright and Fair Use Center, November 2003, https://fairuse.stanford.edu/2003/11/10/digital_preservation_and_copyr, captured at https://perma.cc/5245-69SZ; Alicia Ryan, “Contract, Copyright, and the Future of Digital Preservation,” Boston University Journal of Science & Technology Law 10 (2004): 152–76; “The Section 108 Study Group Report: An Independent Report Sponsored by the United States Copyright Office and the National Digital Information Infrastructure and Preservation Program of the Library of Congress,” March 2008, http://www.section108.gov/docs/Sec108StudyGroupReport.pdf, captured at https://perma.cc/U565-24QZ.
David Anderson, “Preserving Europe's Digital Cultural Heritage: A Legal Perspective,” New Review of Information Networking 18, no. 1 (2013): 16–39, https://doi.org/10.1080/13614576.2013.775836; Catherine Ayre and Adrienne Muir, “The Right to Preserve: The Rights Issues of Digital Preservation,” D-Lib Magazine 10, no. 3 (2004), https://doi.org/10.1045/march2004-ayre; Andrew Charlesworth, “Intellectual Property Rights for Digital Preservation,” Digital Preservation Coalition, 2012, https://doi.org/10.7207/twr12-02; Tim Padfield, “Copyright in the Electronic Environment,” in Copyright for Archivists and Records Managers, 5th ed. (London: Facet Publishing, 2015), 183–200.
“UNESCO Charter on the Preservation of the Digital Heritage,” UNESCO Digital Library, March 2003, https://unesdoc.unesco.org/ark:/48223/pf0000229034.
Oliver and Harvey, Digital Curation, 204.
Briston, “Contracts, Intellectual Property, and Privacy,” 103.
Alex H. Poole, “How Has Your Science Data Grown? Digital Curation and the Human Factor: A Critical Literature Review,” Archival Science 15, no. 2 (2015): 126, https://doi.org/10.1007/s10502-014-9236-y.
Digital Millennium Copyright Act of 1998, 17 U.S.C. § 1201 (2000).
Besek, Copyright Issues Relevant to the Creation of a Digital Archive. This report, prepared for CLIR in an effort to document anticipated challenges and potential approaches, has limited applicability to most institutions collecting and curating data, as it assumes the Library of Congress, with its unique resources and powers related to copyright, deposit, and transmission, as the collecting institution. Nevertheless, the report provides a clear and succinct introduction to intellectual property rights (IPR) issues for the management and preservation of digital archives. For a similar overview from the perspective of an academic institution, see Gasaway, “America's Cultural Record: A Thing of the Past?” Gasaway covers the concept and IPR-related benefits of institutional repositories and spends some time on Section 108(h) of the Copyright Act of 1976, which Besek mentions only briefly.
“Section 108 of Title 17: A Discussion Document of the Register of Copyrights,” United States Copyright Office, September 2017, 35, https://www.copyright.gov/policy/section108/discussion-document.pdf, captured at https://perma.cc/D8AY-UQKS.
“DCC Curation Lifecycle Model,” Digital Curation Centre, https://www.dcc.ac.uk/guidance/curation-lifecycle-model, captured at https://perma.cc/XX5B-ZTQK.
Corrado and Moulaison Sandy, Digital Preservation for Libraries, Archives, and Museums, 36.
17 U.S.C. § 107 (2006); 17 U.S.C. § 108 (2006).
Besek et al., “Digital Preservation and Copyright: An International Study,” 105. The Berne Convention, the foundation of international copyright law, grants exclusive reproduction rights to creators but leaves it up to participating countries to identify appropriate exceptions (pp. 105–6). There are variations in how different copyright regimes handle preservation-related exceptions to the creator's exclusive reproduction rights: the numbers of copies permitted, the inclusion of audiovisual formats, and the types of institutions eligible for exceptions are not consistent. Besek et al. call for more consistency in these preservation exceptions, both within and between countries.
Aprille C. McKay, “Managing Rights and Permissions,” in Rights in the Digital Era, ed. Menzi L. Behrnd-Klodt and Christopher J. Prom (Chicago: Society of American Archivists, 2015), 188.
McKay points out that archivists are not alone in arguably violating the letter of copyright law in their handling of born-digital content. Copying is required not only for archival accessioning, processing, and preservation but also for processes used in legal and law enforcement fields, such as e-discovery and forensic analysis.
Select examples include “Digital Preservation Handbook,” 2nd ed., Digital Preservation Coalition, 2015, https://www.dpconline.org/handbook; “Levels of Digital Preservation,” 2nd ed., National Digital Stewardship Alliance, 2019, https://doi.org/10.17605/OSF.IO/QGZ98; Space Data and Information Transfer Systems: Audit and Certification of Trustworthy Digital Repositories, ISO 16363:2012 (Geneva, Switzerland: International Organization for Standardization, February 2012; reviewed 2017); and “Strategy for Preserving Digital Archival Materials,” National Archives and Records Administration, 2017, https://www.archives.gov/files/preservation/electronic-records/digital-pres-strategy-2017.pdf, captured at https://perma.cc/8TFV-8A8H.
Richard Pearce-Moses, s.v. “Lots of Copies Keep Stuff Safe,” A Glossary of Archival and Records Terminology (Chicago: Society of American Archivists, 2005), http://files.archivists.org/pubs/free/SAA-Glossary-2005.pdf, captured at https://perma.cc/P9Y5-L6BU.
17 U.S.C. § 512 (1998) outlines a general framework for determining service providers' liability for hosting infringing content. Authors Guild, Inc. v. HathiTrust, 755 F.3d 87 (2d Cir. 2014) addresses security expectations for on-site and remote server storage of copyright-protected content held by libraries.
In a widely cited 1994 essay, Electronic Frontier Foundation cofounder John Perry Barlow raised concerns about how digital distribution might change perceptions of copyright: “If our property can be infinitely reproduced and instantaneously distributed all over the planet without cost, without our knowledge, without its even leaving our possession, how can we protect it?” See Barlow, “The Economy of Ideas: A Framework for Patents and Copyrights in the Digital Age,” Wired, March 1, 1994, https://www.wired.com/1994/03/economy-ideas, captured at https://perma.cc/923N-5GMU.
Yuval Feldman and Janice Nadler, “The Laws and Norms of File Sharing,” San Diego Law Review 43, no. 3 (2006): 577–618; Jérôme Hergueux and Dariusz Jemielniak, “Should Digital Files Be Considered a Commons? Copyright Infringement in the Eyes of Lawyers,” The Information Society 35, no. 4 (2019), 198–215, https://doi.org/10.1080/01972243.2019.1616019; Gregory N. Mandel, “The Public Perception of Intellectual Property,” Florida Law Review 66, no. 1 (2014): 261–312.
Jean Dryden, “Copyfraud or Legitimate Concerns? Controlling Further Uses of Online Archival Holdings,” American Archivist 74, no. 2 (2011): 528, https://doi.org/10.17723/aarc.74.2.d5g2700q5612l4w7.
“09 Legal Issues,” in PARADIGM Workbook on Digital Private Papers, Paradigm Project, 2007, 257, https://ora.ox.ac.uk/objects/uuid:116a4658-deff-4b06-81c5-c9c2071bc6d0.
Trusted Digital Repositories: Attributes and Responsibilities, An RLG-OCLC Report (Mountain View, CA: Research Libraries Group, 2002): 18, https://www.oclc.org/content/dam/research/activities/trustedrep/repositories.pdf, captured at https://perma.cc/5QF4-BQUL.
While it is common to think of digital files transmitted over the Web or saved in the cloud as intangible, all digital information is ultimately material and cannot exist without physical hardware and infrastructure. See Jean-François Blanchette, “A Material History of Bits,” Journal of the American Society for Information Science and Technology 62, no. 6 (2011): 1042–57.
Aaron Perzanowski and Jason Schultz's study The End of Ownership: Personal Property in the Digital Age (Cambridge, MA: MIT Press, 2016) focuses on individual ownership and consumer rights but helpfully illuminates broader legal trends around digital property.
Eurie Hayes Smith IV, “Digital First Sale: Friend or Foe,” Cardozo Arts & Entertainment Law Journal 22, no. 3 (2005): 853–62, https://heinonline.org/HOL/P?h=hein.journals/caelj22&i=861, authentication required.
In popular media, these questions often arise in the context of “digital inheritance” or “digital estate planning”; emerging practices and legal precedents in that area might prove helpful to archivists.
A few cases touch on these questions. See, for example, Capitol Records, LLC v. ReDigi Inc., No. 16-2321 (2d Cir. 2018). Here the court found that resale of digital music files constitutes copyright infringement. Vernor v. Autodesk, Inc., 621 F.3d 1102 (9th Cir. 2010) and MDY Industries, LLC v. Blizzard Entertainment, Inc., 629 F.3d 928 (9th Cir. 2010) both examined whether used software with licensing restrictions could legally be resold under the doctrine of first sale. Thyroff v. Nationwide Mutual Insurance Co., 8 N.Y.3d 283, 864 N.E.2d 1272 (2007) indirectly considered whether ownership of electronic data is analogous to other goods or property. On the international stage, some cases have debated whether digital data is merely information and thus not property [Jonathan Dixon v. R, 2014 N.Z.C.A. 329 (2014); Your Response Ltd v. Datateam Business Media Ltd, 2014 E.W.C.A. Civ 281 (2014)], while another has affirmed that a perpetual license is equivalent to a sale and thus resale of licensed software is permissible under EU copyright law [UsedSoft GmbH v. Oracle International Corp, 2012 C.M.L.R.3 44, 2012 E.C.D.R. 19 (2012)]. For a discussion of digital exhaustion and global ecommerce, see Sean P. Morris, “Beyond Trade: Global Digital Exhaustion in International Economic Regulation,” Campbell Law Review 36, no. 1 (2013): 107–45. For recent explorations of digital files as property under English and European law, see Johan David Michels and Christopher Millard, “Mind the Gap: The Status of Digital Files Under Property Law,” Queen Mary School of Law Legal Studies Research Paper 317/2019, https://ssrn.com/abstract=3387400 and Sjef van Erp, “Ownership of Digital Assets?,” European Property Law Journal 5, no. 2 (2016): 73–76, https://doi.org/10.1515/eplj-2016-0009.
“White Paper on Remixes, First Sale, and Statutory Damages: Copyright Policy, Creativity, and Innovation in the Digital Economy,” The Department of Commerce Internet Policy Task Force, January 2016, 47–48, https://www.uspto.gov/sites/default/files/documents/copyrightwhitepaper.pdf, captured at https://perma.cc/A5CV-FNEG.
Besek et al., “Digital Preservation and Copyright,” 103–11.
Seventy percent of respondents to a 2017 NDSA survey indicated that their institutions do not seek permission prior to archiving sites. See Matthew Farrell et al., “Web Archiving in the United States: A 2017 Survey,” National Digital Stewardship Alliance, October 2018, 24, https://osf.io/ht6ay.
For more on the specific legal issues around web archiving, albeit in an Australian context, see Laura Simes and Bob Pymm, “Legal Issues Related to Whole-of-Domain Web Harvesting in Australia,” Journal of Web Librarianship 3, no. 2 (2009): 129–42, https://doi.org/10.1080/19322900902787227. Copyright is also addressed briefly in Maureen Pennock, Web-Archiving: DPC Technology Watch Report, Digital Preservation Coalition, 2013, https://www.dpconline.org/docs/technology-watch-reports/865-dpctw13-01-pdf/file, captured at https://perma.cc/6TX7-88S3, and Gail Truman, “Web Archiving Environmental Scan,” Harvard Library Report, http://nrs.harvard.edu/urn-3:HUL.InstRepos :25658314.
Code of Best Practices in Fair Use for Academic and Research Libraries, Association of Research Libraries, 2012, 27, https://www.arl.org/wp-content/uploads/2014/01/code-of-best-practices-fair-use.pdf, captured at https://perma.cc/N5Q2-RQQR.
Ryan, “Contract, Copyright, and the Future of Digital Preservation,” 159. A related challenge, with perhaps even more urgency, is archives' and libraries' inability to acquire certain materials at all, such as digital music and video content designed for individual consumers to stream or download and not available on physical media or via institutional license.
The Bibliotèque nationale de France (BnF) has shared its approach to preserving born-digital electronic books in Sophie Derrot, Jean-Philippe Moreux, Clément Oury, and Stéphane Reecht, “Preservation of Ebooks: From Digitized to Born-Digital,” in 11th International Conference on Digital Preservation (iPRES), October 2014, Melbourne, Australia, https://hal-bnf.archives-ouvertes.fr/hal-01088755. Because of the BnF's legal deposit mandate, copyright and DRM are less likely to hamper its preservation of e-books.
Trusted Digital Repositories, 18.
Dick Kawooya and Tucky Taylor, “Cultural Heritage Informatics and Intellectual Property Rights,” in Annual Review of Cultural Heritage Informatics: 2014, ed. Samantha K. Hastings (Lanham, MD: Rowman & Littlefield, 2015), 58. While Kawooya and Taylor's comparative analysis focuses primarily on mass digitization and access initiatives, it sheds light on a number of realities that also apply to born-digital content: insufficient understanding of the legal barriers to digital access, widespread misconceptions about fair use, questions about the nature of ownership in a digital environment, the difficulty of determining authorship of certain types of content (e.g., photos, letters received, organization records), and the implications of displaying materials online. Because “copyright law is out of step with current digitization trends and practices” and “copyright exceptions for . . . cultural institutions may not provide legal cover for . . . mass digitization and global access,” the cultural heritage community needs to unite and advocate for change. “Cultural Heritage Information and Intellectual Property Rights,” 59.
Trusted Digital Repositories, 18–19.
Mike Kastellec, “Practical Limits to the Scope of Digital Preservation,” Information Technology & Libraries 31, no. 2 (2012): 66, https://doi.org/10.6017/ital.v31i2.2167.
Deborah R. Gerhardt, “Copyright Publication on the Internet” (paper presented at Intellectual Property Scholars Conference, Chicago, August 8–9, 2019), https://law.depaul.edu/about/centers-and-institutes/center-for-intellectual-property-law-and-information-technology/programs/ip-scholars-conference/Documents/ipsc_2019/Gerhardt%20-%20Paper.pdf, captured at https://perma.cc/VW9J-4DVQ.
17 U.S.C. § 101 (2006).
Compendium of US Copyright Office Practices, 3rd ed. (Washington, DC: United States Copyright Office, September 2017), 2, https://www.copyright.gov/comp3.
The notion of originality is itself suspect when referring to born-digital objects. See Doug Reside, “File Not Found: Rarity in an Age of Digital Plenty,” RBM: A Journal of Rare Books, Manuscripts, and Cultural Heritage 15, no. 1 (2014): 68–74, https://doi.org/10.5860/rbm.15.1.416.
Denise M. Davis and Tim Lafferty, “Digital Rights Management: Implications for Libraries,” The Bottom Line 15, no. 1 (2002): 18–23, https://doi.org/10.1108/08880450210415725.
Thanks to advocacy by librarians, educators, security researchers, and others, the Copyright Office has over the years granted some exceptions to the prohibition on TPM circumvention for legitimate educational or research uses.
Briston, “Contracts, Intellectual Property, and Privacy,” 106.
Samuel Brylawski, “Preservation of Digitally Recorded Sound,” in Building a National Strategy for Digital Preservation: Issues in Digital Media Archiving (Washington, DC: Council on Library and Information Resources and Library of Congress, 2002); Amy Kirchhoff, “EBooks: The Preservation Challenge,” Against the Grain 23, no. 4 (2014): 34, https://doi.org/10.7771/2380-176X.5935.
Besek et al., “Digital Preservation and Copyright,” 111.
Besek et al. found that permission requests were generally handled on an ad-hoc, case-by-case basis. Kawooya and Taylor noted this was still the case several years later.
Samuel Brylawski, “Preservation of Digitally Recorded Sound,” in Building a National Strategy for Digital Preservation: Issues in Digital Media Archiving, 60.
Besek et al., “Digital Preservation and Copyright,” 104.
Jason Puckett, “Digital Rights Management as Information Access Barrier,” Progressive Librarian 34/35 (2010): 20, https://scholarworks.gsu.edu/cgi/viewcontent.cgi?article=1049&context=univ_lib_facpub, captured at https://perma.cc/HG6S-Y9V2.
Trusted Digital Repositories, 19.
Eun G. Park and Sam Oh, “Examining Attributes of Open Standard File Formats for Long-Term Preservation and Open Access,” Information Technology and Libraries 31, no. 4 (2012): 46–67, https://doi.org/10.6017/ital.v31i4.1946.
Google LLC v. Oracle America, Inc., No. 18-956 (Jan. 24, 2019), https://www.supremecourt.gov/search.aspx?FileName=/docket/docketfiles/html/public/18-956.html.
Oral arguments took place in October 2020, just before this article went to press. A decision may thus have been issued by the time of publication.
Oliver and Harvey, Digital Curation, 204.
Oliver and Harvey, Digital Curation, 205.
William S. Strong, The Copyright Book: A Practical Guide, 6th ed. (Cambridge, MA: The MIT Press, 2014), 28.
Jack Russo and Jamie Nafziger, “Look and Feel in Computer Software,” ComputerLaw Group, LLP, 1993, https://www.computerlaw.com/Articles/Look-and-Feel-in-Computer-Software.shtml.
Poole, “How Has Your Science Data Grown?,” 102.
Ryan, “Contract, Copyright, and the Future of Digital Preservation,” 174–75. While Ryan focuses primarily on traditional library collections, the same concerns apply to any collection of data that is produced by entities outside the collecting organization and governed by contractual agreements that limit its use in some way. Ryan's proposed solution to the problem of preserving collections built through restrictive licenses and web archiving is “rescue power” legislation that permits “libraries and archives to abridge copyright, contract and technological restrictions on digital works in order to ensure their preservation in the face of owner neglect, the inability to find an owner, or an owner's active intention to destroy a valuable cultural artifact” (160). While not a solution to every intellectual property rights issue affecting digital curation, this proposal would go a long way toward solving many of them. Clearly, though, such legal reform has not come to pass, and it is unlikely to do so in the foreseeable future.
Authors Guild v. Google, Inc., 804 F.3d 202 (2015).
Authors Guild, Inc. v. HathiTrust.
Cambridge University Press et al. v. Patton et al., 769 F.3d 1232 (11th Cir. 2014).
Corrado and Moulaison Sandy, Digital Preservation for Libraries, Archives, and Museums, 36.
David Hansen, “Digitizing Orphan Works: Legal Strategies to Reduce Risks for Open Access to Copyrighted Orphan Works” (Cambridge, MA: Harvard Library, 2016), https://dash.harvard.edu/handle/1/27840430.
Hansen, “Digitizing Orphan Works,” ii.
Hansen characterizes uncertainty about ownership and risk of legal action, should a rights holder come forward, as two of the biggest hindrances to libraries' and archives' making out-of-print or unpublished orphan works freely available online. See Hansen, “Digitizing Orphan Works,” 2.
Hansen, “Digitizing Orphan Works,” iii.
The report acknowledges that there are drawbacks and limitations to this strategy since Section 108 generally allows limited reproduction, not the multiple copies involved in online dissemination and long-term digital curation. See Hansen, “Digitizing Orphan Works,” 5.
Society of American Archivists Intellectual Property Working Group, “Orphan Works: Statement of Best Practices,” June 2009, http://www2.archivists.org/sites/all/files/OrphanWorks-June2009.pdf, captured at https://perma.cc/CYB7-8C4X.
For an interesting discussion of Australian practices related to digital preservation of Indigenous intellectual property, see Timothy Robert Hart, Denise de Vries, and Carl Mooney, “Australian Law Implications on Digital Preservation,” in iPRES 2019: 16th International Conference on Digital Preservation, Proceedings, ed. Marcel Ras, Barbara Sierman, and Angela Puggioni, Amsterdam, 2019, 37–45, https://osf.io/4xyan.
Code of Best Practices in Fair Use for Academic and Research Libraries, 17.
Samantha Brown, Samantha Lowrance, and Catherine Whited, “Preservation Practices of Videogames in Archives,” Social Science Research Network, May 5, 2018, 4, http://dx.doi.org/10.2139/ssrn.3174157.
Henrike Maier, “Games as Cultural Heritage: Copyright Challenges for Preserving (Orphan) Video Games in the EU,” Journal of Intellectual Property, Information Technology and Electronic Commerce Law 6, no. 2 (2015): 120–31, https://heinonline.org/HOL/P?h=hein.journals/jipitec6&i=126, authentication required.
17 U.S.C. § 117, which allows owners of copies of programs to make archival copies, could potentially be used to support some archives-based copying, but this hinges on the interpretation of “owner”—generally understood to be an individual consumer—and also requires ongoing rightful possession of the work. Regardless of this section's applicability to software preservation efforts, it unfortunately does not apply to other categories of digital works.
Brown, Lowrance, and Whited, “Preservation Practices of Videogames in Archives.”
Code of Best Practices in Fair Use for Software Preservation, revised ed., Association of Research Libraries, 2019, https://www.arl.org/wp-content/uploads/2018/09/2019.2.28-software-preservation-code-revised.pdf, captured at https://perma.cc/56CB-XWAB. See also Brandon Butler et al., “Cracking the Copyright Dilemma in Software Preservation: Protecting Digital Culture through Fair Use Consensus,” Journal of Copyright in Education & Librarianship 3, no. 3 (2019): 1–23, https://doi.org/10.17161/jcel.v3i3.10267.
Butler et al., “Cracking the Copyright Dilemma in Software Preservation,” 5.
David Anderson, “Preserving Europe's Digital Cultural Heritage: A Legal Perspective,” New Review of Information Networking 18, no. 1 (2013): 16–39, https://doi.org/10.1080/13614576.2013.775836. Anderson concludes that preservation practices involving complex digital objects and emulation are likely in conflict with European copyright laws and that existing legal frameworks are inappropriate for the current data landscape and preservation needs.
It is not clear whether the exemption will be renewed or extended at the end of the three-year period.
Kendra Albert, “A Victory for Software Preservation: DMCA Exemption Granted for SPN,” Cyberlaw Clinic, October 26, 2018, https://clinic.cyber.harvard.edu/2018/10/26/a-victory-for-software-preservation-dmca-exemption-granted-for-spn, captured at https://perma.cc/J9QB-QZAH.
Kee Young Lee and Kendra Albert, “A Preservationist's Guide to the DMCA Exemption for Software Preservation,” Software Preservation Network and Cyberlaw Clinic at the Berkman Klein Center, December 2018, 4, http://softwarepn.webmasters21.com/1201-exemption-guide-for-software-preservationists.
Even if copyright reform is achieved, archivists may still need these tools to manage data already created or acquired that might not be affected by changes to the law or new licensing models.
Maureen Whalen, “Permissions Limbo: Intellectual Property and Licensing Issues,” RBM: A Journal of Rare Books, Manuscripts and Cultural Heritage 10, no. 1 (2009): 27, https://doi.org/10.5860/rbm.10.1.314.
Andrew Charlesworth, “Digital Curation, Copyright, and Academic Research,” International Journal of Digital Curation 1, no. 1 (2008): 17, https://doi.org/10.2218/ijdc.v1i1.3.
“The Section 108 Study Group Report,” 22. For an overview of the study group process and explanation of the issues the group aimed to address, see Laura N. Gasaway, “Amending the Copyright Act for Libraries and Society: The Section 108 Study Group,” Albany Law Review 70, no. 4 (2007): 1331–56, https://perma.cc/QPE4-KPSC.
“Preservation and Reuse of Copyrighted Works: Hearing before the Subcommittee on Courts, Intellectual Property, and the Internet of the House Committee on the Judiciary,” 113th Congress (2014), 6.
“Issue Brief: Archivists and Section 108 of the Copyright Act,” Society of American Archivists, May 2014, https://www2.archivists.org/statements/issue-brief-archivists-and-section-108-of-the-copyright-act, captured at https://perma.cc/M94K-3JH2.
“Statement on US Copyright Office Draft Revision of Section 108: Library and Archives Exceptions in US Copyright Law [Docket No. 2016-4],” Society of American Archivists, 1, https://www2.archivists.org/sites/all/files/SAA%20Comments%20on%20Section%20108_July-2016.pdf, captured at https://perma.cc/J5CN-AG5X.
“Statement on US Copyright Office Draft Revision of Section 108,” 2.
“Statement on US Copyright Office Draft Revision of Section 108,” 2.
“Section 108 of Title 17: A Discussion Document of the Register of Copyrights,” 2.
“Digital Legal Deposit in Selected Jurisdictions,” Law Library of Congress, Global Legal Research Center, July 2018, https://www.loc.gov/law/help/digital-legal-deposit/digital-legal-deposit.pdf. The United States is not alone in lacking a national system for collecting digital materials in a way analogous to copyright deposit requirements for print materials.
Marietjie De Beer et al., “Legal Deposit of Electronic Books: A Review of Challenges Faced by National Libraries,” Library Hi Tech 34, no. 1 (2016): 87–103, https://doi.org/10.1108/LHT-06-2015-0060.
A 2020 study presents an international perspective on legal deposit for digital materials: Paul Gooding and Melissa Terras, eds., Electronic Legal Deposit: Shaping the Library Collections of the Future (London: Facet Publishing, 2020).
While not a preservation-focused code, Association of Independent Video and Filmmakers et al., Documentary Filmmakers' Statement of Best Practices in Fair Use, American University School of Communication Center for Media and Social Impact, 2005, https://cmsimpact.org/code/documentary-filmmakers-statement-of-best-practices-in-fair-use, captured at https://perma.cc/XF5S-GCRU, offers an example of how a shared understanding of best practices can alter the landscape of community behavior and legal risk. See Pat Aufderheide and Aram Sinnreich, “Documentarians, Fair Use, and Best Practices,” American University School of Communication Center for Media and Social Impact, 2014, https://cmsimpact.org/wp-content/uploads/2016/01/ida_fairuse_handout.pdf, captured at https://perma.cc/MA9S-RJE3. Thank you to Laura Burtle for bringing this statement and its impact to my attention.
“SAA Core Values Statement and Code of Ethics,” Society of American Archivists, May 2011, https://www2.archivists.org/statements/saa-core-values-statement-and-code-of-ethics, captured at https://perma.cc/7MNQ-AZ2M.
Peter McKinney, “A Draft Code of Ethics for Digital Preservation,” National Library of New Zealand, 2018, 7, https://natlib.govt.nz/files/digital-preservation/Draft-Code-of-Ethics-for-Digital-Preservation.pdf.
ABOUT THE AUTHOR
Katherine Fisher is the head of digital archives at Emory University's Stuart A. Rose Manuscript, Archives, and Rare Book Library, where she provides leadership and expertise in the acquisition, preservation, and delivery of born-digital and digitized collections. Previously, she was the digital preservation archivist at Georgia State University Library, and, before becoming an archivist, she worked in scholarly publishing. She holds an MLIS from the University of Hawai‘i at Mānoa and a PhD in English literature from the University of Michigan.