Creating Intellectual Boundaries in Complex Computing Environments: A Small Way in Which Copyright May Actually Help Software Preservation

Early on in my software preservation project, I wrote a blog post about the difficulties of researching software history. One of the issues I list is establishing the boundaries for a particular piece of software. As I have inventoried the software developed at the National Library of Medicine (NLM), I have been forced to established what constitutes a single piece of software each time I create a new record. Software goes through many versions, so naturally the preservationist needs to create temporal boundaries. In other words, they need to decide which version or versions of the software need to be preserved. However, it is far more difficult and just as necessary to decide what constitutes one piece of software within a complex system that relies on many moving parts in order to function properly over time.

At NLM, for example, developers experimented with creating a “coach” for Grateful Med. Grateful Med was a user-friendly front-end system that searched NLM’s networked databases. Named Coach Metathesuarus Browser,  this software was designed to hook into Grateful Med and provide assistance to the user when the user’s search queries returned inadequate responses. This piece of software was fully developed and tested in NLM’s Reading Room version of Grateful Med, but, for a variety of reasons, it was not adapted widely. While conducting my inventory, I decided that Coach Metathesaurus Browser constituted a separate piece of software and therefore justified its own record. My reasoning was that it had a separate institutional history from Grateful Med and that this history needed to be acknowledged. But, if Coach Metathesaurus Browser had been implemented, users may have interpreted it as a feature of Grateful Med and not as its own entity. What this means is that if Coach Metathesaurus Browser had been implemented, I would have been required to prioritize either the experience of the developer or the experience of the user in my inventory. Considering the goals of my current project, I would have made the same decision and inventoried Coach Metathesaurus Browser separately, but it is important to note that this is an decision with intellectual ramifications.

The problem of boundaries offers an unexpected view of the role of copyright in software preservation. Generally, copyright is only viewed as an obstacle for an institution or individual that wishes to preserve software. It limits what can be done to a piece of software without the consent of the copyright holder and presents a serious issue for the long term access to the cultural heritage inherent in software. Yet, copyright may benefit software preservation projects in one way. Whereas I am working with materials that are not under copyright, a preservationist working with copyrighted materials already has boundaries imposed on a piece of software. The legal structure of copyright requires a clear definition of what is in an individual piece of software and therefore protected by the law. What this means is that boundaries around a piece of software are created at the time that the software is developed and by someone affiliated with the software development project.

Because NLM is a government entity, its software is not under copyright. As I create records in my inventory, I establish what constitutes an individual piece of software, and I draw the intellectual boundaries around that software. While the history of NLM and the history of software development informs these decisions, there is not always a clear right answer. If NLM’s in-house developed software was copyrighted, I could rely on the logic of copyright and the individuals who held that copyright in order to draw these boundaries. In other words, researching the copyright of a particular piece of software may remove the need to make a decision that could cause inaccuracies or anachronisms in a record.

When compared to the obstacles that copyright creates for most of software preservation, this one possible benefit is almost negligible. Yet, highlighting this unexpected aspect of the relationship between copyright and software preservation demonstrates the ways in which an archivist is reliant on context in order to decide what belongs in a collection and how it ought to be described. Where contextual information is scarce, as is frequently the case for software development and use, copyright can provide necessary information. Although this observation is not pertinent to my current project, I will continue to make decisions about what constitutes an individual software project very carefully because I understand how these decisions may affect future perceptions of historic software and computing behavior.