My work at the National Library of Medicine centers around researching the history of software development at NLM and designing a strategy to preserve that history and those digital objects. I’m currently trying to inventory all of the software developed at NLM as a key aspect of my project. But, researching the history of software development is not easy, even when you are provided with incredible institutional support. Here are some of my current challenges:
1) The actual process of creating and distributing software in an institution: Much of the software I have begun to research is created, adapted, fixed over a long period of time and without any documentation. It makes sense – software is a use-based object and is generally created to serve a larger purpose. As a tool, it is not an end in itself and there is no logical reason to document the quick fixes that become necessary when software is integral to an institution or business. If software is malfunctioning, who has time to document why? It is better to fix it and keep business moving forward!
This practice may be entirely sensible at the time of development and implementation, but it leaves me in a difficult position. How do I deal with a lack of documentation when I’m trying to understand the historical, institutional, and cultural significance of a piece of software?
2) Drawing boundaries around a software project: Unlike a book or a movie, software does not necessarily have a final form. As stated above, a single piece of software goes through many iterations throughout its life. Furthermore, bits and pieces of executable files may be traded throughout an institution for different projects if that executable is deemed helpful. For example, if one software engineer creates a piece of code that aggregates records quickly, other engineers may implement that code in different projects. What is the best way to inventory NLM-developed software when what constitutes an individual piece of software is unclear, even to the developers? Additionally, entire software projects are sometimes absorbed into other projects and divisions, and that process is not always thoroughly documented. Tracing a particular project can become quite difficult.
Part of a solution to this is to conceive of a ‘software project’ as a software ecosystem instead of as a single entity. In this way, when inventorying software, I do not need to look for a defined boundary for project. Instead, I can look for the bits and pieces that rely on each other in order to serve a institutional purpose. Inventorying software, then, becomes a part of understanding institutional goals and the variety of tools that the institution employed to reach those goals.
A software engineer at NLM recently suggested this approach to me, and I deeply appreciate her input. This strategy will allow me to create an inventory that is more easily understood by both external and internal users. Concentrating on ‘software ecosystems’ allows one to be granular enough to document technical details while also tying software efforts to institutional goals and needs.
3) Finding failed projects: No one likes to fail, but failures are an important part of our history. As I try to piece together the history of software development at NLM, I find that some projects simply disappear from the records. There is no explanation as to why the project was discontinued or what went wrong. It simply disappears. I understand why – who wants to talk about their failures? Especially when their boss may read about it! But for historians and archivists, it is important to be able to find the failed projects as they may provide important insight into the technical details of that software as well as the historical, cultural, and institutional factors that may have influenced its failure. Tracking the failed projects will definitely prove to be an interesting challenge as I continue my research.