Some Thoughts on Researching Software History

My work at the National Library of Medicine centers around researching the history of software development at NLM and designing a strategy to preserve that history and those digital objects. I’m currently trying to inventory all of the software developed at NLM as a key aspect of my project. But, researching the history of software development is not easy, even when you are provided with incredible institutional support. Here are some of my current challenges:

1) The actual process of creating and distributing software in an institution: Much of the software I have begun to research is created, adapted, fixed over a long period of time and without any documentation. It makes sense – software is a use-based object and is generally created to serve a larger purpose. As a tool, it is not an end in itself and there is no logical reason to document the quick fixes that become necessary when software is integral to an institution or business. If software is malfunctioning, who has time to document why? It is better to fix it and keep business moving forward!

This practice may be entirely sensible at the time of development and implementation, but it leaves me in a difficult position. How do I deal with a lack of documentation when I’m trying to understand the historical, institutional, and cultural significance of a piece of software?

2) Drawing boundaries around a software project: Unlike a book or a movie, software does not necessarily have a final form. As stated above, a single piece of software goes through many iterations throughout its life. Furthermore, bits and pieces of executable files may be traded throughout an institution for different projects if that executable is deemed helpful. For example, if one software engineer creates a piece of code that aggregates records quickly, other engineers may implement that code in different projects. What is the best way to inventory NLM-developed software when what constitutes an individual piece of software is unclear, even to the developers? Additionally, entire software projects are sometimes absorbed into other projects and divisions, and that process is not always thoroughly documented. Tracing a particular project can become quite difficult.

Part of a solution to this is to conceive of a ‘software project’ as a software ecosystem instead of as a single entity. In this way, when inventorying software, I do not need to look for a defined boundary for project. Instead, I can look for the bits and pieces that rely on each other in order to serve a institutional purpose. Inventorying software, then, becomes a part of understanding institutional goals and the variety of tools that the institution employed to reach those goals.

A software engineer at NLM recently suggested this approach to me, and I deeply appreciate her input. This strategy will allow me to create an inventory that is more easily understood by both external and internal users. Concentrating on ‘software ecosystems’ allows one to be granular enough to document technical details while also tying software efforts to institutional goals and needs.

3) Finding failed projects: No one likes to fail, but failures are an important part of our history. As I try to piece together the history of software development at NLM, I find that some projects simply disappear from the records. There is no explanation as to why the project was discontinued or what went wrong. It simply disappears. I understand why – who wants to talk about their failures? Especially when their boss may read about it! But for historians and archivists, it is important to be able to find the failed projects as they may provide important insight into the technical details of that software as well as the historical, cultural, and institutional factors that may have influenced its failure. Tracking the failed projects will definitely prove to be an interesting challenge as I continue my research.

Vocabulary Forensics, Digital Media, and Technological Change

I’ve created a new game, and while I’m sure only a few people will find it fun, I think the game illustrates a wider issue in preserving digital media and technological tools. The game, most simply put, is to guess the shelf-life of a word.

Slang comes in and out of style pretty frequently. People aren’t saying, “That’s haaawt,” the same way that there were in 2002 under Paris Hilton’s dubious influence. But technical language also falls in and out of use, based largely on the object and infrastructures which the vocabulary is tied to.

Let me illustrate with a quick example: “smartphone.” While the lexical construct of “smart – object” has retained a fair bit of influence with things like “smart-fridges,” the word “smartphone” may not be long for this world. When is the last time a Verizon commercial touted the amount of smartphones they offered? At this point, at least within a certain demographic, we just call them phones. The consumer-level technology has advanced in such a way that using the term “smartphone” is redundant. The internet-connected phone is no longer noteworthy; the flip-phone is.

In this way, the use of the word “smartphone” decreased as the market’s ability to provide that object increased.At least in the United States, smartphones are so ubiquitous that we have largely dropped the “smart” and are back to simply “phone.”

This observation is relatively simplistic. Of course language changes to reflect the lived, human environment, and technology is a key aspect of that environment. I am a little personally astounded by how quickly the word came into and fell out of use (a little over 10 years by my count), but this isn’t terribly interesting on its own. It does, however, illustrate an issue when undertaking a digital preservation project, particularly one that focuses on preserving software, like my current work at the National Library of Medicine. As I familiarize myself with obsolete technology, I also need to familiarize myself with obsolete language. Let’s just say that the word for “back-end” does not seem to be as stable as one may have assumed.


Hello! My name is Nicole Contaxis, and I’m currently in Washington DC as part of the National Digital Stewardship Residency. More on that later.

In general, I’m interested in digital preservation, the semiotics of computation and digital media, and the history and rhetoric of technology. Or, more clearly said, I spend most of my time thinking about how we communicate and the ramifications of both the content and nature of that communication. Some of this blog will cover these types of issues and will deal with technology and history more widely.

However, many blog posts will deal with my current project as a part of the National Digital Stewardship Residency. I am working at the National Library of Medicine on a project titled, “NLM-Developed Software as Cultural Heritage.” What that means is that I’m trying to track down all of the software developed at NLM and design a preservation strategy for it. Considering that NLM has a 40 year history of developing software for internal needs and for their users, I’ve got my work cut out for me. Nevertheless, I’m pumped. Software is a huge part of the lived experience, and I’m excited to play my part in ensuring long-term access to executable files.