On Creating a Disciplined and Ethical Practice of Software Archeology
By Norman L. Kerth
My dictionary defined the word archeology as
The systematic recovery and study of material evidence, such as graves, buildings, tools, and pottery, remaining from past human life and culture.
This definition suggest that there is more to the concept of software archeology than just how quickly a developer might understand a piece of a large system that they have to repair. I agree that the rapid understanding of a large software system has immediate commercial application. With that said, I stress the rapid understanding for the purpose of modification is only a small part of the total riches that can come from the careful study of an ancient monument of software antiquity.
For example, in Using Patterns to Improve Our Architectural Vision, the authors suggest software archeology is the foundation for a new approach to software architecture, built upon the study of the masterpieces of our field's great software architects. They urge the inventions and discoveries made by the early pioneers in our field be captured in pattern language form for the wide distribution and appreciation of modern day practitioners.
But software archeology is not just concerned just with long-lived architecture. My dictionary's definition of archeology also directs us to the tools, languages and libraries used; bug reports; configuration management activities; schedule and budget histories; and so forth, with the goal of not just understanding the artifact, but through the artifact we come to understand human life and culture. In such a study we might discover the value of development practices, procedures, methodologies, and the like. We might see the rise and fall of certain disciplines. Such studies can lead to a more mature and reasoned modern day practice of our profession. We might also learn to shy away from practices that show a history of long term trouble.
So I ask the following questions:
How do we determine what systems are worthy of study? Is every system that needs enhancement worthy of an archeology study? Just as most of the buildings in the Yucatan Peninsula have little research value, many software systems have little to teach us. But what about the works of our great masters. Suppose we could find the code for Dijkstra's THE Operating System, would it be worthy of study? I'd think yes. How about the operating system and compiler from Wirth's Lillith Machine? I personally know there is great value in spending time studying that code. Sadly, at the time, I didn't know how to document what I learned so while a great learning experience for me, I have lost the opportunity to share what I discovered.
When I read National Geographic, or watch the television show Nova, I see field archeologists at work. They cordon off a site, develop a grid and prepare to meticulously record their research activities as well as their findings. They proceed at what I'd call a terribly slow pace, because they want to make sure they don't destroy evidence that might interest future researchers. This discipline has been developed over more than 100 years. Early archeologists made many mistakes and are remembered in history not only for their discoveries but also their mistakes. How should we study a great work of software antiquity?
If we are studying a software system, funded by the goal of improvement, can we also be looking to discover and preserve the riches that are contained within, including the understanding of how the system came to be? If so then what impact on this treasure would refactoring have? Are the cultural lessons to be learned, likely to be lost, as a programmer with an object-oriented perspective works on code crafted by masters of Lambda Calculus? Would we commit the mistakes made by archeologists when they try to shored up an Anastasia ruin with concrete, and there by lost what ever discoveries that might have been possible, even if the ruins fell to the ground.
I'm also concerned that the authors of the original software are respected for what they knew at the time, given the tools they had available at the time, and so forth. We wrong our forefathers by judging them through modern day knowledge, practices and beliefs. It's easy to laugh at a FORTRAN programmer using GLOBAL COMMON extensively, until you understand the alternatives she had available during that stage in the development of our field.
Is a pattern language really the best approach? I'm sure there are other forms with advantages as well. Does UML have any utility, given it was designed to express design ideas in a form that parallels modern day thought.
This brings me to my last question:
Should this have been two separate workshops?
Using Patterns to Improve Our Architectural Vision, Norman L. Kerth and Ward Cunningham, IEEE Software, Vol. 14, No. 1, January/February, 1997, pp. 53 - 59.