Visible Workings > Software Archeology Workshop

OOPSLA 2001 Workshop
Software Archeology: Understanding Large Systems

Ward Cunningham, Andrew Hunt, Brian Marick, and Dave Thomas
(Address correspondence to
marick@visibleworkings.com)

  Wiki Forum

Position Papers

Recent Changes  

How do you come to grips with 1,000,000 lines of code right away?

Programmers are often given a large system they’ve not seen before, built by people they don’t know, touched by many people since, documented sketchily if at all. They’re told to improve it. Their task might be to fix a bug, add a feature, or complete a refactoring. They are under time pressure, so they need to minimize the total time spent learning and the time spent improving.

In this workshop, we will share techniques and approaches for understanding enough about a lot of code in not much time. We are concerned not just with speed, but also with confidence: how can you know you’ve made an improvement, not made the system worse?

Our assumptions

  1. Archeology is a useful metaphor: programmers try to understand what was in the minds of other developers using only the artifacts left behind. They’re hampered because the artifacts were not created to communicate to the future, because only part of what was originally created has been preserved, and because relics from different eras are intermingled.
  2. Large systems require different techniques than small ones.
  3. The approaches used under time pressure differ from those used when a program is understood at leisure.
  4. Time pressure and large systems combine to make many changes entropy-increasing. How can a change be made expeditiously and with confidence that it’s correct and with some hope that the system will not now be even harder to understand?

Our goals for the workshop

Participants will share concrete techniques and approaches that they can take back and use. Each person will be better at software archeology.

The workshop will be summarized in a first version of a “software archeologist’s handbook”, written results useful to those not at the workshop. It will be published on the web, and it will be associated with a Wiki forum so that discussion can continue.

Getting into the workshop

There is a limit of 15 participants.

Participants will be accepted based on a position paper, which should be sent to marick@visibleworkings.com by August 17, 2001. The position paper will have two parts:

Experience report. This will take one of two forms. In the first, you will go out, find a large object-oriented system you have no experience with, do something to it (such as fix a bug or add a feature), and write up how you came to understand enough to accomplish the task. We have suggestions for systems to look at in the Wiki Forum, but you are free to choose your own.

The second form is a “war story” about a recent experience of having been thrown onto a legacy project with a difficult deadline for some specific task. The approach taken should, again, be described in detail.

Position statement. You will state something the world needs to know: a theory of how people understand systems, a tool you wish you had, a method to follow.

In addition, we expect potential participants to make suggestions for workshop activities: “Let me talk about X”, “let’s do Y in pairs”, or “let me demo this tool I found”. This will be done in another part of the Wiki forum. The committee will take these suggestions into account when selecting participants.

Position papers will be posted on this web site after participants are selected.

Position papers should be no more than three pages long. Shorter is fine. We prefer HTML. PDF or some format that can easily be converted into PDF is fine, too.

What will happen at the workshop

As described above, activities will in part depend on participant suggestions. We expect the basic structure to fall into three parts, the first before lunch, the remaining two after.

Talk. Selected people will make presentations, perhaps on a favorite technique. We’ll use the format of the Los Altos Workshop on Software Testing <http://www.kaner.com/lawst.htm>. It’s a moderated format, with a note-taker, that emphasizes questioning of the speaker. Questions come in two phases:

  1. Clarifying questions to help the speaker explain herself.
  2. Questions exploring the boundaries of the technique: upon what assumptions does it depend? in what situations does it not work?

So that probing can be thorough, presentations are not time-boxed. At the end, a list of useful techniques and generalizations will have emerged.

Action. Pairs or small groups will fix bugs in large systems. There are a variety of approaches we could use:

  1. Break into pairs and work on different bugs, using only the information from a bug report.
  2. Have someone who knows a system challenge the group with a bug report. The group would collectively brainstorm their way to an understanding of the system and strategize on a fix. Smaller groups would implement the fix, then gather to compare work.

And so forth.

Talk. The full group will gather again to discuss what was learned through action.

Organizers

Ward Cunningham is a founder of Cunningham & Cunningham, Inc. He has also served as Director of R&D at Wyatt Software and as Principle Engineer in the Tektronix Computer Research Laboratory before that. Ward is well known for his contributions to the developing practice of object-oriented programming, the variation called Extreme Programming, and the communities hosted by his WikiWikiWeb. He is active with the Hillside Group and has served as program chair of the Pattern Languages of Programs conference which it sponsors. Ward created the CRC design method which helps teams find leveraged objects for their programs. Ward has written for PLoP, JOOP and OOPSLA on Patterns, Objects, CRC and related topics.

Andy Hunt is co-author of the best-selling book The Pragmatic Programmer, the new Programming Ruby, and various articles. Between writing, traveling, woodworking and playing the piano, Andy finds time for his consulting business specializing in agile software development. Andy has been writing software professionally since the early 80's across diverse industries such as telecommunications, banking, financial services, utilities, medical imaging, graphic arts, and Internet services. Andy has slogged through a lot of impenetrable legacy code in the process, and even created his own tortured, byzantine code for the enjoyment of generations to come.

Brian Marick specializes in software testing, especially code-based software testing. This sometimes involves being dropped into a project near the end, having to quickly understand enough of the state of the product and the structure of the code to focus the testing effort, and then finding bugs fast. He’s observed that some people are strikingly better at that than others. He’d like to know why, in detail. To that end, he’s made “adequate understanding of large systems” the topic of a mid-life PhD under Ralph Johnson.

Dave Thomas has been writing software for money for 25 years, and reading code for pleasure for almost as long. As an independent consultant, and a user of open source tools, he often finds himself having to understand and make fixes to large pieces of software under ridiculous time pressures. Sometimes this is a team process, facilitating client staff as we scramble towards enlightenment, other times it is a solo journey. Always, it seems like 3 parts science, 5 parts art, and 2 parts luck. He suspects that's there's gold to be mined from this mix, and is looking forward to using the workshop as an opportunity to do some digging. Dave is co-author of The Pragmatic Programmer and Programming Ruby, and is a signatory of the Agile Manifesto.

  Wiki Forum

Position Papers

Recent Changes