0% found this document useful (0 votes)
49 views18 pages

An Examination of Software Engineering Work Practices

An Examination of Software Engineering Work Practices Singer, Janice; Lethbridge, T.; Vinson, Norman; Anquetil, N. March 2002

Uploaded by

Nicole Witthuhn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views18 pages

An Examination of Software Engineering Work Practices

An Examination of Software Engineering Work Practices Singer, Janice; Lethbridge, T.; Vinson, Norman; Anquetil, N. March 2002

Uploaded by

Nicole Witthuhn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/2543447

An Examination of Software Engineering Work Practices

Article · March 2002


Source: CiteSeer

CITATIONS READS
121 478

5 authors, including:

Janice Singer Timothy Lethbridge


National Research Council Canada University of Ottawa
84 PUBLICATIONS   5,472 CITATIONS    213 PUBLICATIONS   5,860 CITATIONS   

SEE PROFILE SEE PROFILE

Norman G. Vinson Nicolas Anquetil


National Research Council Canada University of Lille Nord de France
40 PUBLICATIONS   1,560 CITATIONS    154 PUBLICATIONS   3,093 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Research Ethics View project

Epidemiology View project

All content following this page was uploaded by Timothy Lethbridge on 16 October 2012.

The user has requested enhancement of the downloaded file.


NRC Publications Archive
Archives des publications du CNRC

An Examination of Software Engineering Work Practices


Singer, Janice; Lethbridge, T.; Vinson, Norman; Anquetil, N.

NRC Publications Record / Notice d'Archives des publications de CNRC:


http://nparc.cisti-icist.nrc-cnrc.gc.ca/npsi/ctrl?action=rtdoc&an=5209032&lang=en
http://nparc.cisti-icist.nrc-cnrc.gc.ca/npsi/ctrl?action=rtdoc&an=5209032&lang=fr

Access and use of this website and the material on it are subject to the Terms and Conditions set forth at
http://nparc.cisti-icist.nrc-cnrc.gc.ca/npsi/jsp/nparc_cp.jsp?lang=en
READ THESE TERMS AND CONDITIONS CAREFULLY BEFORE USING THIS WEBSITE.

L’accès à ce site Web et l’utilisation de son contenu sont assujettis aux conditions présentées dans le site
http://nparc.cisti-icist.nrc-cnrc.gc.ca/npsi/jsp/nparc_cp.jsp?lang=fr
LISEZ CES CONDITIONS ATTENTIVEMENT AVANT D’UTILISER CE SITE WEB.

Contact us / Contactez nous: [email protected].


1
An Examination of Software Engineering Work Practices
Janice Singerα, Timothy Lethbridgeβ,
Norman Vinsonα, Nicolas Anquetilβ
α β
Institute for Information Technology School of Information Technology and Engineering
National Research Council, Ottawa, ON, K1A OR6 University of Ottawa, Ottawa, ON, K1N 6N5

Abstract The rest of this introduction will first critically


examine the more traditional uses of psychology in
the program comprehension literature, and second
This paper presents work practice data of the daily describe the study of work practices. We will then
activities of software engineers. Four separate studies outline some results of a study we conducted at a
are presented; one looking longitudinally at an large telecommunications company. Finally, we will
individual SE; two looking at a software engineering discuss the implications of these results for tool
group; and one looking at company-wide tool usage design .
statistics. We also discuss the advantages in
considering work practices in designing tools for 1.1 Empirical Studies of Programmers
software engineers, and include some requirements
for a tool we have developed as a result of our
(ESP)
studies. One human-computer interaction approach to the
design of tools has been to study the cognitive
processes of programmers as they attempt to
1. Introduction understand programs [19, 20, 21]. The results of such
The Knowledge Based Reverse Engineering Project’s studies are supposed to provide the basis for
goal is to provide software engineers (SEs) in an designing better tools. In other words, understanding
industrial telecommunications group with a toolset to the mental processes involved in programming will
help them maintain their system more effectively. To permit the design of tools that mesh with the
achieve this goal, we have adopted a user-centered programming process.
design approach to tool development [6, 7, 8]. In this vein, ESP research has identified a number
However, unlike traditional user-centered approaches, of programmers’ approaches to the comprehension
we have focused on the SEs’ work-practices. This ‘problem’ including the top-down [12], bottom-up
represents a new approach [15] to tool design. [4], and as-needed strategies [9], and the integrated
This approach borrows from several different meta-model [21].
fields in an effort to more accurately assess users’ There are three problems with this research,
behavior and then provide them with tools that though, as it pertains to tool design. First, the vast
enhance, rather than displace or replace, these work majority of the research has been conducted with
practices. The rationale is that the tools that are built graduate and advanced undergraduates serving as
will actually be used because they have been created expert programmers (but c.f., [21]). It is not at all
to mesh with existing behavior. This paper will clear that these subjects accurately represent the
describe our experiences with this approach and what population of industrial programmers. Consequently,
we have learned about the work practices of one
group of SEs at a large telecommunications company.

1
This work is supported by NSERC and the company serving as the site of the study.. This work was sponsored by
the Consortium for Software Engineering Research (CSER). The IBM contact for CSER is Patrick Finnigan.
the results of studies involving students cannot be cannot speak to the issue of whether a user will adopt
generalized to programmers in industry. and use a new tool in the workplace because that is
Second, to control extraneous variables, not the point, or the focus, of usability. Moreover,
researchers have used programs that are very small several features of the usability approach prevent it
(both in terms of lines of code and logic) relative to from informing the designers about the acceptance of
industrial software. This poses a generalization the tool in the workplace. Usability testing usually
problem as well: it is not clear that approaches to takes place outside the normal work setting,
comprehending small programs scale up to the sometimes in a room especially designed for that
comprehension of very large programs. purpose. This method of testing prevents the user
Third, there is an assumption that understanding from behaving in a normal manner because it isolates
the programmer’s mental model is an efficient route him from resources that are not part of the software
to designing effective tools. However, it is not at all (such as colleagues, documentation, notes). In other
obvious how to design a tool given a specification of words, it prevents the user from engaging in his day-
the programmer’s mental model. For instance, how to-day work practices. In addition, during usability
does knowing that programmers will sometimes use a testing, the user is essentially forced to use the
top-down strategy to understand code [12] inform tool software. In consequence, it is impossible to collect
design? It doesn’t tell us what kind of tool to build, or data on whether the user would use the software if he
how to integrate that tool into the workplace or the were given a choice between his existing work
programmer’s work. Furthermore, given this practices and the new software.
knowledge, it is not clear how to help the programmer The lack of tool adoption and use is a major
build that mental model; how to help her apply it; or problem in the area of tool design for software
how to help her use it effectively in software engineering. However, because of its features and
engineering activities. techniques, usability cannot inform designers on this
These three problems with the ESP approach issue. We believe that to build tools that are actually
suggest that an alternative approach to tool design used, designers must first understand what it is that
may be more effective. SEs do when they work. This is the reason for our
focus on work practices in designing software
1.2 Human Computer Interaction engineering tools.
Currently, there is a strong focus on usability in the
field of human-computer interaction [10]. That is, 1.3 Work Practices
designers attempt to ensure that prospective users can The study of work practices is a relatively new field
use the software without encountering interface [2, 3, 18] which seeks to understand how work occurs
difficulties. For instance, it should be clear to users and, from this understanding, suggest appropriate
what action they should take at each step, preferably technologies for the workplace. Work practices have
without referring to documentation. Another aspect of been studied in such diverse fields as law, navigation,
usability is the minimization of the number of steps document use, etc.
and the amount of time needed to accomplish a task. In studies of work practices, data are generally
To determine whether software is sufficiently usable, collected by following and recording the work that
prospective users are observed using the software for people do. Researchers often rely on ethnographic
a few, or several minutes. Reaction times, errors, methodologies producing diverse sets of data. The
backtracking to previous states and failures to challenge, then, is to take work practice data sets, and
accomplish the task are recorded along with the put them into a form that is useful to designers.
conditions under which they occurred. These data are Our approach to this problem has been to
2
then used to fix the interface and, ideally, more of implement many different data collection techniques
these test-redesign iterations take place until the and see if the evidence from each converges. Then we
software is sufficiently usable. will use these data to decide what types of tools
However, we see problems with this approach.
While it may increase the usability of systems, it does
not guarantee that the systems that are built will be 2
These methods are detailed more precisely in [7, 8,
genuinely useful [2, 18]. The usability approach
13]
would best solve the problems that SEs face in their Over 100 people have made changes to the source
daily activities. code during the life of the system.
The first thing that struck us when we entered the
work place was that we did not know exactly what it 2.2 Software Engineering Process And
was that the SEs did on a day-to-day basis. That is, Tools In The Group
we knew neither the kinds of activities they
performed, nor the frequency with which these The group follows a well-defined process for creating
various activities took place. As far as we could tell, new system features. They also keep detailed records
there were many hypotheses about the kinds of things of problem reports and the consequent changes to the
SEs do, but no clear ‘cataloging’ as such of exactly system. Other important documents include the
how SEs go about solving problems. Consequently, ‘practices’ that are followed by those who install and
we decided to begin our study of work practices by run the system in the field.
finding out what it is that SEs do when they do their Careful attention is paid to quality control in the
work. First, we will briefly describe the form of design reviews, informal code inspections,
characteristics of the workplace. Then, the rest of this and an independent test team.
paper will present the findings from several studies Development work is done on the Sun platform,
we conducted to answer this first question. although the SEs must also spend considerable time
installing and running the software on various
configurations of the target hardware.
2. Workplace Characteristics
The group we are studying maintains a large 3. SE Activities
telecommunications system that is one of the key
products of the company. The management of the We collected five basic types of SE work practice
group is fairly informal, with group members able to data. First, using a web questionnaire, we simply
select the problems on which they work. asked the SEs what they do. Second, we followed an
Group members work in close proximity and individual SE for 14 weeks as he went about his
often walk over to each other’s desks with questions. work. Third, we individually shadowed 9 different
The group also makes use of a laboratory in which the SEs for one hour as they worked. Fourth, we
target hardware is installed. performed a series of interviews with software
engineers. Finally, we obtained company-wide tool
usage statistics. The next several sections will outline
2.1 The System
more precisely our methodologies and results from
The system includes a real-time operating system and these various studies.
interacts with a large number of different hardware
devices. The system contains several million lines of 3.1 Questionnaire Study
code with over 16000 routines in over 8000 files. It is
also divided into numerous layers and subsystems We began this research by administering a web-based
written in a proprietary high-level language. questionnaire. The questionnaire covered many
The system was first fielded in the early 1980s different aspects of the SEs’ work. Here we report
and has since been continually updated. Its their answers to a question on what they spend their
importance to the company and its evolution are time doing. Six SEs in the group of 13 responded.
expected to continue for many years to come. The question was open-ended, i.e., the SEs had to
Approximately 13 people actively work on decide how to describe their work, rather than
various aspects of the system at the current time. choosing certain activities from a list.
On average, SEs said that they spend 57% of their There, B maintained a product in the same category
time fixing bugs, and 35% of their time making as the current product, but developed on a much
enhancements to the system. Table 1 shows more smaller scale.
specifically the things they reported that they engaged B has experience in several languages, but prior to
in, and the percentage of people reporting that joining the company, considered himself to be an
activity. expert only in an in-house proprietary language.
The most reported activity was reading Likewise, while he has experience in several
documentation. SEs also reported that they spend platforms, prior to joining the company, B considered
time looking at source, writing documentation, himself to be an expert only in an in-house
attending meetings, and writing code. Other activities proprietary 68K development platform. B has worked
include consulting, both answering and asking on 5 different systems, 3 of which have involved
questions, working with the hardware, testing, development, 2 of which have involved maintenance.
designing, and fixing bugs. B joined the company in November, 1996. Before
Because of the questionable validity of self-reports, then B had no experience in the company’s in-house
we felt it was extremely important to not just rely on Pascal-based proprietary language. Nor did B have
what SEs said they did, but to actually observe them any experience in Pascal, although he had
as they worked. Hence the next sections of the paper programmed in other structured languages. B had
describe two studies that we undertook towards this utilized VI before coming to the company, but
goal. planned on switching to the Emacs editor at the
company. Similarly, he had used Grep previously, but
3.2 Individual study was switching to use of Egrep and Fgrep at the
company. B did not have previous experience with
We have been following one SE longitudinally from
the other tools available at the company
the time he joined the company (November, 1996).
For the first six months, we spent about 1-1/2 hours
3.2.1.2 Procedure and Data
per week with B. However, as B has become more
expert, we have found that it makes more sense to The shadowing data result from 14 half-hour sessions
meet once every 3 weeks. This is both because new ranging from October 17, 1996 to February 27, 1997.
things happen less frequently (e.g., experience with a Some days are missing because of vacation or
new tool) and because B is more busy with ‘real’
Activity % of
tasks. B is an experienced SE (was previously a team-
people
leader), thus while he is new to the company, he is
Read documentation 66%
certainly new to neither maintenance nor
Look at source 50%
telecommunications software.
Write documentation 50%
Our sessions with B consist of 3 distinct
Write code 50%
components. First we talk about what has transpired
Attend meetings 50%
since the last time we met. This could be anything
Research/identify alternatives 33%
from code review to learning about a new tool to
Ask others questions 33%
reading documentation, etc. Second, we ask B to look
Configure hardware 33%
at a diagram of the system that he previously
Answer questions 33%
constructed and ask him to modify it if it does not
reflect his current understanding of the system. Fix bug 33%
Finally, we ‘shadow’ B as he works for 1/2 hour. In Design 17%
this paper, we report the data from the shadowing. Testing 17%
Review other’s work 17%
3.2.1 Method Learn 17%
Replicate problem 17%
Library maintenance 17%
3.2.1.1 Subject
B has worked in the software industry for many years. Table 1: Questionnaire results of work
Prior to joining the telecommunications company, he practices (6 responses).
worked as a team leader for a nearby competitor.
Activity Description
Call trace Looking at an execution trace of the program
Consult Either being consulted or consulting someone else
Compile Linking or compiling a program
Configuration Mgt Entering and using the in-house configuration management system (sometimes for
updating, and sometimes to search for past updates)
Debug Using either the high-level or low-level debugger
Documentation Looking at documentation
Edit Actually making a change to source code
Management General software activities, such as meetings, code reviews, etc.
In-house tools Using one of the in-house tools, primarily static software analysis tools
Notes Taking notes, or reading past notes
Search Using Grep, in-house search tools, or searching in an editor
Source Looking at source code using editors or code viewers
Hardware Interacting with the hardware, e.g., loading software, running software, configuring the
hardware, etc.
UNIX Issuing a general UNIX command such as LS, CD, etc.

Table 2: Categories of activities observed when shadowing software engineers


schedule conflicts. For the most part, however, these would have chosen to respond to personal email when
dates reflect weekly meetings with B. he was being shadowed by us. As a rule, he was
For half an hour, we would sit behind B and write always directly involved in work activities. We do not
down the things he did. For instance, if he used Grep, consider this to be too much of a problem, however,
that would be recorded (using pencil and paper). If he because our goal is, after all, to build tools that help
read documentation, or wrote notes to himself, that SEs work.
was written down.
We recorded B’s activities in detail, but not to the 3.2.2 Results
point of exactly what he typed or said. For example, The shadowing events were categorized into 14
we would record that B edited a file or interacted with distinct categories which are described in Table 2.
the hardware, while not detailing the exact nature of Each of B’s events was then classified as belonging to
his involvement with these activities. one of these event categories.
A new activity was recorded each time a switch in The data were then examined in two distinct
activity occurred. So, for instance, if B did 6 Greps in ways. First, Figure 1 shows the percentage of days
a row, that was recorded as a single instance of the (for a 14 day span) on which an event occurred at
event Grep. Then if he did 4 Diffs, a single Diff event least once. For example, if B searched for information
would be recorded. Taking that to its extreme, if all B one day, the search count would be incremented by 1,
did was Grep for 1/2 hour, that is the single activity regardless of whether B searched 1 time, 4 times, or
that would have been recorded for that 1/2 hour. No 24 times on that particular day.
time measures were taken. Thus, we do not know the Searching and interacting with the hardware were
duration of B’s involvement in each distinct event. the most likely events to occur on a daily basis, each
We followed the shadowing procedure regardless of occurring on 8 of the 14 days. B looked at the source
the nature of B’s work. Sometimes that meant that we code on 6 of the 8 days. The reason that B searched
observed B reading documentation only. Other times on more days than he looked at the source code is
B was engaged in a wide variety of tasks. because searching was an activity that also occurred
As a general note, there is probably some self- when interacting with the hardware and debugging. B
selection of activities involved in B’s choices of only looked at documentation on 2 of the 14 days.
things to do. For instance, it is highly unlikely that B This is surprising because, at the time, B was still a
relative novice to the software system and it is 60%
commonly assumed that novices will spend much of
their time reading the documentation to get a handle
on what they are doing. The data show that this was 50%
not the strategy B pursued. However, because B was
a novice, it was not surprising to find that , editing
code, compiling, and management were each only 40%
done on 1 of the 14 days.
Figure 2 shows the proportion of each event type
out of the total of 156 distinct events. Unlike Figure 30%
1, Figure 2 shows the total count, so that if B
searched 8 times on one day, that is counted as 8
20%
instances of search.
Again, we see that overall B searched more often
than he did anything else (37 times). He also 10%
frequently looked at the source code (33 times).
While B was likely on any particular day to work
with the hardware (see Figure 1), he did so on only 22 0%

Hardware

UNIX
Debug
distinct occasions.

Edit
Source

Notes
Search

Consult
Documentation
Call_trace

Compile
Management
Configuration Mgt

In-house tools
Remember, however, that these data do not
include time measurements, but simply activity
switches. So, for instance, while B did management
activities on only 1 day, the code review that was

60%
Figure 2. Percentage of times each type of
event occurred out of a total of 156 distinct
50% events.
undertaken took the entire 1/2 hour.
Thus overall, in terms of both daily activities and
40% frequency of different activities, search for
information about the system, whether through Grep,
in-house search tools, or within a particular editor or
30% debugger, figures most prominently. A significant
amount of effort was also expended interacting with
the hardware and looking at the source code.
20%

10%

0% 3.3 Group study


Hardware

UNIX

Debug

Edit
Notes
Source
Search

Consult

To generalize our findings, we have conducted


Documentation

Management
Call_trace

Compile
Configuration Mgt

In-house tools

several studies that focus on different aspects of the


work of an entire group of SEs.
We have collected four types of data from the
group. First, we asked the SEs to draw a diagram or
picture of their current understanding of the system, a
conceptual map, if you will. Second, we conducted
Figure 1. Percentage of days on which B intensive interviews with the SEs as they solved a real
engaged at least once in a particular activity.
problem with the software. This generally involved 1 engaged in. Again, a new activity was recorded when
hour interviews over the course of several days. there was a switch in activity, so 9 Greps in a row
Third, we asked the SEs to recount how they solved a counted as one instance of the activity search.
recently encountered problem. Finally, we spent one Durations of activities were not recorded.
hour shadowing each SE as they went about their We recorded activities in gross, not fine, detail;
work. This report focuses on this fourth type of data; e.g., we did not record the arguments to particular
the shadowing data. commands.
Shadowing schedules were not chosen to reflect
3.3.1 Method any particular activity, but rather were scheduled at
times convenient for the SEs. Shadowed times were
3.3.1.1 Subjects relatively free from stress, i.e., SEs were not
shadowed as deadlines approached.
Eight group members participated in the shadowing
Again, there is probably some self-selection
study. Their experience ranged from the most expert
involved in the activities that the SEs pursued.
member of the group (8 years) to the least
However, it was very clear that they were all working
experienced (6 months - recent college graduate). All
on ‘real’ problems as evidenced by their concern with
but one of the shadowed subjects worked on the main
the problem report’s contents.
controller of the hardware. One of the subjects
worked primarily on the database component.
3.3.2 Results
The subjects were expert in a wide variety of
platforms and languages, and had experience in both Like B’s data, the shadowed events were categorized
development and maintenance environments. into 14 distinct categories which are described in
Table 2. Each of the events was then classified as
3.3.1.2 Procedure and Data belonging to one of these event categories. 356
distinct events were recorded.
The shadowing occurred in the same manner as for B:
we sat behind the SEs and recorded the activities they
100%

90%

80%

70%

60%

50%

40%

30%

20%

10%

0%
Hardware
UNIX

Debug
Source

Edit

Notes
Search

Consult
Documentation

Management
Compile

Call_trace
Configuration Mgt

In-house tools

Figure 3. Percentage of users who engaged


in a particular type of activity.
Figure 3 shows the proportion of users who compiling also seem important. This concurs with
engaged in a particular type of activity at least once what we would expect in that the code is the focus of
during the shadowed hour. All 8 SEs looked at the their work.
source, conducted a search, and changed the source
code at least once during the hour. Most of the SEs 3.4 Company Study
also engaged at least once in several other activities,
The final study we report concerns company-wide
with 5 of the 8 SEs interacting with the hardware,
tool usage statistics. These data were obtained from
debugger, or the in-house tools. On the other hand,
the company’s tool group. This group is responsible
only 3 SEs looked at a call trace, while only one SE
for acquiring, updating, and maintaining the
performed a management activity.
company’s tools. Collecting usage statistics is part of
Figure 4 shows the percentage of times a
their mission.
particular type of event occurred out of the total of
357 events (totaled over the 8 SEs). Issuing a UNIX
3.4.1 Results
command was the most frequent activity, occurring
54 times. A close second was looking at the source The data presented here represent one week of Sun
which was done 52 times. Interacting with the tool usage by 367 users in late May. Note that this
hardware or the debugger, searching, or changing the week occurred before ‘vacation season,’ so is fairly
source code was done on 36, 32, 31, and 30 occasions representative of peak tool usage. There were 79,295
respectively. Configuration management, consulting, separate tool calls logged from the Sun operating
compiling, and looking at in-house tools were each system. Each call counts as one usage event. These
done about 20 times. tool calls were classified according to the scheme
Surprisingly enough, reading the documentation, presented in Table 3.
although done by 6 of the 8 SEs accounted for only
12 separate events. Clearly, the act of looking at the 100%
documentation is more salient in the SEs’ minds (as
evidenced by the questionnaire data) than its actual 90%
occurrence would warrant.
SEs only occasionally wrote notes, looked at the 80%
call trace or did management activities. This is not to 70%
say that these events are not important, but merely
that they did not occur as frequently as other events. 60%
As B did, the group frequently examined the
source code. Every SE in the group made at least one 50%
search during their shadowing session, but search was
less prominent than in B’s activities. Search ranked as 40%
the most frequent event type for B, while it was the
30%
4th most frequent for the group.
Code editing and compiling were more prominent 20%
activities in the group data. This is probably because
B was still learning the system at the time we 10%
shadowed him, so he was not yet in a position to
make many changes. This may also explain the higher 0%
Hardware
UNIX

Debug

Notes
Edit
Source

Search

Consult

Documentation

prominence of call trace in his data: call trace may be


Compile

Call_trace
Management
Configuration Mgt

In-house tools

effective in gaining an initial understanding of a


system.
Interestingly, in-house tools and documentation
were both relatively infrequent activities for both the
group and B.
The group data converge with B’s data to suggest
that looking and searching through the source code Figure 4. Proportion of times a particular type of
are prominent activities for SEs. Editing and event occurred out of the total of 357 events.
Tool Description
Compilers Compilers, assemblers, linkers
Compression Compression tools such as zip and unzip
Configuration Mgt Make and an in-house configuration management tool
Debuggers General and in-house debuggers
Editors Emacs, VI, and various others
Formatters Tools such as latex and groff
Graphics Tools Tools to create and display graphics
Hardware Connectors In-house tools to connect to hardware
Internet Tools Web browsers, news readers, and email programs
In-house Tools Primarily software static analysis tools
Operating System Windowing, terminal, and various other OS tools
Search Primarily variations of Grep, but some in-house tools
Viewers Document viewers such as More and Less
Other A collection of various other tools

Table 3. Classification of the types of Sun tools.


Figure 5 shows the proportion of times that each These data are therefore not representative of the SEs
type of tool was used. Compilers, which accounted real work practices.
for 32,422 calls, or 41% of all calls are not included The overwhelming finding from the company data
in this graph. This is because the compiler data is that search is done far more often than any other
include all the automatic software builds done nightly activity. In fact, search accounts for 21,146 events
and by the various testing and verification groups. over the course of the week, or an average of about 58
searches per individual user. Compression and un-
50% compression tools are also used often. We never
actually observed anyone using these tools. Perhaps
they are used by the verification groups.
40% The configuration management system was
activated 2819 times, accounting for approximately
4% of all events. At this company, the configuration
30% management system is central to the work process,
both for retrieving files, filing changes, and searching
through past changes (along with associated
documentation).
20% Editors and viewers account for approximately
3190 events, or 4% of the total number of events.
This low frequency could be due to counting
10% particularities that apply only to editors. In the
company tool data, an editor command is counted
only when the editor is opened. Once an editor is
0% open, it generally stays open, regardless of how many
changes are made, or how many files are viewed. In
Viewers

Debuggers
Hardware Connectors

Other
Search

Editors

Graphics Tools
Operating System

Formatters
Configuration Mgt

Internet tools
In-house tools
Compression

contrast, in the shadowing data, an edit was recorded


each individual time the source was changed, and a
source event was counted each time the source was
examined, whether the editor was already open or not.
Consequently, it comes as no surprise that the
shadowing data edit and source frequency is higher
than that of the company data.
Figure 5. Proportion of all tool calls accounted
for by each tool type.
Again, the in-house tools are not used very develop requirements for software engineering tools.
frequently, but that belies their importance. These This section describes those requirements.
tools are important because they perform necessary
functions that cannot be performed by other tools. 4.1 The Software Engineering Task that
Search is the most frequently used tool at the We Address: Just in Time
company wide level. Grep and its variants are the
most frequently used search tools, accounting for Comprehension of Source Code
21,117 separate invocations. Clearly, search is an Almost all the SEs we have studied spend a con-
important aspect of SEs work practices. siderable proportion of their total working time in the
task of trying to understand source code prior to
3.5 Discussion making changes. We call the approach they use Just
in Time Comprehension (JITC) [14]; the reason for
This examination of work practices suggests that
this label will be explained below. We choose to
search is an important component of real, day-to-day,
focus our research on this task since it seems to be
software engineering. It is therefore quite reasonable
particularly important, yet lacking in sufficient tool
to think that an improvement in search tools would
support.
help SEs to do their job better. In fact, the KBRE
The ‘changes’ mentioned in the last paragraph
group has decided to focus its efforts in this direction.
may be either fixes to defects or the addition of
Currently, we are implementing a source code
features: The type of change appears to be of little
exploration tool [6] and investigating ways to
importance from the perspective of the approach the
introduce it into the workplace.
SEs use. In either case the SE has to explore the
In order to improve these new tools, we are
system with the goal of determining where
continuing our study of SE work practices in several
modifications are to be made.
ways. First, we are examining the source code search
A second factor that seems to make relatively
activity: identifying the kinds of things SEs search
little difference to the way the task is performed is
for, how many searches they need to find a particular
class of user: Two major classes of users perform this
piece of information, etc. Second, we are continuing
task: Novices and experts. Novices are not familiar
our longitudinal study of B’s work and our
with the system and must learn it at both the
involvement with the group. Finally, we are talking to
conceptual and detailed level; experts know the
SEs at other companies to determine whether our
system well, and may have even written it, but are
findings generalize to their work practices.
still not able to maintain a complete-enough mental
Our shadowing studies indicate that SEs also
model of the details. The main differences between
expend a significant amount of effort in just looking
novice and expert SEs are that novices are less
at the source. This suggests that intelligent viewers
focused: They will not have a clear idea about which
might prove valuable. Indeed, the process of reading
items in the source code to start searching, and will
and navigating huge pieces of source code can be
spend more time studying things that are, in fact, not
considered to be a type of navigation and information
relevant to the problem. It appears that novices are
retrieval problem [16]. In the future, we plan on
less focused merely because they do not have enough
exploiting this perspective on code viewing,
knowledge about what to look at; they rarely set out
especially in terms of the relationship between
to deliberately learn about aspects of the system that
viewers and search.
do not bear on the current problem. The vision of a
novice trying to ‘learn all about the system’, therefore
seems to be a mirage.
4. Application or Work-Practices As described in section 3, we observe that SEs
Studies to the Development of repeatedly search for items of interest in the source
code, and navigate the relationships among items they
Tool Requirements have found. SEs rarely seek to understand any part of
We have used the data gathered during the work- the system in its entirety; they are content to under-
practices studies described in section 3, in order to stand just enough to make the change required, and to
confirm to themselves that their proposed change is
correct (impact analysis). After working on a
particular area of the system, they will rapidly forget The SEs we have studied do this with high
details when they move to some other part of the frequency. In the case of a file whose name they
system; they will thus re-explore each part of the know, they can of course use the operating
system when they next encounter it. This is why we system to retrieve it. However, for definitions (of
call the general approach, just-in-time comprehension routines, variables etc.) embedded in files, they
(JITC). Almost all the SEs we have studied confirm use some form of search tool (see section 4.3).
that JITC accurately describes their work paradigm –
the only exceptions were those who did not, in fact, F2 Provide capabilities to display all relevant
work with source code (e.g. requirements analysts). attributes of the items retrieved in requirement
F1, and all relationships among the items.
4.2 List of Key Requirements for a We have observed SEs spending considerable
Software Exploration Tool time looking for information about such things as
As a result of our work-practices studies (section 3), the routine call hierarchy, file inclusion
we have developed a set of requirements for a tool hierarchy, and use and definitions of variables
that will support the just-in-time comprehension etc. Sometimes they do this by visually scanning
approach presented in the last section. Requirements source code, other times they use tools discussed
of relevance to this paper are listed and explained in in section 4.3. Often they are not able to do it at
the paragraphs below. Actual requirements are in all, are not willing to invest the time to do it, or
italics; explanations follow in plain text. obtain only partially accurate results.
The reader should note that there are many other
F3 Provide capabilities to keep track of separate
requirements for the system whose discussion is
searches and problem-solving sessions, and
beyond the scope of this paper. The following are
allow the navigation of a persistent history.
examples:
• Requirements regarding interaction with con- This requirement hhas come about because we
figuration management environments and other observe users working on multiple problems and
external systems. subproblems over a span of many days. We also
• Requirements regarding links to sources of observe them losing information they had
information other than source code, such as previously found and redoing searches.
documentation.
• Detailed requirements about usability. Non-functional requirements. The system will:

Functional requirements. The system shall: NF1 Be able to automatically process a body of
source code of very large size, i.e. consisting of
F1 Provide search capabilities such that the user at least several million lines of code.
can search for, by exact name or by way of
regular expression pattern-matching, any named As we are concerned with systems that are to be
item or group of named items that are used by real industrial Ses, an engineer should be
3
semantically significant in the source code. able to pick any software system and use the tool
to explore it.
3
We use the term semantically significant so as to
exclude the necessity for the tool to be required to NF2 Respond to most queries without perceptible
retrieve ‘hits’ on arbitrary sequences of characters in delay.
the source code text. For example, the character
This is one of the hardest requirements to fulfill,
sequence ‘e u’ occurs near the beginning of this
but also one of the most important. In our
footnote, but we wouldn’t expect an information
retrieval system to index such sequences; it would
only have to retrieve hits on words. In software the associations include such things as routine calls and
semantically significant names are filenames, routine file inclusion.
names, variable names etc. Semantically significant
observations, SEs waste substantial time waiting of such information are the effects of conditional
for tools to retrieve the results of source code compilation and macros.
queries. Such delays also interrupt their thought
patterns. Acceptable limitations:

NF3 Process source code in a variety of pro- L1 The server component of the tool may be limited
gramming languages. to run on only one particular platform.

The SEs that we have studied use at least two This simplifies implementation decisions without
languages – a tool is of much less use if it can unduly restricting SEs.
only work with a single language. We also want
to validate our tools in a wide variety of software L2 The system is not required, at the present time, to
engineering environments, and hence must be handle object oriented source code.
prepared for whatever languages are being used.
We are restricting our focus to SEs working on
NF4 Wherever possible, be able to interoperate with large bodies of legacy code that happens to be
other software engineering tools. written in non-object-oriented languages.
Clearly, this decision must be subsequently lifted
We want to be able to connect our tools to those for the tool to become universally useful.
of other researchers, and to other tools that SEs
are already using. L3 The system is not required, at present, to deal
with dynamic information, i.e. information about
NF5 Permit the independent development of user what occurs at run time.
interfaces (clients).
This is the purview of debuggers, and dynamic
We want to perform separate and independent analysis tools. Although it would be useful to
research into user interfaces for such tools. This integrate these, it is not currently a requirement.
paper addresses only the overall architecture and We have observed software engineers spending
server aspects, not the user interfaces. considerable time on dynamic analysis (tracing,
stepping etc.), but they consume more time
NF6 Be well integrated and incorporate all fre- performing static code exploration.
quently-used facilities and advantages of tools
that SEs already commonly use. 4.3 Why Other Tools are Not Able to
It is important for acceptance of a tool that it Meet these Requirements
neither represent a step backwards, nor require There are several types of tools used by SEs to
work-arounds such as switching to alternative perform the code exploration task described in section
tools for frequent tasks. In a survey of 26 SEs 4.1 This section explains why, in general, they do not
[7], the most frequent complaint about tools fulfill our requirements:
(23%) was that they are not integrated and/or are
incompatible with each other; the second most Grep: Our studies described in section 3.4 indicated
common complaint was missing features (15%). that over 25% of all command executions were of one
In section 4.3 we discuss some tools the SEs of the members of the Grep family (Grep, Egrep,
already use for the program comprehension task. Fgrep, Agrep and Zgrep). Interviews show that it is
the most widely used software engineering tool. Our
NF7 Present the user with complete information, in a observations as well as interviews show that Grep is
manner that facilitates the JITC task. used for just-in time comprehension. If SEs have no
other tools, it is the key enabler of JITC; in other
Some information in software might be described situations it provides a fall-back position when other
as ‘latent’. In other words, the software engineer tools are missing functionality.
might not see it unless it is pointed out. Examples
However, Grep has several weaknesses with
regard to the requirements we identified in the last Program understanding tools: University
section: researchers have produced several tools specially
designed for program understanding. Examples are
• It works with arbitrary strings in text, not semantic Rigi [11] and the Software Bookshelf [5]. Rigi meets
items (requirement F1) such as routines, variables many of the requirements, but is not as fast [NF2] nor
etc. as easy to integrate other tools [NF6] as we would
• SEs must spend considerable time performing like. As we will see later it differs from what we
repeated Greps to trace relationships (requirement would like in some of the details of items and
F2); and Grep does not help them organize the relationships. The Software Bookshelf differs from
presentation of these relationships. our requirements in a key way: Before somebody can
• Over a large body of source code Grep can take a use a ‘bookshelf’ that describes a body of code, some
large amount of time (requirements NF1 and NF2). SE must organize it in advance. It thus does conform
fully with the ‘automatically’ aspect of requirement
Search and browsing facilities within editors: All NF1.
editors have some capability to search within a file.
However, as with Grep they rarely work with 4.4 The Tools We are Developing
semantic information. Advanced editors such as
Emacs (used by 68% of a total of 127 users of text- As a consequence of our work practices studies, and
editing tools in our study) have some basic abilities to thus the requirements described in the last section, we
search for semantic items such as the starts of have developed an improved software exploration
procedures, but these facilities are by no means tool which we call tksee. A view of this tool is shown
complete. in figure 6.
The main features that fulfill F1 and F2 (search
Browsing facilities in integrated development capabilities) are in the bottom two panes. The bottom
environments: Many compilers now come with left pane shows a hierarchy that the user
limited tools for browsing, but as with editors these incrementally expands by asking to show attributes of
do not normally allow browsing of the full spectrum items, or to search for information (relations or Grep
of semantic items. Smalltalk browsers have for years results) about a given item. The currently selected
been an exception to this, however such browsers item is shown in the bottom-right pane, from which
typically do no not meet requirements such as speed the user can hyper-jump by selecting any item of text.
(NF2), interoperability (NF4), and multiple languages The main feature that fulfills F3 is the top pane.
(NF3). IBM’s VisualAge tools are to some extent Each element in this pane is a complete state of the
dealing with the latter problem. bottom two panes. A hierarchy of these states is saved
persistently, so each time the user starts the tool, his
Special-purpose static analysis tools: We observed or her work is in the same state as at the end of the
SEs using a variety of tools that allow them to extract previous session.
such information as definitions of variables and the The non-functional requirements are met by the
routine call hierarchy. The biggest problems with tksee architecture shown in figure 7. This architecture
these tools were that they were not integrated includes a very fast database, an interchange language
(requirement NF6) and were slow (NF2) for language-independent information about
software, and a client-server mechanism that allows
Commercial browsing tools: There are several incorporation of existing tools (e.g. Grep) so that
commercial tools whose specific purpose is to meet software engineers can continue to use tools they
requirements similar to ours. A particularly good already find useful.
example is Sniff+ from Take5 Corporation [17]. Further details about this tool are in [6, 15].
Sniff+ fulfills the functional requirements, and key We are continuing our involvement with users:
non-functional requirements such as size [NF1], we are studying how their work practices evolve
speed [NF2], multiple languages [NF3], its when they choose to adopt this tool.
commercial nature means that it is hard to extend and
integrate with other tools.
Parsers Source Code
File
Auxilliary
Analysis Tools
ensure that the SEs can effectively use these tools to
Interchange Interchange accomplish their work.
format format (TA++)
(TA++) Database It is possible that the study of work practices can
TA++ Write-API Read-API Clients reduce, or perhaps even eliminate, the need to study
TA++ Parser data data (User Interfaces
Files
Interchange DBMS and other cognitive processes and mental models. This will
format (TA++) analysis tools)
3rd party tools Read-API depend on the accuracy and detail with which work
that produce data
TA++ Query
Query
response
practices can be described. If they can be described in
Engine
3rd party tools detail, in terms of every system state explicitly and
that read
TA++ TA++
Generator
intentionally accessed by the user, it may not be
necessary at all to fathom the users’ cognitions. We
Figure 7: Data flow diagram showing
may only need to abide by general principles of
archtecture of the tksee software
usability and usability testing in addition to the work
practice specifications in order to design useful, and
5. Conclusions used tools. Moreover, it may be more efficient, in
In conclusion, the study of work practices provides a terms of time, to take the work practice approach to
path to tool design that is an alternative to the tradi- tool design than the cognitive approach. However,
tional paths taken in human-computer interaction, further empirical work is required in order to
namely those issuing from the study of the users’ strengthen out confidence in these statements. Further
cognitive processes and mental models, and the details about our research can be found in [15].
emphasis on usability. The problem of disuse has
plagued software tools designed with these traditional
human computer interaction approaches. By focusing
Acknowledgments
on workplace activities, the study of work practices
increases the likelihood that tools can be smoothly in- We would like to thank the software engineers who
tegrated into the users’ daily activities. This, in turn, have participated in our studies, in particular those
should increase the acceptance and use of software with whom we have worked for many months. We
tools designed on the basis of work practices. would also like to thank the tools group at the
Whether one wishes to examine user cognitions or company for providing us with the tool usage
not, it is necessary that tools be consistent with work statistics. Finally, we would like to thank the KBRE
practices for them to be used. Once this consistency is group for many helpful suggestions in conducting this
established, the usability approach can be taken to research.

About the Authors


Janice Singer is a cognitive psychologist who is now
researching software engineering work practices with
the Software Engineering Laboratory at the National
Research Council of Canada. Prior to her Ph.D.
studies, she conducted research in human-computer
interaction and worked as a software engineer.
Timothy C. Lethbridge is an Assistant Professor
in the newly-formed School of Information
Technology and Engineering (SITE) at the University
of Ottawa. He teaches software engineering, object
oriented analysis and design, and human-computer
Figure 6: The main window of the tksee tool.
interaction. He heads the Knowledge-Based Reverse [7] Lethbridge, T. and Singer J., Understanding
Engineering group, which is one of the projects Software Maintenance Tools: Some Empirical
sponsored by the Consortium for Software Research, Workshop on Empirical Studies of
Engineering Research. Prior to becoming university Software Maintenance (WESS 97), Bari Italy,
researcher, Dr. Lethbridge worked as an industrial October, 1997.
software developer in both the public and private
sectors. [8] Lethbridge, T. and Singer, J, Strategies for
Norman Vinson is a cognitive psychologist Studying Maintenance", Workshop on Empirical
working in the Interactions with Modeled Studies of Software Maintenance, Monterey,
Environments group, at the Institute for Information November 1996.
Technology, National Research Council of Canada.
[9] Littman, D., Pinto, J., Letovsky, S., & Soloway,
Prior to joining the NRC, Dr. Vinson was a user-
E., Mental Models and Software Maintenance,
interface designer at Northern Telecom.
Empirical Studies of Programmers, pp. 80-98,
Nicolas Anquetil recently completed his Ph.D. at
1986.
the l’Université de Montréal and is now working as a
research associate and part time professor in SITE at [10] Mayhew, D., Principles and Guidelines in
the University of Ottawa. Software User Interface Design, Prentice Hall,
The URL for the KBRE project is 1991.
http://www.csi.uottawa.ca/~tcl/kbre. The URL for the
Institute for Information Technology is [11] Müller, H., Mehmet, O., Tilley, S., and Uhl, J.,
http://www.iit.nrc.ca. The authors can be reached at A Reverse Engineering Approach to Subsystem
{singer, vinson}@iit.nrc.ca and {tcl, anquetil} Identification, Software Maintenance and
@site.uottawa.ca Practice, Vol 5, 181-204, 1993.

[12] Pennington, N., Stimulus Structures and Mental


References Representations in expert comprehension of
computer programs. Cognitive Psychology (19),
[1] Anderson, J., Cognitive Psychology and Its pp. 295-341, 1987.
Implications, WH Freeman, 1995.. [13] Singer, J. and Lethbridge, T, Methods for
[2] Blomberg, J., Suchman, L., & Trigg, R., Studying Maintenance Activities, Workshop on
Reflections on a Work-oriented Design Project. Empirical Studies of Software Maintenance,
Human Computer Interaction (11), pp. 237-265, Monterey, November 1996.
1996. [14] Singer, J., and Lethbridge, T. (in preparation).
[3] Beyer, H., & Holtzblatt, K., Apprenticing with Just-in-Time Comprehension: A New Model of
the customer. Communications of the ACM (38), Program Understanding.
pp. 45-52, 1995. [15] Singer, J, Lethbridge, T., and Vinson, N. Work
[4] Brooks, R., Towards a Theory of the Practices as an Alternative Method for Tool
Comprehension of Computer Programs, Design in Software Engineering, submitted to
International Journal of Man-Machine Studies CHI ‘98.
(18), pp. 543-554, 1983. [16] Storey, M., Fracchia, F., & Müller, H.,
[5] Holt, R., Software Bookshelf: Overview And Cognitive Elements to support the construction
Construction, www.turing.toronto.edu/ of a mental model during software visualization.
~holt/papers/bsbuild.html In Proceedings of the 5th Workshop on
Program Comprehension, Dearborn, MI, pp.
[6] Lethbridge, T., & Anquetil, N., Architecture of 17-28, May, 1997.
a source code exploration tool: A software
engineering case study. School of Information [17] Take5 Corporation home page,
Technology and Engineering, Technical Report. http://www.takefive.com/index.htm
[18] Vicente, K and Pejtersen, A. Cognitive Work
Analysis, in press

[19] von Mayrhauser, A., & Vans, A., From


Program Comprehension to Tool Requirements
for an Industrial Environment, In: Proceedings
of the 2nd Workshop on Program
Comprehension, Capri, Italy, pp. 78-86, July
1993.

[20] von Mayrhauser, A., & Vans, A., From Code


Understanding Needs to Reverse Engineering
Tool Capabilities, In: Proceedings of the 6th
International Workshop on Computer-Aided
Software Engineering (CASE93), Singapore, pp.
230-239, July 1993.

[21] von Mayrhauser, A and & Vans, A., Program


Comprehension During Software Maintenance
and Evolution, Computer, pp. 44-55, Aug. 1995.

View publication stats

You might also like