Software Design
Software Design
UNIT I
DESIGN QUALITIES
High standard should maintain in the design, must relate concept with verification &
validation.
1. Quality concepts are the abstract ideas that we have about what constitutes good and bad
properties of a system, and which will need to be assessed by the designer when making
decisions about design choice.
2. Design attributes provide a set of measurable characteristics.
3. Counts realizing the design attribute.
The ilities:
Form a group of quality factors that need to be considered when making any attempt to
assess design quality.
1. Reliability:
Behavioral issues
Complete
Consistent
Robust.
2. Efficiency:
Resources such as processor time, memory, network access, system failures &
disk space.
3. Maintainability:
Implementation factors.
4. Usability:
User interface.
Others
Testability
Portability
Reusability.
Design attributes:
1. Simplicity:
Design should simple and provide all the required information without complexity
2. Modularity:
a. Coupling:
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
CLASSIFICATION:
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
DISCOVERY-we recognize the key abstraction and mechanisms that form the vocabulary of
our problem domain.
INVENTION-we devise generalized abstractions and mechanisms that specify how objects
collaborate.
Her we group things that have common structure or exhibit common behavior.
Classification helps us to
DIFFICULTY OF CLASSIFICATION:
Intelligent classification is intellectually hard work and best comes through an incremental and
iterative process.
It is evident in the development of software technologies such as GUI, database standards and
fourth generation languages.
The development of individual abstractions follows a common pattern. First problems are solved
adhoc. As experience accumulates, some solutions turn out to work better than the other and sort
of folklore is passed informally from one person to other. Eventually the useful solutions are
understood systematically and they are codified and analyzed. This enables the development of
models.
The incremental and iterative nature directly impacts the construction of classes and objects
hierarchies in the design of complex s/w systems.
It is common to assert a certain class structure in the beginning and revise it over time.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
CLASSICAL CATEGORIZATION: All the entities that have a common property or set of
properties in common form a category.
PROTOTYPE THEORY: There are some abstractions that have neither clearly defined
boundaries nor concepts.
Eg: games does not fit the classical mold, since there are no common properties shared by all
games.
Classical approach
Behavior analysis
Domain analysis
Use case analysis
CRC cards
Informal English description
Structured analysis
COMPLEXITY
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
1) ROLE OF DECOMPOSITION:
divide and conquer rule
system decomposed into smaller parts
refined independently
2) ALGORITHMIC DECOMPOSITION:
top down structured design
3) OBJECT ORIENTED DECOMPOSITION:
identify objects
collaborate it to higher level behaviour
4) ALGO VS OO DECOMPOSITION:
algo- ordering of events
oo- agents that cause action or subject upon which object acts
oo is better- smaller system through reuse, more resilient to chznge and able to
evolve over time
5) ROLE OF ABSTRACTION:
ignore inessential details and deal with generalised model
6) ROLE OF HIERARCHY:
different objects collaborate with each other through patterns of interactions
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Design qualities :Internal including low coupling, high cohesion, information hiding, efficiency
Design concepts
The design concepts provide the software designer with a foundation from which more
sophisticated methods can be applied. A set of fundamental design concepts has evolved. They
are:
3. Software Architecture - It refers to the overall structure of the software and the ways in
which that structure provides conceptual integrity for a system. A good software
architecture will yield a good return on investment with respect to the desired outcome of
the project, e.g. in terms of performance, quality, schedule and cost.
4. Control Hierarchy - A program structure that represents the organization of a program
component and implies a hierarchy of control.
5. Structural Partitioning - The program structure can be divided both horizontally and
vertically. Horizontal partitions define separate branches of modular hierarchy for each
major program function. Vertical partitioning suggests that control and work should be
distributed top down in the program structure.
6. Data Structure - It is a representation of the logical relationship among individual
elements of data.
7. Software Procedure - It focuses on the processing of each modules individually
8. Information Hiding - Modules should be specified and designed so that information
contained within a module is inaccessible to other modules that have no need for such
information.
Design considerations
There are many aspects to consider in the design of a piece of software. The importance of each
should reflect the goals the software is trying to achieve. Some of these aspects are:
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Compatibility - The software is able to operate with other products that are designed for
interoperability with another product. For example, a piece of software may be backward-
compatible with an older version of itself.
Extensibility - New capabilities can be added to the software without major changes to
the underlying architecture.
Fault-tolerance - The software is resistant to and able to recover from component
failure.
Maintainability - The software can be restored to a specified condition within a specified
period of time. For example, antivirus software may include the ability to periodically
receive virus definition updates in order to maintain the software's effectiveness.
Modularity - the resulting software comprises well defined, independent components.
That leads to better maintainability. The components could be then implemented and tested
in isolation before being integrated to form a desired software system. This allows division
of work in a software development project.
Packaging - Printed material such as the box and manuals should match the style
designated for the target market and should enhance usability. All compatibility information
should be visible on the outside of the package. All components required for use should be
included in the package or specified as a requirement on the outside of the package.
Reliability - The software is able to perform a required function under stated conditions
for a specified period of time.
Reusability - the software is able to add further features and modification with slight or
no modification.
Robustness - The software is able to operate under stress or tolerate unpredictable or
invalid input. For example, it can be designed with a resilience to low memory conditions.
Security - The software is able to withstand hostile acts and influences.
Usability - The software user interface must be usable for its target user/audience.
Default values for the parameters must be chosen so that they are a good choice for the
majority of the users.
Modeling language
A modeling language is any artificial language that can be used to express information or
knowledge or systems in a structure that is defined by a consistent set of rules. The rules are used
for interpretation of the meaning of components in the structure. A modeling language can be
graphical or textual. Examples of graphical modeling languages for software design are:
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Extended Enterprise Modeling Language (EEML) is commonly used for business process
modeling across a number of layers.
Flowchart is a schematic representation of an algorithm or a stepwise process,
Fundamental Modeling Concepts (FMC) modeling language for software-intensive
systems.
IDEF is a family of modeling languages, the most notable of which include IDEF0 for
functional modeling, IDEF1X for information modeling, and IDEF5 for modeling
ontologies.
Jackson Structured Programming (JSP) is a method for structured programming based on
correspondences between data stream structure and program structure
LePUS3 is an object-oriented visual Design Description Language and a formal
specification language that is suitable primarily for modelling large object-oriented (Java, C+
+, C#) programs and design patterns.
Unified Modeling Language (UML) is a general modeling language to describe software
both structurally and behaviorally. It has a graphical notation and allows for extension with a
Profile (UML).
Alloy (specification language) is a general purpose specification language for expressing
complex structural constraints and behavior in a software system. It provides a concise
language based on first-order relational logic.
Systems Modeling Language (SysML) is a new general-purpose modeling language for
systems engineering.
Design patterns
A software designer or architect may identify a design problem which has been solved by others
before. A template or pattern describing a solution to a common problem is known as a design
pattern. The reuse of such patterns can speed up the software development process, having been
tested and proven in the past.
Usage
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
1. ER Model.
2. UML Model
3. DFD
ER Model
1. Requirement analysis
2. Conceptual Model
3. Logical Model
4. Schema Refinement
5. Physical data model
6. Security
Within this Design viewpoint mainly concentrate on Conceptual Model, Logical Model
& Physical Model
UML Model
Activity diagram
Component diagram.
Class diagram
Object diagram
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
In my model of the design process, which I teach my students at NID during the Design
Concepts and Concerns course, I offer a four stage model where the User Research leads to
Scenario Visualization and this will bring to the surface many ideas and concepts that can be
shared with users and others as the work progresses. These concepts and models can be subjected
to debate and discussion as well as detailed modeling and testing till you are ready to invest time
and effort as well as develop the basis for obtaining the costs to detail out one, two or more of
these scenarios and subject these to further testing, all usually done in rapid succession. So, in
this way designing is an action oriented work where research is invariably interspaced with
action of modeling and discourse as well as a good measure of discussion and debate based on
which your insights and convictions would develop more fully and you will then make some
decisions about directions and goals that need to be reached. Both goals and possible solutions as
well as means of achieving the goals are co-developed or co-evolved as the work progresses.
The third stage is Concept Development which takes a substantial amount of time and money in
a business situation. Here the detailing of some promising concepts are taken up in a systematic
manner and this can take a good deal of time effort and cost and the fourth stage is to develop
Business Models that can help realize the concept in the real world.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
DOs are used because applicable systems standards are not in existence.
This is intended to be a short reference of basic software design concepts. The objectives are to:
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Notation
1. UML Model
Activity diagram
Component diagram.
Class diagram
Object diagram
2. E-R Model
3. DFD
Object Model
1. OOA
2. OOD
3. OOP
Objects as a tangible entity, entity that exhibits some well defined behavior. Objects have certain
integrity. An object can only change stae, behavior, be manipulated or stand in relation to other
objects in ways appropriate to that object.
OOA:
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
OOA is a method of analysis that examines requirements from the perspective of the
classes & objects found in the vocabulary of the problem domain.
OOD:
OOP:
i. It supports objects that are data abstract as with an interface of named operations and
a hidden local state.
Abstraction
Encapsulation
Modularity
Hierarchy
Minor Elements
Typing
Concurrency
Persistence
Abstraction:
A simplified description or specification, of a system that emphasizes some of the system
details or properties while suppressing others. An abstraction denotes the essential characterstics
of an object that distinguish it from all other kinds of objects.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
1. Entity abstraction:
An object that map a useful model of a problem domain or solution domain entity.
2. Action abstraction:
4. Coincidential operation:
Packages.
A protocol denotes the ways in which an object may act and react, and thus constitutes
the entire static & dynamic outside view of the abstraction.
Central to the idea of an abstraction is the concept of invariance. An invariant is some Boolean
condition whose truth must be preserved. For each operation associated with an object, we may
define preconditions as well as post conditions.
It specifies Operation, method & member function, all abstraction have static & dynamic
property of the object.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Encapsulation:
Process of hiding all the secrets of an object tht does not contribute to its essential characteristics.
Modularity:
The act of partition a program into individual components can reduce its complexities to
some degree. Modulation consists of dividing a program into modules which can be compiled
separated, but which have connection with other modules.
Hierarchy:
1. Single
2. Multiple.
3. Multilevel
4. Hierarchical
5. Hybrid
Typing (class):
Concurrency:
Persistence:
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
OBJECTIVES :
1. Define and give examples of class, object, attribute, operation, relationship, message,
persistence and state.
2. Describe and use several strategies used to find classes and objects.
3. Describe a technique for keeping track of future enhancements for the information system.
4. Identify several questions to ask to help discover attributes.
5. Define three types of attributes.
6. Describe one attribute data dictionary technique.
7. Be able to create a UML Class Diagram for a problem domain.
The UML's class notation does not differentiate between an abstract class and a [concrete]
class with the exception that one could (but is not required) label an abstract class as
<<abstract>>. As a general rule, when we discuss the notion of a class we normally are referring
to one that could or does have objects instantiated from it. In keeping with the UML reference
materials, the first letter of any identified class is always capitalized. For example, a class of
bicycles would be called Bicycle.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Examples of objects, or what is often referred to as instances, could be your bicycle, your
backpack, your automobile, your telephone, each shoe you own, each key on your key chain,
each slice of bread in a loaf of bread, the paycheck you received last pay day, and so on. These
physical objects and others in the real world are the easiest type of objects to identify. Later in
the chapter we will discuss the more difficult types of objects to identifythe ones that are not
so obvious.
Examples of classes could be Bicycle, Backpack, Automobile, Telephone, Shoe, Key, Bread,
and Paycheck. Classes are always referred to in the singular even though they typically represent
a collection of one or more objects with similar characteristics. So, the Bicycle class could
contain instances of your bicycle, my bicycle, your roommate's bicycle, and so on.
In the UML a Class Diagram is created that will contain all of the classes that are important
for the architecture of the information system. Each individual class on a class diagram is
partitioned into three sections from top-to-bottomthe class name, the class attributes, and the
class operationsas shown in Figure 5.1. Attributes and operations are discussed in the next
section of OT-101.
Be sure to use terminology and names the user of the information system will be comfortable
with, otherwise user acceptance of the information system may be difficult to obtain. For
example, in a recent requirements-gathering session with a client, several synonyms were
discovered to mean basically the same thing. Of the seven people in the meeting, five users and
two systems analysts, four different terms were favoredorigin code, priority code, source code,
and key codeby different individuals thus introducing a potential communication gap between
the people. Further discussion revealed that all four were in fact synonyms. The group finally
reached a consensus to use one of the four termsorigin codeuniversally from that point on in
order to eliminate confusion and misunderstanding. Sometimes use of the synonyms is
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
acceptable and agreed to by the users, but, over time, organizational dynamics can negatively
affect the use of synonyms as in the preceding example.
Sometimes it is necessary to use an adjective along with a noun for the class name such as
Rental Agreement instead of Agreement in order to further clarify the class and to make it more
meaningful to the user. In the UML's notation, class names with more than one word are
compressed together with no spaces and the first letter of each word is capitalized. For example,
Rental Agreement would become RentalAgreement. There are a few simple rules and guidelines
for creating classes that are summarized in Figure 5.2.
ATTRIBUTES
Referring back to Figure 5.1 you see several classes. The center section of each class symbol is
reserved for any associated characteristics belonging to the class that is collectively referred to as
attributes. An attribute is data that further describes an object instance. Examples of attributes
that are all related to a student could be a student's last name, first name, address, city, state, and
grade point average. Attributes related to a course at a university could be course number, course
name, course units, and course prerequisites.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
to tell your bank account number from all the other thousands it keeps track of, or the fact that
your fingerprints are like no other on the face of this earth (so we are told).
The way business-oriented information systems identify one object as being unique from all
other objects in the class is to identify some attribute or combination of attributes that would
distinguish it from all other objects. For example, how is one automobile object instance
identified uniquely from all other automobiles in the Automobile class? Well, in the real world,
each automobile has a unique vehicle identification number (VIN) associated with it and
assigned by the vehicle's manufacturer. There should not be two VIN numbers alike on the face
of the earth (so we are told). Therefore, this attribute could be used to uniquely identify one
automobile from all the others. Attributes are more fully discussed in a later section of this
chapter.
OPERATIONS
Referring back to Figure 5.1 once again you see several classes The bottom section of the class
symbol is reserved for operations. Operations are often referred to as methods in the object-
oriented programming world. Operations are actions that each object in the class is responsible
for exhibiting or performing. In keeping with this definition of an operation, you as an object
could be expected to walk, talk, listen, eat, sing, smile, etc. As with any example or metaphor,
this example "breaks-down" because you and I know that not all people objects can perform
every one of these listed operations, but at least you should get the idea.
In object-oriented information systems operations are usually very specific and focused. For
example, the walking operation from above is really composed of many smaller actions that are
put together to create the walking operation. The more atomic we make our operations the more
they have the potential to be reused over and over again within and across several information
systems. For example, an operation that simply retrieves a person's first name from a database of
names and addresses is more reusable than an operation that retrieves a person's first name and
does ten other things in addition. Some examples of operations could be: getFirstName(),
setFirstName(), getLastName(), setLastName(), computeSalesTax() and so forth. Operations are
more fully explored in Chapter 7.
An example of a class with attributes and operations for a university course registration
system is illustrated in Figure 5.3. Not all possible attributes and operations are listed due to
space considerations, but hopefully enough are shown to give you the idea of these concepts.
Keep in mind that the class notation is flexible enough to fit the needs of the problem domain.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
RELATIONSHIPS
The third and final responsibility of classes and objects is relationships. Much of Chapter 6 is
devoted to this concept since it is so important. In the UML Class Diagram a relationship can be
established between two classes that is referred to as a generalization relationship or a
relationship can be established between objects either belonging to different classes or belonging
to the same class and this form of relationship is referred to as an object association
relationship. In the UML Class Diagram there are three (3) forms of object association referred
to as association, aggregation, and composition. Each of these will be explored in much more
detail in the next chapter. Relationships allow a class to be related to one or more other classes or
an object to be related to one or more objects either belonging to the same class or different
classes. Figure 5.1 illustrates the generalization relationship and Figure 5.4 illustrates the
composition association relationship.
Relationships contribute to the UML Class Diagram in two ways: 1) relationships help to
simplify a Class Diagram, and 2) relationships help to communicate the architectural
relationships that must exist within the information system in order for it to accomplish its
intended purpose.
MESSAGE
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
A message is a "signal" from one object to another object requesting the receiving object to carry
out one of its operations. Messages are how objects communicate with one another in order to
accomplish the information systems purpose. Keep in mind that "the operation is the message"
which means that one object specifically identifies another object's operation by name when it
needs assistance to accomplish some task. For example, in the real world you might ask me what
grade you have earned so far in this course. That is the messageyou asking me what grade you
have earned so far. I, as an object, would look in my grade book and tell you your grade. So, my
action of looking in my grade book for your grade and telling you your grade is the essence of
the operation that I perform when you send me the "what is my current grade" message. In object
technology a message identifies an object and the specific operation to perform.
We conclude our OT-101 "course" with two final object technology conceptspersistence and
state. These two concepts are easy to explain and are mainly identified here because they are
vocabulary words often associated with computer science or software engineering but rarely used
in the information technology community. Persistence simply refers to the permanent, long-term
storage of data. It is analogous to the storing of data in files on a hard disk, diskette, or CD.
Persistent data can withstand the turning-on and turning-off of electrical power whereas
memory-resident data are volatile and are erased when power is turned-off. The information
technology community often refers to persistence as a database.
State refers to the condition of an object at a specific moment in time. For example, my
personal weight on a Friday afternoon might be 180 pounds. The following Monday morning my
weight might be 183 pounds or 178 pounds depending on my food consumption and my activity
or lack thereof over the week-end.
Getting started! Once again we are faced with creating a different UML diagramthe Class
Diagram. As we move on to a discussion of finding objects and their classes for the Class
Diagram an important characteristic of the object-oriented problem-solving strategy comes to
mind. I bring this up here because finding classes and objects is an object-oriented problem-
solving strategy for defining and documenting an information system. Classes and their
associated attributes and operations are relatively stable over time. Even though some of the
attribute values of a class may change (e.g., you move so your address and telephone number
must be changed in your object), or the way the information system performs one of the
operations that belongs to a class may change down the road (e.g., an ATM machine's display
screen changes from monochrome and text to color and graphics), the class will remain a
constant and an integral part of the overall problem domain. This is good news! Stability over
time is very important when it comes to the amount of time, resources, and cost to maintain the
information system throughout its life. Like a building whose foundation is instrumental to the
long-term usability of the building, an information system's foundation is also critically
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
important. In general there is a correlation between the stability of the information system's
foundation and the amount of time, resources, and cost to maintain it over its lifetime.
Now that you have an understanding of objects, classes and the other object technology
concepts, we will discuss three strategies for helping you discover classes and objects in the
problem domain of interest to you and your user. This section is going to present strategies for
finding objects and their classes. As you know, no object may stand-alone; therefore, when you
find an object, you automatically find the class that will group all objects of that type. For
example, in a university course registration system, let's say that a student object is deemed a
candidate object. Since there are lots of student objects (one for each student registering for
courses), you automatically find a class called Student or StudentInformation, which will be the
grouping of all student objects.
The strategy or strategies you select for finding objects and their classes are often dependent
on several factors. First, the type of requirements documentation you have to work with in the
beginning will play a significant role in determining this. If you have something like Kozar's
requirements model as the requirements document for the problem domain, as discussed in an
earlier chapter, or some other well-defined document, you may choose freely from the strategies
discussed later. If you are getting together with your user in a brainstorming or JAD session in
lieu of having a requirements document, you may need to use the third strategy discussed later.
Second, your user may have a preferred way of initially looking at the problem domain.
Discussed in an earlier chapter were the data, functional, and behavioral (user interface)
characteristics of an information system. Your user may have a preference for one or more of
these as the basis for discussing a model of the problem domain. For example, a recent client of
mine had prepared a simplified user requirements document prior to meeting with me. His
document focused primarily on the data aspects of an information system. He had drawn samples
of half a dozen or more reports that he wanted from the information system. An object-oriented
model could still be constructed starting from a data perspective such as this.
Third, you may have a bias toward one or more of these three aspects of information
systems and feel more comfortable using a certain one. With over 20 years of systems analysis
and design consulting experience, I have a predisposition to use the functional aspect but have
tempered this to use both the data and behavioral when necessary. Lastly, the information system
itself may cry out for a specific aspect emphasisdata, functional, or behavioral (user-interface).
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
The first strategy, discussed in more detail in the end of chapter reference book, approaches the
problem of finding objects and their classes by focusing on the nouns and noun phrases that exist
in the requirements specification document. Rebecca Wirfs-Brock and her colleaguesBrian
Wilkerson and Lauren Wienersuggest the following steps be followed by the systems analyst
and are quite often done with discussion and assistance from the user:
1. Read and understand the requirements document because the goal of the "finding objects
and their classes" activity is to create a model that very closely parallels the real-world
problem domain in question.
2. Reread the document looking specifically for nouns and noun phrases. Make a
preliminary list of these nouns and noun phrases and change all plurals in the list to
singular, such as bicycles to bicycle.
3. Divide the noun and noun phrase list into three categories: obvious objects (classes),
obvious nonsense objects (classes), and "not sure of" objects (classes).
4. Discard the nonsense noun and noun phrase list.
5. Discuss the "not sure" noun and noun phrase list in more detail until each one can be
moved to either the obvious or nonsense list.
Figure 5.5 is a copy of part of the Video Store's requirements from an earlier chapter. It is
presented here for illustration of step 2 in the Wirfs-Brock approach for finding objects and their
classes. The pertinent nouns and noun phrases in the list have been underlined as object (class)
candidates. Figure 5.6, which represents the results of step 2, lists the noun and noun phrases
from the information system objectives in Figure 5.5. These are the candidate objects (classes).
Note that plurals have been converted to singular at this point, and that all words have been
capitalized. The systems analyst working with the user would perform steps 3 through 5. Finally,
the systems analyst would end up with a refined list of candidate objects (classes) that may
undergo additional changes as the analysis and design proceed.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
The other strategies for finding objects (classes) have not been articulated as succinctly as the
Wirfs-Brock strategy. Therefore, they are combined for presentation purposes in this section, and
each is documented in one or more of the end of chapter references. These strategies are similar
and compatible with each other and are often used together to find objects. Later in the chapter
these suggestions will be applied to finding objects in the video store information system
example. As you begin to find objects for the problem domains that you are working with, you
should look for the following:
As you create the list of candidate objects (classes) or after you have a list, each object on the
list should be challenged based on at least the following generic ideas:
1. Needed remembrance. Does the information system need to keep track of this object
over time? Why?
2. Any attributes? Does the object have at least one identifiable attribute? If none or only
one can be identified, then seriously consider the need to keep this as a separate object. If
it is still justifiable, then by all means, keep it.
3. Any operations? Does the information system need to have certain operations performed
by this object? Why? An object that does nothing (e.g., no operations) is not useful or
necessary to the information system. However, sometimes operations are simply not
discovered early in the process of finding objects. It is often later on that operations are
discovered and assigned to objects. Therefore, don't just throw an object away because it
has no operations early in the development life cycle but rather wait until later in the
conceptual design to do so.
4. Only one object of this type? Will there be more than one of this type of object that
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
could be grouped into a class? If none or only one object can be identified for the class,
then seriously consider the need to keep this as a separate object. If it is still justifiable,
then keep it. This challenge can often be similar to the foregoing needed remembrance
challenge. An example of this challenge could be a class called University, which
contains instance objects such as Harvard, Yale, Stanford, UCLA, Michigan, Notre
Dame, Arizona, Minnesota, and so on. This candidate class would pass this challenge. In
another information system situation having University as a class, you may find that the
information system need only have one instance object, such as the object for your
specific university. This should be challenged. During the challenge, your user tells you
that the information system needs to keep track of specific information about your
university because it is needed to allow the information system to perform its duties.
Based on the user's explanation, the need for the University class would override this
challenge and be kept as a candidate. If the user could not justify the need for keeping the
University class, then it probably should be dropped from the candidate list of objects
(classes).
5. Avoid derived results as objects. Often derived results appear on reports or displays.
Derived results are often the result of a calculation. For example, a Sales Summary
Report in a Video Store could look like the following simplified chart. Multiplying the
Unit Price by the Quantity Sold derives the dollars in the Total Price column. Adding up
the individual Total Price dollars derives the Grand Total dollars. Those five (5) values
are derived results and normally would not be stored as part of an object since they can
be calculated whenever they are needed. The Quantity Sold column values are also
probably derived but the derivation is not as easily seen as is the Total Price and Grand
Total values. Since this report represents groups of products there are probably individual
sales of products that are summarized to produce the one-line summary line for each
product group. Each of those individual sales of products has a quantity sold that must be
summarized by the information system before printing the product group's Quantity Sold
hence it too is derived.
The systems analyst must retain and therefore document the fact that these derived values are
needed in order to keep from losing some important requirements information. Often derived
results are identified as belonging to a particular object but with the additional derived word
identifier (e.g., Total Price (derived), and Grand Total (derived)).
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
No single strategy has emerged as the definitive way to find objects; therefore, it is common
for different project teams to create somewhat different candidate objects for the same problem
domain. This situation continues to make the modeling activity more of an art rather than a
science. As object-oriented technology matures, object patterns will become more available to
use as a starting point for finding objects. The systems analyst will have a group of candidate
objects to begin with and can then customize them as the specific problem domain where they
might be useful is further discussed for its actual requirements.
In a 1989 research conference paper, Kent Beck and Ward Cunningham proposed their CRC
strategy for finding and documenting objects. This strategy takes a holistic view of objects and
combines the discovery of classes (objects) along with their associated responsibilities and
collaborationsCRC. Beck and Cunningham classified attributes and operations ("what an
object knows about itself" and "what an object does") as responsibilities and what we are calling
a third responsibility"what other object(s) does an object know about"as a collaboration. As
one begins to discover objects, their attributes, operations and collaborations, it becomes
necessary to document these discoveries. Beck and Cunningham proposed a low technology or
low-fidelity way of documenting them and their responsibilities and collaborations. They
suggested the use of 4x6 cardsone per class, dividing it up into three sectionsclass name,
responsibilities (attributes and operations), and collaborations (other object interactions). Refer to
Figure 5.7 for a CRC card sample. I have used this documenting tool very successfully on a
number of occasions both in the classroom and in "the real world." Peter Coad may have
introduced the use of the sticky "post-it-note" pieces of paper as a surrogate for 4x6 cards. He
also suggested the use of different colors of post-it-note papers to differentiate various types of
classes. These ideas are simplistic but sometimes it's the simple ideas that can be really useful. In
this case its a simple way to visualize a Class Diagram model of the information system.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
The video store's problem domain presented in Chapter 3 will continue to be used to develop an
object-oriented information system example from the ground up. Here the emphasis is on finding
objects. The video store information system model, as mentioned earlier, was developed in the
classroom working with about three-dozen undergraduate students over several weeks of a
semester. As you would expect, these students were all subject matter experts with respect to at
least visiting a video store. The intent of the exercise was not to build the ultimate video store
information system but to practice using an object-oriented methodology and notation to create
an object model of a simple business information system.
Using Kozar's Requirements Model the students helped develop the video store mission
statement, goals, business objectives, business tactics, information system objectives, and
information system tactics as shown in Chapter 2 and partially shown in this chapter as Figure
5.5. Several brainstorming sessions were held to create the object-oriented video store Class
Diagram model, shown in Chapter 3, working from these lists intermixed with classroom
discussion. Chapter 4 addressed features, actors and use cases. For our purposes features are
synonymous with Kozar's business objectives.
While interacting with the students for this exercise I chose to use the conglomeration of
strategies for finding objects as discussed earlier in this chapter as the strategy for finding
objects. The primary reason for doing so was that almost all of the students had some personal
interaction with video store movie rentals and, therefore, could be considered knowledgeable in
the problem domain. I could have used either the Wirfs-Brock or CRC Strategy as discussed
earlier but I wanted to have a significant amount of free-flowing discussion of the problem
domain with the students that I did not want to inhibit for pedagogical and learning reasons.
What the students discovered through the brainstorming sessions was that there were many
different opinions of what happens in a video store. Many of the differences were subtle but
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
some were significant. The significant ones were mostly due to the variety of types of video
stores that the students had visited to check out movie videos.
During an initial class period of brainstorming in order to find candidate objects, Figure 5.8
was developed. This list was referred to as the Video Store Information System Candidate List of
ObjectsPass One. No attempt was made during the session to discuss details of these candidate
objects. In fact, in the spirit of brainstorming, I, as the facilitator, expressly discouraged any
comments about the positive or negative merits of each. No attempt was made to identify
whether a candidate item was an object or not. In fact, we referred to all of the items discussed as
just objects. The discussion was lively with many of the students offering their object
suggestions.
After a few brainstorming and discussion sessions, the student/instructor team refined the
first-pass list of candidate objects to come up with the second-pass list of candidate objects
grouped into classes as shown in Figure 5.9. Note that the class names are the same as the
original object names. This is not required but often ends up being the case. Some of the
candidate objects were removed from the list because they did not pass the "challenge based on"
ideas from an earlier section of this chapter. The students indicated that they felt pretty good
about their first few attempts at finding objects during the brainstorming and discussion sessions.
They did admit, however, that having experience with video stores allowed them to find more of
the important objects early in the process.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Having completed Pass 2 in the "finding objects" activity for the video store information
system, the students believed that the core objects for the Class Diagram model had been
identified. More objects may be identified as the methodology's activities progress. Figure 5.10
shows the grouping of the identified objects into classes, each having the same name as the
object it is a collection of. This is a common phenomenon in object modelingclass names
being the same as the object namessince a class is a collection of objects.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
later time or a later version of the proposed information system. Comments like, "It would be
nice if the system could do . . ." tend to result in ideas or features that can be put on a future
enhancements list to be dealt with after the information system has been implemented. Also time
to develop and cost factors may shift current features into a future enhancements list in order to
reduce the overall development time and/or cost of implementing the first iteration of the
information system. A future enhancements list was created and expanded throughout the
brainstorming and discussion sessions surrounding the video store example. The complete future
enhancements list is presented here, as Figure 5.11, even though the actual future enhancements
list at this point in the project would have included fewer items.
ATTRIBUTES
What's your name? What size shoe do you wear? What is your address? What color are your
eyes? How much do you weigh? What is your date of birth? These questions are somewhat
intrusive, aren't they? Characteristics such as name, shoe size, address, eye color, weight, and
date of birth are called attributes in the object-oriented methodology. The responses you give
regarding the preceding questions represent the data values of the attributes for you. Your best
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
friend's answers to these questions represent his or her data values for these attributes, and your
instructor's answers represent his or her data values for these attributes.
As you have already read, objects (classes) have responsibilities that they are expected to
exhibit in an object-oriented information system. There are three basic types of object
responsibilities:
Attribute names (e.g., name, shoe size, eye color, and so on) become the template or pattern
that can be applied to all object instances within a class that has attributes associated with it. For
example, refer back to Figure 5.3, the Student class shows the list of attributes that the
information system is responsible for. Therefore, each student object instance (e.g., you and each
of your classmates) has its own personal data values for each of the attributes that make up this
attribute template. One final thought for you to be aware of is that it is possiblealthough highly
unlikelythat a class in a business information system may not have any attributes.
Sometimes the attribute values in one object instance may be the same as the attribute values
in another object instance, but that happens most often by chance. For example, the eye color
attribute probably has only a few possible values such as blue, brown, green, and so on, whereas
a date of birth attribute could have tens of thousands of possible values.
Another name for an attribute value is an attribute state, and we will consider these two
terms as interchangeable. State was discussed early in this chapter and represents the condition
of the attribute at any moment in time for a specific object instance. Some attributes' state rarely
or never changes once established. For example, your value for the Date of Birth attribute should
never change; your value for the Name attribute might change a few times in your lifetime. For
example, Hillary Rodham's name changed to Hillary Rodham Clinton. Some attributes' state
changes frequently. For example, your value for the Shoe Size attribute changed several times as
you were growing up; your value for the Weight attribute could change every day or more
frequently; your value for your Body Temperature could also change frequently. Some
information systems do not need to be as precise as knowing that your weight changes daily. For
example, weight values on driver's licenses are only changed, if at all, at renewal time, which
may be every four years.
Candidate classes often have one or more attributes. More often the case is that many of the
classes have dozens, even hundreds, of attributes. Attributes have already been mentioned
several times in this book because they are an integral and necessary component of object-
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
oriented models and often contribute to the identification and grouping of classes. Attributes add
more detail to classes, which in turn reveals more of how the model should be constructed.
Over the years professional men and women and students have commented that when they
were first learning about objects, attributes, and states, they found it helpful to make a mental
picture of some examples. Examples may help you also. Figure 5.12 represents a mental picture
example that has been helpful to others.
Identifying and defining attributes is an ongoing and iterative activity that involves the
systems analyst interacting with the user. A specific problem domain may potentially have
thousands of attributes, but chances are that only a portion of them is a necessary requirement of
the system. In an earlier chapter, the necessary subset was referred to as the information system's
responsibility. Once again, the analysis activity is performed to prune the list of potential
attributes down to the list of those that are necessary for the information system to fulfill its
responsibilities, as in Figure 5.13.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
A change made to an attribute state can be anything that is allowed for the specific attribute.
For example, a new object instance may not have a specific value for any of its attributes when
created; then one or more operations within the class are called upon to populate the attributes
with data values for each attribute. The operation that changes the eyeColor attribute value to
"blue" had better know the policy or rule pertaining to allowable data values for this attribute. If
the request was made for this operation to change the eyeColor attribute data value to "purple," it
should know whether or not this is a valid eye color data value. There are other ways of handling
attribute value validations beyond the one just presented here, and several are discussed in a later
chapter dealing with input and output design.
Determining Attributes
Determining attributes, like finding objects, is still a highly cognitive activity involving the
analyst and the user. Many traditional information systems have predetermined attribute
templates from which to begin, but even these must be investigated as to their significance within
a user's specific problem domain. Just because every new car sold in the United States has a
cigarette lighter in it doesn't mean that every buyer will use it. The same is true for using
attribute templates for common information systems. However, unlike the car's unused cigarette
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
lighter, which is there and never needs any attention beyond an occasional dusting, an attribute
that is not needed but still part of the information system can add confusion and overhead in the
form of increased processing and storage requirements.
Omitting an attribute can also be problematic if it is determined later that a specific output
from the information system is not possible without that attribute. Retrofitting it is costly, just as
retrofitting anything in a construction project would be. As the systems analyst and the user
discuss the inclusion and exclusion of attributes, the systems analyst should ask the user to think
of the future need for an attribute that is now not necessary and is tentatively going to be
excluded. If the user says there is a good chance that the attribute might be needed within a year
or two, the systems analyst might be wise to include it in the information system's requirements
now in order to avoid the retrofitting later on.
There are endless questions that could be asked of users to help determine the required
attributes for the many classes in the information system. In the following list are a few of the
more generic-type questions that can be used in almost any situation and often can start the
investigation moving with the users. Note that these questions are in the first-person mode
because we have found that this technique often helps to "put yourself in the class' shoes for a
few minutes" to better relate to it.
1. How am "I" described in general? For example, if you were a personal computer (just
for a few minutes), how would you describe yourself generally? You would probably
mention something about the brand, type, and speed of microprocessor you have, type of
floppy disk drive you have, size of hard disk, amount of RAM, number of parallel, serial
and USB ports, type of monitor, and so on.
2. How am "I" described in this specific problem domain? For example, how would you
describe yourself if you were a specific brand of personal computer, say a Compaq model
XYZ? You would tell us the specific components that make up the Compaq model XYZ.
Some of your responses would be identical to your responses to question 1 above, while
others would be slightly different because we are focusing on a specific situation. In
addition, some of your responses might be contrary to the "general" responses you gave
in question 1, again because the Compaq model XYZ may have something different than
the generic personal computer.
3. What do "I" (as this class or object) need to know? In other words, what data are
important to the success of this information system? Sometimes looking at the desired
outputs will help determine the necessary inputs. The necessary inputs often tend to
become the attributes.
4. What state information do "I" need to remember over time? This question can help
identify additional attributes that are neither obvious nor identified by the prior questions.
Often this question is probing to identify attributes that may be needed in order to provide
historical values over time for one or more attributes. If, for example, it is important to
retain all temperature readings from a freezer's thermometer, and the readings are done
every 15 minutes to prevent food thawing caused by an air-conditioner failure within the
freezer, then one or more attributes and possibly a new class need to be created.
5. What states can "I" be in? Once attributes are identified, you can then ask the user this
question for each attribute to determine the policy for each attribute and its accepted
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
values.
Finally, in addition to these questions, you will no doubt ask questions that start with the
words "what" "why," "when," "who" and "how."
Attribute Types
As you discuss attributes and their states, at least three types of attributes will be discovered in
many information systems. These types are:
1. Single-value attributes
2. Mutually exclusive-value attributes
3. Multi-value attributes
Each of these attribute types is discussed next. Following the discussion of these attribute types,
there will be a discussion of the strategy used in object-oriented models to accommodate these
attributes.
The single-value attribute is perhaps the most frequently encountered attribute type. A
single-value attribute is one that has only one value or state for itself at any moment in time.
Referring to Figure 5.14, examples of this type of attribute are name,
studentIdentificationNumber, eyeColor, height, weight, and dateOfBirth. Even though a student's
height and weight will change over time, at any single moment in time, there is only one height
and only one weight for a specific student. Many single-valued attributes have a descriptive
nature to them, as do these examples.
The mutually exclusive-value attribute is most often problem domain dependent. What this
means is that the identification of this type of attribute can only be determined by discussing it in
its role as part of the problem domain and more specifically how it relates to the other attributes
within a specific class in the problem domain. An attribute is said to be a mutually exclusive-
value attribute if the presence or absence of its value is dependent upon the presence or absence
of one or more other attribute values. The business policy decisions that this information system
is representing play a vital role in determining whether an attribute is of the mutually exclusive
value type. Figure 5.15 illustrates mutually exclusive value attributes. The business policy for
this information system states that an employee may be either hourly or salaried, but not both.
With this in mind then, either the hourlyRate attribute or the weeklySalary attribute would have a
value for each Employee object instance.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
The third attribute type is the multi-value attribute. This attribute is the opposite of the
single-value attribute because it can have, as its name implies, multiple values at any moment in
time. Figure 5.16 illustrates this, again using the Student class. For this illustration the Student
class has name, collegeAttended, and collegeGradePointAverage attributes. A specific student,
you for example, may be attending or have already attended one or more other colleges,
community colleges, or vocational training schools. The business policy decision that is being
enforced or supported by these attributes says that a student is to list all current and prior
colleges, community colleges, and vocational training schools that he or she attended.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
One of the goals of developing information systems using an object-oriented methodology and
the UML is to simplify the decision logic that is embedded within the programming code for the
information system. For example, using Figure 5.15, the programming code embedded in one of
the class's operations would have to be able to distinguish an hourly employee from a salaried
employee, since both types of employee objects are associated with this class. The reason the
programming code would need to know about this distinction between hourly and salaried
employees is that different processing actions would be performed on each employee type. A
Class Diagram conceptual design strategy for handling mutually exclusive attributes would be to
create new classes, one for each employee typehourly and salariedas shown in the lower
portion of Figure 5.15. With this structure all Employee objects instantiated from
HourlyEmployee are hourly employees and all Employee objects instantiated from
SalariedEmployee are salaried employees. There is still only one class--
HourlySalariedEmployee--containing employee objects, but hourly employees are associated
with the HourlyEmployee class and salaried employees are associated with the
SalariedEmployee class.
In object-oriented methodology theory this works well. But just how practical is it in every
situation? Well, it often depends on the number of mutually exclusive-value attributes belonging
to a class. The theory works well for a class having two mutually exclusive-value attributes, as in
Figure 5.7. The class having two sets of two attributes that are mutually exclusive can probably
also be structured similarly to Figure 5.15. But what about three, four, five, six, or more sets of
pairs of mutually exclusive-value attributes in one class? Structures to support these would get
quite cumbersome. Also the example so far only has considered pairs of mutually exclusive-
value attributes. What about the situation where there are three attributes (or four, five, six, and
so on) that is mutually exclusive with each other? And what about sets of these groups of
attributes? Again, elaborate object-oriented structures to accommodate these situations may be
more problematic than they are worth, even though in theory mutually exclusive-value attributes
should be split apart and represented by a generalization class relationship pattern, which is
discussed in the next chapter.
During conceptual design multi-value attributes also suggest the splitting up of one class into two
or more classes in an object association relationship to more effectively represent them. Looking
at the top portion of Figure 5.16, there are three student names. Each would be an object.
However, notice that because of the multi-value attributes, the first name has two colleges
attended and the last name has three colleges attended. Because of these multiple values, there
would actually be six Student objects--two for the first name, one for the second name, and three
for the last name. Each Student object would have data values for each attribute, causing
redundancy as shown in the bottom portion of Figure 5.16.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
As illustrated in Figure 5.17 objects that have multi-value attributes can be moved to create a
new class that becomes part of an object association relationship pattern that will be discussed in
the next chapter. By doing this, redundancy is avoided. The Class Diagram would now contain
two classesStudent and CollegeAttended. Note that the name and studentIdNumber attribute
data values are no longer duplicated, as was the case in the lower portion of Figure 5.16. Each
Student object in Figure 5.17 is associated with the appropriate number of CollegeAttended
objects, as illustrated to the right side of the object association relationship pattern in the figure.
Object association constraints are also noted on either end of the line with the diamond on the
left end, which allows the software to administer the number of CollegeAttended objects
associated with each Student object and vice versa. In the figure, a Student object may have zero
or more Colleges attended, and a CollegeAttended object must be associated with one and only
one Student.
Finally, the preceding discussion is more focused on "how" rather than "what," making it
more appropriate for consideration during either the conceptual or physical design phase of
information systems development. What this means is that during the user requirements
determination and analysis phase, the object-oriented model may contain mutually exclusive-
value and multi-value attributes in the same class. However, during the design, mutually
exclusive-value and multi-value attributes will be looked at from a "how to design and
implement" perspective and adjusted via the class generalization and object association
relationship patterns as discussed earlier. As always, if during user requirements determination
and analysis it is beneficial for the user's understanding of the problem domain's Class Diagram,
then creating new class patterns to accommodate both of these attribute types is acceptable. As
mentioned earlier, both the class generalization and the object association relationship patterns
are discussed in more detail in the next chapter.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
The video store information system's user requirements model now includes classes. A few more
brainstorming sessions yielded the attributes shown in Figure 5.18a,b,c for the 11 classes
identified earlier. Trying to show a Class Diagram with all of the attributes assigned to the
various classes becomes cumbersome on paper since the model will overlap several pages. So a
two-column spreadsheet approach is taken in Figure 5.18 is to just list the classes in the left
column and their associated attributes in the right column. The students suggested almost all of
the attributes with little prodding from me, which indicated that they had some familiarity with
video stores.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
After some additional massaging of the Video Store Class Diagram's 11 classes we expanded
the diagram to 17 classes. Four of the new classes are abstractInventory, SaleItem, RentalItem,
and Transaction. Two additional classes were important to accommodate multi-valued attribute
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
SUMMARY
This chapter defined several object technology conceptsobject, class, attributes, operations,
relationships, messages, persistence and stateand gave examples of each. Several strategies
were presented as ways to help the systems analyst find objects and classes in a problem domain.
These strategies and others are helpful given the facts that the problem domains are quite diverse
and the object-oriented methodology is still emerging and needs to mature. The Wirfs-Brock
strategy was discussed in a step-by-step fashion in order to illustrate its approach. The
combination of several other "finding objects" strategies was used to begin the development of
the video store information system object-oriented UML Class Diagram. As requirements are
being gathered for an information system, it is helpful to keep a list of proposed future
enhancements for possible addition to the information system at a later time. Attributes were
defined, described, and illustrated with examples. Several questions were presented that can help
a systems analyst discover attributes. Three types of attributes were discussed and illustrated
single-value, mutually exclusive-value, and multi-value attributes. The Class Diagram strategy to
handle mutually exclusive-value and multi-value attributes was discussed. Finally, the video
store information system example was further expanded to include attributes.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Pragmatics
It means Project Management
Following steps are going to take for Project management.
1. Management & Planning
Risk Management
Task Planning
Walkthrough
2. Staffing
Resource Allocation
Human Resource
Software Resource
Hardware Resource
Development Team Roles
Project Architect
Subsystem lead
Application Engineer
Project Manager
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Analyst
Reuse Engineer
Quality Analyst
Integration Manager
3. Release Management
Integrator
Control Management or Change Management
Version Control
Testing
Unit Testing
Subsystem testing
System Testing
4. Reuse
Elements of reuse
Institutionalizing reuse
5. Quality Assurance & Metrics
Software Quality
Object Oriented Metrics
6. Documentation
Documentation Contents
7. Tools
Kinds of Tools
Organizational implications
View
A view of a system is a representation of the system from the perspective of a viewpoint. This
viewpoint on a system involves a perspective focusing on specific concerns regarding the
system, which suppresses details to provide a simplified model having only those elements
related to the concerns of the viewpoint. For example, a security viewpoint focuses on security
concerns and a security viewpoint model contains those elements that are related to security from
a more general model of a system.[7]
A view allows a user to examine a portion of a particular interest area. For example, an
Information View may present all functions, organizations, technology, etc. that use a particular
piece of information, while the Organizational View may present all functions, technology, and
information of concern to a particular organization. In the Zachman Framework views comprise
a group of work products whose development requires a particular analytical and technical
expertise because they focus on either the what, how, who, where, when, or why of
the enterprise. For example, Functional View work products answer the question how is the
mission carried out? They are most easily developed by experts in functional decomposition
using process and activity modeling. They show the enterprise from the point of view of
functions. They also may show organizational and information components, but only as they
relate to functions.[8]
Viewpoints
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Viewpoints provide the conventions, rules, and languages for constructing, presenting and
analysing views. In ISO/IEC 42010:2007 (IEEE-Std-1471-2000) a viewpoint is a specification
for an individual view. A view is a representation of a whole system from the perspective of a
point. A view may consist of one or more architectural models.[9] Each such architectural model
is developed using the methods established by its associated architectural system, as well as for
the system as a whole.[6]
Modeling perspectives
In information systems, the traditional way to divide modeling perspectives is to distinguish the
structural, functional and behavioral/processual perspectives. This together with rule, object,
communication and actor and role perspectives is one way of classifying modeling approaches
Viewpoint model
In any given viewpoint, it is possible to make a model of the system that contains only the
objects that are visible from that viewpoint, but also captures all of the objects, relationships and
constraints that are present in the system and relevant to that viewpoint. Such a model is said to
be a viewpoint model, or a view of the system from that viewpoint.[3]
A given view is a specification for the system at a particular level of abstraction from a given
viewpoint. Different levels of abstraction contain different levels of detail. Higher-level views
allow the engineer to fashion and comprehend the whole design and identify and resolve
problems in the large. Lower-level views allow the engineer to concentrate on a part of the
design and develop the detailed specifications.[3]
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
In the system itself, however, all of the specifications appearing in the various viewpoint models
must be addressed in the realized components of the system. And the specifications for any given
component may be drawn from many different viewpoints. On the other hand, the specifications
induced by the distribution of functions over specific components and component interactions
will typically reflect a different partitioning of concerns than that reflected in the original
viewpoints. Thus additional viewpoints, addressing the concerns of the individual components
and the bottom-up synthesis of the system, may also be useful.[3]
Architecture description
At the data layer are the architecture data elements and their defining attributes and relationships.
At the presentation layer are the products and views that support a visual means to communicate
and understand the purpose of the architecture, what it describes, and the various architectural
analyses performed. Products provide a way for visualizing architecture data as graphical,
tabular, or textual representations. Views provide the ability to visualize architecture data that
stem across products, logically organizing the data for a specific or holistic perspective of the
architecture.
The notion of a three-schema model was first introduced in 1977 by the ANSI/X3/SPARC three
level architecture, which determined three levels to model data.[11]
The Three schema approach for data modeling, introduced in 1977, can be considered one of the
first view models. It is an approach to building information systems and systems information
management, that promotes the conceptual model as the key to achieving data integration.[12] The
Three schema approach defines three schema's and views:
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
At the center, the conceptual schema defines the ontology of the concepts as the users think of
them and talk about them. The physical schema describes the internal formats of the data stored
in the database, and the external schema defines the view of the data presented to the application
programs.[13] The framework attempted to permit multiple data models to be used for external
schemata.[14]
Over the years, the skill and interest in building information systems has grown tremendously.
However, for the most part, the traditional approach to building systems has only focused on
defining data from two distinct views, the "user view" and the "computer view". From the user
view, which will be referred to as the external schema, the definition of data is in the context of
reports and screens designed to aid individuals in doing their specific jobs. The required structure
of data from a usage view changes with the business environment and the individual preferences
of the user. From the computer view, which will be referred to as the internal schema, data is
defined in terms of file structures for storage and retrieval. The required structure of data for
computer storage depends upon the specific computer technology employed and the need for
efficient processing of data.[15]
4+1 is a view model designed by Philippe Kruchten in 1995 for describing the architecture of
software-intensive systems, based on the use of multiple, concurrent views.[16] The views are
used to describe the system in the viewpoint of different stakeholders, such as end-users,
developers and project managers. The four views of the model are logical, development, process
and physical view:
Logical view : is concerned with the functionality that the system provides to end-users.
Development view : illustrates a system from a programmers perspective and is concerned
with software management.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Process view : deals with the dynamic aspect of the system, explains the system processes
and how they communicate, and focuses on the runtime behavior of the system.
Physical view : depicts the system from a system engineer's point-of-view. It is concerned
with the topology of software components on the physical layer, as well as communication
between these components.
In addition selected use cases or scenarios are utilized to illustrate the architecture. Hence the
model contains 4+1 views.[16]
Enterprise Architecture framework defines how to organize the structure and views associated
with an Enterprise Architecture. Because the discipline of Enterprise Architecture and
Engineering is so broad, and because enterprises can be large and complex, the models
associated with the discipline also tend to be large and complex. To manage this scale and
complexity, an Architecture Framework provides tools and methods that can bring the task into
focus and allow valuable artifacts to be produced when they are most needed.
Architecture Frameworks are commonly used in Information technology and Information system
governance. An organization may wish to mandate that certain models be produced before a
system design can be approved. Similarly, they may wish to specify certain views be used in the
documentation of procured systems - the U.S. Department of Defense stipulates that specific
DoDAF views be provided by equipment suppliers for capital project above a certain value.
Zachman Framework
Simplified illustration of the Zachman Framework with an explanation of the rows.[17] The
original framework is more advanced, see for an example here.
The Zachman Framework, originally conceived by John Zachman at IBM in the 1987, is a
framework for enterprise architecture, which provides a formal and highly structured way of
viewing and defining an enterprise.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
The Framework is used for organizing architectural "artifacts" in a way that takes into account
both who the artifact targets (for example, business owner and builder) and what particular issue
(for example, data and functionality) is being addressed. These artifacts may include design
documents, specifications, and models.[18]
The Zachman Framework is often referenced as a standard approach for expressing the basic
elements of enterprise architecture. The Zachman Framework has been recognized by the U.S.
Federal Government as having "... received worldwide acceptance as an integrated framework
for managing change in enterprises and the systems that support them."[19]
RM-ODP views
The RM-ODP view model, which provides five generic and complementary viewpoints on the
system and its environment.
The International Organization for Standardization (ISO) Reference Model for Open Distributed
Processing (RM-ODP) [20] specifies a set of viewpoints for partitioning the design of a distributed
software/hardware system. Since most integration problems arise in the design of such systems
or in very analogous situations, these viewpoints may prove useful in separating integration
concerns. The RMODP viewpoints are:[3]
the enterprise viewpoint, which is concerned with the purpose and behaviors of the
system as it relates to the business objective and the business processes of the organization
the information viewpoint, which is concerned with the nature of the information handled
by the system and constraints on the use and interpretation of that information
the computational viewpoint, which is concerned with the functional decomposition of
the system into a set of components that exhibit specific behaviors and interact at interfaces
the engineering viewpoint, which is concerned with the mechanisms and functions
required to support the interactions of the computational components
the technology viewpoint, which is concerned with the explicit choice of technologies for
the implementation of the system, and particularly for the communications among the
components
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
DoDAF views
The DoDAF defines a set of products that act as mechanisms for visualizing, understanding, and
assimilating the broad scope and complexities of an architecture description through graphic,
tabular, or textual means. These products are organized under four views:
Each view depicts certain perspectives of an architecture as described below. Only a subset of the
full DoDAF viewset is usually created for each system development. The figure represents the
information that links the operational view, systems and services view, and technical standards
view. The three views and their interrelationships driven by common architecture data elements
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
provide the basis for deriving measures such as interoperability or performance, and for
measuring the impact of the values of these metrics on operational mission and task
effectiveness.[21]
In the US Federal Enterprise Architecture enterprise, segment, and solution architecture provide
different business perspectives by varying the level of detail and addressing related but distinct
concerns. Just as enterprises are themselves hierarchically organized, so are the different views
provided by each type of architecture. The Federal Enterprise Architecture Practice Guidance
(2006) has defined three types of architecture:[22]
By contrast, segment architecture defines a simple roadmap for a core mission area, business
service, or enterprise service. Segment architecture is driven by business management and
delivers products that improve the delivery of services to citizens and agency staff. From an
investment perspective, segment architecture drives decisions for a business case or group of
business cases supporting a core mission area or common or shared service. The primary
stakeholders for segment architecture are business owners and managers. Segment architecture is
related to EA through three principles: structure, reuse, and alignment. First, segment
architecture inherits the framework used by the EA, although it may be extended and specialized
to meet the specific needs of a core mission area or common or shared service. Second, segment
architecture reuses important assets defined at the enterprise level including: data; common
business processes and investments; and applications and technologies. Third, segment
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
architecture aligns with elements defined at the enterprise level, such as business strategies,
mandates, standards, and performance measures.[22]
In search of "Framework for Modeling Space Systems Architectures" Peter Shames and Joseph
Skipper (2006) defined a "nominal set of views",Derived from CCSDS RASDS, RM-ODP, ISO
10746 and compliant with IEEE 1471.
This "set of views", as described below, is a listing of possible modeling viewpoints. Not all of
these views may be used for any one project and other views may be defined as necessary. Note
that for some analyses elements from multiple viewpoints may be combined into a new view,
possibly using a layered representation.
In a latter presentation this nominal set of views was presented as an Extended RASDS Semantic
Information Model Derivation. Hereby RASDS stands for Reference Architecture for Space Data
Systems. see second image.
Information viewpoint
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Metamodel view An abstract view that defines information model elements and their
structures and relationships. Defines the classes of data that are created and managed by the
system and the data architecture.
Information view Describes the actual data and information as it is realized and
manipulated within the system. Data elements are defined by the metamodel view and they
are referred to by functional objects in other views.
Physical viewpoint
Data System view Describes instruments, computers, and data storage components,
their data system attributes and the communications connectors (busses, networks, point to
point links) that are used in the system.
Telecomm view Describes the telecomm components (antenna, transceiver), their
attributes and their connectors (RF or optical links).
Navigation view Describes the motion of the major elements within the system
(trajectory, path, orbit), including their interaction with external elements and forces that are
outside of the control of the system, but that must be modeled with it to understand system
behavior (planets, asteroids, solar pressure, gravity)
Structural view Describes the structural components in the system (s/c bus, struts,
panels, articulation), their physical attributes and connectors, along with the relevant
structural aspects of other components (mass, stiffness, attachment)
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Thermal view Describes the active and passive thermal components in the system
(radiators, coolers, vents) and their connectors (physical and free space radiation) and
attributes, along with the thermal properties of other components (i.e. antenna as sun shade)
Power view Describes the active and passive power components in the system (solar
panels, batteries, RTGs) within the system and their connectors, along with the power
properties of other components (data system and propulsion elements as power sinks and
structural panels as grounding plane)
Propulsion view Describes the active and passive propulsion components in the system
(thrusters, gyros, motors, wheels) within the system and their connectors, along with the
propulsive properties of other components
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Communications Protocol view Describes the end to end design of the communications
protocols and related data transport and data management services, shows the protocol stacks
as they are implemented on each of the physical components of the system.
Risk view Describes the risks associated with the system design, processes, and
technologies, assigns additional risk assessment attributes to other elements described in the
architecture
Control Engineering view - Analyzes system from the perspective of its controllability,
allocation of elements into system under control and control system
Integration and Test view Looks at the system from the perspective of what must be
done to assemble, integrate and test system and sub-systems, and assemblies. Includes
verification of proper functionality, driven by scenarios, in satisfaction of requirements.
IV&V view independent validation and verification of functionality and proper
operation of the system in satisfaction of requirements. Does system as designed and
developed meet goals and objectives.
Technology viewpoint
Standards view Defines the standards to be adopted during design of the system (e.g.
communication protocols, radiation tolerance, soldering). These are essentially constraints on
the design and implementation processes.
Infrastructure view Defines the infrastructure elements that are to support the
engineering, design, and fabrication process. May include data system elements (design
repositories, frameworks, tools, networks) and hardware elements (chip fabrication, thermal
vacuum facility, machine shop, RF testing lab)
Technology Development & Assessment view Includes
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
UNIT II
ABSTRACTION:
Abstraction is a process by which higher concepts are derived from the usage and
classification of literal ("real" or "concrete") concepts, first principles or other methods. "An
abstraction" is the product of this process a concept that acts as a super-categorical noun for all
subordinate concepts and connects any related concepts as a group, field or category.
In computer science, abstraction is the process by which data and programs are defined
with a representation similar in form to its meaning (semantics), while hiding away the
implementation details. Abstraction tries to reduce and factor out details so that the programmer
can focus on a few concepts at a time. A system can have several abstraction layers whereby
different meanings and amounts of detail are exposed to the programmer. For example, low-level
abstraction layers expose details of the computer hardware where the program is run, while high-
level layers deal with the business logic of the program.
Abstraction captures only those details about an object that are relevant to the current
perspective. The concept originated by analogy with abstraction in mathematics. The
mathematical technique of abstraction begins with mathematical definitions making it a more
technical approach than the general concept of abstraction in philosophy. For example, in both
computing and in mathematics, numbers are concepts in the programming languages, as founded
in mathematics. Implementation details depend on the hardware and software but this is not a
restriction because the computing concept of number is still based on the mathematical concept.
Control abstraction involves the use of subprograms and related concepts control flows
Data abstraction allows handling data bits in meaningful ways. For example, it is the
basic motivation behind datatype.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
The recommendation that programmers use abstractions whenever suitable in order to avoid
duplication (usually of code) is known as the abstraction principle. The requirement that a
programming language provide suitable abstractions is also called the abstraction principle.
LEVELS OF ABSTRACTION:
Computer science commonly presents levels (or, less commonly, layers) of abstraction,
wherein each level represents a different model of the same information and processes, but uses a
system of expression involving a unique set of objects and compositions that apply only to a
particular domain.Each relatively abstract, "higher" level builds on a relatively concrete, "lower"
level, which tends to provide an increasingly "granular" representation. For example, gates build
on electronic circuits, binary on gates, machine language on binary, programming language on
machine language, applications and operating systems on programming languages. Each level is
embodied, but not determined, by the level beneath it, making it a language of description that is
somewhat self-contained.
DATABASE SYSTEMS:
Since many users of database systems lack in-depth familiarity with computer data-
structures, database developers often hide complexity through the following levels:
Physical level: The lowest level of abstraction describes how a system actually stores data. The
physical level describes complex low-level data structures in detail.
Logical level: The next higher level of abstraction describes what data the database stores, and
what relationships exist among those data. The logical level thus describes an entire database in
terms of a small number of relatively simple structures. Although implementation of the simple
structures at the logical level may involve complex physical level structures, the user of the
logical level does not need to be aware of this complexity. This referred to as Physical Data
Independence. Database administrators, who must decide what information to keep in a database,
use the logical level of abstraction.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
View level: The highest level of abstraction describes only part of the entire database. Even
though the logical level uses simpler structures, complexity remains because of the variety of
information stored in a large database. Many users of a database system do not need all this
information; instead, they need to access only a part of the database. The view level of
abstraction exists to simplify their interaction with the system. The system may provide many
views for the same database.
LAYERED ARCHITECTURE:
Systems design and business process design can both use this. Some design processes
specifically generate designs that contain various levels of abstraction.
Layered architecture partitions the concerns of the application into stacked groups (layers). It
is a technique used in designing computer software, hardware, and communications in which
system or network components are isolated in layers so that changes can be made in one layer
without affecting the others.
LANGUAGE FEATURES:
SPECIFICATION METHODS:
Analysts have developed various methods to formally specify software systems. Some
known methods include:
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
DATA ABSTRACTION:
Data abstraction enforces a clear separation between the abstract properties of a data type
and the concrete details of its implementation. The abstract properties are those that are visible to
client code that makes use of the data typethe interface to the data typewhile the concrete
implementation is kept entirely private, and indeed can change, for example to incorporate
efficiency improvements over time. The idea is that such changes are not supposed to have any
impact on client code, since they involve no difference in the abstract behaviour.
For example, one could define an abstract data type called lookup table which uniquely
associates keys with values, and in which values may be retrieved by specifying their
corresponding keys. Such a lookup table may be implemented in various ways: as a hash table, a
binary search tree, or even a simple linear list of (key:value) pairs. As far as client code is
concerned, the abstract properties of the type are the same in each case.
Of course, this all relies on getting the details of the interface right in the first place, since
any changes there can have major impacts on client code. As one way to look at this: the
interface forms a contract on agreed behaviour between the data type and client code; anything
not spelled out in the contract is subject to change without notice.
Languages that implement data abstraction include Ada and Modula-2. Object-oriented
languages are commonly claime to offer data abstraction; however, their inheritance concept
tends to put information in the interface that more properly belongs in the implementation; thus,
changes to such information ends up impacting client code, leading directly to the Fragile binary
interface problem.
CONTROL ABSTRACTION:
Programming languages offer control abstraction as one of the main purposes of their
use. Computer machines understand operations at the very low level such as moving some bits
from one location of the memory to another location and producing the sum of two sequences of
bits. Programming languages allow this to be done in the higher level. For example, consider this
statement written in a Pascal-like fashion:
a := (1 + 2) * 5
To a human, this seems a fairly simple and obvious calculation ("one plus two is three,
times five is fifteen"). However, the low-level steps necessary to carry out this evaluation, and
return the value "15", and then assign that value to the variable "a", are actually quite subtle and
complex. The values need to be converted to binary representation (often a much more
complicated task than one would think) and the calculations decomposed (by the compiler or
interpreter) into assembly instructions (again, which are much less intuitive to the programmer:
operations such as shifting a binary register left, or adding the binary complement of the contents
of one register to another, are simply not how humans think about the abstract arithmetical
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
operations of addition or multiplication). Finally, assigning the resulting value of "15" to the
variable labeled "a", so that "a" can be used later, involves additional 'behind-the-scenes' steps of
looking up a variable's label and the resultant location in physical or virtual memory, storing the
binary representation of "15" to that memory location, etc.
Without control abstraction, a programmer would need to specify all the register/binary-level
steps each time he simply wanted to add or multiply a couple of numbers and assign the result to
a variable. Such duplication of effort has two serious negative consequences:
1. it forces the programmer to constantly repeat fairly common tasks every time a similar
operation is needed
2. it forces the programmer to program for the particular hardware and instruction set.
HIERARCHY:
Dimension: another word for "system" from on-line analytical processing (e.g. cubes)
Rank: the relative value, worth, complexity, power, importance, authority, level etc. of
an object
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Hierarch, the top level of the hierarchy, usually consisting of one object or member of a
dimension
Peer: an object with the same rank (and therefore at the same level)
Neighbour: the adjacent level/ranking (the immediate superior and immediate inferior)
Interaction: the relationship between an object and its direct superior or subordinate (i.e.
a superior/inferior pair)
a direct interaction occurs when one object is on a level exactly one higher or one lower
than the other (i.e., on a tree, the two objects have a line between them)
Distance: the minimum number of connections between two objects, i.e., one less than
the number of objects that need to be "crossed" to trace a path from one object to another
Span: a qualitative description of the width of a level when diagrammed, i.e., the number
of subordinates an object has.
DEGREE OF BRANCHING:
Degree of branching refers to the number of direct subordinates or children an object has
(equivalent to the number of vertices a node has). Hierarchies can be categorized based on the
"maximum degree", the highest degree present in the system as a whole. Categorization in this
way yields two broad classes: linear and branching.
In a linear hierarchy, the maximum degree is 1.[1] In other words, all of the objects can
be visualized in a lineup, and each object (excluding the top and bottom ones) has exactly one
direct subordinate and one direct superior. Note that this is referring to the objects and not
the levels; every hierarchy has this property with respect to levels, but normally each level can
have an infinite number of objects. An example of a linear hierarchy is the hierarchy of life.
In a branching hierarchy, one or more objects has a degree of 2 or more (and therefore
the maximum degree is 2 or higher).[1] For many people, the word "hierarchy" automatically
evokes an image of a branching hierarchy. [1] Branching hierarchies are present within numerous
systems, including organizations and classification schemes. The broad category of branching
hierarchies can be further subdivided based on the degree.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
A square can always also be referred to as a quadrilateral, polygon or shape. In this way,
it is a hierarchy. However, consider the set of polygons using this classification. A square
can only be a quadrilateral; it can never be a triangle, hexagon, etc.
Nested hierarchies are the organizational schemes behind taxonomies and systematic
classifications. For example, using the original Linnaean taxonomy a human can be
formulated as:
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Taxonomies may change frequently (as seen in biological taxonomy), but the underlying
concept of nested hierarchies is always the same.
Containment hierarchy
A containment hierarchy is a direct extrapolation of the nested hierarchy concept. All of
the ordered sets are still nested, but every set must be "strict"no two sets can be identical. The
shapes example above can be modified to demonstrate this:
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
PARTITIONING:
The partitioning can be done by either building separate smaller databases (each with its
own tables, indices, and transaction logs), or by splitting selected elements, for example just one
table.
Horizontal partitioning involves putting different rows into different tables. Perhaps
customers with ZIP codes less than 50000 are stored in CustomersEast, while customers with
ZIP codes greater than or equal to 50000 are stored in CustomersWest. The two partition tables
are then CustomersEast and CustomersWest, while a view with a union might be created over
both of them to provide a complete view of all customers.
Vertical partitioning involves creating tables with fewer columns and using additional
tables to store the remaining columns. Normalization also involves this splitting of columns
across tables, but vertical partitioning goes beyond that and partitions columns even when
already normalized. Different physical storage might be used to realize vertical partitioning as
well; storing infrequently used or very wide columns on a different device, for example, is a
method of vertical partitioning. Done explicitly or implicitly, this type of partitioning is called
"row splitting" (the row is split by its columns). A common form of vertical partitioning is to
split dynamic data (slow to find) from static data (fast to find) in a table where the dynamic data
is not used as often as the static. Creating a view across the two newly created tables restores the
original table with a performance penalty, however performance will increase when accessing the
static data e.g. for statistical analysis.
PARTITIONING CRITERIA:
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Current high end relational database management systems provide for different criteria to
split the database. They take a partitioning key and assign a partition based on certain criteria.
Common criteria are:
Range partitioning
Selects a partition by determining if the partitioning key is inside a certain range.
An example could be a partition for all rows where the column zipcode has a value
between70000 and 79999.
List partitioning
A partition is assigned a list of values. If the partitioning key has one of these
values, the partition is chosen. For example all rows where the column Country is
either Iceland,Norway, Sweden, Finland or Denmark could build a partition for
the Nordic countries.
Hash partitioning
The value of a hash function determines membership in a partition. Assuming there
are four partitions, the hash function could return a value from 0 to 3.
Composite partitioning
It allows for certain combinations of the above partitioning schemes, by for example first
applying a range partitioning and then a hash partitioning.Consistent hashing could be
considered a composite of hash and list partitioning where the hash reduces the key space to a
size that can be listed.
Cohesion
Cohesion (interdependency within module) strength/level names : (from worse to better, high
cohesion is good)
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Sequential Cohesion : operations on same data in significant order; output from one
function is input to next (pipeline)
Informational Cohesion: a module performs a number of actions, each with its own entry
point, with independent code for each action, all performed on the same data structure.
Essentially an implementation of an abstract data type.
o i.e. define structure of sales_region_table and its operators: init_table(),
update_table(), print_table()
Functional Cohesion : all elements contribute to a single, well-defined task, i.e. a function
that performs exactly one operation
o get_engine_temperature(), add_sales_tax()
Coupling
Coupling (interdependence between modules) level names: (from worse to better, high coupling
is bad)
Data Design
Designing data is about discovering and completely defining your application's data
characteristics and processes. Data design is a process of gradual refinement, from the coarse
"What data does your application require?" to the precise data structures and processes that
provide it. With a good data design, your application's data access is fast, easily maintained, and
can gracefully accept future data enhancements.
The process of data design includes identifying the data, defining specific data types and storage
mechanisms, and ensuring data integrity by using business rules and other run-time enforcement
mechanisms.
This section is not a formal methodology for data modeling, although it does use some relational
terminology. Rather, it presents some concepts and processes that are typically encountered as
you design your application's data.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
This topic makes no assumptions about the eventual data storage technology used to store and
retrieve your application's data. After all, it is not always obvious at the beginning of an
application design just exactly how or where the data will be stored. While most formal data
modeling methodologies anticipate using a relational database engine, an enterprise application
has many data storage options, including relational, mainframe hierarchical and VSAM files,
AS/400 files, and various other distributed file data structures.
The following sections will acquaint you with some general concepts useful for designing
enterprise data.
Data Identification
Describes the process of discovering how the organization and your application will use
the data.
Data Definition
Explains the general process of defining tables, rows, columns, data types, keys, and
relationships.
Data Integrity
Discusses some important ways to provide data integrity, including normalization,
business rules, referential integrity, and data validation.
Some Data Design Cautions
Presents some real-world conflicts that influence data design decisions.
Database design
It is the process of producing a detailed data model of a database. This logical data model
contains all the needed logical and physical design choices and physical storage parameters
needed to generate a design in a Data Definition Language, which can then be used to create a
database. A fully attributed data model contains detailed attributes for each entity.
The term database design can be used to describe many different parts of the design of an overall
database system. Principally, and most correctly, it can be thought of as the logical design of the
base data structures used to store the data. In the relational model these are the tables and views.
In an object database the entities and relationships map directly to object classes and named
relationships. However, the term database design could also be used to apply to the overall
process of designing, not just the base data structures, but also the forms and queries used as part
of the overall database application within the database management system (DBMS).[1]
The process of doing database design generally consists of a number of steps which will be
carried out by the database designer. Usually, the designer must:
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Attributes in ER diagrams are usually modeled as an oval with the name of the attribute, linked
to the entity or relationship that contains the attribute.
Within the relational model the final step can generally be broken down into two further steps,
that of determining the grouping of information within the system, generally determining what
are the basic objects about which information is being stored, and then determining the
relationships between these groups of information, or objects. This step is not necessary with an
Object database
1. Determine the purpose of your database - This helps prepare you for the remaining
steps.
2. Find and organize the information required - Gather all of the types of information
you might want to record in the database, such as product name and order number.
3. Divide the information into tables - Divide your information items into major entities
or subjects, such as Products or Orders. Each subject then becomes a table.
4. Turn information items into columns - Decide what information you want to store in
each table. Each item becomes a field, and is displayed as a column in the table. For
example, an Employees table might include fields such as Last Name and Hire Date.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
5. Specify primary keys - Choose each tables primary key. The primary key is a column
that is used to uniquely identify each row. An example might be Product ID or Order ID.
6. Set up the table relationships - Look at each table and decide how the data in one table
is related to the data in other tables. Add fields to tables or create new tables to clarify the
relationships, as necessary.
7. Refine your design - Analyze your design for errors. Create the tables and add a few
records of sample data. See if you can get the results you want from your tables. Make
adjustments to the design, as needed.
Conceptual schema
Once a database designer is aware of the data which is to be stored within the database, they
must then determine where dependency is within the data. Sometimes when data is changed you
can be changing other data that is not visible. For example, in a list of names and addresses,
assuming a situation where multiple people can have the same address, but one person cannot
have more than one address, the name is dependent upon the address, because if the address is
different, then the associated name is different too. However, the other way around is different.
One attribute can change and not another.
(NOTE: A common misconception is that the relational model is so called because of the stating
of relationships between data elements therein. This is not true. The relational model is so named
because it is based upon the mathematical structures known as relations.)
Once the relationships and dependencies amongst the various pieces of information have been
determined, it is possible to arrange the data into a logical structure which can then be mapped
into the storage objects supported by the database management system. In the case of relational
databases the storage objects are tables which store data in rows and columns.
Each table may represent an implementation of either a logical object or a relationship joining
one or more instances of one or more logical objects. Relationships between tables may then be
stored as links connecting child tables with parents. Since complex logical relationships are
themselves tables they will probably have links to more than one parent.
In an Object database the storage objects correspond directly to the objects used by the Object-
oriented programming language used to write the applications that will manage and access the
data. The relationships may be defined as attributes of the object classes involved or as methods
that operate on the object classes.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
The physical design of the database specifies the physical configuration of the database on the
storage media. This includes detailed specification of data elements, data types, indexing options
and other parameters residing in the DBMS data dictionary. It is the detailed design of a system
that includes modules & the database's hardware & software specifications of the system.
Modularity
Modular system
Collection of abstraction
1. Functional abstraction
2. Data Abstraction
3. Control Abstraction
Modular system consists of well defined, manageable units with well defined interface among
the units.
Properties
1. Each processing abstraction is a well defined subsystem i.e potentially useful in other
application.
Characteristics
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Modularization can be used to isolate the machine dependencies, to mprove the performance of a
software product.
1.coupling
2. cohesion
Cohesion
Cohesion (interdependency within module) strength/level names : (from worse to better, high
cohesion is good)
Coincidental Cohesion : (Worst) Module elements are unrelated
Logical Cohesion: Elements perform similar activities as selected from outside module,
i.e. by a flag that selects operation to perform (see also CommandObject).
o i.e. body of function is one huge if-else/switch on operation flag
Coupling
Coupling (interdependence between modules) level names: (from worse to better, high coupling
is bad)
Content/Pathological Coupling : (worst) When a module uses/alters data in another
Control Coupling : 2 modules communicating with a control flag (first tells second what
to do via flag)
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Normalization
Database normalization is the process of organizing the fields and tables of a relational
database to minimize redundancy and dependency. Normalization usually involves dividing large
tables into smaller (and less redundant) tables and defining relationships between them. The
objective is to isolate data so that additions, deletions, and modifications of a field can be made
in just one table and then propagated through the rest of the database via the defined
relationships.
Functional dependency
In a given table, an attribute Y is said to have a functional dependency on a set of
attributes X (written X Y) if and only if each X value is associated with precisely one Y
value. For example, in an "Employee" table that includes the attributes "Employee ID"
and "Employee Date of Birth", the functional dependency {Employee ID} {Employee
Date of Birth} would hold. It follows from the previous two sentences that each
{Employee ID} is associated with precisely one {Employee Date of Birth}.
Trivial functional dependency
A trivial functional dependency is a functional dependency of an attribute on a superset of
itself. {Employee ID, Employee Address} {Employee Address} is trivial, as is
{Employee Address} {Employee Address}.
Full functional dependency
An attribute is fully functionally dependent on a set of attributes X if it is:
functionally dependent on X, and
not functionally dependent on any proper subset of X. {Employee Address} has a
functional dependency on {Employee ID, Skill}, but not a full functional
dependency, because it is also dependent on {Employee ID}.
Transitive dependency
A transitive dependency is an indirect functional dependency, one in which XZ only by
virtue of XY and YZ.
Multivalued dependency
A multivalued dependency is a constraint according to which the presence of certain rows
in a table implies the presence of certain other rows.
Join dependency
A table T is subject to a join dependency if T can always be recreated by joining multiple
tables each having a subset of the attributes of T.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Superkey
A superkey is a combination of attributes that can be used to uniquely identify a database
record. A table might have many superkeys.
Candidate key
A candidate key is a special subset of superkeys that do not have any extraneous
information in them: it is a minimal superkey.
Examples: Imagine a table with the fields <Name>, <Age>, <SSN> and <Phone Extension>.
This table has many possible superkeys. Three of these are <SSN>, <Phone Extension, Name>
and <SSN, Name>. Of those listed, only <SSN> is a candidate key, as the others contain
information not necessary to uniquely identify records ('SSN' here refers to Social Security
Number, which is unique to each person).
Non-prime attribute
A non-prime attribute is an attribute that does not occur in any candidate key. Employee
Address would be a non-prime attribute in the "Employees' Skills" table.
Prime attribute
A prime attribute, conversely, is an attribute that does occur in some candidate key.
Primary key
Most DBMSs require a table to be defined as having a single unique key, rather than a
number of possible unique keys. A primary key is a key which the database designer has
designated for this purpose.
Normal forms
The normal forms (abbrev. NF) of relational database theory provide criteria for determining a
table's degree of vulnerability to logical inconsistencies and anomalies. The higher the normal
form applicable to a table, the less vulnerable it is to inconsistencies and anomalies. Each table
has a "highest normal form" (HNF): by definition, a table always meets the requirements of its
HNF and of all normal forms lower than its HNF; also by definition, a table fails to meet the
requirements of any normal form higher than its HNF.
The normal forms are applicable to individual tables; to say that an entire database is in normal
form n is to say that all of its tables are in normal form n.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Denormalization
Databases intended for online transaction processing (OLTP) are typically more normalized than
databases intended for online analytical processing (OLAP). OLTP applications are characterized
by a high volume of small transactions such as updating a sales record at a supermarket checkout
counter. The expectation is that each transaction will leave the database in a consistent state. By
contrast, databases intended for OLAP operations are primarily "read mostly" databases. OLAP
applications tend to extract historical data that has accumulated over a long period of time. For
such databases, redundant or "denormalized" data may facilitate business intelligence
applications. Specifically, dimensional tables in a star schema often contain denormalized data.
The denormalized or redundant data must be carefully controlled during extract, transform, load
(ETL) processing, and users should not be permitted to see the data until it is in a consistent
state. The normalized alternative to the star schema is the snowflake schema. In many cases, the
need for denormalization has waned as computers and RDBMS software have become more
powerful, but since data volumes have generally increased along with hardware and software
performance, OLAP databases often still use denormalized schemas.
).
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Procedural Design
1. Structured Programming
2. Graphical Design Tools
3. Program Design
4. Structured English
5. Psudocode
Strctured Design
The basic approach in strcuted design is systematic conversion of DFD into structure charts and
design heuristics such as coupling & Cohesion are used to guide the design process.
1. Review & refinement of the DFD developed during requirements definition & external
design.
2. To determeine whether the system is transform centered or transaction centered driven
and to derive a high level structure chart based on the determination.
3. Decomposition of each subsystem using guidelines such as coupling, cohesion,
information hiding, levels of abstraction, data abstraction.
Structure Chart
1. It has no decision boxes.
2. Sequential oredering of tasks inherient in a flowchart can be suppreesed in structure
chart.
TRANSFORM ANALYSIS
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
===> complete SC
the DFD should exist as a result of systems requirements engineering and systems
analysis
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
UNIT - III
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Aggregation
object composition (not to be confused with function composition) is a way to combine simple
objects or data types into more complex ones. Compositions are a critical building block of many
basic data structures, including the tagged union, the linked list, and the binary tree, as well as
the object used in object-oriented programming.
A real-world example of composition may be seen in the relation of an automobile to its parts,
specifically: the automobile 'has or is composed from' objects including steering wheel, seat,
gearbox and engine.
When, in a language, objects are typed, types can often be divided into composite and
noncomposite types, and composition can be regarded as a relationship between types: an object
of a composite type (e.g. car) "has an" object of a simpler type (e.g. wheel).
Composition must be distinguished from subtyping, which is the process of adding detail to a
general data type to create a more specific data type. For instance, cars may be a specific type of
vehicle: car is a vehicle. Subtyping doesn't describe a relationship between different objects, but
instead, says that objects of a type are simultaneously objects of another type.
In programming languages, composite objects are usually expressed by means of references from
one object to another; depending on the language, such references may be known as fields,
members, properties or attributes, and the resulting composition as a structure, storage
record, tuple, user-defined type (UDT), or composite type. Fields are given a unique name so
that each one can be distinguished from the others. However, having such references doesn't
necessarily mean that an object is a composite. It is only called composite if the objects it refers
to are really its parts, i.e. have no independent existence. For details, see the aggregation section
below.
UML notation
In UML, composition is depicted as a filled diamond and a solid line. It always implies a
multiplicity of 1 or 0..1, as no more than one object at a time can have lifetime responsibility for
another object.
The more general form, aggregation, is depicted as an unfilled diamond and a solid line. The
image below shows both composition and aggregation. The C++ code below shows what the
source code is likely to look like.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
// Composition
class Car
{
private:
Carburetor* carb;
public:
Car() : carb(new Carburetor()) { }
virtual ~Car() { delete carb; }
};
// Aggregation
class Pond
{
private:
std::vector<Duck*> ducks;
};
Composite types in C
typedef struct
{
int age;
char *name;
enum { male, female } sex;
} Person;
Class Pattern
In computer programming, the adapter pattern (often referred to as the wrapper pattern or
simply a wrapper) is a design pattern that translates one interface for a class into a compatible
interface. An adapter allows classes to work together that normally could not because of
incompatible interfaces, by providing its interface to clients while using the original interface.
The adapter translates calls to its interface into calls to the original interface, and the amount of
code necessary to do this is typically small. The adapter is also responsible for transforming data
into appropriate forms. For instance, if multiple Boolean values are stored as a single integer (i.e.
flags) but your client requires a 'true'/'false', the adapter would be responsible for extracting the
appropriate values from the integer value. Another example is transforming the format of dates
(e.g. YYYYMMDD to MM/DD/YYYY or DD/MM/YYYY).
Structure
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
In this type of adapter pattern, the adapter contains an instance of the class it wraps. In this
situation, the adapter makes calls to the instance of the wrapped object.
The object adapter pattern expressed in UML. The adapter hides the adaptee's interface from the
client.
This type of adapter uses multiple polymorphic interfaces to achieve its goal. The adapter is
created by implementing or inheriting both the interface that is expected and the interface that is
pre-existing. It is typical for the expected interface to be created as a pure interface class,
especially in languages such as Java that do not support multiple inheritance.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
It consist
1. OOA
2. OOD
3. OOP
Analysis
1. Subject layer:
Showing the attributes of the classs and the association relationship between
classes.
5. Service layer:
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Which shows the operations of the classes and the potential message pasing
between the objects.
Design
Four components
Design Pattern
A pattern is instructive information that captures the essential structure and insight of a
successful family of proven solutions to a recurring problem that arises within a certain context
and system to focus.
A pattern involves a general description of a solution to a recurring problem bundle with various
goals and constraints.
Good pattern
1. It solves a problem
It captures a solution.
2. It is a problem concept
Generative patterns that only describe a recurring problem, they can tell us how to
generate something and can be observed in the resulting system architectures.
Nongenerative patterns are static & passive; they describe recurring phenomena without
necessary saying how to produce them.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Pattern Templates
Every pattern must be expressed in the form of a rule which establishes a relationship
between a context, a system of forces which arises in that context & a configuration.
Components
1. Name
2. Problem
3. Context
4. Force
5. Solution
6. Example
7. Resulting Context
8. Rational
9. Relative Pattern
Pattern Thumbnail
Antipatterns
Worst Practice
Bad solution
1. Focus on practicability
3. Careful editing
4. Writers workshop
Frameworks
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
A framework is a way of present a generic solution to a problem that can be applied to all
levels in a development.
OOFramework
It is a set of co operating classes that makeup a reusable design for a specific class of
software.
Evaluation testing
Performance is an assessment of how well a task is executed and the success of a training
program is largely dependent upon satisfying the performance aims associated with it.
Testing and measurement are the means of collecting information upon which subsequent
performance evaluations and decisions are made.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
All of the above stages should be completed with the athlete - especially the analysis of the
collected data and making decision of an appropriate way forward.
In constructing tests it is important to make sure that they really measure the factors required to
be tested, and are thus objective rather than subjective. In doing so all tests should therefore be
specific (designed to assess an athlete's fitness for the activity in question), valid (the degree to
which the test actually measures what it claims to measure), reliable (capable of consistent
repetition) and objective (produce a consistent result irrespective of the tester).
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Tests additionally break up and add variety to the training program. They can be used to satisfy
the athlete's competitive urge out of season. Maximal tests demand maximum effort of the athlete
so they are useful at times as a training unit in their own right.
The following factors may have an impact on the results of a test (test reliability):
For the coach and athlete it is important to monitor the program of work, to maintain progression
in terms of the volume of work and its intensity. Both coach and athlete must keep their own
training records. A training diary can give an enormous amount of information about what has
happened in the past and how training has gone in the past. When planning future training cycles,
information of this kind is invaluable.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
o The training load (the number of miles, the number of sets and repetitions, the
number of attempts)
o The training intensity (kilograms, percentage of maximum, percentage of VO2)
o The response to training (the assignments completed, the resultant heart rate
recovery, felt tired, etc.)
Information that measures status. This can take the form of a test. If the test is repeated
throughout the program, it can then be used as a measure of progress within the training
discipline. Examples of such tests are:
o Time trials - speed, speed endurance, endurance
o Event specific
Competition evaluation
Following competition, it is important that the coach and athlete get together as soon as possible
in order to evaluate the athlete's performance. Elements to be considered are pre race
preparations, focus and performance plans and achievement of these plans. An evaluation form is
useful to help the athlete and coach conduct this review.
Maximal Tests
Maximal means the athlete works at maximum effort or tested to exhaustion. Examples of
maximal anaerobic tests are the 30 metre acceleration test and the Wingate ANaerobic 30 cycle
test. Examples of maximal aerobic tests are the Multistage Fitness Test or Bleep test and the
Cooper VO2max test
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Submaximal Tests
Submaximal means the athlete works below maximum effort. In sub maximal tests, extrapolation
is used to estimate maximum capacity. Examples of submaximal aerobic test are the PWC-170
test and the Queens College Step Test.
Normative data
Where normative data (average test results) is available, it is included on the appropriate
evaluation test pages which are identified below.
The Sport Specific Performance Tests page provides guidance on possible tests to evaluate the
athlete's fitness components for a variety of sports.
OOAD
A software system is a set of mechanisms for performing certain action on certain data.
OOSD is a way to develop software by building self contained modules or objects that can be
easily replaced, modified and reused.
Software is a collection of discrete objects that encapsulate their data as well as the functionality
to model real world objects.
Rambaugh method is well suited for describing the object model or the static structure of the
system.
Jacobson method is good for producing user-driven analysis model.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
It describes a method for the analysis, design and implementation of a system using an
Object Oriented Technique.
Object Modeling Technique is a fast, intuitive approach for identifying and modeling all the
objects making up a system details such as class attributes, methods, inheritance and association.
Dynamic behavior of objects with in a system can be described using the OMT dynamic model.
1. Analysis:
2. System design:
Basic architecture of the system along with high level strategy decision.
3. Object Design:
4. Implementation:
Sheller-Mellor method
It contain recursive design and model driven architecture normally associated with the
Unified Modeling Language (UML)
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Domain
Domain Types
1. Application Domain
2. Service Domain
3. Architecture Domain
4. Implementation Domain
Analysis of domains begine by identifying the objects that make up a domain subject
matter.
Information Model
State Model
Event Model
It consist of a graphical state transition diagram describing states, transitions, events that
cause transits, actions occurring on entering states.
Events provide a communication mechanism between objects and between objects and
the outside world.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Actions can do calculations, create and delete instances, read and write an event.
The method includes a complete set of time rules and event handling rules.
Process Models
It provide
1. Accessors.
2. Event generators
3. Transformations
4. Text
Derived Model
It provide subsystem
1. Communicational model
2. Relationship model
3. Access Model
Recursive Design
It refers to repeatedly applying the OOA process to the application, service and
architectural domains.
1. Determine architecture and specify the rules for translation of the OOA models into
implementation.
2. Build the translation Components
3. Translate the OOA Models in to implementation.
4. UML Diagrams(Class,Collaboration & Sequence)
5. What is UML?
6. UML stands for Unified Modeling Language. This object-oriented system of notation has
evolved from the work of Grady Booch, James Rumbaugh, Ivar Jacobson, and the
Rational Software Corporation. These renowned computer scientists fused their
respective technologies into a single, standardized model. Today, UML is accepted by the
Object Management Group (OMG) as the standard for modeling object oriented
programs.
7. Types of UML Diagrams
8. UML defines nine types of diagrams: class (package), object, use case, sequence,
collaboration, statechart, activity, component, and deployment.
9. Class Diagrams
10. Class diagrams are the backbone of almost every object oriented method, including
UML. They describe the static structure of a system.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
11.
12. Package Diagrams
13. Package diagrams are a subset of class diagrams, but developers sometimes treat them as
a separate technique. Package diagrams organize elements of a system into related groups
to minimize dependencies between packages.
14.
15. Object Diagrams
16. Object diagrams describe the static structure of a system at a particular time. They can be
used to test class diagrams for accuracy.
17.
18. Use Case Diagrams
19. Use case diagrams model the functionality of system using actors and use cases.
20.
21. Sequence Diagrams
22. Sequence diagrams describe interactions among classes in terms of an exchange of
messages over time.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
23.
24. Collaboration Diagrams
25. Collaboration diagrams represent interactions between objects as a series of sequenced
messages. Collaboration diagrams describe both the static structure and the dynamic
behavior of a system.
26.
27. Statechart Diagrams
28. Statechart diagrams describe the dynamic behavior of a system in response to external
stimuli. Statechart diagrams are especially useful in modeling reactive objects whose
states are triggered by specific events.
29.
30. Activity Diagrams
31. Activity diagrams illustrate the dynamic nature of a system by modeling the flow of
control from activity to activity. An activity represents an operation on some class in the
system that results in a change in the state of the system. Typically, activity diagrams are
used to model workflow or business processes and internal operation.
32.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
35.
36. Deployment Diagrams
37. Deployment diagrams depict the physical resources in a system, including nodes,
components, and connections.
38.
39.
40. Use-case diagram
41. A use case illustrates a unit of functionality provided by the system. The main purpose of
the use-case diagram is to help development teams visualize the functional requirements
of a system, including the relationship of "actors" (human beings who will interact with
the system) to essential processes, as well as the relationships among different use cases.
Use-case diagrams generally show groups of use cases either all use cases for the
complete system, or a breakout of a particular group of use cases with related
functionality (e.g., all security administration-related use cases). To show a use case on a
use-case diagram, you draw an oval in the middle of the diagram and put the name of the
use case in the center of, or below, the oval. To draw an actor (indicating a system user)
on a use-case diagram, you draw a stick person to the left or right of your diagram (and
just in case you're wondering, some people draw prettier stick people than others). Use
simple lines to depict relationships between actors and use cases, as shown in Figure 1.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
42.
Figure 1: Sample use-case diagram
43. A use-case diagram is typically used to communicate the high-level functions of the
system and the system's scope. By looking at our use-case diagram in Figure 1, you can
easily tell the functions that our example system provides. This system lets the band
manager view a sales statistics report and the Billboard 200 report for the band's CDs. It
also lets the record manager view a sales statistics report and the Billboard 200 report for
a particular CD. The diagram also tells us that our system delivers Billboard reports from
an external system called Billboard Reporting Service.
44. In addition, the absence of use cases in this diagram shows what the system doesn't do.
For example, it does not provide a way for a band manager to listen to songs from the
different albums on the Billboard 200 i.e., we see no reference to a use case called
Listen to Songs from Billboard 200. This absence is not a trivial matter. With clear and
simple use-case descriptions provided on such a diagram, a project sponsor can easily see
if needed functionality is present or not present in the system.
45.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
attributes, however, because it will most likely have references to things like Vectors and
HashMaps.
A class is depicted on the class diagram as a rectangle with three horizontal sections, as
shown in Figure 2. The upper section shows the class's name; the middle section contains
the class's attributes; and the lower section contains the class's operations (or "methods").
48.
Figure 2: Sample class object in a class diagram
49. In my experience, almost every developer knows what this diagram is, yet I find that
most programmers draw the relationship lines incorrectly. For a class diagram like the
one in Figure 3, you should draw the inheritance relationship1 using a line with an
arrowhead at the top pointing to the super class, and the arrowhead should be a
completed triangle. [Note: For more information on inheritance and other object-oriented
principles, see the Java tutorial What Is Inheritance?] An association relationship should
be a solid line if both classes are aware of each other and a line with an open arrowhead
if the association is known by only one of the classes.
50.
Figure 3: A complete class diagram, including the class object shown in Figure 2
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
box, put the class instance name and class name separated by a space/colon/space " : "
(e.g., myReportGenerator : ReportGenerator). If a class instance sends a message to
another class instance, draw a line with an open arrowhead pointing to the receiving class
instance; place the name of the message/method above the line. Optionally, for important
messages, you can draw a dotted line with an arrowhead pointing back to the originating
class instance; label the return value above the dotted line. Personally, I always like to
include the return value lines because I find the extra details make it easier to read.
59. Reading a sequence diagram is very simple. Start at the top left corner with the "driver"
class instance that starts the sequence. Then follow each message down the diagram.
Remember: Even though the example sequence diagram in Figure 4 shows a return
message for each sent message, this is optional.
60.
Figure 4: A sample sequence diagram
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
that is, classes with three or more potential states during system activity should be
modeled.
67. As shown in Figure 5, the notation set of the statechart diagram has five basic elements:
the initial starting point, which is drawn using a solid circle; a transition between states,
which is drawn using a line with an open arrowhead; a state, which is drawn using a
rectangle with rounded corners; a decision point, which is drawn as an open circle; and
one or more termination points, which are drawn using a circle with a solid circle inside
it. To draw a statechart diagram, begin with a starting point and a transition line pointing
to the initial state of the class. Draw the states themselves anywhere on the diagram, and
then simply connect them using the state transition lines.
68.
Figure 5: Statechart diagram showing the various states that classes pass through in
a functioning system
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
grouped into swimlanes, which are used to indicate the object that actually performs the
activity, as shown in Figure 6.
75.
Figure 6: Activity diagram, with two swimlanes to indicate control of activity by two
objects: the band manager, and the reporting tool
76. In our example activity diagram, we have two swimlanes because we have two objects
that control separate activities: a band manager and a reporting tool. The process starts
with the band manager electing to view the sales report for one of his bands. The
reporting tool then retrieves and displays all the bands that person manages and asks him
to choose one. After the band manager selects a band, the reporting tool retrieves the
sales information and displays the sales report. The activity diagram shows that
displaying the report is the last step in the process.
77.
Component diagram
78. A component diagram provides a physical view of the system. Its purpose is to show the
dependencies that the software has on the other software components (e.g., software
libraries) in the system. The diagram can be shown at a very high level, with just the
large-grain components, or it can be shown at the component package level. [Note: The
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
81.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Server named w3reporting.myco.com. The diagram shows the Reporting Tool component
drawn inside of IBM WebSphere, which in turn is drawn inside of the node
w3.reporting.myco.com. The Reporting Tool connects to its reporting database using the
Java language to IBM DB2's JDBC interface, which then communicates to the actual
DB2 database running on the server named db1.myco.com using native DB2
communication. In addition to talking to the reporting database, the Report Tool
component communicates via SOAP over HTTPS to the Billboard Service.
UML includes a set of graphic notation techniques to create visual models of object-oriented
software-intensive systems
Diagrams overview
UML diagrams
Class diagram
Component diagram
Deployment diagram
Object diagram
Package diagram
Profile diagram
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Activity diagram
Communication diagram
Sequence diagram
State diagram
Timing diagram
UML 2.2 has 14 types of diagrams divided into two categories. [13] Seven diagram types represent
structural information, and the other seven represent general types of behavior, including four
that represent different aspects of interactions. These diagrams can be categorized hierarchically
as shown in the following class diagram:
UML does not restrict UML element types to a certain diagram type. In general, every UML
element may appear on almost all types of diagrams; this flexibility has been partially restricted
in UML 2.0. UML profiles may define additional diagram types or extend existing diagrams with
additional notations.
In keeping with the tradition of engineering drawings, [citation needed] a comment or note explaining
usage, constraint, or intent is allowed in a UML diagram.
Structure diagrams
Structure diagrams emphasize the things that must be present in the system being modeled. Since
structure diagrams represent the structure, they are used extensively in documenting the software
architecture of software systems.
Class diagram: describes the structure of a system by showing the system's classes, their
attributes, and the relationships among the classes.
Component diagram: describes how a software system is split up into components and
shows the dependencies among these components.
Composite structure diagram: describes the internal structure of a class and the
collaborations that this structure makes possible.
Deployment diagram: describes the hardware used in system implementations and the
execution environments and artifacts deployed on the hardware.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Object diagram: shows a complete or partial view of the structure of an example modeled
system at a specific time.
Package diagram: describes how a system is split up into logical groupings by showing
the dependencies among these groupings.
Profile diagram: operates at the metamodel level to show stereotypes as classes with the
<<stereotype>> stereotype, and profiles as packages with the <<profile>> stereotype.
The extension relation (solid line with closed, filled arrowhead) indicates what
metamodel element a given stereotype is extending.
Class diagram
Component diagram
Composite structure diagram
Deployment diagram
Object diagram
Package diagram
Behavior diagrams
Behavior diagrams emphasize what must happen in the system being modeled. Since behavior
diagrams illustrate the behavior of a system, they are used extensively to describe the
functionality of software systems.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Interaction diagrams
Interaction diagrams, a subset of behavior diagrams, emphasize the flow of control and data
among the things in the system being modeled:
In software and systems engineering, a use case (pronounced /jus/, a case in the use of a system)
is a list of steps, typically defining interactions between a role (known in UML as an "actor") and
a system, to achieve a goal. The actor can be a human or an external system.
Use case diagrams are behavior diagrams used to describe a set of actions (use cases) that some
system or systems (subject) should or can perform in collaboration with one or more external
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
users of the system (actors). Each use case should provide some observable and valuable result
to the actors or other stakeholders of the system.
Note, that UML 2.4 specification also describes use case diagrams as a specialization of class
diagrams, and class diagrams are structure diagrams.
Use case diagrams are in fact twofold - they are both behavior diagrams (because they describe
behavior of the system), and they are also structure diagrams - as a special case of class diagrams
where classifiers are restricted to be either actors or use cases related with association.
Major elements of the use case diagram are shown on the picture below.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
UNIT IV
Architecture Concept
Architecture is the building blocks of the system.
Architecture style:
The way we express
It provides abstract description of the system.
Ex: two-tier architecture client server
Architecture can have impact on 5 aspects
1. Understanding
2. Reuse
3. evolution
4. Analysis
5. Management
Classification of architecture style
Specification of its characteristics style describes a family of instances.
S/W architecture = {elements, form, rationale}
Elements can be classified into 3 types
1. Processing Elements
2. Data elements
3. Connecting elements.
Form described as a set of weighted properties and relationship.
Rationale to capture the motivation for particular chices.
Architecture style classified based on the following features
The kinds of components and connectors that are used in the style.
The ways in which control is shared, allocated and transferred among the components.
How data is communicated through the system.
How data and control interact
The type of design.
Major category of software architecture style
1. Data flow
2. Call and return
3. Interactions process
4. Data centered repository
5. Data sharing.
Design Method
It provides the information how to generate design solution
In is a process of transferring knowledge.
A design method can be providing a procedural description of how to set about the task of
producing a design solution for a given problem i.e a method describes the tasks that the
designer is to perform and the order in which they should be performed.
Design is a creative process; a method cannot provide actual guidance about exactly how
each task should be performed for any specific problem.
Design methods provide design knowledge in rapid development, through notation and
procedures, a design method implicitly produces a design solution.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Design patterns
Design patterns are not procedural; they share some of the role of methods in that they
are concerned with the processes of producing a design solution, but also retain a strong
element of product modeling.
A pattern is a generic solution to a problem, it addressing the problem and provides the
solution. it will be more useful in OOAD.
Methods concentrate on teaching about solutions while pattern educate about problem, good
pattern provide good solution.
Some design representations
1. Black box notations
What
DFD
ER Diagram
State transition diagram
State chart
Use case diagram
Class diagram
Activity diagram
2. White box Notations(How)
State chart
Class diagram
Object diagram
Sequence diagram
Pesudocode
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
2. Problem
3. Context
4. Force
5. Solution
Static & Dynamic Behaviour
6. Example
7. Resulting Context
8. Rational
9. Relative Pattern
10. Know uses
Pattern Thumbnail
Overview of a structure of the pattern
Antipatterns
Worst Practice
Bad solution
Guideline for capturing pattern
5. Focus on practicability
6. Aggressive disregard of originality
7. Careful editing
8. Writers workshop
Frameworks
Frameworks are a way of delivering application development patterns to support best
practice sharing during application development.
A framework is a way of present a generic solution to a problem that can be applied to all
levels in a development.
OOFramework
It is a set of co operating classes that makeup a reusable design for a specific class of
software.
It captures design decisions.
Single framework encompasses several design patterns.
A framework is executable software, where as design pattern represents knowledge and
experience about software.
Difference between Patterns & Framework
4. Design pattern are more abstract than framework.
5. Design patterns are smaller architectural elements than frameworks.
6. Design patterns are less specialized than frameworks.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
2. SSD
It is a network diagram that identifies the interaction between the entities that make up
the model of the system.
Two mechanisms
1. A data Flow stream
2. State Vector.
JSP (Jackson Structured Programming)
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
1. Declarative knowledge:
It describing what tasks needs to be performed at each step in the design
process,.
2. Procedural knowledge:
a. It consisting of knowledge about how to employ a given method in a partocular
solution.
Three main components of software design methods
1. Representation part:
It consist of notations
2. Process Part:
Procedures to follow
3. Heuristics or clichs :
It provide a guideline on the way to design
Methods
1. Formal methods
2. Systematic methods
Stepwise Refinement
Subprogram has formed.
A process of gradual decomposition of the problem into smaller problems(termed stepwise
refinement)
The decomposition of tasks into sub tasks and of data into data structures.
Ex. Eight queens problesm
1. Program constrion consist of a sequence of refinement steps, in each of which a task is
divided into a number of subtasks, aaccompanished where appropriate by a further
refinement.
2. Degree of modularity resulting from this process will determine the ease with which a
program can be adopted to meet changes in the requirements.
3. Notation
4. Set of design decisions based on specifications.
5. Long store purpose and philosophy of the problem.
Incremental design
Black box to white box stages
Incremental development is often termed as Rapid Application development, it is approach is
called as agile method; it is also called as Extreme programming.
It provides
1. Individual and interactions over processed and tools
2. Working software over comprehensive documentation
3. Customer collaboration over contract negotiations.
4. Responding to change over following plan based on market level.
Technical reasons for incremental design
1. Black box model for the system(prototyping)
2. White box
Steps
1. Agree basic requirements
2. Develop architecture design
3. Undertake partial detailed design
4. Develop prototype, test and evaluate
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
5. Implementation.
Structured System Analysis & Structured design (SSA/SD)
Problem related analysis set of descriptive forms that can also be used for architectural design.
SD means solution related approach.
Structure System Analysis technique are centered on the use of the Data Flow Diagram.
Representation for Structured System Analysis
DFDs provide a problem oriented and functional viewpoint that does not involve making
any assumptions about hierarchy.
SSA guides the designer in building a model of the problem by using DFDs.
A data dictionary can also be used to record the information content of data flows. This
section provides P-Specs(Process Specification).
Representation used for structured Design
Structured chart is very much program oriented form of description in the call and return
style.
SSA/SD process
The design begins by constrcting a model of the top level problems in terms of the
operations that are performed by the ssyetm, and then this description of the problem is
transformed in to a plan for a program, this plan is in term described in terns of the set of
subprogram that are used to perform the relevant operations.
Five steps
1. Construct an initial DFD to provide a top level description of the problem.
2. Elaborate this into a layered hierarchy of DFD supported by a data dictionary.
3. Use Transactional Analysis to divide the DFD into traceable units.
4. Perform a Transform analysis on the DFD created for each transaction.
5. Merge the resulting structure chart to create the basic implementation plans, and refine
them to include any necessary error handling and other exception.
First two provides SSA and three to Five provides SD.
Step1 & 2 : Structured System Analysis
Problem driven in nature
To produce a functional specification that describes what the system is to do.
Context diagram or Bubble diagrams are used.
This diagram encapsulates a whole system.
DFD schema provides
1. TOP down Functional decomposition.
2. Event Partitioning.
Drawing a physical DFD to describe the initial systems model in terms of relatively concrete
items.
Constructing a logical DFD, which use more abstract terms to describe the operations and data
flows.
DFD provide
1. Frequency of information flow
2. Volume of data flow
3. Size of messages
4. Lifetime of an item of information
5. Dont try to handle exceptions and error conditions at this stage.
6. Dont flow chart
Step 3 Transaction Analysis
To separate the components of large design into a network of cooperating subsystem.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
5 basic Components
1. The event in the systems environment that causes the transaction to occur.
2. The stimulus that is applied to the system to inform it about the event.
2. The activity that is performed by the system as a result of the stimulus.
4. The response that this generates in terms of output from the system
5. The effect that this upon the environment
Step 4: Transform Analysis
DFD created to describe given transaction that can be divided and prepare the structured
chart separately.
Step 5: merge and completing the design process.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
UNIT-V
The Domain Name System (DNS) is a hierarchical distributed naming system for computers,
services, or any resource connected to the Internet or a private network. It associates various
information with domain names assigned to each of the participating entities.
A Domain Name Service translates queries for domain names (which are meaningful to
humans) into IP addresses for the purpose of locating computer services and devices worldwide.
An often-used analogy to explain the Domain Name System is that it serves as the phone book
for the Internet by translating human-friendly computer hostnames into IP addresses. For
example, the domain name www.example.com translates to the addresses 192.0.43.10 (IPv4) and
2620:0:2d0:200::10 (IPv6).
The Domain Name System makes it possible to assign domain names to groups of Internet
resources and users in a meaningful way, independent of each entity's physical location. Because
of this, World Wide Web (WWW) hyperlinks and Internet contact information can remain
consistent and constant even if the current Internet routing arrangements change or the
participant uses a mobile device. Internet domain names are easier to remember than IP
addresses such as 208.77.188.166 (IPv4) or 2001:db8:1f70::999:de8:7648:6e8 (IPv6). Users take
advantage of this when they recite meaningful Uniform Resource Locators (URLs) and e-mail
addresses without having to know how the computer actually locates them.
The Domain Name System distributes the responsibility of assigning domain names and
mapping those names to IP addresses by designating authoritative name servers for each domain.
Authoritative name servers are assigned to be responsible for their particular domains, and in
turn can assign other authoritative name servers for their sub-domains. This mechanism has
made the DNS distributed and fault tolerant and has helped avoid the need for a single central
register to be continually consulted and updated.
In general, the Domain Name System also stores other types of information, such as the list of
mail servers that accept email for a given Internet domain. By providing a worldwide, distributed
keyword-based redirection service, the Domain Name System is an essential component of the
functionality of the Internet.
Other identifiers such as RFID tags, UPCs, international characters in email addresses and host
names, and a variety of other identifiers could all potentially use DNS.[1][2]
The Domain Name System also specifies the technical functionality of this database service. It
defines the DNS protocol, a detailed specification of the data structures and communication
exchanges used in DNS, as part of the Internet Protocol Suite.
Overview
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
The Internet maintains two principal namespaces, the domain name hierarchy[3] and the Internet
Protocol (IP) address spaces.[4] The Domain Name System maintains the domain name hierarchy
and provides translation services between it and the address spaces. Internet name servers and a
communication protocol implement the Domain Name System.[5] A DNS name server is a server
that stores the DNS records for a domain name, such as address (A) records, name server (NS)
records, and mail exchanger (MX) records (see also list of DNS record types); a DNS name
server responds with answers to queries against its database.
History
The practice of using a name as a simpler, more memorable abstraction of a host's numerical
address on a network dates back to the ARPANET era. Before the DNS was invented in 1982,
each computer on the network retrieved a file called HOSTS.TXT from a computer at SRI (now
SRI International).[6][7] The HOSTS.TXT file mapped names to numerical addresses. A hosts file
still exists on most modern operating systems by default and generally contains a mapping of
"localhost" to the IP address 127.0.0.1. Many operating systems use name resolution logic that
allows the administrator to configure selection priorities for available name resolution methods.
The rapid growth of the network made a centrally maintained, hand-crafted HOSTS.TXT file
unsustainable; it became necessary to implement a more scalable system capable of
automatically disseminating the requisite information.
At the request of Jon Postel, Paul Mockapetris invented the Domain Name System in 1983 and
wrote the first implementation. The original specifications were published by the Internet
Engineering Task Force in RFC 882 and RFC 883, which were superseded in November 1987 by
RFC 1034[3] and RFC 1035.[5] Several additional Request for Comments have proposed various
extensions to the core DNS protocols.
In 1984, four Berkeley studentsDouglas Terry, Mark Painter, David Riggle, and Songnian
Zhouwrote the first Unix implementation, called The Berkeley Internet Name Domain (BIND)
Server.[8] In 1985, Kevin Dunlap of DEC significantly re-wrote the DNS implementation. Mike
Karels, Phil Almquist, and Paul Vixie have maintained BIND since then. BIND was ported to the
Windows NT platform in the early 1990s.
BIND was widely distributed, especially on Unix systems, and is the dominant DNS software in
use on the Internet.[9] With the heavy use and resulting scrutiny of its open-source code, as well
as increasingly more sophisticated attack methods, many security flaws were discovered in
BIND[citation needed]. This contributed to the development of a number of alternative name server and
resolver programs. BIND version 9 was written from scratch and now has a security record
comparable to other modern DNS software.[citation needed]
Structure
The domain name space consists of a tree of domain names. Each node or leaf in the tree has
zero or more resource records, which hold information associated with the domain name. The
tree sub-divides into zones beginning at the root zone. A DNS zone may consist of only one
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
domain, or may consist of many domains and sub-domains, depending on the administrative
authority delegated to the manager.
The hierarchical Domain Name System, organized into zones, each served by a name server
Administrative responsibility over any zone may be divided by creating additional zones.
Authority is said to be delegated for a portion of the old space, usually in the form of sub-
domains, to another nameserver and administrative entity. The old zone ceases to be authoritative
for the new zone.
The definitive descriptions of the rules for forming domain names appear in RFC 1035, RFC
1123, and RFC 2181. A domain name consists of one or more parts, technically called labels,
that are conventionally concatenated, and delimited by dots, such as example.com.
The right-most label conveys the top-level domain; for example, the domain name
www.example.com belongs to the top-level domain com.
The hierarchy of domains descends from right to left; each label to the left specifies a
subdivision, or subdomain of the domain to the right. For example: the label example
specifies a subdomain of the com domain, and www is a sub domain of example.com.
This tree of subdivisions may have up to 127 levels.
Each label may contain up to 63 characters. The full domain name may not exceed a total
length of 253 characters in its external dotted-label specification.[10] In the internal binary
representation of the DNS the maximum length requires 255 octets of storage.[3] In
practice, some domain registries may have shorter limits.[citation needed]
DNS names may technically consist of any character representable in an octet. However,
the allowed formulation of domain names in the DNS root zone, and most other sub
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
domains, uses a preferred format and character set. The characters allowed in a label are a
subset of the ASCII character set, and includes the characters a through z, A through Z,
digits 0 through 9, and the hyphen. This rule is known as the LDH rule (letters, digits,
hyphen). Domain names are interpreted in case-independent manner.[11] Labels may not
start or end with a hyphen.[12]
A hostname is a domain name that has at least one IP address associated. For example,
the domain names www.example.com and example.com are also hostnames, whereas the
com domain is not.
The permitted character set of the DNS prevented the representation of names and words of
many languages in their native alphabets or scripts. ICANN has approved the Internationalizing
Domain Names in Applications (IDNA) system, which maps Unicode strings into the valid DNS
character set using Punycode. In 2009 ICANN approved the installation of IDN country code
top-level domains. In addition, many registries of the existing top level domain names (TLD)s
have adopted IDNA.
Name servers
The Domain Name System is maintained by a distributed database system, which uses the client-
server model. The nodes of this database are the name servers. Each domain has at least one
authoritative DNS server that publishes information about that domain and the name servers of
any domains subordinate to it. The top of the hierarchy is served by the root nameservers, the
servers to query when looking up (resolving) a TLD.
An authoritative name server is a name server that gives answers that have been configured by
an original source, for example, the domain administrator or by dynamic DNS methods, in
contrast to answers that were obtained via a regular DNS query to another name server. An
authoritative-only name server only returns answers to queries about domain names that have
been specifically configured by the administrator.
An authoritative name server can either be a master server or a slave server. A master server is a
server that stores the original (master) copies of all zone records. A slave server uses an
automatic updating mechanism of the DNS protocol in communication with its master to
maintain an identical copy of the master records.
Every DNS zone must be assigned a set of authoritative name servers that are installed in NS
records in the parent zone.
When domain names are registered with a domain name registrar, their installation at the domain
registry of a top level domain requires the assignment of a primary name server and at least one
secondary name server. The requirement of multiple name servers aims to make the domain still
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
functional even if one name server becomes inaccessible or inoperable.[13] The designation of a
primary name server is solely determined by the priority given to the domain name registrar. For
this purpose, generally only the fully qualified domain name of the name server is required,
unless the servers are contained in the registered domain, in which case the corresponding IP
address is needed as well.
Primary name servers are often master name servers, while secondary name server may be
implemented as slave servers.
An authoritative server indicates its status of supplying definitive answers, deemed authoritative,
by setting a software flag (a protocol structure bit), called the Authoritative Answer (AA) bit in its
responses.[5] This flag is usually reproduced prominently in the output of DNS administration
query tools (such as dig) to indicate that the responding name server is an authority for the
domain name in question.[5]
In principle, authoritative name servers are sufficient for the operation of the Internet. However,
with only authoritative name servers operating, every DNS query must start with recursive
queries at the root zone of the Domain Name System and each user system must implement
resolver software capable of recursive operation.
To improve efficiency, reduce DNS traffic across the Internet, and increase performance in end-
user applications, the Domain Name System supports DNS cache servers which store DNS query
results for a period of time determined in the configuration (time-to-live) of the domain name
record in question. Typically, such caching DNS servers, also called DNS caches, also implement
the recursive algorithm necessary to resolve a given name starting with the DNS root through to
the authoritative name servers of the queried domain. With this function implemented in the
name server, user applications gain efficiency in design and operation.
The combination of DNS caching and recursive functions in a name server is not mandatory; the
functions can be implemented independently in servers for special purposes.
Internet service providers typically provide recursive and caching name servers for their
customers. In addition, many home networking routers implement DNS caches and recursors to
improve efficiency in the local network.
DNS resolvers
The client-side of the DNS is called a DNS resolver. It is responsible for initiating and
sequencing the queries that ultimately lead to a full resolution (translation) of the resource
sought, e.g., translation of a domain name into an IP address.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
A non-recursive query is one in which the DNS server provides a record for a domain for
which it is authoritative itself, or it provides a partial result without querying other
servers.
A recursive query is one for which the DNS server will fully answer the query (or give an
error) by querying other name servers as needed. DNS servers are not required to support
recursive queries.
The resolver, or another DNS server acting recursively on behalf of the resolver, negotiates use
of recursive service using bits in the query headers.
Resolving usually entails iterating through several name servers to find the needed information.
However, some resolvers function more simply by communicating only with a single name
server. These simple resolvers (called "stub resolvers") rely on a recursive name server to
perform the work of finding information for them.
Operation
Domain name resolvers determine the appropriate domain name servers responsible for the
domain name in question by a sequence of queries starting with the right-most (top-level)
domain label.
1. A network host is configured with an initial cache (so called hints) of the known
addresses of the root nameservers. Such a hint file is updated periodically by an
administrator from a reliable source.
2. A query to one of the root servers to find the server authoritative for the top-level domain.
3. A query to the obtained TLD server for the address of a DNS server authoritative for the
second-level domain.
4. Repetition of the previous step to process each domain name label in sequence, until the
final step which returns the IP address of the host sought.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
The mechanism in this simple form would place a large operating burden on the root servers,
with every search for an address starting by querying one of them. Being as critical as they are to
the overall function of the system, such heavy use would create an insurmountable bottleneck for
trillions of queries placed every day. In practice caching is used in DNS servers to overcome this
problem, and as a result, root nameservers actually are involved with very little of the total
traffic.
Name servers in delegations are identified by name, rather than by IP address. This means that a
resolving name server must issue another DNS request to find out the IP address of the server to
which it has been referred. If the name given in the delegation is a subdomain of the domain for
which the delegation is being provided, there is a circular dependency. In this case the
nameserver providing the delegation must also provide one or more IP addresses for the
authoritative nameserver mentioned in the delegation. This information is called glue. The
delegating name server provides this glue in the form of records in the additional section of the
DNS response, and provides the delegation in the answer section of the response.
For example, if the authoritative name server for example.org is ns1.example.org, a computer
trying to resolve www.example.org first resolves ns1.example.org. Since ns1 is contained in
example.org, this requires resolving example.org first, which presents a circular dependency. To
break the dependency, the nameserver for the org top level domain includes glue along with the
delegation for example.org. The glue records are address records that provide IP addresses for
ns1.example.org. The resolver uses one or more of these IP addresses to query one of domain's
authoritative servers, which allows it to complete the DNS query.
Record caching
Because of the large volume of DNS requests generated for the public Internet, the designers
wished to provide a mechanism to reduce the load on individual DNS servers. To this end, the
DNS resolution process allows for caching of records for a period of time after an answer. This
entails the local recording and subsequent consultation of the copy instead of initiating a new
request upstream. The time for which a resolver caches a DNS response is determined by a value
called the time to live (TTL) associated with every record. The TTL is set by the administrator of
the DNS server handing out the authoritative response. The period of validity may vary from just
seconds to days or even weeks.
Some resolvers may override TTL values, as the protocol supports caching for up to 68 years or
no caching at all. Negative caching, i.e. the caching of the fact of non-existence of a record, is
determined by name servers authoritative for a zone which must include the Start of Authority
(SOA) record when reporting no data of the requested type exists. The value of the MINIMUM
field of the SOA record and the TTL of the SOA itself is used to establish the TTL for the
negative answer.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Reverse lookup
A reverse lookup is a query of the DNS for domain names when the IP address is known.
Multiple domain names may be associated with an IP address. The DNS stores IP addresses in
the form of domain names as specially formatted names in pointer (PTR) records within the
infrastructure top-level domain arpa. For IPv4, the domain is in-addr.arpa. For IPv6, the reverse
lookup domain is ip6.arpa. The IP address is represented as a name in reverse-ordered octet
representation for IPv4, and reverse-ordered nibble representation for IPv6.
When performing a reverse lookup, the DNS client converts the address into these formats, and
then queries the name for a PTR record following the delegation chain as for any DNS query. For
example, assume the IPv4 address 208.80.152.2 is assigned to Wikimedia. It is represented as a
DNS name in reverse order like this: 2.152.80.208.in-addr.arpa. When the DNS resolver gets a
PTR (reverse-lookup) request, it begins by querying the root servers (which point to ARIN's
servers for the 208.in-addr.arpa zone). On ARIN's servers, 152.80.208.in-addr.arpa is assigned to
Wikimedia, so the resolver sends another query to the Wikimedia nameserver for
2.152.80.208.in-addr.arpa, which results in an authoritative response.
Client lookup
Users generally do not communicate directly with a DNS resolver. Instead DNS resolution takes
place transparently in applications such as web browsers, e-mail clients, and other Internet
applications. When an application makes a request that requires a domain name lookup, such
programs send a resolution request to the DNS resolver in the local operating system, which in
turn handles the communications required.
The DNS resolver will almost invariably have a cache (see above) containing recent lookups. If
the cache can provide the answer to the request, the resolver will return the value in the cache to
the program that made the request. If the cache does not contain the answer, the resolver will
send the request to one or more designated DNS servers. In the case of most home users, the
Internet service provider to which the machine connects will usually supply this DNS server:
such a user will either have configured that server's address manually or allowed DHCP to set it;
however, where systems administrators have configured systems to use their own DNS servers,
their DNS resolvers point to separately maintained nameservers of the organization. In any event,
the name server thus queried will follow the process outlined above, until it either successfully
finds a result or does not. It then returns its results to the DNS resolver; assuming it has found a
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
result, the resolver duly caches that result for future use, and hands the result back to the
software which initiated the request.
Broken resolvers
An additional level of complexity emerges when resolvers violate the rules of the DNS protocol.
A number of large ISPs have configured their DNS servers to violate rules (presumably to allow
them to run on less-expensive hardware than a fully compliant resolver), such as by disobeying
TTLs, or by indicating that a domain name does not exist just because one of its name servers
does not respond.[14]
As a final level of complexity, some applications (such as web-browsers) also have their own
DNS cache, in order to reduce the use of the DNS resolver library itself. This practice can add
extra difficulty when debugging DNS issues, as it obscures the freshness of data, and/or what
data comes from which cache. These caches typically use very short caching timeson the order
of one minute.[citation needed]
Internet Explorer represents a notable exception: versions up to IE 3.x cache DNS records for 24
hours by default. Internet Explorer 4.x and later versions (up to IE 8) decrease the default time
out value to half an hour, which may be changed in corresponding registry keys.[15]
Other applications
The system outlined above provides a somewhat simplified scenario. The Domain Name System
includes several other functions:
E-mail Blacklists: The DNS system is used for efficient storage and distribution of IP
addresses of blacklisted e-mail hosts. The usual method is putting the IP address of the
subject host into the sub-domain of a higher level domain name, and resolve that name to
different records to indicate a positive or a negative. Here is a hypothetical example
blacklist:
o 102.3.4.5 is blacklisted => Creates 5.4.3.102.blacklist.example and resolves to
127.0.0.1
o 102.3.4.6 is not => 6.4.3.102.blacklist.example is not found, or default to
127.0.0.2
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
o E-mail servers can then query blacklist.example through the DNS mechanism to
find out if a specific host connecting to them is in the blacklist. Today many of
such blacklists, either free or subscription-based, are available mainly for use by
email administrators and anti-spam software.
Software Updates: many anti-virus and commercial software now use the DNS system to
store version numbers of the latest software updates so client computers do not need to
connect to the update servers every time. For these types of applications, the cache time
of the DNS records are usually shorter.
Sender Policy Framework and DomainKeys, instead of creating their own record types,
were designed to take advantage of another DNS record type, the TXT record.
To provide resilience in the event of computer failure, multiple DNS servers are usually
provided for coverage of each domain, and at the top level, thirteen very powerful root
servers exist, with additional "copies" of several of them distributed worldwide via
Anycast.
Dynamic DNS (sometimes called DDNS) allows clients to update their DNS entry as
their IP address changes, as it does, for example, when moving between ISPs or mobile
hot spots.
Protocol details
DNS primarily uses User Datagram Protocol (UDP) on port number 53 to serve requests.[5] DNS
queries consist of a single UDP request from the client followed by a single UDP reply from the
server. The Transmission Control Protocol (TCP) is used when the response data size exceeds
512 bytes, or for tasks such as zone transfers. Some resolver implementations use TCP for all
queries.
A Resource Record (RR) is the basic data element in the domain name system. Each record has a
type (A, MX, etc.), an expiration time limit, a class, and some type-specific data. Resource
records of the same type define a resource record set (RRset). The order of resource records in a
set, returned by a resolver to an application, is undefined, but often servers implement round-
robin ordering to achieve load balancing. DNSSEC, however, works on complete resource record
sets in a canonical order.
When sent over an IP network, all records use the common format specified in RFC 1035:[16]
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
(octets)
NAME is the fully qualified domain name of the node in the tree. On the wire, the name may be
shortened using label compression where ends of domain names mentioned earlier in the packet
can be substituted for the end of the current domain name.
TYPE is the record type. It indicates the format of the data and it gives a hint of its intended use.
For example, the A record is used to translate from a domain name to an IPv4 address, the NS
record lists which name servers can answer lookups on a DNS zone, and the MX record specifies
the mail server used to handle mail for a domain specified in an e-mail address (see also List of
DNS record types).
RDATA is data of type-specific relevance, such as the IP address for address records, or the
priority and hostname for MX records. Well known record types may use label compression in
the RDATA field, but "unknown" record types must not (RFC 3597).
The CLASS of a record is set to IN (for Internet) for common DNS records involving Internet
hostnames, servers, or IP addresses. In addition, the classes Chaos (CH) and Hesiod (HS) exist.
[17]
Each class is an independent name space with potentially different delegations of DNS zones.
In addition to resource records defined in a zone file, the domain name system also defines
several request types that are used only in communication with other DNS nodes (on the wire),
such as when performing zone transfers (AXFR/IXFR) or for EDNS (OPT).
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
The domain name system supports wildcard domain names which are names that start with the
asterisk label, '*', e.g., *.example.[3][18] DNS records belonging to wildcard domain names specify
rules for generating resource records within a single DNS zone by substituting whole labels with
matching components of the query name, including any specified descendants. For example, in
the DNS zone x.example, the following configuration specifies that all subdomains (including
subdomains of subdomains) of x.example use the mail exchanger a.x.example. The records for
a.x.example are needed to specify the mail exchanger. As this has the result of excluding this
domain name and its subdomains from the wildcard matches, all subdomains of a.x.example
must be defined in a separate wildcard statement.
The role of wildcard records was refined in RFC 4592, because the original definition in RFC
1034 was incomplete and resulted in misinterpretations by implementers.[18]
Protocol extensions
The original DNS protocol had limited provisions for extension with new features. In 1999, Paul
Vixie published in RFC 2671 an extension mechanism, called Extension mechanisms for DNS
(EDNS) that introduced optional protocol elements without increasing overhead when not in use.
This was accomplished through the OPT pseudo-resource record that only exists in wire
transmissions of the protocol, but not in any zone files. Initial extensions were also suggested
(EDNS0), such as increasing the DNS message size in UDP datagrams.
Dynamic DNS updates use the UPDATE DNS opcode to add or remove resource records
dynamically from a zone data base maintained on an authoritative DNS server. The feature is
described in RFC 2136. This facility is useful to register network clients into the DNS when they
boot or become otherwise available on the network. Since a booting client may be assigned a
different IP address each time from a DHCP server, it is not possible to provide static DNS
assignments for such clients.
Security issues
Originally, security concerns were not major design considerations for DNS software or any
software for deployment on the early Internet, as the network was not open for participation by
the general public. However, the expansion of the Internet into the commercial sector in the
1990s changed the requirements for security measures to protect data integrity and user
authentication.
Several vulnerability issues were discovered and exploited by malicious users. One such issue is
DNS cache poisoning, in which data is distributed to caching resolvers under the pretense of
being an authoritative origin server, thereby polluting the data store with potentially false
information and long expiration times (time-to-live). Subsequently, legitimate application
requests may be redirected to network hosts operated with malicious intent.
DNS responses are traditionally not cryptographically signed, leading to many attack
possibilities; the Domain Name System Security Extensions (DNSSEC) modify DNS to add
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
support for cryptographically signed responses. Several extensions have been devised to secure
zone transfers as well.
Some domain names may be used to achieve spoofing effects. For example, paypal.com and
paypa1.com are different names, yet users may be unable to distinguish them in a graphical user
interface depending on the user's chosen typeface. In many fonts the letter l and the numeral 1
look very similar or even identical. This problem is acute in systems that support
internationalized domain names, since many character codes in ISO 10646, may appear identical
on typical computer screens. This vulnerability is occasionally exploited in phishing.[19]
Techniques such as forward-confirmed reverse DNS can also be used to help validate DNS
results.
The right to use a domain name is delegated by domain name registrarswhich are accredited by
the Internet Corporation for Assigned Names and Numbers (ICANN), the organization charged
with overseeing the name and number systems of the Internet. In addition to ICANN, each top-
level domain (TLD) is maintained and serviced technically by an administrative organization,
operating a registry. A registry is responsible for maintaining the database of names registered
within the TLD it administers. The registry receives registration information from each domain
name registrar authorized to assign names in the corresponding TLD and publishes the
information using a special service, the whois protocol.
ICANN publishes the complete list of TLD registries and domain name registrars. Registrant
information associated with domain names is maintained in an online database accessible with
the WHOIS service. For most of the more than 240 country code top-level domains (ccTLDs),
the domain registries maintain the WHOIS (Registrant, name servers, expiration dates, etc.)
information. For instance, DENIC, Germany NIC, holds the DE domain data. Since about 2001,
most gTLD registries have adopted this so-called thick registry approach, i.e. keeping the
WHOIS data in central registries instead of registrar databases.
For COM and NET domain names, a thin registry model is used: the domain registry (e.g.
VeriSign) holds basic WHOIS (registrar and name servers, etc.) data. One can find the detailed
WHOIS (registrant, name servers, expiry dates, etc.) at the registrars.
Some domain name registries, often called network information centers (NIC), also function as
registrars to end-users. The major generic top-level domain registries, such as for the COM,
NET, ORG, INFO domains, use a registry-registrar model consisting of many domain name
registrars[20][21] In this method of management, the registry only manages the domain name
database and the relationship with the registrars. The registrants (users of a domain name) are
customers of the registrar, in some cases through additional layers of resellers.
Internet standards
The Domain Name System is defined by Request for Comments (RFC) documents published by
the Internet Engineering Task Force (Internet standards). The following is a list of RFCs that
define the DNS protocol.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Security
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Electronic mail
An email message consists of three components, the message envelope, the message header, and
the message body. The message header contains control information, including, minimally, an
originator's email address and one or more recipient addresses. Usually descriptive information is
also added, such as a subject header field and a message submission date/time stamp.
Originally a text-only (7-bit ASCII and others) communications medium, email was extended to
carry multi-media content attachments, a process standardized in RFC 2045 through 2049.
Collectively, these RFCs have come to be called Multipurpose Internet Mail Extensions (MIME).
Operation overview
The diagram to the right shows a typical sequence of events [43] that takes place when Alice
composes a message using her mail user agent (MUA). She enters the email address of her
correspondent, and hits the "send" button.
1. Her MUA formats the message in email format and uses the Submission Protocol (a
profile of the Simple Mail Transfer Protocol (SMTP), see RFC 6409) to send the message
to the local mail submission agent (MSA), in this case smtp.a.org, run by Alice's internet
service provider (ISP).
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
2. The MSA looks at the destination address provided in the SMTP protocol (not from the
message header), in this case [email protected]. An Internet email address is a string of the form
localpart@exampledomain. The part before the @ sign is the local part of the address,
often the username of the recipient, and the part after the @ sign is a domain name or a
fully qualified domain name. The MSA resolves a domain name to determine the fully
qualified domain name of the mail exchange server in the Domain Name System (DNS).
3. The DNS server for the b.org domain, ns.b.org, responds with any MX records listing the
mail exchange servers for that domain, in this case mx.b.org, a message transfer agent
(MTA) server run by Bob's ISP.
4. smtp.a.org sends the message to mx.b.org using SMTP.
This server may need to forward the message to other MTAs before the message reaches the final
message delivery agent (MDA).
That sequence of events applies to the majority of email users. However, there are many
alternative possibilities and complications to the email system:
Alice or Bob may use a client connected to a corporate email system, such as IBMLotus
Notes or MicrosoftExchange. These systems often have their own internal email format
and their clients typically communicate with the email server using a vendor-specific,
proprietary protocol. The server sends or receives email via the Internet through the
product's Internet mail gateway which also does any necessary reformatting. If Alice and
Bob work for the same company, the entire transaction may happen completely within a
single corporate email system.
Alice may not have a MUA on her computer but instead may connect to a webmail
service.
Alice's computer may run its own MTA, so avoiding the transfer at step 1.
Bob may pick up his email in many ways, for example logging into mx.b.org and reading
it directly, or by using a webmail service.
Domains usually have several mail exchange servers so that they can continue to accept
mail when the main mail exchange server is not available.
Email messages are not secure if email encryption is not used correctly.
Many MTAs used to accept messages for any recipient on the Internet and do their best to deliver
them. Such MTAs are called open mail relays. This was very important in the early days of the
Internet when network connections were unreliable. If an MTA couldn't reach the destination, it
could at least deliver it to a relay closer to the destination. The relay stood a better chance of
delivering the message at a later time. However, this mechanism proved to be exploitable by
people sending unsolicited bulk email and as a consequence very few modern MTAs are open
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
mail relays, and many MTAs don't accept messages from open mail relays because such
messages are very likely to be spam.
Message format
The Internet email message format is now defined by RFC 5322, with multi-media content
attachments being defined in RFC 2045 through RFC 2049, collectively called Multipurpose
Internet Mail Extensions or MIME. RFC 5322 replaced the earlier RFC 2822 in 2008, and in turn
RFC 2822 in 2001 replaced RFC 822 - which had been the standard for Internet email for nearly
20 years. Published in 1982, RFC 822 was based on the earlier RFC 733 for the ARPANET.[44]
Header Structured into fields such as From, To, CC, Subject, Date, and other
information about the email.
Body The basic content, as unstructured text; sometimes containing a signature block
at the end. This is exactly the same as the body of a regular letter.
Message header
Each message has exactly one header, which is structured into fields. Each field has a name and
a value. RFC 5322 specifies the precise syntax.
Informally, each line of text in the header that begins with a printable character begins a separate
field. The field name starts in the first character of the line and ends before the separator
character ":". The separator is then followed by the field value (the "body" of the field). The
value is continued onto subsequent lines if those lines have a space or tab as their first character.
Field names and values are restricted to 7-bit ASCII characters. Non-ASCII values may be
represented using MIME encoded words.
Header fields
Email header fields can be multi-line, and each line must be at most 76 characters long. Header
fields can only contain US-ASCII characters; for encoding characters in other sets, a syntax
specified in RFC 2047 can be used.[45] Recently the IETF EAI working group has defined some
experimental extensions to allow Unicode characters to be used within the header. In particular,
this allows email addresses to use non-ASCII characters. Such characters must only be used by
servers that support these extensions.[citation needed]
From: The email address, and optionally the name of the author(s). In many email clients
not changeable except through changing account settings.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Date: The local time and date when the message was written. Like the From: field, many
email clients fill this in automatically when sending. The recipient's client may then
display the time in the format and time zone local to him/her.
Message-ID: Also an automatically generated field; used to prevent multiple delivery and
for reference in In-Reply-To: (see below).
In-Reply-To: Message-ID of the message that this is a reply to. Used to link related
messages together. This field only applies for reply messages.
RFC 3864 describes registration procedures for message header fields at the IANA; it provides
for permanent and provisional message header field names, including also fields defined for
MIME, netnews, and http, and referencing relevant RFCs. Common header fields for email
include:
To: The email address(es), and optionally name(s) of the message's recipient(s). Indicates
primary recipients (multiple allowed), for secondary recipients see Cc: and Bcc: below.
Subject: A brief summary of the topic of the message. Certain abbreviations are
commonly used in the subject, including "RE:" and "FW:".
Bcc: Blind Carbon Copy; addresses added to the SMTP delivery list but not (usually)
listed in the message data, remaining invisible to other recipients.
Cc: Carbon copy; Many email clients will mark email in your inbox differently
depending on whether you are in the To: or Cc: list.
Content-Type: Information about how the message is to be displayed, usually a MIME
type.
Precedence: commonly with values "bulk", "junk", or "list"; used to indicate that
automated "vacation" or "out of office" responses should not be returned for this mail,
e.g. to prevent vacation notices from being sent to all other subscribers of a mailinglist.
Sendmail uses this header to affect prioritization of queued email, with "Precedence:
special-delivery" messages delivered sooner. With modern high-bandwidth networks
delivery priority is less of an issue than it once was. Microsoft Exchange respects a fine-
grained automatic response suppression mechanism, the X-Auto-Response-Suppress
header.[48]
References: Message-ID of the message that this is a reply to, and the message-id of the
message the previous reply was a reply to, etc.
Reply-To: Address that should be used to reply to the message.
Sender: Address of the actual sender acting on behalf of the author listed in the From:
field (secretary, list manager, etc.).
Archived-At: A direct link to the archived form of an individual email message.[49]
Filename extension
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Upon reception of email messages, email client applications save messages in operating
system files in the file system. Some clients save individual messages as separate files,
while others use various database formats, often proprietary, for collective storage. A
historical standard of storage is the mbox format. The specific format used is often
indicated by special filename extensions:
eml
Used by many email clients including Microsoft Outlook Express, Windows Mail and
Mozilla Thunderbird.[62] The files are plain text in MIME format, containing the email
header as well as the message contents and attachments in one or more of several
formats.
emlx
Used by Apple Mail.
msg
Used by Microsoft Office Outlook and OfficeLogic Groupware.
mbx
Used by Opera Mail, KMail, and Apple Mail based on the mbox format.
Some applications (like Apple Mail) leave attachments encoded in messages for
searching while also saving separate copies of the attachments. Others separate
attachments from messages and save them in a specific directory.
File Transfer Protocol (FTP) is a standard network protocol used to transfer files from one host
to another host over a TCP-based network, such as the Internet. It is often used to upload web
pages and other documents from a private development machine to a public web-hosting server.
FTP is built on a client-server architecture and uses separate control and data connections
between the client and the server FTP users may authenticate themselves using a clear-text sign-
in protocol, normally in the form of a username and password, but can connect anonymously if
the server is configured to allow it. For secure transmission that hides (encrypts) your username
and password, as well as encrypts the content, you can try using a client that uses SSH File
Transfer Protocol.
The first FTP client applications were interactive command-line tools, implementing standard
commands and syntax. Graphical user interface clients have since been developed for many of
the popular desktop operating systems in use today including general web design programs like
Microsoft Expression Web, and specialist FTP clients such as CuteFTP.
FTP operates on the application layer of the OSI model, and is used to transfer files using
TCP/IP.To do so, an FTP server has to be running and waiting for incoming requestsTheclient
computer is then able to communicate with the server on port 21 This connection, called the
control connection remains open for the duration of the session. A second connection, called the
data connection, can either be opened by the server from its port 20 to a negotiated client port
(active mode), or by the client from an arbitrary port to a negotiated server port (passive mode)
as required to transfer file data. The control connection is used for session administration, for
example commands, identification and passwords exchanged between the client and the server
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
using a telnet-like protocol. For example "RETR filename" would transfer the specified file from
the server to the client. Due to this two-port structure, FTP is considered an out-of-band protocol,
as opposed to an in-band protocol such as HTTP.
The server responds over the control connection with three-digit status codes in ASCII with an
optional text message. For example "200" (or "200 OK") means that the last command was
successful. The numbers represent the code for the response and the optional text represents a
human-readable explanation or request (e.g. <Need account for storing file>). An ongoing
transfer of file data over the data connection can be aborted using an interrupt message sent over
the control connection.
FTP may run in active or passive mode, which determines how the data connection is established
In active mode, the client creates a TCP control connection to the server and sends the server the
client's IP address and an arbitrary client port number, and then waits until the server initiates the
data connection over TCP to that client IP address and client port number. In situations where the
client is behind a firewall and unable to accept incoming TCP connections, passive mode may be
used. In this mode, the client uses the control connection to send a PASV command to the server
and then receives a server IP address and server port number from the server, which the client
then uses to open a data connection from an arbitrary client port to the server IP address and
server port number received. Both modes were updated in September 1998 to support IPv6.
Further changes were introduced to the passive mode at that time, updating it to extended passive
mode.
While transferring data over the network, four data representations can be used:
ASCII mode: used for text. Data is converted, if needed, from the sending host's character
representation to "8-bit ASCII" before transmission, and (again, if necessary) to the
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
For text files, different format control and record structure options are provided. These features
were designed to facilitate files containing Telnet or ASA formatting.
Stream mode: Data is sent as a continuous stream, relieving FTP from doing any
processing. Rather, all processing is left up to TCP. No End-of-file indicator is needed,
unless the data is divided into records.
Block mode: FTP breaks the data into several blocks (block header, byte count, and data
field) and then passes it on to TCP.
Compressed mode: Data is compressed using a single algorithm (usually run-length
encoding).
Login
FTP login utilizes a normal usernames and password scheme for granting access. The username
is sent to the server using the USER command, and the password is sent using the PASS
command If the information provided by the client is accepted by the server, the server will send
a greeting to the client and the session will commence If the server supports it, users may log in
without providing login credentials, but the server may authorize only limited access for such
sessions
Anonymous FTP
A host that provides an FTP service may provide anonymous FTP accessUsers typically log into
the service with an 'anonymous' (lower-case and case-sensitive in some FTP servers) account
when prompted for user name. Although users are commonly asked to send their email address in
lieu of a password no verification is actually performed on the supplied data. Many FTP hosts
whose purpose is to provide software updates will provide anonymous logins
FTP normally transfers data by having the server connect back to the client, after the PORT
command is sent by the client. This is problematic for both NATs and firewalls, which do not
allow connections from the Internet towards internal hostsFor NATs, an additional complication
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
is that the representation of the IP addresses and port number in the PORT command refer to the
internal host's IP address and port, rather than the public IP address and port of the NAT.
There are two approaches to this problem. One is that the FTP client and FTP server use the
PASV command, which causes the data connection to be established from the FTP client to the
server This is widely used by modern FTP clients. Another approach is for the NAT to alter the
values of the PORT command, using an application-level gateway for this purpose.
FTPmail
Where FTP access is restricted, an FTPmail service can be used to circumvent the problem. An
e-mail containing the FTP commands to be performed is sent to an FTPmail server, which parses
the incoming e-mail, executes the requested FTP commands and sends back an e-mail with any
downloaded files as attachments. This service is less flexible than an FTP client, as it is not
possible to view directories interactively or to issue any modify commands. There can also be
problems with large file attachments not getting through mail servers. The service was used in
the days when some users' only internet access was via e-mail through gateways such as a BBS
or online service. As most internet users these days have ready access to FTP, this procedure is
no longer in widespread use.
Most common web browsers can retrieve files hosted on FTP servers, although they may not
support protocol extensions such as FTPSWhen an FTPrather than an HTTPURL is
supplied, the accessible contents on the remote server are presented in a manner that is similar to
that used for other Web content. A full-featured FTP client can be run within Firefox in the form
of an extension called FireFTP
Syntax
ftp://public.ftp-servers.example.com/mydirectory/myfile.txt
or:
ftp://user001:[email protected]/mydirectory/myfile.txt
More details on specifying a username and password may be found in the browsers'
documentation, such as, for example, Firefox and Internet Explorer. By default, most web
browsers use passive (PASV) mode, which more easily traverses end-user firewalls.
Security
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
FTP was not designed to be a secure protocolespecially by today's standardsand has many
security weaknesses. In May 1999, the authors of RFC 2577 listed a vulnerability to the
following problems:
Bounce attacks
Spoof attacks
Brute force attacks
Packet capture (sniffing)
Username protection
Port stealing
FTP is not able to encrypt its traffic; all transmissions are in clear text, and usernames,
passwords, commands and data can be easily read by anyone able to perform packet capture
(sniffing) on the network. This problem is common to many of the Internet Protocol
specifications (such as SMTP, Telnet, POP and IMAP) that were designed prior to the creation of
encryption mechanisms such as TLS or SSL A common solution to this problem is to use the
"secure", TLS-protected versions of the insecure protocols (e.g. FTPS for FTP, TelnetS for
Telnet, etc.) or a different, more secure protocol that can handle the job, such as the SFTP/SCP
tools included with most implementations of the Secure Shell protocol.
Secure FTP
There are several methods of securely transferring files that have been called "Secure FTP" at
one point or another.
FTPS
Explicit FTPS is an extension to the FTP standard that allows clients to request that the FTP
session be encrypted. This is done by sending the "AUTH TLS" command. The server has the
option of allowing or denying connections that do not request TLS. This protocol extension is
defined in the proposed standard: RFC 4217. Implicit FTPS is a deprecated standard for FTP that
required the use of a SSL or TLS connection. It was specified to use different ports than plain
FTP.
SFTP
SFTP, the "SSH File Transfer Protocol," is not related to FTP except that it also transfers files
and has a similar command set for users. SFTP, or secure FTP, is a program that uses Secure
Shell (SSH) to transfer files. Unlike standard FTP, it encrypts both commands and data,
preventing passwords and sensitive information from being transmitted openly over the network.
It is functionally similar to FTP, but because it uses a different protocol, you can't use a standard
FTP client to talk to an SFTP server, nor can you connect to an FTP server with a client that
supports only SFTP.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
FTP over SSH (not SFTP) refers to the practice of tunneling a normal FTP session over an SSH
connection Because FTP uses multiple TCP connections (unusual for a TCP/IP protocol that is
still in use), it is particularly difficult to tunnel over SSH. With many SSH clients, attempting to
set up a tunnel for the control channel (the initial client-to-server connection on port 21) will
protect only that channel; when data is transferred, the FTP software at either end will set up new
TCP connections (data channels), which bypass the SSH connection and thus have no
confidentiality or integrity protection, etc.
Otherwise, it is necessary for the SSH client software to have specific knowledge of the FTP
protocol, to monitor and rewrite FTP control channel messages and autonomously open new
packet forwardings for FTP data channels. Software packages that support this mode are:
FTP over SSH is sometimes referred to as secure FTP; this should not be confused with other
methods of securing FTP, such as SSL/TLS (FTPS). Other methods of transferring files using
SSH that are not related to FTP include SFTP and SCP; in each of these, the entire conversation
(credentials and data) is always protected by the SSH protocol.
Below is a list of FTP commands that may be sent to an FTP server, including all commands
that are standardized in RFC 959 by the IETF. All commands below are RFC 959-based unless
stated otherwise. Note that most command-line FTP clients present their own set of commands to users.
For example, GET is the common user command to download a file instead of the raw command RETR.
9P
Apple Filing Protocol (AFP)
BitTorrent
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
FTAM
FTP
o FTP over SSL (FTPS)
HFTP
HULFT
HTTP
o HTTPS
o WebDAV
rcp
rsync
Simple Asynchronous File Transfer (SAFT), bound to TCP port 487
Secure copy (SCP)
SSH file transfer protocol (SFTP)
Simple File Transfer Protocol
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
HTTP
HTTP stands for Hypertext Transfer Protocol. It is an TCP/IP based communication protocol
which is used to deliver virtually all files and other data, collectively called resources, on the
World Wide Web. These resources could be HTML files, image files, query results, or anything
else.
A browser is works as an HTTP client because it sends requests to an HTTP server which is
called Web server. The Web Server then sends responses back to the client. The standard and
default port for HTTP servers to listen on is 80 but it can be changed to any other port like 8080
etc.
There are three important things about HTTP of which you should be aware:
HTTP is connectionless: After a request is made, the client disconnects from the server
and waits for a response. The server must re-establish the connection after it process the
request.
HTTP is media independent: Any type of data can be sent by HTTP as long as both the
client and server know how to handle the data content. How content is handled is
determined by the MIME specification.
HTTP is stateless: This is a direct result of HTTP's being connectionless. The server and
client are aware of each other only during a request. Afterwards, each forgets the other.
For this reason neither the client nor the browser can retain information between different
request across the web pages.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Like most network protocols, HTTP uses the client-server model: An HTTP client opens a
connection and sends a request message to an HTTP server; the server then returns a response
message, usually containing the resource that was requested. After delivering the response, the
server closes the connection.
The format of the request and response messages are similar and will have following structure:
Initial lines and headers should end in CRLF. Though you should gracefully handle lines ending
in just LF. More exactly, CR and LF here mean ASCII values 13 and 10.
The initial line is different for the request than for the response. A request line has three parts,
separated by spaces:
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
The initial response line, called the status line, also has three parts separated by spaces:
HTTP/1.0 200 OK
or
Header Lines
Header lines provide information about the request or response, or about the object sent in the
message body.
The header lines are in the usual text header format, which is: one line per header, of the form
"Header-Name: value", ending with CRLF. It's the same format used for email and news
postings, defined in RFC 822.
A header line should end in CRLF, but you should handle LF correctly.
The header name is not case-sensitive.
Any number of spaces or tabs may be between the ":" and the value.
Header lines beginning with space or tab are actually part of the previous header line,
folded into multiple lines for easy reading.
User-agent: Mozilla/3.0Gold
or
An HTTP message may have a body of data sent after the header lines. In a response, this is
where the requested resource is returned to the client (the most common use of the message
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
body), or perhaps explanatory text if there's an error. In a request, this is where user-entered data
or uploaded files are sent to the server.
If an HTTP message includes a body, there are usually header lines in the message that describe
the body. In particular:
The Content-Type: header gives the MIME-type of the data in the body, such as
text/html or image/gif.
The Content-Length: header gives the number of bytes in the body.
The set of common methods for HTTP/1.0 is defined below. Although this set can be expanded.
The GET method means retrieve whatever information (in the form of an entity) is identified by
the Request-URI. If the Request-URI refers to a data-producing process, it is the produced data
which shall be returned as the entity in the response and not the source text of the process, unless
that text happens to be the output of the process.
A conditional GET method requests that the identified resource be transferred only if it has been
modified since the date given by the If-Modified-Since header. The conditional GET method is
intended to reduce network usage by allowing cached entities to be refreshed without requiring
multiple requests or transferring unnecessary data.
The GET method can also be used to submit forms. The form data is URL-encoded and
appended to the request URI
A HEAD request is just like a GET request, except it asks the server to return the response
headers only, and not the actual resource (i.e. no message body). This is useful to check
characteristics of a resource without actually downloading it, thus saving bandwidth. Use HEAD
when you don't actually need a file's contents.
The response to a HEAD request must never contain a message body, just the status line and
headers.
A POST request is used to send data to the server to be processed in some way, like by a CGI
script. A POST request is different from a GET request in the following ways:
There's a block of data sent with the request, in the message body. There are usually extra
headers to describe this message body, like Content-Type: and Content-Length:
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
The request URI is not a resource to retrieve; it's usually a program to handle the data
you're sending.
The most common use of POST, by far, is to submit HTML form data to CGI scripts. In this
case, the Content-Type: header is usually application/x-www-form-urlencoded, and the
Content-Length: header gives the length of the URL-encoded form data. The CGI script
receives the message body through STDIN, and decodes it. Here's a typical form submission,
using POST:
home=Mosby&favorite+flavor=flies
If you were writing a CGI script directly i.e. not using PHP, but Perl, Shell, C, or antoher
language you would have to pay attention to where you get the user's value/variable
combinations. In the case of GET you would use the QUERY_STRING environment variable
and in the case of POST you would use the CONTENT_LENGTH environment variable to
control your iteration as you parsed for special characters to extract a variable and its value.
POST Method:
Your form data is attached to the end of the POST request (as opposed to the URL).
Not as quick and easy as using GET, but more versatile (provided that you are writing the
CGI directly).
GET Method :
Your entire form submission can be encapsulated in one URL, like a hyperlink so can
store a query by a just a URL
You can access the CGI program with a query without using a form.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Header lines provide information about the request or response, or about the object sent in the
message body. This section will list out all the header fields available in HTTP Version 1.0
Allow
The Allow entity-header field lists the set of methods supported by the resource identified by the
Request-URI. The purpose of this field is strictly to inform the recipient of valid methods
associated with the resource.
Example
Authorization
The Authorization field value consists of credentials containing the authentication information of
the user agent for the realm of the resource being requested.
Example
Authorization : credentials
Content-Encoding
The Content-Encoding entity-header field is used as a modifier to the media-type. When present,
its value indicates what additional content coding has been applied to the resource, and thus what
decoding mechanism must be applied in order to obtain the media-type referenced by the
Content-Type header field. The Content-Encoding is primarily used to allow a document to be
compressed without losing the identity of its underlying media type.
Example
Content-Encoding: x-gzip
Content-Length
The Content-Length entity-header field indicates the size of the Entity-Body, in decimal number
of octets, sent to the recipient or, in the case of the HEAD method, the size of the Entity-Body
that would have been sent had the request been a GET.
Example
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Content-Length: 3495
Content-Type
The Content-Type entity-header field indicates the media type of the Entity-Body sent to the
recipient or, in the case of the HEAD method, the media type that would have been sent had the
request been a GET.
Example
Content-Type: text/html
Date
The Date general-header field represents the date and time at which the message was originated,
having the same semantics as orig-date in RFC 822.
Example
Expires
The Expires entity-header field gives the date/time after which the entity should be considered
stale. This allows information providers to suggest the volatility of the resource, or a date after
which the information may no longer be valid.
Example
From
The From request-header field, if given, should contain an Internet e-mail address for the human
user who controls the requesting user agent. The address should be machine-usable, as defined
by mailbox in RFC 822.
Example
From: [email protected]
If-Modified-Since
The If-Modified-Since request-header field is used with the GET method to make it conditional:
if the requested resource has not been modified since the time specified in this field, a copy of
the resource will not be returned from the server; instead, a 304 (not modified) response will be
returned without any Entity-Body.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Example
Last-Modified
The Last-Modified entity-header field indicates the date and time at which the sender believes
the resource was last modified.
Example
Location
The Location response-header field defines the exact location of the resource that was identified
by the Request-URI. For 3xx responses, the location must indicate the server's preferred URL for
automatic redirection to the resource. Only one absolute URL is allowed.
Example
Location: http://www.w3.org/hypertext/WWW/NewLocation.html
Pragma
The Pragma general-header field is used to include implementation-specific directives that may
apply to any recipient along the request/response chain. All pragma directives specify optional
behavior from the viewpoint of the protocol; however, some systems may require that behavior
be consistent with the directives.
Example
Referer
The Referer request-header field allows the client to specify, for the server's benefit, the address
(URI) of the resource from which the Request-URI was obtained.
Example
Referer: http://www.w3.org/hypertext/DataSources/Overview.html
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Server
The Server response-header field contains information about the software used by the origin
server to handle the request. The field can contain multiple product tokens and comments
identifying the server and any significant subproducts.
Example
User-Agent
The User-Agent request-header field contains information about the user agent originating the
request. This is for statistical purposes, the tracing of protocol violations, and automated
recognition of user agents for the sake of tailoring responses to avoid particular user agent
limitations.
Example
WWW-Authenticate
Example
http://www.somehost.com/path/file.html
first open a socket to the host www.somehost.com, port 80 (use the default port of 80 because
none is specified in the URL). Then, send something like the following through the socket:
The server should respond with something like the following, sent back through the same socket:
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
HTTP/1.0 200 OK
Date: Fri, 31 Dec 1999 23:59:59 GMT
Content-Type: text/html
Content-Length: 1354
<html>
<body>
<h1>Happy New Millennium!</h1>
To familiarize yourself with requests and responses, do manually experiment with HTTP using
telnet.
Using telnet, you can open an interactive socket to an HTTP server. This lets you manually enter
a request, and see the response written to your screen. It's a great help when learning HTTP, to
see exactly how a server responds to a particular request. It also helps when troubleshooting.
From a Unix prompt, open a connection to an HTTP server with something like
telnet www.somehost.com 80
After you finish your request with the blank line, you'll see the raw response from the server,
including the status line, headers, and message body.
WWW
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Introduction
When you first open the web browser it will automatically load a "homepage" - usually that of
the browser's manufacturer. E.g. Internet Explorer will load the MSN homepage. Most people
find this irritating and change the default homepage setting to something they are more interested
in like weather reports, stock exchange info or their favourite search engine. To change your
homepage to your preferred page, browse to the page you want as your homepage. Then go
"View", "Internet Options", select the "Use Current" button.
As a page "loads" into your browser you will see the text come in, the pictures arrive - all the
basic elements of a web page. You will notice that some text is underlined and in different
colours - this is hypertext, if you click on it, you will jump to another web page. When you move
your mouse arrow over a link the mouse's pointer will change to a hand. This indicates there is a
hypertext link associated with that text. The same applies to pictures, that is, if you move your
mouse over a picture and your mouse arrow turns from an arrow to a hand, you know there is a
link there. The process of clicking hypertext links, loading one page after another, is called
"browsing" or "surfing the web".
The Internet Explorer tool bar looks like the one below. The tool bar is essential for navigation
and frequently performed functions.
IE Toolbar
The address or URL of the current page you are on appears in the "Address" Bar below the tool
bar. You can click in here at any time and overwrite this address with another if you know the
address off the top of your head. Leave the "http://" as it is required at the start of the URL, it
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
indicates that the browser is to use hypertext transfer protocol to request the web page. URL
stands for "uniform resource locator". In non-technical terms you can think of it as simply the
address of a web page.
Address panel
When a page is loading you may see some unusual things - the wavy window (top right) waves
and a green status bar (bottom right) moves from left to right. The wavy window is simply
saying that your browser is looking for the page you requested, while the status bar shows you
how much of the web page has loaded.
Bottom left, in the gray frame you will see words flashing backwards and forwards. These are
the names of all the files that will make up the web page you have requested.
Caching
As you browse the web, your machine saves html files and images that you request. This process
is called "caching". Internet Explorer calls cached files "Temporary Internet Files". Caching is
designed to speed up your experience on the web. Caching occurs on your machine and
sometimes on the servers of your Internet Service Provider. If you own a web site or make
changes to a web site, you may have to refresh your browser to see your changes. The quick way
to do this is to click the refresh button on your browser. This sometimes does not work if other
servers above you have cached your page. To get past this, and back to the original server, press
Ctrl + F5 (or hold Shift while pressing the Refresh button). You can control you caching option
by going Tools > Internet Options. On the general tab, you will see an area called Temporary
Internet Files.
Return to top
SSL
Also, sometimes you will see a little padlock appear (bottom right on the status bar ). This
means that the page you have requested is on a "secure server" - a server that allows sensitive
information to be transferred between your computer and the server you have contacted -
information like your credit card details, or details of an account login.
More and more secure servers are appearing on-line as banks, shops and others move more of
their businesses to the web. Security on the Internet is handled by a protocol called SSL or
Secure Socket Layer. This protocol is very robust and unlikely ever to be practically cracked.
However, where human intervention or handling of credit card numbers is possible, security
breaches may occur.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Multimedia is media and content that uses a combination of different content forms. The term
can be used as a noun (a medium with multiple content forms) or as an adjective describing a
medium as having multiple content forms. The term is used in contrast to media which use only
rudimentary computer display such as text-only, or traditional forms of printed or hand-produced
material. Multimedia includes a combination of text, audio, still images, animation, video, or
interactivity content forms.
Multimedia finds its application in various areas including, but not limited to, advertisements,
art, education, entertainment, engineering, medicine, mathematics, business, scientific research
and spatial temporal applications. Several examples are as follows:
Creative industries
Creative industries use multimedia for a variety of purposes ranging from fine arts, to
entertainment, to commercial art, to journalism, to media and software services provided for any
of the industries listed below. An individual multimedia designer may cover the spectrum
throughout their career. Request for their skills range from technical, to analytical, to creative.
Commercial uses
Much of the electronic old and new media used by commercial artists is multimedia. Exciting
presentations are used to grab and keep attention in advertising. Business to business, and
interoffice communications are often developed by creative services firms for advanced
multimedia presentations beyond simple slide shows to sell ideas or liven-up training.
Commercial multimedia developers may be hired to design for governmental services and
nonprofit services applications as well.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
entails the creation of multimedia that can be displayed in a traditional fine arts arena, such as an
art gallery. Although multimedia display material may be volatile, the survivability of the content
is as strong as any traditional media. Digital recording material may be just as durable and
infinitely reproducible with perfect copies every time.
Education
Learning theory in the past decade has expanded dramatically because of the introduction of
multimedia. Several lines of research have evolved (e.g. Cognitive load, Multimedia learning,
and the list goes on). The possibilities for learning and instruction are nearly endless.
The idea of media convergence is also becoming a major factor in education, particularly higher
education. Defined as separate technologies such as voice (and telephony features), data (and
productivity applications) and video that now share resources and interact with each other,
synergistically creating new efficiencies, media convergence is rapidly changing the curriculum
in universities all over the world. Likewise, it is changing the availability, or lack thereof, of jobs
requiring this savvy technological skill.
The English education in middle school in China is well invested and assisted with various
equipments. In contrast, the original objective has not been achieved at the desired effect. The
government, schools, families, and students spend a lot of time working on improving scores, but
hardly gain practical skills. English education today has gone into the vicious circle. Educators
need to consider how to perfect the education system to improve students practical ability of
English. Therefore an efficient way should be used to make the class vivid. Multimedia teaching
will bring students into a class where they can interact with the teacher and the subject.
Multimedia teaching is more intuitive than old ways; teachers can simulate situations in real life.
In many circumstances teachers dont have to be there, students will learn by themselves in the
class. More importantly, teachers will have more approaches to stimulating students passion of
learning
Journalism
Newspaper companies all over are also trying to embrace the new phenomenon by implementing
its practices in their work. While some have been slow to come around, other major newspapers
like The New York Times, USA Today and The Washington Post are setting the precedent for the
positioning of the newspaper industry in a globalized world.
News reporting is not limited to traditional media outlets. Freelance journalists can make use of
different new media to produce multimedia pieces for their news stories. It engages global
audiences and tells stories with technology, which develops new communication techniques for
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
both media producers and consumers. Common Language Project is an example of this type of
multimedia journalism production.
Engineering
Software engineers may use multimedia in Computer Simulations for anything from
entertainment to training such as military or industrial training. Multimedia for software
interfaces are often done as a collaboration between creative professionals and software
engineers.
Industry
In the Industrial sector, multimedia is used as a way to help present information to shareholders,
superiors and coworkers. Multimedia is also helpful for providing employee training, advertising
and selling products all over the world via virtually unlimited web-based technology
In mathematical and scientific research, multimedia is mainly used for modeling and simulation.
For example, a scientist can look at a molecular model of a particular substance and manipulate
it to arrive at a new substance. Representative research can be found in journals such as the
Journal of Multimedia.
Medicine
In Medicine, doctors can get trained by looking at a virtual surgery or they can simulate how the
human body is affected by diseases spread by viruses and bacteria and then develop techniques
to prevent it.
Document imaging
Document imaging is a technique that takes hard copy of an image/document and converts it into
a digital format (for example, scanners).
Disabilities
Ability Media allows those with disabilities to gain qualifications in the multimedia field so they
can pursue careers that give them access to a wide array of powerful communication forms.
Security
Security is the degree of protection against danger, damage, loss, and crime. Securities as a form
of protection are structures and processes that provide or improve security as a condition.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
computer system security means the collective processes and mechanisms by which sensitive and
valuable information and services are protected from publication, tampering or collapse by
unauthorized activities or untrustworthy individuals and unplanned events respectively. The
strategies and methodologies of computer security often differ from most other computer
technologies because of its somewhat elusive objective of preventing unwanted computer
behavior instead of enabling wanted computer behavior.
The following terms used in engineering secure systems are explained below.
Authentication techniques can be used to ensure that communication end-points are who
they say they are.
Automated theorem proving and other verification tools can enable critical algorithms
and code used in secure systems to be mathematically proven to meet their specifications.
Capability and access control list techniques can be used to ensure privilege separation
and mandatory access control. This section discusses their use.
Chain of trust techniques can be used to attempt to ensure that all software loaded has
been certified as authentic by the system's designers.
Cryptographic techniques can be used to defend data in transit between systems, reducing
the probability that data exchanged between systems can be intercepted or modified.
Firewalls can provide some protection from online intrusion.
Applications only control the use of resources granted to them, and not which resources
are granted to them. They, in turn, determine the use of these resources by users of the
application through application security.
According to the patterns & practices Improving Web Application Security book, the following
terms are relevant to application security:[1]
Asset. A resource of value such as the data in a database or on the file system, or a system
resource.
Threat. A negative effect.
Vulnerability. A weakness that makes a threat possible.
Attack (or exploit). An action taken to harm an asset.
Countermeasure. A safeguard that addresses a threat and mitigates risk.
2.Data Security means protecting a database from destructive forces and the unwanted actions
of unauthorised users
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Disk Encryption
Disk encryption refers to encryption technology that encrypts data on a hard disk drive. Disk
encryption typically takes form in either software or hardware Disk encryption is often referred
to as on-the-fly encryption or transparent encryption.
Software based security solutions encrypt the data to prevent data from being stolen. However, a
malicious program or a hacker may corrupt the data in order to make it unrecoverable or
unusable. Similarly, encrypted operating systems can be corrupted by a malicious program or a
hacker, making the system unusable. Hardware-based security solutions can prevent read and
write access to data and hence offers very strong protection against tampering and unauthorized
access.
[Backups
Data Masking
Data masking of structured data is the process of obscuring (masking) specific data within a
database table or cell to ensure that data security is maintained and sensitive information is not
exposed to unauthorized personnel. This may include masking the data from users (for example
so banking customer representatives can only see the last 4 digits of a customers national identity
number), developers (who need real production data to test new software releases but should not
be able to see sensitive financial data), outsourcing vendors, etc.
Data Erasure
Data erasure is a method of software-based overwriting that completely destroys all electronic
data residing on a hard drive or other digital media to ensure that no sensitive data is leaked
when an asset is retired or reused.
The terms information security, computer security and information assurance are frequently used
interchangeably. These fields are interrelated often and share the common goals of protecting the
confidentiality, integrity and availability of information; however, there are some subtle
differences between them.
These differences lie primarily in the approach to the subject, the methodologies used, and the
areas of concentration. Information security is concerned with the confidentiality, integrity and
availability of data regardless of the form the data may take: electronic, print, or other forms.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Computer security can focus on ensuring the availability and correct operation of a computer
system without concern for the information stored or processed by the computer. Information
assurance focuses on the reasons for assurance that information is protected, and is thus
reasoning about information security.
Should confidential information about a business customers or finances or new product line fall
into the hands of a competitor, such a breach of security could lead to negative consequences.
Protecting confidential information is a business requirement, and in many cases also an ethical
and legal requirement.
Key concepts
a.Confidentiality
Breaches of confidentiality take many forms. Permitting someone to look over your shoulder at
your computer screen while you have confidential data displayed on it could be a breach of
confidentiality. If a laptop computer containing sensitive information about a company's
employees is stolen or sold, it could result in a breach of confidentiality. Giving out confidential
information over the telephone is a breach of confidentiality if the caller is not authorized to have
the information.
Confidentiality is necessary (but not sufficient) for maintaining the privacy of the people whose
personal information a system holds.[citation needed]
b.Integrity
In information security, integrity means that data cannot be modified undetectably. [citation needed] This
is not the same thing as referential integrity in databases, although it can be viewed as a special
case of Consistency as understood in the classic ACID model of transaction processing. Integrity
is violated when a message is actively modified in transit. Information security systems typically
provide message integrity in addition to data confidentiality.
c.Availability
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
For any information system to serve its purpose, the information must be available when it is
needed. This means that the computing systems used to store and process the information, the
security controls used to protect it, and the communication channels used to access it must be
functioning correctly. High availability systems aim to remain available at all times, preventing
service disruptions due to power outages, hardware failures, and system upgrades. Ensuring
availability also involves preventing denial-of-service attacks.
d. Authenticity
In computing, e-Business, and information security, it is necessary to ensure that the data,
transactions, communications or documents (electronic or physical) are genuine. It is also
important for authenticity to validate that both parties involved are who they claim they are.
e.Non-repudiation
In law, non-repudiation implies one's intention to fulfill their obligations to a contract. It also
implies that one party of a transaction cannot deny having received a transaction nor can the
other party deny having sent a transaction.
Electronic commerce uses technology such as digital signatures and public key encryption to
establish authenticity and non-repudiation
4.Network security[1] consists of the provisions and policies adopted by a network administrator
to prevent and monitor unauthorized access, misuse, modification, or denial of a computer
network and network-accessible resources. Network security involves the authorization of access
to data in a network, which is controlled by the network administrator. Users choose or are
assigned an ID and password or other authenticating information that allows them access to
information and programs within their authority. Network security covers a variety of computer
networks, both public and private, that are used in everyday jobs conducting transactions and
communications among businesses, government agencies and individuals. Networks can be
private, such as within a company, and others which might be open to public access. Network
security is involved in organizations, enterprises, and other types of institutions. It does as its title
explains: It secures the network, as well as protecting and overseeing operations being done. The
most common and simple way of protecting a network resource is by assigning it a unique name
and a corresponding password
Types of Attacks
Networks are subject to attacks from malicious sources. Attacks can be from two categories
"Passive" when a network intruder intercepts data traveling through the network, and "Active" in
which an intruder initiates commands to disrupt the networks normal operation.[9]
Passive
o Network
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
wiretapping
Port scanner
Idle scan
Active
o Denial-of-service attack
o Spoofing
o ARP poisoning
o Smurf attack
o Buffer overflow
SNMP-based management allows for third-party solutions to be used. This includes products
such as HP OpenView and CA Unicenter.
The base component of an SNMP solution is the Management Information Base (MIB). The
MIB is included on the Sun Fire V20z and Sun Fire V40z Servers Network Share Volume CD.
This server management configuration is beneficial when, for example, you have a cluster of
machines serving web content and the platform is connected to the Internet, but the SP is
protected and accessible only on an internal network.
SNMP Integration
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
SNMP is used to communicate management information between the management stations and
the agents. In other words, SNMP is the protocol by which the agent and the management station
communicate.
Monitoring status through SNMP at any significant level of detail is accomplished primarily by
polling for appropriate information on the part of the management station. Managed nodes can
also provide unsolicited status information to management stations in the form of traps, which
are likely to guide the polling at the management station.
The servers include SNMP agents that allow for out-of-band health and status monitoring. The
SNMP agent runs on the SP and therefore all SNMP-based management of the server should
occur through the SP.
Event management
Inventory management
Sensor and system state monitoring
SP configuration monitoring
The Management Information Base (MIB) is a text file that describes SNMP data as managed
objects. These servers provide SNMP MIBs so that you can manage and monitor your server
using any SNMP-capable network management system, such as HP OpenView Network Node
Manager (NNM), Tivoli, CA Unicenter, IBM Director, and so on. The MIB data describes the
information being managed, reflects current and recent server status, and provides server
statistics.
In typical SNMP uses, one or more administrative computers, called managers, have the task of
monitoring or managing a group of hosts or devices on a computer network. Each managed
system executes, at all times, a software component called an agent which reports information
via SNMP to the manager.
Essentially, SNMP agents expose management data on the managed systems as variables. The
protocol also permits active management tasks, such as modifying and applying a new
configuration through remote modification of these variables. The variables accessible via SNMP
are organized in hierarchies. These hierarchies, and other metadata (such as type and description
of the variable), are described by Management Information Bases (MIBs).
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
Managed device
Agent software which runs on managed devices
Network management system (NMS) software which runs on the manager
A managed device is a network node that implements an SNMP interface that allows
unidirectional (read-only) or bidirectional access to node-specific information. Managed devices
exchange node-specific information with the NMSs. Sometimes called network elements, the
managed devices can be any type of device, including, but not limited to, routers, access servers,
switches, bridges, hubs, IP telephones, IP video cameras, computer hosts, and printers.
A network management system (NMS) executes applications that monitor and control managed
devices. NMSs provide the bulk of the processing and memory resources required for network
management. One or more NMSs may exist on any managed network.
To be managed, a device must have an SNMP agent associated with it. The agent receives
requests for data representing the state of the device and provides an appropriate response. The
agent can also control the state of the device. Additionally, the agent can generate SNMP traps,
which are unsolicited messages sent to selected NMSs to signal significant events relating to the
device.
Protocol details
SNMP operates in the Application Layer of the Internet Protocol Suite (Layer 7 of the OSI
model). The SNMP agent receives requests on UDP port 161. The manager may send requests
from any available source port to port 161 in the agent. The agent response will be sent back to
the source port on the manager. The manager receives notifications (Traps and InformRequests)
on port 162. The agent may generate notifications from any available port. When used with
Transport Layer Security or Datagram Transport Layer Security requests are received on port
10161 and traps are sent to port 10162.[3].
SNMPv1 specifies five core protocol data units (PDUs). Two other PDUs, GetBulkRequest and
InformRequest were added in SNMPv2 and carried over to SNMPv3.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
GetRequest
SetRequest
GetNextRequest
A manager-to-agent request to discover available variables and their values. Returns a Response
with variable binding for the lexicographically next variable in the MIB. The entire MIB of an
agent can be walked by iterative application of GetNextRequest starting at OID 0. Rows of a
table can be read by specifying column OIDs in the variable bindings of the request.
GetBulkRequest
Response
Returns variable bindings and acknowledgement from agent to manager for GetRequest,
SetRequest, GetNextRequest, GetBulkRequest and InformRequest. Error reporting is provided by
error-status and error-index fields. Although it was used as a response to both gets and sets, this
PDU was called GetResponse in SNMPv1.
Trap
InformRequest
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
commonly runs over UDP where delivery is not assured and dropped packets are not reported,
delivery of a Trap was not guaranteed. InformRequest fixes this by sending back an
acknowledgement on receipt. Receiver replies with Response parroting all information in the
InformRequest. This PDU was introduced in SNMPv2.
SNMP locates the network management component on one or more computers and locates the
managed component on multiple managed devices:
SNMP agent. An SNMP agent is any computer or other network device that monitors
and responds to queries from SNMP managers. The agent can also send a trap message to
the manager when specified events, such as a system reboot or illegal access, occur.
A computer on which you install SNMP management software is an SNMP manager, and a
computer on which you install agent software, such as the Microsoft SNMP agent included with
Windows Server 2003, is an SNMP agent. The SNMP manager displays the information it
receives in a user-friendly graphical user interface. You configure SNMP options, including
traps, on the SNMP agent, but the SNMP agent does not display the managed information that it
sends to an SNMP manager. For more information about SNMP requests and trap messages, see
SNMP Messages later in this section.
To enable SNMP communications between an SNMP manager and SNMP agents, you configure
the SNMP manager and the SNMP agents that it manages as members of an SNMP community.
The community name functions like a password to authenticate communications between the
SNMP manager and agent. The SNMP community is an SNMP-defined group, not a group
defined in the Active Directory directory service. For more information about SNMP
communities, see SNMP Communities later in this section.
An SNMP manager can request the following types of information from the SNMP agents that it
monitors:
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
If you assign the SNMP manager write permission for the SNMP agent, the SNMP manager can
also send a configuration request to the agent (using a Set message) to change a local parameter.
However, Set requests are limited to a small set of client parameters that have read-write access
defined. Most client parameters allow only read-only access.
When an SNMP manager requests information from an SNMP agent, the SNMP agent retrieves
the current value of the requested information from the Management Information Base (MIB).
The MIB defines the managed objects that an SNMP manager monitors (or sometimes
configures) on an SNMP agent.
Each system in a network (workstation, server, router, bridge, and so forth) maintains a MIB that
reflects the status of the managed resources on that system, such as the version of the software
running on the device, the IP address assigned to a port or interface, the amount of free hard
drive space, or the number of open files. The MIB does not contain static data, but is instead an
object-oriented, dynamic database that provides a logical collection of managed object
definitions. The MIB defines the data type of each managed object and describes the object.
FIGURE shows a high-level overview of the Sun Netra CT900 server from the SNMP manager's
perspective. Fan trays and power entry modules (PEMs) are just a couple examples of resources
that are manageable through the ShMM.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
The following table describes each type of message that the SNMP manager and agent programs
use to communicate with each other.
SNMP
From / To Message Description
Message
Manager / Accesses and retrieves the current value of one or more MIB objects on
Get
agent an SNMP agent .
Agent /
GetResponse Replies to a Get, GetNext, or Set operation.
manager
Browses the entire tree of MIB objects, reading the values of variables
in the MIB sequentially. Typically, you use GetNext to obtain
Manager / information from selected columns from one or more rows of a table.
GetNext
agent GetNext is especially useful for browsing dynamic tables, such as an
internal IP route table or an ARP table, reading through the table one
row at a time.
Retrieves data in units as large as possible within the given constraints
on the message size. GetBulk, which accesses multiple values at one
time without using a GetNext message, minimizes the number of
GetBulk protocol exchanges required to retrieve a large amount of information.
Manager /
(SNMPv2c To avoid fragmentation, restrict the maximum message size to a size
agent
only) smaller than the path maximum transmission unit (MTU), the largest
frame size allowed for a single frame on your network. Typically, when
it is not known how many rows are in a table, GetBulk is used (rather
than GetNext) to browse all rows in the table.
SNMP uses the connectionless User Datagram Protocol (UDP) service to transmit SNMP
messages. SNMP uses the simple UDP transport service, which guarantees neither delivery nor
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162
IT2053 SOFTWARE DESIGN LECTURE NOTES
correct sequencing of delivered packets, so that SNMP can continue functioning after many other
network services have failed. By default, UDP port 161 is used to listen for SNMP messages and
port 162 is used to listen for SNMP traps. If necessary for example, because your organization
already uses ports 161 and 162 for some other protocol or service you can change these port
settings by configuring the local Services file (this Services file is different from the Windows
Services snap-in). The following figure shows messages moving between an SNMP manager and
several SNMP agents.
Security implications
SNMP versions 1 and 2c are subject to packet sniffing of the clear text community string
from the network traffic, because they do not implement encryption.
All versions of SNMP are subject to brute force and dictionary attacks for guessing the
community strings, authentication strings, authentication keys, encryption strings, or
encryption keys, because they do not implement a challenge-response handshake.
Although SNMP works over TCP and other protocols, it is most commonly used over
UDP that is connectionless and vulnerable to IP spoofing attacks. Thus, all versions are
subject to bypassing device access lists that might have been implemented to restrict
SNMP access, though SNMPv3's other security mechanisms should prevent a successful
attack.
SNMP's powerful configuration (write) capabilities are not being fully utilized by many
vendors, partly because of a lack of security in SNMP versions before SNMPv3 and
partly because many devices simply are not capable of being configured via individual
MIB object changes.
SNMP tops the list of the SANS Institute's Common Default Configuration Issues with
the issue of default SNMP community strings set to public and private and was
number ten on the SANS Top 10 Most Critical Internet Security Threats for the year
2000.
162
Velammal Institute of Technology DEPARTMENT OF INFORMATION TECHNOLOGY
162