0% found this document useful (0 votes)
25 views20 pages

BX4002

The document outlines the systematic process of computer-based problem solving, emphasizing the importance of defining problems, designing programs, coding, and testing. It details the six essential steps in problem solving, including problem analysis, program design, coding, compilation, debugging, and documentation. Additionally, it discusses the significance of algorithms and top-down design in breaking down complex tasks into manageable parts.

Uploaded by

ajithkumar17366
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views20 pages

BX4002

The document outlines the systematic process of computer-based problem solving, emphasizing the importance of defining problems, designing programs, coding, and testing. It details the six essential steps in problem solving, including problem analysis, program design, coding, compilation, debugging, and documentation. Additionally, it discusses the significance of algorithms and top-down design in breaking down complex tasks into manageable parts.

Uploaded by

ajithkumar17366
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

PEC,Vaniyambadi

Department of MCA
BX4002 - Problem Solving and Programming in C
UNIT I INTRODUCTION TO COMPUTER PROBLEM SOLVING

Introduction

Computer based problem solving is a systematic process of designing, implementing


and using programming tools during the problem solving stage. This method enables
the computer system to be more intuitive with human logic than machine logic. Final
outcome of this process is software tools which is dedicated to solve the problem
under consideration. Software is just a collection of computer programs and programs
are a set of instructions which guides computer’s hardware. These instructions need to
be well specified for solving the problem. After its creation, the software should be
error free and well documented. Software development is the process of creating such
software, which satisfies end user’s requirements and needs.

The following six steps must be followed to solve a problem using computer.

1. Problem Analysis

Problem analysis is the process of defining a problem and decomposing overall


system into smaller parts to identify possible inputs, processes and outputs associated
with the problem. This task is further subdivided into six subtasks namely:

1. Specifying the Objective :

First, we need to know what problem is actually being solved. Making a clear
statement of the problem depends upon the size and complexity of the problem.
Smaller problems not involving multiple subsystems can easily be stated and
then we can move onto the next step of “Program Design”. However, a
problem interacting with various subsystems and series of programs require
complex analysis, in-depth research and careful coordination of people,
procedures and programs.

2. Specifying the Output :

Before identifying inputs required for the system, we need to identify what
comes out of the system. The best way to specify output is to prepare some
output forms and required format for displaying result. The best person to
judge an output form is the end user of the system i.e. the one who uses the
software to his benefit. Various forms can be designed by the programmer
which must be examined to see whether they are useful or not.

1
3. Specifying Input Requirements :

After having specified the outputs, the input and data required for the system
need to be specified as well. One needs to identify the list of inputs required
and the source of data. For example, in a simple program to keep student’s
record, the inputs could be the student’s name, address, roll-numbers, etc. The
sources could be the students themselves or the person supervising them.

4. Specifying Processing Requirements :

When output and inputs are specified, we need to specify process that converts
specified inputs into desired output. If the proposed program is to replace or
supplement an existing one, a careful evaluation of the present processing
procedures needs to be made, noting any improvements that could made. If the
proposed system is not designed to replace an existing system, then it is well
advised to carefully evaluate another system that addresses a similar problem.

5. Evaluating the Feasibility :

After the successful completion of all the above four steps one needs to see
whether the things accomplished so far in the process of problem solving are
practical and feasible. To replace an existing system one needs to determine
how the potential improvements outperforms existing system or other similar
system.

6. Problem Analysis Documentation

Before concluding the program analysis stage, it is best to record whatever has
been done so far in the first phase of program development. The record should
contain the statement of program objectives, output and input specifications,
processing requirements and feasibility.

2. Program Design - Algorithm, Flowchart and Pseudocode

The second stage in software development or problem solving using computer cycle is
program design. This stage consists of preparing algorithms, flowcharts and
pseudocodes. Generally, this stage intends to make the program more user friendly,
feasible and optimized. Programmer just requires a pen and pencil in this step in
which the tasks are first converted into a structured layout without the involvement of
computer. In structured programming, a given task is divided into number of
sub-tasks which are termed as modules. Each process is further divided until no
further divisions are required. This process of dividing a program into modules and
then into sub-modules is known as “top down” design approach. Dividing a program
into modules (functions) breaks down a given programming task into small,
independent and manageable tasks.

In program design we are mainly interested in designing:

1. Algorithms
2. Flowcharts

2
3. Pseudocodes

3. Coding

In this stage, process of writing actual program takes place. A coded program is most
popularly referred to as a source code. The coding process can be done in any
language (high level and low level). The actual use of computer takes place in this
stage in which the programmer writes a sequence of instructions ready for execution.
Coding is also known as programming.

Good program possess following characteristics :

1. Comment clauses in the program help to make the program readable and
understandable by people other than the original programmer.
2. It should be efficient.
3. It must be reliable enough to work under all reasonable conditions to provide a
correct output.
4. It must be able to detect unreasonable error conditions and report them to the
end user or programmer without crashing the system.
5. It should be easy to maintain and support after installation.

4. Compilation and Execution

Generally coding is done in high level language or low level language (assembly
language). For the computer to understand these languages, they must be translated
into machine level language. The translation process is carried out by a
compiler/interpreter (for high level language) or an assembler (for assembly language
program). The machine language code thus created can be saved and run immediately
or later on.

In an interpreted program, each program statement is converted into machine code


before program is executed. The execution occurs immediately one statement at a
time sequentially. BASIC is one of the frequently used interpreted language. In
contrast to interpreter, a compiler converts a given source code into object code. Once
an object code is obtained, the compiled programs can be faster and more efficient
than interpreted programs.

Compilation Process

A source code must go through several steps before it becomes an executable program.
In the first step the source code is checked for any syntax errors. After the syntax
errors are traced out a source file is passed through a compiler which first translates
high level language into object code (A machine code not ready to be executed). A
linker then links the object code with pre-compiled library functions, thus creating an
executable program. This executable program is then loaded into the memory for
execution. General compilation process is shown in Figure below:

3
Figure : Compilation Process

5. Debugging and Testing

To understand debugging and testing more intuitively, lets first consider learning
about different types of error that occurs while programming.

Error

Error means failure of compilation and execution of the computer program or not
getting expected results after execution. Debugging and testing are systematic process
during program development cycle to avoid errors in the program. Different types of
error that we encounter while programming are listed below :

4
Types of Error:

1. Syntax Error : Syntax error is a violation of programming rules while writing


it. A syntax error does not allow the code to run. Syntax error can be easily
detected during the compilation process using compiler.
2. Logical Error : Logical error occurs when a programmer has applied
incorrect logic for solving problem or left out a programming procedure.
When logical error occurs program executes but fails to produce a correct
result.
3. Run Time Error : Run time error occurs during the execution of program.
Stack overflow, divide by zero, floating point error etc. are examples of
runtime error.

Debugging

Debugging is the process of finding errors and removing them from a computer
program, otherwise they will lead to failure of the program. Even after taking full care
during program design and coding, some errors may remain in the program and these
errors appear during compilation or linking or execution. Debugging is generally done
by program developer.

Testing

Testing is performed to verify that whether the completed software package functions
or works according to the expectations defined by the requirements. Testing is
generally performed by testing team which repetitively executes program with intent
to find error. After testing, list of errors and related information is sent to program
developer or developmen team.

Debugging vs Testing

Major differences between debugging and testing are pointed below :

It is the process of fixing errors.

It is the process of finding as many errors as possible.

Debugging is done during program development phase.

Testing is done during testing phase which comes after development phase.

Debugging is done by program developer.

Testing is generally carried out by separate testing team rather than program
developer.

5
6. Program Documentation

The program documentation is the process of collecting information about the


program. The documentation process starts from the problem analysis phase to
debugging and testing. Documentation consists two types of documentation, they are:

1. Programmer's Documentation
2. User's Documentation

Programmer's Documentation

Programmer’s documentation contains all the technical details. Without proper


documentation it is very difficult even for the original programmer to update and
maintain the program. A programmer’s documentation contains the necessary
information that a programmer requires to update and maintain the program. These
information includes:

1. Program analysis document, with a concise statement of program’s objectives,


outputs and processing procedures.
2. Program design documents with appropriate flowcharts and diagrams.
3. Program verification documents for outlining, checking, testing and correction
procedures along with the list of sample data and results.
4. Log used to document future program revision and maintenance activity.

User's Documentation

User documentation is required for the end user who installs and uses the program. It
consists instructions for installation of the program and user manual.

The Problem Solving aspect

What is Problem Solving?

Problem solving is the act of defining a problem; determining the cause of the
problem; identifying, prioritizing, and selecting alternatives for a solution; and
implementing a solution.

• The problem-solving process


• Problem solving resources

6
Problem Solving Chart

The Problem-Solving Process

In order to effectively manage and run a successful organization, leadership must


guide their employees and develop problem-solving techniques. Finding a suitable
solution for issues can be accomplished by following the basic four-step
problem-solving process and methodology outlined below.

Step Characteristics
1. Define the problem • Differentiate fact from opinion
• Specify underlying causes
• Consult each faction involved for information
• State the problem specifically
• Identify what standard or expectation is
violated
• Determine in which process the problem lies
• Avoid trying to solve the problem without
data

2. Generate alternative • Postpone evaluating alternatives initially


solutions • Include all involved individuals in the
generating of alternatives
• Specify alternatives consistent with
organizational goals
• Specify short- and long-term alternatives
• Brainstorm on others' ideas
• Seek alternatives that may solve the problem

3. Evaluate and select an • Evaluate alternatives relative to a target


alternative standard
• Evaluate all alternatives without bias
• Evaluate alternatives relative to established
goals
• Evaluate both proven and possible outcomes
• State the selected alternative explicitly

4. Implement and follow up • Plan and implement a pilot test of the chosen
on the solution alternative
• Gather feedback from all affected parties
• Seek acceptance or consensus by all those
affected
• Establish ongoing measures and monitoring
• Evaluate long-term results based on final
solution

7
1. Define the problem

Diagnose the situation so that your focus is on the problem, not just its symptoms.
Helpful problem-solving techniques include using flowcharts to identify the expected
steps of a process and cause-and-effect diagrams to define and analyze root causes.

The sections below help explain key problem-solving steps. These steps support the
involvement of interested parties, the use of factual information, comparison of
expectations to reality, and a focus on root causes of a problem. You should begin by:

• Reviewing and documenting how processes currently work (i.e., who does
what, with what information, using what tools, communicating with what
organizations and individuals, in what time frame, using what format).
• Evaluating the possible impact of new tools and revised policies in the
development of your "what should be" model.

2. Generate alternative solutions

Postpone the selection of one solution until several problem-solving alternatives have
been proposed. Considering multiple alternatives can significantly enhance the value
of your ideal solution. Once you have decided on the "what should be" model, this
target standard becomes the basis for developing a road map for investigating
alternatives. Brainstorming and team problem-solving techniques are both useful tools
in this stage of problem solving.

Many alternative solutions to the problem should be generated before final evaluation.
A common mistake in problem solving is that alternatives are evaluated as they are
proposed, so the first acceptable solution is chosen, even if it’s not the best fit. If we
focus on trying to get the results we want, we miss the potential for learning
something new that will allow for real improvement in the problem-solving process.

3. Evaluate and select an alternative

Skilled problem solvers use a series of considerations when selecting the best
alternative. They consider the extent to which:

• A particular alternative will solve the problem without causing other


unanticipated problems.
• All the individuals involved will accept the alternative.
• Implementation of the alternative is likely.
• The alternative fits within the organizational constraints.

4. Implement and follow up on the solution

Leaders may be called upon to direct others to implement the solution, "sell" the
solution, or facilitate the implementation with the help of others. Involving others in
the implementation is an effective way to gain buy-in and support and minimize
resistance to subsequent changes.

8
Regardless of how the solution is rolled out, feedback channels should be built into
the implementation. This allows for continuous monitoring and testing of actual
events against expectations. Problem solving, and the techniques used to gain clarity,
are most effective if the solution remains in place and is updated to respond to future
changes.

Top down design

Top-down design is a method of breaking a problem down into smaller, less complex
pieces from the initial overall problem. Most “good” problems are too complex to
solve in just one step, so we divide the problem up into smaller manageable pieces,
solve each one of them and then bring everything back together again. The process of
making the steps more and more specific in top-down design is called stepwise
refinement.

As mentioned before my father use to work in construction building houses. If


someone gave you a piece of land and told you to, “Build me a house” you would not
immediately go over and start nailing 2 x 4’s together. Building a house is a very
complex adventure, not to mention there are many rules, codes and laws that must be
followed. To build a house you could break the project up into smaller jobs, plan and
do each one of these jobs (in the correct order) and in the end you would have a house.
You will notice in the following top-down design diagram that some jobs get broken
down several times, until they are a manageable size.

We will be using top-down design (and top-down design diagrams, like the one above)
to help us understand a problem and all its components.

9
Implementation of algorithm

How to Use Algorithms to Solve Problems?

An algorithm is a process or set of rules which must be followed to complete a


particular task. This is basically the step-by-step procedure to complete any task. All
the tasks are followed a particular algorithm, from making a cup of tea to make high
scalable software. This is the way to divide a task into several parts. If we draw an
algorithm to complete a task then the task will be easier to complete.

The algorithm is used for,

• To develop a framework for instructing computers.


• Introduced notation of basic functions to perform basic tasks.
• For defining and describing a big problem in small parts, so that it is very easy
to execute.

Characteristics of Algorithm

1. An algorithm should be defined clearly.


2. An algorithm should produce at least one output.
3. An algorithm should have zero or more inputs.
4. An algorithm should be executed and finished in finite number of steps.
5. An algorithm should be basic and easy to perform.
6. Each step started with a specific indentation like, “Step-1”,
7. There must be “Start” as the first step and “End” as the last step of the
algorithm.

Let’s take an example to make a cup of tea,

Step 1: Start

Step 2: Take some water in a bowl.

Step 3: Put the water on a gas burner.

Step 4: Turn on the gas burner

Step 5: Wait for some time until the water is boiled.

Step 6: Add some tea leaves to the water according to the requirement.

Step 7: Then again wait for some time until the water is getting colorful as tea.

Step 8: Then add some sugar according to taste.

Step 9: Again wait for some time until the sugar is melted.

Step 10: Turn off the gas burner and serve the tea in cups with biscuits.

10
Step 11: End

Here is an algorithm for making a cup of tea. This is the same for computer science
problems.

There are some basics steps to make an algorithm:

1. Start – Start the algorithm


2. Input – Take the input for values in which the algorithm will execute.
3. Conditions – Perform some conditions on the inputs to get the desired output.
4. Output – Printing the outputs.
5. End – End the execution.

Let’s take some examples of algorithms for computer science problems.

Example 1. Swap two numbers with a third variable

Step 1: Start

Step 2: Take 2 numbers as input.

Step 3: Declare another variable as “temp”.

Step 4: Store the first variable to “temp”.

Step 5: Store the second variable to the First variable.

Step 6: Store the “temp” variable to the 2nd variable.

Step 7: Print the First and second variables.

Step 8: End

Example 2. Find the area of a rectangle

Step 1: Start

Step 2: Take the Height and Width of the rectangle as input.

Step 3: Declare a variable as “area”

Step 4: Multiply Height and Width

Step 5: Store the multiplication to “Area”, (its look like area = Height
x Width)

Step 6: Print “area”;

Step 7: End

11
Example 3. Find the greatest between 3 numbers.

Step 1: Start

Step 2: Take 3 numbers as input, say A, B, and C.

Step 3: Check if(A>B and A>C)

Step 4: Then A is greater

Step 5: Print A

Step 6: Else

Step 7: Check if(B>A and B>C)

Step 8: Then B is greater

Step 9: Print B

Step 10: Else C is greater

Step 11: Print C

Step 12: End

Advantages of Algorithm

• An algorithm uses a definite procedure.


• It is easy to understand because it is a step-by-step definition.
• The algorithm is easy to debug if there is any error happens.
• It is not dependent on any programming language
• It is easier for a programmer to convert it into an actual program because the
algorithm divides a problem into smaller parts.

Disadvantages of Algorithms

• An algorithm is Time-consuming, there is specific time complexity for


different algorithms.
• Large tasks are difficult to solve in Algorithms because the time complexity
may be higher, so programmers have to find a good efficient way to solve that
task.
• Looping and branching are difficult to define in algorithms.

Program Verification
************See Program Verification PDF***************

12
The efficiency of algorithms

Computer resources are limited that should be utilized efficiently. The efficiency of an
algorithm is defined as the number of computational resources used by the algorithm.
An algorithm must be analyzed to determine its resource usage. The efficiency of an
algorithm can be measured based on the usage of different resources.

For maximum efficiency of algorithm we wish to minimize resource usage. The


important resources such as time and space complexity cannot be compared directly, so
time and space complexity could be considered for an algorithmic efficiency.

Method for determining Efficiency

The efficiency of an algorithm depends on how efficiently it uses time and memory
space.

The time efficiency of an algorithm is measured by different factors. For example, write
a program for a defined algorithm, execute it by using any programming language, and
measure the total time it takes to run. The execution time that you measure in this case
would depend on a number of factors such as:

· Speed of the machine


· Compiler and other system Software tools
· Operating System
· Programming language used
· Volume of data required

However, to determine how efficiently an algorithm solves a given problem, you would
like to determine how the execution time is affected by the nature of the algorithm.
Therefore, we need to develop fundamental laws that determine the efficiency of a
program in terms of the nature of the underlying algorithm.

13
Space-Time tradeoff

A space-time or time-memory tradeoff is a way of solving in less time by using


more storage space or by solving a given algorithm in very little space by spending
more time.

To solve a given programming problem, many different algorithms may be used. Some
of these algorithms may be extremely time-efficient and others extremely
space-efficient.

Time/space trade off refers to a situation where you can reduce the use of memory at the
cost of slower program execution, or reduce the running time at the cost of increased
memory usage.

Asymptotic Notations

Asymptotic Notations are languages that uses meaningful statements about time and
space complexity. The following three asymptotic notations are mostly used to
represent time complexity of algorithms:

(i) Big O

Big O is often used to describe the worst-case of an algorithm.

(ii) Big Ω

Big Omega is the reverse Big O, if Bi O is used to describe the upper bound (worst -
case) of a asymptotic function, Big Omega is used to describe the lower bound
(best-case).

(iii) Big Θ

When an algorithm has a complexity with lower bound = upper bound, say that an
algorithm has a complexity O (n log n) and (n log n), it’s actually has the complexity Θ
(n log n), which means the running time of that algorithm always falls in n log n in the
best-case and worst-case.

14
Best, Worst, and Average ease Efficiency

Let us assume a list of n number of values stored in an array. Suppose if we want to


search a particular element in this list, the algorithm that search the key element in the
list among n elements, by comparing the key element with each element in the list
sequentially.

The best case would be if the first element in the list matches with the key element to be
searched in a list of elements. The efficiency in that case would be expressed as O(1)
because only one comparison is enough.

Similarly, the worst case in this scenario would be if the complete list is searched and
the element is found only at the end of the list or is not found in the list. The efficiency
of an algorithm in that case would be expressed as O(n) because n comparisons
required to complete the search.

The average case efficiency of an algorithm can be obtained by finding the average
number of comparisons as given below:

Minimum number of comparisons = 1 Maximum number of comparisons = n

If the element not found then maximum

number of comparison = n

Therefore, average number of comparisons = (n + 1)/2

Hence the average case efficiency will be expressed as O (n).

The analysis of algorithms

In the analysis of the algorithm, it generally focused on CPU (time) usage, Memory
usage, Disk usage, and Network usage. All are important, but the most concern is
about the CPU time. Be careful to differentiate between:

• Performance: How much time/memory/disk/etc. is used when a program is


run. This depends on the machine, compiler, etc. as well as the code we write.
• Complexity: How do the resource requirements of a program or algorithm
scale, i.e. what happens as the size of the problem being solved by the code
gets larger.

15
Algorithm Analysis:
Algorithm analysis is an important part of computational complexity theory, which
provides theoretical estimation for the required resources of an algorithm to solve a
specific computational problem. Analysis of algorithms is the determination of the
amount of time and space resources required to execute it.

Why Analysis of Algorithms is important?

• To predict the behavior of an algorithm without implementing it on a specific


computer.
• It is much more convenient to have simple measures for the efficiency of an
algorithm than to implement the algorithm and test the efficiency every time a
certain parameter in the underlying computer system changes.
• It is impossible to predict the exact behavior of an algorithm. There are too
many influencing factors.
• The analysis is thus only an approximation; it is not perfect.
• More importantly, by analyzing different algorithms, we can compare them to
determine the best one for our purpose.

Algorithm Analysis:
Algorithm analysis is an important part of computational complexity theory, which
provides theoretical estimation for the required resources of an algorithm to solve a
specific computational problem. Analysis of algorithms is the determination of the
amount of time and space resources required to execute it.

Why Analysis of Algorithms is important?

• To predict the behavior of an algorithm without implementing it on a specific


computer.
• It is much more convenient to have simple measures for the efficiency of an
algorithm than to implement the algorithm and test the efficiency every time a
certain parameter in the underlying computer system changes.
• It is impossible to predict the exact behavior of an algorithm. There are too
many influencing factors.
• The analysis is thus only an approximation; it is not perfect.
• More importantly, by analyzing different algorithms, we can compare them to
determine the best one for our purpose.

1.1 Why Analyze an Algorithm?

The most straightforward reason for analyzing an algorithm is to discover its


characteristics in order to evaluate its suitability for various applications or compare it
with other algorithms for the same application. Moreover, the analysis of an algorithm
can help us understand it better, and can suggest informed improvements. Algorithms
tend to become shorter, simpler, and more elegant during the analysis process.

1.2 Computational Complexity.

The branch of theoretical computer science where the goal is to classify algorithms
according to their efficiency and computational problems according to their inherent

16
difficulty is known as computational complexity. Paradoxically, such classifications
are typically not useful for predicting performance or for comparing algorithms in
practical applications because they focus on order-of-growth worst-case performance.
In this book, we focus on analyses that can be used to predict performance and
compare algorithms.

1.3 Analysis of Algorithms.

A complete analysis of the running time of an algorithm involves the following steps:

• Implement the algorithm completely.


• Determine the time required for each basic operation.
• Identify unknown quantities that can be used to describe the frequency of
execution of the basic operations.
• Develop a realistic model for the input to the program.
• Analyze the unknown quantities, assuming the modelled input.
• Calculate the total running time by multiplying the time by the frequency for
each operation, then adding all the products.

Classical algorithm analysis on early computers could result in exact predictions of


running times. Modern systems and algorithms are much more complex, but modern
analyses are informed by the idea that exact analysis of this sort could be performed
in principle.

1.4 Average-Case Analysis.

Elementary probability theory gives a number of different ways to compute the


average value of a quantity. While they are quite closely related, it will be convenient
for us to explicitly identify two different approaches to compute the mean.

• Distributional. Let ΠN

be the number of possible inputs of size N and ΠNk be the number of inputs of size N
that cause the algorithm to have cost k, so that ΠN=∑kΠNk. Then the probability that
the cost is k is ΠNk/ΠN and the expected cost is
1ΠN∑kkΠNk.
The analysis depends on "counting." How many inputs are there of size N and how
many inputs of size N cause the algorithm to have cost k? These are the steps to
compute the probability that the cost is k
· , so this approach is perhaps the most direct from elementary probability theory.
· Cumulative. Let ΣN
be the total (or cumulated) cost of the algorithm on all inputs of size N. (That is,
ΣN=∑kkΠNk, but the point is that it is not necessary to compute ΣN in that way.)
Then the average cost is simply ΣN/ΠN

• . The analysis depends on a less specific counting problem: what is the total
cost of the algorithm, on all inputs? We will be using general tools that make
this approach very attractive.

17
The distributional approach gives complete information, which can be used directly to
compute the standard deviation and other moments. Indirect (often simpler) methods
are also available for computing moments when using the other approach, as we will
see. In this book, we consider both approaches, though our tendency will be towards
the cumulative method, which ultimately allows us to consider the analysis of
algorithms in terms of combinatorial properties of basic data structures.

1.5 Example: Analysis of quicksort.

The classical quicksort algorithm was invented by C.A.R. Hoare in 1962:


public class Quick
{
private static int partition(Comparable[] a, int lo, int hi)
{
int i = lo, j = hi+1;
while (true)
{
while (less(a[++i], a[lo])) if (i == hi) break;
while (less(a[lo], a[--j])) if (j == lo) break;
if (i >= j) break;
exch(a, i, j);
}
exch(a, lo, j);
return j;
}

private static void sort(Comparable[] a, int lo, int hi)


{
if (hi <= lo) return;
int j = partition(a, lo, hi);
sort(a, lo, j-1);
sort(a, j+1, hi);
}
}

To analyze this algorithm, we start by defining a cost model (running time) and an
input model (randomly ordered distinct elements). To separate the analysis from the
implementation, we define CN

to be the number of compares to sort N elements and analyze CN (hypothesizing that


the running time for any implementation will be ∼aCN for some
implementation-dependent constant a

). Note the following properties of the algorithm:

• N+1

· compares are used for partitioning.


· The probability that the partitioning element is the k
th smallest is 1/N for k between 0 and N−1

18
· .
· The size of the two subarrays to be sorted in that case are k
and N−k−1.

• The two subarrays are randomly ordered after partitioning.

These imply a mathematical expression (a recurrence relation) that derives directly


from the recursive program
CN=N+1+∑0≤k≤N−11N(Ck+CN−k−1)
This equation is easily solved with a series of simple albeit mysterious algebraic steps.
First, apply symmetry, multiply by N, subtract the same equation for N−1 and
rearrange terms to get a simpler recurrence.
CNNCNNCN−(N−1)CN−1NCN=N+1+2N∑0≤k≤N−1Ck=N(N+1)+2∑0≤k≤N−1Ck=N(
N+1)−(N−1)N+2CN−1=(N+1)CN−1+2N
Note that this simpler recurrence gives an efficient algorithm to compute the exact
answer. To solve it, divide both sides by N(N+1) and telescope.
NCNCNN+1CN=(N+1)CN−1+2NforN>1withC1=2=CN−1N+2N+1=CN−2N−1+2N+2
N+1=2HN+1−2=2(N+1)HN+1−2(N+1)=2(N+1)HN−2N.
The result is an exact expression in terms of the Harmonic numbers.

1.6 Asymptotic Approximations

The Harmonic numbers can be approximated by an integral (see Chapter 3),


HN∼lnN,
leading to the simple asymptotic approximation
CN∼2NlnN.
It is always a good idea to validate our math with a program. This code
public class QuickCheck
{
public static void main(String[] args)
{
int maxN = Integer.parseInt(args[0]);
double[] c = new double[maxN+1];
c[0] = 0;
for (int N = 1; N <= maxN; N++)
c[N] = (N+1)*c[N-1]/N + 2;

for (int N = 10; N <= maxN; N *= 10)


{
double approx = 2*N*Math.log(N) - 2*N;
StdOut.printf("%10d %15.2f %15.2f\n", N, c[N], approx);
}
}
}
produces this output.
% java QuickCheck 1000000
10 44.44 26.05
100 847.85 721.03
1000 12985.91 11815.51

19
10000 175771.70 164206.81
100000 2218053.41 2102585.09
The discrepancy in the table is explained by our dropping the 2N term (and our not
using a more accurate approximation to the integral).

1.7 Distributions.

It is possible to use similar methods to find the standard deviation and other moments.
The standard deviation of the number of compares used by quicksort is
7−2π2/3−−−−−−−−√N≈.6482776N which implies that the expected number of
compares is not likely to be far from the mean for large N. Does the number of
compares obey a normal distribution? No. Characterizing this distribution is a difficult
research challenge.

1.8 Probabilistic Algorithms.

Is our assumption that the input array is randomly ordered a valid input model? Yes,
because we can randomly order the array before the sort. Doing so turns quicksort
into a randomized algorithm whose good performance is guaranteed by the laws of
probability.

It is always a good idea to validate our models and analysis by running experiments.
Detailed experiments by many people on many computers have done so for quicksort
over the past several decades.

In this case, a flaw in the model for some applications is that the array items need not
be distinct. Faster implementations are possible for this case, using three-way
partitioning.

Fundamental Algorithms
************** See Fundamental Algorithms PDF***************

20

You might also like