100% found this document useful (1 vote)
957 views

Content Based Search Final Document

Content Based file search is a Java application to find files that contain (or don't contain) a given string. The speed of this program depends upon the speed of computer's hardware and the complexity of the search string.

Uploaded by

sivasatish007
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
957 views

Content Based Search Final Document

Content Based file search is a Java application to find files that contain (or don't contain) a given string. The speed of this program depends upon the speed of computer's hardware and the complexity of the search string.

Uploaded by

sivasatish007
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 49

PROJECT REPORT

CONTENT BASED SEARCH BY


RETRIVING THE FILES

A thesis submitted in partial


fulfillment of the
requirement for
the
Award of Degree
Of

BACHELOR OF TECHNOLOGY(Computer Sciences)

NIMRA COLLEGE OF ENGINEERING


AND TECHNOLOGY
VIJAYAWADA

Zulfikar Ali.Md
(06231A05C0)
Sudeesha.M Ritesh Abhishekh
(06231A05A4) (06231A0570)

Sajida Bhanu Pramod.G


(06231A0572) (06231A0562)

Unde r t he este emed gui dance of

Miss. G.ANITHA, B.Tech (CSE)


Lecturer

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

NIMRA COLLEGE OF ENGINEERING AND TECHNOLOGY


(Affiliated to Jawaharlal Nehru Technological University)
AN ISO 9001-2000 CERTIFIED INSTITUTION

1
PROJECT REPORT

JUPUDI, VIJAYAWADA, AP.MAY, 2009

ABSTRACT

Content Based File Search is a Java application to find files that


contain (or don’t contain) a given string. The string may be in plain
text or it may be a Java regular expression. Such a trivial search
should be part of the operating system, and in fact, once was. As
bigger and more impressive features were added to Windows, it lost
the ability to search files for arbitrary bytes of text. Windows
98/ME/2000 could find words buried in files with unknown formats;
Windows XP and Vista search only their supported file types. Through
the creation of files of content through applications, downloading of
content from the Internet, or receiving content via email, this file
system can become quite full of important content located throughout
the system. Whether these files are carefully filed away in deeply
nested hierarchical folders, or haphazardly filed away in a nearly flat
system, at some point that data probably needs to be accessed again.
It is at this point the problem of desktop search becomes apparent. In
a system consisting of gigabytes and gigabytes of thousands or even
millions of files, it is important have more efficient search engine for
desktop. The speed of this program depends upon the speed of
computer’s hardware and the complexity of the search string. When
searching for plain ASCII text or Unicode characters from 0x20 to
0x7E, the “(raw data bytes)” encoding is about 40% faster than the
local system’s in “(default encoding)”.

2
PROJECT REPORT

Introduction

The capacity of our hard-disk drives has in creased


tremendously over the past decade, and so has the number of files we
usually store on our computer. It is no wonder that sometimes we
cannot find a document any more, even when we know we saves it
some where. The recent arrival of desktop search applications, which
index all data on a PC, promises to increase search efficiency on the
desktop. Still, these search applications are weaker than their web
counterparts. Unfortunately, they also fall short of utilizing desktop
specific characteristics, especially context information. For example,
one file might contain a question describing the object one is looking
for, and another file in the same thread might include the answer to
that question in the form of an attached document. The search
functionality in earlier versions of Windows searches all files for the
specified string and may return a large number of irrelevant files such
as program and configuration files. As a result of this change, the
search functionality can find the same set of files if the Content Index
service is turned on or off. In previous versions of Windows, the
computer exhibited different behavior if you turned on the Content
Index service.

Content-based File retrieval was initially proposed to overcome


the difficulties encountered in keyword-based File search in 1990s.
Since then, it has been an active research topic, and a lot of
algorithms have been published in the literature. In keyword-based
search,file have to be manually annotated with keywords. As keyword
annotation is a tedious process, it is impractical to annotate.
Furthermore, annotation may be inconsistent. Moreover, the feature
extraction can be performed automatically. Thus, the human labeling
process can be avoided.

3
PROJECT REPORT

Context and Content


This metric brings about two points. First, the context of the
search - what documents and text you have open or have recently
modified - could help immensely, and since this is search done on a
local computer that information could be accessible. Second, it points
out that a text-based keyword search may not be the whole answer. A
content-based information retrieval system that allows you to
construct search queries based on the kind of content you're searching
for could be an important area for research. This isn't the best
example, but rather than just searching for a company name in your
email to find correspondence with members of that company, if you
have one email from that company the fact that all email from that
company will be from the same domain name is something your
search tool could notice. It might rank email to and from that specific
person as most relevant, email to and from that company as also
relevant. When you think of your documents and content as query
statements, interesting possibilities open up.
When the domain switches from email to media - like music or
images - the possibilities for content-based image retrieval seem even
more interesting. Especially considering the relatively impoverished
state of metadata, text-based searching for media content on the
desktop is extremely difficult.

4
PROJECT REPORT

Existing System

Through the creation of files of content through applications,


downloading of content from the Internet, or receiving content via
email, this file system can become quite full of important content
located throughout the system. Whether these files are carefully filed
away in deeply nested hierarchical folders, or haphazardly filed away
in a nearly flat system, at some point that data probably needs to be
accessed again. It is at this point the problem of desktop search
becomes apparent. In a system consisting of gigabytes and gigabytes
of thousands or even millions of files, how does one locate a specific
file? If it is filed away "properly," that is, in a manner the user was
conscious of and remembers, perhaps it will be easily located in that
folder. But what if the user has put the file in a folder he can't
remember? Or software automatically saved it somewhere he does not
expect? Or the folder it is in contains over a hundred files, and the
user can't remember the file's name? Or he knows the folder it is in,
but can't remember where the folder is? There are many reasons to
not be able to instantly remember the folder location of a file,
especially if it was created months or even years earlier.

Disadvantages
Speed is a major issue. By default, neither file metadata nor content is
indexed in such a way that results are returned quickly. Although
Windows XP includes something called "Indexing Service" that will
index files for quick access, it is not enabled by default. It was not
examined for the purposes of this paper since it is so seldom used or
mentioned by normal users. There is no meaningful ranking of the
results. That is, although you can resort the results by the common
file system metadata: name, folder location, file type, and date
modified, results seemed to be returned simply in the order they are

5
PROJECT REPORT

found as Windows XP Search linearly searches through files and


folders.

Proposed System

The string may be in plain text or it may be a Java regular


expression. Such a trivial search should be part of the operating
system, and in fact, once was. As bigger and more impressive
features were added to Windows, it lost the ability to search files for
arbitrary bytes of text. Windows 98/ME/2000 could find words buried
in files with unknown formats; Windows XP and Vista search only their
supported file types. A regular expression is a way of specifying
relationships between elements of a complex pattern. You don’t need
to understand regular expressions to use this program. This program
can be executed from both the command prompt and the graphical
user interface. As we implement the regular expression we can over
come the disadvantages of the previous system.

6
PROJECT REPORT

System Specifications

Hardware Specification:
The speed of this program depends upon the speed of your
computer’s hardware. When searching for plain ASCII text or Unicode
characters from 0x20 to 0x7E, the “(raw data bytes)” encoding is
about 40% faster than the local system’s “(default encoding)”. Even
an old Intel Pentium 3 processor at 3.0 GHz should be able to scan
large files at 15 megabytes per second (MB/s) as raw data bytes with
the “case” option enabled.

PROCESSOR Pentium Series

RAM 64 MB

KEY BOARD 104 Keys

FLOPPY DISK 1.44 MB

HARD DISK 6 GB

MOUSE Serial Mouse

Software Specification:

7
PROJECT REPORT

FileSearch was developed with Java 1.4 and should run on later
versions. It may also run on earlier versions, but this has not been
tested. For Macintosh computers, the version of Java is determined by
your version of MacOS. For Windows, Linux, and Solaris, you can
download the JRE from Sun Microsystems:
Sun Java
JRE for end users: http://www.java.com/getjava/
SDK for programmers: http://developers.sun.com/downloads/
IDE for programmers: http://www.netbeans.org/

As the application is developed using the java technology, for


compiling the project we need the java installed on the system but in
order to run the project we just need JVM installed in system. Now a
day most of the operating system is installed with JVM inbuilt. If we
don’t found the JVM on system we can download from the Sun
Microsoft web sites as it is free to download. Once we load the JVM we
can run the application. As per this project development and executing
we need software as follows

Operating Systems Windows XP, 2000

Technologies Java 5 or 6, JVM

Run time environment Java Virtual machine

IDE Net Beans IDE

8
PROJECT REPORT

System Analysis

9
PROJECT REPORT

Requirement Analysis

Digital data volume has been increasing at a phenomenal rate during


the past decade. The ``Moore's law curve'' (doubling every 18
months) no longer refers only to the exponential improvement rate of
processor performance, storage density and network bandwidth, but
also to the data growth rates of many disciplines. The dominating data
types are feature-rich data such as audio, digital photos, videos, and
scientific sensor data. As we are moving into a digital society where all
information is digitized and where the world is interconnected by
digital means, it is highly desirable for next-generation systems to
provide users with abilities to access, search, explore and manage
feature-rich data.

Although several new operating systems attempt to provide users with


content-based search capabilities, they are limited to text documents.
A key challenge in implementing a content-based similarity search
system for feature-rich data is that such data is noisy and complex.
For example, consider two different photographs of an identical scene,
or two separate recordings of a person speaking the same sentence.
Despite the high degree of similarity between the two images or
between the audio recordings, the digital representations are different
at the bit level. Comparing noisy, feature-rich data requires matching
based on similarity instead of exact match, and thus searching for
noisy data requires similarity search instead of exact search. However,
similarity search in high-dimensional spaces is notoriously difficult (the
so called curse of dimensionality). Hence, practical advanced search
solutions, such as database tools and search engines (e.g. Google),
have been limited to searching for exact matches and tend to work
only for text documents and text annotations. To date, there is no

10
PROJECT REPORT

practical content-based search engine for massive amounts of


inherently noisy, feature-rich data.
A key component in our research is a general-purpose similarity
search engine. To deliver high-quality similarity search results with
minimal CPU cycles and memory resources, we have developed novel
techniques based on dimension-reduction ideas recently developed in
the theory community. We use these to construct sketches -- tiny data
structures that can be used to estimate properties of the original data
-- from feature vectors as highly compact metadata for the similarity
search engine. This approach allows us to attack the ``curse of
dimensionality'' problem in the design of the similarity search engine
for feature-rich data.

11
PROJECT REPORT

2. SYSTEM ANALYSIS:
System Analysis is first stage according to System
Development Life Cycle model. This System Analysis is a process that
starts with the analyst.

Analysis is a detailed study of the various operations performed


by a system and their relationships within and outside the system.
One aspect of analysis is defining the boundaries of the system and
determining whether or not a candidate should consider other related
systems. During analysis, data is collected from the available files,
decision points, and transactions handled by the present system.
Logical system models and tools are used in analysis. Training,
experience, and common sense are required for collection of the
information needed to do the analysis.

2.1 SYSTEM OBJECTIVES:


1.To automate selection process
2.To facilitate high graphical user interface to the user
3.To provide better functioning and accurate information in
time
4.To provide data maintenance features.
5.To improve the efficiency and to reduce the overload of work
6.To generate appropriate and concerned information to the
user using dynamic queries
7.To generate appropriate reports
8.To provide security.

2.2 FEASIBILITY STUDY:

12
PROJECT REPORT

All projects are feasible, given unlimited resources and infinite time.
But the development of software is plagued by the scarcity of
resources and difficult delivery rates. It is both necessary and prudent
to evaluate the feasibility of a project at the earliest possible time.
Three key considerations are involved in the feasibility analysis.

2.2.1 Economic Feasibility:

This procedure is to determine the benefits and savings that are


expected from a candidate system and compare them with costs. If
benefits outweigh costs, then the decision is made to design and
implement the system. Otherwise, further justification or alterations
in proposed system will have to be made if it is to have a chance of
being approved. This is an ongoing effort that improves in accuracy at
each phase of the system life cycle.

2.2.2 Technical Feasibility:

Technical feasibility centers on the existing computer system


(hardware, software, etc.,) and to what extent it can support the
proposed addition. If the budget is a serious constraint, then the
project is judged not feasible.

2.2.3 Operational Feasibility:

People are inherently resistant to change, and computers have been


known to facilitate change. It is understandable that the introduction
of a candidate system requires special effort to educate, sell, and train
the staff on new ways of conducting business.

2.3 FEASIBILITY STUDY IN THIS PROJECT:

13
PROJECT REPORT

In this test, the operational scope of the system is checked. The


system under consideration should have enough operational reach. It
is observed that the proposed system is very user friendly and since
the system is built with enough help, even persons with little
knowledge of windows can find the system very easy.

2.3.1 Technical Feasibility:

This test includes a study of function; performance and


constraints that may affect the ability to achieve an acceptable
system. This test begins with an assessment of the technical viability
of the proposed system. One of the main factors to be accessed is the
need of various kinds of resources for the successful implementation
for the proposed system.

2.3.2 Economical Feasibility:

An evaluation of development cost weighed against the ultimate


income or benefit derived from the development of the proposed
system is made. Care must be taken that incurred in the development
of the proposed of the system should not exceed from the system.
The income can be in terms of money or goodwill, since the software
brings in both, the system is highly viable.

14
PROJECT REPORT

SYSTEM DESIGN

15
PROJECT REPORT

3.1 SYSTEM DESIGN:


Once software requirements have been analyzed and
specified, software design as the first of 3 technical activities – design,
code generation and test-that are required to build and software.
Each of the elements of the analysis model provides
necessary information for the specification of the designs.

Systems design goes through two phases of development:

3.1.1 Logical Design:

DFD shows the logical flow of the system and defines the
boundaries of the system for a candidate system it describes the
inputs (source), output(destination), databases(data stores) and
procedures(data flows)- all in a format that meets the user
requirements. The DFD are already explained in previous section.

3.1.2 Physical Design:


This produces the working system by defining the design
specifications that tell programmers exactly what the candidate
system must to. In turn the programmer writes necessary programs
or modifies the software package that accepts input form the user,
performs necessary calculations through the existing file or data base,
produces report on a hard copy or displays it on a screen and
maintains a updated database at all times.

3.1.3 Design Principles:


Software designs is a both process and a model Basic design
principles enables the analyst to navigate the design process.

•The design process should not suffer from “tunnel vision”.

16
PROJECT REPORT

•The design should be traceable to analysis model.


•The design should “minimize the intellectual distance” between
software and problem that exists in the real world.
•The design should exhibit uniformity and integration.
•The design should be structured to accommodate.
•The design should be structured to degrade gently, even when
aberrant data, events or operating conditions are encountered.
•The design should be reviewed to minimize conceptual (semantic)
errors.

3.1.4 Input Design:

Inaccurate input data are the most common cause of


errors in data processing. Errors entered by data entry operator can
be controlled by input design. Input design is the process of
converting user-originated inputs to a computer-based format.

Once defined, appropriate input media are selected for


processing. The goal of the designing input data is to make data entry
as easy, logical and free from errors as possible.

3.1.5 Output Design:

Computers output is the most important and direct source


of information’s to the user. Efficient, intelligible output design should
improve the systems relationships with the user, and help in decision
making.

3.3 SYSTEM PLANNING:


Planning information systems in business has become
increasingly important during the past decade. First, information is
recognized as vital resource and must be managed. Secondly more

17
PROJECT REPORT

and more financial resources are committed to information system.


Thirdly there is a growing need for long range for use of common
database or have a greater competitive edge.

3.3.1 Initial Investigation:


The user request identifies the need for change and authorizes
the initial investigation. It undergoes several modifications before it
becomes a written Commitment .In this system the following are
done. Background investigation, fact finding and analysis.

3.3.2 Needs Identification:


User needs identification and analysis is concerned with what
the user needs rather than what they want. Often problems come into
focus after a joint meeting between the user and the analyst.

18
PROJECT REPORT

INTRODUCTION TO JAVA

19
PROJECT REPORT

JAVA

Its creators have called java “programming for the


internet”. What makes java a good language for networking are the
classes defined in the java net package. These networking classes
encapsulate the “socket” paradigm pioneered in the Berkeley software
distribution from the University of California at Berkeley.

The Java Programming language is a high-level language that can be


characterized by all the following buzz words:

Simple Architecture neutral

Object oriented Portable

Distributed High performance

Interpreted Multithreaded

Robust Dynamic

Secure

With most programming languages, you either compile


or interpret a program so that you can run it on your computer. The
Java programming language is unusual in that a program is both
compiled and interpreted. With the compiler, first you translate a
program into an intermediate language called Java byte codes —the
platform-independent codes interpreted by the interpreter on the Java
platform. The interpreter parses and runs each Java byte code
instruction on the Computer. Compilation happens just once;
interpretation occurs each time the program is executed. The following
figure illustrates how this works.

20
PROJECT REPORT

You can think of Java byte codes as the machine code instructions for
the Java Virtual Machine (Java VM). Every Java interpreter, whether
it's a development tool or a Web browser that can run applets, is an
implementation of the Java VM.

Java byte codes help make "write once, run anywhere" possible. You
can compile your program into byte codes on any platform that has a
Java compiler. The byte codes can then be run on any implementation
of the Java VM. That means that as long as a computer has a Java VM,
the same program written in the Java programming language can run
on Windows 2000, a Solaris workstation, or on an iMac.

21
PROJECT REPORT

4.1 The Java Platform

A platform is the hardware or software environment in


which a program runs. We've already mentioned some of the most
popular platforms like Windows 2000, Linux, Solaris, and MacOS. Most
platforms can be described as a combination of the operating system
and hardware. The Java platform differs from most other platforms in
that it's a software-only platform that runs on top of other hardware-
based platforms.

The Java platform has two components:

 The Java Virtual Machine (Java VM)


 The Java Application Programming Interface (Java
API)

JVM is the base for the Java platform and is ported onto various
hardware-based platforms.

The Java API is a large collection of ready-made software components


that provide many useful capabilities, such as graphical user interface
(GUI) widgets. The Java API is grouped into libraries of related classes
and interfaces; these libraries are known as packages. The next
section, What Can Java Technology Do?, highlights what functionality
some of the packages in the Java API provide.

The following figure depicts a program that's running on the Java


platform. As the figure shows, the Java API and the virtual machine
insulate the program from the hardware.

22
PROJECT REPORT

Native code is code that after you compile it, the compiled code runs
on a specific hardware platform. As a platform-independent
environment, the Java platform can be a bit slower than native code.
However, smart compilers, well-tuned interpreters, and just-in-time
bytecode compilers can bring performance close to that of native code
without threatening portability.

What Can Java Technology Do?

The most common types of programs written in the Java


programming language are applets and applications. If you've surfed
the Web, you're probably already familiar with applets. An applet is a
program that adheres to certain conventions that allow it to run within
a Java-enabled browser.

However, the Java programming language is not just for writing cute,
entertaining applets for the Web. The general-purpose, high-level Java
programming language is also a powerful software platform. Using the
generous API, you can write many types of programs.

An application is a standalone program that runs directly on the Java


platform. A special kind of application known as a server serves and
supports clients on a network. Examples of servers are Web servers,
proxy servers, mail servers, and print servers. Another specialized
program is a servlet. A servlet can almost be thought of as an applet
that runs on the server side. Java Servlets are a popular choice for
building interactive web applications, replacing the use of CGI scripts.
Servlets are similar to applets in that they are runtime extensions of
applications. Instead of working in browsers, though, servlets run
within Java Web servers, configuring or tailoring the server.

23
PROJECT REPORT

How does the API support all these kinds of programs? It does so with
packages of software components those provide a wide range of
functionality. Every full implementation of the Java platform gives you
the following features:

 The essentials: Objects, strings, threads, numbers,


input and output, data structures, system properties,
date and time, and so on.
 Applets: The set of conventions used by applets.
 Networking: URLs, TCP (Transmission Control
Protocol), UDP (User Data gram Protocol) sockets, and
IP (Internet Protocol) addresses.
 Internationalization: Help for writing programs that
can be localized for users worldwide. Programs can
automatically adapt to specific locales and be displayed
in the appropriate language.
 Security: Both low level and high level, including
electronic signatures, public and private key
management, access control, and certificates.
 Software components: Known as JavaBeans, can
plug into existing component architectures.
 Object serialization: Allows lightweight persistence
and communication via Remote Method Invocation
(RMI).
 Java Database Connectivity (JDBC): Provides uniform
access to a wide range of relational databases.

The Java platform also has APIs for 2D and 3D graphics, accessibility,
servers, collaboration, telephony, speech, animation, and more. The
following figure depicts what is included in the Java 2 SDK.

24
PROJECT REPORT

Java Patterns:

Java has several design patterns Singleton Pattern being the most
commonly used. Java Singleton pattern belongs to the family of
design patterns, that govern the instantiation process. This design
pattern proposes that at any time there can only be one instance of a
singleton (object) created by the JVM.

The class’s default constructor is made private, which prevents the


direct instantiation of the object by others (Other Classes). A static
modifier is applied to the instance method that returns the object as it
then makes this method a class level method that can be accessed
without creating an object.

We write a public static getter or access method to get the instance of


the Singleton Object at runtime. First time the object is created inside
this method as it is null. Subsequent calls to this method returns the
same object created as the object is globally declared (private) and
the hence the same referenced object is returned.

25
PROJECT REPORT

public static synchronized SingletonObjectDemo getSingletonObject()

It could happen that the access method may be called twice from 2
different classes at the same time and hence more than one object
being created. This could violate the design patter principle. In order
to prevent the simultaneous invocation of the getter method by 2
threads or classes simultaneously we add the synchronized keyword to
the method declaration

We can still be able to create a copy of the Object by cloning it using


the Object’s clone method. This can be done as shown below

SingletonObjectDemo clonedObject = (SingletonObjectDemo)


obj.clone ();

This again violates the Singleton Design Pattern’s objective. So to deal


with this we need to override the Object’s clone method which throws
a CloneNotSupportedException exception.

Unified Modeling Language Diagrams

• The unified modified language allows the software engineer to


express an analysis model using the modeling notation that is
governed by a set of syntactic, semantic and pragmatic rules.

• A UML system is represented using five different views that


describe the system from distinctly different perspective. Each
view is defined by a set of diagram ,which is as follows.

User Model View

26
PROJECT REPORT

i. This view represents the system from the users


perspective.
ii. The analysis representation describes a usage scenario
from the end-users Perspective.

Structural model view


i. In this model the data and functionality are arrived from
inside the system.
ii. This model view models the static structures.

Behavioral model view


i. It represents the dynamic of behavioral as parts of the
system, depicting the
ii. Interactions of collection between various structural
elements described in the user model and structural model
view.

Implementation model view


• In this the structural and behavioral as parts of the system
are represented as they are to be built.

Environmental model view


In this the structural and behavioral aspects of the
environment in which
The system is to be implemented are represented.

UML is specifically constructed through two different domains they


are

 UML Analysis modeling, which focuses on the user model and


 Structural model views of the system.

27
PROJECT REPORT

 UML design modeling, which focuses on the behavioral


modeling,
 Implementation modeling and environmental model views.

28
PROJECT REPORT

Use Case Diagrams

Set the Parameter for


the Searching Operation

Set Regular
Sub folders text
Set Case
& hidden

“Search String” Enter the


search string
User

Select the directory, where


to search

Save the result output

Use case diagram

29
PROJECT REPORT

Class FileSearch1 Class FileSearch1User


static final long BIG_FILE_SIZE ;
static final int BUFFER_SIZE ;
static final int BYTE_MASK ;
static final String LOCAL_ENCODING; ActionPerformed(ActionEvent ae);

static final String[] FONT_SIZES; public FileSearch1User() ;

static final String[] REPORT_CHOICES public void actionPerformed(ActionEvent

static final int TIMER_DELAY event);

ActionListener action; FileSearch1.userButton(event);

FontName; public void run ()

Font Size new font(arial,20);


String textsearch=null;
JButtons open,close,save,cancel;
JTextField textString(10);
JComboBox
fontSize,fontName,regularText;
JcheckBoxes nullchk, caseChk;

doOpenButton();
doCancelButton();
doSaveButton();
doOpenRunner();
formatMatchWindow();
makeRegularPlain(String text);
prettyPlural();
processFileOrFolder();
processUnknownFile(File givenFile);
putError(String text);
putOutput(String text);
setStatusMessage(String text);
static void showHelp();
static File[] sortFileList(File[] input);

30
PROJECT REPORT

Run the Program

Enter the String

Settings

Open the Folder and


select the folder

Search process Starts

Stop Search
Cancel
Process

Write the search file to


the output text

Write the search file to


the output text

31
PROJECT REPORT

OUTPUT SCREENS

32
PROJECT REPORT

Fig 1: Running of project

33
PROJECT REPORT

Fig 2: Main Page

34
PROJECT REPORT

Fig 3: Entering of String to Search

35
PROJECT REPORT

Fig 4: Opening the Directory where to Search

36
PROJECT REPORT

Fig 5: Select the Drive

37
PROJECT REPORT

Fig 6: Searching Process

38
PROJECT REPORT

Fig 7: Searching process where cancel but is enabled

39
PROJECT REPORT

Fig 8: End of File Search

40
PROJECT REPORT

Fig 9: Saving the output to Text file

41
PROJECT REPORT

Fig 10: File1.txt generated by the project Save in C drive

42
PROJECT REPORT

Fig 11: Content of saved file

43
PROJECT REPORT

TESTING

44
PROJECT REPORT

5.1 Testing and testing types

Software testing is critical element of Software Quality


Assurance and represents the ultimate review of specification, design
and coding. Software testing is one of broader topic and often
referred to as verification refer to all the activities that endure that
software built is traceable to use requirements. System testing
consists of the following steps:
1. Modular Testing
2. Integrated Testing
3. User Acceptance Testing
5.1.1 Modular Testing:

A module represents the logical element of a system. For


a module to run satisfactorily, it must compile and test data correctly
and tie in properly with other modules. Modular testing checks for
two types of errors: Syntax and Logic. A syntax error is a program
statement that violates one or more rules of the language in which it
is written. A logic error, on the other hand deals with incorrect data
fields, out –of-range items, and invalid combination.

5.1.2 Integrated Testing:

Individual modules are invariably related to one another


and interact in a total system. Each portion of the system is tested
against the entire module with both testing and live data before the
entire system is ready to be implemented.
When the individual modules were found works
satisfactorily, the system integration test was carried out. Data was
collected in such way that all program paths could be covered. Using
these data a complete test was made. All outputs were generated.

45
PROJECT REPORT

Different users were allowed to work on the system to check its


performance.

5.1.3 User Acceptance Testing:


An acceptance test has the objective of selling the user
the validity and reliability of the system. It verifies that system’s
procedure operates to system specifications that their integrity of
vital data is maintained. Performance of an acceptance test is
actually user’s show. User motivation and knowledge are critical for
the accused performance of the system. Then a comprehensive test
report is prepared. The report indicated the system’s tolerance,
performance range, error rate and accuracy. In the development of
the system both verification and validation was done. Testing carried
out in 2 phases.
Each module is thoroughly tested for various test cases. Testing
methodologies such as input/output testing validation testing are
conducted at each step so as to get more efficient performance

5.2 TESTING STRATEGIES:


A Strategy for software testing integrates software test cases
into a series of well planned steps that result in the successful
construction of software. Software testing is a broader topic for what
is referred to as Verification and Validation. Verification refers to the
set of activities that ensure that the software correctly implements a
specific function. Validation refers he set of activities that ensure that
the software that has been built is traceable to customer’s
requirements

5.2.1 Unit Testing:

Unit testing focuses verification effort on the smallest unit of


software design that is the module. Using procedural design
description as a guide, important control paths are tested to uncover

46
PROJECT REPORT

errors within the boundaries of the module. The unit test is normally
white box testing oriented and the step can be conducted in parallel
for multiple modules.

5.2.2 Integration Testing:

Integration testing is a systematic technique for constructing the


program structure, while conducting test to uncover errors associated
with the interface. The objective is to take unit tested methods and
build a program structure that has been dictated by design.

5.2.2.1 Top-down Integration:

Top down integrations is an incremental approach for


construction of program structure. Modules are integrated by moving
downward through the control hierarchy, beginning with the main
control program. Modules subordinate to the main program are
incorporated in the structure either in the breath-first or depth-first
manner.

5.2.2.2 Bottom-up Integration:

This method as the name suggests, begins construction and


testing with atomic modules i.e., modules at the lowest level. Because
the modules are integrated in the bottom up manner the processing
required for the modules subordinate to a given level is always
available and the need for stubs is eliminated.

5.2.3 Validation Testing:

At the end of integration testing software is completely


assembled as a package. Validation testing is the next stage, which
can be defined as successful when the software functions in the
manner reasonably expected by the customer. Reasonable

47
PROJECT REPORT

expectations are those defined in the software requirements


specifications. Information contained in those sections form a basis
for validation testing approach.

5.2.4 System Testing:

System testing is actually a series of different tests whose


primary purpose is to fully exercise the computer-based system.
Although each test has a different purpose, all work to verify that all
system elements have been properly integrated to perform allocated
functions.

5.2.5 Security Testing:

Attempts to verify the protection mechanisms built into the


system.

5.2.6 Performance Testing:

This method is designed to test runtime performance of software


within the context of an integrated system.

We use these testing Strategies in this project.

48
PROJECT REPORT

BIBILOGRAPHY:
• Java Complete Reference
--Herbert Schild
• www.3gpp.com

49

You might also like