0% found this document useful (0 votes)
31 views

Tourist Palce Reviews Sentiment Classification

MCA PROJECT

Uploaded by

sumityana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

Tourist Palce Reviews Sentiment Classification

MCA PROJECT

Uploaded by

sumityana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

SUMMER INTERNSHIP PROJECT

ON
“TOURIST PALCE REVIEWS SENTIMENT
CLASSIFICATION”

IN THE PARTIAL FULFILLMENT OF REQUIREMENT OF


MASTER OF COMPUTER APPLICATION (MCA)

UNDER THE GUIDANCE OF


DR. SANGITA PHUNDE

SUBMITTED BY
AFFAN ABUTALHA CHAUS
MITU22MCAD0003

SUBMITTED TO

MIT COLLEGE OF MANAGEMENT, PUNE


YEAR : 2023-2024
MIT College of Management, Pune
This is to certify that, Mr./Miss Affan Abutalha Chaus has submitted a Project
Report on "Tourist Palce Reviews Sentiment Classification " to MIT – ADT
University, Pune for the partial fulfillment of Master in Computer Application
(Data Science) (MCA) (2022-24 Batch)

We further certify that to the best of our knowledge and belief, the matter
presented in this project has not been submitted to any Degree or Diploma
course.

PRN No: MITU22MCAD0003

Dr. Sangita Phunde Dr. Vijay Gondane Prof. Dr. Sunita Karad
HOD, MCA PG HEAD Executive Director-MITCOM

External Examiner Sign of Examiners:

1.______________________ __________________

Internal Examiner

2.______________________ __________________
DECLARATION

I hereby declare that the project work entitled “Tourist Place Reviews
Sentiment Classification” submitted to the MIT – ADT University, Pune,
is a record of an original work done by me under the guidance of Dr.
Sangita Phunde, and this project work is submitted in the partial
fulfillment of the requirements for the award of the degree of Master of
Computer Application. The project work in this report has not been
submitted to any other University or Institute for the award of any degree
or diploma. This is my own and original work.

Date: Signature of the Candidate


CERTIFICATE OF THE GUIDE

This is to certify that, Mr.Affan Abutalha Chaus of MCA Course (Data

Science) have successfully completed his Project Work Titled “Tourist

Place Reviews Sentiment Classification”, under my guidance during the

Academic Year 2022-2024.

Date:

Prof.: Dr. Sangita Phunde

Project Guide : Name & Signature


ACKNOWLEDGEMENT
I would like to convey my sincere gratitude to all those who have been

instrumental in the development of the project.

I am thankful to the organization ExcelR Solutions for giving me an

opportunity to work with them. Sincere thanks are uttered towards project

guide Vasanth of ExcelR Solutions towards the motivation, technical support

and inspiration provided to me.

I am greatly thankful to Honorable Dr. Prof. Sunita Karad, Executive Director

of MITCOM for all her timely support.

I express my gratitude to the PG Head Dr. Vijaya Gondane & Head of MCA

Department Dr. Sangita Phunde who helped me in my extreme solutions.

I am also thankful to Dr. Sangita Phunde, my internal project guide for her

invaluable guidance, help and great support during the project work.

I am greatly thankful to the staff of MITCOM, Pune for helping me through the

entire course.

Student Name & Signature: Affan Abutalha Chaus


Date:
Place: MITCOM, MIT-ADT University, Pune
INDEX

TITLES PAGE NO

CONTENTS
ABSTRACT vi

1. INTRODUCTION 1

2. LITERATURE SURVEY 5
2.1 KEY-DEDUPLICATION WITH IBBE 5
2.2 SERVER LESS DISTRIBUTEDFILESYSTEM 6
2.3 THE GOOGLEFILESYSTEM 7
2.4 CONVERGENTKEYMANAGEMENT 8
2.5 SOFTWAREENVIRONMENT 9
2.6 WHY CHOOSEPYTHON 10

3. SYSTEM ANALYSIS 14
3.1 EXISTING SYSTEM 16
3.2 PROPOSEDSYSTEM 16

4. FEASIBILITYSTUDY 17
4.1 ECONOMICALFEASIBILITY 17
4.2 TECHNICALFEASIBILITY 17
4.3 SOCIALFEASIBILITY 18

5. SYSTEMREQUIREMENTS 19
6. SYSTEMDESIGN 20
6.1 SYSTEMARCHITECTURE 20
6.2 DATAFLOW DIAGRAM 20
6.3 UMLDIAGRAMS 22
7. IMPLEMENTATION 27
7.1 MODULES 27
7.2 SAMPLE CODE 28

8. SYSTEMTESTING 29
8.1 UNITTESTING 31
8.2 INTEGRATION TESTING 32
8.3 ACCEPTANCETESTING 33

9. INPUT DESIGN ANDOUTPUTDESIGN 34


9.1 INPUTDESIGN 34

9.2 OUTPUTDESIGN 35

10. SCREENSHOTS 37

11. FUTUREWORK 50

12. CONCLUSION 51

13. BIBLOGRAPHY 52
ABSTRACT:
Social media is growing trend now a days. Every day millions of user review
and rate tourist places on tourism websites. Sentiment analysis can be
performed over these reviews which will be helpful to find tourist place
popularity. Based on sentiment analysis result, tourist can easily decide tour
destination to be visited. In this paper sentiment analysis has been implemented
using machine learning approach. The Dataset has been collected from various
tourism review websites. Here we have performed comparative study of feature
extraction algorithms i.e. CountVectorization, TFIDFVectorization. Along with
classification algorithms Naive Bayes (NB), Support Vector Machine (SVM)
and Random Forest (RF). Performance of algorithms has been compared using
various parameters like accuracy, recall, precision and f1-score. From
experiment we found that TFIDFVectorization feature extraction algorithm has
improved accuracy of classification algorithm as compare to
CountVectorization for given review dataset. In sentiment classification of
tourist place reviews TFIDFVectorization+RF has given highest accuracy 86%
for a research dataset used.
1. INTRODUCTION

Social media is rapidly growing now a days. Millions of users post reviews and
rate tourist place on a daily basis over tourism websites. For analyzing this
reviews sentiment analysis can be performed. Proper analysis of reviews will
able to find a trend of tourist place popularity. Summarized results from
sentiment analysis will help tourist to decide the tour destination and tour
planning. In this research paper two feature extraction algorithms have been
used i.e. CountVectorization and TFIDFVectorization algorithm. Also three
classification algorithms Naive Bayes (NB), Support Vector Machine (SVM)
and Random Forest (RF) has been used for sentiment classification. Comparison
of performance has been performed for combination of fea- ture extraction and
classification algorithms on the basis of parameters like execution time,
accuracy, recall, precision and f1-score.
2. LITERATURE SURVEY

2.1 Sentiment Analysis: A Comparative Study on Different Approaches

Authors: M.D.Devika, C.Sunitha, Amal Ganesh Abstract: Sentiment analysis (SA) is an


intellectual process of extricating user's feelings and emotions. It is one of the pursued field
of Natural Language Processing (NLP). The evolution of Internet based applications has
steered massive amount of personalized reviews for various related information on the Web.
These reviews exist in different forms like social Medias, blogs, Wiki or forum websites.
Both travelers and customers find the information in these reviews to be beneficial for their
understanding and planning processes. The boom of search engines like Yahoo and Google
has flooded users with copious amount of relevant reviews about specific destinations, which
is still beyond human comprehension. Sentiment Analysis poses as a powerful tool for users
to extract the needful information, as well as to aggregate the collective sentiments of the
reviews. Several methods have come to the limelight in recent years for accomplishing this
task. In this paper we compare the various techniques used for Sentiment Analysis by
analyzing various methodologies.

2.2 Comparative analysis of Twitter data using supervised classifiers

Authors: Rohit Joshi , Rajkumar Tekchandani Abstract: Online


Microblogging on social networks have been used for indicating opinions about certain entity
in very short messages. Existing some popular microblogs like Twitter, facebook etc, in
which Twitter attains maximum amount of attention in the field of research areas related to
product, movie reviews, stock exchange etc. We had extracted data from Twitter i.e. movie
reviews for sentiment prediction using machine-learning algorithms. We applied supervised
machine-learning algorithms like support vector machines (SVM), maximum entropy and
Naïve Bayes to classify data using unigram, bigram and hybrid i.e. unigram + bigram
features. Result shows that SVM surpassed other classifiers with remarkable accuracy of 84%
for movie reviews.
2.3 A Survey of Sentiment Analysis techniques
Authors: Harpreet Kaur, Veenu Mangat, Nidh
Abstract: Sentiment analysis is an application of natural language processing. It is
also known as emotion extraction or opinion mining. This is a very popular field of research
in text mining. The basic idea is to find the polarity of the text and classify it into positive,
negative or neutral. It helps in human decision making. To perform sentiment analysis, one
has to perform various tasks like subjectivity detection, sentiment classification, aspect term
extraction, feature extraction etc. This paper presents the survey of main approaches used for
sentiment classification.
2.4 A Brief Survey of Text Mining: Classification, Clustering and Extraction
Techniques
Authors: Mehdi Allahyari, Seyedamin Pouriyeh, Mehdi Assefi, Saied
Safaei, Elizabeth D. Trippe, Juan B. Gutierrez, Krys Kochut, Abstract: The
amount of text that is generated every day is increasing dramatically. This tremendous
volume of mostly unstructured text cannot be simply processed and perceived by computers.
Therefore, efficient and effective techniques and algorithms are required to discover useful
patterns. Text mining is the task of extracting meaningful information from text, which has
gained significant attentions in recent years. In this paper, we describe several of the most
fundamental text mining tasks and techniques including text pre-processing, classification
and clustering. Additionally, we briefly explain text mining in biomedical and health care
domains.

2.5 SOFTWARE ENVIRONMENT


Python is a high-level, interpreted scripting language developed in the late 1980s by
Guido van Rossum at the National Research Institute for Mathematics and Computer Science
in the Netherlands. The initial version was published at the alt. Sources newsgroup in 1991,
and version 1.0 was released in 1994.

Python 2.0 was released in 2000, and the 2.x versions were the prevalent releases until
December 2008. At that time, the development team made the decision to release version 3.0,
which contained a few relatively small but significant changes that were not backward
compatible with the 2.x versions. Python 2 and 3 are very similar, and some features of
Python 3 have been back ported to Python 2. But in general, they remain not quite
compatible.
Both Python 2 and 3 have continued to be maintained and developed, with periodic release
updates for both. As of this writing, the most recent versions available are 2.7.15 and 3.6.5.
However, an official End of Life date of January 1, 2020 has been established for Python 2,
after which time it will no longer be maintained. If you are a newcomer to Python, it is
recommended that you focus on Python 3, as this tutorial will do.

Python is still maintained by a core development team at the Institute, and Guido is still in
charge, having been given the title of BDFL (Benevolent Dictator For Life) by the Python
community. The name Python, by the way, derives not from the snake, but from the British
comedy troupe Monty Python‟s Flying Circus, of which Guido was, and presumably still is, a
fan. It is common to find references to Monty Python sketches and movies scattered
throughout the Python documentation.

2.6 WHY CHOOSE PYTHON

If you‟re going to write programs, there are literally dozens of commonly used languages to
choose from. Why choose Python? Here are some of the features that make Python an
appealing choice.

Python is Popular

Python has been growing in popularity over the last few years. The 2018 Stack Overflow
Developer Survey ranked Python as the 7th most popular and the number one most wanted
technology of the year. World-class software development countries around the globe use
Python every single day.

According to research by Dice Python is also one of the hottest skills to have and the most
popular programming language in the world based on the Popularity of Programming
Language Index.

Due to the popularity and widespread use of Python as a programming language, Python
developers are sought after and paid well. If you‟d like to dig deeper into Python salary
statistics and job opportunities, you can do so here.
Python is interpreted

Many languages are compiled, meaning the source code you create needs to be translated into
machine code, the language of your computer‟s processor, before it can be run. Programs
written in an interpreted language are passed straight to an interpreter that runs them directly.

This makes for a quicker development cycle because you just type in your code and run it,
without the intermediate compilation step.

One potential downside to interpreted languages is execution speed. Programs that are
compiled into the native language of the computer processor tend to run more quickly than
interpreted programs. For some applications that are particularly computationally intensive,
like graphics processing or intense number crunching, this can be limiting.

In practice, however, for most programs, the difference in execution speed is measured in
milliseconds, or seconds at most, and not appreciably noticeable to a human user. The
expediency of coding in an interpreted language is typically worth it for most applications.

Python is Free
The Python interpreter is developed under an OSI-approved open-source license, making it
free to install, use, and distribute, even for commercial purposes.

A version of the interpreter is available for virtually any platform there is, including all
flavors of Unix, Windows, macOS, smart phones and tablets, and probably anything else you
ever heard of. A version even exists for the half dozen people remaining who use OS/2.

Python is Portable
Because Python code is interpreted and not compiled into native machine instructions, code
written for one platform will work on any other platform that has the Python interpreter
installed. (This is true of any interpreted language, not just Python.)

Python is Simple
As programming languages go, Python is relatively uncluttered, and the developers have
deliberately kept it that way.
A rough estimate of the complexity of a language can be gleaned from the number of
keywords or reserved words in the language. These are words that are reserved for special
meaning by the compiler or interpreter because they designate specific built-in functionality
of the language.

Python 3 has 33 keywords, and Python 2 has 31. By contrast, C++ has 62, Java has 53, and
Visual Basic has more than 120, though these latter examples probably vary somewhat by
implementation or dialect.

Python code has a simple and clean structure that is easy to learn and easy to read. In fact, as
you will see, the language definition enforces code structure that is easy to read.

But It‟s Not That Simple For all its syntactical simplicity, Python supports most constructs
that would be expected in a very high-level language, including complex dynamic data types,
structured and functional programming, and object-oriented programming.

Additionally, a very extensive library of classes and functions is available that provides
capability well beyond what is built into the language, such as database manipulation or GUI
programming.

Python accomplishes what many programming languages don‟t: the language itself is simply
designed, but it is very versatile in terms of what you can accomplish with it.

Conclusion
This section gave an overview of the Python programming language, including:

A brief history of the development of Python


Some reasons why you might select Python as your language of choice

Python is a great option, whether you are a beginning programmer looking to learn the basics,
an experienced programmer designing a large application, or anywhere in between. The
basics of Python are easily grasped, and yet its capabilities are vast. Proceed to the next
section to learn how to acquire and install Python on your computer.

Python is an open source programming language that was made to be easy-to-read and
powerful. A Dutch programmer named Guido van Rossum made Python in 1991. He named
it after the television show Monty Python's Flying Circus. Many Python examples and
tutorials include jokes from the show.

Python is an interpreted language. Interpreted languages do not need to be compiled to run. A


program called an interpreter runs Python code on almost any kind of computer. This means
that a programmer can change the code and quickly see the results. This also means Python is
slower than a compiled language like C, because it is not running machine code directly.

Python is a good programming language for beginners. It is a high-level language, which


means a programmer can focus on what to do instead of how to do it. Writing programs in
Python takes less time than in some other languages.

Python drew inspiration from other programming languages like C, C++, Java, Perl, and Lisp.

Python has a very easy-to-read syntax. Some of Python's syntax comes from C, because that
is the language that Python was written in. But Python uses whitespace to delimit code:
spaces or tabs are used to organize code into groups. This is different from C. In C, there is
a semicolon at the end of each line and curly braces ({}) are used to group code. Using
whitespace to delimit code makes Python a very easy-to-read language.

Python use [change / change source]

Python is used by hundreds of thousands of programmers and is used in many

places. Sometimes only Python code is used for a program, but most of the time it is used to
do simple jobs while another programming language is used to do more complicated tasks.

Its standard library is made up of many functions that come with Python when it is installed.
On the Internet there are many other libraries available that make it possible for the Python
language to do more things. These libraries make it a powerful language; it can do many
different things.

Some things that Python is often used for are:

Web development
Scientific programming
Desktop GUIs
Network programming
Game programming
3. SYSTEM ANALYSIS
3.1 EXISTING SYSTEM:

Every day millions of user review and rate tourist places on tourism websites.
Sentiment analysis can be performed over these reviews which will be helpful
to find tourist place popularity. Based on sentiment analysis result, tourist can
easily decide tour destination to be visited..
Different levels of sentiments are document level, sentence level, aspect level
which has been elaborated Approaches used for sentiment analysis in this paper
are machine learning based, Rule based and lexical based. Inside machine
learning approach various techniques are SVM (Support Vector Machine), NB
(Naive Bayes), Maximum Entropy, K-NN and Weighted K-NN, Multilingual
Sentiment Analysis also feature driven sentiment analysis has been described in
detailed. Various approaches of sentiment analysis has been compared its
corresponding advantages and disadvantages are described in detail. From
Various parameters of comparison like performance, efficiency, and accuracy it
has been found that machine learning approach gives best result. As described
in [2] paper twitter sentiment analysis has been performed on movie reviews.
They have used various supervised machine learning algorithms such as support
vector machine, naive bayes and maximum entropy using various feature
extraction techniques like unigram, bigram and hybrid i.e. unigram + bigram.
From research study they have concluded that SVM using hybrid feature
extractor outperforms over other techniques.
3.2 PROPOSED SYSTEM:
In this paper sentiment analysis has been implemented using machine
learning approach. The Dataset has been collected from various tourism
review websites. Here we have performed comparative study of feature
extraction algorithms i.e. CountVectorization, TFIDFVectorization. Along
with classification algorithms Naive Bayes (NB), Support Vector Machine
(SVM) and Random Forest (RF). Performance of algorithms has been
compared using various parameters like accuracy, recall, precision and f1-
score. From experiment we found that TFIDFVectorization feature
extraction algorithm has improved accuracy of classification algorithm as
compare to CountVectorization for given review dataset. In sentiment
classification of tourist place reviews TFIDFVectorization+RF has given
highest accuracy 86% for a research dataset used.

4. FEASIBILITY STUDY
The feasibility of the project is analyzed in this phase and business proposal is
put forth with a very general plan for the project and some cost estimates.
During system analysis the feasibility study of the proposed system is to be
carried out. This is to ensure that the proposed system is not a burden to the
company. For feasibility analysis, some understanding of the major
requirements for the system is essential.

Three key considerations involved in the feasibility analysis are


ECONOMICAL FEASIBILITY
TECHNICAL FEASIBILITY
SOCIAL FEASIBILITY
4.1 ECONOMICAL FEASIBILITY
This study is carried out to check the economic impact that the system
will have on the organization. The amount of fund that the company can pour
into the research and development of the system is limited. The expenditures
must be justified. Thus the developed system as well within the budget and this
was achieved because most of the technologies used are freely available. Only
the customized products had to be purchased.
4.2 TECHNICAL FEASIBILITY
This study is carried out to check the technical feasibility, that is, the
technical requirements of the system. Any system developed must not have a
high demand on the available technical resources. This will lead to high
demands on the available technical resources. This will lead to high demands
being placed on the client. The developed system must have a modest
requirement, as only minimal or null changes are required for implementing this
system.
4.3 SOCIAL FEASIBILITY
The aspect of study is to check the level of acceptance of the system by
the user. This includes the process of training the user to use the system
efficiently. The user must not feel threatened by the system, instead must accept
it as a necessity. The level of acceptance by the users solely depends on the
methods that are employed to educate the user about the system and to make
him familiar with it. His level of confidence must be raised so that he is also
able to make some constructive criticism, which is welcomed, as he is the final
user of the system.
5. SYSTEM REQUIREMENTS

5.1 HARDWARE REQUIREMENTS:

• System : Pentium Dual Core.


• Hard Disk : 120 GB.
• Monitor : 15‟‟ LED
• Input Devices : Keyboard, Mouse
• Ram : 1 GB

5.2 SOFTWARE REQUIREMENTS:

• Operating system : Windows 10

• Coding Language : python

• Tool : PyCharm

• Database : MYSQL

• Server : Flask
6. SYSTEM DESIGN

6.1 SYSTEM ARCHITECTURE:


6.2 DATA FLOW DIAGRAM:

1. The DFD is also called as bubble chart. It is a simple graphical formalism


that can be used to represent a system in terms of input data to the system,
various processing carried out on this data, and the output data is
generated by this system.
2. The data flow diagram (DFD) is one of the most important modeling
tools. It is used to model the system components. These components are
the system process, the data used by the process, an external entity that
interacts with the system and the information flows in the system.
3. DFD shows how the information moves through the system and how it is
modified by a series of transformations. It is a graphical technique that
depicts information flow and the transformations that are applied as data
moves from input to output.
4. DFD is also known as bubble chart. A DFD may be used to represent a
system at any level of abstraction. DFD may be partitioned into levels
that represent increasing information flow and functional detail.
User

Yes NO
Unauthorized user
Check

Upload Tourism Reviews Dataset


& Features
Preprocess Dataset

TFIDF Feature Extraction


Content Column
Count Vectorization Features Extraction

Run SVM,Naive Bayes And Random Forest With TFIDF

Run SVM,Naive Bayes And Random Forest With CountVector

Comparison Graph

Your Review

Predict Sentiments from Review


Set
Search Places
Set
End process
6.3 UML DIAGRAMS:
UML stands for Unified Modeling Language. UML is a standardized
general-purpose modeling language in the field of object-oriented software
engineering. The standard is managed, and was created by, the Object
Management Group.
The goal is for UML to become a common language for creating models
of object oriented computer software. In its current form UML is comprised of
two major components: a Meta-model and a notation. In the future, some form
of method or process may also be added to; or associated with, UML.
The Unified Modeling Language is a standard language for specifying,
Visualization, Constructing and documenting the artifacts of software system, as
well as for business modeling and other non-software systems.
The UML represents a collection of best engineering practices that have
proven successful in the modeling of large and complex systems.
The UML is a very important part of developing objects oriented
software and the software development process. The UML uses mostly
graphical notations to express the design of software projects.

GOALS:
The Primary goals in the design of the UML are as follows:
1. Provide users a ready-to-use, expressive visual modeling Language so
that they can develop and exchange meaningful models.
2. Provide extendibility and specialization mechanisms to extend the core
concepts.
3. Be independent of particular programming languages and development
process.
4. Provide a formal basis for understanding the modeling language.
5. Encourage the growth of OO tools market.
6. Integrate best practices.
USE CASE DIAGRAM:
A use case diagram in the Unified Modeling Language (UML) is a type
of behavioral diagram defined by and created from a Use-case analysis. Its
purpose is to present a graphical overview of the functionality provided by a
system in terms of actors, their goals (represented as use cases), and any
dependencies between those use cases. The main purpose of a use case diagram
is to show what system functions are performed for which actor. Roles of the
actors in the system can bedepicted.
CLASS DIAGRAM:
In software engineering, a class diagram in the Unified Modeling Language
(UML) is a type of static structure diagram that describes the structure of a
system by showing the system's classes, their attributes, operations (or
methods), and the relationships among the classes. It explains which class
contains information.

SEQUENCE DIAGRAM:
A sequence diagram in Unified Modeling Language (UML) is a kind of
interaction diagram that shows how processes operate with one another and in
what order. It is a construct of a Message Sequence Chart. Sequence diagrams
are sometimes called event diagrams, event scenarios, and timing diagrams.
ACTIVITY DIAGRAM:
Activity diagrams are graphical representations of workflows of stepwise
activities and actions with support for choice, iteration and concurrency. In the
Unified Modeling Language, activity diagrams can be used to describe the
business and operational step-by-step workflows of components in a system. An
activity diagram shows the overall flow of control.
7. IMPLEMENTATION

7.1 MODULES:
 Upload Tourism Reviews Dataset
 Preprocess Dataset
 TFIDF Feature Extraction
 Count Vectorization Features Extraction
 Run SVM,Naive Bayes And Random Forest With TFIDF
 Run SVM,Naive Bayes And Random Forest With CountVector
 Comparison Graph
 Your Review
 Predict Sentiments from Review
 Search Places

MODULES DESCRIPTION:
7.2 SAMPLE CODE
8. SYSTEM TESTING

The purpose of testing is to discover errors. Testing is the process of


trying to discover every conceivable fault or weakness in a work product. It
provides a way to check the functionality of components, sub assemblies,
assemblies and/or a finished product It is the process of exercising software
with the intent of ensuring that the Software system meets its requirements and
user expectations and does not fail in an unacceptable manner. There are various
types of test. Each test type addresses a specific testing requirement.

TYPES OF TESTS

Unit testing:
Unit testing involves the design of test cases that validate that the internal
program logic is functioning properly, and that program inputs produce valid
outputs. All decision branches and internal code flow should be validated. It is
the testing of individual software units of the application .it is done after the
completion of an individual unit before integration. This is a structural testing,
that relies on knowledge of its construction and is invasive. Unit tests perform
basic tests at component level and test a specific business process, application,
and/or system configuration. Unit tests ensure that each unique path of a
business process performs accurately to the documented specifications and
contains clearly defined inputs and expected results.
Integration testing:
Integration tests are designed to test integrated software components to
determine if they actually run as one program. Testing is event driven and is
more concerned with the basic outcome of screens or fields. Integration tests
demonstrate that although the components were individually satisfaction, as
shown by successfully unit testing, the combination of components is correct
and consistent. Integration testing is specifically aimed at exposing the
problems that arise from the combination of components.
Functional test:
Functional tests provide systematic demonstrations that functions tested
are available as specified by the business and technical requirements, system
documentation, and user manuals.
Functional testing is centered on the following items:
Valid Input : identified classes of valid input must be accepted.
Invalid Input : identified classes of invalid input must be rejected.
Functions : identified functions must be exercised.
Output : identified classes of application outputs must be
exercised.
Systems/Procedures : interfacing systems or procedures must be invoked.
Organization and preparation of functional tests is focused on
requirements, key functions, or special test cases. In addition, systematic
coverage pertaining to identify Business process flows; data fields, predefined
processes, and successive processes must be considered for testing. Before
functional testing is complete, additional tests are identified and the effective
value of current tests is determined.
System Test:
System testing ensures that the entire integrated software system meets
requirements. It tests a configuration to ensure known and predictable results.
An example of system testing is the configuration oriented system integration
test. System testing is based on process descriptions and flows, emphasizing
pre-driven process links and integration points.
White Box Testing:
White Box Testing is a testing in which in which the software tester has
knowledge of the inner workings, structure and language of the software, or at
least its purpose. It is purpose. It is used to test areas that cannot be reached
from a black box level.
Black Box Testing:
Black Box Testing is testing the software without any knowledge of the
inner workings, structure or language of the module being tested. Black box
tests, as most other kinds of tests, must be written from a definitive source
document, such as specification or requirements document, such as specification
or requirements document. It is a testing in which the software under test is
treated, as a black box .you cannot “see” into it. The test provides inputs and
responds to outputs without considering how the software works.
8.1 Unit Testing:
Unit testing is usually conducted as part of a combined code and unit test
phase of the software lifecycle, although it is not uncommon for coding and unit
testing to be conducted as two distinct phases.
Test strategy and approach:
Field testing will be performed manually and functional tests will be
written in detail.

Test objectives:
All field entries must work properly.
Pages must be activated from the identified link.
The entry screen, messages and responses must not be delayed.

Features to be tested
Verify that the entries are of the correct format
No duplicate entries should be allowed
All links should take the user to the correct page.
8.2 Integration Testing
Software integration testing is the incremental integration testing of two
or more integrated software components on a single platform to produce failures
caused by interface defects.
The task of the integration test is to check that components or software
applications, e.g. components in a software system or – one step up – software
applications at the company level – interact without error.
Test Results: All the test cases mentioned above passed successfully. No
defects encountered.

8.3 Acceptance Testing


User Acceptance Testing is a critical phase of any project and requires
significant participation by the end user. It also ensures that the system meets
the functional requirements.
Test Results: All the test cases mentioned above passed successfully. No
defects encountered.
9. INPUT DESIGN AND OUTPUT DESIGN
9.1 INPUT DESIGN:
The input design is the link between the information system and the user. It
comprises the developing specification and procedures for data preparation and
those steps are necessary to put transaction data in to a usable form for
processingcan be achieved by inspecting the computer to read data from a
written or printed document or it can occur by having people keying the data
directly into the system. The design of input focuses on controlling the amount
of input required, controlling the errors, avoiding delay, avoiding extra steps
and keeping the process simple. The input is designed in such a way so that it
provides security and ease of use with retaining the privacy. Input Design
considered the following things:
 What data should be given as input?
 How the data should be arranged or coded?
 The dialog to guide the operating personnel in providing input.
 Methods for preparing input validations and steps to follow when error
occur.
OBJECTIVES:
1. Input Design is the process of converting a user-oriented description of the
input into a computer-based system. This design is important to avoid errors in
the data input process and show the correct direction to the management for
getting correct information from the computerized system.
2.It is achieved by creating user-friendly screens for the data entry to handle
large volume of data. The goal of designing input is to make data entry easier
and to be free from errors. The data entry screen is designed in such a way that
all the data manipulates can be performed. It also provides record viewing
facilities.
3. When the data is entered it will check for its validity. Data can be entered
with the help of screens. Appropriate messages are provided as when needed so
that the user will not be in maize of instant. Thus the objective of input design
is to create an input layout that is easy to follow
9.2 OUTPUT DESIGN:
A quality output is one, which meets the requirements of the end user and
presents the information clearly. In any system results of processing are
communicated to the users and to other system through outputs. In output
design it is determined how the information is to be displaced for immediate
need and also the hard copy output. It is the most important and direct source
information to the user. Efficient and intelligent output design improves the
system‟s relationship to help user decision-making.
1. Designing computer output should proceed in an organized, well thought out
manner; the right output must be developed while ensuring that each output
element is designed so that people will find the system can use easily and
effectively. When analysis design computer output, they should Identify the
specific output that is needed to meet the requirements.
2. Select methods for presenting information.
3. Create document, report, or other formats that contain information produced
by the system.
The output form of an information system should accomplish one or more of the
following objectives.
 Convey information about past activities, current status or projections of
the
 Future.
 Signal important events, opportunities, problems, or warnings.
 Trigger an action.
 Confirm an action.
10. SCREENSHOTS

Now in below output you can see suggestions also in text box

In below screen u can see separated tables for „Chand Baori‟ search place. All positive Reviews in
below screen

All negative reviews in below screen


All Neutral reviews in below screen

Maximum reviews count in below screen


In last small table displaying count of positive, negative and neutral reviews and then in 4th column
printing which one has majority
11. FUTURE ENHANCEMENT
The research study for Tourist Place review classification using machine
learning algorithm has future scope of handling multilingual review
classification. Also we will try to use different feature selection method like
Recursive feature elimination with cross-validation to improve accuracy of
classification. In future work We will try to use deep learning based techniques
for feature extraction and classification for better performance.

12. CONCLUSION
From research study, we can infer that TFIDFVectorization has outperformed
over CountVectorization feature extraction algorithm by increasing accuracy of
classification. But feature extraction using TFIDFVectorization requires more
execution time than CountVectorization algorithm. In research, classification
algorithms Support Vector Machine(SVM), Naive Bayes(NB), Random
Forest(RF) has been used. It has found that TFIDFVectorization+RF
outperformed over other algorithms used on bases of several evaluation
parameters like accuracy, precision, recall and f1-score.
13. BIBLIOGRAPHY
1. M.D.Devika, C.Sunitha, Amal Ganesh “Sentiment Analysis: A
Comparative Study on Different Approaches” ScienceDirect Fourth
Interna- tional Conference on Recent Trends in Computer Science
Engineering https://doi.org/10.1016/j.procs.2016.05.124

2. Rohit Joshi , Rajkumar Tekchandani ” Comparative analysis of Twitter


data using supervised classifiers” 2016 International Conference on
Inventive Computation Technologies (ICICT) DOI:
10.1109/INVENTIVE.2016.7830089

3. Harpreet Kaur, Veenu Mangat, Nidhi ”A Survey of Sentiment Analysis


techniques ” 2017 International Conference on I-SMAC (IoT in Social,
Mobile, Analytics and Cloud) (I-SMAC) DOI:
10.1109/ISMAC.2017.8058315

4. Mehdi Allahyari, Seyedamin Pouriyeh, Mehdi Assefi, Saied Safaei,


Elizabeth D. Trippe, Juan B. Gutierrez, Krys Kochut, “A Brief Survey of
Text Mining: Classification, Clustering and Extraction Techniques”,
arXiv:1707.02919 [cs.CL], July 2017

5. Robert Dzisevicˇ , Dmitrij Sˇesˇok ”Text Classification using Different


Feature Extraction Approaches Text Classification using Different
Feature Extraction Approaches” 2019 Open Conference of Electrical,
Electronic and Information Sciences (eStream)

6. Seyyed Mohammad Hossein Dadgar, Mohammad Shirzad Araghi,


Morteza Mastery Farahani ”A Novel Text Mining Approach Based on
TF-IDF and Support Vector Machine for News Classification” 2nd IEEE
International Conference on Engineering and Technology (ICETECH),
17th 18thMarch 2016, Coimbatore, TN, India.
7. Rasika Wankhede, Prof. A.N.Thakare ”Design Approach for Accuracy in
Movies Reviews Using Sentiment Analysis”. International Conference on
Electronics, Communication and Aerospace Technology ICECA 2017

8. Bo Pang and Lillian Lee, Shivakumar Vaithyanathan ”Sentiment Classifi-


cation using Machine Learning Techniques ” Proceedings of the Confer-
ence on Empirical Methods in Natural Language Processing (EMNLP),
Philadelphia, July 2002, pp. 79-86. Association for Computational Lin-
guistics. [9] Muhammad Afzaal, Muhammad Usman ”Novel Framework
for Aspect- based Opinion Classification for Tourist Places” The Tenth
International Conference on Digital Information Management (ICDIM
2015)

9. Upma kumari, Dr. Arvind K Sharma, Dinesh Soni ”Sentiment analysis of


smart phone product reviews using SVM classification techniques” 2017
International Conference on Energy, Communication, Data Analytics and
Soft Computing (ICECDS)

10.Xing Fang and Justin Zhan ”Sentiment analysis using product review data
” Springer an Journal of Big Data (2015) 2:5 DOI 10.1186/s40537- 015-
0015-2

11.C. Burges, “A tutorial on support vector machines for pattern


recognition,” Data Mining and Knowledge Discovery, vol. 2, pp. 121–
167, 1998.

12.Leo Breiman ”RANDOM FORESTS” Statistics Department University


of California Berkeley, CA 94720

13.C. Sheppard, Tree-based Machine Learning Algorithms: Decision Trees,


Random Forests, and Boosting. CreateSpace Independent Publishing
Platform, 2017.
14.Kamal Sarkar ”Using Character N-gram Features and Multinomial Naive
Bayes for Sentiment Polarity Detection in Bengali Tweets” 2018 Fifth
International Conference on Emerging Applications of Information
Technology (EAIT)

15.Text Classification and Naive Bayes https://web.stanford.edu/


jurafsky/slp3/slides/7 NB.pdf

16.Dixa Saxena, S. K. Saritha, PhD , K. N. S. S. V. Prasad ”Survey Paper on


Feature Extraction Methods in Text Categorization” International Journal
of Computer Applications (0975 – 8887) Volume 166 – No.11, May 2017

17.https://www.tripadvisor.in/

18.https://www.mouthshut.com/

You might also like