Something
Something
In this current era, social media plays an important role in data exchange, sharing their
thoughts. Emotional Effect of a person maintains an important role on their day-to-day life.
Product Rating Analysis is a procedure of analysing the opinions and polarity of thoughts of
the person.
Twitter is a main platform on sharing the thought's, opinion and Product Rating on different
occasions. Twitter Product Rating Analysis is method of analysing the emotions from tweets
Tweets are helpful in extracting the Product Rating values from the user. The data
provide the Polarity indication like positive, negative or unbiassed values. It is focused on the
person’s tweets and the hash tags for understanding the situations in each aspect of the
criteria. The paper is to analyse the famous person’s id’s (@realdonaldtrump) or hash tags
(#IPL2018) for understanding the mindset of people in each situation when the person has
The proposed system is to analyse the Product Rating of the people using PHP, twitter API,
Text Blob (Library for processing text). As the results it helps to analysis the post with a
better accuracy.
1
1
1.INTRODUCTION
In the era of social media, platforms like Twitter have become invaluable resources for
understanding public sentiment towards various topics, including global crises such as the
enables the automated extraction and categorization of sentiment from text data. This project
aims to apply machine learning techniques to analyse sentiment from COVID-19 related
tweets on Twitter.
1.1.1 Objectives
• Preprocess the data to remove noise, such as URLs, special characters, and stop words.
• Utilize machine learning models for sentiment analysis, including but not limited to:
- Naive Bayes
• Train and fine-tune the chosen models on the pre-processed Twitter data.
• Evaluate the performance of each model using metrics such as accuracy, precision, recall,
and F1-score.
• Visualize the sentiment trends over time using time-series analysis techniques.
1.1.2 Methodology
2
2
• Data Collection
- Utilize the Twitter API to collect a large volume of tweets related to COVID-19 using
• Data Preprocessing
- Remove noise from the tweets, including URLs, special characters, and stop words.
- Tokenize the text and perform lemmatization or stemming to normalize the words.
- Encode the text data into numerical representations suitable for machine learning models.
• Model Training:
- Implement and train various machine learning models for sentiment analysis, including
traditional algorithms like Naive Bayes and SVM, as well as deep learning models like RNNs
and BERT.
- Fine-tune the hyperparameters of the model's using techniques like grid search or random
search.
• Model Evaluation:
- Evaluate the performance of each model using standard evaluation metrics such as
- Compare the performance of different models to identify the most effective approach for
3
3
- Visualize the sentiment trends over time using techniques such as time-series analysis and
- Analyse the findings to gain insights into public sentiment towards COVID-19 and related
topics on Twitter.
1.1.3 Deliverables
• Documentation detailing the methodology, findings, and insights gained from the analysis.
2.SYSTEM ANALYSIS
4
4
2.1 EXISTING SYSTEM
Product Rating analysis is the process of analysis of the text from many levels.
First level is document level, the classification task determines the class of an object
based on its attributes, and after that it can analysed at the sentence level for
classifying the sentence based on the negative, positive and neutral Product Ratings.
Twitter Product Rating analysis comes under the category of text and opinion
mining. It focuses on analysing the Product Ratings of the tweets and feeding the data
to a machine learning model to train it and then check its accuracy, so that we can use
detection, Product Rating classification, training and testing the model. This system
has evolved during the last decade with models reaching the efficiency of almost
85%-90%. But it still lacks the dimension of diversity in the data. Along with this it
has a lot of application issues with the slang used and the short forms of words. Many
analysers don’t perform well when the number of classes are increased. Also, it’s still
not tested that how accurate the model will be for topics other than the one in
consideration. Hence Product Rating analysis has a very bright scope of development
in future.
5
5
In this method we use text blob as a method to find the polarity of the text
The tweets are imported from the Twitter using the (API) provided by the Twitter
Developer. From these API various fields like tweets, source, retweets, likes,
language, user etc. can be scrapped. After collecting these data, we can analyse the
3.SYSTEM CONFIGURATION
6
6
3.1 HARDWARE CONFIGURATION
HDD: >90GB
RAM: >2GB
PHP
PHP started out as a small open-source project that evolved as more and more people
found out how useful it was. Rasmus Lerdorf unleashed the first version of PHP way back in
1994.
7
7
• PHP is pleasingly zippy in its execution, especially when compiled as an Apache
module on the Unix side. The MySQL server, once started, executes even very
complex queries with huge result sets in record-setting time.
• PHP supports a large number of major protocols such as POP3, IMAP, and LDAP.
PHP4 added support for Java and distributed object architectures (COM and
CORBA), making n-tier development a possibility for the first time.
• PHP performs system functions, i.e. from files on a system it can create, open, read,
write, and close them.
• PHP can handle forms, i.e. gather data from files, save data to a file, through email
you can send data, return data to the user.
• You add, delete, modify elements within your database through PHP.
• Using PHP, you can restrict users to access some pages of your website.
Characteristics of PHP
• Simplicity
• Efficiency
• Security
8
8
• Flexibility
• Familiarity
To get a feel for PHP, first start with simple PHP scripts. Since "Hello, World!" is an
essential example, first we will create a friendly little "Hello, World!" script.
As mentioned earlier, PHP is embedded in HTML. That means that in amongst your normal
HTML (or XHTML if you're cutting-edge) you'll have PHP statements like this −
Live Demo
<html>
<head>
<title>Hello World</title>
</head>
<body>
<?php echo "Hello, World! ";?>
</body>
</html>
Hello, World!
If you examine the HTML output of the above example, you'll notice that the PHP code is
not present in the file sent from the server to your Web browser. All of the PHP present in
the Web page is processed and stripped from the page; the only thing returned to the client
from the Web server is pure HTML output.
All PHP code must be included inside one of the three special markup tags ATE are
recognised by the PHP Parser.
9
9
<script language = "php"> PHP code goes here </script>
A most common tag is the <?php...?> and we will also use the same tag in our tutorial.
From the next chapter we will start with PHP Environment Setup on your machine and then
we will dig out almost all concepts related to PHP to make you comfortable with the PHP
language.
MySQL
There are a few elements of MySQL. Let’s take a closer look at them:
Database
users to store, retrieve, update, and delete data. MySQL provides the software framework to
create, maintain, and interact with these databases, making data storage and retrieval
Client-Server Model
Computers that install and run RDBMS software are called clients. Whenever they need to
access data, they connect to the RDBMS server.
MySQL is one of many RDBMS software options. RDBMS and MySQL are often thought to
be the same because of MySQL’s popularity. A few big web applications like Facebook,
Twitter, YouTube, Google, and Yahoo! all use MySQL for data storage purposes. Even
though it was initially created for limited usage, it is now compatible with many important
computing platforms like Linux, macOS, Microsoft Windows, and Ubuntu.
10
10
MySQL and are not the same Be aware that MySQL is one of the most popular RDBMS
software’s brand names, which implements a client-server model.
The client and server use a domain-specific language – Structured Query Language (SQL) to
communicate in an RDBMS environment. If you ever encounter other names that have SQL
in them, like Postgres and Microsoft SQL server, they are most likely brands which also use
Structured Query Language syntax. RDBMS software is often written in other programming
languages but always uses SQL as its primary language to interact with the database. MySQL
itself is written in C and C++.
SQL tells the server what to do with the data. In this case, SQL statements can instruct the
server to perform certain operations:
11
11
xampp
XAMPP helps a local host or server to test its website and clients via computers and laptops
before releasing it to the main server. It is a platform that furnishes a suitable environment to
test and verify the working of projects based on Apache, Perl, MySQL database, and PHP
through the system of the host itself. Among these technologies, Perl is a programming
language used for web development, PHP is a backend scripting language, and MariaDB is
the most vividly used database developed by MySQL. The detailed description of these
components is given below
Components of xampp
As defined earlier, XAMPP is used to symbolize the classification of solutions for different
technologies. It provides a base for testing of projects based on different technologies through
a personal server. XAMPP is an abbreviated form of each alphabet representing each of its
major components. This collection of software contains a web server named Apache, a
database management system named MariaDB and scripting/ programming languages such
as PHP and Perl. X denotes Cross-platform, which means that it can work on different
platforms such as Windows, Linux, and macOS.
Many other components are also part of this collection of software and are explained below.
• Apache: It is an HTTP a cross-platform web server. It is used worldwide for delivering web
content. The server application has made free for installation and used for the community of
developers under the aegis of Apache Software Foundation. The remote server of Apache
delivers the requested files, images, and other documents to the user.
12
12
• MariaDB: Originally, MySQL DBMS was a part of XAMPP, but now it has been replaced
by MariaDB. It is one of the most widely used relational DBMS, developed by MySQL. It
offers online services of data storage, manipulation, retrieval, arrangement, and deletion.
• PHP: It is the backend scripting language primarily used for web development. PHP allows
users to create dynamic websites and applications. It can be installed on every platform and
supports a variety of database management systems. It was implemented using C language.
PHP stands for Hypertext Processor. It is said to be derived from Personal Home Page tools,
which explains its simplicity and functionality.
• Perl: It is a combination of two high-level dynamic languages, namely Perl 5 and Perl 6.
Perl can be applied for finding solutions for problems based on system administration, web
development, and networking. Perl allows its users to program dynamic web applications. It
is very flexible and robust.
• phpMyAdmin: It is a tool used for dealing with MariaDB. Its version 4.0.4 is currently
being used in XAMPP. Administration of DBMS is its main role.
• OpenSSL: It is the open-source implementation of the Secure Socket Layer Protocol and
Transport Layer Protocol. Presently version 0.9.8 is a part of XAMPP.
• XAMPP Control Panel: It is a panel that helps to operate and regulate upon other
components of the XAMPP. Version 3.2.1 is the most recent update. A detailed description of
the control panel will be done in the next section of the tutorial.
• Webalizer: It is a Web Analytics software solution used for User logs and provide details
about the usage.
• Mercury: It is a mail transport system, and its latest version is 4.62. It is a mail server,
which helps to manage the mails across the web.
• Tomcat: Version 7.0.42 is currently being used in XAMPP. It is a servlet based on JAVA
to provide JAVA functionalities.
13
13
• FileZilla: It is a File Transfer Protocol Server, which supports and eases the transfer
operations performed on files. Its recently updated version is 0.9.41.
14
14
4.SYSTEM DESIGN
4.1 Detailed design
Detailed design takes the high-level system architecture and breaks it down into intricate
details. It focuses on how each component works internally, how they talk to each other, and
what data they use. This refined blueprint, often with diagrams, ensures a smooth transition
from design to development for a well-functioning system.
15
15
4.1.2 UML diagram:
The software was renamed StarUML 5.0 in 2005 with a view to publishing it as
open source. The aim was to provide UML 2.0 support as well as the capability to use
third-party plugins. The first public release was published in August 2006 on Source
Forge under GNU GPL license. The source code included multiple copyright notices for
the period 2002-2005 by Plastic Software Inc. The software targeted at that time the
Win32 platform and was essentially written in Delphi. The software evolved over several
years as an open-source project and was recognized as an MDA tool with a capability to
assist in reverse-engineering existing code. The last open-source version is published in
2010. It may still be used nowadays, but according to the owner of the product, it would
no longer be maintained nor supported.
A crowdfunding campaign was launched in 2014 to finance a revival of the project
under the name StarUML 2. The aim of the initiative was to add support for other
languages than Java and other modeling notations than UML. The campaign failed to
raise the needed funds: less than 1000 USD were collected, that is 1% of the campaign's
target.
16
16
StarUML offers object-oriented modelling capabilities. It supports most of the
diagram types specified in UML 2.0.:
‣ Class diagrams
‣ Component diagrams
‣ Object diagrams
‣ Package diagrams
‣ Use-case diagrams
‣ Activity diagrams
‣ Sequence diagrams
‣ Communication diagrams
‣ Timing diagrams
‣ State diagrams
‣ Profile diagrams
17
17
4.1.2.1 USE CASE DIAGRAM:
Use case diagrams are considered for high level requirement analysis of a system.
So, when the requirements of a system are analyzed, the functionalities are captured in
use cases. So, we can say that uses cases are nothing, but the system functionalities
written in an organized manner.
18
18
4.1.2.2 SEQUENCE DIAGRAM:
19
19
4.1.2.3 DATAFLOW DIAGRAM:
DFD is the abbreviation for Data Flow Diagram. The flow of data of a system or a
process is represented by DFD. It also gives insight into the inputs and outputs of each entity
and the process itself. DFD does not have control flow and no loops or decision rules are
present. Specific operations depending on the type of data can be explained by a flowchart. It
is a graphical tool, useful for communicating with users, managers and other personnel. it is
useful for analysing existing as well as proposed system.
20
20
4.1.2.4 TABLE DESIGN:
Table Design is a powerful feature in SAP Analysis for Microsoft Office. It brings more
flexibility in designing and editing the crosstabs and enables formatting and lay outing
reports while keeping the ability for OLAP navigation. The changes made using
Table Design persist after navigation steps that force a rebuild of the crosstab, such as a
refresh or swapping axes.
The following Table Design options are available to edit the crosstabs:
• New lines
• Formats
• Formulas (also formulas to input-enabled planning data cells)
• Text
5.SYSTEM IMPLEMENTATION
21
21
4.1 MODULE LIST
➢ Twitter Extraction
Twitter Extraction - It facilitates interrelation among the system and the client. The
client has privileges to create account and access his/her feeds from the system.
Classification of the tweets - Using Naïve Baye's algorithm, the data can be analysed
and can recognize the patterns used for the classification and regression analysis. After pre-
processing the extracted data, it is then classified into keyword related tweets. It classifies and
predicts the group and clusters according to the user group. The same clusters are grouped
under a classification. All the positive tweets are clustered into a positive classification, all
the negative tweets are clustered into a negative classification, tweets containing a mixture of
both the positive and the negative are clustered into a mixed classification and all the neutral
tweets are clustered into a neutral classification. The classification of the tweets makes easier
to find the score point for each classification. And hence the polarity of each classification
can be represented.
categorise the positive, negative, mixed and neutral comments related with the text
categorization. Product Rating analysis has the complexity like conveying the assumptions in
different ways. In opinion texts, the lexical content might get misloaded. Intra-textual and
22
22
sub-sentential reversals and negation topics can be commonly interpreted. The below are the
possibilities that need to be classified as Users, Texts, Sentences (paragraphs, chunks of text),
data sets are scope sensitive, and it is challenging to create/collect the data from large
domain. Representations of the Product Rating needs more attention on elements to classify
and scale the domain-appropriate annotated data is available or not. This work deals with the
analysis of the tweets, and it checks for the behaviour of the tweets posted by the user
6.SYSTEM TESTING
23
23
System testing usually consists of a layered process, including the user interface (UI)
layer, the business layer, the data access layer and the database itself. The UI layer deals with
the interface design of the database, while the business layer includes databases
Databases, the collection of interconnected files on a server, storing information, may not
deal with the same type of data, i.e. databases may be heterogeneous. As a result, many kinds
of implementation and integration errors may occur in large database systems, which
negatively affect the system's performance, reliability, consistency and security. Thus, it is
important to test in order to obtain a database system which satisfies the ACID properties
One of the most critical layers is the data access layer, which deals with databases
directly during the communication process. Database testing mainly takes place at this layer
and involves testing strategies such as quality control and quality assurance of the product
Symantec, who are associated with data storage, need to have a durable and consistent
database system. If database operations such as insert, delete, and update are performed
without testing the database for consistency first, the company risks a crash of the entire
system.
Some companies have different types of databases, and also different goals and
missions. In order to achieve a level of functionality to meet said goals, they need to test their
database system.
24
24
The current approach of testing may not be sufficient in which developers formally
test the databases. However, this approach is not sufficiently effective since database
developers are likely to slow down the testing process due to communication gaps. A
Database testing mainly deals with finding errors in the databases so as to eliminate
them. This will improve the quality of the database or web-based system.
Database testing should be distinguished from strategies to deal with other problems
such as database crashes, broken insertions, deletions or updates. Here, database refactoring
The figure indicates the areas of testing involved during different database testing
➢ WHITE BOX TESTING: White-box testing (also known as clear box testing, glass
box testing, transparent box testing, and structural testing) is a method of testing
chooses inputs to exercise paths through the code and determine the expected outputs.
This is analogous to testing nodes in a circuit, e.g. in-circuit testing (ICT). White-box
testing can be applied at the unit, integration and system levels of the software testing
process. Although traditional testers tended to think of white box testing as being
done at the unit level, it is used for integration and system testing more frequently
today. It can test paths within a unit, paths between units during integration, and
between subsystems during a system–level test. Though this method of test design can
uncover many errors or problems, it has the potential to miss unimplemented parts of
examines the functionality of an application without peering into its internal structures
or workings. This method of test can be applied virtually to every level of software
requirements, i.e., what the application is supposed to do. Test cases are generally
requirements and design parameters. Although the tests used are primarily functional
• SCREENSHOT
26
26
Source code
27
27
const username = document.querySelector('#username');
firstName.addEventListener('keyup', validateFirstName);
lastName.addEventListener('keyup', validateLastName);
email.addEventListener('keyup', validateEmail);
username.addEventListener('keyup', validateUsername);
bio.addEventListener('keyup', validateBio);
dob.addEventListener('keyup', validateDob);
currentPassword.addEventListener('keyup', validateCurrentPassword);
password.addEventListener('keyup', validatePassword);
confirmPassword.addEventListener('keyup', validateConfirmPassword);
function validateFirstName() {
validateNames(firstName);
function validateLastName() {
validateNames(lastName);
28
28
function validateNames(elem) {
const re = /^[a-z]{3,15}$/i;
if(!re.test(text) ) {
elem.classList.add('is-invalid');
return true;
} else {
elem.classList.remove('is-invalid');
return false;
function validateEmail(e) {
const re = /^[a-z]([\w\_\-\.])+@([\w\.])+\.(\w){2,5}$/i;
if(!re.test(text)) {
email.classList.add('is-invalid');
return true;
} else {
email.classList.remove('is-invalid');
return false;
29
29
function validateUsername(e) {
const re = /^([\w\_]){3,10}$/i;
if(!re.test(text)) {
username.classList.add('is-invalid');
return true;
} else {
username.classList.remove('is-invalid');
return false;
function validateBio() {
const re = /^[\w\W]+$/i;
if(!re.test(text) ) {
bio.classList.add('is-invalid');
return true;
} else {
bio.classList.remove('is-invalid');
return false;
30
30
function validateDob(e) {
const re = /^([\w\s])+\,(\s)([0-9]){4}$/i;
if(!re.test(text)) {
dob.classList.add('is-invalid');
return true;
} else {
dob.classList.remove('is-invalid');
return false;
function validateCurrentPassword(e) {
const re = /^[\w]{6,}$/i;
currentPassword.classList.add('is-invalid');
return true;
} else {
currentPassword.classList.remove('is-invalid');
return false;
31
31
}
function validatePassword(e) {
const re = /^[\w]{6,}$/i;
password.classList.add('is-invalid');
return true;
} else {
password.classList.remove('is-invalid');
return false;
function validateConfirmPassword(e) {
confirmPassword.classList.add('is-invalid');
return true;
} else {
confirmPassword.classList.remove('is-invalid');
return false;
32
32
}
validateBio() || validateDob()) {
validateFirstName();
validateLastName();
validateEmail();
validateUsername();
validateBio();
validateDob();
} else {
return true;
e.preventDefault();
});
validateCurrentPassword();
validatePassword();
validateConfirmPassword();
} else {
return true;
33
33
}
e.preventDefault();
});
const monthsShort =
'Jan',
'Feb',
'Mar',
'Apr',
'May',
'Jun',
'Jul',
'Aug',
'Sep',
'Oct',
'Nov',
'Dec'
];
date.getFullYear();
34
34
document.addEventListener('DOMContentLoaded', function() {
dob.addEventListener('focus', () => {
instances.open();
});
});
• CONCLUSION.
In machine learning both classifiers achieve the best results when using the features of
Bayes classifier. The best performance on the test set comes from the Logistic Regression
with features from Count_Vectorizer. This can be further implemented using the deep
35
35
• BIBLIOGRAPHY
WDBN algorithm for MOOC courses. Journal of Ambient Intelligence and Humanized
Computing.
https://doi.org/10.1007/s12652-019-01190-9
for Twitter Product Rating Analysis. IEEE International Symposium on Signal Processing
(ISSPIT), 46–51.
36
36
3. Andranik, T., Timm, O. S., Philipp, G. S., & Isabell, M. W.
(2010). Predicting Elections with Twitter: What 140 Characters Reveal about Political
36–44.
5. Bian, J., Umit, T., & Fan, Y. (2015). Towards Large-scale Twitter Mining for Drug-
related Adverse Events. SHB ’12 Proceedings of the 2012 international workshop on
Smart
6. Celikyilmaz, A., Hakkani-Tür, D., & Feng, J. (2010). Probabilistic model-based Product
8. Gamallo, P., & Garcia, M. (2014). Citius: A Naive-Bayes Strategy for Product Rating
http://aclweb.org/anthology/S14-2026
37
37
Using WordNet to Measure Semantic Orientations of Adjectives. Proceedings of the
orientations-of-adjectives
10. Liu, B., Blasch, E., Chen, Y., Shen, D., & Chen, G. (2013). Scalable Product Rating
classification for Big Data analysis using Naïve Bayes Classifier. 2013 IEEE International
Conference on
11. Nakov, P., Rosenthal, S., Ritter, A., & Wilson, T. (2013).
13. Niu, Z., Yin, Z., & Kong, X. (2012). Product Rating classification for
14. Pak, A., &Paroubek, P. (2010). Twitter as a Corpus for Product Rating Analysis and
http://incctps.googlecode.com/svn/trunk/TPFinal/bibliografia/Pak
38
38
15. Sadhasivam, J., & Kalivaradhan, R. B. (2017). Review on
303.
39
39