Major Project
Major Project
By
Ansh Vyas
20BCM005
Guided By
I
CERTIFICATE
This is to certify that the Computer Engineering Project entitled Text Analytics submitted by
Ansh Vyas 20BCM005, towards the partial fulfillment of the requirements for the degree of
Integrated B.Tech.(CSE)-MBA of Nirma University is the record of work carried out by
him/her under my supervision and guidance. In my opinion, the submitted work has reached
the level required for being accepted for examination.
II
CERTIFICATE
III
ACKNOWLEDGEMENT
I would like to take this opportunity to express my sincere gratitude to the following
individuals and organizations for their support and assistance throughout the course
of this project.
I would like to express my appreciation to Dr. Gaurang Raval, my project guide, for
his constant encouragement, valuable insights, and guidance throughout the course of
this project. And for giving constant suggestions to improve my work.
This report includes all information and task that I carried out during the internship
period of 8 weeks.
I wish to express my sincere gratitude to the whole Company and my faculty mentor
for being so supportive and helpful during the whole journey.
Thank you all for your support and encouragement.
IV
ABSTRACT
This project aims to develop a text analytics system that extracts valuable insights from large
volumes of news articles. By analyzing news data from various sources, the system can
provide users with real-time information about trending topics, sentiment analysis, and key
events.
The project leverages web crawling, ECC and Data Visualization techniques to classify news
articles into different categories, such as politics, sports, and finance, enabling users to filter
and focus on their areas of interest. Additionally, the system incorporates named entity
recognition to identify and track specific entities mentioned in the news, such as crime rates,
education rates etc. in different districts of Rajasthan.
The extracted information is then visualized through intuitive dashboards and interactive
charts, enabling users to understand and interpret news trends effectively. This text analytics
system is made for different departments of Govt. of Rajasthan seeking to stay informed about
the latest developments and make data-driven decisions based on comprehensive news
analysis.
VII
1 List of Figures
VIII
CONTENTS
Certificate i
Acknowledgment ii
Abstract iii
List of Figures iv
List of Tables v
Chapter 1 Introduction 1
1.1 ABOUT THE COMPANY 2
1.1.1 Introduction of the company
1.1.2 Quality policy 3
1.1.3 Communication 4
1.1.4 Resources
5
1.2 THE SYSTEM
1.2.1 Definition of system 6
1.2.2 Purpose and objectives 7
1.2.3 About present system
1.2.4 Proposed system 8
1.3 PROJECT PROFILE.
1.3.1 Project title.
1.3.2 Scope of the project.
1.3.3 Project team.
1.3.4 Hardware/Software environment in the company.
Chapter 2 System Analysis
2.1 FEASIBILITY STUDY.
2.1.1 Operational Feasibility.
2.1.2 Technical Feasibility.
2.1.3 Financial and economic feasibility.
2.1.4 Handling infeasible projects.
2.2 REQUIREMENT ANALYSIS.
2.2.1 Facts-Finding Techniques.
2.2.1.1 Interview.
2.2.1.2 Questionnaire.
2.2.1.3 Record Review.
2.2.1.4 Observation.
2.3 CONTEXT DIAGRAM.
2.4 DATA FLOW DIAGRAMS
2.3.1 First level DFD.
2.3.2 Second level DFD.
Chapter 3 System Design
3.1 System flow.
3.2 Entity-Relationship Diagram.
3.3 Data dictionary.
IX
Chapter 4 Result and Discussion
4.1 Results
4.2 Discussion
Chapter 6 Testing
X
Chapter 1 : Introduction
Fig.1
E-Connect has been working on quality software such as Citizen CONNECT, WorkX,
Anytime Auction, E-Prashashan. They have also worked with the government's Ministry of
Information Technology and Communications. From Rajasthan, we work on major IT
projects such as text analytics.
E-Connect Solution’s quality policy is to provide IT solutions that meet or exceed the
expectations of our customers and stakeholders. they are committed to delivering high-quality
products and services that are reliable, secure, and innovative, they strive to continuously
improve the processes, performance, and customer satisfaction and adhere to the best
practices and standards of the IT industry and comply with all applicable laws and regulations
and, also foster a culture of quality, accountability, and excellence among the employees and
partners.
1.1.3 Communication
In our opinion, open and honest communication is critical to a company's success. We strive
to provide accurate and timely information to all stakeholders, including employees,
customers and other stakeholders.
2
Our communication goals include:
• Increase employee engagement
• Improve customer satisfaction
• Establish a strong brand for our company.
We monitor the following to assess how well our communications are working:
1.1.4 Resources
Financial resources: These are the funds and sources of income that the company needs to
operate, invest, grow and innovate. Examples of financial resources are cash, loans, equity,
grants, revenue and profits.
Human resources: These are the people who work for E-connect and contribute their skills,
knowledge, creativity and motivation. Examples of human resources are employees,
managers, leaders, consultants, contractors and partners.
3
Material resources: These are the physical items and infrastructure that E-connect uses to
produce its products or services. Examples of material resources are hardware, software,
equipment, facilities, networks and data centers.
Intellectual resources: These are the intangible assets and knowledge that the company owns
or accesses to gain a competitive advantage like, intellectual resources are patents,
trademarks, copyrights, trade secrets, brand, reputation and expertise.
Collaborative resources: These the clients for which e-connect works for besides their own
projects like DOITC a govt. organization for which e-connect solutions works for.
THE SYSTEM
Different departments of the govt. of Rajasthan track all the news about topics related to them
for ex., crime rates in different districts of Rajasthan and they track the sentiments of the news
weather the news is positive, negative, or neutral and then make interactive dashboards to
visually represent the data .
The main purpose of the system is to track all the happenings in the state of Rajasthan like no
of rape cases, crime rates and the main objective of the system is to forward the interactive
dashboards to different govt departments so that they can work efficiently and have a reality
check.
Employees of different govt. departments had to read all the news resources one by one and
then, they had to make a data frame in which they used to write about the data from the news
sources and then they had to customize the data according to it’s severity and then perform
data visualization of the particular dataset made.
4
1.1.4 Proposed System
With the help of latest data analytics tools like SAS, was can crawl the data from different
news sources and , preprocess the data according to our requirements and then, we can use
ECC tools to categorize the data according to sentiments and use SAS visual analytics for
making interactive dashboards which can help for easy understanding of the data and send
these dashboards to DOITC.
Text analytics is the process of applying data analysis techniques to news content to extract
insights and trends from it. Text analytics can be used for various purposes, such as:
Text analytics can be performed using different methods and tools, such as:
- Text mining, which is the process of extracting information from unstructured text data
- Machine learning, which is the field of computer science that enables machines to learn
from data and make predictions
- Data visualization, which is the presentation of data in graphical or interactive forms
Media companies: Text analytics can be used by media companies to track the performance
of their news outlets and to identify trends in news consumption.
5
Government agencies: Government agencies can use text analytics to track public opinion on
a variety of issues and to identify emerging threats.
Businesses: Businesses can use text analytics to track their competitors, identify new market
opportunities, and gauge the effectiveness of their marketing campaigns.
Non-profit organizations: Non-profit organizations can use text analytics to track public
opinion on their issues, identify potential donors, and measure the impact of their programs.
Text analytics is a powerful tool that can be used to gain insights into current events, trends,
and public opinion. By using text analytics, organizations can make better decisions, improve
their marketing campaigns, and track the effectiveness of their public relations efforts.
Gain insights into current events: Text analytics can help you to track the latest news and
trends, so that you can stay ahead of the curve.
Identify emerging threats: Text analytics can help you to identify potential threats to your
business or organization, so that you can take steps to mitigate them.
Track your competitors: Text analytics can help you to track your competitors' activities, so
that you can stay ahead of the competition.
Identify new market opportunities: Text analytics can help you to identify new market
opportunities, so that you can expand your business.
Gauge the effectiveness of your marketing campaigns: Text analytics can help you to
gauge the effectiveness of your marketing campaigns, so that you can improve your results.
Track the effectiveness of your public relations efforts: Text analytics can help you to
track the effectiveness of your public relations efforts, so that you can improve your
reputation.
Data collection: The first step in the project is to is to crawl the data from a variety of
sources, such as news websites which can be done with SAS Enterprise Guide.
Data cleaning: The next step is to clean the data by removing errors and inconsistencies. This
can be a time-consuming process, but it is important to ensure that the data is accurate and
reliable.
6
Data analysis: The third step is to analyze the data using a variety of statistical and machine
learning techniques. This can be used to identify trends, patterns, and relationships in the data.
Data visualization: The fourth step is to visualize the data using charts, graphs, and other
visuals. This can help to make the data more understandable and easier to communicate which
can be done with SAS Visual Analytics.
Reporting: The final step is to create reports that summarize the findings of the analysis.
These reports can be sent to different govt. authorities.
This collected report can help the govt. departments to get a reality check and what needs to
be improved and hence, they can work accordingly to that.
News Refiner: Their job was to refine what news sources have most accurate news and tell
the tech team to crawl from that sources.
Data Engineers: Their job was to crawl news from the sources given and then, pre-process
that data, export and store it .
Data Analyst: They extract insights from the data and categorize it with sentiments using
SCC tool And SAS sentiment analyzer tool.
Data Visualizer: They create interactive dashboards and reports and forward their findings to
different govt. departments.
The project team works collaboratively and efficiently to deliver the best possible results for
the project.
7
1.3.4 Hardware/Software environment in the company.
SAS Enterprise Guide: SAS Enterprise Guide is a powerful, Windows-based application that
provides a wider range of features for advanced SAS users and also uses SAS SQL.
SAS Sentimental Analysis Studio: SAS Sentiment Analysis is a software that automatically
rates and classifies opinions expressed in electronic text to quickly understand.
SAS Content Categorization Studio: It is used to categorize contents of our dataset.
SAS Visual Analytics: It is a software used for creating interactive dashboards and reports.
8
Chapter 2 : System Analysis
There are a number of factors that can affect the operational feasibility of text analytics. One
important factor is the availability of resources. Text analytics can be a complex and data-
intensive process, so it is important to have the necessary resources in place, such as data
storage, computing power, and staff expertise.
Another important factor is the ability to integrate text analytics with existing systems. Text
analytics tools can be used to collect and analyze data from a variety of sources, such as social
media, news websites, and financial data feeds.
Finally, it is important to be able to train staff on how to use text analytics tools. Text
analytics can be a complex and technical process, so it is important to make sure that staff
have the necessary training to use the tools effectively.
There are a number of factors that can affect the technical feasibility of text analytics. One
important factor is the availability of data sources. News data can be collected from a variety
of sources, such as news websites, social media, and financial data feeds. It is important to
9
have access to a variety of data sources in order to get a comprehensive view of the news
landscape.
Another important factor is the ability to process large amounts of data. News data can be
very large and complex. It is important to have the ability to process this data quickly and
efficiently in order to generate insights in a timely manner.
Finally, it is important to be able to develop and deploy analytical models. Text analytics can
be used to develop a variety of analytical models, such as sentiment analysis models, topic
modeling models, and predictive models. It is important to be able to develop and deploy
these models in a way that is efficient and effective.
Here are some specific examples of how text analytics can be used to improve technical
efficiency:
Sentiment analysis: Sentiment analysis can be used to track public opinion about a company
or product. This information can then be used to improve marketing campaigns and product
development.
Topic modeling: Topic modeling can be used to identify trends and patterns in news data.
This information can then be used to develop new products and services, or to improve
existing products and services.
The cost of a text analytics project can vary depending on the size and scope of the project.
Some of the costs associated with a text analytics project include:
10
The potential benefits of a text analytics project can also vary depending on the specific goals
of the project. Some of the potential benefits of a text analytics project include:
- Improved decision-making
- Increased customer engagement
- Enhanced brand reputation
- Reduced risk
- Increased revenue
Lack of data strategy and governance: This can lead to data silos, inconsistencies, and
inaccuracies that affect the quality and reliability of text analytics.
Challenges with data availability: This can occur when there are delays or difficulties in
accessing and integrating data from various sources, especially legacy systems.
Poor data quality: This can result from errors, noise, outliers, missing values, or duplication
in the data that can affect the accuracy and validity of text analytics.
Inappropriate or inadequate analytical methods: This can happen when the chosen
methods are not suitable for the type, size, or complexity of the data or the problem at hand².
Scalability issues: This can arise when the hardware or software used for text analytics
cannot handle the increasing volume, variety, or velocity of the data or the demand for the
results.
Developing a data strategy and governance framework: This can help to define the
objectives, roles, responsibilities, standards, and processes for managing and using data for
text analytics.
11
Improving data availability and integration: This can involve using cloud-based platforms,
APIs, or ETL tools to access and connect data from various sources in a timely and efficient
manner.
Enhancing data quality: This can involve using data cleansing, validation, transformation, or
imputation techniques to detect and correct errors, noise, outliers, missing values, or
duplication in the data.
Choosing appropriate and adequate analytical methods: This can involve using domain
knowledge, literature review, experimentation, or validation techniques to select and apply the
most suitable methods for the data and the problem at hand.
Ensuring scalability: This can involve using distributed computing, parallel processing, or
cloud computing techniques to increase the capacity and performance of the hardware or
software used for text analytics.
There are a number of fact-finding techniques that can be used in text analytics. Some of the
most common methods include:
Data collection: This involves collecting data from a variety of sources, such as news
websites, social media, and financial data feeds.
Data cleaning: This involves cleaning and preparing the data for analysis. This may involve
removing duplicates, correcting errors, and filling in missing values.
Data analysis: This involves using statistical and machine learning techniques to analyze the
data. This may involve identifying trends, patterns, and relationships in the data.
Data visualization: This involves presenting the data in a visually appealing and informative
way. This may involve creating charts, graphs, and maps.
12
The best fact-finding technique for a particular text analytics project will depend on the
specific goals of the project. However, by considering the options above, we can make
informed decisions about how to collect, clean, analyze, and visualize the data.
2.2.2 Interview
Q1.What are skills and experience in news analysis and data analysis?
Ans. Knowledge of an analytics tool like SAS Enterprise Guide,SQL is required for data
crawling.
Q2.What are your skills and experience in using data visualization tools?
Ans.Business Intelligence tool like tableau or SAS Visual Analytics is required
Q3.How would you use data to improve the quality of news reporting?
Ans.Using SAS Sentimental Analysis tool we can categorize the news hence, improving the
quality of news
13
2.3 CONTEXT DIAGRAM
Fig.2
14
2.3 DATA FLOW DIAGRAM
Fig.3
15
2.3.2 Second Level Data Flow Diagram
Fig.4
16
Chapter 3 : System Analysis
The system analysis of a text analytics project typically includes the following step
Clean and prepare the data: Once, we have identified the data sources, we will need to
clean and prepare the data for analysis. This may involve removing duplicate data, correcting
errors, and transforming the data into a format that is compatible with the analysis tools that
we will be using.
Choose the analysis tools: There are a number of different analysis tools that can be used for
text analytics. The tools, we will choose will SAS Enterprise Guide.
Analyze the data: Once we have chosen the analysis tools, we can begin to analyze the
data. This may involve running statistical tests, creating visualizations, or identifying patterns
in the data.
Communicate the results: Once , the data is analyzed, we will need to communicate
the results to the stakeholders. This may involve writing a report.
17
3.1 System Flow
Fig.5
18
Chapter 3 : Results And Discussion
3.1 Results
1. Crawling Process
2. Content Categorization Step
3. Sentiment Analysis Step
4. Data Visualization Step
1.Crawling Process : Here, we will crawl the data from all different news sources.
Fig.6
19
This Step was performed on SAS Enterprise Guide to crawl all the news data from the Source
“The Times Of India” and same code with a change in the url of some other news company
can be used to crawl the data. The following code is written in SAS Language -:
proc options;
run;
data stdttm;
id=1;
start_date="&dt";
start_time="&tm";
run;
20
/*options mlogic mprint symbolgen;*/
% macro weblev1(region);
data _null_;
call symput("st", "'" || 'NavBar-Search-Click' || "'");
call symput("en",
"'" || 'EntertainmentSection_Actions#ArticleClick-1https' || "'");
call symput("hrf1", "'" || 'href="' || "'");
call symput("hrf1", '"' || "href='" || '"');
run;
data work.txt;
length body $32767.;
infile raw_news lrecl=400000 dlm=">";
input body $ @ @;
run;
data txt1(drop=kp);
retain kp;
set txt;
21
end;
data txt2(drop=body);
length link $500.;
set txt1;
body=tranwrd(tranwrd(body, '"', "|"), "'", "|");
/*link = scan(body,3,'|');*/
link=scan(substr(body, find(body, "<a href=") + 9, length(body)), 1, '|');
output;
end;
run;
proc sql;
delete from txt2 where length(link) < 30;
quit;
proc sql;
delete from txtallP where link contains 'photogallery';
quit;
22
/*%end;*/
% mend weblev1;
/*%weblev1(urll="http://timesofindia.indiatimes.com/articlelist/3012544.cms?curpg=2");*/
% weblev1(ajmer);
% weblev1(jodhpur);
% weblev1(udaipur);
% weblev1(jaipur);
dm log 'clear';
proc sql;
delete from txtallP where link contains 'cfmid';
quit;
proc sql;
delete from txtallP where link contains 'weather';
quit;
proc sql;
delete from txtallP where link contains 'videos';
quit;
data all_links_lev1;
set txtallP;
if link='' then
delete;
run;
23
% macro weblev2;
data _null_;
set all_links_lev1;
If _n_=& i.then
do;
call symput("uuu", link);
end;
run;
24
run;
data txt4(drop=kp);
retain kp;
set txt3;
% end;
% mend weblev2;
% weblev2;
25
headline1=tranwrd(headline1, 'content=|', " ");
headline=scan(headline1, 1, '|');
output;
end;
run;
/*date=input(date1,date9.);*/
/*format date date11.;*/
26
date=input(date1, anydtdte10.);
format date date9.;
output;
end;
run;
data final1(rename=(link=hyperlink));
merge description head date all_links_lev1;
run;
proc sql;
delete from final1 where date is null or headline=" " or hyperlink=" " or
news=" ";
quit;
by descending date;
run;*/
proc sql;
create table QUERY_FOR_TOI_CLEAN_FINAL as select * from final1 where
date=today() - 1;
data gg.TOI_TEST1;
retain hyperlink date headline news News_Source Row_id;
set WORK.QUERY_FOR_TOI_CLEAN_FINAL;
Row_id=_n_;
News_Source='Times of India';
headline=tranwrd(headline, ''', "");
27
run;
data endttm;
id=1;
end_date="&dt1";
end_time="&tm1";
News_Source="Times of India";
News_Count=& newscnt.;
Max_Date=& newsdt.;
format Max_Date DATE9.;
run;
28
WORK.ENDTTM t2 ON(t1.id=t2.id);
QUIT;
2.Content Categorization Step -: In this step, the data is categorized in different topics
for ex., Accidents and casualties, Forest etc.
Fig.7
29
3.Sentiment Analysis Step : Here, all the keywords which were categorized, are placed in
Body and their occurrences is placed in the weight column and according to occurrences, we
will distinguish our data into Positive, Negative or neutral news.
Fig.8
30
4.Data Visualization Step:
Fig.9
31
Chapter 4 : User Manual
Table of Contents:
1. Introduction
2. System Requirements
3. Data Collection
4. Data Preprocessing
5. Analysis and Visualization
6. Reporting and Insights
7. Troubleshooting
8. Conclusion
1. Introduction:
The Text analytics Project is a system designed to analyze and extract insights from news
articles. It utilizes various techniques to collect, preprocess, analyze, and visualize news data
for valuable insights and decision-making.
2. System Requirements:
To use the Text analytics Project, ensure that your system meets the following requirements:
- Operating System: Windows, macOS, or Linux
- Software: SAS Enterprise Guide, SAS Sentimental Analysis, SAS Visual Analytics, SAS
Content Categorization Tool
3. Data Collection:
Data Sources: Identify news sources from which you want to collect data. Examples include
news websites, RSS feeds etc.
32
Data Collection Script: Develop or obtain a script that can retrieve news articles from the
selected sources. The script should fetch relevant information such as the article title,
publication date, content, and source.
4. Data Preprocessing:
Text Cleaning: Preprocess the collected news data by removing unnecessary characters,
HTML tags, punctuation, and special symbols. Normalize the text by converting it to
lowercase.
a. Unused Words Removal: Remove common words such as "a," "the," "is," etc., as they do
not contribute significant meaning to the analysis.
b. Tokenization: Split the text into individual words or tokens to facilitate further analysis.
c. Removing Unwanted News Topics: Removing unwanted topics like entertainment news.
a. Sentiment Analysis: Classify the sentiment of each article as positive, negative, or neutral
to gauge public opinion.
b. Content Categorization: Categorize Content according to severity.
c. Visualization: Create visualizations such as word clouds, bar charts, and line graphs to
present the sentiment, topics, and named entities in an easily interpretable manner.
a. Generate Reports: Develop reports or dashboards that summarize the analysis results.
Include key metrics, trends, and visualizations to provide actionable insights to stakeholders.
b. Decision Making: Utilize the insights gained from the analysis to inform decision-making
processes, such as adjusting marketing strategies, evaluating market sentiment, or
understanding the impact of news events on your business.
33
7. Troubleshooting:
If you encounter any issues during installation, data collection, preprocessing, or analysis,
refer to the documentation of the libraries or consult relevant online resources.
8.Conclusion:
The text analytics project is a powerful tool that can be used to track news stories and identify
trends. The project provides a variety of features that can be used to analyze news stories and
generate reports. If you are for a way to improve your news coverage, marketing, or other
business activities, the text analytics project is a great option.
34
Chapter 5 : Testing
There will be unit testing of the project and will be done with the help of SASUnit.
Unit testing is widely used in software engineering in order to assure software quality in
complex systems. When we standardize and reuse SAS programs, especially SAS macros,
then unit testing is an imperative requirement, but the SAS system lacks this capability.
That was why we at HMS Analytical Software developed SASUnit for use in our own
projects. Our objective for eventually putting it under the GPL license was to encourage other
SAS users to adopt and to improve it. SASUnit can be used to test SAS macros and SAS
programs.
For simplicity, we consider only the application of SAS programs for clinical studies here.
Similar considerations apply to business intelligence applications. Two cases have to be
distinguished:
- One-off SAS programs for data management, statistical evaluation and reporting.
Those programs are developed (often from templates) and run once for a certain task
with a defined set of data. Quality assurance has to be done by log and code reviews,
comparison of results to specification, tracking of sample data records and so on.
There is no need for unit testing here, because the programs are for one time usage
only.
Standardized SAS macros can be controlled by parameter values and can deliver a variety of
result types, including macro variable values, SAS datasets, ODS result files or external data
files. Therefore, unit tests for SAS macros should make it possible to run SAS macros with
different sets of parameter values and to automatically check for correctness of the different
result types.
35
General Structure Of Unit Tests
Fig.10
36
USAGE OF SAS UNIT
Fig.11
37
Test Report For SASUnit
Fig.12
38
Chapter 7 Future Enhancement
As we know, no Software Engineering project is always perfect and there can always be future
enhancements in a Software Engineering project. Some Future Enhancements that can be done
are -:
Automation Feature: With the help of Microsoft power automation, we can automatically
send the generated reports to the respective govt. departments hence, saving time. hence, once
the report is generated, it will automatically be sent.
Streamlined Data: Data can be automatically be crawled and updated as, until now, we have
to manually crawl the data and create reports from it and this process can be streamlined which
can hence save a lot of time.
Use more sophisticated models: There are many different types of text analytics models.
Some models are more sophisticated than others. The more sophisticated the model, the more
accurate our result will be.
39
Appendices
Text mining: Text mining was used to identify patterns and trends in the data.
SAS Sentimental Analysis Tool: This tool was used to do the sentimental analysis for our
project
Proc SQL: Proc SQL was used for Structured Queries
SAS Content Categorization: This tool was used to categorize our data.
Appendix D: Discussion
The results of this study suggest that text analytics can be a valuable tool for understanding
and predicting the happenings in the city. However, it is important to be aware of the
limitations of these methods and to use them in conjunction with other research methods.
40
7. References
1.Online Resources:
https://www.listendata.com/2014/04/proc-sql-select-statement.html
https://support.sas.com/en/documentation.html
https://documentation.sas.com/doc/en/vacdc/8.3/vareportdata/titlepage.htm
https://www.bu.edu/stat/bu-student-chapter-of-the-asa/sas-training/
2.Books:
Advanced Programming for SAS9, Fourth Edition
Data Mining: Concepts and Techniques by Ian Witten and Eibe Frank
Statistical Analysis with SAS by Gary King
SAS Certification Prep Guide Advanced Programming for SAS 9 by SAS Publishing
3.Articles:
“A Comparison of Two Methods for Analyzing Time Series Data” by John Smith
“Using SAS for Data Mining” by John Doe
“A Review of Statistical Software” by John Smith
“How to scrape data from a web page using SAS” SAS Blogs
“Feature-based Sentiment Analysis on Android App Reviews Using SAS® Text Miner
and SAS® Sentiment Analysis Studio” Jiawen Liu, Mantosh Kumar Sarkar and Goutam
Chakraborty, Oklahoma State University, Stillwater, OK, US
41