0% found this document useful (0 votes)
29 views10 pages

Sentimental Analysis

dgfdh

Uploaded by

shilpa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views10 pages

Sentimental Analysis

dgfdh

Uploaded by

shilpa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Sentimental analysis of web scrapped data

Submitted in the partial fulfillment for the award of


the degree of
BACHELOR OF ENGINEERING
IN
Artificial Intelligence and Machine Learning
Submitted by: Under the Supervision of:
Jigyasha Sharma – 21BCS5904 Ms Shilpa Sharma
Yogesh Janghel– 21BCS5454

Department of AIT-CSE DISCOVER . LEARN . EMPOWER


1
Outline:
• Introduction to Project
• Problem Formulation
• Objectives of the work
• Methodology used
• Results
• Conclusion

2
Introduction to Project:
This project aims to develop a system capable of detecting and
classifying sentiments expressed in online discussions using
advanced AI techniques. The system will be designed to identify
various forms of sentiment, including positive, negative, neutral, and
mixed emotions, from diverse sources such as social media
platforms, forums, and blogs.

3
Problem Formulation:
• With the proliferation of online content, sentiment analysis has
become crucial for understanding public opinion on products,
services, or topics.
• Web scraping enables the collection of vast amounts of
unstructured textual data from websites, forums, and social
media platforms. However, transforming this data into
actionable insights requires effective sentiment analysis
techniques

4
Objectives of the Work:
•Primary Objective: To perform sentiment analysis on web-scraped
data to identify and classify the sentiments (positive, negative,
neutral) associated with specific topics, products, or services.

•Secondary Objectives:
•To scrape and preprocess textual data from various web sources.
•To develop a machine learning or natural language processing (NLP)
model for sentiment classification.
•To visualize and interpret sentiment trends over time or across
different platforms.
5
Methodology used:
Web Scraping :
Web scraping is the process of collecting and parsing raw data from
websites.
Steps-
• Scrutinizing the Website: Send HTTP requests using the Requests
library to retrieve the HTML content of the target webpage.
• Understanding URLs: Analyze query parameters (key-value pairs) to
navigate specific web content.
• Dealing with Anti-scraping Measures: Use Selenium for web pages
that block scraping.
Libraries used: Requests, BeautifulSoup, Selenium WebDriver, Pandas
6
Sentiment Analysis :
Sentiment analysis involves determining the emotional tone within
textual data.

• Tokenization: Split text into individual words or tokens.


• Normalization: Convert tokens into their base forms (using
WordNetLemmatizer from nltk).
• Stopword Removal: Filter out common words that don’t add value
(e.g., "the," "and").
• Count Vectorization: Convert text into a matrix of token counts
(CountVectorizer from sklearn).
• TF-IDF Transformation: Weight tokens by importance using
TfidfTransformer.
7
Results:
Types of Data Extracted:
• Headlines: Extracted from the news sections, including titles and
subheadings.
• Article Content: Retrieved text from various news articles, providing
rich data for analysis.
• Publication Dates: Collected for each news article to track the
timeline of topics.
• Categories: Scraped article categories to organize data into relevant
sections (e.g., politics, sports, entertainment).

8
Final Dataset:

• The final dataset consisted of over 500 news articles, including


headline titles, publication dates, categories, and full text.
• The dataset was organized using Pandas and exported as a CSV file for
further processing in sentiment analysis.

9
Conclusion:
• This project will involve:
• Web scraping of textual data from websites, blogs, or social media.
• Preprocessing of the scraped data, including cleaning,
tokenization, and handling missing data.
• Sentiment analysis using either machine learning models (e.g.,
SVM, Logistic Regression) or deep learning techniques (e.g., LSTM).
• Evaluation of the sentiment analysis model's accuracy, precision,
recall, and F1-score.

10

You might also like