0% found this document useful (0 votes)
55 views

Sentiment Analysis in Arabic Language Using Machine Learning Iraqi Dialect Case Study

The document describes a study on sentiment analysis of Iraqi dialect movie reviews using machine learning techniques. It presents a dataset of 1189 movie reviews labeled as positive or negative sentiment. The reviews were preprocessed and vectorized using TF-IDF before being classified with models like Naive Bayes, SVM, KNN, and logistic regression. The results showed that the proposed hybrid classifiers achieved the best performance with over 80% accuracy in detecting sentiment polarity in the Iraqi dialect movie reviews.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views

Sentiment Analysis in Arabic Language Using Machine Learning Iraqi Dialect Case Study

The document describes a study on sentiment analysis of Iraqi dialect movie reviews using machine learning techniques. It presents a dataset of 1189 movie reviews labeled as positive or negative sentiment. The reviews were preprocessed and vectorized using TF-IDF before being classified with models like Naive Bayes, SVM, KNN, and logistic regression. The results showed that the proposed hybrid classifiers achieved the best performance with over 80% accuracy in detecting sentiment polarity in the Iraqi dialect movie reviews.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 15

8th Engineering and 2nd International Conference for College of Engineering

University of Baghdad 24–25 November / 2021

Sentiment Analysis in Arabic Language using


Machine Learning: Iraqi Dialect Case Study

Dr. Wameedh N. Flayyih


Hussein A. Nasrullah
Mohammed A. Nasrullah

21/7/2021 1
8th Engineering and 2nd International Conference for College of Engineering
University of Baghdad 24–25 November / 2021

11/7/2021 2
8th Engineering and 2nd International Conference for College of Engineering
University of Baghdad 24–25 November / 2021

INTRODUCTION

 Sentiment analysis is an application of natural language processing.


 The rapid growth of using social network increases the importance of natural
language processing

 This work:
o applies machine learning approaches for sentiment analysis of the Iraqi
dialect.
o extracts sentiments associated with positive or negative polarities
o considering the movies review comments as a case study.

11/7/2021 2
8th Engineering and 2nd International Conference for College of Engineering
University of Baghdad 24–25 November / 2021

11/7/2021 3
8th Engineering and 2nd International Conference for College of Engineering
University of Baghdad 24–25 November / 2021

11/7/2021 3
8th Engineering and 2nd International Conference for College of Engineering
University of Baghdad 24–25 November / 2021

Simple example of data classification

11/7/2021 3
8th Engineering and 2nd International Conference for College of Engineering
University of Baghdad 24–25 November / 2021

Dataset

 1189 movies reviews on Cinemana application


 The dataset was manually extracted and cleaned.
 Almost reviews in Arabic language - Iraqi dialects.
 Each review was labeled as either positive or negative
 Dataset was split into 70% for training and 30% for testing with 10-
fold cross validation.

11/7/2021 2
8th Engineering and 2nd International Conference for College of Engineering
University of Baghdad 24–25 November / 2021

11/7/2021 2
8th Engineering and 2nd International Conference for College of Engineering
University of Baghdad 24–25 November / 2021

Preprocessing

11/7/2021 2
8th Engineering and 2nd International Conference for College of Engineering
University of Baghdad 24–25 November / 2021

Text Vectorization

 TF-IDF is a typical method to convert text to numerical vector


 TF-IDF stands for “Term Frequency — Inverse Document
Frequency”
 It depends on the number of times the word appears in the sentence and
in the document,
 Term Frequency depends on the number of times the word appears in the
sentence (review).
 Document Frequency depends on the number of times the word appears in the
sentence (all dataset)

11/7/2021 2
8th Engineering and 2nd International Conference for College of Engineering
University of Baghdad 24–25 November / 2021

Classification

 Multinomial Naïve Bayes (MNB)

 Support Vector Machine (SVM)

 k-Nearest Neighbors (KNN)

 Logistic Regression (LR)

 Proposed hybrid classifiers

11/7/2021 2
8th Engineering and 2nd International Conference for College of Engineering
University of Baghdad 24–25 November / 2021

Results

11/7/2021 2
8th Engineering and 2nd International Conference for College of Engineering
University of Baghdad 24–25 November / 2021

Results

11/7/2021 2
8th Engineering and 2nd International Conference for College of Engineering
University of Baghdad 24–25 November / 2021

Thank for listening

11/7/2021 2

You might also like