Assignment 1 NLP

1. The document introduces the Superior Arabic Text Categorization Deep Model (SATCDM), which uses a convolutional neural network and word embeddings to classify Arabic text documents into predetermined categories. 2. SATCDM aims to improve the accuracy of classifying Arabic news articles, as traditional machine learning approaches have difficulties with the complex structure and morphology of the Arabic language. 3. The model is evaluated on 15 freely available datasets of Arabic news text in Modern Standard Arabic, achieving classification accuracies ranging from 97.58% to 99.90%, outperforming other models.

Uploaded by

Roshini Rames

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

79 views3 pages

Assignment 1 NLP

Uploaded by

Roshini Rames

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

Fakulti Teknologi Maklumat dan Komunikasi

Universiti Teknikal Malaysia Melaka

BITI 3413 Natural Language Processing
Assignment 01 (Individual)

Lecturer Name: Ts. Dr. HALIZAH BINTI BASIRON

Group Members: -
No. NAMA NO. MATRIK
1. ROSHINI A/P RAMES B032110385
2. HERISH CHAUDRAY A/L RAVANTHERAN B032110409
INTRODUCTION

The research article “Superior Arabic Text Categorization Deep Model (SATCDM)” by M.
Alhawarat, Member IEEE and Ahmad O. Aseeri presents a deep learning methodology along
with Natural Language Processing (NLP) for classifying Arabic text documents. In the
domain of NLP, document classification is a crucial task that involves categorizing
documents into predetermined classes based on their content especially when it comes to the
classification of Arabic text documents is regarded as a significant area of study where the
number of Arabic documents is growing drastically in a daily basis due to increase in new
online pages and social media posts. Thus, it is crucial for users and future researchers to
categorise such documents into distinct types. Deep learning techniques and traditional
machine learning approaches are used to extract meaningful patterns from complex data for
an accurate classification of Arabic text documents.
Classifying Arabic documents shows a unique challenge due to the inherent complexities of
the language. This is because the Arabic Syntax is complex with different word orders and
grammatical structures than English which makes it difficult for traditional Machine Learning
(ML) algorithms to learn the patterns that distinguish different classes of Arabic text.
Secondly, the rich dialects and morphology of the Arabic language
Although deep learning techniques show great potential for obtaining even greater levels of
accuracy, classic machine learning (ML) techniques have been applied efficiently for this task
of Arabic text categorization.The study introduces the Superior Arabic Text Categorization
Deep Model (SATCDM), a novel deep learning methodology leveraging CNN and word
embedding. While deep learning has excelled in Computer Vision and Speech Recognition,
its application to Arabic Natural Language Processing (ANLP) is an ongoing area of
improvement. The SATCDM model, employing an efficient multi-kernel CNN architecture
and skip-gram word embedding with sub-word information, aims to enhance the accuracy of
classifying Arabic news text documents. The research employs 15 free datasets in Modern
Standard Arabic (MSA) format for evaluation, comparing results with traditional ML
techniques as baseline models. The outcomes are anticipated to significantly contribute to
ANLP by improving the precision of classifying Arabic text documents, thereby enhancing
search engine accuracy and other applications. The study is distinctive for being the first to
utilize word embedding and CNN for classifying Arabic news text in MSA format across
various freely available datasets. The article concludes with a comprehensive structure
covering a literature review, CNN introduction, data description, SATCDM model details,
methodology, experimental setup, results, and a concluding section.
In this study, the proposed model incorporates Convolutional Neural Network (CNN) and n-
gram word embedding to enhance classification accuracy. Despite the remarkable
advancements of deep learning in other fields, its application to the Arabic language,
especially in natural language processing, has been gradually improving. The SATCDM
model achieves high accuracy, ranging from 97.58% to 99.90%, surpassing similar studies in
Arabic document classification. The research employs 15 freely available datasets
representing Arabic news text documents in Modern Standard Arabic format. The comparison
includes baseline models using traditional Machine Learning (ML) techniques. The outcomes
are expected to significantly contribute to the accurate classification of Arabic text
documents, benefiting search engine retrieval and various applications in the field of Arabic
Natural Language Processing (ANLP) and Machine Learning (ML).

Recent Advances in NLP The Case of Arabic Language
No ratings yet
Recent Advances in NLP The Case of Arabic Language
217 pages
Natural Language Processing of Semitic Languages
100% (4)
Natural Language Processing of Semitic Languages
477 pages
Student-Centered Learning Climate
100% (1)
Student-Centered Learning Climate
43 pages
WANLP 2017 (Co-Located With EACL 2017)
No ratings yet
WANLP 2017 (Co-Located With EACL 2017)
12 pages
J Ipm 2019 102121
No ratings yet
J Ipm 2019 102121
17 pages
(Series On Language Processing Pattern Recognition and Intelligent Systems Vol. 4) Neamat El Gayar, Ching Y. Suen - Computational Linguistics, Speech and Image Processing For Arabic Language-World Sci
No ratings yet
(Series On Language Processing Pattern Recognition and Intelligent Systems Vol. 4) Neamat El Gayar, Ching Y. Suen - Computational Linguistics, Speech and Image Processing For Arabic Language-World Sci
286 pages
2010 - Improving Arabic Text Categorization Using Neural Network With SVD
No ratings yet
2010 - Improving Arabic Text Categorization Using Neural Network With SVD
7 pages
Arabic Text Classification: The Need For Multi-Labeling Systems
No ratings yet
Arabic Text Classification: The Need For Multi-Labeling Systems
25 pages
JEEIT.2019.8717369
No ratings yet
JEEIT.2019.8717369
7 pages
[Studies in Computational Intelligence 740] Khaled Shaalan,Aboul Ella Hassanien,Fahmy Tolba (eds.) - Intelligent Natural Language Processing_ Trends and Applications (2018, Springer International Publishing).pdf
No ratings yet
[Studies in Computational Intelligence 740] Khaled Shaalan,Aboul Ella Hassanien,Fahmy Tolba (eds.) - Intelligent Natural Language Processing_ Trends and Applications (2018, Springer International Publishing).pdf
763 pages
2402.15313v2
No ratings yet
2402.15313v2
21 pages
Arabic Language Processing, From Theory To Practice (7th International Conference, ICALP 2019)
No ratings yet
Arabic Language Processing, From Theory To Practice (7th International Conference, ICALP 2019)
309 pages
Arabic Natural Language Processing and Machine Learning-Based Systems
No ratings yet
Arabic Natural Language Processing and Machine Learning-Based Systems
10 pages
Computational Linguistics Speech And Image Processing For Arabic Language Neamat El Gayar download
No ratings yet
Computational Linguistics Speech And Image Processing For Arabic Language Neamat El Gayar download
80 pages
Impact of Stemming and Word Embedding On Deep Learning-Based Arabic Text Categorization
No ratings yet
Impact of Stemming and Word Embedding On Deep Learning-Based Arabic Text Categorization
16 pages
BERT-based Models For Classifying Multi-Dialect Arabic Texts
No ratings yet
BERT-based Models For Classifying Multi-Dialect Arabic Texts
10 pages
Arabic NLP Session Hackathon
No ratings yet
Arabic NLP Session Hackathon
33 pages
Natural Language Processing with NLTK: Definitive Reference for Developers and Engineers
From Everand
Natural Language Processing with NLTK: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Applsci 13 07284
No ratings yet
Applsci 13 07284
2 pages
BDCC-08-00032
No ratings yet
BDCC-08-00032
26 pages
26Vol101No3
No ratings yet
26Vol101No3
11 pages
Abstract 2
No ratings yet
Abstract 2
1 page
Advanced Deep Learning Techniques for Natural Language Understanding: A Comprehensive Guide
From Everand
Advanced Deep Learning Techniques for Natural Language Understanding: A Comprehensive Guide
Adam Jones
No ratings yet
5
No ratings yet
5
20 pages
Word Embedding-SemanticFeatureExtraction
No ratings yet
Word Embedding-SemanticFeatureExtraction
14 pages
Hugging Face Transformers Essentials: From Fine-Tuning to Deployment
From Everand
Hugging Face Transformers Essentials: From Fine-Tuning to Deployment
Robert Johnson
No ratings yet
2023.arabicnlp-1.20
No ratings yet
2023.arabicnlp-1.20
12 pages
1-s2.0-S2352340923005607-main
No ratings yet
1-s2.0-S2352340923005607-main
7 pages
Hate Speech Detection of Arabic Social Media Using Machine Learning Techniques: A Comparative study
No ratings yet
Hate Speech Detection of Arabic Social Media Using Machine Learning Techniques: A Comparative study
24 pages
Applied Sciences: Arabic Hate Speech Detection Using Deep Recurrent Neural Networks
No ratings yet
Applied Sciences: Arabic Hate Speech Detection Using Deep Recurrent Neural Networks
15 pages
Gensim for Natural Language Processing: Definitive Reference for Developers and Engineers
From Everand
Gensim for Natural Language Processing: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Arabic_Machine_Translation_A_Survey_With_Challenges_and_Future_Directions
No ratings yet
Arabic_Machine_Translation_A_Survey_With_Challenges_and_Future_Directions
24 pages
Arabic Language Processing From Theory To Practice 7th International Conference ICALP 2019 Nancy France October 16 17 2019 Proceedings Kamel Smaïli
100% (3)
Arabic Language Processing From Theory To Practice 7th International Conference ICALP 2019 Nancy France October 16 17 2019 Proceedings Kamel Smaïli
52 pages
NLP Project Final Report1
No ratings yet
NLP Project Final Report1
10 pages
3405 ArticleText 6033 1 10 20240118
No ratings yet
3405 ArticleText 6033 1 10 20240118
11 pages
Roman Urdu News Headline Classification Empowered With Machine Learning
No ratings yet
Roman Urdu News Headline Classification Empowered With Machine Learning
16 pages
Ara - CANINE: Character-Based Pre-Trained Language Model For Arabic Language Understanding
No ratings yet
Ara - CANINE: Character-Based Pre-Trained Language Model For Arabic Language Understanding
15 pages
NLP Project Final Report1
No ratings yet
NLP Project Final Report1
10 pages
Language Identification: Fundamentals and Applications
From Everand
Language Identification: Fundamentals and Applications
Fouad Sabry
No ratings yet
The Enigmatic Bridge: Computing and Linguistics
From Everand
The Enigmatic Bridge: Computing and Linguistics
Pasquale De Marco
No ratings yet
521 2022 Article 7206
No ratings yet
521 2022 Article 7206
14 pages
NLP Project Final Report1
No ratings yet
NLP Project Final Report1
10 pages
C & Data Structures
From Everand
C & Data Structures
Prof. P. Padmanabham
No ratings yet
Buy ebook Automatic Language Identification in Texts 1st Edition Tommi Jauhiainen cheap price
100% (1)
Buy ebook Automatic Language Identification in Texts 1st Edition Tommi Jauhiainen cheap price
50 pages
PDF Automatic Language Identification in Texts 1st Edition Tommi Jauhiainen download
100% (3)
PDF Automatic Language Identification in Texts 1st Edition Tommi Jauhiainen download
50 pages
Archivo - 01 (Cópia)
No ratings yet
Archivo - 01 (Cópia)
5 pages
CoreNLP in Practice: Definitive Reference for Developers and Engineers
From Everand
CoreNLP in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Applied Natural Language Processing with AllenNLP: Definitive Reference for Developers and Engineers
From Everand
Applied Natural Language Processing with AllenNLP: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
From Recurrent Neural Network Techniques To Pre-Trained Models: Emphasis On The Use in Arabic Machine Translation
No ratings yet
From Recurrent Neural Network Techniques To Pre-Trained Models: Emphasis On The Use in Arabic Machine Translation
10 pages
Text-to-Speech Systems and Algorithms: Definitive Reference for Developers and Engineers
From Everand
Text-to-Speech Systems and Algorithms: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Large Language Models
From Everand
Large Language Models
A. Scholtens
2/5 (2)
Applsci 12 11944
No ratings yet
Applsci 12 11944
14 pages
Neural Networks Algorithm For Arabic Language Features-Based Text Mining
No ratings yet
Neural Networks Algorithm For Arabic Language Features-Based Text Mining
8 pages
2020 Nlposs-1 2
No ratings yet
2020 Nlposs-1 2
6 pages
Jsea 2023072811195844
No ratings yet
Jsea 2023072811195844
14 pages
Bashaier Proposal Ver 22-8-2024
No ratings yet
Bashaier Proposal Ver 22-8-2024
15 pages
2023 East20248286293210 3217
No ratings yet
2023 East20248286293210 3217
9 pages
Explanation Based Learning: Fundamentals and Applications
From Everand
Explanation Based Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
(IJCST-V11I6P2) :ms. Madhuri P. Narkhede, Dr. Harshali B Patil
No ratings yet
(IJCST-V11I6P2) :ms. Madhuri P. Narkhede, Dr. Harshali B Patil
5 pages
IDIR Imane Thesis
No ratings yet
IDIR Imane Thesis
82 pages
2021-Personalized Recommender System For Arabic News On Twitter
No ratings yet
2021-Personalized Recommender System For Arabic News On Twitter
8 pages
Group 6 Relatives
No ratings yet
Group 6 Relatives
2 pages
Growth and Development Theories
100% (2)
Growth and Development Theories
7 pages
Character Strength and Virtues
No ratings yet
Character Strength and Virtues
13 pages
Job Involvement
100% (1)
Job Involvement
12 pages
Active Passive Voice
No ratings yet
Active Passive Voice
3 pages
Exploring Native and Non-Native English Speaker Te
No ratings yet
Exploring Native and Non-Native English Speaker Te
9 pages
Languageandthebrain 130906052411
No ratings yet
Languageandthebrain 130906052411
31 pages
Field Attachment Report
100% (3)
Field Attachment Report
28 pages
Speak Smart, Make Your Mark by Dean Shams
100% (1)
Speak Smart, Make Your Mark by Dean Shams
198 pages
Stakeholders Chart
No ratings yet
Stakeholders Chart
4 pages
Artificial Intelligence With Sas PDF
100% (1)
Artificial Intelligence With Sas PDF
141 pages
Now and Then
No ratings yet
Now and Then
16 pages
Beginning The Proposal Process: Go Here
100% (2)
Beginning The Proposal Process: Go Here
4 pages
Advanced Machine Learning
No ratings yet
Advanced Machine Learning
7 pages
Animal Shelter Les - Plan
No ratings yet
Animal Shelter Les - Plan
3 pages
Present Perfect-Group 3 - Compressed
100% (1)
Present Perfect-Group 3 - Compressed
38 pages
B1.2 - Christmas Test - KEY 2019.2020
No ratings yet
B1.2 - Christmas Test - KEY 2019.2020
1 page
Impact of Music To The Learning Performance of Abm at Ema Emits College Philippines For The S.Y. 2019-2020
No ratings yet
Impact of Music To The Learning Performance of Abm at Ema Emits College Philippines For The S.Y. 2019-2020
5 pages
Objective Setting Why Set Objectives?
No ratings yet
Objective Setting Why Set Objectives?
4 pages
Cognitive Bias
100% (1)
Cognitive Bias
10 pages
Rubrics of LCS Simulation Lab
No ratings yet
Rubrics of LCS Simulation Lab
2 pages
Self Awareness
0% (1)
Self Awareness
2 pages
G2 Prof 5 Tpack
No ratings yet
G2 Prof 5 Tpack
30 pages
Persuasive Communication Skills
100% (2)
Persuasive Communication Skills
35 pages
Davis - Prehistory and Early History of Automated Deduction
No ratings yet
Davis - Prehistory and Early History of Automated Deduction
13 pages
John Ryan D. Aspiras Active Listening: Lost Art of Learnable Skill
No ratings yet
John Ryan D. Aspiras Active Listening: Lost Art of Learnable Skill
1 page
The Selective Laziness of Human Reasoning: Name: Class
No ratings yet
The Selective Laziness of Human Reasoning: Name: Class
4 pages
Science Form 1 Reporting Templates
No ratings yet
Science Form 1 Reporting Templates
16 pages
Energy-Constrained Private and Quantum Capacities of Quantum Channels
No ratings yet
Energy-Constrained Private and Quantum Capacities of Quantum Channels
41 pages

Assignment 1 NLP

Uploaded by

Assignment 1 NLP

Uploaded by

Fakulti Teknologi Maklumat dan Komunikasi

Universiti Teknikal Malaysia Melaka

Lecturer Name: Ts. Dr. HALIZAH BINTI BASIRON

You might also like