Twitter Bot Detection (K2203674)

Uploaded by

soniya bhambure

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

Twitter Bot Detection (K2203674)

Uploaded by

soniya bhambure

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 11

TWITTER BOT

DETECTION USING
MACHINE LEARNING
Unveiling the automation in the Twittersphere
Introduction
Social Media's Integral Role:

Social media platforms, such as Twitter, have become essential for communication,
information sharing, and public discussions in today's digital era.

The Challenge of Automated Accounts:

However, this interconnectedness has led to the emergence of automated accounts,

or "bots," which can manipulate discussions, spread misinformation, and impact
public opinion.

The Importance of Bot Detection:

Detecting and mitigating these bots is vital for upholding the integrity and
trustworthiness of online conversations.

Presentation Focus:

This thesis explores the fusion of artificial intelligence and social media, specifically
the application of machine learning models to effectively identify and combat Twitter
bots.

The presentation will delve into research methods, findings, and implications,
providing insights into the intriguing world of Twitter bot detection.
Dataset
The "Twitter Human-Bots Dataset" is a comprehensive Kaggle dataset
meticulously crafted for the specific purpose of Twitter bot detection.
This dataset encompasses a substantial volume of data, featuring
37,438 rows and 23 columns, each column offering valuable insights
into the characteristics of Twitter accounts.
Bot column is the imperative column with values 0/1. 0 signifies it is a
human-operated account and 1 stands for the bot accounts.
This dataset was chosen as it was evident after the literature review
that this is one of the most extensive dataset available.
Legal and ethical
considerations
Significant ethical and legal factors must be taken into account
while creating and implementing Twitter bot detection
programmes namely-
• Maintaining user-privacy
• Unbiased models
• Transparent, access to information
• Getting informed consent
• Maintaining user trust and upholding user rights
System flowchart

Exploratory Data Applying Machine Plotting evaluation

Loading dataset
Analysis Learning models metrics

Plotting metrics
Hyperparameter after Class-weight
Ensemble Learning
tuning hyperparameter balancing
tuning
Evaluation Metrics
In the context of a classification problem, such as
Twitter bot detection, we rely on six key
performance metrics to assess the model's quality
and its ability to make accurate predictions.
• Accuracy
• Precision
• Recall
• F1 score
• AUC-ROC
• Confusion Matrix
Key Findings
Before Tuning:
•Models' performance (accuracy, precision, recall, F1-Score)
was initially acceptable (around 0.85).
•Random Forest and XGBoost outperformed others.
•Logistic Regression had the lowest performance, while
Gaussian Naive Bayes showed high accuracy but low recall.
After Tuning:
•Notable improvements observed post hyperparameter
tuning.
•Decision Tree, Random Forest, and KNN continued to
perform well (accuracy between 0.84 and 0.86).
•Logistic Regression demonstrated some improvement.
•Gaussian Naive Bayes exhibited decreased recall and F1-
Score.
ROC curves

Hyperparameter tuning significantly improved the

ROC performance of the models, closing the
performance gap with models that performed
well even before tuning. This highlights the
importance of fine-tuning models to maximize
their discriminatory power and improve their
effectiveness in identifying Twitter bots.
Ensemble Models

These tables show the evaluation metrics of

the ensemble models that are generated
during the course of the project.
In conclusion, this study on Twitter bot detection
has offered valuable insights into addressing
artificial accounts on social media. It highlights the
importance of choosing the right models,

Conclusion
optimizing parameters, and considering trade-offs.
As bots evolve, the future requires advanced
machine learning, real-time systems, ethical
awareness, and collaboration. This research lays the
foundation for future progress, emphasizing
adaptability and innovation as essential elements in
tackling these evolving challenges.
THANKYOU