0% found this document useful (0 votes)
108 views

Election Prediction Projectfinal

This document describes a project to develop an election prediction model using machine learning. It will analyze datasets containing voter demographics, past election results, candidate information, and other factors. Machine learning algorithms will be used to build predictive models trained on historical data. The models will forecast election winners and provide insights into election dynamics. The project aims to advance the field of political analysis and demonstrate how data science techniques can accurately predict electoral outcomes.

Uploaded by

Dev Venugopal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
108 views

Election Prediction Projectfinal

This document describes a project to develop an election prediction model using machine learning. It will analyze datasets containing voter demographics, past election results, candidate information, and other factors. Machine learning algorithms will be used to build predictive models trained on historical data. The models will forecast election winners and provide insights into election dynamics. The project aims to advance the field of political analysis and demonstrate how data science techniques can accurately predict electoral outcomes.

Uploaded by

Dev Venugopal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

ELECTION PREDICTION

PROJECT REPORT
Submitted in partial fulfillment of the requirement for the award of the degree of

MASTER OF COMPUTER SCIENCE (DATA ANALYTICS)


OF

MAHATMA GANDHI UNIVERSITY, KOTTAYAM

Submitted By

CHRISTEENA RACHEL MATHEW


Reg. No: 220011023953

DEPARTMENT OF COMPUTER SCIENCE


COOLEGE OF APPLIED SCIENCE KONNI, KERALA-
689692

1
DEPARTMENT OF COMPUTER SCIENCE
COLLEGE OF APPLIED SCIENCE , KONNI
(Affiliated to MG University)

CERTIFICATE
Certificate that this is a bonafide record of the project work
ELECTION PREDICTION
Done By
CHRISTEENA RACHEL MATHEW (Reg. No: 220011023953)
Submitted in partial fulfillment of the requirement for the award of the
degree of the degree of Master of Computer Science of Mahatma
Gandhi University, Kottayam.

Project Guide Head of Department

2
DEPARTMENT OF COMPUTER SCIENCE
COLLEGE OF APPLIED SCIENCE, KONNI

(Affiliated to MG University)

DECLARATION
I, CHRISTEENA RACHEL MATHEW of register number 220011023953, hereby declare that the
mini project entitled
“ELECTION PREDICTION” is a bonafide work of the best of my knowledge and during this period
of study 2022-2024, a similar work has not been submitted to the Mahatma Gandhi University or
any other institution for the fulfillment of the course of study.
This project is submitted in partial fulfillment of the requirement for the award of the
degree of Master of Computer Science of Mahatma Gandhi University, Kottayam

3
ACKNOWLEDGEMENT
First and foremost, I convey my reverential salutation to Almighty God, for enabling me to
take up and complete the project work successfully. Then I would like to thank my family
members and friends for their emotional and financial support throughout.
I express my sincere thanks to MRS.BINDHU.S, Principal, College of Applied
Scinece,konni (Managed by IHRD), providing the necessary facilities for the completion of my
project.
I extended my sincere thanks to Ass.Prof. Mr. Arun. s, Head of the Department of Master
of Computer Science, College of Applied Scinece, konni , (Managed by IHRD), for his kind
patronage in allowing me to carry out this project.
I would like to express my profound thank to Mr. Arun.s my project guide, for his
scholarly advice and support which have helped me grandly in the accomplishment of the
project.
I would like to express my sincere thanks to Mr. Arun.s, Assistant Professor, Miss.
Sanchula k. s Lecturer, Mrs. Neethu k. v Lecturer and Mrs. Renjitha C.P, Mrs.Shynimol s.s
Lecturer and Department of MSC, for their Valuable advice and inspiration throughout the
completion of this project.

I would like to thank all staff members, Department of MSC, for their valuable guidance,
and suggestions rendered during this project.
Finally I thank all my friends for their help, encouragement and moral support given to
me during the course of this work.

4
ABSTRACT
Elections are a fundamental aspect of any democratic society, and predicting election outcomes is of
great interest to political analysts, policymakers, and the general public. In this project, we propose
an innovative approach to predict election winners using Python, machine learning algorithms, and
data analysis techniques. The primary objective of this project is to develop a predictive model that
can accurately forecast the winners of political elections based on historical election data, candidate
profiles, demographic information, and other relevant factors. The model will serve as a valuable
tool for understanding the dynamics of elections and making informed predictions.

Comprehensive election data, including historical election results, candidate information, voter
demographics, campaign expenditure, and social media sentiment analysis, will be collected from
diverse sources. The collected data will undergo rigorous preprocessing, including data cleaning,
feature extraction, and normalization, to ensure its quality and suitability for machine learning.
Relevant features that influence election outcomes, such as candidate popularity, campaign
strategies, and regional factors, will be identified and selected.

Various machine learning techniques, including logistic regression, random forests, and neural
networks, will be used to build predictive models. These models will be trained on historical
election data and fine-tuned for accuracy. The performance of our models will be assessed using
appropriate metrics, such as accuracy, precision, recall, and F1-score. Cross-validation and
ensemble techniques will be employed to enhance model robustness.

Data visualization tools will be employed to create informative graphs and charts, aiding in the
interpretation of election dynamics and trends.

5
CONTENT
1 Introduction ……………………………………. 8

2 System Analysis ………………………………… 9

2.1 System Environment . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1.1 Software Environment . . . . . . . . . . . . . . . . . . . . . . . . . . 9-12

2.2 About the Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .14

2.3.1 Import Data . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . . . . . . . . . . 14

2.3.2 Data Cleaning . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . 15

2.4 Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-17

2.5 Data Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-22

2.6 Implementation of Algorithm . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 23-25

3 Feasibility Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-27


4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .28
5 Future Scope. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Bibliography. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30

6
List of Figures

2.1 Displaying dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13


2.2 Importing packages and dataset . . . . . . . . . . . . . . . . . . . . . . .14
2.3 Sorting out null values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4 Candidates status table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16
2.5 Data type of all columns. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16
2.6 Data type of each column using info () . . . . . . . . . . . . . . . . . . . . . 17
2.7 Total votes for the CPI (m) party with the candidate. . . . . . . . . . . . 17
2.8 Yearwise election result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-19
2.9Year by Year votes for the party INC . . . . . . . . . . . . . . . . . . . . . . . .20
2.10 Displaying the length of gender /age column . . . . . . . . . . . . . . . . . . .20
2.11Relationship between politician category and election result . . . . . . . . . . . .21
2.12 Age count distribution . . . . . . . . . . . . .. . . . . .. . . .. . . . . .. . . . . . . . . . . . . . . . . .. . 22
2.13 Implementation of Algorithms…………………………………………………….23-24
2.14 Summary……………………………………………………………………25

7
INTRODUCTION

Election prediction is a crucial aspect of political analysis, providing insights into the
possible outcomes of an electoral process based on historical data, current trends, and
various influencing factors. The ability to accurately forecast election results is
invaluable for political campaigns, policymakers, and the general public. In recent
years, data science and machine learning techniques have played a significant role in
enhancing the accuracy and efficiency of election predictions.

This project aims to leverage Python programming and data analysis tools to develop
an election prediction model. By harnessing the power of machine learning
algorithms, we can analyze diverse datasets containing information about voter
demographics, past election results, economic indicators, and other relevant factors.
Through this analysis, the project aims to create a predictive model that can estimate
the likelihood of candidates winning in upcoming elections. By undertaking this
project, we aim to contribute to the field of political analysis and election forecasting,
showcasing the power of Python and machine learning in providing valuable insights
into the dynamics of electoral processes. The project's findings and methodology can
be applicable to various scales of elections, from local to national levels.

8
System Analysis
2.1 System Environment

System environment specifies the hardware and software configuration of the new
system. Regardless of how the requirement phase proceeds, it ultimately ends with
the software requirement specification. A good SRS contains all the system
requirements to a level of detail sufficient to enable designers to design a system that
satisfies those requirements. The system specified in the SRS will assist the potential
users to determine if the system meets their needs or how the system must be
modified to meet their needs.
2.1.1 Software Environment

• Front End : Python


• IDE : Chrome
1. Python

Python is an interpreted high-level programming language for general-purpose


programming. ,Created by Guido van Rossum was first released in 1991,Python
has a design philosophy that emphasizes code readability notably using significant
white space. It provides constructs that enable clear programming on both small and
large scales. Python features a dynamic type system and automatic memory
management. It supports multiple programming paradigms, including object oriented
imperative, functional and procedural and has a large and comprehensive standard
library.

9
Python interpreters are available for many operating systems. It has a wide range of
applications from Web development (like: Django and Bottle) scientific and
mathematical computing (Orange, SymPy, NumPy) to desktop graphical user
Interfaces (Pygame, Panda3D). Python is a widely used high-level programming
language for general-purpose programming. Apart from being an open- source
programming language, python is a great object-oriented, interpreted and interactive
programming language. Python combines remarkable power with very learn syntax.
It has modules, classes, exceptions very high-level dynamic data types, and dynamic
typing. There are interfaces to many systems calls and libraries, as well as to various
windowing systems. New built-in modules

that are;

• It’s simple to learn. As compared to C, C++ and Java the syntax is simpler and Python
also consists of a lot of code libraries for ease of use.
• Though it is slower than some of the other languages, the data handling capacity
isgreat.

• Open Source! – Python along with R is gaining momentum and popularity in the
Analytics domain since both of these languages are open-source Capability of
interacting with almost all the third-party languages and platforms.

Python Libraries:

10
NumPy : NumPy is the fundamental package for scientific computing with
Python It contains:
• A powerful N-dimensional array object
• Sophisticated (broadcasting) functions.
• Tools for integrating C/C++ and Fortran code.

• Useful linear algebra, Fourier transform, and random number capabilities

Besides its obvious scientific uses, NumPy can also be used as an efficient multi-
dimensional container of genericdata. Arbitrary data-types can be defined. This allows
NumPy to seamlessly and speedily integrate with a wide variety of databases.

Matplotlib :

Matplotlib is an amazing visualization library in Python for 2D plots of arrays.


Matplotlib is a multi-platform data visualization library built on NumPy arrays and
designed to work with the broader SciPy stack. It was introduced by John Hunter in the
year 2002. One of the greatest benefits of visualization is that it allows us visual access
to huge amounts of data in easily digestible visuals. Matplotlib consists of several plots
like line, bar,scatter or histogram etc. Matplotlib is a plotting library for creating
static,animated, and interactive visualizations in Python. Matplotlib can be used in
Python scripts, the Python and IPython shell, web application servers and various
graphical user interface toolkits like Tkinter, awxPython etc

11
Pandas :

Pandas is a Python library used for working with data sets. It has functions for
analyzing, cleaning, exploring, and manipulating data. The name "Pandas" has a
reference to both "Panel Data", and "Python Data Analysis" and was created by Wes
McKinney in 2008.
Pandas contains:

 Data set merging and cleaning and joining


 Easy handling of missing data (represented as NaN) in
floating point as well as non-floating point data.
 Columns can be inserted and deleted from DataFrame and
higher dimensional objects.
 Powerful group by functionality for performing split-apply-
combine operations on data sets.
 Data Visulaization

OS: The OS module in python provides functions for interacting


with the operating system.

12
2.2 About The Dataset
To the Kerala Legislative Assembly list of Voters of third ward of Konni Assembly Constituency
provided the raw data for this study,which was gathered through the website State Election
Commission Kerala, voter’s list. Which will be used to train machine learning models. The data is
originally available in English language, and it was obtained in excel sheet format. The dataset used in
this project consist of 867rows and 9 columns.

Figure 2.1: Displaying dataset

13
2.3 Data Preprocessing
Data preprocessing is a process of preparing the raw data
and making it suitable for a machine learning model. It is the first and crucial step
while creating a machine learning model. When creating a machine learning project,
it is not always a case that we come across the clean and formatted data. And while
doing any operation with data, it is mandatory to clean it and put in a formatted
way. So for this, we use data preprocessing task.
2.3.1 Import Data
The first stage is to import the dataset, The dataset used for this project
has been downloaded from “State Election Commission”. Here we have also
imported the essential libraries such as numpy, pandas, matplotlib,seaborn etc and
then read out the excel file.

Figure 2.2: Importing packages and dataset


14
2.3.2 Data Cleaning
Data cleaning is the process of fixing or removing incorrect, corrupted,
incorrectly formatted, duplicate, or incomplete data within a dataset. When combining
multiple data sources, there are many opportunities for data to be duplicated or
mislabeled. If data is incorrect, outcomes and algorithms are unreliable, even though they
may look correct. There is no one absolute way to prescribe the exact steps in the data
cleaning process because the processes will vary from dataset to dataset. But it is crucial
to establish a template for your data cleaning process so you know you are doing it the
right way every time.

Figure: 2.3 Sorting out null values

15
2.4 Data Analysis
A data set is a collection of related datas and analysis of dataset refers
to the process of manipulating raw data to uncover useful insights.Here in this
project Kerala’s politians candidates status table dataset is analysed.And by
the counting each candidates, we can determine which candidate is primarily
win the election. This information is shown in the picture below,

Figure2.4 Candidates status table

By analysing the data ,We can also determine which candidates got vote most. Data
analysis plays a significant role in this project. This dataset can be used to analyse a wide
range of information.

Figure 2.5 Data type of all columns

16
Figure 2.6 Data type of each column using info ()

Figure 2.7 Total votes for the CPI (m) party with the candidate

17
2.5 Data Representation
Data Representation refers to the form in which data is stored,
processed, and transmitted. The graphical representation of information and data. In
order to store the data in digital format, we can use any device like computers,
smartphones, and iPads. Data visualization gives us a clear idea of what the
information means by giving it visual context through maps or graphs. This makes
the data more natural for the human mind to comprehend and therefore makes it
easier to identify trends, patterns, and outliers within large datasets. By using visual
elements like charts, graphs, and maps, data visualization tools provide an
accessible way to see and understand trends, outliers, and patterns in data. By the
charts plotted below we can analyse the yearwise election results of each candidate.

18
Figure 2.8 Year wise electionresult

19
Figure 2.9Year by Year votes for the party INC

Figure 2.10 Displaying the length of gender /age column

20
Figure: 2.11Relationship between politician category and election result

21
Figure: 2.12 Age count distribution

22
2.6 Implementation of Algorithm
Inorder to predict the election result for a given features, we need to import various machine
learning algorithms from the ”scikit- learn” library and performance metrics. In my project, I
have used Random Forest Classifier ,Logistics Regression algorithms. And obtain the results
as the specified attributes, such as Party,Vote Rate.

23
Figure: 2.13 Implementation of Algorithms

24
Figure:2.14 Summary

25
Feasibility Analysis
Feasibility study is made to see if the project on completion will serve the purpose of the
organization for the amount of work, effort and the time that spend on it. Feasibility study
lets the developer foresee the future of the project and the usefulness. A feasibility study of a
system proposal is according to its work ability, which is the impact on the organization,
ability to meet their user needs and effective use of resources. Thus when a new application
is proposed it normally goes through a feasibility study before it is approved for
development. There are three aspects in the feasibility study portion of the preliminary
investigation.

• Technical Feasibility

• Economic Feasibility

• Operational Feasibility

Technical Feasibility :

Technical study is a study of hardware and software requirements. All the technical issue
related to the proposed system is dealt during feasibility stage of preliminary investigation.
Data keeping capacity of the proposed equipment to be used for the system are enough.
There is no need to develop any hardware to use this system. Election prediction is a system
that runs platform independently. So there is no need to install any bulk software for
implementing this system. And also there is no need for using other special equipment’s,
thus the system is technical feasible and non-intrusive.

26
Economic Feasibility :

Economic analysis is the most frequently used method for evaluating the effectiveness of a
candidate system. Election prediction system will be cost effective and budgetary
constraints, it would be cheap and quick to implement. There isn’t any extra requirement of
peripheral or software for development of system as it can be completed with the available
resource. There is no need of special equipments to use this system. Also doesn’t need bulk
writing. So it is economicaly feasible.

Operational Feasibility:

Operational feasibility is the measure of how well a proposed system solves the problems,
and takes advantage of the opportunities identified during scope definition and how it
satisfies the requirements identified in the requirements analysis phase of system
development. Election prediction is easy to operate because it only uses simple steps to
predict the result for a given set of attributes. The system is simple for user. Because there is
separate forms for prediction. So there is no any complicated steps to use this system.

27
Conclusion

Election prediction is a crucial aspect of political analysis, providing insights


into the possible outcomes of an electoral process based on historical data,
current trends, and various influencing factors. The ability to accurately
forecast election results is invaluable for political campaigns, policymakers,
and the general public. In recent years, data science and machine learning
techniques have played a significant role in enhancing the accuracy and
efficiency of election predictions. We aim to contribute to the field of political
analysis and election forecasting, showcasing the power of Python and
machine learning in providing valuable insights into the dynamics of electoral
processes. The project's findings and methodology can be applicable to various
scales of elections, from local to national levels. The desired feature
extraction, analysis and representation methods were selected and the selected
machine learning algorithms such as Random Forest Classifier, Logistics
Regression were applied and evaluated. We gathered 867 rows of data from
the online resource State Election Commission Kerala, voter’s list.
Among them 867 rows were used for training and 867 rows
are used for testing inorder to quantify accuracy and assessment
metrics. While evaluating the trained model, prediction on the trained data
is computed. After that accuracy of the predicted output to the original
testing set data is checked. And as per the accuracy of the results, we
concluded that the election prediction dataset showed the best accuracy
with Random Forest Classifier.

28
Future Scope

Predicting election outcomes using Python for data analysis can be a fascinating and
impactful endeavor. Kerala, with its politically vibrant landscape, provides an interesting
case study for such analysis. Python, with its powerful libraries like Pandas, NumPy,
Matplotlib, and Scikit-learn, offers a robust ecosystem for data analysis and prediction
tasks. Election prediction using Python for data analysis holds immense potential for
understanding and forecasting electoral outcomes. With continued advancements in
technology and methodology, coupled with a commitment to ethical practices, the future of
this field looks promising in its ability to provide valuable insights into the democratic
process.

29
Bibliography
[1] State Election Commission Kerala, voter’s list. https://www.sec.kerala.gov.in/public/voters/list

[2] Paramartha Sengupta ,Data Scientist at Customerinsights.ai Hyderabad, Telangana,India.EDA(Plotly)


& Prediction:- Indian Elections 2019

[3]berndkleinpythondataanalysisa4.pdfhttps://pythoncourse.eu/books/bernd_klein_python_data_analysis_a4.
pdf
[4] Python Data Science Handbook Jake Vander Plas https://jakevdp.github.io/PythonDataScienceHandbook/

30

You might also like