STC Report 44
STC Report 44
On
Amol Funde
Roll No:- 44
Exam seat No.: ____________
CERTIFICATE
This is to certify that Mr. Amol Shahadev Funde from Third Year Computer
Engineering has successfully completed his seminar work titled “ DATA MINING FOR
CRIME INVESTIGATION ”at Bharati Vidyapeeth’s College of Engineering, Lavale,
Pune in the partial fulfilment of the bachelor’s degree in engineering.
Vulnerability Crime rate is increasing very fast in India. With the existing crime
investigation techniques, officers have to spend a lot of time as well as man power to
identify suspects and criminals. As large amount of information is collected during
crime investigation, data mining is an approach which can be useful in this perspective.
Data mining is a process that extracts useful information from large amount of crime
data so that possible suspects of the crime can be identified efficiently. Best performing
algorithm will be used against sample crime and criminal database to identify possible
suspects of the crime.
Crime investigation is a critical aspect of maintaining public safety and ensuring
the rule of law. With the advent of technology and the proliferation of digital data, law
enforcement agencies are increasingly turning to data mining techniques to aid in their
investigative efforts. This paper presents an overview of the application of data mining
in crime investigation, focusing on its role in uncovering hidden patterns, predicting
criminal activities, and enhancing decision- making processes.
The primary objective of this research is to highlight the potential of data mining
as a valuable tool for law enforcement agencies in solving and preventing crimes. We
explore various data sources, including crime reports, geographic data, social media, and
surveillance footage, and discuss the challenges and ethical considerations associated
with collecting and analyzing such data. The paper delves into different data mining
techniques such as clustering, classification, and association rule mining, demonstrating
how they can be applied to identify crime hotspots, suspect profiles, and modus
operandi. Additionally, predictive modeling using machine learning algorithms is
discussed to forecast future criminal activities, allocate resources effectively, and
prioritize investigations.
Ethical and privacy concerns surrounding the use of personal data in crime investigations
are addressed, emphasizing the importance of responsible data handling practices and
adherence to legal and ethical standards. The results of this research illustrate the
significant potential of data mining in crime investigation, highlighting its ability to assist
law enforcement agencies in understanding crime patterns, making informed decisions,
and ultimately enhancing public safety. As technology continues to advance, the
integration of data mining techniques into crime. investigation procedures is likely to
become even more crucial in the ongoing effort to combat criminal activities and maintain
a safer society.
Department of Computer Engineering, BVCOE, Lavale Pune
KEYWORDS:
Criminal Investigation, Data Mining, Suspect Prediction, Criminology, Cloud Technology,
Mobile Application, Prediction Algorithms, Crime Analysis
Acknowledgements
We would like to thank our internal guide PROF. PRATIMA KADAM MAM for giving
us all the help and guidance we needed. We are really grateful to them for their kind support
and their valuable suggestions.
I would like to thank to all faculties and friends who directly or indirectly supported me
time to time. I also wish to express my deepest gratitude to head of the department
Dr.P.V.Rathod and all the faculty and staff members of Information Technology Engg.
Department for their valuable support
Sincerely,
AMOL FUNDE ( 44 )
1 Introduction 1
2 Literature Survey 8
3 14
Problem Definition and scope
4 Software Requirement 16
Specification
5 Conclusions 23
6 References 24
7 Report Documentation 25
VAPT – Bug Bounty Recon Tool 7
1.Introduction
1.1 Introduction:
• In India, police department is the largest unit for preventing crimes, maintaining law
orders, rules and peace throughout the country. However problem with the Indian
police is that they are still using the traditional manual process such as First
Information Report (FIR) to keep and analyze the records crime and criminals. On
the other hand, criminals use more sophisticated technologies to commit the crime.
In 2011, 34305 cases of murder, 31385 cases of half murder, 24206 cases of rape,
44664 cases of kidnapping and around 600000 cases of robbery, theft and dacoit
were recorded. Total numbers of crime cases recorded were 325575. This is only
the statistics of recorded crimes. Crime rate is increasing very fast in India and we
are becoming unsafe. If we perform necessary calculations on the above statistics,
police department has to handle near about 6400 cases per day. In order to prevent
crimes police officer, have to identify evidences for those cases.
2.Literature Survey
Crimes are a social disturbance and cost the general public to a great degree from
numerous points of view. Any examination that can help in breaking down and
solving crime faster pays for itself. Crime data mining has the capacity of extricating
helpful data and concealed examples from the substantial wrongdoing informational
indexes. The crime data mining challenges are getting to be fortifying open doors for
the coming years. Since the writing of crime information mining has expanded
energetically as of late, it winds up obligatory to build up a diagram of the cutting
edge. This orderly survey centers around crime data mining procedures and
innovations utilized as a part of past investigations. The current work is grouped into
various classifications and is introduced utilizing perceptions. Crime Data information
mining and examination is a dynamic territory of research. The consequences of this
investigation may help new potential clients in understanding the scope of accessible
wrongdoing in- formation mining methods and advances.
Frequent pattern (item set) mining plays a key role in rule mining. service
enumeration, and vulnerability scanning. It can reveal more detailed information but
may also trigger security alerts.
The Apriori and FP development calculations are the most renowned calculations
which can be utilized for Frequent Pattern mining. his paper shows the study of
different Frequent Pattern Mining and Rule Mining calculation which can be applied
Department of Computer Engineering, BVCOE, Lavale Pune
VAPT – Bug Bounty Recon Tool 9
to crime pattern mining. The analysis of survey would provide the data about what
has been done already in a same area, what is the present trend and what are the other
related areas. Frequent Pattern mining having three important approaches that is
candidate generation approach, without candidate generation approach and vertical
layout approach. It also explains different frequent pattern calculations and how it can
be connected to various zones especially in wrongdoing design identification. This
paper helps the researchers to get clear thought towards the application of frequent
pattern mining algorithm in various areas. Crime Pattern Theory guarantees that a
crime including a wrongdoer and a casualty or target can just happen when the
movement spaces of both run into each other. Just put crime will happen if a region
gives chance to crime and it exists inside a guilty party’s mindfulness space. the audit
on different research papers relating to uses of incessant examples mining and
affiliation control mining in the field of wrongdoing design recognition. It gives
learning about different regular example mining calculation and augmentations of the
same. It additionally clarifies about the diverse application zones where these regular
examples can be utilized other than wrongdoing design.
In the daily life, crime keeps increasing and threatens the lives of the people in public.
The accuracy and time of tracing are robust while data mining technique is indulged.
The hurdle in the process starts from selecting the related variable for analysis and
their sensitiveness. The research work carried on the crime is potential area that
requires optimization. The quantum of the data and volatility makes the field
challenging. In this paper summarized the various existing techniques. The motivation
of this developed system relies on the problem faced by the police in tackling
voluminous crime.
The lack of software based on the police need, paved way to the OVER Seminar, that
aimed at managing various records of the past to address the issues. The work started
with the Microsoft Access and analyzed using Structured Query Language. Followed
by developing system that afford the mapping and visualization tools for the existing
data along with predicting perspective. The purpose of data mining is vital in all the
area and from the above works it is evident that each researcher worked on different
perspective to ensure the safety for the nation perspective. The data utilized for the
purpose is varied in nature and it is important to create a common ground for the
storing of the crime related data. The usage of the soft computing algorithms will
enhance the rate of finding the highly critical data in the shorter span of time. The
data mining techniques like classification, clustering and association rule mining is
Department of Computer Engineering, BVCOE, Lavale Pune
Data Mining For Crime Investigation 10
well serving the purpose of handing the versatile data. The effective management of
crime related data along with the optimized algorithms is the potential area of
research. Profiling the offender information then applying the data mining paves way
to understand the behavioral pattern of the criminals.
With a substantial increase in crime across the globe, there is a need for analyzing the
crime data to lower the crime rate. This helps the police and citizens to take necessary
actions and solve the crimes faster. In this paper, data mining techniques are applied
to crime data for predicting features that affect the high crime rate. Supervised
learning uses data sets to train, test and get desired results on them whereas
Unsupervised learning divides an inconsistent, unstructured data into classes or
clusters. Decision trees, Nave Bayes and Regression are some of the supervised
learning methods in data mining and machine learning on previously collected data
and thus used for predicting the features responsible for causing crime in a region or
locality. Based on the rankings of the features, the Crimes Record Bureau and Police
Department can take necessary actions to decrease the probability of occurrence of the
crime.
The crime data has many features including information about immigrants, race, sex,
population, demographics and so on. Analyzing this data not only helps in
recognizing a feature responsible for high crime rate but also helps in taking
necessary actions for prevention of crimes. Data mining provides powerful techniques
and algorithms to analyses data and extract important information from it. Criminal
network analysis, this included study of network boundaries, defining actors, its
attributes, activities, and affiliations that ensure data reliability and change over time.
He concluded that selecting the appropriate methodology depends on tasks required or
high volume of crime data to be prepared. The paper concludes with Random Forest
Classifier giving the most balanced results with respect to accuracy, precision, recall
and F1 score out of three models for prediction of ‘Per Capita Violent Crimes’ feature.
While Linear Regression gave the lowest values in these performance measures, the
data could not fit well to the straight line considered using target and remaining
features.
Crime rate is increasing very fast in India because of increase in poverty and
unemployment. With the existing crime investigation techniques, officers have to
spend a lot of time as well as man power to identify suspects and criminals. However,
crime investigation process needs to be faster and efficient. As large amount of
The data objects within the group are very similar and very dissimilar as well when
compared to objects of other groups. Traditional crime investigation processes require
a lot of skilled man power and paperwork. There is lack in use of technology for
sensitive domain like crime investigation. So, crime investigation has become a time-
consuming process. Data mining is the process of extracting useful information or
knowledge from large data sources. Large amount of information is collected during
crime investigation process and only useful information is required for analysis. So,
data mining can be used for this purpose. Selection of particular data mining
technique has greater influence on the results obtained. This is main reason behind the
performance comparison and selection of best performing data mining algorithm.
This paper presents a proposed model for crime and criminal data analyzes using
simple k-means algorithm for data clustering and Apriori algorithm for data
Association rules. The paper tends to help specialist in discovering patterns and
trends, making forecasts, finding relation- ships and possible explanations, mapping
criminal networks and identifying possible suspects. Clustering is based on finding
relationships between different Crime and Criminal attributes having some previously
unknown common characteristics. Association rules mining is based on generate rules
from crime dataset based on frequents occurrence of patterns to help the decision
makers of our security society to make a prevention action. The data was collected
manually from some police department in Libya. This work aims to help the Libyan
government to make a strategically decision regarding prevention the increasing of
the high crime rate these days. Data for both crimes and criminals were collected from
police departments’ dataset to create and test the proposed model, and then these data
were preprocessed to get clean and accurate data using different pre-processing
Department of Computer Engineering, BVCOE, Lavale Pune
Data Mining For Crime Investigation 12
The purpose of this paper is to suggest ways by which law enforcement and
intelligence agencies can analyze large volume of data using data mining as one of the
ways of getting active solutions for crime investigation in Nigeria. The problem is, the
general public are extremely concerned on how the activities of the terrorist in Nigeria
can be reduced if it cannot be completely eliminated. The results of this paper can be
significantly useful to the National Security Advisor’s office (NASAO), State Security
Service (SSS), Nigerian Police Force (NPF), Nigeria Army, including the Air Force
(NAF) and Navy, Immigration, Customs, Economic and Financial Crime Commission
(EFFC), Independent Corrupt Practices Commission (ICPC), and the general public.
Data mining will help these stakeholders to review various data mining algorithms
and then design a framework that would be automated to trigger alarm for timely
solution for prevention, arrest and investigation of crimes.
Data mining is a way to extract knowledge out of usually large data sets; in other
words, it is an approach to discover hidden relationships among data by using
artificial intelligence methods. The wide range of data mining applications has made
it an important field of research. Criminology is one of the most important fields for
applying data mining. Criminology is a process that aims to identify crime
characteristics. Actually, crime analysis includes exploring and detecting crimes and
their relation- ships with criminals. The high volume of crime datasets and also the
complexity of relationships between these kinds of data have made criminology an
appropriate field for applying data mining techniques.
Identifying crime characteristics is the first step for developing further analysis. The
knowledge that is gained from data mining approaches is a very useful tool which can
help and support police forces. An approach based on data mining techniques is
discussed in this paper to extract important entities from police narrative reports
which are written in plain text. By using this approach, crime data can be
automatically entered into a database, in law enforcement agencies. We have also
applied a SOM clustering method in the scope of crime analysis and finally we will
use the clustering results in order to perform crime matching process. In this research
some of the most significant capabilities of data mining techniques were leveraged
through a multi-purpose framework for intelligent crime investigation. The
framework exploited a systematic approach for using SOM and MLP neural networks
Problem Statement
As the crime rate is increasing day by day, many cases take too much of time to be
solved and due to insufficient resources and lack of technology provided to them many
cases which can be solved within short period of time take too much of time. To
overcome this situation, we are proposing this Seminar in which data mining techniques
and algorithms are used.
•Before going on air, suspect prediction score must be tested using the previous
dataset.
• Prediction algorithm will only predict the most suspected criminal among all suspect’s
description available in database, it cannot predict suspect whose data is not available.
3.3 Outcome
3.4 Applications
• Database management.
• Crime Reporting.
• RAM: -1 GB
• Camera
• GPS
Platform:
They are responsible for the design, testing and maintenance of software programs for
computer operating systems or applications, such as word processing or database management
systems.
The use case scenario considers the goal of publishing a short story. It breaks down
the process of book publishing by describing the actors, the typical work-ow in the main
success story, and the things that could go wrong, called extensions. When managing
Seminar that use UML conventions, there can be temptation to jump straight into the case
diagram, with stick figures, ovals, and lots of lines. But if you don’t know your goals and
whose involved, take a step back and write your goals down in prose.
1.Admin: Admin can add the FIR with details like name, age, Crime, Place, Weapon, etc.
which are included by the application user.
2. Sub Inspector: Can view the FIR and get updates related to that particular case.
He can also update the reports by having tag on his changes
3.User: Add crime details from crime scene Use inbuilt camera to capture crime scene
photo.
Data objects and their major attributes and relationships among data objects are
described using an ERD- like form.
• 2) DATA ANALYSIS AND USEFUL DATA GENERATION Stored data over cloud
is used to generate useful information using Data Mining techniques and Criminal
Prediction algorithms.
3) USING GENERATED DATA FOR CRIME INVESTIGATION Generated data
is used for further Investigation and stored in records for other requirements.
• PERFORMANCE REQUIREMENTS:
The performance of the system completely depends upon how quickly the system
will be able to run analysis and prepare crime patterns based on the data and the
volume of the data to be extracted. It is necessary to maintain the performance of
the system so that the results are accurate.
• SAFETY REQUIREMENTS:
The system must be safe and should not be susceptible to attacks. Attacks can
change the integrity of the data/system and result in loss of confidentiality or data
loss which can affect the system hugely.
Department of Computer Engineering, BVCOE, Lavale Pune
VAPT – Bug Bounty Recon Tool 21
• SECURITY REQUIREMENTS:
Data must be secure from unauthorized access. Data present in database is highly
confidential and related to Crime scene which should not be tempered or modified
at any cost.
Qualities like availability of the system is an important attribute since the system
should be available to use and implement whenever needed. The system should be
up and running whenever it is required for. System should run on any device with
the same efficiency, time and accuracy as the original system.
A state diagram is a type of diagram used in computer science and related fields to
describe the behavior of systems. State diagrams require that the system described is
composed of a finite number of states; sometimes, this is indeed the case, while at other
times this is a reasonable abstraction. Many forms of state diagrams exist, which differ
slightly and have different semantics. A transition in a state diagram is a progression
from one state to another and is triggered by an event that is internal or external to the
entity modeled. An action is an operation that is invoked by an entity that is modeled. A
very traditional form of state diagram for a finite machine is a directed graph.
Any design constraints that will impact the subsystem are noted
5 Conclusions
[1] Dr. Ritu Bhargava, Pramod Singh, Rameshwar Singh Sangwa. ANALYSIS OF
CRIME DATA USING DATA MINING ALGORITHM.
[6] Dr. Zakaria Suliman Zubi, Ayman Altaher Mahmmud. CRIME DATA ANALYSIS
USING DATA MINING TECHNIQUES TO IMPROVE CRIMES PREVENTION.
7. Report Documentation
Address:
E-mail: [email protected]
Roll: 44
Abstract:
Vulnerability Crime rate is increasing very fast in India. With the existing crime investigation
techniques, occurs have to spend a lot of time as well as man power to identify suspects and criminals. As
large amount of information is collected during crime investigation, data mining is an approach which can
be useful in this perspective. Data mining is a process that extracts useful information from large amount of
crime data so that possible suspects of the crime can be identified efficiently. Best performing algorithm will
be used against sample crime and criminal database to identify possible suspects of the crime.