DAV Proj Proposal
DAV Proj Proposal
This project will focus on analyzing the crime patterns in Chicago, using publicly available
crime data, with particular emphasis on gun-related violence. Chicago has long struggled with
high crime rates, especially involving gun violence, which has been a persistent issue over the
years. By leveraging a comprehensive dataset from the Chicago Data Portal, we aim to gain
insights into the distribution, frequency, and severity of crimes within the city.
The dataset contains detailed information on crimes from 2001 to the present, including incident
types, locations, dates, and crime outcomes. Our analysis will focus on uncovering trends over
time, identifying crime hotspots, and determining the most prevalent types of crimes in various
neighborhoods. While we will analyze all types of crimes, gun violence will receive special
attention due to its significant social impact.
The dataset will be collected from the Chicago Open Data Portal [1] and additional crime data
resources from data.gov [3]. We will preprocess the data to handle missing values, outliers, and
inconsistencies. Geographic coordinates will be cleaned to ensure accurate mapping of crime
locations. Additional data on gun-related incidents will be sourced from the Gun Violence
Archive [2].
We will begin with an exploratory analysis of the crime data to identify overall trends, peak
crime periods, and regions most affected by gun violence. This will include:
· Time-series analysis to determine how crime rates have evolved over time.
· Geo-spatial analysis to visualize crime distributions across Chicago’s neighborhoods.
Hotspot Detection with Clustering: Using clustering algorithms like DBSCAN, we will detect
crime hotspots, particularly in areas with frequent gun-related incidents.
Classification Models for Crime Severity: We will develop classification models (e.g., Decision
Trees, SVM, and XGBoost) to predict the severity of crimes (minor, major, fatal) based on
historical data. These models will help identify the areas at higher risk of serious crimes.
Time-Series Forecasting for Crime Trends: We will use models like ARIMA or Prophet to
forecast future crime rates and gun violence trends in Chicago.
We will create interactive crime maps, allowing users to visualize crime intensity across different
neighborhoods. Downloadable crime reports in PDF or Excel format will provide detailed
insights into specific crime types and their distribution across the city.
Software Tools:
· Programming Languages: Python for data scraping, analysis, and model implementation.
· Libraries: Pandas, NumPy, Scikit-learn, GeoPandas, Matplotlib, Seaborn, and Folium (for
interactive maps).
We expect to uncover important patterns in Chicago’s crime data, particularly regarding gun
violence. Our key deliverables will include:
· Classification Models: Accuracy, Precision, Recall, and F1 Score for predicting crime
severity.
Initial exploration of the Chicago crime dataset reveals that gun-related crimes are concentrated
in certain neighborhoods. These areas often experience higher rates of violent crimes and show
significant temporal variations. Through further analysis, we plan to detect shifts in crime
patterns and identify which areas are most vulnerable to future incidents.
Outline of Work-to-Do:
· Data Preprocessing and Cleaning: Handling missing values, scaling features, and
transforming geographical data.
· Exploratory Data Analysis (EDA): Visualizing crime distributions and trends, especially
gun-related violence.
· Clustering and Hotspot Detection: Applying DBSCAN and K-Means to detect crime
hotspots.
· Visualization: Building interactive maps and dashboards to visualize crime trends and
severity.
References: