0% found this document useful (0 votes)
3 views2 pages

f7

The document outlines a geospatial analysis task involving the visualization and analysis of restaurant data using Python. It includes creating an interactive map of restaurant locations, analyzing the distribution of restaurants across countries and cities, and determining correlations between restaurant ratings and their geographical locations. The results are presented through various plots and saved outputs.

Uploaded by

sambhaviasingh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views2 pages

f7

The document outlines a geospatial analysis task involving the visualization and analysis of restaurant data using Python. It includes creating an interactive map of restaurant locations, analyzing the distribution of restaurants across countries and cities, and determining correlations between restaurant ratings and their geographical locations. The results are presented through various plots and saved outputs.

Uploaded by

sambhaviasingh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 2

# LEVEL 1 - TASK 3: GEOSPATIAL ANALYSIS

print("LEVEL 1 - TASK 3: GEOSPATIAL ANALYSIS")


print("===================================")

# 1. Visualize the locations of restaurants on a map


# For this task, we'll use folium to create an interactive map
# First, let's filter out rows with missing or invalid coordinates
valid_coords = df_processed.dropna(subset=['Latitude', 'Longitude'])
valid_coords = valid_coords[(valid_coords['Latitude'] != 0) &
(valid_coords['Longitude'] != 0)]

# To avoid overwhelming the map, let's take a sample of restaurants


# If the dataset is large, we'll sample 1000 restaurants
if len(valid_coords) > 1000:
map_sample = valid_coords.sample(1000, random_state=42)
else:
map_sample = valid_coords

# Calculate the center of the map


center_lat = map_sample['Latitude'].mean()
center_lon = map_sample['Longitude'].mean()

# Create a map centered at the mean coordinates


restaurant_map = folium.Map(location=[center_lat, center_lon], zoom_start=2)

# Add a marker cluster to make the map more manageable


marker_cluster = MarkerCluster().add_to(restaurant_map)

# Add markers for each restaurant


for idx, row in map_sample.iterrows():
popup_text = f"""
<b>{row['Restaurant Name']}</b><br>
Cuisine: {row['Cuisines']}<br>
Rating: {row['Aggregate rating']}<br>
Price Range: {row['Price range']}<br>
"""

# Color markers based on rating


if row['Aggregate rating'] >= 4.0:
color = 'green'
elif row['Aggregate rating'] >= 3.0:
color = 'blue'
elif row['Aggregate rating'] >= 2.0:
color = 'orange'
else:
color = 'red'

folium.Marker(
location=[row['Latitude'], row['Longitude']],
popup=folium.Popup(popup_text, max_width=300),
icon=folium.Icon(color=color)
).add_to(marker_cluster)

# Save the map to an HTML file


restaurant_map.save('restaurant_locations_map.html')
print("Interactive map created and saved as 'restaurant_locations_map.html'")
print("Note: The interactive map can't be displayed directly in this notebook
output.")
# 2. Analyze the distribution of restaurants across different cities or countries
print("\nDistribution of restaurants across top countries:")
country_distribution = df_processed['Country Code'].value_counts().head(10)
print(country_distribution)

print("\nDistribution of restaurants across top cities:")


city_distribution = df_processed['City'].value_counts().head(10)
print(city_distribution)

# Visualize the distribution


plt.figure(figsize=(14, 8))
country_distribution.plot(kind='bar', color='teal')
plt.title('Distribution of Restaurants Across Top 10 Countries', fontsize=16)
plt.xlabel('Country Code', fontsize=14)
plt.ylabel('Number of Restaurants', fontsize=14)
plt.xticks(rotation=45, fontsize=12)
plt.yticks(fontsize=12)
plt.grid(axis='y', alpha=0.3)
plt.tight_layout()
plt.show()

# 3. Determine if there is any correlation between the restaurant's location and


its rating
# Group by city and calculate average rating
city_ratings = df_processed.groupby('City')['Aggregate rating'].agg(['mean',
'count']).reset_index()
city_ratings = city_ratings[city_ratings['count'] >= 10] # Filter cities with at
least 10 restaurants
city_ratings = city_ratings.sort_values('mean', ascending=False)

print("\nAverage ratings by city (for cities with at least 10 restaurants):")


print(city_ratings.head(15))

# Visualize the relationship


plt.figure(figsize=(14, 10))
sns.barplot(x='mean', y='City', data=city_ratings.head(15), palette='viridis')
plt.title('Average Ratings by City (Top 15)', fontsize=16)
plt.xlabel('Average Rating', fontsize=14)
plt.ylabel('City', fontsize=14)
plt.xticks(fontsize=12)
plt.yticks(fontsize=12)
plt.grid(axis='x', alpha=0.3)
plt.tight_layout()
plt.show()

# Check if there's a correlation between latitude/longitude and rating


print("\nCorrelation between location coordinates and rating:")
location_corr = df_processed[['Latitude', 'Longitude', 'Aggregate rating']].corr()
print(location_corr)

# Visualize the correlation with a heatmap


plt.figure(figsize=(10, 8))
sns.heatmap(location_corr, annot=True, cmap='coolwarm', fmt='.2f', linewidths=0.5)
plt.title('Correlation Between Location and Rating', fontsize=16)
plt.xticks(fontsize=12)
plt.yticks(fontsize=12)
plt.tight_layout()
plt.show()

You might also like