0% found this document useful (0 votes)
9 views

Startup Ecosystem Analysis Model

The document provides a comprehensive analysis of the Indian startup funding ecosystem, detailing various aspects such as funding trends over time, industry preferences, and the role of location in startup growth. It includes a project description, dataset information, and methodologies for data cleaning and statistical analysis. The insights aim to guide new investors in making informed decisions based on data-driven visualizations and trends.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Startup Ecosystem Analysis Model

The document provides a comprehensive analysis of the Indian startup funding ecosystem, detailing various aspects such as funding trends over time, industry preferences, and the role of location in startup growth. It includes a project description, dataset information, and methodologies for data cleaning and statistical analysis. The insights aim to guide new investors in making informed decisions based on data-driven visualizations and trends.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Table of Contents

1. Understanding The Data


1.1. Project Description

1.2. About the Datasets

1.3. Import The Libraries

1.4. Performing Essential Statistical Analysis on the Dataset

1.5. Data Cleaning and Preparation

1.6. Feature Engineering

2. How Does the Funding Ecosystem changes with respect to Time?

3. What is the General Amount that Startups get in India?

4. Which Kind of Industries are more preferred for Startups?

5. Does Location also play a role, In determining the Growth of a Startup?

6. Who plays the main role in Indian Startups Ecosystem?

7. What are the different Types of Funding for Startups?

1. Understanding The Data


1.1. Project Description
As the startup world keeps changing, figuring out how funding works is essential. This information collection
gives a complete picture of how startups get funded in India, including all the different ways it's been done
over time. From the kind of investments available to who the major players are and what industries get
funded the most, this data is like a detailed map that can help people involved in startups make smart
choices and spot new trends.

#### Purpose: To provide strategic guidance to new investors looking to invest in the Indian startup
ecosystem by analyzing data and visualizing which sectors, cities, and types of investments have the
highest potential

#### Project Importance: Examining this project helps new investors make more informed and strategic
decisions in the Indian startup ecosystem. Data-driven insights and visualizations enable investors to
minimize risks and capitalize on high-potential opportunities.
1.2. About the Datasets
- Dataset Descriptions: 'stocks_daily_prices.csv' / 'stocks_daily_returns.csv.

Content: Daily stock prices for various companies.


Rows: 3044
Columns-: 10
Sr No: Serial number.A unique identifier for each re-cord.
Date dd/mm/yyyy: The date when the funding event took place.
Startup Name: The name of the startup receiving the funding.
Industry Vertical: The primary industry to which the startup belongs.
SubVertical: A more specific category within the primary industry.
City Location: The city where the startup is headquartered.
Investors Name: The names of the investors or investment firms involved in the funding.
InvestmentnType: The type of investment (e.g., Seed, Series A, Series B).
Amount in USD: The amount of funding received in US dollars.
Remarks: Additional comments or details about thels about the funding event.

1.3. Import The Libraries


In [46]: import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import seaborn as sns
import folium
from folium.plugins import MarkerCluster
from wordcloud import WordCloud
%matplotlib inline
import warnings
warnings.filterwarnings("ignore")
warnings.warn("this will not show")
pd.set_option('display.max_columns',None)
pd.set_option('display.max_rows', None)

In [47]: df0=pd.read_csv('startup_funding.csv')
df=df0.copy()

1.4. Performing Essential Statistical Analysis on the Dataset


In [48]: # Dimensions of the Data Set - (rows, columns)
df.shape

(3044, 10)
Out[48]:

In [49]: # Preview of Data Set


df.head()

Out[49]: Sr Date Startup Industry City Investors Amou


SubVertical InvestmentnType
No dd/mm/yyyy Name Vertical Location Name

Tiger Global Private Equity


0 1 9/1/2020 BYJU’S E-Tech E-learning Bengaluru 20,00,00
Management Round

1 2 13/01/2020 Shuttl Transportation App based Gurgaon Susquehanna Series C 80,48


shuttle Growth
service Equity

Retailer of
baby and Sequoia
2 3 9/1/2020 Mamaearth E-commerce Bengaluru Series B 1,83,58
toddler Capital India
products

Online Vinod
3 4 2/1/2020 Wealthbucket FinTech New Delhi Pre-series A 30,00
Investment Khatumal

Embroiled Sprout
Fashion and
4 5 2/1/2020 Fashor Clothes For Mumbai Venture Seed Round 18,00
Apparel
Women Partners

In [50]: # Data Type Properties


df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3044 entries, 0 to 3043
Data columns (total 10 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Sr No 3044 non-null int64
1 Date dd/mm/yyyy 3044 non-null object
2 Startup Name 3044 non-null object
3 Industry Vertical 2873 non-null object
4 SubVertical 2108 non-null object
5 City Location 2864 non-null object
6 Investors Name 3020 non-null object
7 InvestmentnType 3040 non-null object
8 Amount in USD 2084 non-null object
9 Remarks 419 non-null object
dtypes: int64(1), object(9)
memory usage: 237.9+ KB

In [51]: df.describe(include="object").T

Out[51]: count unique top freq

Date dd/mm/yyyy 3044 1035 2/2/2015 11

Startup Name 3044 2459 Ola Cabs 8

Industry Vertical 2873 821 Consumer Internet 941

SubVertical 2108 1942 Online Lending Platform 11

City Location 2864 112 Bangalore 700

Investors Name 3020 2412 Undisclosed Investors 39

InvestmentnType 3040 55 Private Equity 1356

Amount in USD 2084 464 10,00,000 165

Remarks 419 71 Series A 175

In [52]: # Checking Null Values


(df.isnull().sum() / df.shape[0] *
100).sort_values(ascending=False).round(2).astype(str) + ' %'

Remarks 86.24 %
Out[52]:
Amount in USD 31.54 %
SubVertical 30.75 %
City Location 5.91 %
Industry Vertical 5.62 %
Investors Name 0.79 %
InvestmentnType 0.13 %
Sr No 0.0 %
Date dd/mm/yyyy 0.0 %
Startup Name 0.0 %
dtype: object

1.5. Data Cleaning and Preparation


In [53]: #Changing commas in the 'Amount in USD' column
df['Amount in USD']=df['Amount in USD'].apply(lambda x: str(x).replace(',',''))

In [54]: # Correction of incorrect values from 'Amount in USD' column


replace_map={
"Undisclosed":"0",
"unknown":"0",
"undisclosed":"0",
"\\xc2\\xa020000000":"0",
"N/A":"0",
"nan":"0",
"\\xc2\\xa020000000":"0"
}
df['Amount in USD']=df['Amount in USD'].apply(lambda x: replace_map.get(str(x),x))

In [55]: # Conversion to digital data


df['Amount in USD'] = pd.to_numeric(df['Amount in USD'])

In [56]: # Replacing 'Amount in USD' 0 values with empty values


df["Amount in USD"]=df["Amount in USD"].replace(0,np.nan)

In [57]: # Replace empty values with average


df["Amount in USD"].fillna(df["Amount in USD"].mean(), inplace=True)

In [58]: # Correcting incorrect date values


data_replace_map={
'12/05.2015':'12/05/2015',
'13/04.2015':'13/04/2015',
'15/01.2015':'15/01/2015',
'22/01//2015':'22/01/2015/',
'05/072018':'05/07/2018',
'01/07/015':'01/07/2015',
'\\xc2\\xa010/7/2015': '10/07/2015',
'\\\\xc2\\\\xa010/7/2015': '10/07/2015'
}
df['Date dd/mm/yyyy']=df['Date dd/mm/yyyy'].apply(lambda x:data_replace_map.get(x,x))

In [59]: # Convert to datetime type by specifying the date format


df['Date dd/mm/yyyy'] = pd.to_datetime(df['Date dd/mm/yyyy'],
format='%d/%m/%Y',
errors='coerce')

In [60]: # 86.24% of the 'Remarks' column consists of empty values, so we remove this line
df.drop('Remarks', axis=1, inplace=True)

In [61]: # Replacing 'Bengaluru' used in the data set with the more common name 'Bangalore'
df['City Location'][df['City Location'] ==
'Bengaluru'] = 'Bangalore'

In [62]: # Change the name in the 'Undisclosed investors' column to 'Undisclosed Investors'
investor_replace_map={
'Undisclosed investors': 'Undisclosed Investors',
'Undisclosed Investor': 'Undisclosed Investors',
'undisclosed investors': 'Undisclosed Investors',
'Undisclosed': 'Undisclosed Investors'
}
df['Investors Name']=df['Investors Name'].apply(lambda x:investor_replace_map.get(x,x))

In [63]: # Removal of the gap in 'Ola Cabs'.


df['Startup Name'][df['Startup Name'] == 'Ola Cabs'] = 'OlaCabs'

In [64]: # Replace with a more commonly used word


investment_type_replace_map = {
'Seed/ Angel Funding': 'Seed / Angel Funding',
'Seed\\nFunding': 'Seed Funding',
'Seed/Angel Funding': 'Seed / Angel Funding',
'Angel / Seed Funding': 'Seed / Angel Funding'
}
df['InvestmentnType']=df['InvestmentnType'].apply(lambda x:investment_type_replace_map.g

In [65]: # Standardizing common industry terms using regex and string replacement
replacements = {
r'\be[ -]?commerce\b': 'e-commerce',
r'\bfintech\b': 'fintech',
r'\bhealth[ -]?tech\b': 'healthtech',
r'\bedu[ -]?tech\b': 'edtech',
r'\bfood[ -]?(tech|delivery)\b': 'food & beverage',
r'\btransportation|logistics\b': 'transportation & logistics',
r'\bconsumer internet\b': 'consumer internet',
r'\btechnology\b': 'technology',
r'\bagri[ -]?tech\b': 'agritech',
r'\bauto[ -]?tech\b': 'autotech',
r'\bmedia\b': 'media',
r'\bfinance\b': 'finance',
r'\bunknown\b': 'other'
}
# Applying replacements
for pattern, replacement in replacements.items():
df['Industry Vertical']=df['Industry Vertical'].str.replace(pattern, replacement,reg

In [66]: df['Industry Vertical']=df['Industry Vertical'].fillna('unknown')


df['IndustY Vertical']=df['Industry Vertical'].str.lower()
def clean_industry(industry):
parts=industry.split('&')
cleaned_parts=[]
for part in parts:
if part not in cleaned_parts:
cleaned_parts.append(part)
if len(cleaned_parts)==2:
break
return '&'.join(cleaned_parts)
df['Industry Vertical']=df['Industry Vertical'].apply(clean_industry)

1.6. Feature Engineering


In [67]: # Create the 'Year Month' column
df['Year Month']=(df['Date dd/mm/yyyy'].dt.year*100+df['Date dd/mm/yyyy'].dt.month)
# Let's check that the conversion was successful
df[['Date dd/mm/yyyy','Year Month']].head()

Out[67]: Date dd/mm/yyyy Year Month

0 2020-01-09 202001.0

1 2020-01-13 202001.0

2 2020-01-09 202001.0
3 2020-01-02 202001.0

4 2020-01-02 202001.0

In [68]: df.head()

Out[68]: Sr Date Startup Industry City Investors Amou


SubVertical InvestmentnType
No dd/mm/yyyy Name Vertical Location Name

Tiger Global Private Equity


0 1 2020-01-09 BYJU’S E-Tech E-learning Bangalore 2000000
Management Round

App based Susquehanna


1 2 2020-01-13 Shuttl Transportation shuttle Gurgaon Growth Series C 80483
service Equity

Retailer of
baby and Sequoia
2 3 2020-01-09 Mamaearth E-commerce Bangalore Series B 183588
toddler Capital India
products

Online Vinod
3 4 2020-01-02 Wealthbucket FinTech New Delhi Pre-series A 30000
Investment Khatumal

Embroiled Sprout
Fashion and
4 5 2020-01-02 Fashor Clothes For Mumbai Venture Seed Round 18000
Apparel
Women Partners

2. How Does the Funding Ecosystem changes with respect to Time?


In [69]: # Convert 'Date' column to datetime format
df['Date'] = pd.to_datetime(df['Date dd/mm/yyyy'], format='%d/%m/%Y')
# Add Year and Month columns
df['Year']=df['Date'].dt.year
df['Month']=df['Date'].dt.month
# Yearly funding trend
funding_trend_yearly=df.groupby('Year')['Amount in USD'].sum().reset_index()
# Monthly funding trend
funding_trend_monthly=df.groupby(['Year','Month'])['Amount in USD'].sum().reset_index()
funding_trend_monthly['Month']=funding_trend_monthly['Month'].astype(str)
plt.figure(figsize=(10,6))
plt.plot(funding_trend_yearly['Year'], funding_trend_yearly['Amount in USD'], marker='o'
plt.title('Yearly Funding Trend')
plt.xlabel('Year')
plt.ylabel('Total Funding(USD)')
plt.grid(True)
plt.show()
fig=make_subplots(rows=1, cols=1, subplot_titles=('Monthly Funding Trend'))
# Monthly funding trend with selected years highlighted
colors={2015:'red', 2017:'green', 2019:'purple'}
for year in funding_trend_monthly['Year'].unique():
monthly_data=funding_trend_monthly[funding_trend_monthly['Year']==year]
color=colors.get(year,'gray')
fig.add_trace(
go.Scatter(x=monthly_data['Month'], y=monthly_data['Amount in USD'], mode='lines
name=str(year), line=dict(color=color)),
row=1, col=1
)
# Update layout
fig.update_layout(title_text='Monthly Funding Trend', height=600)
fig.update_xaxes(title_text='Month',row=1,col=1)
fig.update_yaxes(title_text='Total Funding Amount (USD)',row=1,col=1)
# Show the plot
fig.show()
Findings:
Yearly Funding Pattern:
The funding amounts demonstrate significant fluctuation year-over-year.
Funding reached a peak in 2017, nearing $10 billion total.
A substantial decrease followed in 2018, with another increase observed in 2019.
The data for 2020 shows a sharp decline, but it's likely incomplete because the year is partial or
ongoing. #### Potential Causes:

Economic cycles, investor sentiment, and macroeconomic factors might be influencing these variations.

Large funding rounds or significant investments in specific years can cause surges.

Monthly Funding Pattern:


The monthly funding trend exposes finer details, with numerous peaks and valleys.
There are substantial spikes in certain months, particularly in mid-2017 and mid-2019.
There appears to be some seasonality, with certain periods consistently exhibiting higher funding
activity.

In [70]: # Extract year and month for monthly funding analysis


df['Year']=df['Date dd/mm/yyyy'].dt.year
df['Month']=df['Date dd/mm/yyyy'].dt.month
# Group by Year and Month to get monthly funding amounts
monthly_funding=df.groupby(['Year','Month'])['Amount in USD'].sum().reset_index()
# Unique years in the dataset
years=monthly_funding['Year'].unique()
# Plotting monthly funding amounts for each year using subplots
fig, axs=plt.subplots(len(years)//2,2,figsize=(14,7*(len(years)//2)))
color='#4878A2'
for i,year in enumerate(years):
row=i//2
col=i%2
sns.barplot(x='Month',
y='Amount in USD',
data=monthly_funding[monthly_funding['Year']==year],
color=color,
ax=axs[row,col])
axs[row,col].set_title(f'Monthly Funding in {year}')
axs[row,col].set_xlabel('Month')
axs[row,col].set_ylabel('Funding Amount (USD)')
axs[row,col].grid(True)
plt.tight_layout()
plt.show()
Findings:
Seasonal Distribution:
Certain months, like January, July, and August, consistently exhibit higher funding levels, hinting at
potential seasonal patterns. Yearly Variations:

Peak funding:
months differ across years, suggesting that specific events or investments may influence funding
activity.

Funding Spikes:
Substantial surges in funding can be explained by large investment rounds or prominent startups
receiving funding.

Analysis Insight:
This examination provides a clear picture of how funding is dispersed throughout the year, emphasizing
significant periods of investment activity.

3. What is the General Amount that Startups get in India?


In [71]: # Preview of the details of the 10 most funded Initiatives
df.sort_values('Amount in USD',ascending=False).head(10)

Out[71]: Sr Date Startup Industry City Investors Am


SubVertical InvestmentnType
No dd/mm/yyyy Name Vertical Location Name

Rapido Bike Westbridge


60 61 2019-08-27 Transportation Bike Taxi Bangalore Series B 3.900
Taxi Capital

Online
651 652 2017-08-11 Flipkart eCommerce Bangalore Softbank Private Equity 2.500
Marketplace

Microsoft,
ECommerce eBay,
966 967 2017-03-21 Flipkart eCommerce Bangalore Private Equity 1.400
Marketplace Tencent
Holdings

Mobile
Wallet & SoftBank
830 831 2017-05-18 Paytm ECommerce Bangalore Private Equity 1.400
ECommerce Group
platform

Vijay
Mobile
31 32 2019-11-25 Paytm FinTech Noida Shekhar Funding Round 1.000
Wallet
Sharma

Steadview
Online Capital and
2648 2649 2015-07-28 Flipkart.com NaN Bangalore Private Equity 7.000
Marketplace existing
investors

Alibaba
2459 2460 2015-09-29 Paytm E-Commerce NaN New Delhi Group, Ant Private Equity 6.800
Financial

Private
188 189 2018-08-30 True North Finance Mumbai NaN Private Equity 6.000
Equity Firm

33 34 2019-10-02 Udaan B2B Business Bangalore Altimeter Series D 5.850


development Capital,
DST
Global

Baillie
Gifford,
Car Falcon
2244 2245 2015-11-18 Ola NaN Bangalore Private Equity 5.000
Aggregator Edge
Capital,
Tiger Gl...

In [72]: # Preview of the least funded initiatives


df.sort_values(by='Amount in USD').head(10)

Out[72]: Sr Date Startup Industry City Investors Amount


SubVertical InvestmentnType
No dd/mm/yyyy Name Vertical Location Name in USD

Hyderabad
Angels (at
3020 3021 2015-01-19 Enabli unknown NaN NaN Startup Seed Funding 16000.0 u
Heroes
event)

Hyderabad
Angels (at
3021 3022 2015-01-19 CBS unknown NaN NaN Startup Seed Funding 16000.0 u
Heroes
event)

Hyderabad
Angels (at
3019 3020 2015-01-19 Yo Grad unknown NaN NaN Startup Seed Funding 16000.0 u
Heroes
event)

Hyderabad
Angels (at
Play your
3018 3019 2015-01-19 unknown NaN NaN Startup Seed Funding 16000.0 u
sport
Heroes
event)

Hyderabad
Angels (at
3017 3018 2015-01-19 Hostel Dunia unknown NaN NaN Startup Seed Funding 16000.0 u
Heroes
event)

Group of
2933 2934 2015-02-02 Faaya unknown NaN NaN Angel Seed\\nFunding 16600.0 u
Investors

Group of
2934 2935 2015-02-02 InstaBounce unknown NaN NaN Angel Seed\\nFunding 16600.0 u
Investors

Group of
Chloroplast
2935 2936 2015-02-02 unknown NaN NaN Angel Seed\\nFunding 16600.0 u
Foods
Investors

Group of
2936 2937 2015-02-02 Dealwithus unknown NaN NaN Angel Seed\\nFunding 16600.0 u
Investors

Group of
2937 2938 2015-02-02 CleverSharks unknown NaN NaN Angel Seed\\nFunding 16600.0 u
Investors

In [73]: # Calculate the average funding amount


average_funding=df['Amount in USD'].mean()
# Log-transform the funding amounts for better visualization
df['Log Amount in USD']=np.log10(df['Amount in USD']+1)
# Plot the log-transformed funding amount distribution
plt.figure(figsize=(14, 8))
sns.histplot(df['Log Amount in USD'], bins=50, kde=True, color='#4878A2')
plt.axvline(np.log10(average_funding + 1), color='r', linestyle='--', label=f'Log Averag
plt.title('Log-Scaled Distribution of Funding Amounts for Startups in India')
plt.xlabel('Log Funding Amount (USD)')
plt.ylabel('Number of Startups')
plt.legend()
plt.grid(True)
plt.show()

4. Which Kind of Industries are more preferred for Startups?


In [74]: # Identify the top 10 industries
top_industries=df[(df['Industry Vertical']!='unknown')]['Industry Vertical'].value_count
top_industries.columns=['Industry Vertical','Count']
# Create the bar plot
plt.figure(figsize=(14,8))
sns.barplot(x='Count',y='Industry Vertical', data=top_industries, color='#4878A2')
plt.title('Top 10 Preferred Indusries for Startups')
plt.xlabel('Number of Starups')
plt.ylabel('Industry Vertical')
plt.grid(True)
plt.show()
In [75]: # Count the number of startups in each city
top_cities_count = df['City Location'].value_counts().head(10).reset_index()
top_cities_count.columns = ['City', 'Count']
# Plot the number of startups by city
plt.figure(figsize=(14, 8))
sns.barplot(x='Count', y='City', data=top_cities_count, color='#4878A2')
plt.title('Top 10 Cities by Number of Startups')
plt.xlabel('Number of Startups')
plt.ylabel('City')
plt.grid(True)
plt.show()

Output: Industry Preferences Analysis


Number of Funding Rounds per Industry
Top Industries:
Consumer Internet: Leading with the highest number of funding rounds (589 rounds).
Technology: Second most active with 310 funding rounds.
E-commerce: Significant presence with 170 funding rounds.

Total Funding Amount per Industry

Top Funded Industries:


E-commerce: Secured the highest total funding amount, indicating large investments in this sector
(7.16 billion Dollars).
Consumer Internet: Close behind with substantial funding (6.25 billion Dollars).
Technology: Also received considerable funding (2.23 billion Dollars). #### Insights

Active Sectors: Consumer Internet and Technology sectors are highly active in terms of funding rounds.

High Investment Sectors: E-commerce and Consumer Internet attract the highest total funding,
reflecting investor confidence and market potential in these sectors.
Industry Dynamics: The analysis highlights which industries are more preferred by investors and which
sectors secure larger investments.

In [76]: # Ensure 'Date' column is in datetime format


df['Date'] = pd.to_datetime(df['Date dd/mm/yyyy'], format='%d/%m/%Y')

# Extract year and group by year and industry


df['Year'] = df['Date'].dt.year
yearly_industry_count = df.groupby(['Year', 'Industry Vertical']).size().unstack().filln
# Plotting the line plots for top industries over time

top_industries_list = top_industries['Industry Vertical'].head(5)

plt.figure(figsize=(14, 8))
for industry in top_industries_list:
plt.plot(yearly_industry_count.index, yearly_industry_count[industry], marker='o', l
plt.title('Number of Startups Founded Over Time by Industry')
plt.xlabel('Year')
plt.ylabel('Number of Startups')
plt.legend()
plt.grid(True)
plt.show()
Output:Top 5 Industry Choice Analysis
Consumer Internet:

Peak in 2016: The number of consumer internet startups saw a significant peak in 2016 with over 500
startups founded. Sharp Decline: After 2016, there is a sharp decline, indicating a reduction in new
consumer internet startups over the subsequent years. Technology:

Steady Growth and Decline: Technology startups grew steadily, peaking in 2016 with around 200 startups,
followed by a decline similar to the consumer internet trend. Consistency: Despite the decline, the number
of technology startups remains relatively consistent compared to other industries.

E-Commerce:

Initial Growth: E-commerce startups showed initial growth, peaking in 2016 with about 150 startups.
Gradual Decline: There is a gradual decline after 2016, but not as steep as the consumer internet sector.

Healthcare:

Stability: The healthcare industry shows relative stability with slight fluctuations, peaking modestly in 2016
and maintaining a lower but consistent presence.

Insights
2016 as a Pivotal Year: Most industries, especially consumer internet, technology, and e-commerce, peaked
in 2016. This indicates a significant year for startup formations across these sectors. Post-2016, there is a
noticeable decline in new startup formations, which could be due to market saturation, changing investment
climates, or shifts in entrepreneurial focus.

Consumer Internet and Technology Leading: These two sectors have the highest peaks, indicating high
interest and investment in these areas during their peak years. The sharp decline post-2016 suggests
potential over-saturation or a shift in investor interest.

Steady but Low Growth in Healthcare: Healthcare startups show steady but lower growth compared to other
sectors, suggesting a more stable but less explosive industry.

Potential Reasons for Trends:


Economic Factors: Changes in the economic environment, funding availability, and investor sentiment
could explain the peak and subsequent decline. Market Saturation: High initial growth could lead to
market saturation, causing a drop in new startup formations in subsequent years.
Shifts in Focus: Emerging technologies and changing market demands might shift entrepreneurial
focus to other areas over time.

5. Does Location also play a role, In determining the Growth


of a Startup?
In [77]: top_cities_funding = df.groupby('City Location')['Amount in USD'].sum().reset_index()
top_cities_funding = top_cities_funding.sort_values(by='Amount in USD', ascending=False)
top_cities_funding.columns = ['City', 'Total Funding Amount']

# Plot the total funding amount by city


plt.figure(figsize=(14, 8))
sns.barplot(x='Total Funding Amount', y='City', data=top_cities_funding, color='#4878A2'
plt.title('Top 10 Cities by Total Funding Amount')
plt.xlabel('Total Funding Amount (USD)')
plt.ylabel('City')
plt.grid(True)
plt.show()

Output:Analysis of the Top 10 Cities by Number of Startups


Bangalore as the Primary Hub:
Bangalore's significant lead in the number of startups highlights its role as the primary tech and innovation
hub in India. The city's infrastructure, talent pool, and supportive ecosystem attract a large number of
startups.

Mumbai and New Delhi's Strong Presence:

Mumbai and New Delhi's high ranks underscore their importance in the Indian startup ecosystem. Mumbai's
financial prowess and New Delhi's political and incubator support contribute to their strong startup cultures.

Emergence of Other Cities:

Cities like Gurgaon, Pune, and Hyderabad show significant numbers of startups, indicating the
diversification of the startup ecosystem beyond the primary hubs. These cities offer favorable conditions
such as talent availability, infrastructure, and government support.

Regional Clusters:

The presence of multiple cities from the National Capital Region (NCR) like New Delhi, Gurgaon, and Noida
highlights the region's attractiveness for startups. Proximity to the capital and good connectivity are key
factors.

Supporting Infrastructure and Ecosystems:

The distribution of startups across these cities suggests that supporting infrastructure, educational
institutions, corporate presence, and government policies play crucial roles in fostering startup growth.

Conclusion
The chart indicates that location significantly influences startup growth. Cities with strong ecosystems,
infrastructure, and support systems tend to have higher concentrations of startups. Understanding these
dynamics can help stakeholders, including investors, entrepreneurs, and policymakers, make informed
decisions about where to focus their efforts and resources.

6. Who plays the main role in Indian Startups Ecosystem?


In [78]: # Investor analysis
investor_funding = df['Investors Name'].value_counts().reset_index()
investor_funding.columns = ['Investor', 'Number of Investments']

# Top 10 most invested investors


top_investors = investor_funding.head(10)

plt.figure(figsize=(14, 7))
plt.barh(top_investors['Investor'],
top_investors['Number of Investments'],
color='#4878A2')
plt.xlabel('Number of Investments')
plt.ylabel('Investor')
plt.title('Top 10 Investors in Indian Startup Ecosystem')
plt.gca().invert_yaxis()
plt.show()
In [79]: # Extract the most active startup founders (this is a simplification)
# Split multiple founders in a single row (assuming founders are listed in 'Startup Name
df['Founders'] = df['Startup Name'].fillna('Unknown').str.split(',')

# Explode the list of founders into separate rows


founders_exploded = df.explode('Founders')
top_founders = founders_exploded['Founders'].value_counts().head(10).reset_index()
top_founders.columns = ['Founder', 'Number of Startups']

# Plot the top founders


plt.figure(figsize=(14, 8))
sns.barplot(x='Number of Startups', y='Founder', data=top_founders, color='#4878A2')
plt.title('Top 10 Founders by Number of Startups')
plt.xlabel('Number of Startups')
plt.ylabel('Founder')
plt.grid(True)
plt.show()
7. What are the different Types of Funding for Startups?
In [40]: df.head()

Out[40]: Sr Date Startup Industry City Investors Amou


SubVertical InvestmentnType
No dd/mm/yyyy Name Vertical Location Name

Tiger Global Private Equity


0 1 2020-01-09 BYJU’S E-Tech E-learning Bangalore 2000000
Management Round

App based Susquehanna


1 2 2020-01-13 Shuttl Transportation shuttle Gurgaon Growth Series C 80483
service Equity

Retailer of
baby and Sequoia
2 3 2020-01-09 Mamaearth E-commerce Bangalore Series B 183588
toddler Capital India
products

Online Vinod
3 4 2020-01-02 Wealthbucket FinTech New Delhi Pre-series A 30000
Investment Khatumal

Embroiled Sprout
Fashion and
4 5 2020-01-02 Fashor Clothes For Mumbai Venture Seed Round 18000
Apparel
Women Partners

In [41]: from wordcloud import WordCloud


investment_types=df['InvestmentnType'].value_counts().reset_index()
investment_types.columns=['Investment Type','Number of Investments']
# Convert the investment types to a dictionary
investment_dict=dict(zip(investment_types['Investment Type'],investment_types['Number of
#generate a word cloud
wordcloud=WordCloud(width=800, height=400,background_color='white',colormap='coolwarm').
# Plot the word cloud
plt.figure(figsize=(14, 8))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off')
plt.title('Types of Funding for Startups')
plt.show()

In [42]: # Assuming df is the cleaned dataframe with startup data


# Count the number of each investment type
investment_types = df['InvestmentnType'].value_counts().head(10).reset_index()
investment_types.columns = ['Investment Type', 'Number of Investments']

# Plot the investment types


plt.figure(figsize=(14, 8))
sns.barplot(x='Number of Investments', y='Investment Type', data=investment_types, color
plt.title('Types of Funding for Startups')
plt.xlabel('Number of Investments')
plt.ylabel('Investment Type')
plt.grid(True)
plt.show()

Output:Analysis of the Types of Funding for Startups


Private Equity: Most Common Funding Type: Private Equity is the most common funding type, with
nearly 1,400 instances. This indicates that many startups in the dataset have reached a level of
maturity where they can attract significant private equity investments.

Seed Funding: Second Most Common: Seed Funding is close behind Private Equity, with a similar
number of instances. This suggests that many startups are in the early stages of their lifecycle, seeking
initial capital to develop their ideas and products.

Seed / Angel Funding: Early-Stage Investments: Seed / Angel Funding is also prominent, with a
significant number of instances. This type of funding is crucial for startups to get off the ground and
demonstrates the active role of angel investors in the ecosystem.

Diverse Funding Landscape: The chart demonstrates a diverse landscape of funding types, from early-
stage seed funding to later-stage private equity. This diversity is crucial for catering to the varying
needs of startups at different stages of their growth.

Importance of Early-Stage Funding: The high frequency of Seed Funding and Seed / Angel Funding
underscores the importance of early-stage investments in nurturing new startups. These funding types
are critical for startups to develop their initial ideas and products.
Private Equity's Dominance: The dominance of Private Equity highlights the significant role of large-
scale investments in the startup ecosystem. It suggests that many startups in the dataset have
achieved substantial growth and maturity, making them attractive targets for private equity investors.

Growth Funding Rounds: The presence of Series A, B, C, and D funding rounds, although less
frequent, indicates a structured path for startups to secure additional capital as they grow. Each
subsequent round typically involves larger amounts of funding and is aimed at scaling the business.

Role of Debt Funding: While less common, Debt Funding provides an alternative financing route for
startups. This can be particularly useful for startups that want to avoid equity dilution or have specific
capital requirements that debt can fulfill

You might also like