DA Portfolio Project
DA Portfolio Project
PORTFOLIO
PROJECT
PREPARED BY DEEPAK YADAV
Email Id:[email protected]
PROFESSIONAL BACKGROUND
EDUCATION
Completed B.E - 2018.
Completed Data Analytics Trainee certificate course from Trainity.
Currently learning Python & Tableau
OTHER SKILLS
Predictive Analytics
Risk Analytics
Behavioral Analytics
TABLE OF CONTENTS
Topics Projects Page
Project description
Data Analytics 6 steps process 4
Process Project Link
Project description
Instagram User Findings & Approach
Analytics Insights 5
Project Link
Project description
Hiring Process Findings & Approach
Insights 8
Analytics
Project Link
Project description
IMDB Movie Findings & Approach
Analysis Insights 9
Project Link
Project description
Business Understanding
Bank Loan Case
Findings & Approach 10-11
Study
Insights
Project Link
Project description
XYZ Ads Findings & Approach
Insights 12-13
Airing Reports
Project Link
Project description
ABC Call Findings & Approach
Volume Trend Insights 14-15
Project Link
Learning
Conclusion 16
Portfolio Link
Data Analytics Process
6 Steps Process:
Plan:
We first decide which things we need to search before opening the Google. Which
information do we need to search? For example, we want to know about different
machine learning algorithms.
Prepare:
Next we need to check which website will give us the correct information in the simplest way.
Process:
Then we need to check how much we want from the data. Like if we want –
difference between the different algorithms / which algorithm is the best choice
according to the different scenarios / etc.
Analyze:
We then analyze the different algorithms. Suppose we are working on some project and
we don’t know how to apply the algorithms or which algorithm will be the best choice in
that project.
Share:
Now we search it in the Google. And Google gives us the best results at the top-most by
using the process of data analytics / data science.
Act:
Then we finally click it and get the necessary information which we need. We, then use
that information in our project.
Project Link:
https://drive.google.com/file/d/1rB0VmFRLSrMd
RUciMiPcUqBf02QRsBnW/view?usp=sharing
Instagram User Analytics
This project is about how the users engage and interact with Instagram.
Description:
We will analyse these users in an attempt to derive business insights for
marketing, product & development teams. These insights are then used
by teams across the business to launch a new marketing campaign,
decide on features to build for an app, track the success of the app by
measuring user engagement and improve the experience altogether while
helping the business grow.
We are working with the product team of Instagram and the product
Approach:
manager has asked us to provide insights on the questions asked by the
management team. We use SQL to derive different insights from the
dataset provided by the management team. First, we run the necessary
commands for creating the database to work on. Then, we performed
analysis to generate valuable insights for the company.
Project Link: 13% of Instagram IDs are fake and dummy accounts.
https://docs.google.com/document/d/1XEbjny2iJtzAn3jI4lvf8CguvRJgcq-N/edit?usp=sharing&
ouid=109059427539152145245&rtpof=true&sd=true
Operation Analytics & Investigating Metric Spikes
Operation Analytics is the analysis done for the complete end to end
Description:
operations of a company.With the help of this, the company then finds
the areas on which it must improve upon.
Being one of the most important parts of a company, this kind of analysis
is further used to understanding between cross-functional teams, and
more effective workflows.
The number of distinct jobs reviewed per hour per day for November
2020 is 83%.
We used the 7-day rolling average of throughput as it gives the average
for all the days right from day 1 to day 7 whereas, daily metric gives the
average for only that particular day itself.
The percentage share of Persian language is the most (37.5%).
There are two duplicate rows if we partition the data by job_id. But if we
look the overall columns, all the rows are unique.
2. Case Study 2 (Investigating metric spike):
The weekly user engagement increased from week 18th to week 31st
and then started declining from then onward. This means that some of
the users do not find much quality in the product/service in the last of
the weeks.
There are in total 9381 active users from 1st week of 2013 to the 35th
week of 2014.
The overall count of weekly engagement per device used is the most
for MacBook users and iPhone users.
The email opening rate is around 34% and email clicking rate is around
15%. The users are engaging with the email service which is good for
the company to expand.
I am working for a MNC such as Google as a lead Data Analyst and the
Approach:
company has provided with the data records of their previous hirings
and have asked me to answer certain questions making sense out of
that data.
We will use EDA to generate different insights and to answer the
questions asked by the company.
The dataset given by the company contains the details about people
who registered for a particular post in a department of this company. I
used MS Excel to analyze the data with different tables and columns.
https://docs.google.com/presentation/d/1rVQl-934cwQqaRU6lIpw3Epx
Project Link:
LdcTN-8D/edit?usp=sharing&ouid=109059427539152145245&rtpof=tr
ue&sd=true
Bank Loan Case Study
2. Default:
Men are at relatively higher default rate.
Clients who are either at Maternity leave OR Unemployed default a
lot.
Not approving the loan of young people who are in age group of 20-40
as they have higher probability of defaulting.
When the credit amount goes beyond 3M, there is an increase in
defaulters.
People who have less than 5 years of employment have high default
rate.
https://drive.google.com/file/d/1j-GELtBkl6Y50GuImk0uTtAzMTbVTCbc
Project Link:
/view?usp=sharing
XYZ Ads Airing Report Analysis
https://drive.google.com/file/d/17Azv3yhCuM2DhOQCzsm_goaIzmM2l0Y
Project Link:
F/view?usp=sharing
ABC Call Volume Trend Analysis
The attached dataset is of Inbound calls of an ABC company from
Description:
the insurance category consists of a Customer Experience (CX)
Inbound calling team for 23 days. Data includes Agent_Name,
Agent_ID, Queue_Time [duration for which customer have to wait
before they get connected to an agent], Time [time at which call was
made by customer in a day], Time_Bucket [for easiness we have also
provided you with the time bucket], Duration [duration for which a
customer and executives are on call, Call_Seconds [for simplicity we
have also converted those time into seconds], call status (Abandon,
answered, transferred).
The average call time duration for all incoming calls received by
Findings:
agents (in each Time_Bucket).
The total volume/ number of calls coming in via charts/ graphs
[Number of calls v/s Time].
Propose a manpower plan required during each time bucket
[between 9am to 9pm] to reduce the abandon rate to 10%.
Propose a manpower plan required during each time bucket in a
day[9 pm to 9 am]. Maximum Abandon rate assumption would be
same 10%.
Approach: We used pivot table and pivot charts to get the valuable
insights of the data.
We assumed an agent work for 6 days a week;
On an average total unplanned leaves per agent is 4 days a
month; An agent total working hrs is 9 Hrs out of which 1.5 Hrs
goes into lunch and snacks in the office.
On average an agent occupied for 60% of his total actual
working Hrs (i.e. 60% of 7.5 Hrs) on call with customers/ users.
We also assumed total days in a month is 28 days for easy
calculation.
The customers call the least in the evening. So, the company can
Insights:
reduce the number of agents at that time for answering the calls.
The company can hire 17 customer support agents for the night shift
work.
The company can shift some of the day workers for the night shift.
The employees who are working 9 am to 9 pm. The manager can
change some of the workers shift from 5 am to 2 pm and some
workers from 2 pm to 11 pm to get the most calls answered.
The company can make the employers divide into 3 parts too, so that
the agents are always available 24/7.
Project Link:
https://drive.google.com/file/d/13BVWMxiomQkblAg0ZfgJKiiy2rUBwh88/vi
ew?usp=sharing
Conclusion
The things I learned from the projects are:
Learnings:
Data Analysis 6 Steps Processes
How to use Advance SQL Concepts in the real world
business case
How to use Advance Excel Concepts in the real business
case scenario
How to analyze the huge datasets in Python, Excel, etc.
How to visualize the data to gain the valuable insights
Concepts of Operation Analysis & Investigating Metric
Spikes
HR Analytics
Predictive Analytics
Risk Analytics
Behavioral Analytics
Business scenario of Ads Airing Report
Customer Experience Team and Inbound Customer Support
Google to get the concepts and answers whenever get stuck
Contact: [email protected]
Thank You