PATIL - Data Scientist
PATIL - Data Scientist
[email protected]
571-223-4948
Data Science and statistician professional, creative thinker, and analytical person. Able to distill high
performant solutions from data to drive business strategy. Versatile, purposeful, and meticulous
professional in Data Science and programming. Experience in Machine Learning and Data Mining with
large Structured and Unstructured datasets, performing Data Acquisition, Data Validation, Predictive
modelling, and Data Visualization. Experience in time series foresting, text mining,and deep learning.
Summary
Skilled in using dplyr, ggplot2, Pandas, NumPy, Matplotlib, Seaborn and sklearn in R and python for
performing exploratory and predictive data analysis
Experience with Data Analytics, Data Reporting, Ad-hoc Reporting, Graphs, Scales, PivotTables.
Familiarity with developing, deploying, and maintaining production models with scalability in mind.
Exposure to working on cloud platforms like MS Azure – Machine Learning and Power BI.
Proficient in building dashboards in Tableau, MS Power BI, ggplot2 and R shiny.
Provides business expertise and supports the development of models and analysis to provide the
organization with insights.
Drives the analytics roadmap proactively by identifying opportunities in the data based on the business
priorities working with all divisions.
Proficient in development methodologies such as Agile, specifically Scrum methodology.
Technical Skills
Analytics: R, Python, Tableau, Power BI,SPSS Modeler, XLSTAT
Data Science: Generalized Linear Modeling (GLM), LogisticRegression, Decision Trees, Naive Bayes,
Random Forest, Gradient Boosted Models (GBM), SVM, Neural Networks, CHAID, PCA, K-mean
clustering, Discriminant analysis, ARIMA, SARIMA, Holt’s methods, Weighted Moving Average,
Hypothesis Testing,Sentiment Analysis, Topic Modeling and Named Entity Recognition.
Databases: Azure SQL Database, MS Access 2016, SQL Server 2017, PostgreSQL, and MySQL
Cloud Platforms: Microsoft Azure Machine Learning Studio, IBM SPSS Modeler
IDE: Jupyter Notebook, Spyder, Eclipse and Google Colab.
Certifications: Microsoft Azure ML – Data Science Essentials, IBM SPSS Modeler data analysis for
Business partners v2
Education
JUNE 2012
Master of Computer Applications/M S Ramaiah Institute of Technology, Bengaluru,India
JUNE 2009
Bachelor of Computer Applications/Karnataka University, Vijayapura, India
Professional Experience
SEP 2019 –TILL DATE
Data Scientist – Sales Forecast/ Avanos Medical, Alpharetta, Georgia
Worked as a Data Scientist for a leading medical technology company focused on delivering clinical
medical device solutions. Avanos provides different solutions ranging from Respiratory Health, Pain
Management Solutions, Digestive Health, IV Infusion, and Surgical Pain. Collaborated with a small team
of Data scientists and analysts where we created numerous sales forecasting models from Avanos’s
historical data hosted on Microsoft SQL server to estimate sales forecast for each Business and Group
unit. Forecasts help sales ops with territory and quota planning, supply chain with material purchases
and production capacity, and sales strategy with channel and partner strategies.
Queried databases with pyodbc in Python and loaded the results into Pandas Data Frames.
Visualized data and performed exploratory data analysis (EDA) with Matplotlib/Seaborn in
Python.
Endeavored multiple approaches for predicting month ahead sales demand with Python,
including exponential smoothing, ARIMA, Prophet, TBATS, and LSTM.
Performed feature engineering with the use of NumPy, Pandas, and FeatureTools to engineer
time-series features.
Evaluate the performance of predictions using the root mean squared error (RMSE)
Coordinated with the sales team to understand the problem and ensure our predictions were
beneficial.
By getting down to the device/product and customer level, forecasting accuracy is improved by
better understanding products and customers.
Set up architecture for integration between Microsoft Azure Machine Learning Studio and
Dynamics 365 Customer Insight.Inject historic data from Dynamics 365 Customer Insights into
Azure ML Studio using web service API.
2
The goal of this project is to find out the most striking behaviors of customers through
Exploratory Data Analysis and later use some of the predictive analytics techniques to determine
the customers who are most likely to churn.
Undertook several classification techniques like Gradient Boosting, Random Forest, Logistic
Regression, AdaBoost and K-Nearest Neighbor (KNN)to predict the probability of a customer is
likely to churn.
Performed optimization techniques like Cross-validation and Hyper-parameter tuning to
improve the accuracy of the classifiers.
This churn model helped business to identify their loyal customers to conduct loyalty program.
This unique approach to rewards has helped generate record levels of client satisfaction and
retention with members. Members report more than 85 percent satisfaction with their banking
relationship and have a retention rate of 99 percent.
Set up architecture for integration between MS Azure ML and Dynamics 365 PSA.
Inject historic data from Dynamics 365 PSA into Azure ML Studio using web service API key.
The main assumption in analyzing time series is that the successive values of a variable
represent consecutive measurements of equally spaced time intervals. Endeavored multiple
approaches for predicting monthly skill-wise demand with R, including ARIMA (autoregressive
integrated moving average) model, ETS (Exponential Smoothing), and STL (Seasonal Trend
Decomposition).
Evaluate all models using MAPE (Mean Absolute Percentage Error). Consider the best
aggregated MAPE out of the models.
Generate forecast results in the form of output data structure which Dynamics 365 PSA accepts.
Finally deploy an experiment or model as a web service, which can be consumed in two modes:
RRS (request-response service) and BES (batch execution service)
3
FEB 2015 –FEB 2016
Sr Analyst–Text MiningEmployee Perception Survey/ Wipro Technologies, Bengaluru, India
Employee Perception Survey is conducted annually in Wipro to get feedback from its employees about
the organization. To analyze employee feedback comments available in the form of unstructured text
data. 16 crucial drivers were identified from the employee comments like Rewards, Role, and Customer
Focus, etc. that further provided insights on employee satisfaction.
Built taxonomy around 16 drivers using IBM SPSS Text Analytics for Surveys from scratch.
Reporting to the Senior Vice President- HR regarding the latest updates and developments of
the project.
Preprocessed raw text data by removing URL, removing all irrelevant characters (Numbers and
Punctuation), Convert all characters into lowercase, Remove the words having length <= 2.
OneVsRest multi-label classification algorithm is built which accepts a binary mask over multiple
labels. Naive Bayes, LinearSVC, and Logistic Regression classifiers are used to predict multiple
labels for instance.
Built sentiment score model for each driver to identify employee’s positive and negative
feedback.
Web crawling for online data using Alterian SM2 for Glassdoor and Indeed Employee review
data. And performed competitive analysis on online data of employee reviews.
A wide range of employee data is collected from central HR team like employee profile data,
leave data, manager data, appraisal data, salary data and training data.
Queried databases with RPostgres in R and loaded the results into Data Frames.
Built an advanced predictive model using techniques like random forest and logistic regression
in R to explore the complex non-linear relationship between Attrition and its causal drivers. Also
using a scoring algorithm, the list of potential attritors generated.
The model is evaluated by confusion metrics, you are cross tabulating the actual results with
predicted results. Another way the models are evaluated using AUC. The higher the AUC the
better.
Finally, generate the report of model accuracy and statistical inference of the model.
4
Extract and transform data required for Visualization tools.
The attrition model helped the business minimize the cost of new talent acquisition based on
the employee profiling and company requirements, analysis, and assessment of the loss in
expertise and skillsets, it also helped in preparing contingency plans based on the insight and
foresight provided by the model.
A range of data is collected from various sources like forums, blogs, Twitter, Facebook, open
web, etc. using Alterian SM2.
Performed multi-label classification approach for classifying customer reviews from different
social media segments into categories based on their products and services. Linear SVC
performed well in this scenario.
Customer retention analysis – Identified key factors driving customers who are willing to switch
(Preferred Banks) and key factors driving customers who are willing to stay.
This helped client continuously refine and improve their strategy to improve your marketing
efforts, stay on top of emerging trends, and offer your clients proof of your results.
5
The demand forecasting model helps to predict the skill-wise (top-level) demand for resources
based on historical data and to predict the suite-wise (hierarchical) demand for resources based
on the skill-wise forecast.
Endeavored multiple approaches for predicting monthly skill-wise demand with R, including
exponential smoothing, ARIMA, Holt’s methods, and Croston.
Incessantly validated models using a train-validate-test split to ensure forecasting was sufficient
to elevate the optimal output of the number of resources (mapped skill-wise) to meet demand.
Coordinated with the central workforce team to understand the problem and ensure our
predictions were beneficial.
Forecast capacity vs. demand from multiple perspectives, i.e., by role, department, location,
skills, etc helps you identify short or long-term shortage/excess of resources ahead of time.
Bridge the gap with appropriate resourcing treatments. Take corrective actions, i.e., retrain
current employees and hire recruits early on. Provide shadowing opportunities to bench
resources.
Susan
Fax: 571-291-4522
Email: [email protected]
LinkedIn: linkedin.com/in/susan-s-838a1719b
www.pvkc.com