0% found this document useful (0 votes)
19 views

Da BD

Data Analyst
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Da BD

Data Analyst
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Name

Email
Mobile
Location

Data Analyst

Professional summary:
 Experienced Sr. Data Analyst with 7+ years of expertise in manipulating and analysing large datasets, developing
sophisticated data models, and delivering actionable insights to drive strategic business decisions
 Advanced proficiency in SQL, consistently writing, optimizing, and maintaining complex queries across various
database systems.
 Demonstrated mastery of data visualization tools, particularly Qlik Sense and Tableau, with a track record of creating
impactful dashboards and reporting systems that increased data-driven decision-making.
 Proven ability to apply data science techniques, including statistical analysis and machine learning algorithms, to
develop predictive models that enhanced forecast accuracy by 30% and uncovered valuable business insights
 Excellent understanding of machine learning algorithms, such as Decision tree, Naive Bayes, SVM & Random forest,
Cluster, PCA, KNN, ANN, CNN etc.
 Experience in working with cloud-based data platforms (AWS, Google Cloud, Azure) to build scalable and cost-effective
data solutions.
 Experience in Big Data Tech’s with deep understanding of the Hadoop Distributed File System and Eco System (HDFS,
Map Reduce, Hive, Sqoop, Oozie, Zookeeper, HBase, Flume, PIG, Apache Kafka) in a range of industries such as Retail
and Communication sectors.
 Strong background in ensuring data integrity and accuracy through regular audits and validation checks, reducing data
errors and establishing robust data governance practices
 Proficient in programming languages such as Python and R, leveraging these skills to automate data processes and
conduct advanced statistical analyses
 Familiarity with big data technologies including Hadoop and Spark, as well as cloud-based data platforms like AWS,
Google Cloud, and Azure, enabling efficient handling and analysis of large-scale datasets
 Work on relational database in MySQL, PostgreSQL, and non-relational database in MongoDB, DynamoDB
 Knowledge of statistical techniques such as Descriptive Modelling, Linear & Non-linear Models, classification and Data
reduction techniques, Predictive Analysis.
 Experience in importing and exporting data from RDBMS to HDFS, Hive tables and HBase by using Sqoop.
 Experience in Python to manipulate data for data loading and extraction and worked with python libraries like
Matplotlib, Scikit, NumPy, Seaborn, TensorFlow, Keras and Pandas for data analytics and predictive modelling along
with BI tool such as Tableau.
 Capable of developing and deploying predictive models to forecast future trends, make informed decisions, and
identify potential risks.
 Skilled in time series analysis and forecasting, utilizing ARIMA, SARIMA, and Prophet models to predict trends and
seasonality in business metrics
 Excellent Experience of MS Excel plus VBA (Macros) pertaining to MS Office products (MS Access, MS Word, MS Power
Point) and VBA for automation reporting - Dashboards. Pivot Table & Power Pivot (Data Model).
 Extensive experience in designing and implementing ETL (Extract, Transform, Load) processes using tools like Apache
NiFi and Talend, optimizing data pipelines for improved efficiency and reduced latency

Technical Skills
 Data Analysis: Entity-Relationship Diagrams, Dimensional Modeling, Schema Design, Data Warehousing, DQM, MDM,
Data Lineage, Data Encryption, Data Visualization
 Big Data: Hadoop (HDFS, YARN), Spark, Hive, Pig, Kafka, Flink, NiFi, Airflow, Zeppelin.
 ML: Scikit-learn, TensorFlow, PyTorch, Keras, NLTK, spaCy,
 Statistical Analysis: Hypothesis Testing, Regression, Time Series Analysis, A/B Testing, ANOVA, PCA, Cluster Analysis
 Data Mining: Association Rule Learning, Anomaly Detection, Pattern Recognition, Text Mining, Web Scraping
 Languages: Python, SQL, R, Java, PowerShell,
 Cloud & Databases: AWS, GCP, MS Azure, Snowflake, MySQL, PostgreSQL, Oracle, MS SQL, MongoDB
 Skills: BI, GDPR, RDBMS, OOP’s, API Integration, predictive modelling, Scripting & Automation,
 Tools: Tableau, Power BI, AWS, GCP, MS Azure, MS Excel, MS Word, Git, SAP BO, IBM Cognos, MicroStrategy

Professional experience:

Sr. Data Analyst


Responsibilities:
 Architected a cloud-native data lake on AWS using S3, Glue, and Athena, enabling seamless querying of petabyte-scale
datasets and reducing data retrieval times by 70%
 Developed a multi-stage ETL pipeline using Apache Spark on Azure Databricks, processing diverse data sources and
improving data transformation efficiency by 55%
 Implemented advanced machine learning algorithms, including gradient boosting and neural networks, using Python's
TensorFlow and scikit-learn libraries to create a churn prediction model, increasing customer retention by 28%
 Designed and deployed a real-time anomaly detection system using Spark Streaming and Kafka on Google Cloud
Dataproc, identifying fraudulent transactions with 96% accuracy
 Created interactive, multi-dimensional dashboards using Qlik Sense and Tableau, visualizing complex KPIs and
enabling data-driven decision making across all organizational levels
 Optimized complex SQL queries and implemented indexing strategies in a multi-terabyte PostgreSQL database,
reducing average query execution time by 65%
 Conducted comprehensive statistical analyses using R, including time series forecasting and multivariate regression, to
identify key drivers of product demand and optimize inventory management
 Led cross-functional data science workshops, fostering collaboration between IT, marketing, and finance teams to
develop predictive models for customer lifetime value
 Implemented a data quality framework using Apache NiFi and custom Python scripts, automating data validation
checks and improving overall data integrity by 40%
 Developed a natural language processing pipeline using NLTK and spaCy to analyze customer feedback, extracting
sentiment and key topics with 88% accuracy
 Led the migration of on-premises data warehouses to Google Big Query, optimizing data architecture and reducing
data processing costs by 60%
 Designed and implemented a recommendation engine using collaborative filtering techniques in PySpark, increasing
cross-sell opportunities by 35%
 Created a automated reporting system using Airflow and Redash, reducing manual report generation time by 75% and
ensuring timely delivery of insights to stakeholders
 Utilized Hadoop ecosystem tools (Hive, Impala) to perform ad-hoc analyses on historical data, uncovering trends that
informed long-term business strategy
 Implemented data governance policies and conducted regular data audits, ensuring compliance with GDPR and
industry-specific regulations
 Developed a machine learning model deployment pipeline using MLflow and Docker, enabling seamless model updates
and version control
 Created and maintained complex data models in Snowflake, optimizing star schema designs for analytical queries and
improving query performance by 50%
 Utilized SQL queries to extract and manipulate data from various databases, including MySQL, PostgreSQL, MongoDB,
Redshift, and Snowflake, ensuring data accuracy and reliability.
 Conducted A/B tests and statistical experiments to evaluate the impact of product features, providing data-driven
recommendations that increased user engagement by 22%
 Implemented a data catalog using AWS Glue Data Catalog, improving data discovery and enabling self-service analytics
for business users
 Developed predictive maintenance models using IoT sensor data and Azure Machine Learning, reducing equipment
downtime by 40% and saving $3M annually

Sr. Data Analyst


Responsibilities:
 Engineered complex SQL queries to extract and manipulate data from multi-terabyte databases, improving data
retrieval efficiency by 45% and enabling real-time analytics
 Developed an end-to-end data pipeline using Apache Spark on AWS EMR, processing daily incremental loads of
500GB+ data and reducing processing time by 60%
 Created interactive, multi-layered dashboards in Qlik Sense and Tableau, providing actionable insights to C-level
executives and driving a 30% increase in data-driven decision making
 Implemented a machine learning-based customer segmentation model using Python's scikit-learn library, resulting in
a 25% uplift in targeted marketing campaign effectiveness
 Utilized Hadoop ecosystem tools (Hive, Impala) to analyze petabytes of historical data, extracting valuable business
insights and supporting long-term strategy development
 Utilized R for time series forecasting and anomaly detection in financial data, identifying potential fraud cases with
92% accuracy
 Orchestrated the migration of on-premises data warehouses to Google Cloud BigQuery, optimizing query performance
and reducing infrastructure costs by 40%
 Designed and implemented a real-time data validation system using Azure Stream Analytics, ensuring 99.9% data
integrity across all data pipelines
 Led cross-functional workshops to identify KPIs and develop performance metrics, aligning data initiatives with
strategic business objectives
 Conducted advanced statistical analyses, including multivariate regression and principal component analysis, to
uncover hidden patterns in customer behavior data
 Developed a predictive maintenance model using IoT sensor data and Spark MLlib, reducing equipment downtime by
35% and saving $2M annually in maintenance costs
 Created a automated reporting system using Python and Airflow, reducing manual report generation time by 80% and
ensuring timely delivery of insights to stakeholders
 Implemented data governance policies and conducted regular data audits, ensuring compliance with GDPR and
industry-specific regulations
 Created ad hoc reports and pivot tables in Excel using VBA and macros.
 Optimized complex database queries (SQL) and data models in Snowflake, improving overall system performance and
reducing data processing costs by 50%
 Developed and delivered data literacy training programs to non-technical teams, fostering a data-driven culture across
the organization

Data Analyst
Responsibilities:
 Developed advanced SQL queries and stored procedures to optimize data retrieval from complex relational databases,
reducing query execution time by 60% and improving overall system performance
 Created interactive dashboards using Qlik Sense and Tableau, providing real-time insights into key business metrics
and increasing stakeholder engagement by 40%
 Conducted comprehensive statistical analyses using R, including multivariate regression and ANOVA, to identify
significant factors influencing customer churn and inform retention strategies
 Implemented data mining techniques using Python's scikit-learn library to uncover hidden patterns in customer
behavior, leading to a 25% increase in cross-selling opportunities
 Designed and deployed machine learning models on AWS Sage Maker, leveraging cloud computing capabilities to scale
predictive analytics across the organization
 Utilized Hadoop ecosystem tools (HDFS, Hive, Pig) to process and analyse terabytes of unstructured log data,
extracting valuable insights for system optimization
 Developed data pipelines using Apache Spark on Google Cloud Dataproc, enabling efficient processing of big data and
reducing ETL job completion time by 50%
 Created and maintained complex data models in Azure Synapse Analytics, ensuring data consistency and facilitating
advanced analytics across multiple business units
 Loaded current month credit card transaction data into Python to do data cleaning, and data wrangling like merge
tables, fill null values, and check outliers with packages NumPy and Pandas.
 Created Information Links for many databases for populating data into Visualization from underlying databases such
as Sql Server and Oracle.
 Implemented automated data quality checks and cleansing procedures using Python, improving data reliability and
reducing manual data preparation time by 70%
 Proficiency in data modeling techniques, including Star Schema, Time-Series models, Continuous
Integration/Continuous Deployment (CI/CD) pipelines, and Cube models.
 Built interactive dashboard in Excel with Pivot table, VLOOKUP, SmartArt and reported insight to manager.
 Creating SQL Scripts to include them in the development of Crystal Reports.
 Developed a predictive maintenance model using Random Forest algorithm, resulting in a 30% reduction in equipment
downtime and significant cost savings

Data Analyst
Responsibilities:
 Spearheaded the implementation of a real-time data streaming pipeline using Apache Kafka and Flink, reducing data
processing latency by 70% and enabling near real-time analytics for critical business metrics
 Developed and maintained a distributed data lake architecture on Hadoop HDFS, improving data accessibility and
query performance for petabyte-scale datasets
 Utilized Apache Hive and Presto for large-scale data warehousing and SQL-based analytics, enabling efficient querying
of historical data spanning multiple years
 Implemented advanced anomaly detection algorithms using ensemble methods, resulting in a 40% improvement in
fraud detection rates for the company's financial transactions
 Designed and implemented a patient risk prediction model using gradient boosting algorithms (XGBoost) on electronic
health records, improving early intervention rates by 35%
 Collaborated with the DevOps team to containerize data science workflows using Docker and orchestrate them with
Kubernetes, enhancing reproducibility and scalability of analytical processes
 Migrated a data warehouse to AWS Redshift to improve performance and scalability, resulting in a 30% reduction in
query response times.
 Developed a natural language processing (NLP) pipeline using NLTK and spaCy to extract valuable insights from
unstructured medical notes, enhancing the accuracy of diagnostic coding by 25%
 Utilized time series forecasting techniques (ARIMA, Prophet) to predict hospital resource utilization, optimizing staff
scheduling and reducing operational costs by 15%
 Implemented privacy-preserving machine learning techniques, including federated learning and differential privacy, to
ensure HIPAA compliance while leveraging multi-institutional data
 Created interactive data visualizations using D3.js and Plotly, effectively communicating complex health trends to both
medical professionals and administrative stakeholders
 Proficient in managing relational databases such as MySQL, PostgreSQL, and Oracle, including installation,
configuration, and maintenance tasks.
 Prepared reports by using and utilizing MS Excel (VLOOKUP, HLOOKUPS, pivot tables, Macros, data points).
 Worked closely with all team members in implementing Tableau BI tool in all IT projects.
 Collaborated with the IT security team to implement data encryption and access control measures, ensuring the
protection of sensitive patient information in big data environments

Data Analyst
Responsibilities:
 Extensive experience in designing and developing complex DAX calculations and measures to perform advanced
calculations, aggregations, and data transformations within Power BI.
 Pivot tables and create various analysis report using MS-Excel.
 Performed Data analysis and Data profiling using complex SQL on various sources systems including Oracle and SQL
Server.
 Leveraged SQL extensively for data querying and analysis, enabling detailed reporting on key performance indicators
(KPI) and healthcare metrics.
 Loaded data using /SQL scripts to populate the Alt Data from EXCEL to ACCESS 2007.
 Data Mapping, logical data modeling, created class diagrams and ER diagrams and used SQL queries to filter data
within the Oracle database.
 Maintain and perform process analysis and designing of BI reports as well as opening StoreFront tickets in UAT, DEV
and PROD for claim’s team.
 Used openpyxl module in python to format excel files.
 Write SQL query for ad-hoc analysis report to analyze the customer feedback.
 Designing or developing databases and database solutions using MySQL, Oracle.
 Experienced in Data Modeling, Database design & well versed with Server best practices and hands on experience in
developing Stored Procedures/routines and Reports.
 Published Data model in model mart, created Skilled in System Analysis, E-R/Dimensional Data Modeling, Database
Design and implementing RDBMS specific features.
 Reviewed Stored Procedures for reports and wrote test queries against the source system (SQL Server) to match the
results with the actual report against the Data mart.
 Created DDL scripts for implementing Data Modeling changes. Designed Star and Snowflake Data Models for
Enterprise Data Warehouse using ERWIN.
 Responsible for gathering data migration requirements.
 Created Technical Design Documents, Unit Test Cases.
 Performed web analytics and reporting via Google Analytics.
 Involved in data mapping and data clean up.

You might also like