Da BD
Da BD
Email
Mobile
Location
Data Analyst
Professional summary:
Experienced Sr. Data Analyst with 7+ years of expertise in manipulating and analysing large datasets, developing
sophisticated data models, and delivering actionable insights to drive strategic business decisions
Advanced proficiency in SQL, consistently writing, optimizing, and maintaining complex queries across various
database systems.
Demonstrated mastery of data visualization tools, particularly Qlik Sense and Tableau, with a track record of creating
impactful dashboards and reporting systems that increased data-driven decision-making.
Proven ability to apply data science techniques, including statistical analysis and machine learning algorithms, to
develop predictive models that enhanced forecast accuracy by 30% and uncovered valuable business insights
Excellent understanding of machine learning algorithms, such as Decision tree, Naive Bayes, SVM & Random forest,
Cluster, PCA, KNN, ANN, CNN etc.
Experience in working with cloud-based data platforms (AWS, Google Cloud, Azure) to build scalable and cost-effective
data solutions.
Experience in Big Data Tech’s with deep understanding of the Hadoop Distributed File System and Eco System (HDFS,
Map Reduce, Hive, Sqoop, Oozie, Zookeeper, HBase, Flume, PIG, Apache Kafka) in a range of industries such as Retail
and Communication sectors.
Strong background in ensuring data integrity and accuracy through regular audits and validation checks, reducing data
errors and establishing robust data governance practices
Proficient in programming languages such as Python and R, leveraging these skills to automate data processes and
conduct advanced statistical analyses
Familiarity with big data technologies including Hadoop and Spark, as well as cloud-based data platforms like AWS,
Google Cloud, and Azure, enabling efficient handling and analysis of large-scale datasets
Work on relational database in MySQL, PostgreSQL, and non-relational database in MongoDB, DynamoDB
Knowledge of statistical techniques such as Descriptive Modelling, Linear & Non-linear Models, classification and Data
reduction techniques, Predictive Analysis.
Experience in importing and exporting data from RDBMS to HDFS, Hive tables and HBase by using Sqoop.
Experience in Python to manipulate data for data loading and extraction and worked with python libraries like
Matplotlib, Scikit, NumPy, Seaborn, TensorFlow, Keras and Pandas for data analytics and predictive modelling along
with BI tool such as Tableau.
Capable of developing and deploying predictive models to forecast future trends, make informed decisions, and
identify potential risks.
Skilled in time series analysis and forecasting, utilizing ARIMA, SARIMA, and Prophet models to predict trends and
seasonality in business metrics
Excellent Experience of MS Excel plus VBA (Macros) pertaining to MS Office products (MS Access, MS Word, MS Power
Point) and VBA for automation reporting - Dashboards. Pivot Table & Power Pivot (Data Model).
Extensive experience in designing and implementing ETL (Extract, Transform, Load) processes using tools like Apache
NiFi and Talend, optimizing data pipelines for improved efficiency and reduced latency
Technical Skills
Data Analysis: Entity-Relationship Diagrams, Dimensional Modeling, Schema Design, Data Warehousing, DQM, MDM,
Data Lineage, Data Encryption, Data Visualization
Big Data: Hadoop (HDFS, YARN), Spark, Hive, Pig, Kafka, Flink, NiFi, Airflow, Zeppelin.
ML: Scikit-learn, TensorFlow, PyTorch, Keras, NLTK, spaCy,
Statistical Analysis: Hypothesis Testing, Regression, Time Series Analysis, A/B Testing, ANOVA, PCA, Cluster Analysis
Data Mining: Association Rule Learning, Anomaly Detection, Pattern Recognition, Text Mining, Web Scraping
Languages: Python, SQL, R, Java, PowerShell,
Cloud & Databases: AWS, GCP, MS Azure, Snowflake, MySQL, PostgreSQL, Oracle, MS SQL, MongoDB
Skills: BI, GDPR, RDBMS, OOP’s, API Integration, predictive modelling, Scripting & Automation,
Tools: Tableau, Power BI, AWS, GCP, MS Azure, MS Excel, MS Word, Git, SAP BO, IBM Cognos, MicroStrategy
Professional experience:
Data Analyst
Responsibilities:
Developed advanced SQL queries and stored procedures to optimize data retrieval from complex relational databases,
reducing query execution time by 60% and improving overall system performance
Created interactive dashboards using Qlik Sense and Tableau, providing real-time insights into key business metrics
and increasing stakeholder engagement by 40%
Conducted comprehensive statistical analyses using R, including multivariate regression and ANOVA, to identify
significant factors influencing customer churn and inform retention strategies
Implemented data mining techniques using Python's scikit-learn library to uncover hidden patterns in customer
behavior, leading to a 25% increase in cross-selling opportunities
Designed and deployed machine learning models on AWS Sage Maker, leveraging cloud computing capabilities to scale
predictive analytics across the organization
Utilized Hadoop ecosystem tools (HDFS, Hive, Pig) to process and analyse terabytes of unstructured log data,
extracting valuable insights for system optimization
Developed data pipelines using Apache Spark on Google Cloud Dataproc, enabling efficient processing of big data and
reducing ETL job completion time by 50%
Created and maintained complex data models in Azure Synapse Analytics, ensuring data consistency and facilitating
advanced analytics across multiple business units
Loaded current month credit card transaction data into Python to do data cleaning, and data wrangling like merge
tables, fill null values, and check outliers with packages NumPy and Pandas.
Created Information Links for many databases for populating data into Visualization from underlying databases such
as Sql Server and Oracle.
Implemented automated data quality checks and cleansing procedures using Python, improving data reliability and
reducing manual data preparation time by 70%
Proficiency in data modeling techniques, including Star Schema, Time-Series models, Continuous
Integration/Continuous Deployment (CI/CD) pipelines, and Cube models.
Built interactive dashboard in Excel with Pivot table, VLOOKUP, SmartArt and reported insight to manager.
Creating SQL Scripts to include them in the development of Crystal Reports.
Developed a predictive maintenance model using Random Forest algorithm, resulting in a 30% reduction in equipment
downtime and significant cost savings
Data Analyst
Responsibilities:
Spearheaded the implementation of a real-time data streaming pipeline using Apache Kafka and Flink, reducing data
processing latency by 70% and enabling near real-time analytics for critical business metrics
Developed and maintained a distributed data lake architecture on Hadoop HDFS, improving data accessibility and
query performance for petabyte-scale datasets
Utilized Apache Hive and Presto for large-scale data warehousing and SQL-based analytics, enabling efficient querying
of historical data spanning multiple years
Implemented advanced anomaly detection algorithms using ensemble methods, resulting in a 40% improvement in
fraud detection rates for the company's financial transactions
Designed and implemented a patient risk prediction model using gradient boosting algorithms (XGBoost) on electronic
health records, improving early intervention rates by 35%
Collaborated with the DevOps team to containerize data science workflows using Docker and orchestrate them with
Kubernetes, enhancing reproducibility and scalability of analytical processes
Migrated a data warehouse to AWS Redshift to improve performance and scalability, resulting in a 30% reduction in
query response times.
Developed a natural language processing (NLP) pipeline using NLTK and spaCy to extract valuable insights from
unstructured medical notes, enhancing the accuracy of diagnostic coding by 25%
Utilized time series forecasting techniques (ARIMA, Prophet) to predict hospital resource utilization, optimizing staff
scheduling and reducing operational costs by 15%
Implemented privacy-preserving machine learning techniques, including federated learning and differential privacy, to
ensure HIPAA compliance while leveraging multi-institutional data
Created interactive data visualizations using D3.js and Plotly, effectively communicating complex health trends to both
medical professionals and administrative stakeholders
Proficient in managing relational databases such as MySQL, PostgreSQL, and Oracle, including installation,
configuration, and maintenance tasks.
Prepared reports by using and utilizing MS Excel (VLOOKUP, HLOOKUPS, pivot tables, Macros, data points).
Worked closely with all team members in implementing Tableau BI tool in all IT projects.
Collaborated with the IT security team to implement data encryption and access control measures, ensuring the
protection of sensitive patient information in big data environments
Data Analyst
Responsibilities:
Extensive experience in designing and developing complex DAX calculations and measures to perform advanced
calculations, aggregations, and data transformations within Power BI.
Pivot tables and create various analysis report using MS-Excel.
Performed Data analysis and Data profiling using complex SQL on various sources systems including Oracle and SQL
Server.
Leveraged SQL extensively for data querying and analysis, enabling detailed reporting on key performance indicators
(KPI) and healthcare metrics.
Loaded data using /SQL scripts to populate the Alt Data from EXCEL to ACCESS 2007.
Data Mapping, logical data modeling, created class diagrams and ER diagrams and used SQL queries to filter data
within the Oracle database.
Maintain and perform process analysis and designing of BI reports as well as opening StoreFront tickets in UAT, DEV
and PROD for claim’s team.
Used openpyxl module in python to format excel files.
Write SQL query for ad-hoc analysis report to analyze the customer feedback.
Designing or developing databases and database solutions using MySQL, Oracle.
Experienced in Data Modeling, Database design & well versed with Server best practices and hands on experience in
developing Stored Procedures/routines and Reports.
Published Data model in model mart, created Skilled in System Analysis, E-R/Dimensional Data Modeling, Database
Design and implementing RDBMS specific features.
Reviewed Stored Procedures for reports and wrote test queries against the source system (SQL Server) to match the
results with the actual report against the Data mart.
Created DDL scripts for implementing Data Modeling changes. Designed Star and Snowflake Data Models for
Enterprise Data Warehouse using ERWIN.
Responsible for gathering data migration requirements.
Created Technical Design Documents, Unit Test Cases.
Performed web analytics and reporting via Google Analytics.
Involved in data mapping and data clean up.