0% found this document useful (0 votes)
23 views

BSBA F23 BAIT3473 Project1 Powerquery

The document discusses a business database strategy term project focused on HR analytics. The project involves connecting an HR dataset to Power BI, transforming and cleaning the data, and creating dashboards to visualize key metrics like attrition rates and job satisfaction.

Uploaded by

Javeria Umar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

BSBA F23 BAIT3473 Project1 Powerquery

The document discusses a business database strategy term project focused on HR analytics. The project involves connecting an HR dataset to Power BI, transforming and cleaning the data, and creating dashboards to visualize key metrics like attrition rates and job satisfaction.

Uploaded by

Javeria Umar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 21

UCP (University of Central Punjab)

FOMS (Faculty of Management Sciences)

Business Database Strategy

Term Project

1
Group Members

 AMNA TAHIR (L1F21BSBA0001)

 SYED ABDUL SAMI (LF21BSBA0011)

 ABDULLAH (L1F21BSBA0014)

 JAVERIA UMAR(L1F21BSBA0025)

 ALY(L1F21BSBA0026)

 AMMAR YASIR (L1F21BSBA0035)

 ZAINAB MUMTAZ (L1F21BSBA0050)

SUBMITTED TO: SIR EHTISHAM UL HAQ

SUBMITTED ON: 25 JANUARY 2024

2
Executive Summary

In our Business Database Strategy project, our focus is on HR analytics, utilizing a dataset
obtained from Kaggle. The dataset, centered around the Human Resources department, contains
detailed employee information. Our primary goal is to employ Power BI for comprehensive data
analysis and visualization.

We initiated the project by connecting the dataset to Power BI, structuring and cleaning the data
through Power Query. This involved tasks such as removing irrelevant columns, handling
missing values, and transforming numerical ratings into qualitative measures.

Our project emphasizes a strategic approach to data capturing, transformation, and cleaning. The
dataset's structured nature ensures efficient data management and analysis. The main objectives
include achieving company goals, fostering a positive workplace culture, and promoting
employee empowerment within the HR domain.

Moving forward, the project involves creating dashboards that visualize key HR metrics,
including attrition rates, job satisfaction, and employee demographics. This approach enables us
to derive meaningful insights, support informed decision-making, and contribute to the overall
success of our organization by optimizing HR strategies. The project aligns with the broader goal
of leveraging data-driven insights to enhance business database strategy.

3
Table of Contents
project scope 5
business domain 5
nature of data 5
objectives and features of dataset 5

step 1(data capturing) 5


what is data capturing? 5
why data capturing is important? 5
how did we perform data capturing? 6
step 2(data transformation) 8
what is data transformation? 8
why data transformation is important? 8
what is data cleaning? 9
why data cleaning is important? 9
how did we perform data transformation? 10
step 3(data visualization) 13
what is data visuualization? 13
why data visualization is important? 13
how did we perform data transformation? 14
KPIs 14
FILTERS 15
VISUALIZATIONS 16

importance of using power bi for visualization 19

4
Project Scope
Business domain:
The dataset that we had selected is about human resource department that contain entire
employee information with respect to their work activities. The Human Resources (HR) function
plays a pivotal role in ensuring the effective management of human capital to achieve project
objectives. The primary objectives of the HR are achieving company objectives, improving
workplace culture, conducting development and training programs, employee empowerment,
employee motivation, and teamwork.

Nature of data:
The dataset chosen for our analysis is characterized as a structured dataset, distinguished by
well-defined attributes and key identifiers. This dataset exhibits a clear organization, with
distinct variables and identifiable keys that facilitate efficient data management and analysis. The
structured nature of this dataset ensures a systematic arrangement of information, allowing for
precise retrieval and interpretation of data elements. The inclusion of proper attributes and key
identifiers enhances the overall integrity and reliability of the dataset, providing a solid
foundation for meaningful insights and informed decision-making in our analytical endeavors.

Objectives and features of the dataset:


The prime objective of our team project is to connect the HR dataset through using proper
connector to Power Bi, after the successful connection the dataset is transformed and clean by
removing any anomalies. Once the dataset is transformed and cleaned then comes the final step
that is about creating dashboards and performing proper visualization.

Step1
Data capturing
Data capturing refers to the process of collecting, extracting, or obtaining raw data from various
sources for further analysis. In the context of analytics and business intelligence, data capturing
is a crucial step that involves gathering information to be used for making informed decisions,
identifying trends, and deriving insights. The importance of data capturing lies in its role as the
foundation for any analytical or reporting endeavor.

Why Data Capturing is Important?


 Informed Decision-Making:
Data capturing provides the necessary raw material for organizations to make informed decisions
based on evidence and trends within their datasets.

5
 Performance Evaluation
Capturing relevant data allows businesses to assess the performance of various processes,
departments, or initiatives, aiding in strategic planning and resource allocation.
 Identification of Patterns and Trends:
Analyzing captured data helps in identifying patterns, trends, and anomalies, which is essential
for understanding the dynamics of a business or system.
 Predictive Analytics:
Historical data captured over time can be used to develop predictive models, allowing organizations to
anticipate future trends and make proactive decisions.

 Resource Optimization:
Understanding data helps in optimizing resources, streamlining processes, and improving efficiency by
eliminating bottlenecks or inefficiencies.

HOW DID WE PERFORM DATA CAPTURING?


Dataset Sourcing: The dataset was acquired from Kaggle, a popular platform for sharing
datasets and conducting data science competitions. This dataset specifically focuses on IBM HR
analytics, providing information related to employee attrition, job satisfaction, and other relevant
factors.
The link for that dataset is as follow:
https://www.kaggle.com/datasets/pavansubhasht/ibm-hr-analytics-attrition-dataset.

2. Extraction and Format Conversion: The dataset, initially in a zip folder, was extracted to
access the CSV file containing the raw data. This extraction process is akin to capturing the data
from its original compressed state, making it accessible for further analysis.
3. Power BI Connection: Connecting the dataset with Power BI is a pivotal step in the data
analytics workflow. Power BI serves as a robust tool for visualization, analysis, and reporting,
making it essential for transforming raw data into meaningful insights.

6
The data is executed successfully as shown below:

7
In summary, the process of capturing data from Kaggle, extracting it from a zip folder,
converting it to a usable format (CSV), and connecting it to Power BI marks the initial steps in
the data analytics journey. This ensures that the dataset is ready for exploration, transformation,
and visualization within the Power BI environment, setting the stage for comprehensive HR
analytics.

Step 2
data transformation

Data Transformation
Data transformation refers to the process of converting raw data from its original format into a
format that is suitable for analysis or other downstream tasks. It involves restructuring,
modifying, or aggregating data to make it more usable and meaningful. Data transformation may
include tasks such as merging datasets, handling missing values, converting data types, creating
new variables, and scaling numerical values. The goal is to prepare the data in a way that makes
it easier to extract valuable insights or feed into machine learning models.

Why do Data Transformation?


 Normalization:
Transforming data allows for normalization, ensuring that variables are on a similar scale. This is
important for algorithms that are sensitive to the magnitude of different features.
 Feature Engineering:
Data transformation enables the creation of new features or variables that might be more
informative for analysis or modeling. This process enhances the quality of input features.
 Consolidation:
It involves merging or consolidating data from multiple sources to create a comprehensive
dataset. This is crucial for a holistic analysis and avoiding data silos.
 Handling Missing Data:

8
Transformation techniques can be applied to handle missing data, such as imputing values based
on certain criteria or removing instances with missing values.
 Data Type Conversion:
Converting data types ensures that variables are represented in a format suitable for analysis. For
example, transforming a date string into a datetime format.

Data Cleaning:
Data cleaning, also known as data cleansing or data scrubbing, is the process of identifying and
correcting errors, inconsistencies, and inaccuracies in datasets. It involves tasks such as handling
missing values, removing duplicates, correcting typos, and addressing outliers.

Why do Data Cleaning?


 Accuracy and Reliability:
Clean data ensures the accuracy and reliability of analyses. Inaccurate or inconsistent data can
lead to incorrect conclusions and decisions.
 Model Performance:
For machine learning models, the quality of training data significantly impacts model
performance. Clean data helps in building more accurate and robust models.
 Consistency:
Data cleaning ensures consistency in the representation of variables. Inconsistent data can lead to
confusion and misinterpretation of results.
 Compliance:
Clean data is often necessary for regulatory compliance, ensuring that the data used in analyses
or reports meets certain standards and requirements.
 Improved Insights:
Data cleaning enhances the quality of insights derived from the data. It reduces noise, allowing
analysts and data scientists to focus on meaningful patterns and trends.

Both data transformation and data cleaning are essential steps in the data preprocessing pipeline.
They contribute to the overall data quality, making the data more suitable for analysis, reporting,
and modeling purposes. Clean and well-transformed data sets the foundation for extracting
meaningful insights and making informed decisions.

9
data transformation and cleaning within our dataset:
 Removing Unnecessary Columns:
In the initial exploration of the Kaggle dataset, we identified columns such as relationship status
and relationship satisfaction that were not aligned with the specific analytical goals of our
project. To streamline the dataset and focus on key factors influencing attrition, we utilized
Power Query's 'Remove Columns' functionality. This step ensures that only relevant information
is retained for subsequent analysis.

10
 Handling Blank Cells
Addressing missing or null values is
crucial for ensuring the integrity and
completeness of the dataset. Using
Power Query, we applied the
'Replace' and 'Fill' functions to deal
with empty cells. For instance, if
some employees had not provided
salary information (resulting in null
or 0 values), we replaced these
instances with meaningful values or
filled them down based on the
context of the data. This process
ensures a consistent and reliable
dataset for analysis.

 Converting Numerical Ratings to Qualitative Measures


- Some columns, like job satisfaction and environment satisfaction, utilized numerical ratings
(1, 2, 3, ...) that required transformation into qualitative measures for easier interpretation.
Leveraging Power Query's conditional column formatting, we assigned a descriptive scale to the
numerical ratings, categorizing employee performance on a scale from "VERY BAD" to "VERY
GOOD." This not only simplifies the analysis but also enhances the understandability of the data
for end-users and stakeholders.
 1 (VERY BAD)
 2 (BAD)
 3 (SATISFACTORY)
 4 (GOOD)
 5 (VERY GOOD)
The screenshot illustrates the transformation further:

11
BEFORE:

12
AFTER:

In conclusion, the data transformation and cleaning process using Power Query in Power BI were pivotal
for refining the dataset, eliminating irrelevant information, addressing missing values, and converting

13
numerical ratings into more meaningful qualitative measures. This meticulous preparation lays the
foundation for accurate analysis and insightful visualizations, aligning the data with the specific
requirements of the HR analytics project.

STEP 3
DATA VISUALIZATION

Power BI is a powerful business intelligence tool that allows organizations to transform raw data
into meaningful insights. Its user-friendly interface and robust analytical capabilities make it an
ideal choice for creating interactive and visually compelling dashboards. In the presented Power
BI dashboard for HR analytics, we harnessed the tool's capabilities to visualize key performance
indicators (KPIs) such as total employees, attrition count, attrition rate, active employees, and
average age. Through pie charts, stacked column charts, line charts, stacked bar charts, and
matrices, we visually represented critical HR metrics such as department-wise attrition, age
group distribution, environment satisfaction trends, attrition based on education fields, and job
satisfaction across different departments. By incorporating filters for education, department, and
job role, we enhanced the dashboard's interactivity, allowing users to dynamically explore and
analyze specific aspects of the workforce. The utilization of Power BI facilitated a
comprehensive and intuitive representation of HR analytics data, empowering stakeholders to
make informed decisions and optimize human resource management strategies.

14
In the HR analytics dash board we created KPIs, VISUALS and GRAPHS accordingly.

KPIs
KPIs are critical metrics that gauge the performance of an organization in achieving its
objectives. In HR analytics, KPIs like attrition rate, total employees, and average age provide a
holistic view of workforce health.
Importance: KPIs serve as benchmarks, offering clear insights into trends, successes, and areas
that require attention. They guide strategic decision-making, enabling organizations to align their
efforts with overarching goals.
THE KPIs OF OUR DATASET ARE AS FOLLOW:

 Total Employees
This KPI provides a baseline understanding of the organization's
workforce size. Monitoring changes in the total number of employees
over time helps in assessing the overall growth or contraction of the
company.
 Attrition Count

15
The attrition count KPI is crucial for identifying the number of
employees leaving the company. It is a key metric for HR to track
turnover, measure the effectiveness of retention strategies, and assess the
impact on workforce planning.
 Attrition Rate
Calculating attrition as a percentage of the total workforce standardizes the measure, making it
easier to compare attrition rates across different time periods or
departments. A high attrition rate may indicate potential issues in
employee satisfaction, work environment, or other factors.
 Active Employees
Knowing the number of active employees at any given time provides a real-
time snapshot of the workforce. This KPI is essential for day-to-day
operational management, ensuring that there are enough employees to meet
business demands.
 Average Age
The average age of the workforce is valuable for demographic analysis. It
aids in succession planning, identifying potential skill gaps, and tailoring
benefits or training programs to different age groups.

FILTERS
Filters allow users to interactively slice and dice data, focusing on specific subsets. In HR
analytics, filters based on education, department, and job role empower users to tailor their
analysis to particular segments of the workforce.
Importance: Filters enhance the flexibility and depth of analysis. They enable personalized
exploration, facilitating the identification of trends or issues within specific categories and
supporting targeted decision-making.
THE FILTERS OF OUR DATASET ARE AS FOLLOW:

 Education

16
The education filter enables a deeper analysis based on the educational background of
employees. This filter helps in understanding if education is a factor influencing
attrition, job satisfaction, or performance in specific roles.

 Department
The department filter allows for a focused analysis on specific organizational units.
Identifying departments with higher attrition rates or lower job satisfaction
provides insights for targeted interventions, training, or restructuring.

 Job Role
The job role filter allows users to narrow down their analysis to specific roles within
the organization. This is essential for understanding whether certain job roles are
more susceptible to attrition or dissatisfaction, guiding role-specific interventions or
adjustments.

Each KPI and filter in our dashboard serves a unique purpose, collectively offering a
comprehensive view of the workforce. This detailed information helps HR professionals make
data-driven decisions, identify trends, and implement targeted strategies to enhance overall
employee satisfaction and retention.

VISUALS
Visuals and graphs transform raw data into compelling and easy-to-understand representations.
They include pie charts, bar graphs, line charts, tree map, matrix, donut chart etc., providing a
visual narrative for complex datasets.
Importance: Visuals are crucial for conveying insights quickly and intuitively. They aid in
pattern recognition, trend identification, and storytelling. Visualizations make data accessible to a
broader audience, fostering better understanding and engagement.

 Department-wise Attrition Count (Pie Chart)


This visual presents a pie chart that categorizes attrition counts based on different departments within the
organization.

 Enables quick identification of departments with the highest and lowest attrition.
 Facilitates comparison of attrition rates among departments.
 Helps HR focus on targeted strategies for specific departments, addressing unique challenges they
might face.

17
 Number of Employees by Age Group (Stacked Column Chart)
This visual is a stacked column chart illustrating the distribution of employees across various age groups.

 Provides a clear snapshot of the age demographics within the organization.


 Facilitates identification of age groups with a higher concentration of employees.
 Aids in understanding workforce age distribution for effective succession planning and tailored
training programs.

 Environment Satisfaction Trend (Line Chart)


This line chart tracks the trend of environment satisfaction over time and compares it to the total and
active employee count.

 Allows visualization of how changes in environment satisfaction correlate with workforce size.
 Helps HR understand the impact of environmental factors on employee retention.
 Provides insights into the effectiveness of initiatives aimed at improving work satisfaction.

18
 Attrition Count per Education Field (Stacked Bar Chart)
This visual is a stacked bar chart representing attrition counts categorized by different education fields.

 Identifies whether certain educational backgrounds are associated with higher attrition rates.
 Assists in tailoring recruitment strategies based on educational preferences.
 Helps HR focus on development programs that resonate with the educational diversity of the
workforce.

 Job Satisfaction Matrix (Matrix)


This matrix visualizes job satisfaction levels across different departments.

 Facilitates a quick comparison of job satisfaction levels in various departments.


 Identifies areas of strength and weakness in terms of job satisfaction.
 Guides HR in developing targeted interventions to improve job satisfaction in specific
departments.

19
Importance of Using Power BI for Visualization

1. Interactive Dashboards:
Power BI enables the creation of interactive and dynamic dashboards. Users can explore data
intuitively, adjusting filters and interacting with visuals to gain deeper insights. This interactivity
enhances the user experience and encourages data-driven exploration.

2. Data Connectivity:
Power BI allows seamless integration with various data sources, both on-premises and in the
cloud. This flexibility enables organizations to consolidate and analyze diverse datasets,
providing a comprehensive view of HR analytics.

3. User-Friendly Interface:
The user-friendly interface of Power BI makes it accessible to individuals with varying levels of
technical expertise. Its drag-and-drop functionality allows users to build complex visualizations
without extensive coding knowledge.

4. Real-time Updates:
Power BI can connect to real-time data sources, ensuring that dashboards are always up-to-date.
This is particularly crucial in HR analytics, where timely insights into workforce dynamics can
drive proactive decision-making.

20
5. Scalability:
Power BI is scalable, accommodating both small businesses and large enterprises. Its capabilities
make it suitable for handling diverse HR datasets, from basic metrics to advanced analytics,
ensuring flexibility as organizational needs evolve.

In summary, the combination of effective KPIs, filters, and visuals in HR analytics, when
leveraged through Power BI, enhances the interpretation and communication of data-driven
insights. The interactivity, connectivity, user-friendliness, and scalability of Power BI make it an
ideal platform for creating meaningful and actionable visualizations that support informed
decision-making in human resource management.

21

You might also like