0% found this document useful (0 votes)

4 views

Universal Data Analytics Algorithm

The document is a theoretical guide on universal data analytics algorithms, covering essential steps in data analysis from importing data to saving results. It includes chapters on data cleaning, exploratory data analysis, and model selection, providing detailed methodologies and best practices for each stage. The guide emphasizes the importance of thorough data inspection, cleaning, and visualization to ensure reliable and professional analysis.

Uploaded by

Faheem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Universal Data Analytics Algorithm

Uploaded by

Faheem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

A THEORITICAL

GUIDE OF

UNIVERSAL
DATA
ANALYTICS
ALGORITHM

PRATYUSH PURI
A THEORITICAL
GUIDE OF

UNIVERSAL
DATA
ANALYTICS
ALGORITHM

PRATYUSH PURI
Let’s Discuss More…
Contents
• Introduction

• Chapter 1: Importing Data

• Chapter 2: Data Overview and Inspection

• Chapter 3: Data Cleaning

• Chapter 4: Exploratory Data Analysis (EDA)

• Chapter 5: Data Visualization

• Chapter 6: Feature Engineering

• Chapter 7: Outlier Detection

• Chapter 8: Data Splitting

• Chapter 9: Model Selection (If Doing Machine Learning)

• Chapter 10: Insights & Reporting

• Chapter 11: Save Results

Chapter
Introduction

Step 1: Importing Data

• Import the dataset into Python using Pandas (read_csv, read_excel, read_json,
etc.).

Step 2: Data Overview and Inspection

• Check the data’s shape, columns, and data types (df.shape, df.columns,
df.dtypes).
• View the head and tail of the data (df.head(), df.tail()) to understand its structure.

Step 3: Data Cleaning

• Check for missing values (df.isnull().sum()).
• Handle missing values: either fill them (df.fillna()) or drop them (df.dropna()).
• Remove duplicate rows (df.drop_duplicates()).
• Correct the data types (astype()).

Step 4: Exploratory Data Analysis (EDA)

• Generate descriptive statistics (df.describe()).
• Check value counts and unique values (df['column'].value_counts(),
df['column'].unique()).
• Create a correlation matrix (df.corr()).

Step 5: Data Visualization

• Create histograms, boxplots, and scatter plots using Matplotlib/Seaborn.
• For categorical data, create bar plots or count plots.
• Visualize correlations using a heatmap.

Step 6: Feature Engineering

• If needed, create new features or transform existing ones (log, scaling, encoding).
• Encode categorical variables (pd.get_dummies() or LabelEncoder).
Step 7: Outlier Detection
• Detect outliers (boxplot, IQR method).
• Remove or treat outliers if necessary.

Step 8: Data Splitting

• If performing predictive analysis, split the data into train and test sets
(train_test_split from scikit-learn).

Step 9: Model Selection (If Doing Machine Learning)

• Select the model based on the problem: regression, classification, clustering, etc.
• Train and evaluate the model (accuracy, confusion matrix, etc.).

Step 10: Insights & Reporting

• Summarize your findings.
• Create a report with visualizations and key metrics.

Step 11: Save Results

• Save cleaned data, models, or reports (to_csv, pickle, etc.).

Quick Recap Table

Chapter 1
Importing Data

1. Setting Up Python Environment and Libraries

a. First, open your Python environment (Jupyter Notebook, VS Code, or any IDE).
b. Import the essential libraries for data analysis:
i. Pandas (for data handling)
ii. NumPy (for numerical operations)
iii. Matplotlib and Seaborn (for visualization)
iv. If you encounter warnings, set up your environment to ignore them.

2. Identify the Data Source

a. Understand the data source: CSV, Excel, JSON, SQL database, or any other format.
b. Keep the file path or database connection details ready.

3. Best Methods for Importing Data

a. CSV File:
i. This is the most common format; use pd.read_csv() to import.
ii. Write the file path correctly (use double backslash or forward slash in
Windows).
iii. If the file does not have a header, use header=None.
iv. For large files, use the chunk size parameter to import data in chunks.
v. For encoding issues, use the encoding parameter (e.g., encoding='utf-8').
vi. To treat missing values specifically, use the na_values parameter.
b. Excel File:
i. Use pd.read_excel(), and you can specify the sheet name.
c. JSON File:
i. Import using pd.read_json().
d. SQL Database:
i. Create a connection using pyodbc or sqlalchemy, then use
pd.read_sql_query().
e. Other Formats (SAS, Stata, etc.):
i. Pandas provides functions like read_sas(), read_stata() for these formats.

4. Initial Checks After Importing Data

a. Immediately verify that the data has been imported correctly:

i. Use df.head() to view the top 5 rows.
ii. Use df.tail() to view the last 5 rows.
iii. Check the number of rows and columns with df.shape.
iv. Verify column names with df.columns.
v. Check data types with df.dtypes.

5. Advanced Tips for Data Import (Like a Pro Analyst)

a. If the file is very large, import a sample using the nrows parameter.
b. If the data is compressed (zip/gz), you can import it directly (e.g.,
pd.read_csv('file.csv.gz')).
c. To select specific columns, use the usecols parameter.
d. To set a column as the index, use the index_col parameter.
e. If there are comments or unnecessary rows in the data, use the comment or
skiprows parameter.
6. Documenting the Data Import Process

a. Comment the data import process in your notebook so other analysts can understand
where the data came from and how it was imported.
b. Mention the data source, version, and import date (to maintain data lineage).

Pro Tip:

Understand and use all available parameters during data import (header, index_col, usecols,
na_values, dtype, skiprows, nrows, encoding, etc.). This is what sets apart an average analyst
from the best.

Import Syntax for Each Format

Summary:

In Chapter 1, pay attention to every detail while importing data—file format, path, encoding, missing
values, columns, data types, and import parameters. Immediately verify after import that the data is
correct. Following all these steps will ensure your analysis is always professional and reliable.
Chapter 2
Importing Data

1. Check DataFrame Shape and Size

• Use df.shape to get the count of rows and columns. This tells you how big the data is.
• Use df.size to find the total number of elements (rows × columns).

2. Understand DataFrame Structure

• Use df.head(n) to view the top n rows (default 5). This helps you understand the structure
and starting values of the data.
• Use df.tail(n) to view the last n rows, so you can catch end values and possible data entry
issues.
• Use df.sample(n) to look at random rows, ensuring you don’t miss any patterns in the data.

3. Inspect Columns, Index, and Data Types

• Use df.columns to check column names.

• Use df.index to see the structure of the index (default integer, or custom).
• Use df.dtypes to find out the data type of each column (int, float, object, bool, category,
datetime, etc.).
• If needed, use pd.set_option('display.max_columns', None) to display all columns at
once.

4. Get DataFrame Info

• Use df.info() to get data types, non-null counts, and memory usage for each column.
• This helps you identify missing values and get an idea of memory optimization.

5. Generate Descriptive Statistics

• Use df.describe() for numerical columns to get count, mean, std, min, max, and quartiles.
• Use df.describe(include='object') for a summary of categorical columns (unique, top,
freq).
• Use df.describe(include='all') for a summary of mixed data types.

6. Check for Missing Values

• Use df.isnull().sum() to find how many missing values each column has.
• Use df.isnull().any() to see which columns contain missing values.

7. View Unique Values and Value Counts

• Use df.nunique() to get the count of unique values in each column.

• Use df['col'].value_counts() to see unique values and their frequency for a specific
column.

8. Data Quality Checks

• For numeric columns, check the range (e.g., are negative values allowed?).
• For categorical columns, check for inconsistent entries (e.g., 'Male', 'male', 'MALE').
• For date columns, check if the format is consistent, possibly using regex.

9. Logical Consistency Checks

• Check cross-column dependencies (e.g., bedrooms should not be less than rooms).
• Check for duplicates: df.duplicated().sum().
10. Visually Inspect the DataFrame

• View the transposed version of the DataFrame (df.T.head()); sometimes seeing

columns as rows is helpful.
• If the DataFrame is very large, check the memory footprint
with df.memory_usage(deep=True).

Pro Tips (Like the Best Analysts)

• Always focus on data types and missing values, as these can cause errors in analysis and
modeling.
• Use value_counts() on categorical data to spot rare categories or spelling mistakes.
• Check logical consistency (cross-column rules), which might be missed in normal inspection.
• Save the output of DataFrame info and describe in your notebook for future reference.

Quick Checklist Table

Summary:

In Chapter 2, inspect the data from every angle—structure, types, missing values, unique values,
logical consistency, and data quality. Doing all these checks will make your analysis professional,
reliable, and error-free, just like the best data analysts.
Chapter 3
Data Cleaning

1. Preserve Raw Data

• Always save a separate copy of the raw/original data. Never overwrite it, so you can easily
revert if needed.

2. Remove Unwanted Columns and Rows

• Remove columns/rows not needed for analysis, like IDs, irrelevant logs, or placeholder
columns.
• Use df.drop(columns=['col1', 'col2']) or df = df[df['col'] !=
'unwanted_value'].

3. Handle Missing Values

• Identify missing values using df.isnull().sum() or df.isna().sum().

• If there are few missing values, drop those rows: df.dropna().
• If there are many, fill them:
• For numerical columns: fill with mean/median/mode,
e.g., df['col'].fillna(df['col'].mean()).
• For categorical columns: fill with mode or 'Unknown'.
• You can also use domain-specific logic (like forward fill or backward fill).
• Sometimes, analyze the pattern of missing values—it could itself be an insight.

4. Detect and Remove Duplicates

• Duplicate rows can make your analysis wrong.
• Detect duplicates: df.duplicated().sum().
• Remove duplicates: df.drop_duplicates().
• If needed, merge/aggregate duplicates (e.g., sum, mean).

5. Identify and Correct Wrong/Invalid Data

• Detect out-of-range values, impossible entries (like negative age, future date).
• Standardize values, e.g., convert all 'Male', 'male', 'MALE' to 'male'.
• Convert date/time columns to a uniform format: pd.to_datetime(df['date_col'],
errors='coerce').

6. Ensure Data Type Consistency

• Check if every column has the correct data type: df.dtypes.

• If not, convert: df['col'] = df['col'].astype('int') or use pd.to_datetime().
• Convert categorical columns to 'category' type for efficiency.

7. Clean String Data

• Remove extra spaces, special characters, and inconsistent casing.

• Use .str.strip(), .str.lower(), .replace().
• Convert multiple spaces to single, remove special characters.

8. Detect and Treat Outliers

• Identify outliers using boxplot, IQR, or z-score methods.

• Remove, cap, or impute outliers based on domain knowledge.
9. Standardize Inconsistent Data

• Standardize spelling mistakes, abbreviations, or inconsistent labels in categorical values.

• Use a mapping dictionary to replace inconsistent values.

10. Logical Consistency Checks

• Apply cross-column rules (e.g., start_date should not be after end_date, bedrooms should
not be less than rooms).
• Fix or flag logical errors.

11. Handle Special Values

• Treat special symbols (like '?', '--', 'N/A') as missing values using the na_values parameter
or .replace().

12. Clean and Standardize Column Names

• Fix spaces, special characters, and inconsistent casing in column names.

• Use: df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_').

13. Make Data Cleaning Modular (Reusable Functions/Pipeline)

• Create a function for each cleaning step for reuse.

• Build an automated cleaning pipeline where each step is modular and logs are maintained.

14. Document the Cleaning Process

• Write comments or maintain a cleaning log for every cleaning step.

• Record what was changed, why, and when.

15. Inspect Data Again After Cleaning

• Use df.info(), df.describe(), and visual checks to ensure that after cleaning, the data is
correct and no wrong bias has been introduced.

Quick Checklist Table

Pro Tips (Expert Level)

• Always re-inspect the data after every cleaning step to avoid unintended consequences.
• Build automated cleaning pipelines to save time and reduce errors in large or repeatable
projects.
• After cleaning, check the data’s distribution, mean, median, std, and unique values again.
• Make cleaning functions reusable and well-documented for team sharing.

Summary:

In Chapter 3, an expert data analyst performs every possible data cleaning activity—handling missing
values, duplicates, invalid entries, outliers, string/text issues, data types, logical consistency, and
documentation. Every step should be modular, repeatable, and well-documented. Don’t forget to re-
inspect the data after cleaning so your analysis is always trustworthy, accurate, and professional.
Chapter 4
Exploratory Data Analysis (EDA)

1. Reconfirm the Objective of Analysis

• Before starting analysis, review your business or research objectives again. This ensures the
analysis stays focused and relevant.

2. Perform Descriptive Analysis

• Summarize the data using mean, median, mode, minimum, maximum, standard deviation,
percentiles, range, and count.
• For categorical columns, check value counts, frequency tables, and unique values.
• For numerical columns, examine distributions using histograms and boxplots.
• Note any outliers or anomalies.

3. Extensive Use of Data Visualization

• Univariate analysis: histograms, bar charts, pie charts, boxplots.

• Bivariate/multivariate analysis: scatter plots, pair plots, heatmaps, violin plots.
• For time series data: line plots and seasonal decomposition.
• For categorical vs numerical: boxplots, violin plots, swarm plots.
• Save visualizations along with insights in your notebook or report.

4. Detect Relationships, Patterns, and Associations

• Create a correlation matrix to identify strong or weak correlations between columns.

• Use scatter plots to visualize relationships.
• Perform group-by analysis with aggregations like mean, sum, and count.
• Build pivot tables for multidimensional summaries.
• Use cross-tabulation to explore categorical relationships.

5. Apply Advanced Statistical Analysis

• Conduct hypothesis testing such as t-tests, chi-square tests, and ANOVA.

• Use inferential statistics like confidence intervals, p-values, and effect sizes.
• Perform regression analyses (linear, logistic, multivariate) to identify trends and make
predictions.
• Use clustering algorithms like K-means, hierarchical clustering, or DBSCAN for segmenting
data.
• Apply dimensionality reduction techniques like PCA or t-SNE for high-dimensional datasets.

6. Feature Engineering and Transformation

• Create new features (e.g., extract month, day, year from dates; calculate text length or
sentiment).
• Scale or normalize numerical features using StandardScaler or MinMaxScaler.
• Encode categorical features using Label Encoding or One-Hot Encoding.
• Apply binning, bucketing, or discretization where appropriate.

7. Data Segmentation and Subgroup Analysis

• Divide data into relevant segments (such as age groups, locations, product categories).
• Analyze each segment separately to gain granular insights.
• Segmentation helps uncover hidden trends not visible in aggregated data.
8. Detect Anomalies, Trends, and Seasonality

• Re-examine outliers and understand their impact.

• Identify trends, seasonal effects, and cyclic patterns in time series data.
• Use anomaly detection algorithms like Isolation Forest or Z-score methods.

9. Integrate Multiple Data Sources (If Applicable)

• Merge different data sources to enrich insights.

• Validate data consistency and join keys during integration.

10. Real-Time or Near-Real-Time Analysis (If Required)

• Use live dashboards or streaming analytics tools (e.g., Apache Kafka, Spark Streaming) to
meet business needs.

11. Clearly Document Insights

• Record every finding, pattern, relationship, and anomaly in your notebook or report.
• Include visualizations, tables, and key metrics.
• Note limitations, data quality issues, and assumptions.

12. Maintain an Iterative Approach

• Treat EDA as an iterative process: as new patterns emerge, re-inspect data, update
visualizations, and test new hypotheses.
Quick Checklist Table

Pro Tips (Expert Level)

• Always write insights alongside every visualization; just plotting graphs is not enough.
• Check statistical significance to ensure findings are reliable.
• Segment data thoroughly; valuable insights often lie in subgroups.
• Make EDA reproducible by maintaining clean code, comments, and outputs.
• Clearly mention limitations and data quality issues.
Summary:

In Chapter 4, an expert data analyst applies descriptive, inferential, statistical, and machine learning
analyses; visualizes every variable; detects relationships, patterns, and outliers; performs
segmentation; and documents all findings thoroughly. Keep EDA iterative and objective-driven so
insights are robust, actionable, and aligned with business or research goals.
Chapter 5
Data Visualization

1. Clarify the Objective of Visualization

• First, decide the purpose of the visualization: showing trends, explaining distributions,
making comparisons, highlighting correlations, or illustrating part-to-whole relationships.
• Understand your audience: are they technical or non-technical, business or research
focused?

2. Choose the Right Visualization Technique

• Univariate Analysis:
• Numerical: Histogram, boxplot, density plot.
• Categorical: Bar chart, pie chart, count plot.
• Bivariate/Multivariate Analysis:
• Numerical vs Numerical: Scatter plot, hexbin plot.
• Categorical vs Numerical: Boxplot, violin plot, swarm plot.
• Multiple variables: Pairplot, heatmap, correlation matrix.
• Time Series:
• Line plot, area chart, seasonal decomposition.
• Geographical Data:
• Map, choropleth map, symbol map.
• Part-to-Whole:
• Pie chart, donut chart, stacked bar chart.
• Ranking/Comparison:
• Bar chart, lollipop chart, dot plot.
• Network/Relationship:
• Network diagram, sankey diagram.
• Text Data:
• Word cloud, frequency bar chart.

3. Use Visualization Tools and Libraries

• Python: matplotlib, seaborn, plotly, altair.

• BI Tools: Power BI, Tableau (for interactive dashboards).
• Custom visuals: Use community or self-made visuals for special needs.

4. Prepare Data Before Visualizing

• Aggregate, filter, or transform data as needed (e.g., groupby, pivot, rolling averages).
• Treat outliers or missing values so the visualization is not misleading.
• Understand the scale and range of the variables you are plotting.

5. Visualization Design Best Practices

• Clarity:
• Always keep axis labels, titles, and legends clear and readable.
• Use accessible color palettes (colorblind-friendly, high contrast).
• Avoid unnecessary gridlines, ticks, and decorations.
• Consistency:
• Use the same color, scale, and units for the same variable across visuals.
• Annotation:
• Annotate important points, trends, or outliers.
• Sorting:
• Sort bar charts or rankings in a logical order.
• Interactivity:
• Add filters, slicers, and drill-downs in dashboards so users can explore data.
6. Combine Multiple Visualizations

• Build dashboards or storyboards to show one insight from different angles.

• Use linked visuals: selection in one chart filters another (as in Power BI/Tableau).

7. Enable Hierarchies and Drill-Downs

• Create date, geography, or product hierarchies to allow users to move from high-level to
detailed views.

8. Test and Refine Visualizations

• Show visuals to colleagues or stakeholders for feedback.

• Refine based on clarity, accuracy, and impact.

9. Document and Share Visualizations

• Write a short description or insight with each visualization.

• Share visuals as notebooks, PDFs, dashboards, or interactive web apps.

10. Focus on Data Storytelling

• Arrange visualizations in a sequence that tells a coherent story.

• Each visualization should answer a specific question or convey a key message.
Visualization Techniques Quick Table

Pro Tips (Expert Level)

• The purpose of visualization is to communicate insights, not just for decoration.

• Design every chart for your audience’s level.
• Interactive dashboards empower stakeholders to explore data themselves.
• Always consider accessibility in color and design (colorblind-friendly).
• Avoid misleading scales, truncated axes, or unnecessary 3D effects.

Summary:

In Chapter 5, an expert data analyst selects the right visualization technique, prepares data, follows
design best practices, builds interactive and multi-angle dashboards, documents each visualization,
and uses a story-driven approach. The goal of visualization is to convert complex data into simple,
clear, and actionable insights so that decision-making is fast and effective.
Chapter 6
Feature Engineering

1. Deeply Understand Data and Domain

• Grasp the business context, meaning, and importance of every feature.

• Consult domain experts or read documentation to ensure feature creation is relevant.

2. Handle Missing Values

• For numerical features: fill with mean, median, mode, interpolation, or a domain-specific
value.
• For categorical features: fill with mode, 'Unknown', or predictive imputation.
• Advanced: create a missing indicator feature (e.g., is_missing flag).

3. Detect and Treat Outliers

• Detect outliers using boxplot, IQR, z-score, or visualization.

• Remove, cap, or transform outliers (e.g., log transform for skewed data).

4. Feature Scaling and Normalization

• Apply standardization (mean=0, std=1) or normalization (min-max scaling), especially for

distance-based algorithms (KNN, SVM, Neural Networks).
• Use robust scaling (median/IQR) for data with many outliers.
5. Encode Categorical Features

• Use label encoding for ordinal data.

• Use one-hot encoding for nominal data.
• Use frequency or target encoding for advanced cases.
• Group rare categories as 'Other'.

6. Feature Creation – Build New Features

• Interaction Features: Product, ratio, or difference of two or more features (e.g., price ×
quantity = revenue).
• Polynomial Features: Square, cube, etc. of features (e.g., x, x², x³).
• Temporal Features: Extract year, month, day, weekday, or time-delta from dates.
• Aggregated Features: Use groupby to get mean, sum, count, min, max, std, etc. (e.g., total
purchases per customer).
• Text Features: Text length, word count, sentiment score, TF-IDF, embeddings.
• Domain-Specific Features: Create new features based on business logic or expert
knowledge.

7. Feature Transformation

• Apply log, square root, or Box-Cox transformations for skewed distributions.

• Use binning/discretization to convert continuous features into bins (e.g., age groups).
• Use feature extraction methods like PCA, t-SNE, or autoencoders for dimensionality
reduction.

8. Feature Selection – Choose Relevant Features

• Filter Methods: Correlation, chi-square, ANOVA, mutual information.

• Wrapper Methods: Recursive feature elimination (RFE), forward/backward selection.
• Embedded Methods: Model-based selection (feature importance from tree models, LASSO).
• Remove redundant, irrelevant, or highly correlated features to avoid multicollinearity
9. Feature Benchmarking

• Test the impact of every new feature or selection on the model (cross-validation, A/B
testing).
• Use an iterative approach: add/remove features and evaluate model performance.

10. Balance Interpretability and Simplicity

• Complex features can improve accuracy, but may reduce interpretability.

• For business use-cases, prefer explainable features.

11. Automate the Feature Engineering Pipeline

• Wrap all steps in functions or classes.

• Use pipelines (e.g., scikit-learn’s Pipeline) for repeatability and reproducibility.

12. Documentation and Versioning

• Document the logic, source, and transformation of every feature.

• Maintain version control for feature sets for future reference.

Pro Tips (Expert Level)

• The combination of creativity and domain knowledge is the most powerful in feature
engineering.
• Always test the impact of every new feature on the model to avoid overfitting.
• Use dimensionality reduction (PCA, autoencoders) to make high-dimensional data
manageable.
• After feature selection, check model interpretability and business explainability.
• Feature engineering is an iterative process—refine features as new patterns emerge.
Feature Engineering Techniques

Summary:

In Chapter 6, an expert data analyst transforms, creates, selects, and optimizes raw data—handling
missing values, outliers, scaling, encoding, interaction/polynomial/temporal features, feature
selection, and automation—all with documentation and benchmarking. Feature engineering is the
real secret to model accuracy, robustness, and explainability, so use creativity, logic, and domain
knowledge at every step.
Chapter 7
Outlier Detection

1. Understand the Objective of Outlier Detection

• First, decide why you are detecting outliers: data cleaning, anomaly detection, fraud
detection, or rare event analysis.
• Use business context and domain knowledge to correctly interpret unusual points.

2. Visualize Data (Initial Inspection)

• Boxplot: Shows outliers as points outside the whiskers.

• Histogram/Density Plot: Check the shape and tails of the distribution.
• Scatter Plot: Identify bivariate or multivariate outliers.
• Pairplot/Heatmap: Spot outlier patterns in high-dimensional data.

3. Univariate Outlier Detection Techniques

• IQR (Interquartile Range) Method:

• Calculate Q1 (25th percentile) and Q3 (75th percentile).
• IQR = Q3 - Q1
• Lower bound = Q1 - 1.5 × IQR
Upper bound = Q3 + 1.5 × IQR
• Values outside these bounds are outliers.
• Z-Score Method:
• Z = (value - mean) / standard deviation
• Points with Z-score > 3 or < -3 are considered outliers.
• Best for Gaussian (normal) distributions.
• Standard Deviation Method:
• Points more than 2 or 3 standard deviations from the mean are outliers.

4. Multivariate Outlier Detection Techniques

• Mahalanobis Distance:
• Considers correlations between multiple variables.
• Multivariate Analysis (MVA):
• Analyze multiple columns together to detect outliers.
• Pairwise Scatterplots:
• Visually identify outliers in multivariate data.

5. Proximity & Density-Based Methods

• k-Nearest Neighbors (k-NN):

• Points far from their neighbors may be outliers.
• DBSCAN Clustering:
• Points in low-density regions are marked as outliers.
• Local Outlier Factor (LOF):
• Assigns an outlier score based on local density.

6. Machine Learning-Based Methods

• Isolation Forest:
• Isolates data points through random splits; easily isolated points are outliers.
• One-Class SVM:
• Identifies outliers by treating normal data as one class.
• Autoencoders (Deep Learning):
• Detect outliers in high-dimensional data using reconstruction error.

7. Outlier Handling Strategies

• Remove Outliers:
• Remove if they are data entry errors or not justified by business logic.
• Cap/Winsorize Outliers:
• Cap extreme values at a threshold (e.g., 5th and 95th percentiles).
• Impute Outliers:
• Replace outlier values with mean/median/mode, similar to missing value treatment.
• Flag Outliers:
• Flag outlier points in a new column for tracking in downstream analysis.
• Business Review:
• Consult domain experts before removing valuable or rare event outliers.

8. Re-Visualize and Validate Outlier Detection Results

• After removing/capping/flagging outliers, re-plot the data distribution.

• Use summary statistics, boxplots, histograms, and scatter plots to ensure the data is now
balanced and meaningful.

9. Documentation and Transparency

• Document which outlier detection technique was used, what threshold was set, how many
points were detected, and what action was taken.
• Note the impact of outlier removal/capping on the analysis.

10. Maintain an Iterative Approach

• Outlier detection is not a one-time task; check again after each new feature or
transformation.
• Experiment with different techniques (statistical, clustering, ML-based) and choose the best
approach.
Outlier Detection Techniques

Pro Tips (Expert Level)

• Context is most important in outlier detection—sometimes rare but valid business cases may
look like outliers.
• Combine multiple techniques (visual + statistical + ML) for robust detection.
• Always check data distribution and model performance after handling outliers.
• Document and validate with business experts to avoid bias from outlier removal.

Summary:

In Chapter 7, an expert data analyst detects outliers from every angle—visualization, IQR, Z-score,
clustering, ML-based, domain logic—and handles each outlier according to context (remove, cap,
impute, flag, or business review). Every step should be transparent, iterative, and well-documented
to ensure the analysis is accurate, fair, and business-relevant.
Chapter 8
Reporting & Communicating Insights

1. Understand the Objective of Data Splitting

• The main goal of data splitting is to evaluate your model in an unbiased way and prevent
overfitting.
• The training set teaches the model, the validation set is for hyperparameter tuning and
model selection, and the test set evaluates real-world model performance.

2. Plan the Data Splitting Approach

• Decide how many splits you need:

• For simple ML tasks: Training + Testing (2-way split)
• For advanced ML/Deep Learning: Training + Validation + Testing (3-way split)
• For large/complex projects: Cross-validation (K-Fold, Stratified K-Fold,
TimeSeriesSplit)

3. Prepare the Data (Features & Target)

• Separate your data into “Features” (X) and “Target” (y).

• Ensure the data is clean, consistent, and shuffled (unless it’s time series data).

4. Choose the Splitting Strategy

• Random Splitting:
• Randomly split the data (e.g., 70-80% training, 20-30% testing).
• Simple and effective when data is large and balanced.
• Stratified Splitting:
• For imbalanced datasets, use stratified splitting to maintain class proportions in each
split.
• Use the stratify parameter in scikit-learn.
• Time-Based Splitting:
• For time series data, use earlier data for training and later data for testing.
• Use TimeSeriesSplit or custom logic.
• K-Fold Cross-Validation:
• Split data into K equal folds; each fold is used once as a test set, the rest as training.
• Stratified K-Fold is best for imbalanced classes.
• Custom Splitting:
• Use business or domain-specific logic (e.g., recent data for testing, older data for
training).

5. Decide Split Ratios

• Common ratios:
• 70% train, 30% test
• 80% train, 20% test
• 60% train, 20% validation, 20% test
• For K-Fold: K = 5 or 10 is commonly used.

6. Ensure Reproducibility

• For random splits, fix the random_state parameter to make results reproducible.
• Document the splitting process, code, parameters, and logic.

7. Prevent Data Leakage

• Perform data cleaning, feature engineering, scaling, or encoding only after splitting (fit only
on training, then apply to test/validation).
• Make sure the target variable or future information does not accidentally leak into training or
test sets.

8. Check Distribution After Splitting

• Check the distribution of the target variable in each split (especially for stratified splits).
• Ensure splits are representative and not biased.

9. Use Advanced Techniques (If Needed)

• Nested Cross-Validation:
• For hyperparameter tuning and unbiased evaluation.
• Group K-Fold:
• When data has groups (e.g., patients, users), ensure each group is only in one split.
• Leave-One-Out (LOO):
• Each observation serves as a test set once (for small datasets).

10. Document the Splitting Process

• Clearly mention split ratios, strategy, random state, and logic in your notebook/report.
• Report summary stats, class balance, and sample sizes after splitting.

Pro Tips (Expert Level)

• Always use stratified splitting for imbalanced datasets to avoid model bias.
• Never include future data in the training set for time series problems.
• Use K-Fold cross-validation to check model stability and robustness.
• After splitting, plot descriptive stats and target distribution for each split.
• Make your data splitting code modular for repeatability and auditability.
Data Splitting Checklist Table

Summary:

In Chapter 8, an expert data analyst carefully plans data splitting—choosing the right strategy, ratios,
ensuring reproducibility, preventing leakage, and documenting the process. They check the
distribution of each split, use advanced techniques if needed, and keep the process transparent. This
ensures model evaluation is fair, unbiased, and real-world ready—just like the best data analysts do.
Chapter 9
Model Selection

1. Problem Formulation & Metric Selection

• Clearly define the problem: classification, regression, clustering, or another task.

• Select evaluation metrics based on the problem type:
• Classification: accuracy, precision, recall, F1-score, ROC-AUC
• Regression: mean squared error (MSE), mean absolute error (MAE), R²
• Clustering: silhouette score, Davies-Bouldin index
• Also consider business-specific KPIs.

2. Candidate Model Selection

• Shortlist multiple algorithms—from simple (linear/logistic regression, decision tree) to

complex (random forest, SVM, XGBoost, neural networks, etc.).
• Select models based on data size, feature types, interpretability, scalability, and domain
knowledge.
• Include ensemble methods (bagging, boosting, stacking) as they often improve accuracy.

3. Data Preparation for Modeling

• Separate features and target variable.

• Perform feature engineering, selection, and transformation (encoding, scaling, imputation).
• Prepare train/validation/test splits or cross-validation folds.

4. Model Training
• Train each shortlisted model on the training data.
• Define the loss function (e.g., cross-entropy, MSE) and select the optimization algorithm (e.g.,
gradient descent).
• Apply regularization (L1, L2, dropout) to prevent overfitting.

5. Hyperparameter Tuning

• Tune hyperparameters for each model (e.g., tree depth, learning rate, number of estimators,
regularization strength).
• Use grid search, random search, or Bayesian optimization to find the best combination.
• Perform tuning with cross-validation for unbiased results.

6. Model Evaluation & Comparison

• Evaluate each model on the validation set or with cross-validation.

• Compare performance using selected metrics (accuracy, F1, ROC-AUC, MSE, etc.).
• Also compare model complexity, interpretability, training/inference time, and resource
usage.

7. Overfitting/Underfitting Analysis

• Plot learning curves and compare training vs validation performance.

• If overfitting, increase regularization, simplify the model, or augment data.
• If underfitting, increase model complexity or improve features.

8. Final Model Selection

• Select the most balanced model—one that performs best on validation and generalizes well.
• Consider model interpretability, business constraints, and deployment feasibility.
9. Final Evaluation on Test Set

• Evaluate the final selected model on the untouched test set.

• Report test set performance (metrics, confusion matrix, ROC curve, etc.) for a real-world
performance estimate.

10. Model Documentation & Reproducibility

• Document each model, hyperparameters, evaluation results, and selection logic.

• Save random seeds, code, and data splits so results are reproducible.

11. Model Explainability (If Needed)

• Use feature importance, SHAP values, or LIME to explain model predictions—especially for
critical domains (healthcare, finance, etc.).

12. Model Deployment Readiness

• Check model size, latency, and integration requirements.

• Prepare the deployment pipeline (pickle, ONNX, API, etc.) for production use.

Pro Tips (Expert Level)

• Always try multiple models—never settle for just one.

• Use cross-validation to check model stability, especially for small or imbalanced datasets.
• Automate hyperparameter tuning (GridSearchCV, RandomizedSearchCV, Optuna).
• Use model explainability tools—especially when you need to explain predictions to
stakeholders.
• Keep every step's code, config, and results under version control.

Model Selection & Training Checklist Table

Summary:

In Chapter 9, an expert data analyst systematically performs model selection, training, tuning,
evaluation, and documentation—tries multiple models, compares on best metrics, analyzes
overfitting/underfitting, checks explainability and deployment readiness, and ensures everything is
reproducible and transparent. This approach ensures your model is always accurate, robust, and
business-ready—just like the best data analysts do.
Chapter 10
Insights & Reporting

1. Clearly Interpret Insights

• Convert your findings from mere observations to actionable insights—focus not just on
“what happened,” but also on “why it happened” and “what should be done next.”
• Explain every major trend, pattern, or anomaly and highlight its business impact.
• Link insights to the business or project objectives.

2. Ensure Recommendations Are Actionable

• For every insight, provide clear, specific, and practical recommendations (e.g., “Streamline
the onboarding process to reduce customer churn”).
• Suggest both short-term quick wins and long-term strategic actions.
• Justify recommendations with data and analysis—avoid opinions, make them data-driven.

3. Practice Honest Communication & Highlight Limitations

• Transparently mention any uncertainty, data limitation, or assumption in your results.

• Flag any ambiguity or incomplete data in your findings.
• Honest communication builds trust and sets realistic expectations for decision-makers.

4. Use Audience-Centric Reporting Structure

• Structure your report, presentation, or dashboard according to the audience—technical,

non-technical, management, or client.
• Common structure:
• Executive Summary (key insights, recommendations)
• Objectives & KPIs
• Data Findings (numbers, trends)
• Analysis & Insights (meaning, implications)
• Recommendations (action items)
• Appendix (charts, raw data, methodology)
• Keep every section concise and relevant—avoid unnecessary fluff.

5. Use Visualizations and Storytelling Effectively

• Present complex data in simple, clear visuals—charts, graphs, dashboards, infographics.

• Write a short annotation or explanation with every visualization to provide context.
• Use data storytelling techniques—create a coherent narrative that takes the audience from
the “big picture” to “actionable steps.”

6. Use Multiple Reporting Mediums

• Use written reports (PDF, Word), presentations (PowerPoint), dashboards (Tableau, Power
BI), or web-based reports—choose what works best for your audience.
• Also consider oral presentations, digital reports, and interactive dashboards.
• Ensure accessibility—visuals should be colorblind-friendly, fonts readable, and formats
universally accessible.

7. Ensure Timeliness and Relevance

• Share results in a timely manner so they can actually impact business or project decisions.
• Make sure insights are not outdated and recommendations are relevant to the current
context.
8. Enable Feedback and Iteration

• Collect feedback from stakeholders, address queries, and refine the report/insights as
needed.
• Maintain an iterative approach so findings can be continuously improved.

9. Maintain Documentation & Transparency

• Document every insight, recommendation, and data source.

• Clearly write methodology, assumptions, and limitations in the appendix or footnotes.

Insights Sharing & Reporting Checklist Table

Pro Tips (Expert Level)

• For every insight, answer “So What?”—what does this mean for the business or project?
• Highlight key trends in visuals (annotations, callouts)—just showing data is not enough.
• Prioritize recommendations—separate quick wins, high-impact, and strategic actions.
• Do not cherry-pick data or insights—share all relevant findings, whether positive or negative.
• Always include a “Next Steps” or “Action Plan” section at the end of every report or
presentation.

Summary:

In Chapter 10, an expert data analyst converts analysis results into actionable insights and
recommendations, links them to business objectives, maintains honest and transparent
communication, uses the best visualization and storytelling techniques, follows an audience-centric
structure, shares results in multiple timely mediums, collects feedback, and maintains thorough
documentation. This approach ensures insights are impactful, understandable, and decision-ready—
just like the best data analysts do.
Chapter 11
Save Results

1. Clarify the Objective of Result Saving

• Decide what needs to be saved or backed up: cleaned datasets, processed features, trained
models, code scripts, reports, visualizations, logs, and documentation.
• The objective is reproducibility, audit trail, future reuse, and knowledge transfer.

2. Save Cleaned Data and Outputs

• Save final cleaned datasets in standardized formats (CSV, Parquet, Excel, SQL, etc.).
• Use data versioning (file naming conventions, timestamps, or tools like DVC/Git LFS).
• Encrypt or apply access control to sensitive data.

3. Model Saving & Serialization

• Serialize trained models (pickle, joblib, ONNX, PMML, etc.).

• Save model metadata: training parameters, hyperparameters, version, and environment
details.
• Use a model registry or cloud storage for enterprise projects.

4. Archive Code, Notebooks, and Scripts

• Push Jupyter notebooks, Python scripts, R scripts, or SQL queries to version control (Git).
• Save README, requirements.txt/environment.yml, and usage instructions with the code.
• Update code documentation and comments.
5. Store Reports, Visualizations, and Dashboards

• Save reports as PDF, PPT, HTML, or on dashboard platforms (Power BI, Tableau).
• Export visualizations as high-resolution images or interactive formats (Plotly, Tableau Public,
etc.).
• Store reports and dashboards on shared drives, SharePoint, or cloud storage.

6. Archive Documentation and Data Dictionary

• Save data dictionaries, methodology docs, decision logs, and process documentation in a
central repository.
• Maintain documentation with version and date.

7. Enable Knowledge Transfer & Sharing

• Share results, code, and documentation with the team (shared folders, GitHub, Confluence,
Notion, etc.).
• Conduct handover notes or walkthrough sessions for new team members or stakeholders.
• Share FAQs, troubleshooting guides, and best practices.

8. Follow Archiving & Retention Policy

• Adhere to your organization’s data retention policy—know how long to keep

data/models/reports and when to delete/archive.
• Archive old versions but keep backups of critical outputs.

9. Ensure Security, Privacy, and Compliance

• Encrypt sensitive or PII data and apply access controls.

• Follow compliance requirements (GDPR, HIPAA, etc.) for data sharing, storage, and deletion.
10. Maintain Reproducibility & Audit Trail

• Attach process logs, code version, data version, and environment details to every saved
output.
• This ensures future analysis can be reproduced, troubleshot, or audited.

11. Continuous Improvement & Feedback

• Save feedback, lessons learned, and improvement notes in the archive for future projects.
• Keep documentation updated and review the archive periodically.

Result Saving & Archiving Checklist Table

Pro Tips (Expert Level)

• Always tag every output, model, and code file with version and date.
• Use cloud storage, version control, and data cataloging tools for large teams.
• Keep a “README” or summary file in the archive so anyone can quickly understand what’s
there and how to use it.
• Never store sensitive data in unsecured locations—encryption and access control are a must.
• Periodically audit the archive—clean obsolete files, back up critical outputs.

Summary:

In Chapter 11, an expert data analyst systematically saves, archives, and shares every output—cleaned
data, models, code, reports, and documentation. Every file is versioned, documented, and secured;
knowledge transfer and reproducibility are ensured; and the foundation is set for compliance, audit
trail, and future learning. This approach makes your work sustainable, reusable, and a long-term
asset for the organization—just like the best data analysts do.

Data Cleaning - Cheatsheet
100% (2)
Data Cleaning - Cheatsheet
8 pages
Data Exploration Preparation
No ratings yet
Data Exploration Preparation
12 pages
Exploratory Data Analysis (Eda) With Pandas: (Cheatsheet)
No ratings yet
Exploratory Data Analysis (Eda) With Pandas: (Cheatsheet)
7 pages
Comprehensive EDA Python Guide
No ratings yet
Comprehensive EDA Python Guide
13 pages
Python for Data Analysis
No ratings yet
Python for Data Analysis
84 pages
Interactive Data Analysis With Jupyter Cheatsheet 1731972443
No ratings yet
Interactive Data Analysis With Jupyter Cheatsheet 1731972443
10 pages
DataCleaning
No ratings yet
DataCleaning
28 pages
Data Analysis With Python
No ratings yet
Data Analysis With Python
29 pages
Dataframe in Pandas - Cheatsheet
No ratings yet
Dataframe in Pandas - Cheatsheet
8 pages
Unit-2 Bda
No ratings yet
Unit-2 Bda
11 pages
Unit - Iii - Eda
No ratings yet
Unit - Iii - Eda
25 pages
dataframing_in_csv
No ratings yet
dataframing_in_csv
14 pages
data analysis
No ratings yet
data analysis
42 pages
Python Quick Notes
No ratings yet
Python Quick Notes
2 pages
Important Pandas Operations 1697910759
No ratings yet
Important Pandas Operations 1697910759
6 pages
DAP writeups_merged
No ratings yet
DAP writeups_merged
33 pages
Assvid
No ratings yet
Assvid
13 pages
EDA with Pandas
No ratings yet
EDA with Pandas
8 pages
BasicAnalysis Using PYTHON
No ratings yet
BasicAnalysis Using PYTHON
6 pages
Pandas Data Manipulation Extended CheatSheet 1731972219
No ratings yet
Pandas Data Manipulation Extended CheatSheet 1731972219
9 pages
Course_ Introduction to Data Science (SD211105)
No ratings yet
Course_ Introduction to Data Science (SD211105)
10 pages
Statistical Transform Data Cleaning
No ratings yet
Statistical Transform Data Cleaning
30 pages
Data Manipulation in Python Using Pandas
No ratings yet
Data Manipulation in Python Using Pandas
12 pages
Explorotary Data Analysis
100% (1)
Explorotary Data Analysis
30 pages
EDA_INDEPTH
No ratings yet
EDA_INDEPTH
19 pages
Pandas Roadmap
No ratings yet
Pandas Roadmap
6 pages
1.2.1. Retrieving Data - 1.2.2. Cleaning Data
No ratings yet
1.2.1. Retrieving Data - 1.2.2. Cleaning Data
35 pages
Exploratory Data Analysis: by Neha Mathur
No ratings yet
Exploratory Data Analysis: by Neha Mathur
14 pages
Supermarket Sales Data analysis
No ratings yet
Supermarket Sales Data analysis
6 pages
Exploratory Data Analysis: by Neha Mathur
No ratings yet
Exploratory Data Analysis: by Neha Mathur
14 pages
An Extensive Step by Step Guide To Exploratory Data Analysis
No ratings yet
An Extensive Step by Step Guide To Exploratory Data Analysis
26 pages
EDA With Pandas CheatSheet
No ratings yet
EDA With Pandas CheatSheet
3 pages
Chapter 2. Data Analysis and Processing - Full
No ratings yet
Chapter 2. Data Analysis and Processing - Full
49 pages
Data Analysis
No ratings yet
Data Analysis
4 pages
Comprehensive Guide Data Exploration Sas Using Python Numpy Scipy Matplotlib Pandas
100% (1)
Comprehensive Guide Data Exploration Sas Using Python Numpy Scipy Matplotlib Pandas
12 pages
Pandas PDF(2)
No ratings yet
Pandas PDF(2)
25 pages
Advanced Python Programming Data Science: The University of Sheffield
No ratings yet
Advanced Python Programming Data Science: The University of Sheffield
55 pages
Document (2)
No ratings yet
Document (2)
29 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
5 pages
Pandas Complete + Visualisation Summary of IBM Visualization
No ratings yet
Pandas Complete + Visualisation Summary of IBM Visualization
21 pages
Summary: Introduction To Data Visualization Tools
No ratings yet
Summary: Introduction To Data Visualization Tools
13 pages
Exp 8_LM
No ratings yet
Exp 8_LM
10 pages
DAC Phase3
No ratings yet
DAC Phase3
6 pages
BI Pracrical
No ratings yet
BI Pracrical
12 pages
Data Cleaning and Preparation
No ratings yet
Data Cleaning and Preparation
9 pages
Pandas CheatSheet
No ratings yet
Pandas CheatSheet
18 pages
What is pandas
No ratings yet
What is pandas
9 pages
DevOps Session 3 Pandas.pptx
No ratings yet
DevOps Session 3 Pandas.pptx
33 pages
pandas_notes
No ratings yet
pandas_notes
8 pages
Usage of NumPy for Numerical Data in Detail
No ratings yet
Usage of NumPy for Numerical Data in Detail
52 pages
Python (Unit - 2)
No ratings yet
Python (Unit - 2)
22 pages
DATA AGGREGATION USING PYTHON (1)
No ratings yet
DATA AGGREGATION USING PYTHON (1)
33 pages
Data Wrangling
No ratings yet
Data Wrangling
15 pages
Introduction To Pandas
No ratings yet
Introduction To Pandas
27 pages
Learneverythingai
No ratings yet
Learneverythingai
9 pages
Lesson 1 - Data Visualisation
No ratings yet
Lesson 1 - Data Visualisation
35 pages
Algorithms and Data Structures: An Easy Guide to Programming Skills
From Everand
Algorithms and Data Structures: An Easy Guide to Programming Skills
Rigdon Jonathan
No ratings yet
Data Structures and Algorithm
From Everand
Data Structures and Algorithm
Knowledge Flow
No ratings yet
Mastering Data Structures and Algorithms in C and C++
From Everand
Mastering Data Structures and Algorithms in C and C++
Sachin Naha
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Breat Cancer Detection using Thermograpgy
No ratings yet
Breat Cancer Detection using Thermograpgy
15 pages
8-Network Layer
No ratings yet
8-Network Layer
58 pages
IPV4 Header Solved Example Q
100% (1)
IPV4 Header Solved Example Q
2 pages
Game Development Using Python - Lecture#1,2 & 3
No ratings yet
Game Development Using Python - Lecture#1,2 & 3
10 pages
IIT-Roorkee-HR Analytics-29922
No ratings yet
IIT-Roorkee-HR Analytics-29922
25 pages
Business Analyst Master’s Program in Collaboration With IBM V11_new
No ratings yet
Business Analyst Master’s Program in Collaboration With IBM V11_new
28 pages
Edited Modeling and Simulation Course 2
No ratings yet
Edited Modeling and Simulation Course 2
4 pages
Tutorial 01_Visualization in Excel (I)
No ratings yet
Tutorial 01_Visualization in Excel (I)
15 pages
Catatan
No ratings yet
Catatan
6 pages
Rapport Copie Wael
No ratings yet
Rapport Copie Wael
46 pages
Topic 35 - Compu WPS Office Report
No ratings yet
Topic 35 - Compu WPS Office Report
8 pages
Unit 2- Data Representation
No ratings yet
Unit 2- Data Representation
44 pages
Strategic Modelling in FP&A - Ad - Van Der Post, Hayden
100% (1)
Strategic Modelling in FP&A - Ad - Van Der Post, Hayden
217 pages
Ensemble Sonification Syllabus
No ratings yet
Ensemble Sonification Syllabus
7 pages
ICONICS Suite Catalog (MELCO Version)
No ratings yet
ICONICS Suite Catalog (MELCO Version)
48 pages
SIH2024_IDEA_SECURESPHERE_Presentation_Format
No ratings yet
SIH2024_IDEA_SECURESPHERE_Presentation_Format
6 pages
Visual Climate Communication
No ratings yet
Visual Climate Communication
1 page
Assessment-and-project-plan_-Preparing-data-3809
No ratings yet
Assessment-and-project-plan_-Preparing-data-3809
2 pages
Baseline Study on Grade 12 STEM Students’ Competency Level and their Sources of Difficulties on Kinematics Graphs Interpretation
No ratings yet
Baseline Study on Grade 12 STEM Students’ Competency Level and their Sources of Difficulties on Kinematics Graphs Interpretation
9 pages
Looker Revised TOC
No ratings yet
Looker Revised TOC
16 pages
Thragom Lower Secondary School Trashi Yangtse Dzongkhag
No ratings yet
Thragom Lower Secondary School Trashi Yangtse Dzongkhag
28 pages
Shiva Ppt Codsoft
No ratings yet
Shiva Ppt Codsoft
14 pages
Rajat Awasthi-
No ratings yet
Rajat Awasthi-
2 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
1 page
D3 js in Action Third Edition Elijah Meeks 2024 scribd download
100% (18)
D3 js in Action Third Edition Elijah Meeks 2024 scribd download
60 pages
Project Impact of Car Features project-7 (final)(1)
No ratings yet
Project Impact of Car Features project-7 (final)(1)
11 pages
NIST Standard de Evaluación de Robots en Versión Draft
No ratings yet
NIST Standard de Evaluación de Robots en Versión Draft
94 pages
How To Create Effective Presentation
No ratings yet
How To Create Effective Presentation
11 pages
LC - RESUME - TEMPLATE - No Experience - Yes Degree 021219 - 3
No ratings yet
LC - RESUME - TEMPLATE - No Experience - Yes Degree 021219 - 3
2 pages
The New Art of Problem Solving - A Guide To Your Decision Sciences Journey
No ratings yet
The New Art of Problem Solving - A Guide To Your Decision Sciences Journey
22 pages
(The Ultimate PDF) Practical File For I.P. Practical 2023-24
No ratings yet
(The Ultimate PDF) Practical File For I.P. Practical 2023-24
45 pages
Renil Benny - SR Data Analyst - Resume
No ratings yet
Renil Benny - SR Data Analyst - Resume
5 pages
08 Visual Analytics
No ratings yet
08 Visual Analytics
11 pages
Vaibhav AIML
No ratings yet
Vaibhav AIML
2 pages