Chatgpt For Data Analytics: Live Online Training
Chatgpt For Data Analytics: Live Online Training
ChatGPT For
Data Analytics
How to 10x your data analysis
productivity with generative AI
Day 1
About me
Tobias Zwingmann
AI Advisor, Author
Connect: /tobias-zwingmann/
2
Agenda
Day 1 - Basics Day 2 - Advanced
2. ChatGPT Fundamentals (45 min) 2. Using ChatGPT with Google Sheets (45 min)
3. Use Cases For ChatGPT in 3. Use Cases for ChatGPT with SQL & Python
Business Analytics (60 min) (75 min)
3
Course Overview
Learning goals:
❑ Understand the potential of ChatGPT for data analytics and how to use it
with the most common data analysis tools such as SQL, Python, Excel,
and Google Sheets.
❑ Discover data analytics use cases for ChatGPT that will 10x your
productivity today
4
Course Overview
Who this is for:
5
Course Overview
Prerequisites
❑ A free OpenAI account for ChatGPT access (ideally ChatGPT Plus subscription)
6
Discussion:
7
Poll:
8
Discussion:
9
Poll:
A) Yes
B) No
C) What?
10
ChatGPT Fundamentals
What is ChatGPT?
- ChatGPT is an AI-powered web
application by OpenAI, a US-based
for-profit company
12
What’s so special about ChatGPT?
- ChatGPT acquired 1 Million users in
only about 5 days
● Source: https://www.sequoiacap.com/article/generative-ai-act-two/
13
What can it do?
- Answer questions
- Provide recommendations
- Simulate conversations
- Generate stories
- Translate languages
- Offer explanations
- Assist with research
- Summarize articles
- Generate poetry
- Edit text
- Proofread documents
- …
● Surce: https://chatgptguide.net
14
Some Terminology
Main features:
15
The Evolution of Generative AI Models
16
How does an LLM work?
● Source: https://thegradient.pub/understanding-evaluation-metrics-for-language-models/
17
How does an LLM work?
18
How does an LLM work?
Until
What is the capital of Germany? Berlin
The
What
A
For
…
19
How does an LLM work?
20
How does an LLM work?
No going back!
21
How does an LLM work?
22
How does an LLM work?
23
How does an LLM work?
24
How does an LLM work?
25
How does an LLM work?
Base model:
26
The ”Secret Sauce” of ChatGPT
- Instruction Fine-Tuned Model with
Reinforcement Learning from Human
Feedback (RLHF): Human AI trainers
provided conversations in which they played
both sides—the user and an AI assistant.
● Source: https://openai.com/blog/chatgpt
27
How does an LLM work?
28
How does an LLM work?
Germany was split into two parts after the second world war.
What is the capital of Germany? Until 1990 the capital used to be Bonn.
29
There’s more than ChatGPT!
Model Chat App Organization Highlights Release URL
GPT-3.5-Turbo ChatGPT OpenAI, USA - 4k and 16k token limit 2022 chat.openai.com
- fast & cheap
GPT-4 ChatGPT OpenAI, USA - 8k and 32k token limit 2023 chat.openai.com
- web browsing
- image inputs
- plugins
PaLM 2 Bard Google, USA - 8k token limit 2023 bard.google.com
- image inputs
- internet access
- plugins announced
Claude-v2-100 Claude Anthropic, USA - 100k (!) token limit 2023 anthropic.com/product
k - constitutional training
Luminous Lumi Aleph Alpha, - image inputs 2023 aleph-alpha.com/luminous
Germany - high privacy standards
- explainable AI
30
Use Case Matrix For Generative AI
31
Use Case Matrix For Generative AI
Augmented
AI Use Cases
32
Before we start…
Be Group 1
33
Q&A
34
Writing Effective Prompts
What is a prompt?
- A question or phrase: Something you ask or - Can be simple or complex: Prompts can
tell the AI to start a conversation or get be straightforward questions or more
information. detailed scenarios for the AI to engage with.
- Guides AI response: The prompt helps the AI - Like a conversation starter: Think of it as
understand what you want and shape its reply an opening line that sparks a discussion
accordingly. between you and the AI.
36
Prompt Engineering: Common mistakes
- If your prompt is not specific, it will be mediocre by definition: ChatGPT can’t read your
mind!
Make this SQL code better I'm using PostgreSQL and would like to
optimize the following SQL query. My goal is
to reduce the execution time and make the
code easier for other developers to
understand. Please use comments where
appropriate. Before you rewrite the code,
please explain the steps you would take to
optimize it.
38
Principle 2: Don't let ChatGPT do the thinking
- Here’s the deal: You do the thinking, ChatGPT does the doing!
- Give ChatGPT the mental framework it should use to solve the task
39
Principle 3: Break large tasks into small pieces
- If you ask for too much, you will get nothing useful in return
40
Principle 4: Use structured prompt formats
41
Principle 5: Build your prompt library
42
Principle 5: Build your prompt library
or…
43
Principle 5: Build your prompt library
44
This is NOT prompt engineering!
● Source: x.com/DrTBehrens
45
Break
46
Use Cases For ChatGPT
in Business Analytics
Recap: The Data Analytics Process (simplified)
- Don’t start with the data, start with the problem
48
ChatGPT For Data Analysis
What people think where ChatGPT can help them:
49
ChatGPT For Data Analysis
Where ChatGPT can actually help them:
50
ChatGPT For Data Analysis
Where we will start:
51
Before you start doing any data analysis:
3 Key Components:
52
Problem Statements & Hypothesis
Define the problem (SMART)
Bad example:
We’re converting only 30% of all leads and miss our goal
on selling 1,000 services per quarter.
🡪 Factual statement, not a SMART problem statement
53
Problem Statements & Hypothesis
Define the problem (SMART)
Good example:
54
Problem Statements & Hypothesis
Define the problem (SMART)
Good example:
55
Use Case 1: SMART Problem
Statements
56
Issue Trees
- Once you have a problem statement, your next step
is to break this problem down into multiple
sub-problems (sub-issues) for further exploration
Sub-issue
- Issue Trees are a great way to do that! Issue 1
Problem/Issue
Sub-issue
- Background:
Sub-issue
- Issue Trees are a popular problem structuring tool Issue 2
used daily by big consulting companies like Sub-issue
McKinsey, BCG, etc.
Problem/Issue
Sub-issue
2. Maintain integrity of problem solving
🡪 Solving the parts will solve the problem
Sub-issue
🡪 Consider all options and test hypothesis Issue 2
🡪 No overlaps, no gaps Sub-issue
3. Communication
🡪 Build a common understanding of the problem
and the approach being taken
58
Issue Trees
”Secret Sauce”:
Problem/Issue
Sub-issue
🡪 No overlaps between different parts of
the tree
Sub-issue
Issue 2
2. Collectively exhaustive (CE)
Sub-issue
🡪 You consider all possible options (no
gaps)
59
Issue Trees
1. Mutually exclusive (ME)
60
Issue Trees
Entire Solution Space
2. Collectively exhaustive (CE)
Problem/Issue
Sub-issue
- Essentially, you are "exhausting" the set
of things to look at.
Sub-issue
Issue 2
- If you looked at each component of a
Sub-issue
collectively exhaustive issue tree, there
isn’t an area left to look at with respect to
the problem.
61
Issue Trees
Let’s say we want to decide which
Pizza to order.
With Cheese
A BAD issue tree would look like
this: Veggie
Which Pizza?
Without Cheese
- It’s not ME (There could also
be beef Pizza without cheese)
62
Issue Trees
Vegan
Example of a GOOD issue tree:
Veggie
- It's ME because there’s no overlap
Non-Vegan
in the different categories.
Which Pizza?
- It’s CE because there’s no other Beef
option left.
63
Issue Trees - Example
Personalized Follow Up
via Phone
Personalized Follow Up
via SMS/Messenger
Hot lead measures
Personalized Follow Up
via E-Mail
Per Channel
Adjust Ad Budgets
Per Course
Get Higher
Quality Leads
Improve SEO
Keep Ad Budgets
Improve Content
64
Use Case 2: Issue Trees
65
Analysis Design: Root Cause Analysis With 5 Why
66
RCA: The 5 Why method - Example
67
Use Case 3: RCA
68
Background: Data Storytelling
● Source: https://www.mckinsey.com/alumni/news-and-events/global-news/alumni-news/barbara-minto-mece-i-invented-it-so-i-get-to-say-how-to-pronounce-it
69
Use Case 4: Improved Storytelling
70
Use Cases For ChatGPT
with Microsoft Excel
How to use ChatGPT with Microsoft Excel
🡪 Excel Copilot
https://www.youtube.com/watch?v=vG
I6VLr8L5w
https://www.youtube.com/watch?v=I-w
aFp6rLc0
Experimental features:
https://www.microsoft.com/en-us/gar
age/profiles/excel-labs/
72
How to use ChatGPT with Excel - today
Upload data to ChatGPT4 (Advanced Data
Analysis)
The Good
A data analyst in your pocket
The Bad
Potentially wrong (need to double check!)
The Ugly
Probably not allowed
73
Use Case 5: Analyze a
Spreadsheet with Advanced Data
Analysis
74
How to use ChatGPT with Excel - today
Main advantages
▪ Quickly explain complex formulas in simple language
▪ Quickly write complex formulas from simple language
▪ Quickly modify complex formulas from simple language
Tip:
Works for VBA, too!
76
How to use ChatGPT with Excel - today
Main advantages
▪ Let ChatGPT help you through tedious
data wrangling tasks
▪ It will give you step-by-step instructions
● Source: https://r4ds.had.co.nz/tidy-data.html
77
Use Case 7: Step-by-Step
Instructions
78
How to use ChatGPT with Excel - today
Main advantages
▪ Let ChatGPT do the tedious stuff
🡪 Write VBA scripts that would
otherwise take hours to write
80
Thank you!
Subscribe to my free
newsletter 🡪
81
LIVE ONLINE TRAINING
ChatGPT For
Data Analytics
How to 10x your data analysis
productivity with generative AI
Day 2
Course Overview
Learning goals:
- Understand the potential of ChatGPT for data analytics and how to use it with the most common
data analysis tools such as SQL, Python, Excel, and Google Sheets.
- Discover data analytics use cases for ChatGPT that will 10x your productivity today
- Understand future applications of ChatGPT and its potential impact on data analytics
83
Recap
Day 1 Learnings:
84
Agenda
Day 2 - Advanced
85
Q&A
86
Using ChatGPT with
Google Sheets
How to use ChatGPT in Google Sheets (and Excel)
88
How to use ChatGPT in Google Sheets (and Excel)
Data Cleaning
Main advantages
▪ Spreadsheet data is often messy
▪ GPT models can help to
dramatically reduce the time it
takes to clean data
89
Use Case 9: Data Cleaning
90
How to use ChatGPT in Google Sheets (and Excel)
Main advantages
▪ Easily turn unstructured data into
structured data
▪ Makes data easier to analyze
91
Use Case 10: Data Classification
92
How to use ChatGPT in Google Sheets (and Excel)
Text-to-numeric
Main advantages
▪ Transform text data to numeric
scales so you can run calculations
(mean, min, max, etc.)
93
Use Case 11: Text-to-Numeric
94
How to use ChatGPT in Google Sheets (and Excel)
Topic Mining
Main advantages
▪ Identify common themes and
patterns from your text data
95
Use Case 12: Topic Mining
96
ChatGPT Use Cases for
SQL & Python
How to use ChatGPT for SQL & Python
- ChatGPT has incredible coding skills, especially
when paired with an (experienced) human
98
Use Cases 13-15: Interactive Labs
99
Use Cases 16-20: Jupyter Notebook
100
Limitations and security concerns
Limitations of ChatGPT
Problem:
- ChatGPT writes very confident, plausible-sounding
answers, that are potentially completely incorrect.
- Fixing this issue is tricky because: (1) during RL training,
there’s currently no single source of truth; (2) training the
model to be more cautious causes it to decline questions
that it can answer correctly; and (3) supervised training
misleads the model because the ideal answer depends on
what the model knows, rather than what the human
knows.
Problem:
- ChatGPT is sensitive to tweaks to the input
context. For example, given one phrasing of a
question, the model can give the wrong or no
answer, but given a slight rephrase, can answer
correctly.
Tactics
- Neutral prompt: Don’t force the model into a
direction (e.g. Instead of “Why has X led to Y”,
ask “How do X and Y relate to each other”)
Source: Open AI
103
Limitations of ChatGPT
Problem:
- The model was framed to act as a helpful assistant.
That makes it often produce answers even if it
doesn’t know the answer or the context is not clear.
Instead, the model usually tries to guess what the
user intended.
Tactics:
- Instruct for questions: In the prompt, define that
the model should ask clarifying questions when it is
not sure about the answer or the user provided an
ambiguous query.
Source: Open AI
104
Security risks
Source: https://x.com/goodside/status/1569128808308957185?s=20
105
Security risks
Source: https://not-just-memorization.github.io/extracting-training-data-from-chatgpt.html
106
How to avoid security risks
- Not exposing data
- Exposing data, but with commercial API Find the balance that works for you!
107
How to avoid security risks
- Not exposing data
Switch this off!
108
How to avoid security risks
- Not exposing data
109
How to avoid security risks
- Not exposing data
Source: https://learn.microsoft.com/en-us/legal/cognitive-services/openai/data-privacy
110
How to avoid security risks
- Not exposing data
Source:https://ai.meta.com/llama/
111
How to avoid security risks
112
Discussion
113
Future Outlook and Closing
What’s next?
��
115
What’s next in AI?
116
Custom GPTs
Source: https://openai.com/blog/introducing-gpts
117
ChatGPT Plugins
118
Multimodal Inputs
Source: https://twitter.com/petergyang/status/1707169696049668472/photo/1
119
Increased Context Limit
- 100k tokens: You can put the “Great Gatsby” into the prompt
Anthropic’s Claude 2
120
Explainable AI
Goal:
123
Thank you!
Subscribe to my free
newsletter 🡪
124