0% found this document useful (0 votes)
110 views124 pages

Chatgpt For Data Analytics: Live Online Training

Uploaded by

William Sun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
110 views124 pages

Chatgpt For Data Analytics: Live Online Training

Uploaded by

William Sun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 124

LIVE ONLINE TRAINING

ChatGPT For
Data Analytics
How to 10x your data analysis
productivity with generative AI

Day 1
About me

Tobias Zwingmann

AI Advisor, Author

Managing Partner @ RAPYD.AI

15+ years corporate experience

Author: AI-Powered Business Intelligence

Connect: /tobias-zwingmann/

2
Agenda
Day 1 - Basics Day 2 - Advanced

1. Course Introduction (10 min) 1. Introduction and Recap (15 min)

2. ChatGPT Fundamentals (45 min) 2. Using ChatGPT with Google Sheets (45 min)

3. Use Cases For ChatGPT in 3. Use Cases for ChatGPT with SQL & Python
Business Analytics (60 min) (75 min)

4. Use Cases For ChatGPT with 4. Limitations and security concerns of


Microsoft Excel (60 min) ChatGPT (30 min)

5. Closing (5 min) 5. Future Outlook and Closing (15 min)

3
Course Overview
Learning goals:

❑ Understand the potential of ChatGPT for data analytics and how to use it
with the most common data analysis tools such as SQL, Python, Excel,
and Google Sheets.

❑ Examine the vulnerabilities and risks of using ChatGPT

❑ Discover data analytics use cases for ChatGPT that will 10x your
productivity today

❑ Understand future applications of ChatGPT and its potential impact on


data analytics

4
Course Overview
Who this is for:

- You're a data analyst, data scientist, or BI professional who wants to


be more productive

- Your day-to-day work consists of analyzing data using spreadsheet


software such as Excel or Google Sheets, or a scripting language
such as SQL or Python.

- You have at least a few months of data analytics experience

5
Course Overview
Prerequisites

❑ A free OpenAI account for ChatGPT access (ideally ChatGPT Plus subscription)

❑ A computer set up with Microsoft Excel installed

❑ A free Google account to access Google Sheets

❑ An IDE of your choice (e.g. Google Colab or Jupyter Notebooks)

6
Discussion:

What’s your ChatGPT experience


level?

7
Poll:

A) I use ChatGPT every day


B) I use ChatGPT at least once a week
C) I use ChatGPT at least once a month
D) I’ve used it maybe 1-2 times or less in total

8
Discussion:

Do you have a ChatGPT Plus


Subscription?

9
Poll:

A) Yes
B) No
C) What?

10
ChatGPT Fundamentals
What is ChatGPT?
- ChatGPT is an AI-powered web
application by OpenAI, a US-based
for-profit company

- It can respond to any instructions

- ….and sometimes it’s right! ;-)

- Knowing how to use it is key to get


valuable outcomes!

- ChatGPT is powered by a Large


Language Model (GPT-3.5-Turbo /
GPT-4)

12
What’s so special about ChatGPT?
- ChatGPT acquired 1 Million users in
only about 5 days

- Set a record for the fastest app to reach


100 million users, faster than Google+
(1+ year)

- Meanwhile, growth has slowed down,


partly due to competition and stronger
integration

● Source: https://www.sequoiacap.com/article/generative-ai-act-two/
13
What can it do?
- Answer questions
- Provide recommendations
- Simulate conversations
- Generate stories
- Translate languages
- Offer explanations
- Assist with research
- Summarize articles
- Generate poetry
- Edit text
- Proofread documents
- …

● Surce: https://chatgptguide.net
14
Some Terminology

- Generative AI: AI trained to generate digital content


(text, image, video, audio).

- LLMs: Specific Generative AI application for text

- GPTs are a special category of LLMs, trained using the


transformer architecture and a huge corpus of text.
Examples: GPT-4, PaLM, LLaMA, Claude, …
- ChatGPT is a web app that is built on GPT-3.5/4

Main features:

- Contextual Understanding: The model doesn't just look


at individual words, but also the context around them.

- No Personal Memory: LLMs don't have personal


experiences or memories. They generate responses
based on patterns they've learned from the data.

15
The Evolution of Generative AI Models

16
How does an LLM work?

- ”Fancy text completion” – with some magic!

● Source: https://thegradient.pub/understanding-evaluation-metrics-for-language-models/
17
How does an LLM work?

How does it know which word to predict?

Attention is all you need


“Context window”
The context is all in the prompt! The LLM
has no ”shared memory” after it was
trained!

18
How does an LLM work?

How does it know which word to predict?

Until
What is the capital of Germany? Berlin
The
What
A
For

19
How does an LLM work?

How does it know which word to predict?

What is the capital of Germany? The

20
How does an LLM work?

How does it know which word to predict?

What is the capital of Germany? The capital

No going back!

21
How does an LLM work?

How does it know which word to predict?

What is the capital of Germany? The capital of

22
How does an LLM work?

How does it know which word to predict?

What is the capital of Germany? The capital of Germany

23
How does an LLM work?

How does it know which word to predict?

What is the capital of Germany? The capital of Germany is

24
How does an LLM work?

How does it know which word to predict?

What is the capital of Germany? The capital of Germany is Berlin

25
How does an LLM work?

Base model:

What is the capital of France?


What is the capital of Germany? What is the capital of Italy?

26
The ”Secret Sauce” of ChatGPT
- Instruction Fine-Tuned Model with
Reinforcement Learning from Human
Feedback (RLHF): Human AI trainers
provided conversations in which they played
both sides—the user and an AI assistant.

- Trainers had access to model-written


suggestions to help them compose their
responses.

- Mixed this new dialogue dataset with existing


instructional-based dataset, transformed into a
dialogue format.

● Source: https://openai.com/blog/chatgpt
27
How does an LLM work?

Model fine-tuned for instructions:

What is the capital of France?


What is the capital of Germany? The capital of France is Paris, and the capital of Germany is Berlin.

28
How does an LLM work?

The context makes all the difference!

Germany was split into two parts after the second world war.
What is the capital of Germany? Until 1990 the capital used to be Bonn.

29
There’s more than ChatGPT!
Model Chat App Organization Highlights Release URL
GPT-3.5-Turbo ChatGPT OpenAI, USA - 4k and 16k token limit 2022 chat.openai.com
- fast & cheap
GPT-4 ChatGPT OpenAI, USA - 8k and 32k token limit 2023 chat.openai.com
- web browsing
- image inputs
- plugins
PaLM 2 Bard Google, USA - 8k token limit 2023 bard.google.com
- image inputs
- internet access
- plugins announced
Claude-v2-100 Claude Anthropic, USA - 100k (!) token limit 2023 anthropic.com/product
k - constitutional training
Luminous Lumi Aleph Alpha, - image inputs 2023 aleph-alpha.com/luminous
Germany - high privacy standards
- explainable AI

30
Use Case Matrix For Generative AI

31
Use Case Matrix For Generative AI

Augmented
AI Use Cases

32
Before we start…

● Don’t be too dogmatic about LLMs


● Approach them from a practical angle
● See what works for you

Be Group 1

33
Q&A

34
Writing Effective Prompts
What is a prompt?

- A question or phrase: Something you ask or - Can be simple or complex: Prompts can
tell the AI to start a conversation or get be straightforward questions or more
information. detailed scenarios for the AI to engage with.

- Guides AI response: The prompt helps the AI - Like a conversation starter: Think of it as
understand what you want and shape its reply an opening line that sparks a discussion
accordingly. between you and the AI.

36
Prompt Engineering: Common mistakes

- Even if ChatGPT is designed to be easy to use,


the way you prompt it indeed makes a difference!

- For most beginners, the first few rides with


ChatGPT are exciting, but then they quickly get
frustrated because their answers are mediocre,
repetitive, boring, and not useful.

- Most beginner prompts have the following


problems:
- too vague
- lack context
- ask for too much
- not structured properly

- To get great outputs, there are a few principles


that should be executed well!

No need to become a prompt engineer 37


Principle 1: Be as specific as possible

- If your prompt is not specific, it will be mediocre by definition: ChatGPT can’t read your
mind!

Bad example: Good example:

Make this SQL code better I'm using PostgreSQL and would like to
optimize the following SQL query. My goal is
to reduce the execution time and make the
code easier for other developers to
understand. Please use comments where
appropriate. Before you rewrite the code,
please explain the steps you would take to
optimize it.

38
Principle 2: Don't let ChatGPT do the thinking

- Here’s the deal: You do the thinking, ChatGPT does the doing!
- Give ChatGPT the mental framework it should use to solve the task

Bad example: Good example:

I want to develop a new SaaS My goal is to build a new SaaS application.


application. Give me some ideas. To do this, I need to build something that my
users will find valuable. My ideal users are
social media managers from companies with 5M+
- 50M annual revenue who primarily work with
LinkedIn, Facebook, and Instagram. Please use
the Jobs-To-Be-Done framework to outline the
key features and benefits that would appeal
to this target market.

39
Principle 3: Break large tasks into small pieces

- If you ask for too much, you will get nothing useful in return

Bad example: Good example:

40
Principle 4: Use structured prompt formats

- Using structured prompt formats


can significantly improve the
quality of results obtained from
ChatGPT and also make it easier
to maintain your prompts.

41
Principle 5: Build your prompt library

- Store your best prompts for easy retrieval

42
Principle 5: Build your prompt library

or…

43
Principle 5: Build your prompt library

- Build a custom GPT:

44
This is NOT prompt engineering!

● Source: x.com/DrTBehrens
45
Break

46
Use Cases For ChatGPT
in Business Analytics
Recap: The Data Analytics Process (simplified)
- Don’t start with the data, start with the problem

1. Define 2. Gather 3. Analyze 4. Derive 5. Suggest


Problem Data Data Insights Actions

48
ChatGPT For Data Analysis
What people think where ChatGPT can help them:

1. Define 2. Gather 3. Analyze 4. Derive 5. Suggest


Problem Data Data Insights Actions

49
ChatGPT For Data Analysis
Where ChatGPT can actually help them:

1. Define 2. Gather 3. Analyze 4. Derive 5. Suggest


Problem Data Data Insights Actions

50
ChatGPT For Data Analysis
Where we will start:

1. Define 2. Gather 3. Analyze 4. Derive 5. Suggest


Problem Data Data Insights Actions

51
Before you start doing any data analysis:

3 Key Components:

Problem Statement, ideally SMART


Issue Tree, ideally MECE
Analysis design, ideally scientifically sound

52
Problem Statements & Hypothesis
Define the problem (SMART)

Specific – Clearly identify the problem

Measurable – Quantify the focus

Actionable – Provide actions to tackle the problem

Relevant – Identify actions that will solve the problem

Timebound – Set a firm date by which to solve this problem

Bad example:
We’re converting only 30% of all leads and miss our goal
on selling 1,000 services per quarter.
🡪 Factual statement, not a SMART problem statement

53
Problem Statements & Hypothesis
Define the problem (SMART)

Specific – Clearly identify the problem

Measurable – Quantify the focus

Actionable – Provide actions to tackle the problem

Relevant – Identify actions that will solve the problem

Timebound – Set a firm date by which to solve this problem

Good example:

What opportunities exist to increase our lead conversion rate to 45%


over the next 3 months through an improved marketing strategy in
alignment with the business objective of selling more than 1,000
services per quarter.

54
Problem Statements & Hypothesis
Define the problem (SMART)

Specific – Clearly identify the problem

Measurable – Quantify the focus

Actionable – Provide actions to tackle the problem

Relevant – Identify actions that will solve the problem

Timebound – Set a firm date by which to solve this problem

Good example:

What opportunities exist to increase our lead conversion rate to 45%


over the next 3 months through an improved marketing strategy in
alignment with the business objective of selling more than 1,000
services per quarter.

55
Use Case 1: SMART Problem
Statements

56
Issue Trees
- Once you have a problem statement, your next step
is to break this problem down into multiple
sub-problems (sub-issues) for further exploration
Sub-issue
- Issue Trees are a great way to do that! Issue 1

Problem/Issue
Sub-issue
- Background:
Sub-issue
- Issue Trees are a popular problem structuring tool Issue 2
used daily by big consulting companies like Sub-issue
McKinsey, BCG, etc.

- Issue Trees allow you do decompose a problem


into different sub-problems without overlaps while
still exploring all possible options for the main
hypothesis
57
Issue Trees
Benefits:

1. Make the problem easier to manage:


🡪 Smaller pieces are easier to manage intellectually
Sub-issue
🡪 Prioritize and allocate to different resources Issue 1

Problem/Issue
Sub-issue
2. Maintain integrity of problem solving
🡪 Solving the parts will solve the problem
Sub-issue
🡪 Consider all options and test hypothesis Issue 2
🡪 No overlaps, no gaps Sub-issue

3. Communication
🡪 Build a common understanding of the problem
and the approach being taken

58
Issue Trees
”Secret Sauce”:

MECE criteria: Issue Trees are mutually


exclusive and collectively exhaustive
Sub-issue

1. Mutually exclusive (ME) Issue 1

Problem/Issue
Sub-issue
🡪 No overlaps between different parts of
the tree
Sub-issue
Issue 2
2. Collectively exhaustive (CE)
Sub-issue
🡪 You consider all possible options (no
gaps)

59
Issue Trees
1. Mutually exclusive (ME)

- Mutually exclusive means that


components of an Issue Tree are
completely independent of each other.

- Mutually exclusive entities do not include


each other. Sub-Issue 1 Sub-Issue 2

- (Think of it as a Venn diagram where the


circles don’t overlap.)

60
Issue Trees
Entire Solution Space
2. Collectively exhaustive (CE)

- This means the entire set of possible


Sub-issue
solutions is considered. Issue 1

Problem/Issue
Sub-issue
- Essentially, you are "exhausting" the set
of things to look at.
Sub-issue
Issue 2
- If you looked at each component of a
Sub-issue
collectively exhaustive issue tree, there
isn’t an area left to look at with respect to
the problem.

61
Issue Trees
Let’s say we want to decide which
Pizza to order.
With Cheese
A BAD issue tree would look like
this: Veggie

Which Pizza?
Without Cheese
- It’s not ME (There could also
be beef Pizza without cheese)

- It’s not CE (There are more Beef Only

options, e.g. chicken) Beef

Surf & Turf

62
Issue Trees
Vegan
Example of a GOOD issue tree:
Veggie
- It's ME because there’s no overlap
Non-Vegan
in the different categories.

Which Pizza?
- It’s CE because there’s no other Beef
option left.

- This tree helps us to break down Chicken


the problem fast. (If you're
Non-Veggie
vegetarian, you can immediately
Fish
drop the bottom part of the tree)

- Note: There might be even more


Mixed
types of meat. Looking at a specific
problem is critical!

63
Issue Trees - Example
Personalized Follow Up
via Phone
Personalized Follow Up
via SMS/Messenger
Hot lead measures
Personalized Follow Up
via E-Mail

strategy in alignment with Increase Personalized Follow Up


the business objective of Conversion Efforts via Direct Mail
What opportunities exist
for Edu X to increase its

selling more than 1,000


lead conversion rate to

courses per quarter.


improved marketing
45% over the next 3
months through an

Cold Lead Nurturing


Campaign
Cold lead measures
Introduce Presales
product

Per Channel
Adjust Ad Budgets
Per Course
Get Higher
Quality Leads
Improve SEO
Keep Ad Budgets
Improve Content

64
Use Case 2: Issue Trees

65
Analysis Design: Root Cause Analysis With 5 Why

- Introduced by Sakichi Toyoda and used within


Toyota Motor Corporation in the 1930s
- Problem-solving technique that involves asking
"why" questions to identify the root cause of a
problem
- Simple yet powerful method that can be
applied to almost any type of problem
- Goal: dig deeper into the issue until you reach
the underlying cause, typically after 5 steps
- Received various criticism (more on that later),
but still useful

66
RCA: The 5 Why method - Example

Example: A company has noticed a decline


in sales over the past few months, and
they want to use the 5-Why method to
identify the root cause of the problem. Root Cause: We don't have the
right tools and knowledge to
analyze data

67
Use Case 3: RCA

68
Background: Data Storytelling

Pyramid Principle by McKinsey:

Start by identifying your key message.


Build 3-4 supporting arguments and add
evidence or examples.
Turn this pyramid structure into an outline
for your presentation
Conclude with a call-to-action.
Tailor the pyramid to your audience's
perspective and consider using
mini-pyramids for complex topics.

● Source: https://www.mckinsey.com/alumni/news-and-events/global-news/alumni-news/barbara-minto-mece-i-invented-it-so-i-get-to-say-how-to-pronounce-it
69
Use Case 4: Improved Storytelling

70
Use Cases For ChatGPT
with Microsoft Excel
How to use ChatGPT with Microsoft Excel

The elephant in the room:

🡪 Excel Copilot

https://www.youtube.com/watch?v=vG
I6VLr8L5w

https://www.youtube.com/watch?v=I-w
aFp6rLc0

Experimental features:
https://www.microsoft.com/en-us/gar
age/profiles/excel-labs/

72
How to use ChatGPT with Excel - today
Upload data to ChatGPT4 (Advanced Data
Analysis)

What is Advanced Data Analysis (Code Interpreter)?


▪ Available to Plus users
▪ Python sandbox to write and run code
▪ Subsequent calls can build on top of each other
▪ Supports file uploading and downloading

The Good
A data analyst in your pocket

The Bad
Potentially wrong (need to double check!)

The Ugly
Probably not allowed
73
Use Case 5: Analyze a
Spreadsheet with Advanced Data
Analysis

74
How to use ChatGPT with Excel - today

Generate & explain Excel formulas

Main advantages
▪ Quickly explain complex formulas in simple language
▪ Quickly write complex formulas from simple language
▪ Quickly modify complex formulas from simple language

Tip:
Works for VBA, too!

Image source: https://www.journalofaccountancy.com/issues/2017/jun/how-to-evaluate-complex-excel-formulas.html


75
Use Case 6: Generate & Explain
Formulas

76
How to use ChatGPT with Excel - today

Provide Step-by-Step Instructions

Main advantages
▪ Let ChatGPT help you through tedious
data wrangling tasks
▪ It will give you step-by-step instructions

● Source: https://r4ds.had.co.nz/tidy-data.html
77
Use Case 7: Step-by-Step
Instructions

78
How to use ChatGPT with Excel - today

Write VBA scripts

Main advantages
▪ Let ChatGPT do the tedious stuff
🡪 Write VBA scripts that would
otherwise take hours to write

Image source: https://www.ablebits.com/office-addins-blog/run-macro-excel-create-macro-button/


79
Use Case 8: Write VBA Scripts

80
Thank you!

Subscribe to my free
newsletter 🡪

Let’s connect! ai4bi.rocks


linkedin.com/in/tobias-zwingmann
[email protected]

81
LIVE ONLINE TRAINING

ChatGPT For
Data Analytics
How to 10x your data analysis
productivity with generative AI

Day 2
Course Overview
Learning goals:

- Understand the potential of ChatGPT for data analytics and how to use it with the most common
data analysis tools such as SQL, Python, Excel, and Google Sheets.

- Examine the vulnerabilities and risks of using ChatGPT

- Discover data analytics use cases for ChatGPT that will 10x your productivity today

- Understand future applications of ChatGPT and its potential impact on data analytics

83
Recap
Day 1 Learnings:

❑ Fundamentals about ChatGPT, LLMs & prompt engineering


❑ Business Analytics Use Cases:
❑ SMART Problem Statements
❑ Issue Trees
❑ RCA
❑ Storytelling
❑ Microsoft Excel Use Cases:
❑ Advanced Data Analysis
❑ Generate & Explain Formulas
❑ Step-b—Step Instructions
❑ Writing VBA Scripts

84
Agenda
Day 2 - Advanced

1. Introduction and Recap (15 min)

2. Using ChatGPT with Google Sheets (45 min)

3. Use Cases for ChatGPT with SQL & Python


(75 min)

4. Limitations and security concerns of


ChatGPT (30 min)

5. Future Outlook and Closing (15 min)

85
Q&A

86
Using ChatGPT with
Google Sheets
How to use ChatGPT in Google Sheets (and Excel)

- Google Sheets offers a free plugin, GPT for


Sheets and Docs, to integrate GPT functionality
via an OpenAI API key.

- Use the plugin for non-critical data; it's not ideal


for sensitive information.

- For secure handling of sensitive data, deploy


GPT-4 on Azure and use custom functions in
Google Sheets or Python/R in Excel.

- Or: Wait for Microsoft Copilot.

- Testing with the plugin is a good first step


before moving on to more advanced, secure
options.

88
How to use ChatGPT in Google Sheets (and Excel)

Data Cleaning

Main advantages
▪ Spreadsheet data is often messy
▪ GPT models can help to
dramatically reduce the time it
takes to clean data

89
Use Case 9: Data Cleaning

90
How to use ChatGPT in Google Sheets (and Excel)

Data Sorting & Classification

Main advantages
▪ Easily turn unstructured data into
structured data
▪ Makes data easier to analyze

91
Use Case 10: Data Classification

92
How to use ChatGPT in Google Sheets (and Excel)

Text-to-numeric

Main advantages
▪ Transform text data to numeric
scales so you can run calculations
(mean, min, max, etc.)

93
Use Case 11: Text-to-Numeric

94
How to use ChatGPT in Google Sheets (and Excel)

Topic Mining

Main advantages
▪ Identify common themes and
patterns from your text data

95
Use Case 12: Topic Mining

96
ChatGPT Use Cases for
SQL & Python
How to use ChatGPT for SQL & Python
- ChatGPT has incredible coding skills, especially
when paired with an (experienced) human

- Automate repetitive coding tasks, generate code


snippets, and offer refactoring suggestions, so you
focus on more complex aspects of the task

- Automatically generate documentation and test


cases

- ChatGPT facilitates communication between


technical and non-technical stakeholders by
providing plain-language explanations of complex
code.

- It can also assist in tasks like generating dummy


data, porting code between languages.

98
Use Cases 13-15: Interactive Labs

99
Use Cases 16-20: Jupyter Notebook

100
Limitations and security concerns
Limitations of ChatGPT

Problem:
- ChatGPT writes very confident, plausible-sounding
answers, that are potentially completely incorrect.
- Fixing this issue is tricky because: (1) during RL training,
there’s currently no single source of truth; (2) training the
model to be more cautious causes it to decline questions
that it can answer correctly; and (3) supervised training
misleads the model because the ideal answer depends on
what the model knows, rather than what the human
knows.

Tactics to mitigate risks:


- Ask: Ask ChatGPT to verify its previous answer. Sounds
silly, but it works! LLMs can’t go backwards, just forward.
- Repeat: Run the prompt multiple times and see how
much the result varies.
- Use GPT-4: It has much better cognitive capabilities
Source: Open AI
compared to GPT-3.5
102
Limitations of ChatGPT

Problem:
- ChatGPT is sensitive to tweaks to the input
context. For example, given one phrasing of a
question, the model can give the wrong or no
answer, but given a slight rephrase, can answer
correctly.

Tactics
- Neutral prompt: Don’t force the model into a
direction (e.g. Instead of “Why has X led to Y”,
ask “How do X and Y relate to each other”)

Source: Open AI
103
Limitations of ChatGPT

Problem:
- The model was framed to act as a helpful assistant.
That makes it often produce answers even if it
doesn’t know the answer or the context is not clear.
Instead, the model usually tries to guess what the
user intended.

Tactics:
- Instruct for questions: In the prompt, define that
the model should ask clarifying questions when it is
not sure about the answer or the user provided an
ambiguous query.

Source: Open AI
104
Security risks

- Exposing sensitive data which might be


used to train the model, especially with
regards to the ChatGPT web application.
- External dependencies that might not
always work as expected
- Prompt injections

Source: https://x.com/goodside/status/1569128808308957185?s=20
105
Security risks

- Sometimes, these “hacks” are not even


that obvious!
- Conclusion: Building a safe LLM takes
much more than just “training” a model.

Source: https://not-just-memorization.github.io/extracting-training-data-from-chatgpt.html
106
How to avoid security risks
- Not exposing data

- Exposing data, but with privacy mode

- Exposing data, but with commercial API Find the balance that works for you!

- Exposing data, but in a dedicated instance

- Hosting an on-prem model

107
How to avoid security risks
- Not exposing data
Switch this off!

- Exposing data, but with privacy mode

- Exposing data, but with commercial API

- Exposing data, but in a dedicated instance

- Hosting an on-prem model

108
How to avoid security risks
- Not exposing data

- Exposing data, but with privacy mode

- Exposing data, but with commercial API

- Exposing data, but in a dedicated instance

- Hosting an on-prem model

109
How to avoid security risks
- Not exposing data

- Exposing data, but with privacy mode

- Exposing data, but with commercial API

- Exposing data, but in a dedicated instance

- Hosting an on-prem model

Source: https://learn.microsoft.com/en-us/legal/cognitive-services/openai/data-privacy
110
How to avoid security risks
- Not exposing data

- Exposing data, but with privacy mode

- Exposing data, but with commercial API

- Exposing data, but in a dedicated instance

- Hosting an on-prem model

Source:https://ai.meta.com/llama/
111
How to avoid security risks

● Adapt your architecture


so that the data
exposure to the LLM is
minimized

112
Discussion

113
Future Outlook and Closing
What’s next?

��

115
What’s next in AI?

- Building the next-gen of


foundation models like GPT-4
will probably take a while (12+
months, GPT-5 is not even
training yet)

- Until then: Focus on integration


and enhancing capabilities of
existing models via fine-tuning,
increased token limit or other
variations.

116
Custom GPTs

- Create custom versions


of ChatGPT that
combine instructions,
extra knowledge, and
any combination of
skills.

- Internal (company) and


external use cases
(marketplace)

Source: https://openai.com/blog/introducing-gpts
117
ChatGPT Plugins

- The app store for ChatGPT users is growing every day

118
Multimodal Inputs

● ChatGPT can now see,


hear, and speak

Source: https://twitter.com/petergyang/status/1707169696049668472/photo/1
119
Increased Context Limit

- Context length = Input Tokens + Output Tokens

- Context length will increase as LLM research continues

- 100k tokens: You can put the “Great Gatsby” into the prompt

16k 32k 100k


tokens tokens
4k tokens

OpenAI’s GPT-3 OpenAI’s GPT-3.5-16k OpenAI’s GPT-4-32k

Anthropic’s Claude 2
120
Explainable AI

Goal:

- Trace outputs back


to model inputs
- Make the model
more interpretable
and increase trust

Source: Aleph Alpha


121
Wrap-up
Recap
Day 2 Learnings:

❑ Using ChatGPT with Google Sheets


❑ Data Cleaning
❑ Data Transformation

❑ Use Cases for ChatGPT with SQL & Python


❑ Labs & Notebook exercises

❑ Limitations and security concerns of ChatGPT

❑ Future Outlook and Closing

123
Thank you!

Subscribe to my free
newsletter 🡪

Let’s connect! ai4bi.rocks


linkedin.com/in/tobias-zwingmann
[email protected]

124

You might also like