AI Supplement IX (Web)
AI Supplement IX (Web)
Educational Publishers
SULTAN CHAND & SONS (P) LTD
Educational Publishers
4859/24, Darya Ganj, New Delhi-110 002
Phones : 4354 6000 (100 Lines), 2324 3939
Fax : (011) 4354 6004, 2325 4295
E-mail : [email protected]
Buy books online at : www.sultan-chand.com
ISBN: 978-81-19446-83-4
Edition 2024
No part of this book may be reproduced or copied in any form or by any means (graphic, electronic or mechanical, including
photocopying, recording, taping, or information retrieval system) or reproduced on any disc, tape, perforated media or any other
information storage device, etc., without the prior written permission of the publishers. Breach of this condition is liable for legal
action. Anyone who brings information regarding any such reproduction will be handsomely rewarded.
Every effort has been made to avoid errors or omissions in this publication. In spite of this, some errors might have crept in. Any
mistake, error or discrepancy noted may be brought to our notice which shall be taken care of in the next edition. It is notified that
neither the publishers nor the author or seller will be responsible for any damage or loss of action to anyone, of any kind, in any
manner, therefrom.
For faulty binding, misprints or for missing pages, etc., the publishers’ liability is limited to replacement within one month of the
purchase by a similar edition. All expenses in this connection are to be borne by the purchaser.
To recognize, engage and relate with the three realms Recommended Activity: The AI Game
AI of AI: Computer Vision, Data Statistics and Natural • Learners to participate in three games based on
REFLECTION Language Processing different AI domains
— Game 1: Rock, Paper and Scissors (based on
data) https://next.rockpaperscissors.ai/
— Game 2: Semantris (based on Natural Language
Processing – NLP)
https://research.google.com/semantris/
— Game 3: Quick Draw (based on Computer
Vision – CV)
https://quickdraw.withgoogle.com/
To identify the AI Project Cycle framework Session: Introduction to AI Project Cycle
• Problem Scoping
• Data Acquisition
• Data Exploration
• Modeling
• Evaluation
• Deployment
To learn problem scoping and ways to set goals for an Session: Problem Scoping
AI project Activity: Brainstorm around the theme provided and
set a goal for the AI project
• Discuss various topics within the given theme and
select one.
• Fill in the 4Ws problem canvas and a problem
statement to learn more about the problem
identified in the community/ society
• List down/ Draw a mind map of problems related
AI PROJECT to the selected topic and choose one problem to be
CYCLE the goal for the project.
To identify stakeholders involved in the problem • Activity: To set actions around the goal
scoped. Brainstorm on the ethical issues involved • List down the stakeholders involved in the problem.
around the problem selected • Search on the current actions taken to solve this
problem.
• Think around the ethics involved in the goal of your
project.
To understand the iterative nature of problem scoping Activity: Data and Analysis
for in the AI project cycle • What are the data features needed?
Foresee the kind of data required and the kind of • How will the features collected affect the problem?
analysis to be done • Where can you get the data?
• How frequent do you have to collect the data?
• What happens if you don’t have enough data?
• What kind of analysis needs to be done?
• How will it be validated?
• How does the analysis inform the action?
Share what the students have discussed so far Presentation: Presenting the goal, actions and data
Teamwork Activity:
• Brainstorming solutions for the problem statement.
To identify data requirements and find reliable Session: Data Acquisition
sources to obtain relevant data Activity: Introduction to data and its types.
• Students work around the scenarios given to them
and think of ways to acquire data.
Activity: Data Features
• Identifying the possible data features affecting the
problem.
Activity: System Maps
• Creating system maps considering data features
identified.
To understand the purpose of Data Visualization Session: Data Exploration/ Data Visualization
• Need of visualizing data
• Ways to visualize data using various types of
graphical tools
Quiz Time
To use various types of graphs to visualize acquired Recommended Activities: Let’s use Graphical Tools
data • Selecting an appropriate graphical format and
presenting the graph sketched
• Understanding graphs using
https://datavizcatalogue.com/
• Listing of newly learnt data visualization
techniques
• Top 10 Song Prediction: Identify the data features,
collect the data and convert into graphical
representation.
• Collect and store data in a spreadsheet and create
some graphical representations to understand the
data effectively.
To understand modeling (Rule-based & Learning- Session: Modeling
based) • Introduction to modeling and types of models
(Rule-based & Learning-based)
To understand various evaluation techniques Session: Evaluation
Learners will understand about new terms
• True Positive
• False Positive
• True Negative
• False Negative
Challenge students to think about how they can apply Session: Deployment
their knowledge of deployment in future AI projects Recommended Case Study: Preventable Blindness
and encourage them to continue exploring different Activity: Implementation of AI project cycle to develop
deployment methods. an AI Model for Personalized Education
To understand and reflect on the ethical issues Session: Ethics
around AI Video Session: Discussing about AI Ethics
Recommended Activity: Ethics Awareness
• Students play the role of major stakeholders, and
they have to decide what is ethical and what is not
for a given scenario.
• Students to explore Moral Machine
• (https://www.moralmachine.net/) to understand
more about the impact of ethical concerns
To gain awareness around AI bias and AI access Session: AI Bias and AI Access
• Discussing about the possible bias in data collection
• Discussing about the implications of AI technology
To let the students analyze the advantages and Recommended Activity: Balloon Debate
disadvantages of Artificial Intelligence • Students divide in teams of 3 and 2 teams are given
same theme. One team goes in affirmation to AI for
their section while the other one goes against it.
• They have to come up with their points as to why AI
is beneficial/ harmful for the society.
UNIT 2: DATA LITERACY
SUB-UNIT LEARNING OUTCOMES SESSION / ACTIVITY / PRACTICAL
• To define data literacy and recognize its importance Session: Basics of data literacy
• To understand how data literacy enables informed • Introduction to Data Literacy
decision-making and critical thinking • Impact of data Literacy
• To apply the Data Literacy Process Framework to • How to become Data Literate?
analyze and interpret data effectively • What are data security and privacy? How are they
• To differentiate between Data Privacy and Security related to AI?
Basics of data • To identify potential risks associated with data • Best Practices for Cyber Security
literacy breaches and unauthorized access
• To learn measures to protect data privacy and Recommended Activity: Impact of News Articles
enhance data security Reference Videos:
• https://www.youtube.com/watch?v=yhO_t-c3yJY
• https://www.youtube.com/watch?v=aO858HyFbKI
• https://www.cbse.gov.in/cbsenew/documents/
Cyber%20Safety.pdf
• To determine the best methods to acquire data Session: Acquiring, Processing, and Interpreting
• To classify different types of data and enlist Data
different methodologies to acquire it • Types of data
• To define and describe data interpretation • Data Acquisition/Acquiring Data
Acquiring, • To enlist and explain the different methods of data • Best Practices for Acquiring Data
Processing, interpretation • Features of data and Data Preprocessing
and • To recognize the types of data interpretation • Data Processing and Data Interpretation
Interpreting • To realize the importance of data interpretation • Types of Data Interpretation
Data • Importance of Data Interpretation
Recommended Activities:
• Trend analysis
• Visualize and Interpret Data
• To recognize the importance of data visualization Session: Project Interactive Data Dashboard &
• To discover different methods of data visualization Presentation
Project
• Data visualization Using Tableau Reference Links
Interactive
• https://public.tableau.com/en-us/s/download
Data
• https://www.datawrapper.de/
Dashboard &
Video Links:
Presentation
• https://www.youtube.com/watch?v=NLCzpPRCc7U
• https://www.youtube.com/watch?v=_M8BnosAD78
PRINT
• To find square of number 7
• To find the sum of two numbers 15 and 20
• To convert length given in kilometers into meters
• To print the table of 5 up to five terms
• To calculate Simple Interest if the principle_amount = 2000, rate_of_interest = 4.5, time = 10
• To calculate Area and Perimeter of a rectangle
• To calculate Area of a triangle with Base and Height
INPUT • To calculate average marks of 3 subjects
• To calculate discounted amount with discount %
• To calculate Surface Area and Volume of a Cuboid
• Create a list in Python of children selected for science quiz with following names—Arjun, Sonakshi, Vikram,
Sandhya, Sonal, Isha, Kartik perform the following tasks on the list in sequence—
— Print the whole list
— Delete the name “Vikram” from the list
— Add the name “Jay” at the end
— Remove the item which is at the second position
LIST • Create a list num = [23,12,5,9,65,44]
— print the length of the list
— print the elements from second to fourth position using positive indexing
— print the elements from position third to fifth using negative indexing
• Create a list of first 10 even numbers, add 1 to each list item and print the final list.
• Create a list List_1 = [10,20,30,40]. Add the elements [14,15,12] using extend function. Now sort the final
list in ascending order and print it.
• Program to check if a person can vote
• To check the grade of a student
• Input a number and check if the number is positive, negative or zero and display an appropriate message
• To print first 10 natural numbers
IF, FOR, WHILE
• To print first 10 even numbers
• To print odd numbers from 1 to n
• To print sum of first 10 natural numbers
• Program to find the sum of all numbers stored in a list
Important • https://cbseacademic.nic.in/web_material/Curriculum21/publication/secondary/Python_Content_Manual.pdf
Links • https://drive.google.com/drive/folders/1qRAckDculA5i164OUFDlilxb8mT65MMb
PART D: Project Work / Field Visit / Student Portfolio (*relate it to Sustainable Development Goals)
SUGGESTED PROJECTS/ FIELD VISIT / PORTFOLIO (ANY ONE HAS TO BE DONE)
1. Create an AI Model using tools like:
— Teachable Machine (https://teachablemachine.withgoogle.com/)
— Machine Learning for Kids (https://machinelearningforkids.co.uk/)
2. Choose an issue that pertains to the objectives of sustainable development and carry out the actions listed
Suggested below.
Projects — To understand more about the problem identified, create a 4Ws problem canvas.
— To identify the data features and create a system map to understand relationship between them
— To visualize the data collected graphically (Spreadsheet software to be used store and visualize
the data)
— Suggest an AI enabled solution to it (Prototype/Research Work)
Suggested Visit to an industry or IT company or any other place that is creating or using AI applications and present the
Field Visit report for the same. Visit can be on physical or virtual mode.
Suggested Maintaining a record of all AI activities and projects (For Example Letter to Future Self, Smart Home Floor
Student Plan, Future Job Advertisement, Research Work on AI for SDGs and AI in Different Sectors, 4Ws canvas, System
Portfolio Map). (Minimum 5 Activities)
CONTENTS
SUBJECT-SPECIFIC SKILLS
1. AI Reflection, Project Cycle and Ethics 1–26
EVALUATION AND METRICS IN MACHINE LEARNING . . . 1
DIFFERENT ML TASKS—DIFFERENT EVALUATION METRICS . . . 1
• Classification Models . . . 2
• Regression Models . . . 2
EVALUATION EXAMPLES . . . 2
• Evaluation Methods . . . 3
• ROC- AUC Curve . . . 6
DEPLOYMENT OF AI MODELS . . . 7
• Examples of AI Deployment . . . 7
• AI Deployment in Smartphones . . . 7
• Mapping the Problem to AI Project Cycle . . . 9
SOME AI APPLICATIONS . . . 12
• Plantix: An AI-based Solution . . . 15
WHAT ARE ETHICS . . . 17
• Ethics vs Morals . . . 19
• Why are ethics important . . . 20
• AI Ethics Principles . . . 20
4. Generative AI 129–156
INTRODUCTION . . . 129
WHAT IS GENERATIVE AI . . . 130
• Key Drivers of Generative AI . . . 131
• Evolution of Generative AI . . . 131
• Generative AI Applications . . . 133
• Generative AI: Unlimited Horizons . . . 134
GENERATIVE AI VS TRADITIONAL AI . . . 135
TYPES OF GENERATIVE AI . . . 137
• Generative Adversarial Networks (GANs) . . . 137
• Recurrent Neural Networks (RNNs) . . . 138
• Variational Autoencoders (VAEs) . . . 139
EXAMPLES OF GENERATIVE AI . . . 142
• OpenAI’s ChatGPT . . . 142
• Kidgeni . . . 142
• OpenAI SORA . . . 142
• MusicLM by Google . . . 143
• Chrome Music Lab: Song Maker . . . 143
• This Person Does Not Exist . . . 144
BENEFITS OF USING GENERATIVE AI . . . 145
LIMITATIONS OF USING GENERATIVE AI . . . 146
HOW TO USE GENERATIVE AI TOOLS IN REAL-WORLD SCENARIOS . . . 146
• Socially Beneficial Uses of Generative AI . . . 148
ETHICAL CONSIDERATIONS IN USING GENERATIVE AI . . . 150
POTENTIAL NEGATIVE IMPACT ON SOCIETY . . . 150
• Energy Usage Concerns . . . 151
RESPONSIBLE USE OF GENERATIVE AI . . . 151
• Responsible Use of AI for Students . . . 152
In machine learning, evaluation and metrics are important to analyze how well a model
performs. These methods help us understand the accuracy and generalizability (how
correctly the model will work on a new unseen data) of ML models. While evaluation is the
process of assessment, metrics are quantitative measures used to assess the performance
of ML models.
We perform the evaluation of ML models to objectively assess their performance. Evaluation
metrics provide us with concrete measures to understand how well our models are performing
and whether they need improvement. We may also say that evaluation metrics provide
numerical feedback on how well a model is performing in solving a specific task such as
classification, regression, clustering or anomaly detection.
A
B
C
Regression Models
Regression models output numerical values such as predicting a sales figure or a housing
price. The key distinction is that classification is about discrete categories while regression
deals with continuous quantities.
• For example, approximating the distance a car will travel with a certain quantity of fuel
is a prediction we can make on the basis of Regression.
EVALUATION EXAMPLES
In a classification task where we distinguish between apples and oranges based on features
like color and size, a sample evaluation metric could be accuracy, which tells us what
percentage of input fruits is correctly classified out of all the fruits in the dataset.
Evaluation Methods
1. Train-Test Split
This is the most basic evaluation method in machine learning. The dataset is divided
into two parts: a training set and a testing set. The model is trained on the data from the
training set and is then evaluated for performance on the data from the testing set.
Example: If you have a dataset of 1,000 housing price records, you might use 80% of
the data (800 records) for training and 20% (200 records) for testing. After training the
model on 800 housing price records, you test the trained model’s performance on the
remaining 200 records to evaluate how accurate the price prediction is for any house in
the testing dataset.
2. Cross-Validation
Cross-validation is an advanced evaluation method. When we have less amount of data,
the dataset is divided into k equal parts (folds). The model is trained k times, each time
using a different fold as the testing set and the remaining folds as the training set. The
average performance across all k tests gives a more reliable estimate of the model’s
performance.
Example: In 5-fold cross-validation, the dataset is split into 5 parts. The model is
trained 5 times, each time using 4 parts for training and the remaining one part for
testing. The results from the 5 tests are averaged to give a final performance metric.
3. Confusion Matrix
Let us create two classifications—‘Dog’ and ‘Not Dog’—as an example.
Possibility: The model can make correct predictions and may classify the provided input
correctly. There may be two such cases:
1. The model may predict the actual class correctly, i.e., the classifier between ‘Dog’ and
‘Not Dog’ images correctly classifies a dog image provided as input.
Input Image
Input Image
Input Image
Input Image
ACTUAL VALUES
POSITIVE NEGATIVE
POSITIVE
TRUE POSITIVE FALSE POSITIVE
PREDICTED VALUES
NEGATIVE
FALSE NEGATIVE TRUE NEGATIVE
Do it Yourself
CONFUSION MATRIX
Consider a Spam Classification ML algorithm which has the following performance: 50
emails correctly identified as spam, 10 emails incorrectly identified as spam (i.e., they are
actually not spam), 5 emails incorrectly identified as not spam (i.e., they are actually spam)
and 35 emails correctly identified as not spam. Count the values of TP, TN, FP and FN for
the above statement.
Actual Class
Spam Not Spam
Predicted Class Spam TP = FP =
Not Spam FN = TN =
ROC-AUC Curve
ROC Curve
The Receiver Operating Characteristic (ROC) curve Perfect
is a graphical representation used to evaluate the classifier ROC curve
1.0
performance of a binary classification model. It plots the Better
True positive rate
and false positive rate are calculated from the confusion 0.5
sif
as
Cl
AUC Values
• AUC = 1: Perfect model
ROC
• 0.9 ≤ AUC < 1: Excellent model
• 0.8 ≤ AUC < 0.9: Good model TPR
• 0.7 ≤ AUC < 0.8: Fair model
AUC
• 0.6 ≤ AUC < 0.7: Poor model
• 0.5 ≤ AUC < 0.6: Failed model
FPR
DEPLOYMENT OF AI MODELS
Deployment in AI refers to the process of taking a developed and tested AI model and
making it available for use in real-world applications. It is the final step in the AI project
cycle, where the model moves from the development environment to being actively used by
people or systems.
Examples of AI Deployment
• Self-Driving Cars: AI models are deployed in self-driving cars to help them navigate
roads, recognize traffic signs and avoid obstacles.
• Medical Diagnosis Systems: In healthcare, AI models assist doctors by analyzing
medical images to detect diseases and recommend treatments.
• Chatbots: Many websites and apps use AI-powered chatbots to answer customer
questions, provide information and help with tasks.
AI Deployment in Smartphones
AI deployment in many smartphone features and apps that we use regularly are quite
common nowadays.
Case Study
CROP DISEASE DETECTION AI PLATFORM
Problem: How to Improve Disease Detection Efficiency to Prevent Crop Loss. Agriculture is
a critical industry and crop diseases can lead to significant losses in quality and yield. Crop
diseases can destroy fields, leading to food scarcity and huge losses to farmers.
Challenges:
• Lack of access to expert agronomists in rural and remote areas.
• Delays in diagnosing diseases can result in widespread crop damage.
• Visual symptoms of diseases, such as leaf spots or wilting, can be easily missed.
Example: Early blight in tomatoes affects the tomato crop
adversely. The symptoms of disease can be identified with the
help of AI using leaf images.
• Normal Leaf: Healthy, green leaves
• Diseased Leaf: Leaves with brown spots and yellowing
AI Solution Implementation: An AI-based crop disease detection platform can be developed
in collaboration with agricultural universities and tech companies. AI models are trained to
achieve high accuracy—comparable to expert agronomists—in detecting various crop diseases.
Deployment: The AI model for disease detection can be deployed in a mobile application. This
will allow farmers to click leaf pictures, which can then be inspected by AI to check for a disease
and suggest remedial measures.
How it works:
• Image Collection: Farmers or technicians click pictures of crop leaves using smartphones.
• AI Analysis: The digital images are analyzed by AI to detect signs of disease.
• Real-Time Feedback: The platform provides immediate feedback and suggests treatment.
This AI application helps in quick identification of crop diseases. Moreover, with a mobile app
deployment, even farmers without access to experts can use this technology.
3. Set Up Your Project: Choose ‘Image Project’ from the available project types. Click on ‘Standard
Image Model’.
5. Upload Images: For the ‘Healthy’ class, click on the ‘Upload’ button and select the images of
healthy tomato leaves from the downloaded dataset. Repeat this process for the ‘Diseased’
class by uploading the images of leaves affected by early blight.
6. Train the Model: Once all the images are uploaded, click on the ‘Train Model’ button. The
platform will start training your model using the uploaded images. This may take a few minutes
depending on the size of your dataset.
7. Test the Model: After the training is complete, you can test your model using the
‘Test Model’ section. Upload new images of tomato leaves to see if the model can correctly
classify them as healthy or diseased.
Discussion Questions
1. What challenges did you encounter while training the model?
2. Can you create a confusion matrix for the model?
3. How can the accuracy of the model be improved?
4. How can this AI model be useful for farmers in real-world scenarios?
SOME AI APPLICATIONS
Face Lock in Smartphones: Face lock is a security feature that
uses facial recognition technology to unlock smartphones.
Instead of typing a password or using fingerprint, users
simply have to look at their devices to gain access.
How It Works:
1. Camera Capture: The phone’s front camera captures an
image of the user’s face.
2. Facial Features Analysis: AI algorithms analyze unique
facial features such as the distance between the eyes,
nose shape or jawline.
3. Comparison: AI compares the captured image with the stored facial data from the
phone’s setup phase.
4. Unlocking: If the features match, the phone unlocks.
Such AI systems provide users with quick and easy access to their devices, without relying on
passwords. They also make it harder for unauthorized users to gain access as compared to
conventional device unlocking systems.
Such AI systems help prevent financial losses due to fraud. They are designed to quickly
process large volumes of transactions to detect suspicious activity in real time.
Planning the AI Solution: Start by listing essential factors for maize crop health and
productivity.
This system aims to:
• Detect early signs of maize lethal necrosis and nutrient deficiencies.
• Recommend appropriate actions such as adjusting fertilizer application or
implementing disease-resistant varieties.
• Provide real-time alerts and guidance to farmers through a user-friendly
mobile application.
(Add other outcomes you think are needed.)
• ............................................................................................................................................................................
• ............................................................................................................................................................................
Problem Scoping
Define the scope of the problem to be addressed by your project. Identify various diseases
and nutrient deficiencies affecting maize crops, along with their specific symptoms and
impact on yield.
Data Acquisition
Collect the following data:
• Images of diseased maize plants and symptoms
• Farmer details including location, farm size and cropping practices
• Soil nutrient levels and historical yield data
Ensure data accuracy and reliability for effective decision-making in crop management.
Data Exploration
Analyze the cleaned data to identify patterns in disease outbreaks, nutrient deficiencies and
crop performance. This exploration aids in understanding the underlying factors influencing
maize health.
Modelling
Select AI algorithms best suited for image recognition and data analysis. Develop models that
can accurately identify disease symptoms and nutrient deficiencies from images and other
data inputs.
Deployment
Deploy the AI-driven maize health monitoring system in a mobile application. Ensure
accessibility to farmers, providing timely insights and actionable recommendations.
Impact on Farmers
By using Plantix, farmers gain easy access to expert-level diagnostics and advice. This helps
them to take timely and informed actions to protect their crops from diseases and nutrient
deficiencies. As a result, they can improve crop yields, reduce losses and increase their overall
agricultural productivity.
FUN To watch an interesting reference video on ethical scenarios, open the link
TIME https://www.youtube.com/watch?v=nyTmeb4vFqE in your web browser or scan
the given QR code.
............................................................................................................................................................................
............................................................................................................................................................................
4. What are the potential consequences of refusing to use the test answers?
............................................................................................................................................................................
............................................................................................................................................................................
............................................................................................................................................................................
............................................................................................................................................................................
............................................................................................................................................................................
............................................................................................................................................................................
Ethics VS Morals
Both ethical and moral questions can be challenging for humans to answer—they present
different types of difficulties based on their nature.
Ethical questions: Focus on what society says is right or wrong. They involve rules and laws,
and affect groups of people.
Moral questions: Focus on personal beliefs about good and bad. They are based on individual
choices and affect an individual and their close circle.
A comparison table to illustrate the differences and challenges associated with ethical and
moral questions is presented below:
Paired Examples
Do it Yourself
1. Parneet takes home the silverware from a restaurant after dinner, which is considered
theft. Is this an ethical or moral concern? Explain why.
2. After Akshit and Aman discuss buying a new mobile phone, focusing on specific features,
Aman starts receiving notifications for mobile models that match their discussion.
Identify the ethical concern in this scenario.
AI Ethics Principles
To make AI better, we need to identify the factors responsible for Human Rights Bias
Human Rights
AI systems should respect and support human rights. This means AI should not harm people
or take away their freedom. AI should not be used to:
• Invade Privacy: AI should not peek into people’s private lives without permission.
• Discriminate: AI should treat everyone fairly, regardless of race, gender, age or religion.
• Limit Freedom: AI should support people’s rights to speak freely and stand up for what
they believe in.
• Take Control: AI should not take away control from humans.
FUN The video ‘Will Artificial Intelligence Take Over the World?’ provides
TIME an excellent analysis of AI and its control issues. To watch the
video and know more, scan the given QR code or open the link
https://www.youtube.com/watch?v=pnpq69WaRsM in your web browser.
Case Study
AMAZON’S GENDER-BIASED HIRING TOOL
Amazon developed an AI tool to help with the hiring process but it was found to be biased
against female candidates. This bias was due to the data used for training, which contained
mostly male resumes. As a result, the AI system favoured male candidates, thus perpetuating
gender bias in hiring.
You can read more about the same on
https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G
To watch a video on the Amazon gender bias fiasco, scan the given QR
code or open the link https://www.youtube.com/watch?v=JOzQjT-hJ8k in your
web browser.
Case Study
RACIAL BIAS IN FACIAL RECOGNITION
Facial recognition systems have shown racial bias, performing less accurately for people with
darker skin tones. This bias is because the training data usually contains more images of lighter-
skinned individuals. This leads to misidentification and unequal treatment of people based on
their skin color.
To read more on this, open the link
https://www.nytimes.com/2019/04/17/technology/facial-recognition-technology-bias.html in
your web browser.
To watch a video on racial bias in algorithms, scan the given QR code or open the
link https://www.youtube.com/watch?v=IzvgEs1wPFQ in your web browser.
To stop AI bias, it is essential to fix the problems in the data that is used to teach AI and to
ensure that AI systems are fair and unbiased.
Inclusion
AI should not discriminate against a particular group of people causing them any kind of
disadvantage. AI systems should be designed to be inclusive and benefit everyone. All AI
developers must strive to:
• Make AI Accessible: Design AI systems such that people with different abilities and
backgrounds can use it.
• Include Diverse Views: Involve people from different backgrounds while creating AI
systems to make sure they work well for everyone.
• Promote Equality: Use AI to create equal opportunities and help everyone, especially
those who are less fortunate.
Do it Yourself
CONFUSION MATRIX
A company developed a confidential AI recruiting tool but their machine learning experts later
discovered a significant issue: the new tool exhibited bias against women. It autonomously
learned to favour male candidates and penalized resumes containing the term ‘female’. As
a result, the tool failed to perform as intended.
(a) Which AI ethics issue is highlighted in this scenario?
(b) What could have caused the ethical concern identified in the situation?
Memory Bytes
Data statistics are crucial for understanding the characteristics of datasets used in AI.
Data preparation involves cleaning and organizing data to make it suitable for training AI models.
Data splitting divides data into training and testing sets to evaluate model performance.
Cross-validation is an advanced evaluation technique for assessing model reliability.
Exercises
Objective Type Questions
I. Multiple Choice Questions (MCQs):
1. Which of the following uses Artificial Intelligence to function?
(a) Toothbrush
(b) Self-driving car
(c) Analog wrist watch
(d) Ceiling fan
2. Which programming language is commonly used today for developing AI applications?
(a) Ruby
(b) Java
(c) C
(d) Python
3. The field of Artificial Intelligence which helps to identify and process human images is:
(a) Face Detection
(b) Computer Vision
(c) Natural Language Processing
(d) Eye-in-Hand System
Learning Objectives
Understanding and explaining the fundamental concepts of data literacy, including the
differences between data, information, knowledge and wisdom
Identifying and applying the steps involved in data collection, cleaning, preprocessing and
analysis to real-world scenarios
Utilizing various data visualization techniques to effectively communicate data insights
through charts, graphs and infographics
Implementing key data security and privacy measures to protect personal and sensitive
information
Developing a critical-thinking approach to evaluate data sources, interpret data accurately
and make evidence-based decisions
Introduction
The world we live in today can
rightly be termed as a ‘Golden Era
for Data’. Billions of smartphones,
computers and other electronic
gadgets, installed with countless
different types of software
applications, are generating
and storing huge amounts of
information on a variety of
topics every second of the day.
Imagine the number of web pages,
podcasts, images, audio, videos,
spreadsheets, written posts, blogs,
Source: Statista Digital Economy Compass 2019
articles, ebooks, infographics and
comments being generated every day.
Have you ever wondered what happens with this data? Who is this data important for? How do
we utilize this data? Or more importantly, what exactly is data? We shall find the answers to
all these questions and much more in this chapter on Data Literacy.
Welcome to an amazing journey through the wonderful world of data and information,
ultimately leading you towards Data Literacy, which is one of the most essential skills of the
21st Century.
Most people believe that Data Literacy skills are important only for professionals working
with Computer Science. This is a common misconception. Can you guess what is common
between a top javelin star, a meteorologist, a business executive and a farmer? Not
surprisingly, all of them are part of the fields where data literacy has become a huge factor of
success.
Today, data literacy is equally important for any person working in any domain of modern
life—be it entertainment, agriculture, sports, business, journalism, finance, economics,
humanities, healthcare, medicine, astronomy or robotics. Let us begin our journey with
the understanding of some building blocks of Data Literacy, starting with the smallest
block, i.e., data.
WHAT IS DATA
Look around you and try ‘describing’ the place, things and people you see. In the study of
English (or any other language), we learn to describe people, places and things with nouns
(which help us name something or somebody) and adjectives (which help us describe the
properties of a person, place or thing, or even our feelings and emotions).
Additionally, Mathematics helps us to describe those attributes or properties which are either
countable or measurable. The ability to describe is the foundation of human communication.
We may be able to communicate with another human being in Hindi, English, Spanish,
German or any language both parties know well, which helps us in the process of description.
A blackboard
Data in Computing
Data is the ‘describing’ language of computers. Computers use data for describing
anything—be it non-measurable properties like names/feelings or any measurable thing like
count or quantity. Anything we store on a computer is the description of something, which we
call data; similarly, any output we receive from a computer is also called data. Let us look at
the formal definition of data.
Data is a collection of facts such as numbers, words, measurements and observations, or
just descriptions of things.
Whenever we think of data, we may think of numbers. However, data on computers can also
be videos, documents, spreadsheets, audio files, photographs and many other different
formats used by computers. Computers store data in binary digits—0 and 1, popularly known
as bits, which are grouped into bytes, where each byte contains 8 bits together.
Do not confuse ‘describing’ language with ‘programming’ language—the latter is the
language of instruction for the computer. A program is a set of instructions (written in
programming language) which tells the computer what to do with data and how to do it.
Do it Yourself
Describe the people below.
Data Literacy 29
Let us illustrate a few examples of data to understand it better. Consider the following table.
Each of the values within columns A to I is an example of data. The data may consist of
numbers, characters, symbols or their combinations as illustrated below:
• The numbers in columns A, G and I are all data values.
• The text stored in columns C, E and H are data values.
• The images in column F are data.
• The character and symbol combinations in column B and the character and number
combinations in column D are also data.
A B C D E F G H I
7802452133 B+ Sameer 10A Pizza 10000 Mumbai 99.1
Data can exist in various forms as observed in the above example. But does this data have
any meaning?
Let us play a guessing game: What do the data values in each column signify? Think about
the significance of the digits in column A. What may these numbers mean?
All the students in the class may interpret these values differently. The above data table
has no meaning of its own. It is just a collection of numbers, characters and their
combinations.
Data VS Information
Data and information are used almost interchangeably by most people. But for proficiency in
Data Literacy, you must understand that both the terms are very different from each other.
Data, by itself, does not have any meaning at all. It shall remain a meaningless collection
of characters, numbers, words or images. Information, on the other hand, represents
meaningful data. It is important to note that data and information are closely related to each
other, as explained further.
Let us rewrite the data table from the previous section again. This time we shall have
some additional details in the table as shown in the top row of the below illustration.
These additional details are called Context, which help us to understand the meaning
of data.
Is it possible to understand the meaning of data contained in each column now? You may
observe that the first column stores the phone number of the student and second column
contains the grade in the last class and so on. Just by adding context to the data values, it is
very easy to understand the meaning of the numbers or characters stored in the table as data.
g ont CONTEXT
a r er
J Al Context refers to the situation within which something exists or happens and that can
help explain it.
Booking ID Blood Passenger Seat In-flight Photo Ticket Cost Destination Body
Group Name No. Order on ID (INR) Temperature
(°F)
7802452133 B+ Sameer 10A Pizza 10000 Mumbai 99.1
Data Literacy 31
Interestingly, all the data values in the given table are the same as the Student Information
table given before. However, the meaning of the columns has changed completely. We can
now understand that data may be a raw, unorganized and meaningless group of characters or
numbers (or combinations of both) but when it is transformed with processing, organization
and is given a context, it becomes meaningful information.
The difference between data and information is as follows:
Data Information
It is a raw collection of numbers or characters. Processed data presented with context is known as information.
Data is independent of information. Information is dependent on data from which it is derived.
Data can be collected by observation and from records. Information is generated from data.
Data may be meaningless. Information is always meaningful.
Raw data cannot be used for decision-making. Since information is processed meaningful data, it helps us in
decision-making.
Data may be difficult to understand without context. Information is usually easier to understand.
Data is represented in bits and bytes. Information is represented with ideas and thoughts.
Example: Raw Data Example: Information
110001 The Pin Code of Parliament House in New Delhi is 110001.
We may summarize the relationship between data and information as under:
• Data refers to raw facts, observations, measurements or records that are typically
collected and stored for analysis, reference or for use in decision-making.
• Data can exist in various forms, including text, numbers, images, audio, video or any
other format that can be processed by computers.
• When data is processed and organized into a meaningful structure, it becomes
information that can be used to make predictions or support decision-making.
0101110
0111010
0101110
0101110
0111010
0101110
Decision-making
0101110
0111010
0101110
Adaptive
Wisdom
Usefulness
Knowledge
Volume
Information
Data
DIKW Pyramid
Data Literacy 33
decision-making level at the top, which helps us to convert data into decisions for both
human beings and computers.
Decisions Future
Add Purpose What action to take based on
Wisdom knowledge or wisdom?
Add Insights
Knowledge
Add Meaning Past
Information Understand Patterns
Discover Relationships
Add Context
Data
Case Study
Let us understand the DIKW pyramid with a simple illustration.
Level Composition
1. Data (Base level) Om, 2, DataRich School, Bus, Car, Home, 10, Petrol, 96.30, Delhi, AQI, bicycle,
Pollution, Distance.
The elements on the base level do not make any sense but represent possible storage
elements called data.
2. Information (Level 1) The below statements are possible information which we get by adding context to the
(Add Context) data in the base level.
• Om is a student of Class 10 of DataRich School.
• He lives in Delhi. The distance between his School and Home is 2 kilometres.
• Om can commute to school on foot, on bicycle, by car or by bus.
• Delhi is one of the most polluted cities in India, with Air Quality Index (AQI)
close to 400.
• Petrol price is 96.30 rupees/litre.
3. Knowledge (Level 2) Some statements which we know through our knowledge from information other
(Combine Information than the above statements can be:
to add meaning) • Walking and cycling make us fit.
• Bicycles are pollution-free vehicles.
• Vehicles add to pollution.
• Short distances can be covered on foot or on bicycle.
• To travel long distances, a vehicle is better. Vehicle pooling helps reduce pollution.
Some knowledge statements which can be derived by combining information
statements from level 1 are:
• Om can commute to his school without any vehicle because the distance is just
2 kilometres, which can be easily covered by walking.
• Om can commute to school by car but the expenses will be more and he shall be
adding to pollution in the city, thus making the AQI worse.
Do it Yourself
Match the Data, Information, Knowledge, Wisdom and Decisions to the correct entries in the table.
12 Decisions
12 degrees Celsius in my city today Wisdom
Humans feel very cold in 12 degrees Celsius Knowledge
12 degrees Celsius Information
I should put on a sweater Data
Entertainment
The most popular entertainment apps like Instagram, Snapchat, Spotify, Netflix, Facebook,
and YouTube, along with all other social media platforms that you might be using passively,
record the data that you generate. Social media companies view your likes, comments, shares
and browsing data to create your unique profile. With the help of these, they ‘recommend’
new content, friends and channels and even target advertisements based on your interests.
Similarly, the recommendation algorithms of Netflix and YouTube can suggest videos based
on what you have previously watched and liked.
The algorithms are hidden but you produce data for them with your likes and clicks and, in
return, consume data which they recommend.
Experience AI
Learn more about use of data in entertainment by scanning the given QR code or by opening
the link https://www.youtube.com/watch?v=OuU3UfRM2pE in your web browser.
Data Literacy 35
Agriculture
Can you believe that farmers are making use of data in agriculture? Agri-tech is helping the
farmers in a big way—data is useful in weather prediction, soil nutrition, making efficient
decisions on what to plant, when to harvest and several other activities related to agriculture
and farming.
g ont AGRI-TECH
a r er
J Al Agri-tech is the use of technology for farming that is developed to improve efficiency
and profitability in agriculture.
Recall the Sustainable Development Goals adopted by the United Nations given on
https://sdgs.un.org/goals. SDG 2 is End hunger, achieve food security and improved nutrition
and promote sustainable agriculture. Food security is one of the most important sustainable
development goals for humans and data plays a major role in aiding farming and agriculture.
Experience AI
To learn more about how data is helping us in the agriculture domain, scan the
given QR code or open the link https://www.youtube.com/watch?v=A3-GjKOdUGo in your
web browser.
Online Shopping
The use of data in e-commerce and online shopping portals like Amazon allows such
companies to keep track of the buying habits of its customers. Data also helps companies to
make decisions about what items are required to be stocked by forecasting demands (like gifts
and sweets before festivals). Data is also useful in understanding the interests of customers
and offering discounts for increasing sales. Robotics and Artificial Intelligence are ways in
which technology and data helps e-commerce.
Experience AI
Scan the given QR code or open the link https://www.youtube.com/watch?v=IMPbKVb8y8s
in your web browser to discover how data and technology are lending a helping hand to
Amazon in its warehouses to maintain the position of a market leader in the online
shopping industry.
Experience AI
Apple watches have been helpful in saving the lives of users! Check out this news article to know the
fascinating application of data and technology in healthcare.
Scan the given QR code or open the link https://economictimes.indiatimes.com/magazines/
panache/apple-watch-saves-haryana-dentists-life-by-detecting-99-9-artery-blockage-ceo-tim-
cook-reacts/articleshow/90319891.cms in your web browser.
To know more features of smartwatches and their use in healthcare, watch the YouTube video by
scanning the given QR code or open the link https://www.youtube.com/watch?v=UZTBIlzGRoA
in your web browser.
Experience AI
It is predicted that data and technology will play a very important role in the future of human
healthcare with 24/7 monitoring of health by using connected devices. Watch this animation
for a sneak peek into the future of healthcare. To open, scan the given QR code or open the
link https://youtu.be/W0li-PI6yWo in your web browser.
Travel
Data is also altering the traditional ways of travel and vacation planning. A large number of
online travel and tourism companies like MakeMyTrip, Goibibo, Airbnb and Trivago are using
data and technology to help people choose flights, hotel reservations and book trains at best
prices—all from the comforts of their homes!
Data is also used in online maps or GPS applications like Garmin, MapmyIndia and Google
Maps for traffic management which helps to save fuel and manage commute time effectively.
g ont GPS
a r er
J Al GPS stands for Global Positioning System. It is a satellite navigation system which helps
in pinpointing the location of a person, vehicle or any object on the earth in real time.
Data Literacy 37
Experience AI
Have you ever wondered how Google Maps can use data to inform about traffic conditions
on a route in advance? To learn about the traffic-prediction service of Google, scan the
given QR code or open the link https://www.youtube.com/watch?v=RQJ3HmVtN4w in your
web browser.
Education
Teachers and parents always help students to achieve their goals in the field of education.
With the use of technology in education, teachers can keep a record of student interest, their
performance, subject-wise scores, the attendance, areas of improvement and other important
factors.
Experience AI
Scan the given QR code or open the link https://www youtube.com/watch?v=cgrfiPvwDBw
in your web browser to view a small animation on how technology and data is helping
teachers to enhance the study experience of their students.
Data use in our everyday life is increasing rapidly and is not restricted to the above examples.
Data finds its applications in endless activities affecting human life, including daily news,
scientific experiments, banking, self-driving cars, robotics, astronomy, industrial automation
and many more.
Achieving a Data Data-Enabled Describing Data Data Collection and Data Cleaning and
Mindset Questioning Acquisition Preprocessing
Cultivating an Formulating precise, Interpreting, Identifying, sourcing, Rectifying issues and
understanding of and objective-driven summarizing and and compiling relevant, transforming data to
belief in data’s questions to guide articulating the high-quality data using make it analysis-ready
ubiquity and utility data activities meaning of data rigorous methods
Data Analysis and Data Visualization Critical Thinking & Ethical and
Interpretation and Communication Problem-Solving Responsible Data Use
Applying appropriate Conveying data-driven Critically evaluating Incorporating reliable,
techniques to extract insights effectively for data representations, ethically sourced data
meaningful patterns, decision-making conclusions and into choices using
trends and insights through visualization, arguments to ensure structured frameworks
from data storytelling and robust interpretation for optimal, socially
strategic planning responsible outcomes
Data Literacy 39
g ont • Data Analysis: The process of thoroughly studying data to find patterns or answers.
r r
Ja Ale • Graphs and Charts: These are visual ways to represent data, e.g., bar graphs or
pie charts.
• Algorithm: It is a step-by-step procedure that computers use to solve problems.
Remember
Data literacy is not just for data scientists! Anyone can sharpen their data literacy skills by following two
basic practices:
• Ask Questions: Don’t be afraid to ask ‘dumb’ questions. A good understanding always starts with
curiosity. Follow this up with data supporting the answers.
• Practice Makes Perfect: Look for data in your everyday life—news articles, weather reports, sports
statistics, etc. Try to interpret the data and see what story it tells.
Case Study
CASE I: STARBUCKS
Starbucks, one of the most popular chains of coffee stores, has utilized data
literacy to optimize its operations and make customer experiences better.
Equipped with data literacy, Starbucks employees analyze data from point-of-sale
systems, loyalty programs and customer feedback. By incorporating data literacy,
Starbucks has been able to make the following data-driven decisions to benefit the organization:
• It identified customer preferences for different drink sizes.
• It adjusted its ordering and inventory management accordingly.
• It reduced wastage and improved efficiency.
• Additionally, it used customer data to personalize promotions and rewards, leading to
increased customer loyalty and sales.
Data Literacy 41
This initiative has enabled Mondelez to make more
informed decisions about product development,
supply chain management and marketing campaigns.
For example, by analyzing consumer preferences
and social media data, it was able to develop successful product innovations and customize
marketing strategies for different regions.
• Business Continuity: Businesses also used data to navigate uncertainty. By understanding
data on consumer behaviour and economic trends, they could adapt their strategies
accordingly like shifting to online operations or implementing safety protocols for
in-person work.
One of the most popular dashboards during COVID-19 was watched
by people across all countries to follow the trends about the
pandemic. To view the same, scan the given QR code or open the link
https://www.worldometers.info/coronavirus/ in your web browser.
Who is the author of the What is the web link of What is the description What are the facts and figures
news? the news? available in the source of the mentioned in the source?
news?
Prepare one document where all student pairs can share their findings and rate the sources
cited in the news items from 1 (least reliable) to 10 (most reliable), providing a valid reason
for the rating. Name the 5 most reliable news sources and 5 least reliable and share your
views on them.
Becoming data literate involves developing specific skills and knowledge to understand,
analyze and use data effectively.
Data Literacy 43
The step-by-step guide to data literacy is explained in detail below:
Step 1: Understand the basics of data: Start by learning what data is and how it is used. Data
refers to raw pieces of information (numbers, words or images) that can be analyzed
to gain insights. For example, a weather forecast is based on the data collected from
various sources.
‘Big data’ refers to large volumes of data that is too complex to process using traditional
methods.
Step 2: Learn data analysis techniques: Explore methods to analyze data such as creating
charts, graphs and tables. This helps in visualizing and understanding the patterns
and trends within the data.
Actions: Learn Basic Statistics
• Mean, Median, Mode: These are the cornerstones of basic statistics. They
help you understand the central tendency of a dataset, explaining where most of
the data points lie.
Step 3: Gain hands-on experience: Practise working with data. Start with simple datasets and
use tools like Microsoft Excel or Google Sheets to organize, analyze and visualize data.
Our brain processes visuals much faster than text. A study done jointly between 3M and the
University of Minnesota about presentations concluded that using visual aids was found to be
43% more persuasive than unaided presentations. Charts and graphs can make complex data
digestible.
FACT
CHECK
BE A DATA DETECTIVE
The 3M study also claimed some other statistics as illustrated below. Search the internet and
find out which of the claims still remain unverified and may be false or misleading.
50%
of your brain is active in
visual processing alone
90%
of information transmitted
to the brain is visual
40%
of people respond
better to visuals
Step 4: Develop critical thinking skills: Always question and evaluate data sources. Not all
data is reliable or accurate, so it is important to assess the credibility and relevance of
the information. Let us take an example to understand this. Consider researching your
favourite IPL team online. You will find different websites with conflicting information.
How would you decide which source(s) to trust?
g ont BIAS
a r er
J Al Bias means data might be misrepresented to favour a particular viewpoint.
Step 5: Stay Curious and Keep Learning: Data literacy is an ever-evolving skill. Stay updated
with new technologies, data visualization tools and trends in data analysis. Take online
courses, read books and explore real-world applications of data in different fields.
Interesting fact
According to an IBM study, bad data decisions cost businesses an average of $3.1 trillion per year.
...of business leaders don’t trust the
33% information they use to make business
decisions (IBM)
$3.1T $15M
...of organizational losses
per year are due to poor
...is the estimated
data quality (Gartner, 2018)
amount of money
that poor data quality ...of annual revenue is spent correcting
costs the US economy ~20% data errors and dealing with business
problems caused by bad data (MIT, 2017)
per year
...of analysis time is spent vetting and
>40% validating analytics data before it can be used
for strategic decision-making (Forrester, 2018)
Experience AI
Watch this informative YouTube video to learn what it means to be data literate and how
we can start looking at information and the world a little differently. To play, scan the given
QR code or open the link https://www.youtube.com/watch?v=yhO_t-c3yJY in your web browser.
It is important to remember that data literacy is a journey. The more you practise these steps,
the more comfortable you will become with the skills and knowledge required for understanding
data. So, start exploring, ask questions and use the power of data for an exciting career.
Data Literacy 47
Examples:
• Data Security: Using strong passwords, encrypting data and having firewalls in place are
all examples of data security measures.
• Data Privacy: Choosing what information to share on social media platforms or deciding
whether to allow apps to access your location are some examples of data privacy choices.
Experience AI
DATA MISUSE AND HOW TO AVOID IT
Watch an informative YouTube video on what is data misuse and how we can avoid
the misuse of our personal data. To play, scan the given QR code or open the link
https://youtu.be/ixzgRHfEkoY in your web browser.
Case Study
WhatsApp
WhatsApp updated its privacy policy in 2016, allowing it to share user data with Facebook and
other companies. Concerns arose and legal cases were filed in Indian courts.
• What happened: The Delhi High Court ruled that WhatsApp’s policy of sharing user data
with Facebook violated privacy rights.
• Insights: This case showcased the importance of user consent and transparency in data
collection practices. It also raised concerns about data transfer across borders.
• Outcome: WhatsApp faced pressure to modify its policies and provide users with more
control over their data.
The focus ultimately shifted to the ongoing debate on national data protection laws in India.
Understanding how to keep your data secure (protected from unauthorized access) and private
(controlled by you) is very important in this digital age. India is a rapidly growing digital nation
but data security and privacy is an important focus area in this rapid digitalization.
Case Study
DigiLocker
DigiLocker is an Indian government initiative
that aims to make your life easier by providing
a secure digital document wallet. It acts as a
safe deposit box for your important documents
in cloud.
Working of DigiLocker:
• Stores digital documents: You can upload scanned copies of your government-issued
documents like driving licence, PAN card and educational certificates into your DigiLocker.
Data Literacy 49
• Issued by trusted sources: Government agencies and educational institutions can also
issue documents directly to your DigiLocker. This ensures authenticity and eliminates the
need for physical copies.
• Easy access: You can access your DigiLocker from anywhere and anytime using your
smartphone or computer. No more worries about lost or damaged documents.
• Sharing made convenient: You can easily share your documents electronically with
authorized entities like banks or employers.
Benefits of DigiLocker:
• Reduced paperwork: It eliminates the hassle of carrying physical documents everywhere.
• Increased security: Documents are stored securely in the government’s cloud structure.
• Convenience: Access and share your documents anytime, anywhere.
• Environment-friendly alternative: It saves paper and reduces the need for physical copies.
Do it Yourself
Research about DigiLocker online and make a presentation on its security and privacy
features.
Play Interland by Google to learn the safe practices of cyberworld. It is an interactive, fun game
which teaches you the safety precautions and the risks associated with cyberworld. To play,
scan the given QR code or open the link https://beinternetawesome.withgoogle.com/en_us/interland
in your web browser.
Data Literacy 51
g ont DATA PROTECTION LAWS
a r er
J Al Data Protection refers to the set of privacy laws, policies and procedures that aim to minimize
intrusion into one’s privacy caused by the collection, storage and dissemination of personal
data. 137 out of 194 countries in the world have already put in place legislation to secure the
protection of data and privacy.
Benefits of DPDPA:
• Increased data privacy for individuals
• More transparency from organizations handling personal data
• Stronger data security measures
• Empowered users with more control over their information
Do it Yourself
Familiarize yourself with your rights under the DPDPA. You can access the official document
on the MeitY (Ministry of Electronics and Information Technology) website by opening the
link https://www.meity.gov.in/content/digital-personal-data-protection-act-2023 in your
web browser.
Case Study
THE IMPORTANCE OF FAIR TRAINING DATA
In 2018, Amazon scrapped an AI hiring tool because it discriminated against female
candidates. The tool learned from past hiring data which reflected gender bias. This case
highlights the importance of secure and unbiased data in AI development.
Data Literacy 53
FOOD FOR THOUGHT—AI AND OUR DATA
With AI systems becoming more and more common, people are increasingly interested in understanding
how they work. This is important because we want to make sure these systems are fair and don’t misuse
our private information.
The future of AI depends on finding a good balance. We want to keep creating new and exciting
technologies but we also need to make sure our personal information is protected. By working together,
we can ensure that AI benefits everyone in a safe and responsible way.
g ont BIOMETRICS
r r
Ja Ale The physical characteristics of a person which can uniquely identify them are called
biometrics. Some popular biometric characteristics are fingerprints, eye scans and facial features.
If you have registered for an Aadhaar card with Unique Identification Authority of India (UIDAI),
your biometrics have been recorded and linked to your Aadhaar number to uniquely identify you.
Biometrics are also used for attendance management in offices and schools these days.
13. Talk it Out: If you encounter something suspicious online or if you are unsure about
something, talk to a trusted adult, teacher or parent. They can help you to handle the
situation and keep you safe.
FUN Learn how to protect yourself online. Follow the rules shared in the
TIME YouTube video to avoid becoming a victim of phishing, scams, ransomware
and cyber threats. To play, scan the given QR code or open the link
https://www.youtube.com/watch?v=aO858HyFbKI in your web browser.
You can also refer to the CBSE guidelines on cyber safety by accessing
the link https://www.cbse.gov.in/cbsenew/documents/Cyber%20Safety.pdf
Data Literacy 55
Where does data come from
Data is all around us. The address of your home, the name of your school, the number of
students in your class and the day, date or time that we observe—they are all data. Data
doesn’t always record numbers or words; it may represent the temperature outside, the sound
of music or visuals like pictures and maps.
Quick Question: How much data did you generate today?
In a survey of Class 9 students in a school, majority of the students believed that since
they did not create any document, presentation or spreadsheet that day, they did not
generate any data. Interestingly, the students later realized they were wrong. Let us find
out how.
Directions: Read each question carefully and answer YES or NO.
1. Do you swipe up on reels?
2. Do you chat with your friends online?
3. Do you use a library card to borrow books?
4. Do you wave your hand to open the automatic doors at the school or grocery store?
5. Do you wear a fitness tracker that counts your steps?
10. Does your school use a system where you tap your ID card or scan your fingerprint to
enter the school building?
(Bonus: Think about what data might be collected in each of the above activities.)
If the answer to any of the above questions is YES, then you are generating data. The level of
data integration in our daily lives is seamless and continually growing. Almost everything we
do generates data or involves the use of data-driven technologies. Data is important across all
aspects of human life.
We already know that data might not be useful by itself. However, when organized and given
some context, it gets converted into useful information. The data which has been converted
into information can help us make decisions.
For example, information can help us answer the following questions:
• Which is the best-selling laptop under INR 40,000?
• Which toy store has the best collection of board games?
• Which country has the coldest climate?
• Who is the best batsman in the Indian cricket team?
This indicates that before we seek information, we always need to ask some questions.
g ont
r r RELEVANT INFORMATION
Ja Ale Relevant Information refers to specific data that can be applied to solve a problem.
Before we start using data to make decisions, we usually need to answer a few more questions:
• With so much data present around us, how do we decide on which data to use?
• Which data is relevant?
• What are the different types of data available?
• What are the potential reliable sources to look for to acquire relevant data?
Before finding answers to these questions, first we must understand how data is stored and
represented as variables in different domains.
• In Programming, a variable is a storage location which has been given a unique name
to identify it to the program code. For example, a programming statement like x = 9
represents a storage location in a computer—its unique name is x and we have stored a
value of 9 in this storage location.
Consider two programming statements in the same program as given below:
x = 9
# The value stored in the variable x is 9
x = 22
# The value stored in variable x is now 22
From the above example, you may observe that the storage location remained the same
but the value stored was changed.
Data Literacy 57
• In Statistics, a variable represents the characteristics which can be recorded, measured
or counted and can have different values. For example, the characteristics of a person—
height, weight, age, eye or hair color—can be called statistical variables because their
values vary from person to person and may even change with time. Similarly, for an
employee, the variables can be designation and salary, both of which can be different for
different people and can also change with time.
In the above meanings of variable, one thing is common. They store observations which may
change. So, let us make a general definition of variable:
A variable is something that can be used to store or record some observations which can
change with time or from one observation to another.
In contrast, a data item, observation or quantity that can assume only one value is called
a constant. For example, the value of Pi (p = 3.14) is considered a constant, the hours in
a day are constant and the acceleration due to gravity is also considered a constant with
g = 9.8 m/s2.
Depending on the type of values being stored in them, variables can be classified into two
different categories. Let us understand the categories of variables.
Numerical Variables
Numerical variables are used to store measurable values and can be represented by numbers.
• Numerical variables allow measurement.
• We use numbers to represent numerical variables.
• Basic mathematical operations are applicable to numerical variables.
Categorical Variables
Categorical variables are used to classify data into different categories or types.
• Categorical variables do not allow measurement.
• All Yes/No answer types also require categorical variables to store data. For example, the
answer to ‘Do you own a pet?’ can be classified as ‘Yes’ or ‘No’.
• Categorical variables can be represented with both numbers and characters but
mathematical operators do not apply in the number representation.
For example, in India, PIN or ‘Postal Index Number’ code uniquely identifies an area.
These PIN codes classify different areas with different numbers. However, we cannot
add two PIN codes together or say one PIN code is bigger than the other. Similarly, the
phone number of a person is stored as a categorical variable.
TYPES OF DATA
Any variable is meaningless unless it is associated with some data. In fact, it is the data which
decides what kind of variable shall be used to store it. Data may be classified based on the
following criteria:
• Categorization by Data Property
• Categorization by Organization
• Categorization by Applications
Quantitative Data
As the name suggests, quantitative data is associated with the word ‘quantity’. Quantitative
data is used to answer research questions like ‘How many?’, ‘How much?’ or ‘Which is more?’.
Definition: All data values that can be measured, counted and compared to each other (less
or more) on the basis of quantity are called Quantitative Data. Some common examples of
quantitative data are listed below:
• Data which describes a person’s age, height or weight
• Data which describes a book’s number of pages, price or weight
• Data which describes a car’s maximum speed, mileage, weight, etc
In other words, data which contains a numerical variable is always quantitative data.
Quantitative data can be further categorized as follows:
1. Discrete Data: Discrete data consists of distinct, separate values that are whole numbers
and cannot be subdivided further. This type of data represents countable quantities or
categories with clear boundaries between values. Examples of discrete data include:
• Number of students in a classroom
• Number of items sold in a store
• Number of cars in a parking lot
• Number of siblings in a family
2. Continuous Data: Continuous data represents measurements that can take any value
within a range and can be infinitely subdivided into smaller increments. In continuous
data, values can vary continuously and may include decimal or fractional values.
Examples of continuous data include:
• Height and weight of individuals
• Temperature readings (25.3°C, 27.8°C, 30.1°C, etc.)
• Time taken to complete a task (3.45 minutes, 4.72 minutes, etc.)
• Blood pressure readings
Qualitative Data
Qualitative data is derived from the word ‘quality’. Qualitative data is used to answer
questions like ‘What type?’ or ‘What kind?’.
Definition: All data values that can depict the quality, properties or the distinguishing
characteristics to be classified into one out of two or more categories are called
Qualitative Data.
Data Literacy 59
Some common examples of qualitative data are listed below:
• Data which describes a person’s eye color, hair color, name, etc.
• Data which describes a book’s author, ISBN number, genre, publisher, etc.
• Data which describes a car’s color, fuel type (CNG, petrol, diesel) or body type
(hatchback, SUV, convertible)
In other words, data which contains a categorical variable is always qualitative data.
Object/Person being Numerical Variable Quantitative Data Categorical Variable Qualitative Data
observed (Data Collected) (Data Collected)
Boys • Age • 20 • Eye Color • Brown
• Height • 5.7 • Hair Color • Black
• Weight • 65.9 • Name • Aman
Case Study
EXAMPLE OF QUANTITATIVE AND QUALITATIVE DATA
Akash is a delivery boy who works for a gift company called ‘Gift-Shift’. He drives his red-colored
scooter which gives a mileage of 60 kmpl (kilometres per litre) and makes 5 trips a day to deliver
the gifts. He can carry three large boxes or several small boxes on each trip at most. He also
drives safely at a speed of 50 km/hr. Akash is a moderately built guy, weighing 75 kg and has
the same brown hair as the spots on his pet dog, a Beagle. The illustration below separates the
quantitative data from the qualitative data.
Quantitative Qualitative
How many? What ‘Type’
How much? How often? Which ‘Category’
Designation
Speed of Vehicle
Experience AI
Learn more about qualitative vs quantitative data in healthcare by scanning the given
QR code or by opening the link https://www.youtube.com/watch?v=4iws9XCyTEk in your
web browser.
Categorization by Organization
Data may also be categorized based on its organization. Depending upon how the data has
been organized, we may have the following three categories.
Data Literacy 63
MOTION
ACCELEROMETER IR GAS TEMPERATURE CHEMICAL DETECTOR
SENSOR SENSOR SENSOR SENSOR SENSOR SENSOR
PROXIMITY SMOKE
SENSOR SENSOR
H
Emergency
Data Collection Methods: Once you have your sources, you need to collect the data. This could
involve web scraping, using sensors, conducting surveys or purchasing datasets.
Do it Yourself
Search online for weather-related datasets and compile the following information about them:
1. Web source/link to the dataset
2. Information stored in the dataset
3. Size of the dataset
4. Whether the dataset is free or paid
Telegram
Snapchat
Douyin
Kuaishou
X (Twitter)
Sina Weibo
QQ
Pinterest
0 500 1,000 1,500 2,000 2,500 3,000
Active Users (in millions)
Source: www.backlinko.com
Let us further explore primary and secondary data sources in the section ahead.
Data Literacy 65
Primary Data
The data which is collected first-hand by researchers, i.e., directly from a data source, is called
Primary Data.
Advantages of Primary Data
1. Primary data can be considered as the most trustworthy data because it comes from
authentic sources.
2. It can be customized to collect only the desired data as per the research question.
3. Primary data is up-to-date, based on real-time data collection.
4. The researcher has complete control on the data collection process.
Surveys
A survey may be conducted among people and is applicable to anything from which data can
be obtained.
For example, if you want to answer the research question ‘What are the career preferences
of classes 11 and 12 students of your school?’, the choice of criteria is classes 11 and 12 and
the sources of data are all the students studying in classes 11 and 12.
On the other hand, if your research question is ‘Which brands of private cars are the most
polluting?’, the choice of criteria is private cars and the source of data is the pollution check
conducted on all private cars for data collection.
Surveys can contain a variety of questions, based on the research problem, which the sources
may respond to and the responses are recorded to answer the research question.
Surveys are suitable for small groups but are very effective for data collection amongst very
large groups. Surveys can be conducted in person (for small groups), telephonically or even
online using WhatsApp, email and other social media platforms.
If you can install only one application in your mobile, what would it be?
The responses to the poll are shown below, where 68% selected LinkedIn, 8% chose
Facebook, 19% went with Instagram and 6% opted for Other.
LinkedIn 68%
Facebook 8%
Instagram 19%
Interviews
Interviews are designed to ask questions from respondents in a face-to-face environment.
The technique is applied usually when the number of respondents is very limited. Earlier,
interviews were conducted telephonically if a personal meeting was not feasible but
nowadays, online videoconferencing on platforms like Google Meet, Skype and Microsoft
Teams have made interviews a much better experience with live video transmission.
Questionnaires
Questionnaires usually contain a set of questions which may be structured to gather
information for use in a survey.
Feedback forms
Feedback forms are a form of survey containing a variety of questions but with a special use.
Feedback forms are utilized to gather data regarding the satisfaction level of a group after
an event or after the consumption of a product. This technique is different from a survey
Data Literacy 67
in the sense that surveys can be conducted for measuring the expectations while feedback
forms are designed to check if the expectations are met and to seek suggestions for further
improvement.
There are several other techniques for collecting primary data and you can easily identify
them if they fit the basic definition.
Secondary Data
Data which is not collected first-hand by the researcher but is used for answering research
questions is called secondary data. Usually, secondary data refers to data collected in the
past by someone and is made available by sharing. It can also be said that the data that one
considers secondary is, at one point, primary data for someone.
A common example of secondary data is the data collected from various online sources. Some
other sources of secondary data may include, but is not limited to, the following:
• Government reports
• Internal records of an organization
• Reports by public research organizations
• Internet websites
• Libraries
• Journals, newspapers and magazines
Do it Yourself
Classify the following data collection methods as primary or secondary.
1. You have been asked to collect food preferences of all the students of your class for
a school picnic. You decide to create a poll and record the food preferences of your
classmates. Which data collection method is this?
Case Study
COLLECTING DATA FROM WEBSITES
Problem Statement: Rajesh has been tasked with
collecting product reviews from websites. What are
the best practices for this data collection?
The process of collecting data from websites using
automated programming methods is called Web
Scraping.
1. Define Clear Objectives:
• Goal: What details do you want to gain
from the reviews (e.g., sentiment analysis,
Web Scraping
understanding customer preferences)?
Data Literacy 69
• Focus: Specify the type of products, brands or websites that you would target for
reviews.
• Metrics: Determine how you would measure success (e.g., number of reviews collected,
accuracy of sentiment analysis).
2. Data Quality is Key: Choose established and reliable websites known for genuine user
reviews.
• Data Cleaning: Address typos, grammatical errors and inconsistencies in the data to
improve accuracy.
3. Maintain Data Privacy and Security: Avoid collecting personally identifiable information
(PII) unless explicitly authorized.
• Website Terms: Is it legal? Do you need permission? Adhere to the terms and conditions
of the websites that you are scraping data from.
• Secure Storage: Store collected data securely to prevent unauthorized access or
breaches.
4. Data Diversity is Essential: Collect reviews from various websites, review aggregators
and forums to capture diverse perspectives.
• Positive & Negative Reviews: Don’t just focus on positive reviews. Capture a balanced
mix of positive and negative feedback.
• Time Period: Include reviews from different timeframes to account for product updates,
changing customer trends, etc.
5. Consider Ethics and Privacy:
• Data Regulations: Be aware of data privacy regulations as per law that may apply to
your data collection methods.
• Consent & Transparency: If collecting user data directly, obtain informed consent and
be transparent about how it will be used.
• Respectful Scraping: Avoid overloading website servers with too many scraping
requests.
6. Documentation is Essential:
• Data Source Tracking: Maintain a record of the websites that you scraped data from,
including URLs and dates.
Features of Data
Data features refer to individual measurable properties or characteristics of data
that are used as inputs in Artificial Intelligence systems. Features represent specific
attributes or variables that provide information about the data points being analyzed.
Data features are used by AI or machine learning models to make predictions or
classifications.
Independent variables
Square footage, Number of
bedrooms, Location, etc.
AI system Model
Dependent variables
House Price Learns the relationship between given
independent variables and dependent
variable and produces a model
Data Literacy 71
For example, independent variables—square footage, number of bedrooms, location, age of
the house—allow the AI system to predict the house price, which becomes a dependent or
target variable.
Data Preprocessing
The quality and relevance of these features directly impacts the ‘goodness’ of the AI system.
To improve the quality of the collected data, we use a method called data preprocessing. The
process of preparing data for analysis by cleaning, transforming and organizing it so that the
AI model can learn from it efficiently is called data preprocessing. Let us understand some
common data preprocessing techniques with examples.
g ont IMPUTATION
r e r
Ja Al It refers to replacing the missing values with a statistical measure (e.g., mean, median,
mode) of the feature.
Data Usability
Data usability in ML simply means how well-suited the data is for training a Machine
Learning model. Usable data has the following characteristics:
• Clean and organized: This means the data is formatted properly and is free of errors.
• Accessible: Data is stored in a way that the ML model can readily access it.
• Understandable: The data is clear and documented so the model can learn from it
effectively.
Good data usability helps ML models train faster and make better predictions. Different
datasets may have different usability and companies usually assign a numeric value between
0 and 10 to usability.
Data Processing and Data Interpretation
Now that you understand data features and preprocessing, let us explore the next important
steps: data processing and data interpretation.
Data Processing takes the prepared (pre-processed) data as input and processes it according
to your analysis goals. Data processing helps us to convert our data into meaningful information,
which is suitable for analysis. Here are some common data processing techniques:
• Aggregation: It refers to combining data from various sources and summarizing it for
better decision-making. For example, calculating the average house price per pin code
in your housing data.
• Sorting: It means arranging data points in a specific order. For example, sorting house
prices from lowest to highest for more information.
• Filtering: Selecting some subsets of data based on a chosen criterion. For example, in
the data set, we may focus only on houses built in the last 5 years and ignore all other
house-related data.
Data Literacy 73
Data Interpretation is the process of using processed data to extract meaning, identify patterns
and draw conclusions. It helps us to make sense of the information that we have obtained from
the data. Data interpretation also helps to answer questions in our problem statement.
Here is data interpretation in action with an example:
• Scenario: You have processed your house-price data.
• Interpretation: By analyzing trends, you may find that houses with more bedrooms or
having a garage parking tend to be more expensive.
Data processing and interpretation are iterative processes. You may need to go back
and forth, refining your processing techniques based on your initial interpretations or
vice versa.
Here are some additional points to remember:
• The choice of data processing and interpretation techniques depends on the type of data
and analysis goals.
• Data visualization tools like charts and graphs can be powerful aids in interpreting data
and communicating insights to others.
• Always be cautious of drawing biased conclusions based on limited data or faulty
processing methods.
Qualitative Data Interpretation: It is easy to gather and interpret quantitative data, which
deals with numbers and measurements. But what about the human element? How do we
understand people’s experiences and motivations? Qualitative data can capture emotions,
feelings and experiences that people have.
The interpretation of qualitative data helps the analysts to:
• Focus on Emotions and Feelings: Qualitative data helps us understand human
experiences. With qualitative data interpretation, we can gain insights into people’s
hopes, fears and emotions.
• Understand Motivations: By interpreting qualitative data, we can try to understand the
‘why’ behind people’s actions. Why did someone choose a particular product? Why did
a group of students react a certain way to a learning activity? Qualitative data helps us
discover the motivations and thought processes that drive human behaviour.
Data Literacy 75
COMPARING QUANTITATIVE AND QUALITATIVE DATA INTERPRETATION
Your school has launched a new app for students to learn Japanese language. Suppose you are
collecting, processing and interpreting data.
Quantitative data may tell you how many students download and use the new app but qualitative data,
through interviews with students, may reveal their feelings about the look and feel of the app, course
content, and even suggestions for improvement.
Do it Yourself
What is the type of interpretation associated with each of the following images? Give reasons.
On choosing to view the trend from 2004 onwards, we get the following graph. It shows that
in India, people’s interest in K-Pop started growing August 2020 onwards but witnessed a
decline after August, 2022.
On ‘Home’ Tab, click on ‘Year in Search 2023’ and prepare a list of the top 5 news events and top
5 sports events in the year 2023. You can also select any previous year and repeat the search.
Data Literacy 77
To summarize, both qualitative and quantitative data interpretation methods are compared in
the table below:
Feature Qualitative Data Interpretation Quantitative Data Interpretation
Focus Emotions, feelings, experiences, insights and motivationsNumbers, measurements, statistics and trends
Data Source Interviews, focus groups, observations and documents Surveys, experiments, existing datasets and sensors
Analysis Themes, narratives and sentiment analysis Descriptive statistics (mean, median, mode) and inferential
statistics (hypothesis testing)
Outcome Understanding the ‘why’ behind behaviour and uncovering Identifying patterns, trends, relationships and making
motivations predictions
Example Analyzing interview records to understand student Calculating the average exam score across different class
experiences with a new teaching method sections
Remember
Choosing the right data interpretation technique depends on the specific questions that you are trying to answer
and the type of data you have. Data interpretation may be qualitative, quantitative or a combination of both
strategies. By combining different techniques, you can get a better understanding of the story your data tells.
TYPES OF DATA
For example, consider the following data statement and its textual interpretation:
Statement: A recent survey found that among high school students in India, 62% reported feeling
stressed ‘often’ or ‘all the time’ while only 18% said that they ‘rarely’ or ‘never’ experienced stress.
Furthermore, the survey results indicated that students with high social media usage were more likely
to report stress frequently as compared to those with lower social media usage.
Textual Interpretation: A recent survey found that over 6 out of 10 high schoolers in India often feel
stressed. High social media usage is also associated with increased stress levels thanks to comparisons
or too much online information/misinformation.
Do it Yourself
Consider the given tabular interpretation and answer the questions that follow.
Frequency % of Source of % of Social % of Coping % of
of Stress Students Stress Students Media and Students Mechanisms Students
Stress Used
Often or All 58% Academic 45% High Usage 70% Report Talking to 65%
the Time Pressure (3+ hours/ Frequent Friends /
day) Stress Family
Occasionally 22% Peer 28% Moderate 55% Report Listening 42%
Pressure Usage (1-2 Frequent to Music /
hours/day) Stress Relaxation
Techniques
Rarely or 20% Family 18% Low Usage 40% Report Physical 38%
Never Expectations (Less than 1 Frequent Activity
hour/day) Stress
1. What percentage of high school students experience stress often or all the time?
2. How many students report experiencing stress rarely or never?
3. What is the leading source of stress for high school students?
4. How does the prevalence of stress from peer pressure and family expectations compare
to academic pressure?
5. Is there a correlation between social media usage and stress levels?
6. What is the most common coping mechanism used by students to manage stress?
Data Literacy 79
3. Graphical Representation: Graphical presentation of data interpretation represents data
as graphs, charts or infographics, making complex data easier to understand.
Graphical representation is helpful for audiences who are more visually oriented and
for those who are less data literate.
Do it Yourself
PIE CHARTS
Pie charts are graphical representation of ‘parts of a whole’. As the name suggests, these
charts are shaped as pie. Each slice of the pie represents the portion of the entire pie allocated
to a specific category.
• Pie charts are circular charts. The circle/pie is divided into as many sections as there
are categories.
• Any category with a bigger proportion in terms of value is shown with a bigger slice of
the pie.
PIE CHART
By Car
12%
• Graphical interpretation helps in showcasing trends, patterns and relations within the
data.
• It is also helpful in simplifying complex relationships for audiences with different
data literacy levels.
BUBBLE CHART
For three Variables
DataFit
TV Advertising
Sales
Bigger
Bubble Size
= More Sales
Radio Advertising
• It is easy to highlight key findings and draw immediate attention to most important
data points.
Srinagar
40
Secunderabad
30
Gandhinagar
20
10 Visakhapatnam
0 Sawai Madhopur
sh
e
nd
nc
nc
gli
at
Hi
cie
ie
M
En
Sc
S
0 10 20 30 40 50
ta
Da
Can you interpret the given charts? Try answering the following questions:
• In which subject did the student score minimum marks?
• Which city has the highest temperature?
Do it Yourself
LINE GRAPHS
Understanding Line Graphs: A line graph is created by connecting various data points. Line
graphs are created to show the change in quantity over time.
Look at the line graph below and answer the questions that follow.
LINE GRAPH Points showing
For Two Variables Month-wise
Sales
10000
8000
Sales
6000 Coffee
4000 Ice Cream
2000
Data Literacy 81
Remember
The best presenters cleverly choose and combine these formats. Use text to explain complex points, tables
for accurately measured data points and graphical charts to show trends. It is also helpful to choose your
interpretation based on the audience, the nature of your data and the message you want to convey. By
understanding the strengths and weaknesses of each format, you can transform your data interpretation
into a compelling story that resonates with your audience.
Data Presentation
Data presentation is a customized data story which communicates a specific message
and persuades or informs an audience. It is most suitable for one-time presentations on a
particular topic.
For example, a presentation with data showing a campaign on the progress made by India in
the last 10 years along with the projected growth may be called data presentation.
A research presentation that shares findings from a study and includes charts, graphs and
data tables with detailed explanations is also an example of data presentation.
Benefits: Data presentation is useful for clearly communicating complex information
by engaging the audience with a narrative. It helps to promote a deeper understanding of
the topic.
Data Literacy 83
Data Visualization using Tableau
Tableau is a powerful data visualization and business intelligence (BI) platform that helps
people see and understand data more effectively. It allows users to connect to various data
sources, explore and analyze information, and create interactive visualizations like charts,
graphs and dashboards.
The following are the two versions of Tableau:
Tableau Desktop (Paid): The full-fledged version, Tableau Desktop, offers a wide range
of functionalities. It allows you to perform complex data analysis and create sophisticated
visualizations. Students and teachers can request access to Tableau Desktop free of cost using
their school-issued email IDs.
Tableau Public (Free): Tableau Public is a free version with a limited feature set. You can
connect to common data sources like spreadsheets and cloud platforms, create basic to
moderate visualizations and publish them on the Tableau Public platform. We shall work with
Tableau Public to get familiar with Data Visualization.
Follow the step-by-step tutorial for learning the basics of Tableau:
Step 1: To begin, visit https://public.tableau.com/
Step 2: Create your account and complete the registration details.
Step 4: Select dataset—You may upload your own data or use the sample data available
on Tableau for experimenting. Click on the Learn tab, which contains simple
self-learning How-To Videos and a Sample Data tab. Download EU Superstore Sales
dataset by clicking on Dataset (xls) link.
Step 5: Once the dataset is available, click Create in the navigation bar to access free data
visualization tools of Tableau Public. Choose Web Authoring which will enable you
to create visualizations—viz in short—directly in the web browser. Web authoring
makes it possible to create a viz without installing any software.
Data Literacy 85
Step 6: On clicking Create, Tableau seeks the dataset that you intend to visualize. Click on
Upload from computer to select the downloaded dataset.
Step 7: Drag the Orders table to the canvas.
Step 8: Below the canvas is a data grid. Click Update Now in the data grid to view the first 100
rows of the dataset.
Step 10: Drag the Returns table to the right of the Orders table on the canvas. It opens a
relationship edit window beneath the canvas. Relationships define how the tables
relate to each other. In this case, Orders and Returns are both identified by the
common field Order ID.
Data Literacy 87
Step 11: Rename the Sheet as my_first_tableau and click Publish to publish the viz with
your chosen name.
Step 12: Click Sheet 1 on the bottom-left corner of the screen. This changes our interface
to show us the extracted data fields.
Step 13: Choose Your Data Fields—When Tableau connects to this dataset, it assigns the
fields to either Dimensions or Measures. The qualitative fields that describe
the categories of data are in the top part of the pane under Dimensions. The
quantitative fields that measure the categories of data are in the bottom part of the
pane under Measures.
Step 14: Drag out a quantitative field or measure to find out ‘how many’. We will use Sales.
Notice the field displayed in the Columns shelf in green in the given screenshot.
Tableau creates a long bar and an axis showing a range of values.
Step 16: Organize the data—use the Sort button in the Menu bar to sort your data in the
ascending or descending order.
Data Literacy 89
Congratulations, you have created your first Tableau Visualization!
Step 17: Let us draw an alternative visualization—Use
‘Show Me’
On the top right corner of the Tableau canvas,
click on the ‘Show Me’ button to see
alternative visualization options available for
the data in use. It opens a list of visualizations
which are applicable and also tells what other
requirements in terms of dimensions or measures
are needed if we select a specific visualization.
Experience AI
This is just the beginning—you can take your learning to the next level by checking out
the YouTube video from Tableau. To do so, scan the given QR code or open the link
https://youtu.be/iT1iHLGawIM in your web browser.
Data literacy is essential. It will be very helpful for everyone to learn these skills for the
modern world, where data-driven decision-making is increasingly important. With a good
conceptual grasp of data literacy, you will be able to interpret and analyze data effectively,
leading to more informed and accurate decisions.
One must also learn to differentiate between reliable and unreliable data sources, which is
vital in an era of information overload and misinformation. Proficiency in tools like Tableau
for creating interactive and visually appealing data presentations shall be helpful in building
a great professional profile in any career choice.
Data Literacy 91
Memory Bytes
Data Literacy: The ability to read, understand, interpret and communicate with data.
Data vs Information: Data is raw facts while information is processed data.
DIKW Model: Data, Information, Knowledge and Wisdom hierarchy explains the value of data processing.
Data in Computing: Refers to data in the context of computer systems and digital formats.
Importance of Data Literacy: Helps in making informed decisions, understanding data presentations and
avoiding misinformation.
Types of Data: Quantitative (numerical) and Qualitative (descriptive)
Quantitative Data: Data that can be measured and counted, e.g., height, weight, age, etc.
Qualitative Data: Data that describes qualities or characteristics, e.g., color, name, type, etc.
Discrete Data: Quantitative data that can be counted and has distinct values like the number of students.
Continuous Data: Quantitative data that can take any value within a range such as the temperature.
Categorical Variables: Variables that represent categories and are often used for ‘Yes/No’ questions.
Data Collection and Acquisition: Methods of gathering data from various sources.
Data Cleaning and Preprocessing: Preparing raw data for analysis by removing errors and inconsistencies.
Data Analysis and Interpretation: Using statistical methods and tools to derive insights from data.
Data Visualization: Presenting data visually using charts, graphs and infographics to communicate findings.
Critical Thinking: Evaluating data sources and the validity of data to make informed decisions.
Ethical Use of Data: Considering privacy, security, confidentiality and bias when working with data.
Data Mindset: Believing that data is abundant and can solve problems when properly utilized.
Data-enabled Questioning: Learning to ask questions that can be answered using data.
Describing Data: Understanding how data is categorized, stored and structured.
Graph Types: Different graphs for data visualization include bar charts, pie charts and line graphs.
Impact of Data Literacy: Helps in various fields like healthcare, education, business and everyday
decision-making.
Steps to Achieve Data Literacy: Understanding the basics, learning analysis techniques, gaining hands-on
experience, developing critical thinking and ethical data use.
Exercises
Objective Type Questions
I. Multiple Choice Questions (MCQs):
1. What is data literacy?
(a) The ability to code
(b) The ability to read, understand, interpret and communicate with data
(c) The ability to create databases
(d) The ability to design websites
Data Literacy 93
11. How is knowledge different from information in the DIKW model?
(a) Knowledge is unprocessed data while information is processed data.
(b) Knowledge adds context and meaning to information.
(c) Information is more useful than knowledge.
(d) Knowledge is more irrelevant to decision-making than information.
12. What role does critical thinking play in data literacy?
(a) It helps in creating new data.
(b) It involves evaluating data and making evidence-based decisions.
(c) It reduces the amount of data needed.
(d) It eliminates the need for data analysis.
13. What is the function of an algorithm in data analysis?
(a) To create visual representations
(b) To process data step by step
(c) To ignore data patterns
(d) To delete data
14. Why is it important to describe data accurately?
(a) To reduce the amount of data collected
(b) To ensure proper understanding and categorization
(c) To increase data complexity
(d) To avoid data visualization
15. What is a common method to secure data online?
(a) Using weak passwords (b) Encrypting data
(c) Sharing data publicly (d) Avoiding data backups
16. In data literacy, what does ‘data visualization and communication’ entail?
(a) Storing data in physical formats
(b) Presenting data using charts, graphs and infographics
(c) Deleting unnecessary data
(d) Ignoring data analysis
17. What is meant by ‘data preprocessing’?
(a) Discarding irrelevant data
(b) Transforming raw data before analysis
(c) Collecting new data
(d) Ignoring data patterns
18. How can data literacy help in everyday life?
(a) By increasing the amount of data one collects
(b) By understanding and making sense of data in various contexts
(c) By reducing the use of data
(d) By ignoring data in decision-making
19. What is the significance of context in data interpretation?
(a) Makes data meaningless
(b) Helps transform data into meaningful information
(c) Complicates data analysis
(d) Reduces the accuracy of data
Data Literacy 95
30. What is the first step to start using Tableau Public?
(a) Downloading the software
(b) Purchasing a licence
(c) Creating an account on the Tableau Public website
(d) Installing a plug-in
Learning Objectives
Understanding the importance of Mathematics as a foundation of Artificial Intelligence
Understanding Statistics as a means to collect, organize and analyze data using various
statistical tools such as dot plots, tally charts and more
Defining probability, calculating the likelihood of events and interpreting sample spaces in
various probability scenarios
Explaining measures of central tendency (mean, median, mode) and graphical representations
to analyze and communicate data effectively
Demonstrating how mathematical concepts, especially Statistics and Probability, are essential
for developing and enhancing AI applications
IMPORTANCE OF MATHS IN AI
Let us play a game before we begin this chapter. Guess the next number in the sequence for
the illustrations below:
4 12
1, 3, 5, 7 ? ?
I. II. 8 16
Easy guesses? In the first case, the pattern is simply an arithmetic series while in the second
case, there is an arithmetic series pattern along with a geometrical pattern, both changing
together. The human mind is a powerful creation. We have a natural gift for recognizing and
understanding patterns when we see them, and we can make inferences about what happens
next in the pattern.
Mathematical patterns are also distributed throughout the natural world—from the intricate
spirals of seashells to the branching patterns of trees.
Consider some examples:
1. Fern: The tiny leaflets echo the shape of the entire fern leaf structure,
creating a beautiful display of self-similarity.
Did you notice a pattern here—the increasing/decreasing size of structure
and the number of leaflets as we go into more detail?
2. Symmetry: Symmetry is the balanced and proportional arrangement
of parts. It is visible in snowflakes, butterfly wings and the intricate
patterns of flowers.
3. Fibonacci Sequence: This sequence, where each number is the sum of the two preceding
ones (0, 1, 1, 2, 3, 5, etc.), appears in the arrangement of seeds in a sunflower head as
illustrated below:
0+1=1
1+1=2
1+2=3
2+3=5
3+5=8
5 + 8 = 13
..............
34
55
5
8
21
3
11
2
13
These are just a few examples of the many mathematical patterns that grace the natural
world. Mathematics can be used to explore and explain a lot of patterns that occur not only in
our world but also in data.
These patterns follow specific rules. By understanding these rules, we can accurately predict
what comes next in the pattern. This skill in recognizing patterns is valuable in two key areas:
Mathematics and Artificial Intelligence (AI).
Do it Yourself
Observe the illustrations and answer the given questions.
1. The chart shows pollutants present in the Delhi atmosphere a year before and a year
after the COVID-19 lockdown.
(a) Identify which part of the graph shows the COVID-19
lockdown period.
(b) What change in the pattern helped you in identifying
the COVID-19 lockdown period? B F
A D
(c) What do you think is the reason for the change in the C E G
pattern?
(d) Do you think that the pollutants in Delhi air follow
a seasonal trend? Which portions of the graph show Pollutants in Delhi Air
repeated patterns?
1
2. Look at the number pyramid carefully. Do you observe a 1 1
pattern? 1 2 1
(a) Write your observations about the pattern. 1 3 3 1
1 4 6 4 1
(b) What should be the next line in the number pyramid? 1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
3. Which of the following shapes should come in the blank space? Tick the correct one.
—Jeff Hawkins
• Finding Out Unknown or Missing Values (using Linear Algebra): Representing and
manipulating data using matrices to solve equations and model data. For example:
◼ Solving Systems of Equations: How many apples and oranges were sold if the total
number of fruits sold and their combined price are known?
◼ Matrix Operations: How to transform an image using rotation and scaling in
computer graphics?
STATISTICS
What is Statistics
Statistics is a special branch of Mathematics that is associated with the collection and
organization of facts and figures. Large volumes of unorganized data do not make sense
on their own but when statistical tools are applied to them, we may obtain a wealth of
information. Statistics gives meaning to raw data, allowing us to draw conclusions. Defined
formally, Statistics is the science of collecting, describing, organizing and analyzing data in
order to derive meaningful insights and inferences from it.
The term ‘Statistics’ is derived from the Latin ‘Statisticum Collegium’ which means ‘Council of
the State’ and was used in reference to the council which conveyed to kings the summarized
information about population, military, land, agriculture and the like. The word gave rise to the
Italian ‘Statista’, and later to the German ‘Statistik’ which signified ‘Science of the State’.
Which card is occurring the most number of times? Which is the rarest card in your collection?
How many ‘Gold’ cards do you have?
If you have an organized collection and know which cards are the most common in your
collection (like finding lots of Bumrah cards) or the rarest (maybe a Sachin Tendulkar
Gold Trump card), it would be easier for you to choose which cards to trade with your friends.
Statistics helps you organize your information and find answers to questions based on
data.
3. Analyze: Now you can see the pattern. Which player has the most and the least number
of cards? Are there any players you haven’t found yet?
By collecting data (cards), counting and analyzing information, you are using statistics to
better understand your card collection.
Frequency
Frequency represents the number of times an observation occurs while recording data. Let us
do a simple experiment to understand the concept of ‘frequency’. We shall cast a die 20 times
and record the observations as illustrated below. Recall from the terminology section that our
data may be qualitative or quantitative in nature—frequency is always quantitative in nature.
Observations
Now, count the number of times each value appears on the die and record the results in a table
as shown below:
*We use a strikethrough line after the tally reaches 4 (\\\\) to represent 5 (\\\\).
Dot Plots
Another representation of the numerical count of observations can be made visually and it
is called a dot plot. This representation shows all possible values on a number line and each
observation, corresponding to a value, is placed as a dot over it as shown in the illustration.
Please note that the dots are equal to the number of times an observation occurs (frequency).
The dot plot representation for the 20 throws of die is shown below. It gives an instant idea
about which is the most frequently occurring number on the die.
1 2 3 4 5 6
Number on Die Face
FUN 1. Count the frequency of each animal in the illustration below and create a table showing
TIME tally and frequency. Create a dot plot to visualize the same.
2. Calculate the frequency of each vowel in the following text and make a table showing
tally and frequency. Create a dot plot to visualize the same.
—Charmaine J. Forde
Applications of Statistics
Climate Action
Statistics play a crucial role in environmental monitoring by
helping us understand and track changes in environmental
factors over time. For example, the image shows the
per-capita CO2 emissions of various countries, indicating
which nations have the highest emissions. By collecting
and analyzing this data, we can identify trends, compare
emissions between countries and check the effectiveness of
policies aimed at reducing carbon footprints.
This type of statistical analysis is directly related to
Sustainable Development Goal (SDG) 13: Climate Action.
SDG 13 aims to combat climate change and its impact by
taking urgent action to reduce greenhouse gas emissions
and strengthen resilience to climate-related hazards.
Monitoring emissions through statistics helps track
progress towards these goals and implement policy
decisions that aid in mitigating climate change. Source: World Bank
Eradication of Poverty
Statistics are essential in tracking and understanding the changes in poverty levels as shown
in the given image. Statistics help governments in tracking the ratio of people living in
poverty. Consider the following Niti Aayog India illustration:
Steep decline in
2013-14 (Projected)
2022-23 (Projected)
24.82 crore
29.17%
Poverty individuals estimated
Headcount 11.28% to have escaped
multidimensional
Ratio poverty during last
during the last
9 years
9 years
helps measure the percentage of the population living below the poverty
line over time. It also highlights the scale of poverty reduction efforts and
their impact on people’s lives.
The image also indicates that India has likely achieved SDG 1 which aims to reduce
multidimensional poverty by at least half by 2030. This is a significant milestone, showing
that India is ahead of schedule in its efforts to alleviate poverty.
Disaster Management
Statistics are vital in disaster management for understanding and preparing for natural
disasters. Authorities use statistics in disaster management to:
1. Alert Citizens: Predict and warn people in areas likely to be affected by natural
disasters.
2. Understand Impact: Know the number of people, services and buildings in the affected
areas.
3. Allocate Resources: Efficiently plan and provide necessary resources like food, water
and medical aid.
4. Improve Response: Analyze past disasters to improve future response and recovery
efforts.
5. Plan Evacuations: Design safe and effective evacuation routes based on population
data.
Disaster
Impact
During
Preparedness (Emergency) Response
Before After
Prevention & Recovery
Mitigation
Risk
Assessment
The above image shows different activities associated with a disaster mitigation and recovery
cycle. Both before and after activities, as shown in the disaster management cycle, utilize
statistics for insightful operations.
For example, the following graph from Our World in Data illustrates the number of recorded
natural disaster events over time. This data helps identify trends, predict future disasters and
effectively allocate resources for prevention.
All disasters
400
300
200
100
0
1900 1920 1940 1960 1980 2000 2023
It is important to note that the graph may show fewer events before 1980 due to
underreporting. Thus, careful interpretation is needed to understand the true historical
trends and to ensure accurate planning and response strategies.
1. What trend do you observe in the number of deaths related to climate-based disasters?
2. Why do you think this trend is happening? What factors are possibly leading to this
change?
Disease Prediction
Statisticians track diseases by analyzing data such as the number of cases, deaths and tests
conducted. This data helps create a picture of how a disease is spreading among people. By
analyzing trends and patterns, statisticians can predict how quickly a disease may spread in the
future. This allows for better preparedness and resource allocation. For example, consider the
COVID-19 outbreak timeline in India for the first wave, as shown in the following illustration.
First wave peak
100,000 cases
1 case
st
10,000,000 cases
10,000 cases
10,916,589 cases
New cases: 97,860
1000 cases
15 Feb 2021
30 Jan 2020
J F M A M J J A S O N D J F
National
Lockdown
Sports
Statistical analysis has become significantly important in sports management with
technological advancements. Let us consider the Indian Premier League (IPL) as an
example.
• Picking the best players: Statistics like runs scored, average (AVG), strike rate (SR)
and hundreds (100) help choose the best players. Virat Kohli has the most runs, so he is
ranked high in the given example.
• Knowing a player’s skills: Statistics like average and strike rate show how good a player
is at scoring runs often (average) and how fast they score (strike rate). A high strike
rate means a player scores runs quickly while a high average means they score runs
consistently.
• Knowing how players match up: By looking at a player’s statistics against a specific
bowler, teams can guess how the player will do in that game. For example, they might
see if a player who scores runs fast has trouble against slow bowlers.
• Planning the game: Statistics help coaches make plans for the upcoming matches. By
looking at how well a team scores runs—quickly or slowly—the coach can decide the
best game strategy.
• Finding new stars: Statistics can help find talented young players who are not famous
yet. Talent Scouts may look for players with high strike rates or good averages in lower
leagues.
Data Collection
While watching the video, complete the table given below using the tally method of counting.
Data Analysis
• Which is the most frequently occurring car color in the video?
• How many cars can be seen in all?
Data Interpretation
• Can you infer the most popular color choice of the residents of this area? Elucidate.
Activity 2
SOCIAL MEDIA STATS
Question: How much time do students in your grade spend on social media each day?
• Data Collection: Conduct a survey by asking your classmates how much time they typically
spend on social media daily.
• Data Analysis: Organize the accumulated data into a frequency table showing how many
students spend similar amount of time on social media daily. Calculate the average time spent
to see how much time a typical student in your class spends on social media every day.
Activity 3
SPORTING CHAMPIONS
Question: Who is the batting leader (with maximum runs) in your school cricket team?
• Data Collection: Find the batting records or statistics of your school’s cricket team and collect
data on the total runs scored by each player.
• Data Analysis: Rank the players based on their total runs. You can also calculate the average
runs scored per game to see who performs consistently.
School Carnival
Problem Statement
You are the Chairman of the organizing committee of DataRich School Carnival, which is due
to be held in December. The carnival, a much-awaited annual event, is attended by all students
from class 6 onwards. The event is big since the school has an intake of 500 students in each
batch, amounting to 3,500 students in all!
The carnival event committee members have proposed the following activities to be held this
year:
Balloon
Shooting
The proposal for the carnival was sent to the budget committee for their consideration and
final approval.
However, the budget committee had already finalized the budget for DJ and Dance Floor
and informed the organizing committee that only three events from the above list could be
accommodated with the remaining funds. As Chairman of the organizing committee, you are
tasked with carefully choosing the activities which shall be liked by all students.
To make a quick but wise decision, you approach the Data Science teacher to help make the
right choice to ensure that everyone is happy. The teacher advises you to use statistics for the
problem at hand.
N
School Carnival
My Name
I am a (Gender) Boy Girl
I Study in Class
My Roll Number is
Magic Show
Merry-Go-Round
Balloon
Balloon Shooting
Shooting
Masquerade Ball
Treasure Hunt
S.No. Name Gender Class Roll Magic Merry-Go- Balloon Masquerade Treasure
Number Show Round Shooting Ball Hunt
1 Aman Boy 6 20 1 0 1 0 1
2 Anil Boy 6 21 0 1 0 1 1
... ... ... ... ... ... ... ... ... ...
350 Neetu Girl 12 5 1 1 0 0 1
Total 205 180 253 104 308
The data is visualized for presentation to the committee, allowing them to make
inferences from the analysis.
Student Preferences
400
308
Number of Students
300
253
205 180
200
104
100
Balloon
Shooting
Step 5: Interpret the Data
As Chairman of the organizing committee, you presented the analysis of statistical
data to the committee members. From the collected data and the subsequent bar chart
visualization, the organizing committee could answer the Statistical Investigative
Question: Which activities should be chosen for the School Carnival to make the maximum
students happy? It concluded that the maximum number of students would be happy if
Magic Show, Balloon Shooting and Treasure Hunt were included in the carnival.
PROBABILITY
The great Greek philosopher Heraclitus is said to have quoted: ‘Change is the only constant
in life.’ We know this for a fact that we cannot tell what may happen in the future. You may
interpret the above statement by saying that what has happened before is not certain to
happen again in future.
Conclusion: The future is uncertain.
g ont INTUITION
a r er
J Al Human intuition is a feeling or understanding that makes you believe or know that something
is true without being able to explain why.
Let us combine all the information mentioned previously and formally define probability.
Probability is the branch of Mathematics that deals with uncertainty. While Statistics helps
us understand data from past events to make predictions about the future, probability
complements this process by evaluating the likelihood of these predictions in the face of
uncertainty.
The accuracy of weather prediction is quite high—90% for a five-day period as compared to
80% for a week. For 10 days, it gets as low as 50%.
Probability helps you understand the reasons behind the patterns that you see in your data
and use those insights to make an educated guess about what may come next. It helps with:
• Looking beyond patterns: Probability goes beyond just observing patterns. It tries to
understand why those patterns exist.
• Dealing with uncertainty: There is always some chance involved. Even with a lot of
information, you cannot be 100% sure about the predictions you make.
• Predicting the future based on data: Using the patterns you see, probability helps you
guess the chances of a prediction being correct. The more data you collect, the better
your guess (probability) will be.
Probability: Terminology
Let us get familiar with some terminologies of Probability which we shall use frequently
throughout our studies. Consider the following statement:
• Statement: The prices of gold may fall tomorrow.
• Experiment: The above statement is called an Experiment in the terminology of
Probability. Any uncertain statement, for which it is possible to have multiple outcomes,
is called an experiment.
For example, some other experiments may be written as:
◼ The sky shall remain clear today. (Can you think of multiple outcomes?)
◼ There may be no examination this year.
◼ Rolling a die or tossing a coin.
Mathematics for AI (Statistics & Probability) 115
• Outcome: There is a possibility that gold prices may increase, decrease or remain
the same tomorrow, and each of the above possibilities is called an Outcome of the
Experiment.
Similarly, getting the values 1, 2, 3, 4, 5 or 6 on the face of a die are the outcomes. Also,
heads and tails may be the outcomes of a coin-toss experiment. Each outcome is the
result of a single trial of the experiment.
• Sample Space: A set of all possible outcomes of an experiment is called the
Sample Space. For example, the rolling of a single die has a sample space—1, 2, 3, 4, 5
and 6.
Activity 4
UNDERSTAND CHANCE EVENTS
The award-winning website Seeing Theory was created by Daniel Kunin while
studying as an undergraduate at Brown University. The goal of this website is to
make statistics more accessible through interactive visualizations. To explore
the meaning of chance events on Seeing Theory, scan the given QR code or
open the link https://seeing-theory.brown.edu/basic-probability/index.html in your
web browser.
0 0.5 1
Probability Range
Sure!
Unlikely Likely
Impossible Even Chance 100%
0 1
1. Certain: An event that is guaranteed to happen. Its probability is 1. For example, day
turns into night—it is a certain event.
2. Likely: An event that has a high probability of happening. Its probability is
generally greater than 0.5. For example, seeing a white car on a city street is a very
likely event.
3. Even Chance: Events that each have an even chance of happening. For two possible
outcomes, the probability for each equally likely event is 0.5. (Guess what will be
the probability of each event if four outcomes are equally likely?) Getting heads or
tails when flipping a fair coin is an example of equally likely outcomes of the
flipping event.
4. Unlikely: An event that has a low probability of happening. Its probability is generally
less than 0.5. For example, seeing a shooting star on any given night is a rare event and
is considered unlikely.
5. Impossible: An event that will never happen. Its probability is 0. For example, it is
impossible to roll a 13 on a pair of standard six-sided dice; the maximum value can be
two 6s, summing to 12.
Sure!
Unlikely Likely
Impossible Even Chance 100%
0 1
Sure!
Unlikely Likely
Impossible Even Chance 100%
0 1
Riya: Oh, so unlikely means not very probable. What does an even chance mean?
Angela: Think of a coin toss. What is the probability of getting heads?
Sure!
Unlikely Likely
Impossible Even Chance 100%
0 1
Sure!
Unlikely Likely
Impossible Even Chance 100%
0 1
Riya: So, it’s highly likely, meaning there is a good chance but not every time. But what about something
that is guaranteed?
Angela: The sun will rise every morning. That’s a certain event. It is going to happen, no doubt about it.
The probability of it rising is 100%, for sure.
Sure!
Unlikely Likely
Impossible Even Chance 100%
0 1
Riya: Like day turning into night or night turning into day.
Angela: Yup! But probability can get a lot more complex. Hopefully, this is a good starting point for
understanding those Statistics lessons. All the best!
Calculating Probability
The simplest way to calculate probability is to divide the number of favourable outcomes by
the number of total outcomes.
Number of Favourable Outcomes
P( E )
Total Number of Outcomes
Remember
Probability can range between 0.0 and 1.0. We may also say that the sum of all probabilities is 100%
or 1.0.
Activity 5
UNDERSTAND CHANCE EVENTS
Now that you can calculate probability, let us play a betting game based on it.
To play, scan the given QR code or open the link https://www.geogebra.org/m/ewtrdaba
in your web browser.
Activity 6
THE WEATHER GAME
Question: How accurate is the weather forecast in your area?
• Data Collection: Track the daily weather forecast for a week and record the predicted highs
and lows. Then record the actual temperatures each day.
• Data Analysis: Calculate the difference between the predicted temperatures and the actual
temperatures. See how often the forecast is accurate and, if off target, by how much.
Best case scenario lower limit Best case scenario upper limit
1 million cases
0
April May June
Source: COV-IND-19StudyGroup
Source: www.telegraphindia.com
Source: www.ey.com
This relates to SDG 8 (Decent Work and Economic Growth) as it helps create policies that
promote economic stability and growth.
Source: www.hindustantimes.com
India vs Pakistan
ODI 12 of 48
WIN PROBABILITY
India Pakistan
68% 32%
6. Traffic Prediction: Most map applications like Google Maps and Apple Maps use
probability to make predictions about the density of traffic from the source to the
destination, and use the same to predict approximate travel time. Traffic predictions are
also used by app-based car rentals for riding price calculations.
Memory Bytes
Arithmetic series is a sequence of numbers with a constant difference between consecutive terms.
Mathematics is crucial in AI because it helps in modelling, predicting and making decisions based on data.
Data collection involves gathering information systematically for the purpose of analysis.
Organizing data involves structuring the collected information in a way that makes analysis and
interpretation easier.
Dot plot is a simple visual tool that displays data points on a number line.
Tally chart uses tally marks to record the frequency of data occurrence.
Frequency refers to the number of times a particular observation appears in a dataset.
Histogram is a graphical representation that shows the distribution of data using bars.
Probability measures the likelihood of an event occurring.
Sample space is the set of all possible outcomes in a probability experiment.
Probability helps manage uncertainty by making informed guesses based on data patterns.
High frequency in a dataset indicates that a particular observation is very common.
Mean is the average of a dataset and is calculated by summing up all the values and dividing them by the
total number of values.
Median is the middle value in an ordered dataset.
Mode is the value that appears most frequently in a dataset.
Dot plots are useful for small datasets because they show individual data points clearly.
Statistics play a key role in disaster management by helping to predict, plan and allocate resources
efficiently during emergencies.
Independent events are those whose outcomes do not affect each other.
Mathematics for AI (Statistics & Probability) 125
Exercises
Objective Type Questions
I. Multiple Choice Questions (MCQs):
1. Which sequence is an example of simple arithmetic series?
(a) 1, 2, 4, 8, 16 (b) 2, 4, 6, 8, 10
(c) 1, 1, 2, 3, 5, 8 (d) 1, 4, 9, 16, 25
2. What mathematical concept is used in AI to predict future events based on data patterns?
(a) Calculus (b) Probability
(c) Linear Algebra (d) Geometry
3. What is the primary role of statistics in data analysis?
(a) Collecting, describing, organizing and analyzing data
(b) Solving complex mathematical equations
(c) Creating visual art
(d) Writing computer programs
4. Find the value of x in the given equation:
7x + 11 = 4
(a) 4 (b) –1
(c) 1 (d) –7
5. Match the following:
Part I Part II
(A) Exploring Data (i) Linear Algebra
(B) Training and Improving AI Model (ii) Probability
(C) Finding Out Unknown Values (iii) Statistics
(D) Predicting Different Events (iv) Calculus
(a) (A)–(iii) (B)–(ii) (C)–(i) (D)–(iv) (b) (A)–(iii) (B)–(iv) (C)–(i) (D)–(ii)
(c) (A)–(iv) (B)–(iii) (C)–(ii) (D)–(i) (d) (A)–(iv) (B)–(ii) (C)–(i) (D)–(iii)
6. In a tally chart, what symbol represents a count of five?
(a) \\ (b) \\
(c) \\\\ (d) \\\\
7. What is the probability of getting heads in a fair coin toss?
(a) 0.25 (b) 0.5
(c) 0.75 (d) 1
8. When a fair die is thrown, what is the probability of getting a multiple of two?
1 2
(a) 6 (b) 6
3 1
(c) 6 (d) 3
9. Which color will mostly likely be hit in the dart game?
(a) Yellow
(b) Blue
(c) Green
(d) Pink
126 SUPPLEMENT—Decoding Artificial Intelligence–IX
10. In probability, what does the term ‘sample space’ mean?
(a) The number of trials conducted (b) The set of all possible outcomes
(c) The most frequent outcome (d) The least likely outcome
11. Which mathematical field(s) help in handling uncertainty and making informed guesses using data?
(a) Calculus (b) Probability and Statistics
(c) Geometry (d) Algebra
12. What does a high frequency in a data set indicate?
(a) The observation occurs rarely (b) The observation is very common
(c) The data is inaccurate (d) The data set is small
13. A die is rolled once. What is the probability of rolling a number greater than 4?
1 1
(a) 6 (b) 3
1 2
(c) 2 (d) 3
14. In a dot plot, what does each dot represent?
(a) A single observation (b) The total frequency
(c) The summary of the data (d) The average value
15. How can statistics help in disaster management?
(a) Predicting future disasters (b) Allocating resources efficiently
(c) Planning evacuations (d) All of these
16. A card is drawn from a standard deck of 52 cards. What is the probability of drawing an Ace?
1 1
(a) 52 (b) 13
1 1
(c) 4 (d) 2
17. Find the mode of the following data:
1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 2, 2, 1
(a) 1 (b) 2
(c) 3 (d) 4
18. .......................... and .......................... are examples of descriptive statistics.
(a) Mean, z-test (b) Mean, Median
(c) Variance, Standard Deviation (d) Hypothesis Testing, Mode
Learning Objectives
Defining Generative AI and classify different kinds of generative models.
Explaining how Generative AI works and recognize how it learns.
Applying Generative AI tools to create content.
Understanding ethical considerations and potential social impact of using Generative AI.
Demonstrating how mathematical concepts, especially statistics and probability, are essential
for developing and enhancing AI applications
Introduction
Humans have been fascinated with technology, machines and gadgets
since time immemorial. The ancient Indian scriptures contain
fascinating stories that hint at machines with intelligent or generative
capabilities. We have always aspired and sought ways to make our tasks
easier and more efficient. From the invention of the wheel to industrial
revolutions, to the development of self-driving cars and robots, we have
constantly strived to automate processes to make human lives better.
The development from calculating devices to computers has been a
long journey towards automation. However, during the last few years,
it is found wanting in generating creative content like designs, art,
music, poems or stories.
The idea that ‘machines can create new things’ has been around
for decades, studied by both universities and businesses. One early
example is how computers understand and write language. Since the
beginning of AI research, scientists have been trying to get computers
to write like humans.
The same goes for creating pictures, music and other creative
elements using computers—it is a long-standing area of interest for AI
researchers.
This chapter is an introduction to Generative Artificial Intelligence. The phrase ‘Generative
AI’ has been around in AI circle for a while. Today, Generative AI covers a wide range of
tools and uses, from writing text to creating images to composing music. What do you think
Generative AI is? Is it a magician, a wizard or a fairy? Let us find out!
‘Generative AI is the most powerful tool for creativity that has ever been
created. It has the potential to unleash a new era of human innovation.’
—Elon Musk
WHAT IS GENERATIVE AI
Generative AI refers to the field of Artificial Intelligence that
focuses on developing algorithms and models capable of
generating original content, such as images, videos or text, by
learning from large datasets. Generative AI opens up exciting
possibilities for creativity and innovation. It shows how
technology can be combined with human imagination to create
something entirely new.
Unlike traditional AI applications that analyze and classify
data, Generative AI takes what it has learned and uses that
knowledge to produce novel outputs. You may imagine it as a
student who has studied a vast library of literature, memorized
it all and can now write their own stories—although inspired
by what they have read, yet entirely original.
Evolution of Generative AI
Early AI: Early AI systems were designed to recognize and categorize things. These
discriminative models could be trained to identify objects or patterns, such as recognizing
pictures of animals or sorting emails into spam and non-spam. These systems could be
categorized as supervised (training data with labels) or unsupervised (training data without
labels).
Supervised Learning: This method involves training a model
using labelled data. For example, a model might be shown many
images labelled “bicycle” so it learns to recognize bicycles in
new images.
The model makes predictions by comparing new inputs to the
examples it has seen. This type of AI is good at “discriminative
modelling” i.e., identifying and categorizing things it has been
trained on.
Generative AI 131
g ont DISCRIMINATIVE MODELS
r r
Ja Ale Discriminative models are AI systems trained for specific tasks. They look at data
(like pictures) and try to figure out what category or label (like “bicycle” or “car”) that data
belongs to.
How Do They Work?
1. Learning to Recognize: Let’s say the model trained on a lot of pictures of bicycles as examples.
Now, when your AI system sees a new picture, it can tell if it is a bicycle by looking for specific
features like two wheels, handlebars, and pedals.
2. Making Fine Distinctions: Discriminative models can also tell the difference between
similar things. For example, even though cars and bicycles both have wheels, a car has
four wheels and an enclosed cabin, while a bicycle has two wheels and is open. The model
learns these differences to make accurate predictions.
3. Making Predictions: Consider a far object, which appears as a blurry picture. At first, you
might not know what it is, but as you get closer, it gets clearer and you can guess what the
picture shows. Discriminative models work in a similar way. They look at the features of
an image and predict what the image is showing. If the features match those of a bicycle,
the model will predict that the image is of a bicycle. But what happens when they make a
prediction and it’s not correct? Or the picture is not sufficiently clear yet?
4. Learning and Improving:
(a) Iterative Process: Discriminative models get better over time. If they make a wrong
prediction, they are corrected and retrained to improve their accuracy. AI systems
work as practising a skill: the more they practice and learn from mistakes, the better
they get.
(b) Human Help: Even though these models are smart, they still need help from humans to
verify and correct their predictions. This helps prevent mistakes and ensures the model
learns accurately.
Unsupervised Learning: In unsupervised learning, the model is given data without labels and
must find patterns and structures on its own. It can group similar items, find relationships
between them, and simplify complex data.
Generative Modelling: Generative AI takes the unsupervised learning approach a step further
by not only recognizing patterns but also creating new data based on what it has learned.
For example, after learning from many pictures of bicycles, a generative model can create
a new, realistic image of a bicycle.
Generative AI Timeline
Sydney Guetta X Eminem Drone Shot
Microsoft new GenAI for Bing is French DJ and music producer An AI-generated image called
reported to be ‘hallucinating’ in David Guetta uses GenAI in ‘Drone Shot’ wins the DigiDirect
February, maintaining it is still 2022, February to generate deepfake Summer Australian photo contest
announcing it self-identifies as lyrics and voice of Eminem at his in February with prompts from
‘Sydney’ and declaring it is in love appearance at Future Rave. Jamie Sissons
with Kevin Roose, a NYT columnist Crowds love it.
Generative AI is not limited to image generation; it has a lot of other exciting applications as
presented below.
Generative AI Applications
The table below presents some most common uses of Generative AI in the present-day
scenario with their uses and examples.
Generative AI 133
Video Generation Makes new videos or changes the existing ones by • Creating new videos
adding or taking away scenes, objects or visual effects • Finishing videos that are not complete
• Predicting what may happen next in a video
3D Model Generation Makes 3D models and objects which can be used for • Creating 3D models
new designs, virtual worlds and simulations • Making virtual reality experiences
• Designing simulations
Speech Generation Imitates human voices and can be used to turn text into • Voice assistants on your phone or speaker
speech, create voice assistants and make automated
voice systems
FUN Check how Intel, the world’s biggest computer chip manufacturer is leading
TIME in the domain of Generative AI in this YouTube video.
To play, use the link https://www.youtube.com/watch?v=26fJ_ADteHo in your
web browser or scan the given QR code.
Interesting fact
AI-GENERATED RECIPES CREATE CULINARY DELIGHTS
AI algorithms have been used to generate novel recipes and culinary creations, including unusual
combinations and innovative dishes. For example, IBM’s ‘Chef Watson’ program has generated
recipes such as the Belgian Bacon Pudding and the Caribbean Plantain Soup, inspiring chefs to
experiment with AI-generated cuisine.
GENERATIVE AI VS TRADITIONAL AI
Generative AI and Traditional AI are both subsets of Artificial Intelligence but they approach
tasks in fundamentally different ways.
Traditional AI works within the predefined rules and instructions. These systems often excel
at tasks with clear goals and well-defined problem-solving steps. The AI model learns by
being explicitly programmed with rules, logic or mathematical equations, or learns the rules
from the training data.
The outputs of traditional AI programs are primarily focused on decision-making,
classification, prediction and optimization, all based on input data. These programs have
limited creative capabilities and rely on existing information; they cannot generate entirely
new concepts. Traditional AI is like a highly skilled chef who can follow a recipe perfectly,
ensuring a delicious and consistent dish.
Examples: Spam filtering, game playing (computer chess), facial-recognition software, etc.
Generative AI, on the other hand, focuses on creating new content, data or creative text
formats. These AI programs usually deal with tasks requiring imagination and exploration.
Generative AI systems learn from vast amounts of data (text, images, codes), known as
training data, by identifying patterns and relationships.
Generative AI 135
The outputs of Generative AI include entirely new content like images, music, text formats
or 3D models based on training data. These applications have high creative potential. They
can generate novel and surprising outputs that resemble the styles and patterns learned from
training data.
Examples: Generating realistic images of people who don’t exist, composing music in a
specific style, writing different creative text formats like poems or scripts, etc.
Generative AI is like a creative, experimental chef who can use their knowledge of different
ingredients and techniques to come up with entirely new and surprising dishes.
The key difference lies in the primary goal. While conventional AI excels in analyzing and
manipulating existing data, Generative AI focuses on creating new content. However, both can
be heavily data-driven, particularly when machine learning is involved. Summarized below
are some similarities between both the approaches, based on how AI systems work.
g ont • Generator: It is the part of the GAN that creates new content.
r r
Ja Ale • Discriminator: It is the part of the GAN that checks if the content is real or
generated.
Interesting fact
AI-GENERATED ARTWORK SELLS AT EXORBITANT PRICE
In 2018, an AI-generated artwork titled Portrait of Edmond de Belamy was sold for $432,500
at a Christie’s auction, far exceeding its estimated price of $7,000–10,000. Created by the
art collective Obvious using a GAN, this sale highlighted the growing interest and value of
AI-generated art.
GANs can generate realistic human faces that don’t actually exist. They are used to create
artwork, enhance photos and even generate new video game levels.
Generative AI 137
Examples:
• Deepfake: Videos where someone’s face is swapped with another person’s face.
• StyleGAN: A tool that creates lifelike images of non-existent people.
Training Set Sample
Discriminator Real
Generator
Fake
Noise
Do it Yourself
GANs IN ACTION
Visit the website https://www.whichfaceisreal.com/ and play against the computer. How
many AI-generated faces could you spot as fake out of 10? Note the number.
g ont • Sequence Data: It is the data that comes in a specific order like text or music.
r r
Ja Ale • Memory: It refers to RNN’s ability to remember the previous steps in a
sequence.
Interesting fact
AI-AUTHORED NOVEL COMPETES IN LITERARY CONTEST
In 2019, a novel titled The Day a Computer Writes a Novel passed the first round of screening for
the Nikkei Hoshi Shinichi Literary Award, a prestigious Japanese literary contest. The novel was
authored by an AI program developed by the team at Future University Hakodate, demonstrating the
potential of AI to participate in creative areas that are traditionally reserved for humans.
Generative AI 139
are not exact copies. He can also add new features in addition to the captured features. VAEs
work similarly by compressing data into a simpler form (blueprint) and then creating new
data from it.
For example, consider the following illustration.
Smile:
–1 0 1
Skin tone:
–1 0 1
Gender:
–1 0 1
encoder decoder
Beard:
–1 0 1
Glasses:
–1 0 1
Hair color:
–1 0 1
While generating a new image based on the VAE method, we may decrease the ‘beard’
characteristic and increase the ‘glasses’ characteristic, and our generator shall create a new
image similar to the input person but without the beard and with the glasses. VAEs are used in
generating new images, music and even 3D models.
Examples: Face Reconstruction: Generating new faces based on the learned facial features.
g ont
r r • Encoder: It is the part of the VAE that compresses data.
Ja Ale • Decoder: It is the part of the VAE that decompresses data to recreate it.
Generative AI 141
Do it Yourself
Visit https://kidgeni.com/ and create your own text-to-image generations. Try creating
doodles to images as well and let your imagination run wild!
Caution: It is recommended to use all Generative AI tools under the supervision of your teacher or
parent.
EXAMPLES OF GENERATIVE AI
Let us explore some popular examples of Generative AI. All examples are provided with their
descriptions, along with links to access them.
Caution: Do not experiment with any Generative AI tool without the supervision of your teacher or
parents.
OpenAI’s ChatGPT
Description: ChatGPT is a powerful language model capable of generating
human-like text based on prompts provided by users. You can explore its
capabilities and applications by scanning the given QR code or by opening the link
https://chatgpt.com/ in your web browser.
Kidgeni
Description: Kidgeni helps kids unleash their creativity with lessons and tools that
guide them on their artistic journey. Kids can play around with different words and
ideas to create one-of-a-kind pictures that bring their imagination to life.
Access to Kidgeni is limited but you can learn more about its capabilities and see examples by
scanning the given QR code or by opening the link https://kidgeni.com in your web browser.
OpenAI SORA
Description: SORA is an upcoming Generative AI model developed by OpenAI that
specializes in text-to-video generation. OpenAI released some incredible videos,
generated completely using Artificial Intelligence, based on the capabilities of
SORA. The platform preview is available on https://openai.com/index/sora/.
Generative AI 143
This Person Does Not Exist
Description: This website showcases AI-generated images of people that do
not exist. Each time you refresh the page, a new image is generated. To view
AI-generated images of fictional people on the website, scan the given QR code or
open the link https://thispersondoesnotexist.com/ in your web browser.
Generative AI 145
Scalability Rapid creation of content Generative AI can help create suitable images,
videos, audio and text rapidly, which is helpful in
scalability over large projects.
Accessibility Image to Audio and other examples Generative AI can also help in increasing
accessibility for differently abled people with
novel and unique applications.
Generative AI 147
Do it Yourself
THE CHATBOT WARS
Use the following 6 prompts ChatGPT, Microsoft Bing Chat, and Gemini, and compare the results.
1. Write a summary of the history of the internet.
2. Explain how to code a simple website.
3. Write a blog post about the latest trends in artificial intelligence.
4. Create a presentation about the benefits of cloud computing.
5. Write a research paper about the future of technology.
6. Design an app that solves a real-world problem.
Parameters: Which chatbot out of the above three is better on each of the following parameters
in your opinion?
Parameter 1: Human-Like Response
Parameter 2: Training Dataset and Underlying Technology
Parameter 3: Authenticity of Response
Parameter 4: Access to the Internet
Parameter 5: User Friendliness and Interface
Parameter 6: Text Processing: Summarization, Paragraph Writing, etc
Parameter 7: Charges and Price
Generative AI 149
4. Misinformation and Manipulation: Generative AI can create fake content to spread
misinformation or manipulate opinions. Examples: Fake news and AI-generated articles
with false information. Deepfakes are also used to mislead voters or harm reputations.
This can damage democracy, erode trust in the media and cause confusion.
To learn how misinformation can affect democracies, scan the given QR code
or open the link https://www.youtube.com/watch?v=LIPkDso-uHA in your web
browser.
5. Intellectual Property and Authorship: Ownership of AI-generated content is unclear.
For example, there is uncertainty over who owns AI-created art or music.
6. Plagiarism: AI-generated content may unintentionally resemble the existing works.
These issues complicate legal matters and can harm creativity and innovation.
Generative AI 151
Developing Critical Thinking Skills: Generative AI can’t replace the critical thinking skills
you develop by researching, analyzing information and forming your own arguments.
For example, if you use AI to write a persuasive essay, you may miss the opportunity to
learn about the different perspectives on the topic or develop your own writing techniques.
Understanding Concepts is the Key: Memorizing facts generated by AI won’t lead to deep
understanding. True learning happens by actively engaging with the material, asking
questions and making connections. Generative AI can help summarize information or create
visuals but it can’t replace the process of actually learning the concepts.
Memory Bytes
Generative AI creates new content like text, images or music from the existing data.
Key algorithms used in Generative AI include GANs, VAEs and RNNs.
GANs consist of a generator and a discriminator working in parallel to produce realistic outputs.
VAEs encode data into a compressed space and then decode it back, generating new data variations.
RNNs generate sequences, making them suitable for tasks like text and music generation.
Generative AI models require large datasets and significant computational power for training.
Applications of Generative AI span creative fields like art, music and literature.
In healthcare, Generative AI helps in drug discovery and personalized treatment plans.
Generative AI enhances accessibility by creating tools for the visually and hearing impaired.
In environmental monitoring, Generative AI predicts ecological changes and disaster risks.
Exercises
Objective Type Questions
I. Multiple Choice Questions (MCQs):
1. What is Generative AI?
(a) AI that follows predefined rules
(b) AI that generates new content like images, videos and text
(c) AI used only in gaming
(d) AI that does not require data for training
2. Which of the following is an example of a Generative AI application?
(a) Spam filtering (b) Facial recognition
(c) Composing music in a specific style (d) Playing chess
3. What is a key characteristic of Generative AI?
(a) Limited to decision-making tasks (b) Requires minimal computational resources
(c) Can create entirely new content (d) Does not learn from data
4. Which type of model is commonly used in Generative AI for creating realistic images?
(a) Recurrent Neural Networks (RNNs) (b) Generative Adversarial Networks (GANs)
(c) Decision Trees (d) Linear Regression Models
5. What is one of the main ethical concerns related to Generative AI?
(a) Cannot learn from data (b) Uses too little data
(c) Bias in generated content (d) Only works with high-quality data
Generative AI 153
6. How can Generative AI assist in personalized learning?
(a) By standardizing all learning experiences
(b) By analyzing strengths and weaknesses to customize practice problems
(c) By replacing teachers entirely
(d) By providing only theoretical knowledge
7. Which of the following is NOT a use case of Generative AI in creative projects?
(a) Writing imaginative descriptions of a story
(b) Generating ideas for characters in a story
(c) Sorting emails
(d) Composing background music
8. What does RNN stand for?
(a) Random Network Node (b) Recurrent Neural Network
(c) Rapid Neural Network (d) Real-time Neural Network
9. Which of the following tasks is traditional AI better suited for as compared to Generative AI?
(a) Creating new images (b) Composing new music
(c) Predicting stock market trends (d) Writing creative stories
10. What is a potential social impact of Generative AI?
(a) Decreasing computational power usage
(b) Improving scientific discovery processes
(c) Eliminating the need for data in AI
(d) Reducing the need for critical thinking skills
11. How should students use Generative AI responsibly in their studies?
(a) As a complete substitute for doing their homework
(b) To enhance their learning without replacing fundamental skills
(c) By relying solely on AI for all research tasks
(d) By using it to cheat in exams
12. What is one benefit of using Generative AI tools in science experiments?
(a) Avoiding understanding the scientific method
(b) Creating simulations to observe environmental changes
(c) Guaranteeing accurate experimental results
(d) Eliminating the need for human observation
13. Which AI model is used to create background music in multimedia projects?
(a) Generative Adversarial Networks (GANs) (b) Recurrent Neural Networks (RNNs)
(c) Variational Autoencoders (VAEs) (d) Decision Trees
14. What does the term ‘black box’ refer to in the context of Generative AI?
(a) A model with transparent and easily understood processes
(b) A model whose internal workings are not fully understood
(c) A model that operates without data
(d) A model that requires no human oversight
15. What is the role of human oversight in the use of Generative AI tools?
(a) To replace human judgment entirely
(b) To ensure AI outputs are guided and verified by humans
(c) To reduce the quality of AI outputs
(d) To minimize human involvement in AI-generated content
Generative AI 155
ANSWERS TO OBJECTIVE TYPE QUESTIONS
Chapter–1
I. Multiple Choice Questions (MCQs):
1. (b) 2. (d) 3. (b) 4. (a) 5. (a)
6. (b) 7. (b) 8. (a & c) 9. (a) 10. (a)
11. (b) 12. (b)
II. Fill in the blanks:
1. Ethics 2. Principles of Ethics
3. Biasedness 4. AI Bias
III. True or False:
1. True 2. True 3. True 4. True 5. False
Chapter–2
I. Multiple Choice Questions (MCQs):
1. (b) 2. (c) 3. (b) 4. (a) 5. (b)
6. (b) 7. (b) 8. (b) 9. (c) 10. (b)
11. (b) 12. (b) 13. (b) 14. (b) 15. (b)
16. (b) 17. (b) 18. (b) 19. (b) 20. (b)
21. (b) 22. (b) 23. (b) 24. (b) 25. (b)
26. (b) 27. (b) 28. (c) 29. (b) 30. (c)
Chapter–3
I. Multiple Choice Questions (MCQs):
1. (b) 2. (b) 3. (a) 4. (b) 5. (b)
6. (d) 7. (b) 8. (c) 9. (b) 10. (b)
11. (b) 12. (b) 13. (b) 14. (a) 15. (d)
16. (b) 17. (c) 18. (b)
Chapter–4
I. Multiple Choice Questions (MCQs):
1. (b) 2. (c) 3. (c) 4. (b) 5. (c)
6. (b) 7. (c) 8. (b) 9. (c) 10. (b)
11. (b) 12. (b) 13. (c) 14. (b) 15. (b)
II. True or False:
1. True 2. True 3. False 4. True 5. False
6. False 7. True 8. True 9. False 10. True
III. Fill in the blanks:
1. text, images, music 2. Generative Adversarial Networks (GANs)
3. sequences 4. deepfakes
5. augmentation