910b7
910b7
INTRODUCTION
Worksheet -1
4) Websites and mobile apps use our search history to provide personalized offers
a) Yes b) No
10) You regret posting a particular picture and want to take it down. Is it possible, and how would you
do that?
a) It is a little tricky but can be done by asking a professional to do it. Then no one can see the photo.
b) You can delete the picture by clicking on the delete button. Then no one can see the photo anymore
c) Only the police can delete a picture uploaded by you
d) A photo can be deleted from your account, but someone might have already saved it or copied it.
INTRODUCTION
Worksheet -2
1. Data can be defined as facts or information which when stored, can be used as a
basis for decision making, calculation, or discussion.
2. Processed, managed, and structured data is called Information.
3. Information is a collection of data that has a logical sense.
4. Data after transformation to information can also be converted to knowledge and
wisdom. This is called the DIKW model.
5. When we use the internet, we send and receive data through the internet. With
all the activities we do on the internet, we create trails of data. These trails of data
are called data footprints.
6. Data footprints can be classified into two categories. – Active and Passive
7. We regularly use several social media platforms and post images or content which
are stored on the media. This is a form of Active data footprint as we have
knowingly shared information about ourselves.
8. Our browsing history, product searches may be stored by search engines.
Organizations use these records for personalized marketing. This is an example
of a passive data footprint.
9. The process of restoring inaccessible, lost, corrupted, damaged, or deleted data is
called data recovery.
10. Some of the reasons for Data Loss are – System Failure (Power failure, Hardware
failure, System Crash), Disaster (Natural disaster, Fire), Crime (Theft, Hacking,
Computer Virus, Ransomware etc), Unintentional actions (Accidental deletion of
files, loss of pen drive or laptop) and Intentional actions (Deletion of files and
programs intentionally)
11. Healthcare, Education, Travel, Online shopping and Online shows are some of the
ways data influences our daily lives.
Chapter-2
1. A school named ABC has recorded the total marks of every student in the class. This an example of:
a. Qualitative data b. Quantitative data c. Both qualitative and quantitative data d. None of the above
2. A food delivery app has asked for your feedback on the quality of the food. You have written two paragraphs
to describe the food. This is an example of:
a. Qualitative data b. Quantitative data c. Both qualitative and quantitative data d. None of the above
3. You need to predict what the temperature will be for next Friday. Which algorithm will you use?
4. You need to predict if your car tyre will last for the next 1000 km. Which algorithm will you use?
a) Business can utilize outside intelligence while making decisions b) Improved customer service
c) Better optimal efficiency d) All of the above
6. The analysis of large amounts of data to see what patterns or other useful information can be found is
known as
9. The advantage of secondary data are low cost, speed, availability, and flexibility
a) True b) False
1. Data Collection is defined as the procedure of collecting data for measuring and analyzing accurate
6. At times data is already recorded for some other purpose but then re-used for analysis. These are
7. Online surveys, interviews, feedback forms are some methods of collecting Primary data.
8. Web traffic tracking, Satellite data tracking are some methods of collecting Secondary data.
9. When data volume increases certain limits and specialized systems are required to manage the data,
10. systems capable of extracting statistical insights from a huge amount of data are called Big Data
Systems.
11. Volume, Variety, and Velocity are some of the key characteristics that can define Big Data
12. Binary classification, regression, anomaly detection, clustering are some of the algorithms used to
15. Big Data techniques are widely used in different sectors. Some of them are Health Care, Retail,
Example – Q) You are checking your car tyre pressure. Is the reading regular?
How many goals will your favorite team score in this football match?
Example - Consider a class of 60 students, students can be categorized into groups based on
their height.
Example - I am a self-driving car. I am at a traffic signal with a red light. What should I do now?
Chapter - 3
Data Visualization
Worksheet-1
1. Data can be visualized using:
a. Graphs
b. Maps
c. Charts
d. All of the above
Answer: d
2. Which of the following statements is false?
a. Data visualization can absorb information quickly.
b. Data visualization decreases the insights and takes slower decisions.
c. Data visualization is a type of visual art.
d. None of the above
Answer: b
Answer: d
4. Bar Graph is a
a. One-dimensional graph
b. Two-dimensional graph
c. Graph with no dimension
d. None of the above
Answer: a
5. The data represented through a histogram can help in finding graphically the
a. Median
b. Mean
c. Mode
d. All of the above
Answer: c
6. Pie Chart is a
a. One-dimensional graph
b. Two-dimensional graph
c. Graph with no dimension
d. None of the above
Answer: a
Answer: a
Distribution Answer: b
Chapter - 3
Data Visualization
Worksheet-2
1. Data visualization is the mechanism of representing raw data in the form of graphical
representations that allow users to explore the data and uncover quick insights.
2. Representing data through visualizations like graphs, charts, maps, etc., gives us a
visual context of the data.
3. Data visualization makes complex data simple and enables the human mind to
understand its significance.
5. Data visualization techniques use visual data in a universal, fast, and powerful way to
communicate information.
7. On the Chart Elements section, we may provide the title, subtitle, name of the x-axis
& name of the y axis of the chart.
8. A bar graph is a graphical display of data using bars of different heights. It is possible to
plots the bars vertically or horizontally.
10. The minimum is the smallest value in the data set. The maximum is the largest value
in the data set.
11. The frequency of a data value is the number of times the data value occurs/repeats.
14. Data points in a normal distribution are as likely to occur on one side of the average
as on the other side of the average.
15. A right-skewed distribution occurs when the data has a range boundary on the left-
hand side of the histogram.
16. A right-skewed distribution is also known as a positively skewed distribution
17. A left-skewed distribution usually occurs when the data has a range boundary on the
histogram's right- hand side.
19. A bimodal distribution has two peaks. In a bimodal distribution, the data should be
separated and analyzed as separate normal distributions.
20. A random distribution lacks an apparent pattern and has several peaks.
21. Multi-variable plots are used to display relationship among several variables
23. Understand the different shapes of a histogram and name the type of distribution.
3) If you are done with using the confidential data collected from users, you should:
a) Safely store it. We may need it in future for some analysis or reports
b) Effectively destroy it in a way that it is unreadable
7) Which of the following is not the appropriate way of discarding the confidential
data?
a) Shredding the data
b) Cutting the files which contain confidential data
c) Burning the confidential data
d) Crumbling the papers which contain confidential data and throwing it in the
dustbin
Chapter-4
Ethics in Data Science
Worksheet -2
1.The private data acquired from a person with their consent should never be
exposed for use by different businesses or individuals.
3. Third party companies should always have restrictions on if and how that
information is allowed to be passed forward.
4, Customers should always have a clear view of how their data is getting
used or traded and should have the authority to manage the flow of their
confidential information across enormous, third-party systems.
7. Once we are done with the user data, especially confidential data, it is
important that we discard this data in appropriate way to make sure that it
is not accessed by any unauthorized person and it is not misused in anyway.
8. There are two ways in which you may have stored the data – in the digital
format or as a physical copy.
11. Do not use confidential customer data for business purposes without
consent.
Computer Virus - A computer virus is a type of malicious software, or malware, that infects
computers and corrupts their data and software.
Ransomware - Ransomware is a type of malware designed to extort money from its victims,
who are blocked or prevented from accessing data on their systems.
Excel
Excel is a spreadsheet program from Microsoft and a component of its Office product group
for business applications.
Cell - A cell is the intersection of a row and a column—in other words, where a row and
column meet. Every cell is identified by its cell address, cell address contains its column number
and row number (If a cell is on the 7th row and on column B, then its address will be B7)
Active cell - The selected cell in which data is entered when you begin typing. Only one cell is
active at a time. The active cell is bounded by a heavy border.
Cell reference - The set of coordinates that a cell occupies on a worksheet. For example, the
reference of the cell that appears at the intersection of column B and row 3 is B3.
Active sheet - The sheet that you're working on in a workbook. The name on the tab of the
active sheet is bold.
Row - In Microsoft Excel, a row runs horizontally across a worksheet's grid structure. Horizontal
rows use Numeric Values such as 1, 2, 3 and 4 as labels.
Column - In Microsoft Excel, a column runs vertically across a worksheet's grid structure.
Vertical columns use letters such as A, B, C and D as labels.
Fill Handle - Fill Handle is a tool that auto-fills the rows/columns following the values pattern of
the selected cells and creates a list of series. The small black square in the lower-right corner of
the selection. When you point to the fill handle, the pointer changes to a black cross.
Address Bar – It shows the address of the active cell. If you have selected more than one cell,
then it will show the address of the first cell in the range.
Formula Bar – The formula bar is an input bar, below the ribbon. It shows the content of the
active cell, and you can also use it to enter a formula in a cell.
Formula - A sequence of values, cell references, names, functions, or operators in a cell that
together produce a new value. A formula always begins with an equal sign (=).
Function - A prewritten formula that takes a value or values, performs an operation, and
returns a value or values. Use functions to simplify and shorten formulas on a worksheet,
especially those that perform lengthy or complex calculations.