0% found this document useful (0 votes)
16 views

AI DataAcquisition

The document discusses the key steps in the data acquisition process for an AI project. It begins by explaining the importance of problem scoping to define the goals and stakeholders. It then provides an example problem statement around protecting elephants from poaching. It describes identifying relevant data features and acquiring reliable training and testing data from authentic sources. Finally, it discusses using system maps to understand relationships between problem elements and how changes can impact the overall system.

Uploaded by

Pratikshaa R
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

AI DataAcquisition

The document discusses the key steps in the data acquisition process for an AI project. It begins by explaining the importance of problem scoping to define the goals and stakeholders. It then provides an example problem statement around protecting elephants from poaching. It describes identifying relevant data features and acquiring reliable training and testing data from authentic sources. Finally, it discusses using system maps to understand relationships between problem elements and how changes can impact the overall system.

Uploaded by

Pratikshaa R
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

AI PROJECT LIFE CYCLE

DATA ACQUISITION
• Problem Scoping
• Data Acquisition
• Data Exploration
• Modelling
• Evaluation
• First step in the lifecycle of an AI project
is Problem scoping
• Scoping helps in setting the goal of the
project
• Identifies the stakeholders
• Defines the scope of the project
• Also sets the measure for evaluating
the system
Read the following articles. Let us come
up with a problem statement to develop
an AI system for this purpose.
• Threat to Elephants
• Impact of poaching
• Importance of elephants
• Conservation by WWF
WHO? :
Who are the stakeholders?
Elephants, WWF
What do you know about the stakeholders?
Elephants live in the forest. They need food and water
for living. They move around in search of food. They
cross over to human habitats.
WWF : An organisation that protects and ensures
safety of wildlife. They work with the government of
different countries for this purpose.
WHAT? :
What is the problem?
The problem is elephants are in danger of
extinction due to poaching and conflict.
How do we know that it is a problem?
The problem has been brought to our notice
through articles in newspaper and websites.
WHERE? :
What is the context / situation in which
the stakeholders experience the problem?
Due to the high demand of ivory in China
and conflict with humans for habitat.
WHY? :
What would be of key value to the stakeholders?
Ensure the safety of elephants which play a
major role in ecological cycle thereby helping
WWF achieve its goal of protecting elephants.
How would it improve the situation?
Solution would ensure habitat for elephants
and allow dependent organisms also to thrive
The process of collecting accurate and
reliable data which forms the base of an AI
sytem is called Data Acquisition.
Data is a piece of information, facts and
statistics collected together for reference
or analysis.
For e.g., if you want to predict next year’s
salary of any employee, what would you
need to know? Previous salaries of
employees
An Intelligent system uses previous salary data to predict
the future salary. The AI system would be trained using the
previous salary data. We call this data set as training data.
Once the training is over, we use data to evaluate the model.
The data used for evaluation is called Testing data.

Training Data : Data that is used to train the machine for


building intelligence is called Training Data
Testing Data : Data that is provided to a machine for
evaluation of the model is called testing data.
There are two important steps in Data
acquisition.
– Identifying what data is needed for
the problem at hand
– Where to collect reliable and
authentic data
• The first step in data collection is to visualise the
factors that affect the problem statement.
• List down the factors that affect the problem
statement directly or indirectly. Examine the factor
and identify the type of data you would require to
solve the problem.
• Data feature refers to the type of data you want to
collect.
For e.g., for predicting salary, we may need salary
amount, increment percentage, years of experience,
bonus etc. Each of this is a data feature.
• Once Data features have been identified, we need to acquire the data.
• While acquiring data, it should be ensured that data is reliable and authentic
• Data should be acquired by authentic means
• There are various ways in which data can be collected :
– Surveys (in case of Poll prediction)
– WebScraping (To know what is trending, opinion on topics)
– Sensors (Min or Max Temperature of a city)
– Cameras (Images)
– Observations (Experiments involving chemicals)
– Applicaion Programming Interface (Data from companies)
• Data from the Internet, from a random website, will not be reliable and authentic.
• Ensure that the data is opensource and does not breach privacy
• Open sourced websites hosted by the Government are ideal
• Europe has a strict law against extracting private data called General Data Protection
Rights
• System maps help us identify relationship
between different elements of the
problem. It helps in strategizing the
solution for achieving the goal of a
project.
• Let us take an example of a familiar
system : Water Cycle
• Elements of a system are put in circles
• The map shows relationship between
different elements
• The arrow head depicts the direction of effect
• The sign (+ or -) in the arrow indicates
whether the two elements are directly or
inversely related. (i.e.) If one increases, the
other increases or if one increases, the other
decreases.
• System maps help us identify elements
in the data which are related to each
other.
• Identifying loops in the system to
understand cause and effect of changes
to an element
• Understand how a change in one
element will impact the entire system

You might also like