Model Lifecycle (XII)
Model Lifecycle (XII)
SUBJECT:ARTIFICIAL INTELLIGENCE(843)
UNIT 2: MODEL LIFECYCLE
Q.1 What is a Model Parameter?
Ans: A model parameter is a variable whose value is estimated from the dataset. Parameters
are the values learned during training from the historical data sets.
The values of model parameters are not set manually. They are estimated from the training
data. These are variables that are internal to the machine learning model.
Based on the training, the values of parameters are set. These values are used by machine
learning while making predictions. The accuracy of the values of the parameters defines the
skill of your model.
Problem Scoping: Whenever we are starting any work, certain problems always associated
with the work or process. These problems can be small or big, sometimes we ignore them,
sometimes we need urgent solutions. Problem scoping is the process by which we figure out
the problem that we need to solve.
The 4Ws Canvas: The 4Ws Canvas is a helpful tool in Problem Scoping.
● Who?: Refers that who is facing a problem, who the stakeholders of the problem
are and who are affected because of the problem.
● What?: Refers to what the problem is and what you know about the problem.
What is the nature of the problem? Can it be explained simply? How do you know
it’s a problem? What is the evidence to support that the problem exists? What
solutions are possible in this situation? etc. At this stage, you need to determine
the exact nature of the problem.
● Where?: It is related to the context or situation or location of the problem, focus
on the context/situation/location of the problem.
● Why?: Refers to the reason we need to solve the problem, the benefits which the
stakeholders would get from the solution and how would it benefit them as well as
the society, what are the benefits to the stakeholders after solving the problem.
Data Acquisition: Data Acquisition consists of two words:
● Data: Data refers to the raw facts, figures, information, or statistics.
● Acquisition: Acquisition refers to acquiring data for the project.
So, Data Acquisition means Acquiring Data needed to solve the problem.
DATA MAY BE THE PROPERTY OF SOMEONE ELSE, AND THE USE OF THAT DATA
WITHOUT THEIR PERMISSION IS NOT ACCEPTABLE.
But there are some sources from which we can collect data, no hassle whatsoever. Let’s Take
a Look:
There are 2 Types of data in this case, Primary and Secondary, let’s take a look at both.
● Primary Data: Primary data is the kind of data that is collected directly from the
data source. It is mostly collected specially for a research project and may be
shared publicly to be used for another research. Primary data is often reliable and
authentic.
● Secondary Data: Secondary data is the data that has been collected in the past by
someone else and made available for others to use. Secondary data is usually easily
accessible to researchers and individuals because they are shared publicly.
Data Exploration: Data exploration is the first step of data analysis which is used to
visualize data to uncover insights from the start or identify areas or patterns to dive into and
dig more. It allows for a deeper, more detailed, and better understanding of the data. Data
Visualization is a part of this where we visualize and present the data in terms of tables, pie
charts, bar graphs, line graphs, bubble chart, choropleth map etc.
Deployment: Deployment is the method by which you integrate a machine learning model
into an existing production environment to make practical business decisions based on data.
To start using a model for practical decision-making, it needs to be effectively deployed into
production.
1. AI Project Scoping
The first fundamental step when starting an AI initiative is scoping and selecting the relevant
use case(s) that the AI model will be built to address. In this phase, it's crucial to precisely
define the strategic business objectives and desired outcomes of the project, align all the
different stakeholders' expectations, anticipate the key resources and steps, and define the
success metrics. Selecting the AI or machine learning use cases and being able to evaluate
the return on investment (ROI) is critical to the success of any data project.
2. Building the Model
Once the relevant projects have been selected and properly scoped, the next step of the
machine learning lifecycle is the Design or Build phase, which can take from a few days to
multiple months, depending on the nature of the project. The Design phase is essentially an
iterative process comprising all the steps relevant to building the AI or machine learning
model: data acquisition, exploration, preparation, cleaning, feature engineering, testing and
running a set of models to try to predict behaviors or discover insights in the data.
3. Deploying to Production
In order to realize real business value from data projects, machine learning models must not
sit on the shelf; they need to be operationalized, or deployed into production for use across
the organization. Sometimes the cost of deploying a model into production is higher than the
value it would bring. Ideally, this should be anticipated in the project scoping phase, before
the model is actually built, but this is not always possible.
Another crucial factor to consider in the deployment phase of the machine learning lifecycle
is the replicability of a project: think about how this project can be reused and capitalized on
by other teams, departments, regions, etc., than the ones it's initially built to serve.
Q.5 What is the AI development life cycle?
Ans: It is the development cycle to create AI solutions. It includes 3 parts:
• Project scoping
• Design
• Build phase
Q.6 What are the 7 phases of the system development life cycle?
Ans: • Planning
• Requirements
• Design
• Development
• Testing
• Deployment
• Maintenance.
Q.7 What are the different methods of tuning hyperparameters? Explain with the help
of an example?
Ans: The performance of the machine learning model improves with hyperparameter tuning.
Hyperparameter Optimization Checklist:
● Manual Search.
● Grid Search.
● Randomized Search.
● Halving Grid Search.
● Halving Randomized Search.
● HyperOpt-Sklearn.
● Bayes Search.
An example of a model hyperparameter is the topology and size of a neural network.
Examples of algorithm hyperparameters are learning rate and batch size as well as
mini-batch size. Batch size can refer to the full data sample where mini-batch size would be
a smaller sample set.
Q.8 What are some of the open-source frameworks used for AI development?
Ans:we have compiled a list of best frameworks and libraries that you can use to build
machine learning models.
1) TensorFlow: Developed by Google, TensorFlow is an open-source software library built
for deep learning or artificial neural networks. With TensorFlow, you can create neural
networks and computation models using flowgraphs. It is one of the most well-maintained
and popular open-source libraries available for deep learning. The TensorFlow framework is
available in C++ and Python.
2)Theano: Theano is a Python library designed for deep learning. Using the tool, you can
define and evaluate mathematical expressions including multi-dimensional arrays.
3) Torch:The Torch is an easy to use open-source computing framework for ML algorithms.
The tool offers an efficient GPU support, N-dimensional array, numeric optimization
routines, linear algebra routines, and routines for indexing, slicing, and transposing.
he most important thing in the complete process is to understand the problem and to know
the purpose of the problem. Therefore, before starting the life cycle, we need to understand
the problem because the good result depends on the better understanding of the problem.
1. Gathering Data:Data Gathering is the first step of the machine learning life cycle. The
goal of this step is to identify and obtain all data-related problems.
In this step, we need to identify the different data sources, as data can be collected from
various sources such as files, database, internet, or mobile devices
This step includes the below tasks:
● Identify various data sources
● Collect data
● Integrate the data obtained from different sources
2. Data preparation:This step can be further divided into two processes:
● Data exploration:
It is used to understand the nature of data that we have to work with. We need to
understand the characteristics, format, and quality of data.
A better understanding of data leads to an effective outcome.
● Data pre-processing:
Now the next step is preprocessing of data for its analysis.
3. Data Wrangling:Data wrangling is the process of cleaning and converting raw data into a
useable format. It is the process of cleaning the data, selecting the variable to use, and
transforming the data in a proper format to make it more suitable for analysis in the next
step. It is one of the most important steps of the complete process. Cleaning of data is
required to address the quality issues.
In real-world applications, collected data may have various issues, including:
● Missing Values
● Duplicate data
● Invalid data
● Noise
4. Data Analysis:Now the cleaned and prepared data is passed on to the analysis step. This
step involves:
● Selection of analytical techniques
● Building models
● Review the result
It starts with the determination of the type of the problems, where we select the machine
learning techniques such as Classification, Regression, Cluster analysis, Association, etc.
then build the model using prepared data, and evaluate the model.
5. Train Model:Now the next step is to train the model, in this step we train our model to
improve its performance for better outcome of the problem.
We use datasets to train the model using various machine learning algorithms. Training a
model is required so that it can understand the various patterns, rules, and, features.
6. Test Model:Once our machine learning model has been trained on a given dataset, then
we test the model. In this step, we check for the accuracy of our model by providing a test
dataset to it.
Testing the model determines the percentage accuracy of the model as per the requirement of
project or problem.
7. Deployment:The last step of machine learning life cycle is deployment, where we deploy
the model in the real-world system.
If the above-prepared model is producing an accurate result as per our requirement with
acceptable speed, then we deploy the model in the real system. But before deploying the
project, we will check whether it is improving its performance using available data or not.
The deployment phase is similar to making the final report for a project.