Assignment 1

This document is an assignment for a course on the Application of AI and ML in Chemical Engineering, consisting of various sections with short answer, conceptual, analytical, and advanced application questions. It covers topics such as machine learning techniques, data analysis, PCA, and model evaluation in the context of chemical engineering processes. Students are required to provide detailed answers, calculations, and explanations for each question.

Uploaded by

sanskarughadeuna

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Assignment 1

Uploaded by

sanskarughadeuna

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Assignment 1

Course Code: C653

Course Name: Application of AI and ML in Chemical Engineering

Instructions:
- Answer all questions.
- Clearly mention assumptions, formulas, and steps in calculations.
- Use diagrams, flowcharts, or graphs where applicable.
Section A: Short Answer Questions (1 Mark Each)
1. Define the difference between hyperparameter and parameter in machine learning?
2. Give details about feature selection Vs feature extraction?
3. What does an ROC curve represent in model evaluation?
4. Define data scaling and Principal Component Analysis (PCA). How is PCA used for dimensionality reduction?
5. Which machine learning techniques utilize both labelled and unlabelled data?
6. The following are the ages of a group of people: [22, 24, 25, 29, 30, 32, 33, 100]. Identify the outlier(s) in the
data using the range method?
7. The following are the scores of 5 students in a math test: [72, 80, 85, 90, 95]. Calculate the mean and the
standard deviation of the scores?
8. What are the differences between Standard Score and Min-Max Scaling in terms of algorithms and outliers?
9. True/False: “Overfitting can sometimes be reduced using optimization techniques like regularization.”?
10. If a dataset has a covariance of 0.85 between two features, what does it indicate?

Section B: Conceptual and Calculation-Based Questions (2 Marks Each)

1. What is the difference between classification and regression in machine learning?
2. If a feature in a dataset ranges from 10 to 100, apply min-max normalization to transform a value of 55?

3. If Singular Value Decomposition (SVD) is applied to a 5 × 4 matrix, what are the dimensions of its
decomposed matrices?

4. Given a dataset with eigenvalues (5.0, 3.0, 1.5, 0.5), determine how many components should be selected if
we want to retain 85% variance?

5. Explain the working of the Savitzky-Golay filter, specifying the polynomial order and window size, and apply
it to smooth the noisy data points: [2.1, 3.4, 4.0, 4.8, 5.3, 6.0]?

Section C: Analytical and Problem-Solving Questions (4 Marks Each)

1. Describe the application of a reinforcement learning algorithm in the field of chemical engineering for
process optimization, focusing on its implementation in a decision support system for a reverse osmosis
plant. The explanation should cover the components such as the flowchart encompassing all aspects of the
decision support system, the features from the reverse osmosis plant that can be utilized, the formulation of
a loss function incorporating rewards and penalties, and the benefits of utilizing reinforcement learning to
differentiate between suboptimal and ideal operational states, thereby assisting operators in improving
plant performance.
2. Discuss the role of optimization in machine learning model development. Is it always required? Provide two
scenarios where optimization is a must and might not be necessary, focusing on the implications for machine
learning model performance and efficiency.
Page 1 of 5
3. Suppose you have an unscaled dataset with two features: 'temperature' (ranging from 0-100°C) and
'pressure' (ranging from 1-10 atm). Explain how standardization would help in ML model training?

4. Using Principal Component Analysis (PCA), compute the first principal component for the dataset:
X = [2, 3, 5], Y = [1, 4, 6]

Find the eigenvector for the covariance matrix?

5. Given the following dataset of daily sales in a store for one week: [150, 200, 180, 220, 250, 190, 210],
perform a basic Exploratory Data Analysis (EDA) and answer the following:
a) What is the mean sales value for the week?
b) What is the range of sales values?
c) Identify any trends or patterns you notice?

6. Given a training dataset with 1000 samples, explain how stratified sampling can improve training results in
classification models?
7. If an ML model was trained using 80% of the data and tested on 20%, compute how many samples were in
the training set if the dataset contained 500 records.?
8. A process control system logs the following pressure values: [80, 85, 83, 87, 90]. Compute the Exponential
Moving Average with α = 0.1?
9. Given the specified feature ranges CSTR, compute the normalized and standardized values for a data point
with a `Temperature` of 200°C, a `Pressure` of 50 bar, and a `Concentration` of 1.0 mol/L. Utilize the dataset
mean (μ) and standard deviation (σ) for each feature: `Temperature` (μ=200°C, σ=75°C), `Pressure` (μ=100
bar, σ=50 bar), and `Concentration` (μ=1.05 mol/L, σ=0.5 mol/L). Temperature: Min = 100°C, Max = 300°C,
Pressure: Min = 20 bar, Max = 180 bar. Concentration: Min = 0.2 mol/L, Max = 2.2 mol/L. Present a detailed
breakdown of your calculations. Further, assume you are requested to create a machine-learning model to
forecast reactor yield. In this context, how do normalization and standardization affect the predictive
accuracy of machine learning models when estimating CSTR yield, given the distinct challenges presented by
the scale, distribution, and units of variables such as Temperature, Pressure, and Concentration? The scale
and distribution of these features (e.g., Temperature, Pressure, and Concentration) can notably impact CSTR
yield.
10. Given a high-dimensional dataset with 200 features, apply PCA to reduce the dimensionality while retaining
90% variance. Explain your approach and compute the number of components to retain if the total variance
is 500 and eigenvalues of selected components sum to 450?
11. The measured pH of the solution in CSTR is observed to have high noise and the data sample for a 1.9-sec
duration is shown in Table 2. Further to develop an ML model to predict the pH with respect to the reactor
input feature, the noise present in the pH has to be suppressed using appropriate smoothing filters.
12. Table 2: pH variation in CSTR reactor.

Time (sec) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
pH 3.81 3.55 3.95 4.05 4.34 3.65 4.19 4.24 4.09 4.56
Time (sec) 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9
pH 4.48 4.13 4.70 4.30 4.80 4.25 4.72 4.08 4.09 4.48
a) Perform the following smoothening techniques to suppress the noise in measured pH data for three-
time steps.
b) Simple Moving Average with window size =3
c) Exponential Moving Average with β=0.25 & β=0.75
d) Savitzky–Golay (SG) Filter of order 2 and window size =5
e) Given data: SG filter coefficients

Page 2 of 5
13. An exothermic reaction is taking place in a jacketed continuously stirred tank reactor. The performance of
the reactor depends on the flow rates between reactant and coolant. The operator has defined reactor
operation as good and bad based on the yield and energy efficiency of the reactor. Three different linear ML
models are constructed to segregate the good and bad operating points. These models are shown in below
Figure 1.
a) Calculate the model 1 performance metrics namely Precision, Recall, F1 score
b) For each of the models estimate the false positive rate, and true positive rate and plot the Receiver
Operating Characteristic (ROC) curve.
A) Model 1 B) Model 2

C) Model 3 Figure 1: Three different ML linear models

to classify the reactor operation category

Section D: Advanced Applications (10 Marks Each)

Page 3 of 5
1. In the context of operating a petrochemical distillation column, accurate prediction of the quality of the top
product, such as the purity of the distilled compound, is paramount for optimizing the process. Various
operational parameters like feed flow rate, feed composition, feed temperature, reflux ratio, distillate flow
rate, and temperatures of the top two trays play crucial roles in influencing the distillation process. Principal
Component Analysis (PCA) has been conducted on historical data to understand how these variables impact
the purity of the top product.

The results of the PCA revealed two principal components, PC1 and PC2, that collectively explain a significant
portion of the variance in the data: PC1 explains 60% of the variance, while PC2 explains 30%. The
eigenvectors associated with PC1 and PC2 provide insight into the influence of each parameter on the
process dynamics. The eigenvectors for PC1 and PC2, reflecting the influence of each parameter, are as
follows:

• Eigenvector for PC1: [0.5 (feed flow rate), 0.3 (feed composition), 0.4 (feed temperature), -0.3 (reflux
ratio), -0.2 (distillate flow rate), 0.6 (top tray temperature), 0.5 (second top tray temperature)].
• Eigenvector for PC2: [-0.2 (feed flow rate), 0.6 (feed composition), -0.1 (feed temperature), 0.5
(reflux ratio), 0.3 (distillate flow rate), -0.4 (top tray temperature), -0.3 (second top tray
temperature)].

Please provide your interpretation and analysis below questions:

a) Discuss the implications of the variance explained by PC1 and PC2 for distillation column operations
and how they contribute to understanding process dynamics.
b) Analyze operational parameters based on their coefficients in the eigenvectors for PC1 and PC2, with
a focus on the importance of the top two tray temperatures.
c) Design a conceptual model architecture leveraging PC1 and PC2 to predict the purity of the distilled
compound. Outline the steps for constructing this model, from inputting operational parameters to
outputting product purity predictions.
I. Incorporate PCA results into the model architecture and explain the roles of PC1 and PC2 in
predicting product purity.
II. Define data preprocessing steps before applying PCA and how the model would handle new
data for real-time or near-real-time predictions of product purity.
d) Propose operational adjustments or monitoring strategies based on PCA analysis to enhance the
purity of the top product, emphasizing how controlling key parameters identified through PCA can
optimize distillation efficiency.
2. In the packed bed column, the pressure drop is measured using a differential pressure gauge by varying the
input flow rate through the column, the measured experimental data is given in Table 3.

Table 3: Pressure drop vs flow rate across the packed bed column

S.No 1 2 3 4 5 6 7 8
Flowrate (m3/hr) 2 5 8 10 12 14 16 18
Pressure drop (bar) 0.1 0.5 0.8 1.5 1.9 2 2.1 2.6
a) Perform the PCA analysis and extract one new feature (i.e., the equation for PC1) that can be used to
identify the process performance.
b) Draw a Scree plot and conclude that the extracted feature (PC1) is good enough to capture the original
process behavior observed from the original data.
3. Explain the methodology to deploy LDA for feature extraction and regression model development.
4. Apply Linear Discriminant Analysis (LDA) on the dataset below to classify two classes and compute the class
separation boundary:
Page 4 of 5
a. Class 1: X = [1, 2, 3], Y = [2, 4, 6]

b. Class 2: X = [4, 5, 6], Y = [8, 10, 12] ?

6. Implement Ridge Regression on the dataset below and compute the regression coefficients using L2
regularization:

a. X = [1, 2, 3, 4, 5]

b. Y = [2, 4, 6, 8, 10]

c. Regularization Parameter λ = 0.5 ?

Page 5 of 5

The Path To Satan
80% (56)
The Path To Satan
93 pages
ME P4252-II Semester - MACHINE LEARNING
No ratings yet
ME P4252-II Semester - MACHINE LEARNING
48 pages
Endsem PDA Key
No ratings yet
Endsem PDA Key
7 pages
CONCEPTS IN MACHINE LEARNING-Ktunotes.in
No ratings yet
CONCEPTS IN MACHINE LEARNING-Ktunotes.in
14 pages
S-1
No ratings yet
S-1
5 pages
MP 1
No ratings yet
MP 1
2 pages
CONCEPTS_OF_MACHINE_LEARNING [MINOR]
No ratings yet
CONCEPTS_OF_MACHINE_LEARNING [MINOR]
14 pages
Lab Assignment - SVM - 2024
No ratings yet
Lab Assignment - SVM - 2024
5 pages
1_Data Preprocessing and Cleaning_55
No ratings yet
1_Data Preprocessing and Cleaning_55
8 pages
MPS_Lab Manual 2025 (2)
No ratings yet
MPS_Lab Manual 2025 (2)
20 pages
Ml Ese 031223 Openbook
No ratings yet
Ml Ese 031223 Openbook
4 pages
CHE_F315_2615_20250106141654
No ratings yet
CHE_F315_2615_20250106141654
3 pages
07a80805 Optimizationofchemicalprocesses
No ratings yet
07a80805 Optimizationofchemicalprocesses
8 pages
R20-ML
No ratings yet
R20-ML
13 pages
Machine Learning
No ratings yet
Machine Learning
14 pages
Gradient Ascent
No ratings yet
Gradient Ascent
27 pages
AIML Lab
No ratings yet
AIML Lab
48 pages
Handout CHE F315
No ratings yet
Handout CHE F315
3 pages
Midpaper
No ratings yet
Midpaper
16 pages
Lab Practice-II Manual
No ratings yet
Lab Practice-II Manual
57 pages
SET-01_SOCS_ESE-MAY23_B.Tech%20%28CSE-H%2bN.H%29_VI_CSAI3011_Pattern%20Recognition%20and%20Anomaly%2
No ratings yet
SET-01_SOCS_ESE-MAY23_B.Tech%20%28CSE-H%2bN.H%29_VI_CSAI3011_Pattern%20Recognition%20and%20Anomaly%2
2 pages
9bca8c6d-eba1-47cd-bf4a-3c14d0cd7b44
No ratings yet
9bca8c6d-eba1-47cd-bf4a-3c14d0cd7b44
28 pages
Machine Learning July 2023
No ratings yet
Machine Learning July 2023
4 pages
ML_2023
No ratings yet
ML_2023
3 pages
AD3461 MACHINE LEARNING LABORATORY SYLLABUS
No ratings yet
AD3461 MACHINE LEARNING LABORATORY SYLLABUS
2 pages
Machine Learning May 2024
No ratings yet
Machine Learning May 2024
8 pages
Lab 06
No ratings yet
Lab 06
12 pages
Dsbda Lab Manual Merged
No ratings yet
Dsbda Lab Manual Merged
117 pages
Data Science and Its Applications (21AD62) Lab Manual
No ratings yet
Data Science and Its Applications (21AD62) Lab Manual
26 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
167 pages
Cems Question Bank
No ratings yet
Cems Question Bank
5 pages
Pmso - Question Bank
No ratings yet
Pmso - Question Bank
4 pages
Teaching Classical Machine Learning As A Graduate-Level Course in Chemical Engineering: An Algorithmic Approach
No ratings yet
Teaching Classical Machine Learning As A Graduate-Level Course in Chemical Engineering: An Algorithmic Approach
11 pages
MCQ 3 aiml
No ratings yet
MCQ 3 aiml
2 pages
Important Questions of Machine Learning
No ratings yet
Important Questions of Machine Learning
5 pages
Assignment 3
No ratings yet
Assignment 3
2 pages
Lab Manual Ds&Bdal
No ratings yet
Lab Manual Ds&Bdal
100 pages
ML PG Assignment 3
No ratings yet
ML PG Assignment 3
3 pages
L3 Overview of ML Model Development Lifecycle-1
No ratings yet
L3 Overview of ML Model Development Lifecycle-1
30 pages
CSE1703 - Fundamental of Data Science
No ratings yet
CSE1703 - Fundamental of Data Science
6 pages
Shuvajit Majumdar Section 1 and Section 2 Answers
No ratings yet
Shuvajit Majumdar Section 1 and Section 2 Answers
9 pages
Ml Cyber Lab
No ratings yet
Ml Cyber Lab
16 pages
ML 101
No ratings yet
ML 101
2 pages
MLCyberLab
No ratings yet
MLCyberLab
9 pages
Lab 08 - Data Preprocessing
No ratings yet
Lab 08 - Data Preprocessing
9 pages
CENG3300 Lecture 3
No ratings yet
CENG3300 Lecture 3
24 pages
Process Modeling and Control (PMC) : Lecture # 2: Model Building Framework
No ratings yet
Process Modeling and Control (PMC) : Lecture # 2: Model Building Framework
28 pages
Preprocessing
No ratings yet
Preprocessing
5 pages
ML Lab 04 Manual - Pandas and MatplotLib
No ratings yet
ML Lab 04 Manual - Pandas and MatplotLib
7 pages
ML Lab Manual TE 2021-22
No ratings yet
ML Lab Manual TE 2021-22
43 pages
FRONT PAGE
No ratings yet
FRONT PAGE
6 pages
SML-SET 1-Batch 1-Answer Key
No ratings yet
SML-SET 1-Batch 1-Answer Key
8 pages
Mathematical Studies
No ratings yet
Mathematical Studies
56 pages
Case Study 1 v2
No ratings yet
Case Study 1 v2
28 pages
2CSOE51-ML - Course Policy
No ratings yet
2CSOE51-ML - Course Policy
7 pages
CAT2 Key
No ratings yet
CAT2 Key
10 pages
ML Lab 3
No ratings yet
ML Lab 3
8 pages
Assignment 4
No ratings yet
Assignment 4
4 pages
SYMCA Autonomous Syallabus
No ratings yet
SYMCA Autonomous Syallabus
41 pages
18ai61-Model Question Paper Solutions
No ratings yet
18ai61-Model Question Paper Solutions
71 pages
Harness Oil and Gas Big Data with Analytics: Optimize Exploration and Production with Data-Driven Models
From Everand
Harness Oil and Gas Big Data with Analytics: Optimize Exploration and Production with Data-Driven Models
Keith R. Holdaway
No ratings yet
M.A.M. School of Engineering: Siruganur, Trichy - 621 105
No ratings yet
M.A.M. School of Engineering: Siruganur, Trichy - 621 105
78 pages
Tally Prime Record Book
100% (1)
Tally Prime Record Book
19 pages
IGCSE - 4 Forces and Energy - Set 1 - Energy, Work and Power - Theory, Part 1
No ratings yet
IGCSE - 4 Forces and Energy - Set 1 - Energy, Work and Power - Theory, Part 1
14 pages
Activity Sheets in Fundamentals of Accountanc3
No ratings yet
Activity Sheets in Fundamentals of Accountanc3
9 pages
English Elective SrSec 2022-23
No ratings yet
English Elective SrSec 2022-23
10 pages
Oral Histology First Exam-Practical-2023-1
No ratings yet
Oral Histology First Exam-Practical-2023-1
6 pages
Object Oriented Programming (OOP) - CS304 Power Point Slides Lecture 42
No ratings yet
Object Oriented Programming (OOP) - CS304 Power Point Slides Lecture 42
43 pages
Your February 2023 Bill: $80.29 $84.60 Summary of Current Charges
No ratings yet
Your February 2023 Bill: $80.29 $84.60 Summary of Current Charges
4 pages
Calbayog Myths
No ratings yet
Calbayog Myths
19 pages
Welcome To American Mosaic From VOA Learning English
No ratings yet
Welcome To American Mosaic From VOA Learning English
2 pages
S32K148 IO Signal Description Input Multiplexing
No ratings yet
S32K148 IO Signal Description Input Multiplexing
180 pages
Final Report
No ratings yet
Final Report
21 pages
Clinical Research
100% (1)
Clinical Research
15 pages
End of 2nd Semester Writing Rubric Grade 12 - (Long Writing - Argumentative) : 2
No ratings yet
End of 2nd Semester Writing Rubric Grade 12 - (Long Writing - Argumentative) : 2
1 page
2758 0-2009 (+a1)
No ratings yet
2758 0-2009 (+a1)
25 pages
Malin Kundang
No ratings yet
Malin Kundang
7 pages
Multi Grade Lesson Plan Science Grade 2 3
No ratings yet
Multi Grade Lesson Plan Science Grade 2 3
7 pages
Food Guide
No ratings yet
Food Guide
8 pages
Stress Management Thesis in Philippines
100% (3)
Stress Management Thesis in Philippines
7 pages
In Search of Our Mothers Garden
No ratings yet
In Search of Our Mothers Garden
5 pages
杨帅雅思口语第1课
No ratings yet
杨帅雅思口语第1课
27 pages
Clean Architecture: A Craftsman's Guide To Software Structure and Design
No ratings yet
Clean Architecture: A Craftsman's Guide To Software Structure and Design
13 pages
Job Description Hydroponics Assistant 1
No ratings yet
Job Description Hydroponics Assistant 1
2 pages
Cisco SPA500S Expansion Module: Highlights
No ratings yet
Cisco SPA500S Expansion Module: Highlights
3 pages
3:13-cv-06629 #1 - Complaint
No ratings yet
3:13-cv-06629 #1 - Complaint
178 pages
Darbyquaveresume FP
No ratings yet
Darbyquaveresume FP
2 pages
Numerical Problems Module-I and III-Engineering Chemistry
No ratings yet
Numerical Problems Module-I and III-Engineering Chemistry
7 pages
Loewy Cocktails and Beers 13 Sept
No ratings yet
Loewy Cocktails and Beers 13 Sept
2 pages
Kinematics of Machinery
100% (1)
Kinematics of Machinery
38 pages