Data Science for Civil Engineering Unit 5 Notes
Data Science for Civil Engineering Unit 5 Notes
Artificial Intelligence
Regularization of Training data, Neural Networks and Deep Learning, Clustering,
Reinforcement Learning and Generative Adversarial Networks, Transfer Learning,
Unsupervised Learning
Featurisation and Deployment, Dimensionality Reduction, Forward and Backward Chaining,
Waltz Algorithm (Constraint Processing) and Arc Consistency, Pattern Directed and Forward
Chaining Interference and Rete Algorithm.
Regularization
Regularization in the context of artificial intelligence and machine learning is a technique
used to prevent overfitting in a model. Overfitting occurs when a model learns to perform
exceptionally well on the training data but fails to generalize to unseen or new data. Regularization
methods introduce constraints to the learning process to avoid this problem.
Two common types of regularization are L1 (Lasso) and L2 (Ridge) regularization:
1. L1 Regularization (Lasso): In L1 regularization, a penalty is added to the model's cost function
based on the absolute values of the model's coefficients. This encourages the model to have sparse
weight values by driving some of them to zero. As a result, L1 regularization can be used for feature
selection, as it effectively excludes less important features from the model.
2. L2 Regularization (Ridge): L2 regularization adds a penalty to the model's cost function based on
the square of the model's coefficients. It encourages the model to distribute the weight values more
evenly and not let any single weight dominate the predictions. L2 regularization helps in preventing
overfitting by reducing the magnitude of the coefficients.
The regularization term is typically added to the loss function when training a model. The
strength of regularization is controlled by a hyperparameter called λ (lambda). The choice of λ affects
the trade-off between fitting the training data and preventing overfitting. Cross-validation is often
used to tune this hyperparameter.
Regularization techniques help in creating more robust and generalizable models. They are
commonly used in various machine learning algorithms, including linear regression, logistic
regression, support vector machines, and neural networks. Regularization is particularly useful when
the dataset is small or when the model is complex, as these situations are more prone to overfitting.
Clustering
Clustering is a technique in unsupervised machine learning that involves grouping data points
into clusters or groups based on their similarities. The primary goal of clustering is to discover
inherent patterns or structures in data without any prior knowledge of the groups or categories. Here's
an overview of clustering:
1. Clustering Algorithms:
- Several clustering algorithms are available, each with its own strengths and use cases. Some
popular clustering algorithms include:
- K-Means: Divides data into K clusters by minimizing the variance within each cluster. It's
widely used and relatively simple.
- Hierarchical Clustering: Builds a tree-like structure of clusters by iteratively merging or
dividing clusters.
- DBSCAN (Density-Based Spatial Clustering of Applications with Noise): Identifies clusters
of arbitrary shapes based on the density of data points.
- Agglomerative Clustering: A hierarchical clustering approach that starts with each data point as
a separate cluster and merges them until a stopping condition is met.
- Mean-Shift: Finds clusters by iteratively moving towards regions of higher data point density.
- Definition: GANs are a type of neural network architecture consisting of two main components -
a generator and a discriminator. GANs are used for generating new data that is similar to a given
dataset.
- Key Components:
- Generator: It takes random noise as input and produces data samples. The goal of the generator is
to generate data that is indistinguishable from real data.
- Discriminator: It evaluates the generated samples and real data to distinguish between them. The
goal of the discriminator is to improve its ability to discriminate.
Both RL and GANs are powerful techniques, but they serve different purposes:
Transfer learning
Transfer learning is a machine learning technique in which a model trained on one task is adapted or
fine-tuned for a different but related task. It involves taking a pre-trained model and reusing part or
all of it for a new problem. Transfer learning has become a popular approach in the field of deep
learning and has been highly successful in various applications. Here are some key aspects of transfer
learning:
1. Motivation for Transfer Learning:
- The motivation behind transfer learning is that knowledge acquired in one domain or task can be
valuable for solving related tasks, even if the tasks are not identical.
- It is often used when there is a lack of labeled data for a specific task or when training a model
from scratch is computationally expensive.
2. Types of Transfer Learning:
- Inductive Transfer Learning: In this approach, the knowledge from the source task is used to
help improve the performance of the target task. The source and target tasks can be related but not
necessarily the same.
- Transductive Transfer Learning: In this case, the source and target tasks are the same, but the
distribution of the data may differ. It focuses on adapting the model from the source domain to the
target domain.
Deployment:
1. Definition:
- Deployment in the context of machine learning refers to the process of making a trained machine
learning model accessible and operational for real-world applications.
2. Deployment Methods:
- Cloud-Based Deployment: Hosting the model in cloud platforms like AWS, Azure, or Google
Cloud for scalability and easy access through APIs.
- Edge Deployment: Deploying the model directly on edge devices (e.g., mobile phones, IoT
devices) for low-latency inference.
- On-Premises Deployment: Hosting the model on an organization's own servers or data center for
data security and privacy.
- Containerization: Using container technologies (e.g., Docker) to package the model and its
dependencies for consistent deployment across different environments.
- Serverless Computing: Utilizing serverless platforms (e.g., AWS Lambda) for automatic scaling
and minimal infrastructure management.
Dimensionality reduction
Dimensionality reduction is a technique used in machine learning and data analysis to reduce the
number of features (dimensions) in a dataset while preserving important information. High-
dimensional data can be challenging to work with due to issues such as the curse of dimensionality,
increased computational complexity, and the risk of overfitting. Dimensionality reduction methods
aim to address these challenges by transforming the data into a lower-dimensional representation.
Here are key aspects of dimensionality reduction:
1. Motivation:
- High-dimensional data often contains redundant or irrelevant features that can lead to increased
computational costs and reduced model performance.
- Reducing the dimensionality of data can help with data visualization, data exploration, and feature
selection.
© Prof. Prashant H. Kamble Page 11
2. Common Techniques:
- Principal Component Analysis (PCA): PCA is a linear technique that finds orthogonal axes,
called principal components, along which the variance of the data is maximized. It projects data
points onto these components, effectively reducing the dimensionality.
- t-Distributed Stochastic Neighbor Embedding (t-SNE): t-SNE is a nonlinear technique used for
data visualization. It minimizes the divergence between probability distributions to map high-
dimensional data to a lower-dimensional space, often for visualization.
- Linear Discriminant Analysis (LDA): LDA is a supervised technique used for dimensionality
reduction that seeks to maximize class separability while reducing dimensionality.
- Autoencoders: Autoencoders are neural networks that consist of an encoder and a decoder. They
are used to learn a compressed representation of data, which effectively reduces dimensionality.
- Feature Selection: Feature selection methods aim to identify and keep the most informative
features while discarding irrelevant ones.
3. Trade-offs:
- Dimensionality reduction can result in information loss, as some variation in the data may be
discarded during the process.
- Choosing the right dimensionality reduction technique and the optimal number of dimensions is
often a trade-off between simplicity and data preservation.
4. Applications:
- Dimensionality reduction is applied in various domains, including image processing, natural
language processing, and recommendation systems.
- In computer vision, dimensionality reduction can help with facial recognition or image
compression.
- In natural language processing, techniques like Word2Vec use dimensionality reduction to create
word embeddings.
5. Visualization:
- One common application is data visualization. By reducing data to 2 or 3 dimensions, it becomes
easier to create scatter plots, heatmaps, and other visualizations to understand patterns in the data.
6. Curse of Dimensionality:
- The curse of dimensionality refers to the challenges and issues that arise as the number of
dimensions in the data increases. These challenges include increased data sparsity and computational
complexity.
Forward Chaining:
1. Definition:
- Forward chaining is a bottom-up or data-driven approach to reasoning and decision-making in a
rule-based system. It starts with the available data and iteratively applies rules to make inferences
until a goal or conclusion is reached.
2. Process:
- The process begins with an initial set of facts or data.
- Rules are applied to these facts, generating new facts or conclusions.
- These new facts are added to the existing set of data.
- The process continues iteratively until a specific goal or conclusion is reached.
3. Use Cases:
- Forward chaining is often used in applications where the initial data is known, and the goal is to
derive further information based on existing knowledge. It's commonly used in expert systems and
diagnostic applications.
4. Example:
- In a medical expert system, forward chaining might start with patient symptoms and apply
medical rules to determine a diagnosis.
Waltz Algorithm:
1. Definition:
- The Waltz Algorithm is a constraint propagation algorithm used for solving constraint satisfaction
problems (CSPs), especially in the context of computer vision and pattern recognition.
2. Usage:
- The Waltz Algorithm was originally designed for the purpose of interpreting line drawings and
understanding the 3D structure of objects represented in 2D images. It's used to identify and reason
about the geometric relationships between lines, curves, and shapes in images.
3. Method:
- The algorithm iteratively examines line segments in the image and checks for geometric
consistency between them.
- It uses a network of constraint rules that capture geometric relationships such as parallelism,
intersection, collinearity, etc.
- By applying these rules and propagating constraints through the network, the algorithm infers the
most likely interpretations of the image.
4. Applications:
- The Waltz Algorithm has applications in computer vision, particularly in the interpretation of
engineering drawings, architectural plans, and other line drawings.
Arc Consistency:
1. Definition:
- Arc Consistency is a fundamental concept in constraint satisfaction problems. It refers to a
property that pruning or reduces the domains of variables in a CSP to ensure that no variable has a
value that contradicts the constraints with its neighboring variables.
2. Usage:
- Arc Consistency is an important step in the process of solving CSPs. It simplifies the problem by
eliminating values from variable domains that are inconsistent with the constraints, making it easier
to find a solution.
Rete Algorithm:
1. Definition:
- The Rete Algorithm is a pattern-matching algorithm used to efficiently implement pattern-directed
and forward chaining inference in rule-based systems.
2. Usage:
- The Rete Algorithm is designed to process and match large sets of rules and facts efficiently. It is
used to improve the performance of expert systems that involve complex rule-based reasoning.
3. Operation:
- The Rete Algorithm creates a network of nodes that represent rules and facts. It matches facts
against rules in a systematic way, allowing for incremental and efficient rule execution.
- The algorithm minimizes redundant computations by caching intermediate results and maintaining
a working memory of matched facts.
4. Benefits:
- The Rete Algorithm greatly speeds up the execution of complex rule-based systems by optimizing
the pattern-matching process. It reduces the need to repeatedly reevaluate the same rules and facts.
5. Applications:
- The Rete Algorithm is used in various domains, including expert systems, business rule engines,
natural language processing, and any application where rule-based reasoning is employed.
In summary, Pattern Directed and Forward Chaining Inference are reasoning strategies used
in rule-based systems, where rules are applied based on available data and patterns. The Rete
Algorithm is a critical technology for optimizing the execution of these strategies by efficiently
managing the matching of rules and facts. It plays a key role in improving the performance of rule-
based systems and expert systems.