AI Notes

Uploaded by

Kulveer singh

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

197 views

AI Notes

Uploaded by

Kulveer singh

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 37

Markov Decision Procegg MDP formulation, utility theory, utility functions, value iteration, policy iteration and Partially observable MDPs. POINTS TO REMEMBER fé i Uncertainty means that many of the simplifications that are possible with deductiy inference are no longer valid. Utility theory says that every state has a degree of usefulness, or utility to in agent, ane that the agent will prefer states with higher utility. The use of utility theory to represent and reason with preference. Belief network is also called Bayesian network. Default reasoning is a very common form of non-monotonic reasoning. It eliminates th need to explicitly store all facts regarding a situation. Fuzzy logic is a problem solving control system methodology that lends itself tc implementation in systems ranging from simple, small, embedded micro-controllers t large, networked, multi-channel PC or workstation-based data acquisition and conto systems. Fuzzy set theory was formalised by Professor Lofti Zadeh at the university of Califor in 1965. Utility theory is used in decision analysis to determine the EU (estimated uly) of som action based on the U (utility) of its possible results (s). QUESTION-ANSWERS Q 1. What is Markov Decision Process ? Ans. A Markov Decision Process (MDP) is defined b A Countable set of states S A countable set of actions A. A transition probability function T: S * AS + Ry. An initial state So ¢S, A reward function R:SxAxS— Re, y ooooo 96Decision Process In other words : if action a is applied from state S, a transition to state S' will occur with ability T (S, a, S') : y is collected. Ina Markov Decision process, both transition probabilities and rewards only depend on sre present state not on the history of the state. In other words the future states and rewards ate independent of the post, given the present. A Markov Decision process has many common features with Markov chains and transition system. Ina MDP Q Transitions and rewards are stationery. The state is known exactly. Q MDPs in which the state is not known. exactly (HMM & Transition systems) are called partially observable Markov decision processes (POM DP’s) these are very hard problems. Q 2. Explain Decision marking in MDPs. Ans. The actual sequence of states are hence the actual total rewards is unknown as prior. We word choose a plan i.e. a sequence of actions A = (84,8g.-.). In this case, a transition probabilities are fixed and one can compute the probability of being at any given state at each time step in a similar way as the forward algorithm in HMMs and hence compare the expected rewards : E[R(S+, at, St+1) S, a] = ¥ T (S, ay, S) R (S, a, S)- Ss Such approach is essentially open loop i.e. It does not take advantage of the fact that at each time step the actual state reached is known, and a new feedback strategy can be ‘computed based on this knowledge. Q 3. What is value iteration ? Also explain value iteration with the help of an algorithm.” Ans. Value iteration : Let us assume we have a function V,: S-> R, that associates to fach state S a lower bound on the optimal (discounted) total reward V* (S) that can be Collected starting from that state. Note the connection with admissible heuristics in informed Search algorithms. eg. We can start with Vo(S) = 0, for all seS. As a feedback strategy, we can do the ‘0lowing : at each state, choose the action that maximizes the expected reward of the present ’ction and estimate total award from next step onwards. Using this strategy, we can get an “Pdate V; + 1 on the function Vj. Iterate until convergence Value interation: algorithm Algorithm : Set Vo (S) < 0, for all seS Nerate for all seS Vier (S)T= max E [R(S,a,S)+rV:(S)] Furthermore, every time a transition is made from § to S' using action a, a reward R(S, _Low» Artificial Intell vv(S)} = max y (s, a, 8) [A (Sa 8) + a Stes until max, | Vier (S) - Vi (SI <€ - Q 4, What is Bellman's Equation? Ans. Under some technical conditions (ed; value iteration converges to the optimal value function V*. | The optimal value function V* satisfies the following equation, called the Beliman's equation, a nice example of the principle of optimality V* (s) = max E[R(S, a, S') + V* (s)) R(S, a, S) +h vr (S)]. vse. inite state and action space, and V < =max y T(S,a,S)T a ses ; Inother words, the optimal value function can be seen as a fixed point for value iteration, The Bellman’s equation can be proven by contradiction. Q 5. Define the term utility? of “the quality of being useful”. Utility Ans. Utility : The term utility is used in the sens of a state is relative to the agents whose preferences the utility function is supposed to represent. Q 6. What is the need Ans. Utility theory says th and that the agent will prefer st and reason with preferences. Q 7. What is called as decision theory? Ans. Preferences as expressed by utilities are com! Theory of Rational decisions called decision theory. Decision Theory = Probability Theory + Utility Theory. @ 8. What is called as principle of maximum expected utility? ‘Ans. The basic idea is that an agent is rational if and only if it chooses the action thal yields the highest expected ult, averaged over all the possible outcomes of the action This own as MEU i.e. Maximum expected utility. Q 9. Explain utility theory and utility function in detail. (PTU, May 2017 ; Dec. 2018) for utility theory in uncertainty? at every state has a degree of usefulness, or utility to in agent, fates with higher utility. The use of utility theory to represent (PTU, Dec. 2018) bined with probabilities in the Gener is kn . aR Explain the process of decision making using utility functions. . 2016 ‘Ans. In the decision making process, the quality of the decision cusene i nea by utility function which determines the relationships between the components affecti"d decision making and the outcomes of such a decision. Utility function His to preferences individually ascribed to a particular individuals. Thus, if we art ain ne > atl quantitatively, we can obtain a very useful decision making tool. range the information nt AERA IDPTOTO TERED n—(auuitility theory : Utility theory is used in decision analysis to determine the EU(estimated = vil ‘of some action based on the U (utility) of its possible result(s). A utility function assigns etd number to express the desirability of a state. U : SR is used to denote the utility of stale where S is the state space of a problem and R is the set of real numbers. These ilies: Fe used in combination with probabilities of outcomes to get expected utilities for ry action. Agents choose the actions that maximize their expected utility. For example : consider @ non-deterministic action A, which has possible outcome states Result i (A). Index j spans over the different outcomes. Each outcome has a probability assigned to it by the t before the action is performed. agent ; P(Result i (A) | Do(A), E) E corresponds to the evidence variables. Now we want to maximize the expected utility. If U(Result i (A)) is the utility of state (Result i (A)), then the expected utility is : EU(AIE) = 5°P (Result i(A)|Do(A),E)U (Result i(A)) Maximum expected utility (MEU) principle : A rational agent will choose the action that maximizes the expected utility. To find this maximum, we would have to enumerate all actions, which is not practical for long sequences. Instead: we must find a different way to handle these problems, starting with simple decisions. Notations : A > B (Ais preferred to B) A~B (Ais indifferent to B) A> B (The agent prefers A to B or is indifferent to them) A and B here are lotteries. A lottery is a probability distribution over a set of outcomes. These outcomes can be called prizes of the lottery. A lottery can be defined according to the equation below, where Cy, ...s Cn are possible outcomes and pj, ... Py are the probabilities of these outcomes. L= [P1045 PaCoi ---+» PasCnl There are six axioms of utility theory = 1. Orderability : (AB) v (BA) v (A~B) 2. Transitivity : (A*B) , (BC) = (A*C) 3. Continuity : (A > B > C) => 3plp.A;1-p,C]-B 4. “Substitutability : (A~B) = [p,A ;1-P.c] ~ [P.B; 1-p.c] 5, Monotonicity : (AB) => (p> 4 @ [p, A; 1-p.8) > [@.A; 1-4, Bl) i 6. Decomposability : [p.A; 1-p.l4.B; 1-4,Cl] ~ [p,Ai(1-p) 4.8:(1-p)(1-4),C] j If the above axioms hold, the 3 a real-valued function U that operates on states so that i U(A) > U(B) iff A*B and 4 U(A) = U(B) iff A~B (b) Utility functions : A utility function U(w) maps from states (w being a world'state! real numbers i.e. U(w) = 5. Defining ullities-is useful in that we can design utility function100 ee LORDS Artificial tniey, ——. ae "ee that can change the behavior of the agent to a desired behavior. Example : One example that everyone can relate to in some form is the Uti of, An agent would exhibit monotone preference which would mean thal the Agen! woul jy prefer more money 10 less. But this i not always oxacily tho case. Considor he cas yy you can either take $1,000,000 or gamble between winning $0 and $3,000,000 by tiny, coin. So our expected monetary value of the gamble looks like this : ' 0.5 . ($0) + 0.5 . ($3,000,000) = $1,500,000 and the expected monetary value for the first case is $1,000,000. This doos , necessarily mean that gambling is the better deal. Some people would rather take $1,009 ¢ if itis worth a lot fo them, which people with billions of dollars may gamblo becaurs $1,000,000 will probably not make a huge difference to them. Studies havo shown tha utility of money is proportional to the logarithm of the amount, shown in the first graph bey, Alter a certain amount of money, risk-averse agents prefer a sure payoff that is less than expected monetary value of the gamble. w u However, if someone is $10,000,000 in debt, they may consider gambling with a 50/50 oie ae Saal chances of winning $10,000,000 or losing of ~18000,7") eisotn ES $20,000,000 because of desperation. This type of agent is risk seeking and the curve of this, behavior can be seen in the second graph in fig. (a) (b) Q 10. What are the steps associated with the knowledge engineering for decisi theoretic expert system. (PTU, Dec. 20 ‘Ans, Decision theoretic expert systems : There are two roles in decision analy: These are : 1. Decision maker which states preferences between outcomes. 2. Decision analyst which enumerates the possible actions and outcomes and elc preferences from the decision maker to determine the best course of action. ‘As an example we will consider the problem of selecting a medical treatment for ak of congenital heart disease in children. About 0.8% of children are born with a heart anomé the most common being aortic coarctation. It can be treated with surgery, angioplasty problem is to decide what treatment to use and when to do it : the your ter the risk of certain treatments, but one mustn't wait too long. A decis ‘stem for this problem can be created by a team consisting of at least 0 er. The process can be broken down into ! medication. The the infant the grea! theoretic expert sy: domain expert and-one knowledge engine following steps Knowledge eng! 1. Create causal model : 2. Simplify to qualitative decision model 3, Assign probabilities 4, Assign utilities ineering for decision-theoretic expert systemsucsion Process a . 401 oe phere penis uen ana te Verity and refine model E perform sensitivity analysis 4, create causal Model : Dotorming what are tho possible symptoms, disorders, aig and outcomes. Then draw arcs betwoon thom, indicating what disorders cause ert and what treatments alleviate What disorders. Some of thie will be well known a domain expert, and some will come from tho Merature. Often the model will match well Ege tomal graphical descriptions given in modical textbooks © 2, Simplity to qualitative decision model + Since Wo ate using the model to make rent decisions and not for other purposes (such as determining the joint probability of won symptomvdisorder combinations), we can often simplity by removing variables that yet involved in treatment decisions. Sometimes variables will have to be split of joined to sesh the expert's intuitions. For example, the Original aortic coarctation model had a treatment wable with values surgery, angioplasty and Medication, and a separate variable for timing je veatment but the expert had a hard time thinking of these separately. So they were | wxbined, with treatment taking on values such AS surgery in 1 month | 3. Assign probabilities : Probabilities can come from Pationt databases, literature sudes, or the expert's subjective assessments. In cases where the wrong kinds of probabilities | se gven in the literature, techniques such as Bayes’s rule and marginalization can be used | scompute the desired probabilities. It has been found thal exports are best able to assess | teprobabilty of an effect given a cause (0 9. P(dyspnola heartfaiture)) rather than the other | my around. 4. Assign Utilities : When there are a smal | muverated and evaluated individually. We woul 0 gve each a numeric value, For example : 1000 for death and 0 for complete recovery | Me would then place the other outcomes on this scale. This can be done by expert, but it is ‘=e the patient can be involved, because different people have different preferences. if Sere are &xponentially many outcomes, we need some way to combine them using “attibute utility functions. For example : we may say that the Negative utility of various "pication is additive 5. Verity and refine the model : To evaluate the system wo will need a Set of correct Pa, Cutput ) pairs ; a so-called gold standard to compare against. For medical expert “78 this usually means assembling the best available doctors, Presenting them with a “ey and asking them for their diagnosis and recommended treatment Plan. We then Well the system matches their recommendations. if it does Poorly, we try to isolate | agi that are going wrong and fix them. It can be useful to run the system “packwards”, Fat of Presenting the system with symptoms and asking for a diagnosis, we can Present "tay, Hlagnosis such as “heart failure”. ae fe Predicted Probability of symptoms 8S tay i the medical literatu it erie She ae : This step checks whether tho best decision s sensitive “a changes; igned probabilities and utilities by systematically varying those haters aan the assigne Uf small changes lead to significantly ditterent "8 and running the evaluation again. ll number of possible outcomes, they can be Id create a scale from best to worst outcome- LOIS fility jy lh, The rosult of this | procedure is a aa of data that can be prnsessed t9 Ging y, MW Prodiction quality ao a function of the siza of tho training 96%, Ts Funan ct ty yy © graph, giving what is called the loarning curve for the algoritin OF NE parti Noise and over fitting : Noise : Two or more examples with tho same ¢ classifications. Overfitting : When there is a largo sot of possible hypothosis 190 1e9UH lth nasi, Fogularity in the data. To overcome tho problem of over fitting decision 1168 pruning yy, validation can be applied. Broadening the applicability of decision trees : Tho dacision tres Induction is {0 variety of problems, to address the following issues: 1, Missing data 2. Multivalued attributes 3. Continuous and intoger-valued input attributes. 4. Continuous valued output attributes, Q 14, List some of the practical uses of decision tree learning, Ans. Some of the practical uses of decision treo learning aro : 1. Designing oil platiorm equipment. 2. Learning to fly. Q 15, List the advantages of decision trees, Ans. The advantages of decision trees are : 1, It is one of the simplest and successful forms of learning algorithm, 2. It serves as a good introduction to the area of inductive learning and is sy riplion In attributes ty» implement. Q 16, What is decision tree 7 (PTU, Dec. 2008 Ans. Decision tree isthe most powerful and popular tool for classification and presive A decision tree is a flowchart like tree structure, where each internal node denotes an attribute, each branch represents an outcome of the test and each leaf node (terre node) holds a class label. Qo00Reinforcement Learning passive reinforcement leaming, direct utility estimation, adaptive dynamic programming semporal difference learning, active reinforcement learning - Q learning POINTS TO REMEMBER (2? 4, Inductive learning is the process of acquiring generalized knowledge trom examples or instances of some class. 2. Dacision tree represents boolean functions. 3. Adecision tree takes as input an object or situation by a set of properties, and output a yesino decision. Current-best-hypothesis first described by John Stuart Mill in 1843. Current-best-hypothesis is used to maintain a single hypothesis throughout An artificial neural network is an information processing paradigm that is inspired by biological nervous system. Aneuron is the basic information processing unit of a neural network. There are three different classes of network structures : (i) Single-layer feed-forward (i) Multi-layer feed-forward (ii) Recurrent Natural language processing is a theoretically motivated range of computational techniques for analyzing and representing naturally occuring texts at one or more levels of linguistic analysis for the purpose of achieving human like language processing tor a Tange of tasks or applications. Reinforcement learning refers to a class of problems in machine learning which postulate an agent exploring an environment in which the agent perceives its current stato and lakes actions. Utiity based agent leas a utility function and select actions that maximise expected utility, . The agent learns an action-value function giving the expected utility of taking a given action in a given state. This is called Q-learning. 1053 . LORDS Artificial in, i “Ny Iwo types of reinforcomen Cay, yg, 13. Passive and active reinforcement learning are the 14. A genetic algorithm (GA) is a search technique used in computing to fing , approximate solutions to optimization and search problems. 15. Genetic algorithm are based on the thoory of selection. 16. Any situation in which both inputs and outputs of a component can be perceived s... Supervised learning. 17. Learning when there is no hint at all about the correct outputs Is called unsuper, learning, QUESTION-ANSWERS Q 1. Define inductive learning. How the performance of inductive learni, algorithms can be measured 2?” (PTU, Dec. 2019 ; May 253 Ans. Inductive learning : Learning a function {rom examples of its inputs and vy is called inductive learning. Other words, Inductive learning is the process of acqui, generalized knowledge from examples or instances of some class. This form of learni accomplished through inductive inference, the process of reasonina from a part to ay from particular instances to generalizations, or from the it:dividual to the universal, powerful form of learning which we humans do almost effortlessly. The performance of inductive learning algorithms can be measured by their les curve, which shows the prediction accuracy as a function of the number of observed exai Q 2. Write a note on the followings : (a) Current-best-hypothesis search. (b) Least-commitment search (PTU, Dec. 201% Ans. (a) Current-best-hypothesis search : Current-best-hypothesis first desc: by John Stuart Mill in 1843. It is used to maintain a single hypothesis throughout. Cure best-hypothesis search update the hypothesis to maintain consistency as a new exam? comes in. Positive example : An instance of the hypothesis. Negative example : It is not an instance of the hypothesis. False negative example : The hypothesis predicts it should be a negative exampt but it is in fact positive. False positive example : It should be positive but it is actually negative. ; It uses specialisation/generalisation of current hypothesis to exclude false positives include false negatives. The algorithm is extremely simple; if a new example is encounter that our hypothesis misclassifies, then change the hypothesis as follows : If it is a false positive, specialize the hypothesis not to cover ‘it. This can be do i dropping disjuncts or adding new terms. If it is a false negative, generalize the hypotness adding disjuncts or dropping terms. If no consistent specialization/generalization can be fou backtrack, fone 6107 Algorithm {, Pick a fandom example to define the initial hypothesis. 2, For each example, Q Incase of a false negative : generalize the hypothesis to include it. Q Incase of a false positive : specialize the hypothesis to execute it. 4, Return the hypothesis. (b) Least commitment search : Rather than backtracking, this approach keeps all rypothesis that are consistent with the data seen so far. As more data becomes available, tis version space shrinks. The algorithm for doing this is called the candidate elimination teaming algorithm or the version space learning algorithm (Mitchell,1977], and consists simply {constraining the version space to be consistent with all data seen so far, updating it as each new instance is seen. This is a least-commitment approach since it never favors one possible hypothesis | cier another, all remaining hypothesis are consistent, Note that this method implicitly assumes | tat here is a partial ordering on all of the hypothesis in the space, the more-specific-than | dering. A generalization G, is more specific than a generalization Gp if it matches a proper |_ ‘subset of the instances that G2 matches. The obvious problem is that this method potentially requires an enormous number of | hypothesis to record. The solution is to use boundary sets to circumscribe the space of tossible hypothesis. The two boundary sets are the G-set (the most general boundary) and ‘he S-set (the most specific boundary). Every member of the G-set is consistent, and there no more general consistent hypothesis; every member of the S-set is consistent and thete are no more specific hypothesis. The algorithm works as follows, Initially, the G-set is simply True, and the S-set False. For every new instance, there are four possible cases : 1D false positive for Si-> The hypothesis Si is too general, but as it has no consistent specializations, we throw it out. Q false negative for Si» Si is too specific, so we replace it by all of its immediate generalizations. Q false positive for Gi Gi is too general, so we replace it by all of its immediate specializations. : © false negative for Gi-> The hypothesis Gi is too specific, but as it has no consistent generalizations, we throw it out. : The process is repeated until one of three things happens. Eventually, either there isee a oe - LORDS Artic, only one hypothesis left in the version space, the version space collapses, mean NO consistent hypothesis, or we run out of examples and our version space sti, hypothesis, so we can use their collective evaluation. The main problems with this approch are two. I there is any noise, OF insufficign {or classification, the version space will collapse. Also, if unlimited disfunction is ali, hypothesis space, $ will contain only the most spécific hypothesis (the conjyy Positive examples), and G will contain only the most-general hypothesis. The fa can partially be solved by using a generalization hierarchy. Such learning syst handle noisy data, One solution is to maintain several S and G sets, consistent vit q numbers of training instances, The ‘pure version-space algorithm was first used in META-DENDAAL (Buch Mitchell, 1978). It was also used in LEX [Mitchell, 1983] which learned to solve integration problems. Q 3. Define supervised learning and unsupervised Learning. Ans. Supervised learning : Any situation in which both inputs and outputs, component can be perceived is called supervised Learning. Unsupervised learning : Learning when there is no hint at all about the Correct o, is called unsupervised learning. Q 4. Write a note on neural network. (PTU, Dec, 20; Ans. Neural network: An artificial neural network is an information processing par that is inspired by biological nervous systems. It is composed of a large number of interconnected processing elements called neurons. Neuron in Artificial neural networks = to have fewer connections than biological neurons. Each neuron in ANN receives a nurs of inputs. An activation function is applied to these inputs which results in activation le neuron (output value of the neuron). Knowledge about the learnii of examples called training examples. Four parts of a typical nerve cells : ing task is given in the Dendrites : Accept inputs Ca ‘Avon : Tum the processed inputs into outputs Synapses : The electrochemical contact between neurons A simple neuronpent LEBITING i098 ites are hair-like extensions of the soma which act like input channels. These nels receive their input through the synapses of other neurons. The soma then these incoming signals over time. The soma then tums that processed value into ryt which is sent Out to other neurons through the axon and the synapses. oe eron : The neuron is the basic information processing unit of a NN. It consist of : 3, Asetof links, describing the neuron inputs, with weights w,, Wo, ..... Wine an adder function (linear combiner) for computing the weighted sum of the inputs : m ueDWixi jst $. Activation function © for limiting the amplitude of the neuron output. Here ‘b’ denotes bias y =9(u+b) Bias b — yo, Induced taput woe | = fee Output ‘Swmming x wa “function Weights The Neuron diagram The bias b has the effect of applying’a transformation to the weigthed sum u veu+b » The bias is an external parameter of the neuron. It can be modeled by adding an extra_input. v is called induced field of the neuron. v= Dp wix) M3 0 Wo =b Network structure : There are three ditferent classes of network structures : 1. Single-layer feed-torward 2. Multilayer feed-forward 3. Recurrent t 'n single-layer feed forward or multi-layer feed-forward, neurons are organized in acyclic ers.> | LORDS Attica, 140 Single-layer feed forward neural networks (perceptrons) | Output oe ie " | or natin A network with all the inputs connected directly to the output is called a sing, neural network, or a perceptron network. Multilayer feed forward : In Multilayer feed forward neural networks, there ar jy, layers between input and output layers. Hidden nodes do not directly receive inputs no, outputs to the external enviornment. Input layer ‘Output layer Recurrent Network : Recurrent networks with hidden neuron(s) unit delay ope Z-+ implies dynamic system. > > NAent Learning 114 oe current network can have connections that go backward from output to input nodes odels dynamic systems.In this way, a recurrent network's internal state can be altered ge of inputs data are presented. It can be said to have memory. It is useful in solving 6 es where the solution depends not just on the current inputs but on all previous inputs. . a5. Why use neural networks ? ‘ans. Neural networks, with their remarkable ability to derive meaning from complicated -preise dala, can be used to extract patterns and detect trends that are too complex to . ried by either humans or other computer techniques. A trained neural network can be - of as an “expert” in the category of information it has been given to analyse. This reh'ccan then be used to provide projections given new situations of interest and answer sat if questions”. Other advantages include : 1, Adaptive learning : An ability to learn how to do tasks based on the data given for raining OF initial experience. 2. Self-organisation : An ANN can create its own organization or representation of the ;omation it receives during learning time. 3, Real time operation : ANN computations may be carried out in parallel, and special tadware devices are being designed and manufactured which take advantage of this capability. 4, Fault tolerance via redundant information coding : Partial destruction of a network leads to the corresponding degradation of performance. However, some network capabilities rray be retained even with major network damage. Q6. Draw the McCullogh-Pitts model. Ans. W, p % Output Inputs % 7 7 Mos Xu My Q7, Write down the applications of neural networks. Ans, Applications of neural network : 1. Character Recognition : The idea of character recognition has become very "Portant as handheld devices like the palm pilot are becoming increasingly popular. Neural | “Works can be used to recognize handwritten characters. 2 Image compression : Neural networks can receive and process vast amounts of “otmation at once, making them useful in image compression. With the internet explosion } a More sites using more images on their sites, using neural networks for image compression “Worth a look.— lm! a a LORDS 4, 142 ob Cate 3. Electronic noses : ANNs are used experimentally f0 implement eli " i is in telemedicine. Telemey. Electronic noses have several potential application: ‘ edi munication link. The electroniy Practice of medicine over long distances via a comn a ic ny identify odours in the remote surgical environment. These identified odours yo), electronically transmitted to another site where an door generation system Wouly thom. Because the sense of smell can be an important sense to the surgeon, telgsn., enhance telepresent surgery. 4. Medicine, security and Loan applications : These are some @pplications h in their proot-of-concept stage, with the acception of a neural network that will deci 4, ©F nol to grant a loan, something that has already been used more successfully ths, humans, 5. Instant physician : An application developed in the mid 1980s called ths \, Physician” trained an auto associative memory neural network to store a large nung medical records, each of which includes information on symptoms, diagnosis, ang treat for a particular case. After training, the net can be presented with input consisting ofa. Symptoms; it will then find the full stored patterns that represent the "best" diagnos treatment 6. Stock market prediction : The day to day business of the stock market is ext, complicated. Many factors weight in whether a given stock will go up or down on any 3 day. Since neural network can examine a lot of information quickly and sort it all ou, hy be used to predict stock prices. Q &. Briefly describe the following functions, (i) Step function (il) Ramp function (ili) Sigmoid function {iv)Gaussian function Ans. (i) Step Function : a v) " if vce OM) =F bit voc 7 L___— {il) Ramp function : > fa itv itved a H(v-e(b-a)/(d-c)) otherwise C___—BS Learning 13 gpceent jap Sigmold function with 2.x, parameters 1 ov) =Z+—_1 eae 14 exp(-nv + y) 4 3y) The Gaussian function is the probability function of the normal distribution. Sometimes ‘so called the frequency curve. f(x) = 0.8} 0.6) 04 0.2 x 3 2 1 2 3 portant task of reinforcement (PTU, Dec. 2018 ; May 2017) Ans, Reinforcement Learning : Reinforcement learning refers to a class of problems \8chine tearing which postulate an agent exploring an environment in which the agent “tees its current state and takes actions. The environment, in retum, provides a reward 9. What is Reinforcement learning? What is the im “ning 2ON 414 LORDS Artioay, wet (which can be Positive or negative). Reinforcement fearing algorithms atone gS Policy for maximizing cumulative reward for the agent ever the cutse ot the Prodion, The important task of reinforcement leaming is to use rewards to leary 9 Sup, agent function. “ Q 10. Explain passive reinforcement learning. Ans. Passive reinforcement learning : Let us first consider passive Thee fearing, where we assume that the agent policy n(s) is fixed. Agent is therefore boung, what the policy dictates, although the outcomes of her actions are probabilistic Te : may watch what is happening. So she knows what states she is Feaching and what tm she gets there. i The agent's jobs is to leam the utilities of the states U*(s), computed BeCOMtINg jy equation : Urs) = [= rs t=O For the 4x3 example environment used here for illustration we will assume yet, Trials : 3} > | >] > [GD 3] 08t2 | 086s | 0918 |[37]| 2] f ft jc 2} 0762 oso | [=] | ft =| fle_ 1] 0.705 | 0885 | 0.611 | 038s + 2 3 @ + 2 3 4 The agent executes the trials where she Perform actions according to the policy, reaching a terminal state, and receives percepts indicating both the current state and! reinforcement. Example trials : (1,1) - 0.04 > (1,2) - 0.04 > (1,3) - 0.04 (1,2) -0.04 + (1,3) - 0.04 -» (2,3) - 0.04 (3,3) - 0.04 (4,3) + 1 (1,1) - 0.04 - (1,2) - 0.04 > (1,3) - 0.04 > (2,3) - 0.04 > (3,3) - 0.04 + (3,2) - 0.04 - (3,3) - 0.04 (4,3) +1 (1,1) = 0.04 - (2,1) - 0.04 — (3,1) - 0.04 + (3,2) - 0.04 (4,2) -1 7 Direct utility determination : The objective of the agent is to compute the stat@ 0 Ur(s) generated by the current policy x(s). The state utilities are defined as ap of the sums of the (discounted) reinforcements received by the agent, who starte’ given state, and is acting according to her policy. Ur(s)=E | )oy'R(s)r : 4115 ees g gent may treat the trials as a source of training data to leam the state utilities by “pe reward-10-g0 for each state. At the end of the each trial the agent takes the £* pained in that state, and then, backtracking along its path, computes the reward-to- eee states, by adding up the rewards obtained there. ample, for the trial 4,1) - 0.04 — (1,2) - 0.04 - (1,3) - 0.04 + (1,2) -0.045 + 0.04 — (2,3) - 0.04 - (3,3) - 0.04 (4,3)+1 wwe get Pig (4.3) = 1, Pag (8,3) = 0.98, R,9(2,3) = 0.92, Rig (1,3) = 0.88, Pig (1,2) = 0.84, F,.(1,3) = 0.80, Ryg(1,2) = 0.76, Rig(1,1) = 0.72 ay averaging over a large number of samples she can determing the subsequent secimations of the expected value of state utilities, which converge in the infinity to the 0 pected values. This way the reinforcement learning task is reduced to a simple inductive zr, This approach works, but is not very efficient, since it requires a large no. of trials. groblem is, that the algorithm, by using simple averaging, ignores, important info contained re tials, namely, that state utilities, in neighboring states are relate. For example, in the second trial of the previous example, the algorithm evaluates the y of state (3,2) as the reward-to-go from this trial, but ignores the fact, that the successor , but this approach cannot take advantage of this. (1,1) - 0.04 - (1,2) - 0.04 — (1,3) - 0.04 (1,2) - 0.04 + (1,3) - 0.04 >» (2,3) - 0.04 - (3,3) - 0.04 -4(4,3) + 1 (1,1) - 0.04 - (1,2) - 0.04 + (1,3) - 0.04 (2,3) - 0.04 4 (3,3) - 0.04 -> (3,2) - 0.04 — (3,3) - 0.04 (4,3) + 1 (111) - 0.04 > (2,1) - 0.04 -> (3,1) - 0.04 (3,2) - 0.04 + (4,2) - 1 011. What is utility based agent 7 Ans. Utility based agent : Utility based agent leams a utlty function and select actions ‘al maximise expected utility. Disadvantage ; It must have (or learn) model of environment need to know where “ons lead in order to evaluate actions and make decision. Advantages : It uses “deeper” knowledge about domain. 12, Define Q-Learning. Ans. Q-Learning : The agent learns an action-value function 'Ng @ given action in a given state. This is called Q-Leaming. Advantage : No model required. Disadvantage : Shallow knowledge. Q cannot look ahead. Q can restrict ability to learn. 913, What are the two types of Reinforcement Learning ? Ans. There are two types of reinforcement learning : }- Passive reinforcement learning Active reinforcement learning giving the expected utility Hag _Q 14, Explain passive reinforcement learning agent's with (8) Adaptive dynamle programming (ADP) () Temporal difference (TD) Ans. (a) Adaptive dynamic programming : A Process similar to dynamic prog, Combining it with learning the model of the enviornment, composed of tho state fa, probability distribution and the reward function, Is called adaptive dynamic Prorgran, (ADP). Learing the modal of the environment works by observing the transitions jg, Stale-action pairs to the next states, The trials provide the training series of transi, frequencies of their occurences in the trials. For example, in the presented tials the» [+] (Righd was executed three times in the state (1,3). Two of these times the Succ State was(2,3), so the agent should compute P((2,3)/(1,3), Right) = 2/3. (1,1) = 0.04 > (1,2) - 0.04 -» (1,3) - 0,04 — (1,2) - 0.04 > (1,3) - 0.04 > (2,3) -0.9 (3,3) - 0.04 (4,3) +1 (1,1) ~ 0.04 > (1,2) - 0.04 > (1,3) - 0.04 (2,3) - 0.04 -> (3,3) - 0.04 > (3,2) -o (3,3) = 0.04 (4,3) +4 (1,1) = 0.04 ~» (2,1) - 0.04 > (3,1) - 0.04 - (8,2) - 0.04 (4,2) -1 Alter executing every single action the agent updates the state utilities by song Bellman equation using one of the appropriate methods. Algorithm : function PASSIVE- ADP- AGENT (percept) returns an action inputs : percept, a pero indicating the current state s' and reward signal r Static : x, a fixed policy mdp, an MDP with model P, rewards R, discount y 3s, initially empty Nsa, a table of frequencies for state-action pairs, initially zero Nsas', a table of outcome frequencies given.state-action pairs, initially zet 3,8 the previous state and action, initially null Ifs' is new then Ufs'] 80 that the evaluation function Up would closely enough approximate the : va Ulilty function. This approach is called a function approximation because there is no bhi ‘hal the real evaluation function can be expressed by this kind of a formula. However, tay! S82 doubtful that, for example, the optimal policy for chess can be expressed by a °n with just a few coefficients, it is entirely possible that a good level of playing can be420 LORDS Artal ng, achieved this way. The main idea of this approach is not an approximation usin fe Coefficients a function, which in fact requires many more of them, but the generalizatg, + is, we want to generate a policy valid for all the states based on the analysis of a smay, fragy of them, 7 For examples, in the experiments with the game backgammon, it was Possible tor a player to a level of play comparable to human players based on examining one jg 4, states, 4 Obviously, the success in reinforcement learning in such cases depends on the ogn, selection of the approximation function. If no combination of the selected feature can Give good strategy for a game, then no method of learning the coefficients will lead to one, On; other hand, selecting a very elaborate function, with a large number of features and coefficie, increases the chance for a success, but at the expense of a slower convergence ar consequently, the learning process. Function parameter correction : In order to facilitate on-line learning, some way updating the parameters based on the reinforcements obtained after each trial (or each siz is needed. For example, if Uj(s) is the reward-to-go for state s in j-th trial, then the ut function approximation error can be computed as (Gots)-uys))” The rate of change of this error with respect to the parameter 01 can be wiiltené 2E\/@8i, so in order to adjust this parameter toward decteasing the error, the proper adjustme formula is : ails) a0e(s) a6 a0 The above formula is known as the widrow-Hoff or the delta rule. ‘As an example, for the 4x3 environment the state utility function could be approximate using a linear combination of the coordinates. i < bia. = Gi+a(Uj(s)-U4(s)). O(%,y) = 09 +O4x+Ony- According to the delta rule the corrections will be given by : 0) — 09+ @(Uils) - Ua(s)) 0, <8, + a(Ui(s) -0(s))x 82.€ Op+ a(Ujs)—Ue(s))y Assuming for an example 0 = <05,0;,0,> = <0.5, 0.2, 0.1> We get the initial approximation U9 (1,1) = 0.8. If, after executing a trial, we comps €.9. u,(1,1) = 0.72, then all the coefficients 6p, 8}, 02 would be reduced by 0,08c, which in : ‘ 6 would decrease the error for the state (1,1). Obviously, all the values of Ua(s) would change, which is the idea of generalization.yr 220799 124 pen . 2-7 write & note on Genetic algorithms. (PTU, Dec. 2018 ; May 2017) < ¢ Genet algorithms : 4 genetic algorthm (or GA) is a search technique used in = x fing Tue Cr approximate solutions to optimization and search problems. Genetic : globe! search heuristics. Genetic algorithms are a particular ues inspired by evolutionary biology such as gree. TENTION. selection, and crossover (also called recombination). Genetic algorithms “pamented 23 2 computer simulation in which a population of abstract repersentations _ pyqmoscmes of the genotype or the genome) of candidate solutions (called individuals, Zation problem evolves toward better solutions. shy, seiions are renresented in binary as strings of Os and 1s, but other encodings gessbie. The evolution usually starts trom a population of randomly generated equals ard baacens in generations. In each generation, the fitness of every individual in ye somuiaton IS evaluates, mulitnle individuals are selected from the current population (recombined and possibly mutated) to form a new sate The Pew pcpUIston is then used in the next iteration of the algorithm. Commonly, me ageritm terminaies when either a maximum number of generations have been produced, suisiactory fimess level has been reached for the population. fhe sigonthm has terminated due to a maximum number of generations, a satisfactory mule may of may not have been reached Keyerms = duals : Any possible solution. Peputation : Group of all individuals. Scucture of genetic algorithm : en ¥ | Generate Evsiuate Optimization Be | Ste Lt Sere LA Oa Le L —, a met? [es] indivduals ~| f= Steps in Genetic Algorithm : Genetic Algorithm are based on the theory of selection. \\ A set of random solution are generated (2) Only those solution survive that satisfy a fitness 1unction, (©) Each solution in the set is a chromosome. {} A set of such solutions forms a population. The algorithm uses three basic genetic operators namely. (a) Reproduction (0) Crossover and {¢) Mutation along with a fitness function to evolve @ new population or the next be ce182. * generation. Thus the algorthin wee these CHETAN Bead the Quide ite seareh for the opbinal SCRUEONE (a) Ris @ gucied rancom search mechanse. Signiticance of the genetic operators ¢ Reproduction of selection ty’ hve patent CICHNEENE 1 GONE HATET OF Hyg Reproduction ensure that ony the fithest ot the sotutions Made 10 fommofisorhgs & with force the GA to seareh thar ater which hag highest Anes vakies Crossover or recombination ¢ Cressover ensue {hal he sea tight direction by making new chromosomes tat POSSESS charecterstics g/ parents. Mutation : To avoid local optinain, mutation Is used [UARCHRREES & SUdIEN Ong a gene with in a chromosome allowing the algorithm to see for the sokitior current ones, : Q 19. Explain all the operators of genetic algorithm, Ans. Operators of genetic algorithm : Genetic operators used Ih genete egoe maintain genetic diversity. Genetic diversity or vatiation is @ necessity for the proves evolution. Genatic operators are analogous to those which occur in the hates! word 1. Reproduction (or selection); 2. Crossover (or Recombination); and 3. Mutation In addition to these operators, there are some parameters of GA. One ios PHOT parameter is population size. Population size says how many chromosomes are in population (iin ene genes there are only few chromosomes, then GA would have a few possibilities to parte and only a small part of search space is explored, If there are many chromosomes, then GA siows down. Research shows that the prod limit, itis not usoful to increase population size, because it does not halp in sob’ faster. The population size depends on the type of encoding and the problem. tion, or selection : Reproduction is usually the first operator aes ty onset Reproduc' population. From the population, the chromosomes are selected to be parents and produce offspring. The problem is how to select these chromosome Darwin's evolution theory “survival of the fittest’- the best ones should SUIVNR @ new offspring. The reproduction operators are also called selection operators, Select ‘extract a subset of genes from an existing population, according to any: dafinition of Every gene has a meaning, so one can derive from the gene @ kind of quatity, mess called fitness function. Following this quality (fitness value), selection cal be Fitness function quantifies the optimality of a soltution so that a particular solu! pet 2 we” ranked against all the other solutions. The function depicts the closeness of agit ae to the desired result. Many reproduction operators exists and they all sentially a ert thelr nw thing. They pick from current population the strings of above average and tt copies in the mating pool in a probabilistic manner. The most commonly used 8X! selecting chromosomes for parents to crossover are:i 123 . $i Roulette wheel selection, iid Boltzmann selection, ji) Tournament selection, w) Rank selection iv) Steady state solution gxomple of selection : Evolutionary algorithm is to maximize the function {(x) = x2 with x in the integer interval “sath Lag XEOs4 yo, 31 "1, The first step is encoding of chromosomes ; use binary representation for integers : 5-bits are used to represent integers up to 31. Assume that the population size is 4. Generate initial population at random. They are chromosomes or genotypes. e.g., 01101,11000,01000,10011. Calculate fitness value for each individual. (a) Decode the individual into an integer (called phenotypes) 01101 13; 11000 - 24; 01000 > 8; 10011419; (b) Evaluate the fitness according to {(x) = x2, 13-169; 24 + 576; 8 +64; 19 > 361 Select parents (two individuals) for crossover based on their fitness in pi. Out of many methods for selecting the best chromosomes. If roulette-wheel selection is n used, then the probability of the ith string in the population is pi = Fi/ [és where \et Fils the fitness for the string i in the population, expressed as f(x). Q pis probability of the string i being selected, Q nis no, of individuals in the population, is population size, n=4 O n+ piis expected count. a | String no Initial X Value Fitness Fi pi Expected count N* \ Population f(x) = x2 probi KI 01101 13 169 0.14 0.58 t 2 11000 24 576 0.49 1.97 8. 01000 8 64 0.06 0.22 K4 10011 19 361 0.31 1.23 Sum 1170 1.00 4.00 | Average 293 0.25 1.00 al Max 576 0.49 1.97 The string no, 2 has maximum chance of selection ‘ba, 2 Crossover : Crossover is a genetic operator that combines (mates) two chromosomes Eris) to produce a new chromosomes (offspring). The idea behind crossover is that tho -oS 124 LORDS Artic tay New chromosome may be better than both of the parents if it takes the best haragae 8 from each of the parents. Crossover occurs during evolution according to a user.g & Crossover probability. Crossover selects genes from parent chromosomes and ce, New offspring. The crossover operators are of many types : (i) One-Point Crossover (ii) Two-Point (iii) Uniform (iv)Arithmetic (v) Heuristic Crossovers 3. Mutation : After a crossover is performed, mutation takes place. Mutation is genes operator used to maintain genetic diversity from one generation of a population of ‘chromosoms to the next, Mutation occur during evolution according to a user-definable mutation probability, Usvay Sel to fairly low value, say 0.01 a good first choice. Mutation alters one or more gene values in a chromosome from its initial state, This ca, result in entirely new gene values been added to the gene pool. With the new gene valuss the genetic algorithm may be able to arrive at better solution than was previously possible, Mutation is an important part of the genetic search, helps to prevent the population from stagnating at any local optima. Mutation is intended to prevent the search falling intoa local optimum of the state space. The mutation operators are of many type : (i) Flip bit (ii) Boundary (iii) Non-Uniform (iv)Uniform (v) Gaussian Q 20. What is a schema? Ans, How to characterize evolution of population in GA ? As with the rules used in classifier systems, a schema is a string consisting of 1's, 0's, and #’s. For Example : 1011 * 001°0 Matches the following four strings : 1011000100 1011000110 1011100100 10114100110 schema with ne’s will match a total of 2° chromosomes. Each chromosome ofr BS will a 2" different schemata. Example : 0010 is a representative of 24 distinct schemas 00°", 0°10, *"*** etc. Q 21. What is FUZZY logic and where did FUZZY LOGIC come from’ 19) (P emu, May we y tt ley Ans. Fuzzy logic : Fuzzy logic is a problem-solving control system methodolog 1 ai {cement Learning 125 . “soll 10 implementation in systems ranging from simple, small, embedded micro- q 4s yes ners to large, networked, multi-channel PC or workstation-based data acquisition and . stems. It can be implemented in hardware, software, or a combination of both. FL. 7 * simple way to arrive at a definite conclusion based upon vague, ambiguous, see noisy, oF missing input information. FL's approach to control problems mimics how Egat E av nee would make decisions, only much faster. ‘The concept of Fuzzy logic (FL) was conceived by Lotfi zadeh, a professor at the university scaitoria at Berkley, and presented not as a control methodology, but as a way of processing } ie py allowing partial set membership rather than crisp set membership or non- “yrbetship.This approach to set theory was not applied to control systems until the 70's ae 10 insufficient small-computer capability prior to that time. Professor Zadeh reasoned st people do not require precise, numerical information input, and yet they are capable of sgh adaptive control. If feedback controllers could be programmed to accept noisy, imprecise put they would be much more effective and perhaps easier to implement. Unfortunately, lS manufacturers have not been so quick to embrace this technology while the Europeans nd Japanese have been aggressively building real products around it. Q 22. How Is Fuzzy logic different from Conventional control methods ? Ans, Fuzzy logic incorporates a simple, rule-based IF X and Y THEN Z approach to a ssving control problem rather than attempting to model a system mathematically. The FL model |s empirically-based, relying on an operator's experience rather than their technical wderstanding of the system. For example, rather than dealing with temperature control in ‘ems such as “SP = SOOF", “T <1000F", or “210C < TEMP < 220C” term like “IF (process is ‘v9 cool) AND (process is getting colder) THEN (add heat to the Process)” or “IF(process is hol) AND (Process is heating rapidly) THEN (cool the process quickly)” are used. These ‘ms are imprecise and yet very descriptive of what must actually happen. Consider what edo in the shower if the temperature is too cold : you will make the water comfortable very ‘ucly with litle trouble. Fuzzy logic is capable of mimicking this type of behavior but at very high rate, Q 23. How does Fuzzy logle work? Ans, Fuzzy logic requires some numerical parameters in order to Operate such as what “considered significant error and Significant rate-of-change-of-error, but exact values of vee Numbers are usually not critical unless very responsive performance is required in » “ch case empirical tuning would determine them. For example : A simple temperature S01 system could use a single temperature feedback sensor whose data is subtracted re ‘he command signal to compute “error" and then time-differentiated to yield the error ; /* 9 tate-of-change-of-error, hereafter called “error-dot”. Error might have units of degs ‘clit small error considered to be 2F while a large error is SF. The “error-dot" might then yi nts of degs/min with a small error-dot being SF/min and a large one being 15F/min, . tes don't have to be symmetrical and can be “tweaked” once the system is operating by... Optimize performance. Generally, fuzzy Logic is so forgiving that the system will 'Y Work the first time without any tweaking.i | 426 LOADS ALNGAI ita.) Q 24, Why use Fuzzy Logic? Ans. Fuzzy Logic offers several unique femuto for many control problems. 1. Itis inherently robust since it does not require pred programmed to fail safely if a feedback sensor quits of Is destroyed, the ‘ng control is a smooth control function despite a wide range of input vatiations ye that make Ita partiotlaty go y eT Ino, nulve-free INpule Andy 2. Since the FL controller processes user-defined rules governing the target coy system, it can be modified and tweaked easily to Improve oF dtastioally alter gy4y performance. New sensors can easily be incorporated inlo the system singiy, generating appropriate governing rules. 3. FL is not limited to a few feedback inputs and one or WWo Control output, nor} necessary to measure or compute rate-of-change parameters In order for iti implemented, Any sensor data that provides some indication of a system's acy and reactions is sufficient. This allows the sensors to be inoxpensive and imprecy thus keeping the overall systom cost and comploxily low. 4. Beacuse of the rule-based operation, any reasonable number of inputs cant processed (1-8 or more) and numerous outputs (1-4 of more) generatod, alloy, defining the rulebase quickly becomes complex if too many inputs or oulpuls a) chosen for a single implementation since rules dofining their interrolations ma also be defined. It would be better to break the control system into smallor chunk and use several smaller FL controllers, distributed on the systems, oach with mot limited responsibilities. 5. FL can control nonlinear systems that would bo difficult or impossiblo to mod mathematically.This opens doors for control systoms that would normally bo doom unfeasible for automation. Q 25. How is Fuzzy logic used ? Ans. 4. Define the control objectives and criteria, What am | trying to control? What do have to do to control the system? What kind of response do | nood? What aro th possible (probable) system failure nodes? 2. Determine the input and output relationships and choose a minimum numbor ( variables for input to the FL engine (typically error and rato-ol-chango-of ovo!) 3. Using the rule-based structure of FL, break the control problem down into « sori of IF X AND Y THEN Z rules that define the dosirod system output respons? K given system input conditions. The number and comploxity of rulos doponds 0” ! umber of input parameters that are to be processed and the numbor fuzzy vail associated with each parameter, If possible use at least one variable and Ils (ia derivative. Although it is possible to use a single, instantanoous error paramol? without knowing its rate of change, this cripplos tho system's abilily to minim! overshoot for a step inputs.ra Learning 127 Greate FL membership functions that define the meaning (values) of input/output used in the rules. terms i create the necessary pre-and post-processing FL routines of implementing in SMW, aihenwise program the rules into the fuzzy logic HW engine, yest the system, evaluates the results, tune the rules and membership functions, and retest until satisfactory results are obtained. 26. Explain the concept of Fuzzy sets theory. (PTU, May 2019, 2017 ; Dec. 2017, 2015) ans. Fuzzy Sets : Fuzzy set theory was formalised by Professor Lofti Zadeh at the of California in 1965. A Fuzzy set is defined in terms of a membership function a mapping from the universal set U to the interval [0,1]. A characteristic function is a ease of a membership function and a regular set ie. also known as a crisp sel, is a a case of a fuzzy set. Thus the concept of a fuzzy set is a natural generalization of the rept of standard set theory. It remains to be proven whether the standard operations of dard set theory; i.e. union, intersection, and complementation, have proper analogues in syset theory. Fuzzy set theory in terms of membership functions : Amembership function is a function from a universal set U to the interval [0,1]. A fuzzy {Ais defined by its membership function A over U. Various fuzzy set operations : . Union : The membership function C(x) of C = AUB is defined as j1C(x) = max (HAC), HB(X)}, xEX. . Intersection : The membership function 41C(x) of the set C = ANB is defined as HC(x) = min {HA(x), LB(x)}, xeX. . Complement : Membership function of the complement of a fuzzy set A, 41A'(x) is defined as A'(x)=[1 — pA(x)], xeX. Example : let X= (1,2,3,4,5,6,7} Az ((3,0.7),(5,1), (6,0.8)} and B = {(3,0.9),(4,1),(6,0.6)} AUB= {(3,0.9),(4,1),(5,1),(6,0.8)} AnB= {(3,0.7),(6,0.6)} A's ((1,1)(2,1)(8,0.3),(4,1),(6,0.2),(7,1)) Adltional operations : ~ Equality : A=B, if WA(x) = B(x), VxeX ~ Not Equal : A + B,if A(x) + B(x) for at least one xeX * Containment : A < B if and only if wA(x) < 1B (x), ¥xeX Proper subset : If Ac B and AzB Product : A.B is defined as j1A.B(x)= A(x) B(x) Power : AN is defined as : jAN(=(qWA(x))" Bold Union : A@B is defined as : wASB(x)=Min(1 A(x) + HB(x)} Bold intersection : AOB is defined as : wAOB(x)=Max (0, HA(x) + 1B(x)-1]os” LORDS Ai, 27. Discuss various learning mothoda In neural networks with Suitable gy, (PTU, Dag” on What do you understand by supervised and unsupervised loarning » Wy the major charactoristlos and differonces botwoon thom ? (PTU, Dog, ho loarning mothods in 4 Ans, Learning Methods In neural notwork ra are classified into throo basic typos Q Supervised learning Q_ Unsupervised loaming Q Reinforced leaming Supervised learning : A toachor is prosont during learning procoss and py expected output. Evory input pattorn is used to train the notwWork, Learning procoss i On Comparison, betwoon network's computed output and the corroct Oxpoctod o generating “error’, The “error” gonorated is used to chango network paramotors that improved performance. Unsupervised learning : No teachor is proscat. The expected or dosirod output Presented to the network, Tho system leams of it own by discovering and adapting structural features in tho input patterns, Reinforced learning : A teacher is present but doos not present the expected or de Output but only indicated if the computed output is correct oF incorroct. Tho information pro helps the network in its loarning process, A roward is given for correct answer compute: a penalty for a wrong answer. The supervisod and unsupervised learning mothods aro most popular forms of lea compared to Reinforced learning. These three types are classified based on presence or absence of teachor and 1 the information provided for the system to learn These are further categorized, based on the rules used, as Q> Hebbian Q_ Gradient descent QQ Competitive Q_ Stochastic learning Neural network leaming algorithms yp ‘Suporvised learning Reinforced lonrning Unsupervised (Entor based) (Output based) leaming fo ia Stochastic Error correction Hebbian Gompettve giadient descent ft Least mean square Back propagation Classification of learning algorithmsLearning ——4 (ap en Supervised and unsupervised tearning Foes Supervised Leaming Unsupervised Learin Deals with labelled data Hanes unlabeled data High Offine Produces accurate results Classification and Tegression Generates moderate resus Clustering and Association tule mining 28 What is perception 7 (PTU, May 2016) Ans. Perception and communication are essential components of intelligent behaviour provide the ability to effectively interact with our environment. Human perceive and |gerncaie Brough thet five basic senses of sight, hearing, touch, smell and toate and tee tb generate Meaningful utterances. Two of the senses, Sight and hearing are especially and require conscious inferencing Q 28. Explain the back propagation algorithm for neural nets. (PTU, May 2018, 2016) Ans. Back propagation is a common method for training a neural network The algorithm [pr te decomposed in the following four steps 1 Feed forward computation, 2 Back propagation to the output layer 5 Back propagation to the hidden layer. 4 Weight updates, elgorthm is stopped when the value of the error function has become sufficionty [ra B= learning rate = 0.45 a = Momentum term = 0.9 (0) = 1.0/1.0 + oxp (-x))130 N.N on fig. 1 has two nodes (No, 0 and No, 4) in input layer, two node (N1, 0 and N1,1) and one node in output layer (N2, 0). Input layer nod: hidden layer nodes with weights (w0, 1 - w 0, 4). Hidden layer n output layer node with weights (wi, 0 and w1,1). The values that w taken randomly and will be changed during BP iterations. Table with input rx desired output with learning rate and momentum are also given. There is 0ig§ formula f(x) - 1.0/(1.0 +exp (-x)) In NN training, all example sets ere behind calculation is the same. 1. Feed forward computation : Feed forward computation is tw part is getting the values of the hidden layer nodes and second part is from hidden layer to compute value or values of output layer. Input ve No,1 are pushed up to the network towards nodes in hidden layer (N1, 0, multiplied with weights of connecting nodes and values of hidden layer nodes are cat ‘Sigmoid function is used for calculation. f(x) — 1,0/(1.0 + exp (-x)) Ni, O-f (x1) -f (0,0 * n0,0 = w0,1 * no,1) - f (0.4 + 0.1) ~ f (0.5) - 0.622458 Nt, 1 -f (x2) — f (WO, 2 * 0,0 + w0, 3° no, 1) ~ f (-0.1-0.1) - f (02) - 04501 When hidden layer values are calculated, network propagates forward, i orp values from hidden layer. Upto a output layer node (N2, 0). This is second step forward computation. N2, 0 — {(x3) — f (w1, 0° n1,0+w1, 1° 1,1) — (0.06 *, 0.622459 + (0.4) * 0.450166) — f (-0.1427168) - 0.484 381. Having calculated N2, 0 forward pass is completed. 2, Back propagation the output layer : Next step is calculate error of N2,¢ From the above table, output should be 1. Predicted value (N2,0) in our example is 04 Error calculation is done as follows : N2, Ognor — 92,0 * (1 ~ 12,0) * (N2,Opesired — N2,0) — 464381 (1 — 464381) * (1 - 464381) - 133225 One error is known, it will be used for back propagation and weights two step process. Error is propagated from output layer fo the hidden layer fist. TMS learning rate and momentum are brought to equation. So weights w1,0 and wit updated first. Before weights can be updated, rate of change needs to be found. This! by multiplication of learning rate, error value and node N1,0 value. ‘Awt, 0, —B *N2, Oggor "1, 0 ~ 45 * .133225 * 6.22459 - 0.37317 Now new weight for w1,0 can be calculated. WA, Ono —W1, Ooi + AW1, O + (a + A (t—1)) ~ 08 + .037317 + 9° 0-07 AW1, 1B" N2, Ognor "M1, 1 — .45 * .199225 * 450166 ~ 026988 W1, Tew — WI, tog + AW1, 1+ (a + A(t 1) ~ 0.45 + 026988 - 0.373012 The value of A (1 1) is previous delta change of the weight. In our example te! previous delta change so it is always 0. If next iteration were to be calculated, this W some value. 3. Back propagation to the hi hidden layer down to the input layer. calculated by multiplying new weights way error for Nt, 1 node will be found. adjustre idden layer : Now error has to be propaga’? Let us start with finding N1, 0 error firs wt, 0 value with error for the node N2, 0 ¥nt Learning . 134 Nt, Og ~ N2, Ognor : W1, Onow > 0. 139225 097317 012905, Nt, tenor ~ N2, Ogror * W1, Iyow ~ .193225 * (-.373012) ~ 049706 error for hidden layer nodes is known, woights between Input and hidden layer oan ied. Rate of change first needs to be calculated for every woight ip v0, 0B N14, Ofnor * NO, 0 ~ 45 * .012965 ~ 005834 v0, 1- BTN1, Oey, * NO, 1.45 * 012965 + 1 ~~ 005834 yw0.2-B*N1, Tenor * NO, 0 ~.45 * 049706 * 1 ~— 022968 ow 0,3-B* N14, tenor * NO, 1.45 *~ 049706 * 1 -- 022368 ‘hen we calculate new weights between input & hidden layer 0, Opew~ WOs Tos + AWO,1 + (+ A(t 1) ~ 0.4 + .005894 4.0.9 * 0 - 0.405834 WO, thew ~ WO. Toa + AWO, 1+ (a + A(t-1)) ~ 0.1 + 0.05 834 + 0 ~ 0.105384 WO, 2now — WO; 2oig + AWO, 2 + (x + A(t-1)) - -0.1 + 022368 + 0 ~~ 0.122368 WO, Snow = WO, Boig + * AWO, 3 + (ce + A(t 1)) ~~ 0.1 + .022368 + 0 ~~ 0.122968 4, Weight updates : Here is quick second pass using new weights to soo if error has sed. Fu, 0-H) —f (WO, 0 * n0,0 + w0,1 * n(0,1) -f (0.406 + 0.1) -f (.506) + 623868314 Ni, tf (x2) — f (WO, 2 * n0,0 + w0,3 * n0,1) -f (-.122 ~ 122) ~ f (-.244) - 0.43930085 N2,0~f (x3) - f(W1,O* nt, 0+ wt, 1 * 1,1) =F (.097 * 623868314 + (-.373) * .43930085) - 1 (-0.103343991) - 0.474186972 Having calculated N2, 0 forward pass is completed. Next step is calculated error of N2, 0 node from the above table, output should be 1. Predicted value (N2,0) in our example is 464381. Eor calculation is done in following way. N2, Ogror~ 2,0 * (12,0) * (N2, Opesires — N2,0) ~ 0.474186972 * (1-0.474186972) * (1 - 474186972) - .131102901 eter initial iteration, calculated error was 0.133225 and new calculated error is stt02, 230. List and explain various types of intelligent agents in Al. Illustrate the ‘aclioning of each agent. (PTU, May 2017) Ans, 1. Simple reflex agents 2 Model-based reflex agents 3. Goal-based agents 4 Utlity based agent Simple reflex agent : Simple reflex agents act only on the basis of the current ignoring the rest of the percept history. The agent function is based on the condition- ‘ig U2 lt condition then action. This agent function only succeeds when the environment hal able. Some reflex agents can also contain information on their current state Noe allows them to disregard conditions whose actuators are already triggered. Infinite Non” often unavoidable for simple reflex agents operating in Partially observable Bey 't the agent can randomize its actions, it may be possible to escape from Ps, | Slladet-bases reflex agents : A model-based agent can handle a partially observable "I. lls current state is stored inside the agent maintaining some kind of structure132 LORDS Artificial Inte . tig wedge about "hoy which describes the part of the world which cannot be seen. This kno\ world works” is called a model of the world, hence the name "model-based agent”. A model-based reflex agent should maintain some sort of internal model that depends on the percept history and thereby reflects at least some of the unobserved aspects of thy current state. It then chooses an action in the same way as reflex agent. 3. Goal-based agents : Goal based agents further expand on the capabilities of the modem-based agents, by using "goal" information. Goal information describes situation, that are desirable. This allows the agent a way to choose among multiple possibilities, se the one which reaches a goal state. Search and planning are the subfields of artificial intelligenge devoted to finding action sequences that achieve the agent's goals. In some instances the goal-based agent appears to be less efficient; it is more flexct because the knowledge that support its decision is represented explicitly and can be modifies, 4, Utility based agents : Goal-based agents only distinguish between goal states and non-goal states. It is possible to define a measure of how desirable a particular state is. This measure can be obtained through the use of a utility function which maps a state to a measure of the utility of the state. A more general performance measure should allow a comparison of different world states according to exactly how happy they would make the agent. The term utility, can be used to describe how “happy” the agent is. A rational utility - based agent chooses the action that maximizes the expected utility of the action outcomes, ie., the agent expects to derive, on overage, given the probabilities and utilities of each outcome.A utility based agent has to model and keep track of its environment, tasks that have involved a great deal of research on perception, representation, reasoning and learning. Sensors + ‘What the world is ike now t ‘What it willbe ike What my actions do peers ¥ How happy Iwill be insur asiate q ‘What action I should do now. How the world evolves: qwewuosjnuy, Agent Actuators > Q 31. Describe any four learning techniques with suitable examples. (PTU, Dec. 2018) Ans. Learning Teachniques : 14. Supervised learning categories and techniques : Q Linear classifier Q Parametric~~ einforcement Learning i Non-parametric Q Non-metric Q Aggregation 2. Unsupervised learning Categories and techiques ; O Clustering O Density Estimation Q Dimensionality reduction Q 82. Write short note on Rule based learning. —(PTU, Dec. 2020 ; May 2017) Ans. Rule based learning : Rule based leaming is a term in computer science intended to encompass any machine learning method that identifies, learns or evolves ‘rules’ to store, manipulate or apply. The defining characteristic of a rule-based machine learner is the ‘sentifcation and utilization of a set of relational rules that collectively represent the knowledge captured by the system. This is in contrast to other machine leamers that commonly identify a singular model that can be universally applied to any instance in order to make a prediction. Rule based machine learning approaches include learnirg classifier systems, association vies learning artifical immune systems and any other method that relies on a set of rules, gach covering contextual knowledge. While rule based machine learning is conceptually a ‘ype of rule based system, it is distinct from traditional rule based system which are often | hand crafted and other rule based decision makers. This is because rule based machine leaming applies some form of learning algorithm to automatically identify useful rules, rather ‘nan @ human needing to apply prior domain knowledge to manually construct rules and create a nule set. © 33. What is an expert system? What are the main advantages in keeping the | knowledge base separate from the control module in knowledge based systems. (PTU, Dec. 2017) Ans. Expert systems solve problem that are normally solved by human “experts", To | Solve expert level problems, expert systems need access to a substantial domain knowledge base, which must be built as efficiently as possible. They also need to exploit one or more ‘asoning mechanism to apply their knowledge to the problems they are given. Then they | "eed @ mechanism for explaining what they have done to the users who rely on them, One ___ “20 look at expert system is that they represent applied Al in a very broad sense, 'n@ conventional program, domain knowledge is intimately interwined with software for cling the application of that knowledge. In a knowledge based system the two roles are ‘citly separated. In the simplest case there are two modules the knowledge module is Sd the knowledge base and control module is called the inference engine with in the knowledge base, the Programmly expresses information about the problem to be solved, Otten this information is declarative i.e. the programmer states same facts, rules or Te&tionships without having to be concerned with the detail of how and when that information Contr expli Call Should be applied. These latter details are determined by the inference engine which uses the knowledge base as a conventional Program uses a data file. A KBS is analogous to har man brain whose control processes are approximately unchanging in their nature, like the rence engine, even though individual behaviour is continually modified by new knowledge 4 experience, like updating the knowledge base.134 LORD) Artilicia Intetigency nt learning. j (PTU, May 2019) Q 34. p, ae ere Ans, Vatious a PPlications of jing are as follows : : anutacturing aE reinforcement learning from one box Nuc, a robot uses deep reinforcement learning to pick 4 dovicg Object and oa) ent Pulting it in a Container. Whether it succeeds or fails, it momorizes tho 2. sins knowledge ang "rains itsel to do this job with groat speed and precision, transit tims e™*°"Y Management « Reinforcement learning algorithm can bo bull to redugp fa ime for Stocking as well 8S Tettieving products in the warehouse for optimizing space ization and Warehouse Operation, Dalveny every Management : Reinforcoment learning is used to solve the problom of spl ‘ehicle Routing, Q ing i jate customers with just one Sehicie’ 9. Q Learning is used to serve appropriate cu J 4. Finance Sector : pit aj is at the forefront leveraging reinforcement learning for evaluating \rading strategies. itis ‘urming out to be a robust tool for training systems to optimizo financial Objectives. It has immense, applications in stock market trading where Q learning algorithm is able to learn an optimal trading Strategy with one simple instruction, maximize the value of our portfolio, 5. Power systems : Reinforcement learning and optimization techniques are utilized fo assess the security of the electric power systems and to enhance Microgrid performance. Adaptive learning methods are employed to develop control and protection schemes, Q 35. What are the advantages of Genetic Algorithms ? (PTU, May 2019) Ans. Various advantages of genetic algorithis are as follows : 1. It can find fit solutions in a very less time, 2. The random mutation guarantees to some extent hat we see a wide range of solutions, d to other algorithm which doe: (PTU, Dec, 2020) Ans. Unsupervised learning is a machine learning technique in which the users do not need to supervise the model. Instead, it allows the model to work n its own to discover patterns and information that was previously undetected. It mainly deals with the unlabelled data Here, are prime reasons for using unsupervised leaming : Unsupervised machine learning finds al kind of unknown Q Unsupervised methods help you to find features which can be useful for ization. , a ous place in real time, so all the input data to be analyzed and labelled in the is amers. Linens iva unlabelled data from a computer than labelled data, which neods O Itis ea i i ition. . manual intervent . ling : The various characterist ervised Characterstics of unsupervised learning aes Bf unsun , lows : : learning are eel are used against data that is not labelled. 4, Algor rithms are computationally complex. pet ithms are less accurate. oo 3. Algor ss Patterns in data. ee

Compiler Design Refresher
100% (1)
Compiler Design Refresher
116 pages
Parking Android App Project Report
No ratings yet
Parking Android App Project Report
46 pages
Knowledge Representation in Artificial Intelligence
100% (5)
Knowledge Representation in Artificial Intelligence
19 pages
Ai Assignment
No ratings yet
Ai Assignment
37 pages
Software Tools
No ratings yet
Software Tools
18 pages
Unit 3
No ratings yet
Unit 3
68 pages
Learning: Chapter 17: Rich & Knight
No ratings yet
Learning: Chapter 17: Rich & Knight
30 pages
THEORY FILE - Android Programming (6th sem) !!
No ratings yet
THEORY FILE - Android Programming (6th sem) !!
30 pages
Unit 1
100% (1)
Unit 1
77 pages
Q. What Is Input Buffering. What Is Sentinels?
No ratings yet
Q. What Is Input Buffering. What Is Sentinels?
6 pages
Artificial Intelligence Module 5
No ratings yet
Artificial Intelligence Module 5
23 pages
Lecture 4 Heuristic Search Strategies
No ratings yet
Lecture 4 Heuristic Search Strategies
20 pages
Unit 3 Ai Notes
No ratings yet
Unit 3 Ai Notes
75 pages
ML Unit 3 New
No ratings yet
ML Unit 3 New
24 pages
(Chapter 1 Introduction) : The Evolving Role of Software
No ratings yet
(Chapter 1 Introduction) : The Evolving Role of Software
16 pages
Principles of Programming Language: B.Tech
No ratings yet
Principles of Programming Language: B.Tech
121 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
32 pages
Representing Knowledge Using
No ratings yet
Representing Knowledge Using
22 pages
SPM Lecture Notes 2023 (R20 III-I)
No ratings yet
SPM Lecture Notes 2023 (R20 III-I)
76 pages
Se Unit2
No ratings yet
Se Unit2
115 pages
Game Development Unit 5
No ratings yet
Game Development Unit 5
17 pages
Unit-5 (Notes AI)
No ratings yet
Unit-5 (Notes AI)
28 pages
Ai R16 - Unit-6
No ratings yet
Ai R16 - Unit-6
36 pages
The Database System Environment
100% (1)
The Database System Environment
2 pages
Knowledge Representation Techniques
No ratings yet
Knowledge Representation Techniques
74 pages
Weak Slot and Filler - Rich
No ratings yet
Weak Slot and Filler - Rich
33 pages
Artificial Intelligence - Lecture Notes, Study Material and Important Questions, Answers
100% (1)
Artificial Intelligence - Lecture Notes, Study Material and Important Questions, Answers
3 pages
Unit 5 - Compiler Design - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Compiler Design - WWW - Rgpvnotes.in
20 pages
CS6659 AI UNIT 1 Notes
100% (8)
CS6659 AI UNIT 1 Notes
47 pages
Probabilistic Reasoning in Artificial Intelligence
No ratings yet
Probabilistic Reasoning in Artificial Intelligence
14 pages
Hill Climbing Algorithm
100% (1)
Hill Climbing Algorithm
49 pages
Machine Learning-Unit-V-Notes
No ratings yet
Machine Learning-Unit-V-Notes
23 pages
Unit V - AI
No ratings yet
Unit V - AI
41 pages
Prolog Notes-Complete
No ratings yet
Prolog Notes-Complete
31 pages
Iii Year Vi Sem CS6659 Artificial Intelligence
No ratings yet
Iii Year Vi Sem CS6659 Artificial Intelligence
44 pages
Expert System in AI
No ratings yet
Expert System in AI
11 pages
Unit-1 DAA Notes - Daa Unit 1 Note Unit-1 DAA Notes - Daa Unit 1 Note
No ratings yet
Unit-1 DAA Notes - Daa Unit 1 Note Unit-1 DAA Notes - Daa Unit 1 Note
26 pages
Toc Full Notes
100% (1)
Toc Full Notes
82 pages
The Structure of Intelligent Agents
No ratings yet
The Structure of Intelligent Agents
5 pages
Business Economics AND Financial Analysis
No ratings yet
Business Economics AND Financial Analysis
41 pages
Rajib Mall Lecture Notes
No ratings yet
Rajib Mall Lecture Notes
131 pages
PPL Unit 3
No ratings yet
PPL Unit 3
14 pages
Planning and Search: Classical Planning: Planning Graphs, Graphplan
No ratings yet
Planning and Search: Classical Planning: Planning Graphs, Graphplan
22 pages
SEN - Unit 2
No ratings yet
SEN - Unit 2
19 pages
Ai-Unit-I Notes
No ratings yet
Ai-Unit-I Notes
74 pages
STM Lab Manual
No ratings yet
STM Lab Manual
50 pages
Ou Mtech Notes Intro Search Sameen Saroj
No ratings yet
Ou Mtech Notes Intro Search Sameen Saroj
20 pages
Expert Systems: Dendral & Mycin
100% (2)
Expert Systems: Dendral & Mycin
7 pages
AI Notes
No ratings yet
AI Notes
3 pages
Uncertainty AI
No ratings yet
Uncertainty AI
45 pages
Topic Knowledge Representation
100% (1)
Topic Knowledge Representation
25 pages
LP I ML Viva Questions
100% (1)
LP I ML Viva Questions
9 pages
AI-Unit 5
100% (1)
AI-Unit 5
6 pages
SPOS Decode
No ratings yet
SPOS Decode
68 pages
CS2351 Ai Notes PDF
No ratings yet
CS2351 Ai Notes PDF
91 pages
Software Engineering Notes Unit-4
No ratings yet
Software Engineering Notes Unit-4
26 pages
Daa-r22-Unit 1&2-Digital Notes Cse Dept (A.y 2024-25) @DR.K
No ratings yet
Daa-r22-Unit 1&2-Digital Notes Cse Dept (A.y 2024-25) @DR.K
50 pages
22 Reinforcement Learning
No ratings yet
22 Reinforcement Learning
18 pages
A17 Complexdecisions
No ratings yet
A17 Complexdecisions
28 pages
L12 Markov Decision Processes
No ratings yet
L12 Markov Decision Processes
64 pages
5.4-Reinforcement Learning-Part1-Introduction
No ratings yet
5.4-Reinforcement Learning-Part1-Introduction
15 pages
Report 6
No ratings yet
Report 6
38 pages
HTML Css
No ratings yet
HTML Css
1 page
Od 427885541408328100
No ratings yet
Od 427885541408328100
1 page
132758175081940206
No ratings yet
132758175081940206
2 pages
Paper 3687
No ratings yet
Paper 3687
7 pages
Img 20230609 Wa0002
No ratings yet
Img 20230609 Wa0002
5 pages
Project Report FSB
No ratings yet
Project Report FSB
29 pages
Wireless Programming Project Report 181022090116
No ratings yet
Wireless Programming Project Report 181022090116
9 pages

AI Notes

Uploaded by

AI Notes

Uploaded by

You might also like