Open navigation menu
Close suggestions
Search
Search
en
Change Language
Upload
Sign in
Sign in
Download free for days
0 ratings
0% found this document useful (0 votes)
29 views
Unit 3
3rd module
Uploaded by
apoorvaappu367
AI-enhanced title
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Download now
Download
Save unit3 For Later
Download
Save
Save unit3 For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
0 ratings
0% found this document useful (0 votes)
29 views
Unit 3
3rd module
Uploaded by
apoorvaappu367
AI-enhanced title
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Download now
Download
Save unit3 For Later
Carousel Previous
Carousel Next
Save
Save unit3 For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
Download now
Download
You are on page 1
/ 38
Search
Fullscreen
©) studocu Unit-3DATA MINING AND BUSINESS INTELLIGENCE, C.CHANDRAPRIVA, M.Sc MPBil UNIT IIL Concept Description and Association Rule Mining: What is concept description? - Data Generalization and summarization-based characterization - Attribute relevance - class comparisons Association Rule Mining: Market basket analysis - basic concepts - Finding frequent item set Apriori algorithm - generating rules — Improved Apriori algorithm — Incremental ARM — Associative Classification — Rule Mining What is concept descriptio Concept Description is a definitive type of data mining. It defines a set of data including frequent buyers, graduate candidates, etc. It describes the characterization and comparison of the data. It also known as a class description when the concept to be described is defined as a class of objects] Data Generalization and summari: ased characterization ata Analysis point of view, data mining can be classified into two categories: Descriptive mining and predictive mining Descriptive mining: It describes the data set in a conei interesting general properties of data. and summative manner and presents Predictive mining: It analyzes the data to construct one or a set of models, and attempts to predict the behavior of new data sets. Databases usually store a large amount of data in great detail. However, users often like to view sets of summarized data in concise, descriptive terms. ‘What Is Concept Description The simplest kind of descriptive data mining is called concept description. A concept usually refers to a collection of data such as frequent_buyers, graduate_students and so on. As data mining task concept description is not a simple enumeration of the data. Instead, concept description generates descriptions for characterization and comparison of the data. It is sometimes called class description when the concept to be described refers to a clas of objects + Characterization: It provides a concise and succinct summarization of the given collection of data, + Comparison: It provides descriptions comparing two or more collections of data, Data Generalization & Summarization Data and objects in databases contain detailed information at the primitive concept level. For example, the item relation in a sales database may contain attributes describing low-level item information such as item ID, name, brand, category, supplier, place made and price. Itis useful to be able to summarize a large set of data and present it at a high conceptual level. For example, summarizing a large set of items relating to Christmas season sales provides a general description of such data, which can be very helpful for sales and marketing managers. isd Jobe ms of chavs Page TT Downloaded by Thammegowéa MT (rthDATA MINING AND BUSINESS INTELLIGENCE, C.CHANDRAPRIVA, M.Sc MPBil This requires an important functionality called data generalization. Data Generalization A process that abstracts a large set of task-relevant data in a database from a low conceptual level to higher ones. Data Generalization is a summarization of general features of objects in a target class and produces what is called characteristic rules. The data relevant to a user-specified class are normally retrieved by a database query and run through a summarization module to extract the essence of the data at different levels of abstractions. For example, one may want to characterize the "OurVideoStore" customers who regularly rent more than 30 movies a year. With concept hierarchies on the attributes describing the target class, the attribute-oriented induction method can be used, for example, to carry out data summarization. Note that with a data cube containing a summarization of data, simple OLAP operations fit the purpose of data characterization. Approaches: + Data cube approach(OLAP approach). + Attribute-oriented induction approach. Presentation Of Generalized Results Generalized Relation: + Relations where some or all attributes are generalized, with counts or other aggregation values accumulated, Cross-Tabulation: ‘+ Mapping results into cross-tabulation form (similar to contingency tables). Visualization Techniques: + Pie charts, bar charts, curves, cubes, and other visual forms. Quantitative characteristic rules: + Mapping generalized results in characteristic rules with quantitative information associated with it Data Cube Approach It is nothing but performing computations and storing results in data cubes. Strength Page TZ Downloaded by Thammegowda MT (rithammegowda@gmallcom)acs DATA MINING AND BUSINESS INTELLIGENCE, C.CHANDRAPRIVA, M.Sc MPBil + Anefficient implementation of data generalization, + Computation of various kinds of measures, e.g., count( ), sum( ), average( ), max( ). + Generalization and specialization can be performed on a data cube by roll-up and drill-down, Limitations + Ithandles only dimensions of simple non-numeric data and measures of simple aggregated numeric values. + Lack of intelligent analysis, can’t tell which dimensions should be used and what levels should the generalization reach. Attribute Itis a statistical approach for preprocessing data to filter out irrelevant attributes or rank the relevant attribute, Measures of attribute relevance analysis can be used to recognize irrelevant attributes that can be unauthorized from the concept description process. The incorporation of this preprocessing step into class characterization or comparison is defined as an analytical characterization, Data discrimination makes discrimination rules which are a comparison of the general features of objects between two classes defined as the target class and the contrasting class. It is a comparison of the general characteristics of targeting class data objects with the general characteristics of objects from one or a set of contrasting classes. The user can define the target and contrasting classes. The methods used for data discrimination are very similar to the approaches used for data characterization with the exception that data discrimination results include comparative measures. Reasons for attribute relevance analysis There are several reasons for attribute relevance analysis are as follows — + Itcan decide which dimensions must be included. + It can produce a high level of generalization. + Itcan reduce the number of attributes that support us to read patterns easily. The basic concept behind attribute relevance analysis is to evaluate some measure that can compute the relevance of an attribute regarding a given class or concept. Such measures involve information gain, ambiguity, and correlation coefficient, Attribute relevance analysis for concept description is implemented as follows — Data collection — It can collect data for both the target class and the contrasting clas processing, by query Preliminary relevance analysis using conservative AOI - This step recognizes a set of dimensions and attributes on which the selected relevance measure is to be use AOI can be used to implement preliminary analysis on the data by eliminating attributes having a high number of distinct values. It can be conservative, the AOI implemented should employ attribute generalization thresholds that are set reasonably large to enable more attributes to be treated in further relevance analysis by the selected measure. Remove- This process removes irrelevant and weakly relevant attributes using the selected relevance analysis measure. Generate the concept description using AOI — It can implement AOT using a less conservative set of attribute generalization thresholds. If the descriptive mining function is class characterization, only the original target class working relation is included now. isd Jobe ms of chavs Page Ts Downloaded by Thammegowéa MT (rthDATA MINING AND BUSINESS INTELLIGENCE C.CHANDRAPRIVA MS: Shi If the descriptive mining function is class characterization, only the original target class working relation is included. If the descriptive mining function is class characterization, only the original target class working relation is included. If the descriptive mining function is class comparison, both the original target class working relation and the original contrasting class working relation are included. class comparisons Association Rule Mining Association rule learning is a type of unsupervised learning technique that checks for the dependency of one data item on another data item and maps accordingly so that it can be more profitable. It tries to find some interesting relations or associations among the variables of dataset. Itis based on different rules to discover the interesting relations between variables in the database. The association rule leaming is one of the very important concepts of machine learning, and it is employed in Market Basket analysis, Web usage mining, continuous production, ete. Here market basket analysis is a technique used by the various big retailer to discover the associations between items. We can understand it by taking an example of a supermarket, as in a supermarket, all products that are purchased together are put together. For example, if a customer buys bread, he most likely can also buy butter, eggs, or milk, so these products are stored within a shelf or mostly nearby. Consider the below diagram: ik cereal ror} ced ‘Customer? Customer 3 Customer n Association rule learning can be divided into three types of algorithms: 1. Apriori 2. Eclat 3. F+P Growth Algorithm We will understand these algorithms in later chapters. How does Association Rule Learning work? Association rule learning works on the concept of If and Else Statement, such as if A then B. a= Downloaded by Thammegowda MT (mtthammegowda@gmailcom) Pagel?LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, (CCHANDRAPRIVA, MSc M.PHIL Here the If element is called antecedent, and then statement is called as Consequent.. These types of relationships where we can find out some association or relation between two items is known as single cardinality. 1t is all about creating rules, and if the number of items increases, then cardinality also increases accordingly. So, to measure the associations between thousands of data items, there are several metrics. These metrics are given below: Support © Confidence Lift Let's understand each of ther Support, Support is the frequency of A or how frequently an item appears in the dataset. It is defined as the fraction of the transaction T that contains the itemset X. If there are X datasets, then for transactions T, it can be written as: Freq(X) T Supp|X) Confidence Confidence indicates how often the rule has been found to be true. Or how often the items X and Y ‘occur together in the dataset when the occurrence of X is already given, It is the ratio of the transaction that contains X and Y to the number of records that contain X. Freq(X.¥) Confidence= mene Freq(X) Lift Itis the strength of any rule, which can be defined as below formula: _ supp (XY) © Supp(X)xSupp(¥) Itis the ratio of the observed support measure and expected support if X and Y are independent of each other. It has three possible values: © If Lift= 1: The probability of occurrence of antecedent and consequent is independent of each other. Lift: It determines the degree to which the two itemsets are dependent to each other. o Lift
{Butter} Some terminologies to familiarize yourself with Market Basket Analysis are: o Antecedent:Items or 'itemsets' found within the data are antecedents. In simpler words, it's the IF component, written on the left-hand side. In the above example, bread is the antecedent. © Consequent:A consequent is an item or set of items found in combination with the antecedent. It’s the THEN component, written on the right-hand side. In the above example, butter is the consequent. ‘Types of Market Basket Analys Market Basket Analysis techniques can be categorized based on how the available data is utilized. Here are the following types of market basket analysis in data mining, such as: er Ponisi Types of Market Basket 9nalysis eee perma 7 Pott ments id iieiniuacs Ey studocu Page 17 Downloaded by Thammegowda MT (rtthammegowda@gmallcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, (CCHANDRAPRIVA, MSc M.PHIL 1. Descriptive market basket analysis: This type only derives insights from past data and is the ‘most frequently used approach. The analysis here does not make any predictions but rates the association between products using statistical techniques. For those familiar with the basics of Data Analysis, this type of modelling is known as unsupervised learning. 2. Predictive market basket analysis: This type uses supervised leaming models like classification and regression. It essentially aims to mimic the market to analyze what causes what to happen. Essentially, it considers items purchased in a sequence to determine cross- selling. For example, buying an extended warranty is more likely to follow the purchase of an iPhone. While it isn't as widely used as a descriptive MBA, it is still a very valuable tool for marketers. 3. Differential market basket analysis: This type of analysis is beneficial for competitor analysis. It compares purchase history between stores, between seasons, between two time periods, between different days of the week, ete., to find interesting patterns in consumer behaviour. For example, it can help determine why some users prefer to purchase the same product at the same price on Amazon vs Flipkart. The answer can be that the Amazon reseller hhas more warehouses and can deliver faster, or maybe something more profound like user experience. Algorithms associated with Market Basket Analysis In market basket analysis, association rules are used to predict the likelihood of products being purchased together. Association rules count the frequency of items that occur together, secking to find associations that occur far more often than expected. Algorithms that use association rules include AIS, SETM and Apriori. The Apriori algorithm is commonly cited by data scientists in research articles about market basket analysis. It identifies frequent items in the database and then evaluates their frequency as the datasets are expanded to larger sizes. R's rules package is an open-source toolkit for association mining using the R programming language. This package supports the Apriori algorithm and other mining algorithms, including arulesNBMiner, opusminer, RKEEL and RSarules. With the help of the Apriori Algorithm, we can further classify and simplify the item sets that the consumer frequently buys. There are three components in APRIORI ALGORITHM: SUPPORT CONFIDENCE o LIFT For example, suppose 5000 transactions have been made through a popular e Now they want to calculate the support, confidence, and lift for the two products. For example, let's say pen and notebook, out of 5000 transactions, 500 transactions for pen, 700 transactions for notebook, and 1000 transactions for both. Page 1S Downloaded by Thammegowda MT (rithammegowda@gmallcom)DATA MINING AND BUSINESS INTELLIGENCE C.CHANDRAPRIVA, MSc MPhil, SUPPORT It has been calculated with the number of transactions divided by the total number of transactions made, 1, Support = freq(A, BYN support(pen) = transactions related to pen/total transactions i.e support > 500/5000=10 percent Whether the product sales are popular on individual sales or through combined sales has been calculated, That is calculated with combined transactions/individual transactions. 1, Confidence = freq (A, BY! freq(A) Confidence = combine transactions/individual transactions ie confidence-> 1000/500=20 percent LIFT Lift is calculated for knowing the ratio for the sales. 1. Lift = confidence percent/ support percent Lift> 20/10-2 When the Lift value is below 1, the combination is not so frequently bought by consumers. But in this case, it shows that the probability of buying both the things together is high when compared to the transaction for the individual items sold. Examples of Market Basket Analysis Here are the following examples that explore Market Basket Analysis by market segment, such as: Market Basket Analysis Examples Page 19 Downloaded by Thammegowda MT (rtthammegowda@gmallcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, (CCHANDRAPRIVA, MSc M.PHIL Retail: The most well-known MBA case study is Amazon.com. Whenever you view a product, on Amazon, the product page automatically recommends, "Items bought together frequently." It is perhaps the simplest and most clean example of an MBA's cr selling techniques. Apart from e-commerce formats, BA is also widely applicable to the in-store retail segment. Grocery stores pay meticulous attention to product placement based and shelving optimization. For example, you are almost always likely to find shampoo and conditioner placed very close to each other at the grocery store. Walmart’s infamous beer and diapers association anecdote is also an example of Market Basket Analysis. Telecom: With the ever-increasing competition in the telecom sector, companies are paying close attention to customers’ services. For example, Telecom has now started to bundle TV and Internet packages apart from other discounted online services to reduce churn. IBFS: Tracing credit card history is a hugely advantageous MBA opportunity for IBES organizations. For example, Citibank frequently employs sales personnel at large malls to lure potential customers with attractive discounts on the go. They also associate with apps like Swiggy and Zomato to show customers many offers they can avail of via purchasing through credit cards, IBFS organizations also use basket analysis to determine fraudulent claims. Medicine: Basket analysis is used to determine comorbid conditions and symptom analysis in the medical field. It can also help identify which genes or traits are hereditary and which are associated with local environmental effec Benefits of Market Basket Analysis ‘The market basket analysis data mining technique has the following benefits, such as: Benefits of Market Basket Qnalysis a Increasing market share: Once a company hits peak growth, it becomes challenging to determine new ways of increasing market share. Market Basket Analysis can be used to put together demographic and gentrification data to determine the location of new stores or geo- targeted ads. Page [10 Downloaded by Thammegowda MT (mtthammegowda@gmailcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, (CCHANDRAPRIVA, MSc M.PHIL © Behaviour analysis foundations of marketing. MBA can be used anywhere from a simple catalogue design to ‘UI/UX. Understanding customer behaviour patterns is a primal stone in the © Optimization of in-store operations: MBA is not only helpful in determining what goes on the shelves but also behind the store, Geographical patterns play a key role in determining the popularity or strength of certain products, and therefore, MBA has been increasingly used to optimize inventory for each store or warehouse. © Campaigns and promotions: Not only is MBA used to determine which products go together but also about which products form keystones in their product line, Recommendations: OTT platforms like Netflix and Amazon Prime benefit from MBA by understanding what kind of movies people tend to watch frequently. basic concepts Data mining is the process of finding useful new correlations, patterns, and trends by transferring through a high amount of data saved in repositories, using pattern recognition technologies including statistical and mathematical techniques. It is the analysis of factual datasets to discover unsuspected relationships and to summarize the records in novel methods that are both logical and helpful to the data owner. There are various concepts of data mining which are as follows ~ Classification — Classification is the procedure of discovering a model that represents and distinguishes data classes or concepts, for the objective of being able to use the model to predict the class of objects whose class label is anonymous. The derived model is based on the analysis of a group of training records (i.e., data objects whose class label is familiar) Predictions — Prediction is the same as classification, except that for prediction, the results are misrepresented in the future, Examples of prediction functions in business and research include ~ + It can be predicting the value of a stock three months into the future. + Itcan be predicting the percentage increase in traffic deaths next year if the speed limit is raised. + It can be predicting the winner of this fall’s baseball World Series, based on a correspondence of team statistics. + It can be predicted whether a definite molecule in drug discovery will begin a cost- effective new drug for a pharmaceutical company. Association Rules and Recommendation Systems — Association rules, or affinity analysis, are designed to find such general associations patterns between items in large databases. ‘The rules can be used in several methods. For example, grocery stores can use such information for product placement, They can use the rules for weekly promotional offers or for bundling products. Association rules derived from a hospital database on patients’ symptoms during consecutive hospitalizations can help find “which symptom is followed by what other symptom” to help predict future symptoms for returning patients. Page [11 Downloaded by Thammegowda MT (mtthammegowda @gmallcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, CCHANDRAPRIVA. M.Sc M.Phil Data Reduction — Data mining is used to the selected data in a huge amount database. When data analysis and mining is completed on a huge amount of records then it takes a very high time to elops it impossible and infeasible It can reduce the processing time for data analysis, data reduction techniques are used to obtain a reduced representation of the dataset that is much smaller in volume by maintaining the integrity of the original data. By reducing the data, the efficiency of the data mining process is improved which produces the same analytical results. Data reduction aims to define it more compactly. When the data size is smaller, it is easier to use mature and computationally high-cost algorithms. The reduction of the data may be in terms of the number of rows (records) or terms of the number of columns (dimensions). Finding frequent item sets Frequent item sets, also known as association rules, are a fundamental concept in association rule mining, which is a technique used in data mining to discover relationships between items in a dataset. The goal of association rule mining is to identify relationships between items in a dataset that occur frequently together. A frequent item set is a set of items that occur together frequently in a dataset. The frequency of an item set is measured by the support count, which is the number of transactions or records in the dataset that contain the item set. For example, if'a dataset contains 100 transactions and the item set {milk, bread} appears in 20 of those transactions, the support count for {milk, bread} is 20, Association rule mining algorithms, such as Apriori or FP-Growth, are used to find frequent item sets and generate association rules. These algorithms work by iteratively generating candidate item sets and pruning those that do not meet the minimum support threshold. Once the frequent item sets are found, association rules can be generated by using the concept of confidence, which is the ratio of the number of transactions that contain the item set and the number of transactions that contain the antecedent (lefi-hand side) of the rule. Frequent item sets and association rules can be used for a variety of tasks such as market basket analysis, cross-selling and recommendation systems. However, it should be noted that association rule mining can generate a large number of rules, many of which may be irrelevant or uninteresting. Therefore, it is important to use appropriate measures such as lift and conviction to evaluate the interestingness of the generated rules. Association Mining searches for frequent items in the data set. In frequent mining usually, interesting associations and correlations between item sets in transactional and relational databases are found. In short, Frequent Mining shows which items appear together in a transaction or relationship. Need of Association Mining: Frequent mining is the generation of association rules from a Transactional Dataset. If there are 2 items X and Y purchased frequently then it’s good to put them together in stores or provide some discount offer on one item on purchase of another item, This can really increase sales, For example, itis likely to find that if a customer buys Milk and bread he/she also buys Butter. So the association rule is [‘milk]*[‘bread’]=>[ butter’). So the seller can suggest the customer buy butter if he/she buys Milk and Bread, Important Definitions : Support : It is one of the measures of interestingness. This tells about the usefulnes of rules, 5% Support means total 5% of transactions in the database follow the rule, and certainty Page [12 Downloaded by Thammegowda MT (rithammegowda@gmallcom)DATA MINING AND BUSINESS INTELLIGENCE C.CHANDRAPRIVA, MSc MPhil, Support(A > B) = Support_count(A U B) Confidence: A confidence of 60% means that 60% of the customers who purchased a milk and bread also bought butter. Confidence(A -> B) = Support_count(A U B) / Support_count(A) If. rule satisfies both minimum support and minimum confidence, it is a strong rule. Support_count(X): Number of transactions in which X appears. If X is A union B then itis the number of transactions in which A and B both are present. Maximal Itemset: An itemset is maximal frequent if none of its supersets are frequent. Closed Itemset: An itemset is closed if none of its immediate supersets have same support count same as Itemset. K- Itemset: Itemset which contains K items is a K-itemset. So it can be said that an itemset is frequent if the corresponding support count is greater than the minimum support count. Example On finding Frequent Itemsets — Consider the given dataset with given 1 {A,C,D} 2 {B,C,D} | 3 {ABCD} | 4 eo 5 {A.B,C,D} Lets say minimum support count is 3 transactions. Relation hold is maximal frequent => closed => frequent I-frequent: {A} = 3; // not closed due to {A, C} and not maximal {B} = 4; // not closed due to {B, Dj and no maximal {C} = 4; //not closed due to {C, D} not maximal {D} = 5; // closed item-set since not immediate super-set has same count. Not maximal 2efrequent: (A, B} = 2 // not frequent because support count < minimum support count so ignore 1A, C} = 3//not closed due to {A, C, D} {A, D} = 3 // not closed due to {A, C, D} {B, C} = 3 // not closed due to {B, C, D} {B, D} = 4// closed but not maximal due to {B, C, D} {C, D} = 4 // closed but not maximal due to {B, C, D} 5-frequent: (A, B, C} = 2// ignore not frequent because support count < minimum support count {A, B, D} = 2// ignore not frequent because support count < minimum support count {A, C, D} = 3 “maximal frequent {B, C, D} = 3 // maximal frequent 4-frequent: {A, B, C, D} = 2 /lignore not frequent ADVANTAGES OR DISADVANTAGES: Page [13 Downloaded by Thammegowda MT (rtthammegowda@gmallcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, C.CHANDRAPRIVA, M.Sc MPBil Advantages of using frequent item sets and association rule mining include: Efficient discovery of patterns: Association rule mining algorithms are efficient at discovering patterns in large datasets, making them useful for tasks such as market basket analysis and recommendation systems Easy to interpret: The results of association rule mining are easy to understand and interpret, making it possible to explain the patterns found in the data. Can be used in a wide range of applications: Association rule mining can be used in a wide range of applications such as retail, finance, and healthcare, which can help to improve decision-making and increase revenue. Handling large datasets: These algorithms can handle large datasets with many items and transactions, which makes them suitable for big-data scenarios. Disadvantages of using frequent item sets and association rule mining include: Large number of generated rules: Association rule mining can generate a large number of rules, many of which may be irrelevant or uninteresting, which can make it difficult to identify the most important patterns. Limited in detecting complex relationships: Association rule mining is limited in its ability to detect complex relationships between items, and it only considers the co-occurrence of items in the same transaction Can be computationally expensive: As the number of items and transactions increases, the number of candidate item sets also increases, which can make the algorithm computationally expensive. Need to define the minimum support and confidence threshold: The minimum support and confidence threshold must be set before the association rule mining process, which can be difficult and requires a good understanding of the data. Apriori algorithm Apriori algorithm refers to the algorithm which is used to calculate the association rules between objects. It means how two or more objects are related to one another. In other words, we can say that the apriori algorithm is an association rule leaning that analyzes that people who bought product A also bought product B. The primary objective of the apriori algorithm is to create the association rule between different objects. The association rule describes how two or more objects are related to one another. Apriori algorithm is also called frequent pattern mining. Generally, you operate the Apriori algorithm on a database that consists of a huge number of transactions. Let's understand the apriori algorithm with the help ofan example; suppose you go to Big Bazar and buy different products. It helps the customers buy their products with ease and increases the sales performance of the Big Bazar. In this tutorial, we will discuss the apriori algorithm with examples. Introduction We take an example to understand the concept better. You must have noticed that the Pizza shop seller makes a pizza, soft drink, and breadstick combo together. He also offers a discount to their customers who buy these combos. Do you ever think why does he do so? He thinks that customers Page [14 Downloaded by Thammegowda MT (rithammegowda@gmallcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, (CCHANDRAPRIVA, MSc M.PHIL who buy pizza also buy soft drinks and breadsticks. However, by making combos, he mak: for the customers. At the same time, he also increases his sales performance. it easy Similarly, you go to Big Bazar, and you will find biscuits, chips, and Chocolate bundled together. It shows that the shopkeeper makes it comfortable for the customers to buy these products in the same place Backward Skip 10s 10 The above two examples are the best examples of Association Rules in Data Mining. It helps us to Jeam the concept of apriori algorithms. What is Apriori Algorithm? Apriori algorithm refers to an algorithm that is used in mining frequent products sets and relevant association rules. Generally, the apriori algorithm operates on a database containing a huge number of transactions. For example, the items customers but at a Big Bazar. Apriori algorithm helps the customers to buy their products with ease and increases the sales performance of the particular store Components of Apriori algorithm ‘The given three components comprise the apriori algorithm. 1. Support 2. Confidence 3. Lift Let's take an example to understand this concept. We have already discussed above; you need a huge database containing a large no of transactions. Suppose you have 4000 customers transactions in a Big Bazar. You have to calculate the Support, Confidence, and Lift for two products, and you may say Biscuits and Chocolate. This is because customers frequently buy these two items together. Out of 4000 transactions, 400 contain Biscuits, whereas 600 contain Chocolate, and these 600 transactions include a 200 that includes Biscuits and chocolates. Using this data, we will find out the support, confidence, and lift Support Support refers to the default popularity of any product. You find the support as a quotient of the division of the number of transactions comprising that product by the total number of transactions. Hence, we get Support (Biscuits) = (Transactions relating biscuits) / (Total transactions) = 400/400 = 10 percent. Page [15 Downloaded by Thammegowda MT (mtthammegowda @gmallcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, C.CHANDRAPRIVA, MScMPBil Confidence Confidence refers to the possibility that the customers bought both biscuits and chocolates together. So, you need to divide the number of transactions that comprise both biscuits and chocolates by the total number of transactions to get the confidence. Hence, Confiden Biscuits) (Transactions relating both biscuits and Chocolate) / (Total transactions involving = 200/400 = 50 percent. It means that 50 percent of customers who bought biscuits bought chocolates also. Lift Consider the above example; lift refers to the increase in the ratio of the sale of chocolates when you sell biscuits. The mathematical equations of lift are given below. Lift = (Confidence (Biscuits - chocolates)’ (Support (Biscuits) = 50/10=5 It means that the probability of people buying both biscuits and chocolates together is five times more than that of purchasing the biscuits alone. If the lift value is below one, it requires that the people are unlikely to buy both the items together. Larger the value, the better is the combination. How does the Apriori Algorithm work in Data Mining? We will understand this algorithm with the help of an example Consider a Big Bazar scenario where the product set is P = {Rice, Pulse, Oil, Milk, Apple}. The database comprises six transactions where | represents the presence of the product and 0 represents the absence of the product. ‘Transaction ID Rice Pulse Oil Milk. Apple a 1 1 1 0 0 2 0 1 1 1 0 B 0 0 0 I 1 4 1 1 0 1 0 Page [16 Downloaded by Thammegowda MT (rithammegowda@gmallcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, C.CHANDRAPRIVA, MScMPBil 8 1 1 I 0 1 6 1 1 1 1 1 The Apriori Algorithm makes the given assumptions © All subsets of a frequent itemset must be frequent. © The subsets of an infrequent item set must be infrequent. © Fix a threshold support level. In our case, we have fixed it at 50 percent. Step 1 Make a frequency table of all the products that appear in all the transactions. Now, short the frequency table to add only those products with a threshold support level of over 50 percent. We find the given frequency table. Product Frequency (Number of transactions) Rice (R) 4 Pulse(P) 5 oi) 4 Milk(M) 4 ‘The above table indicated the products frequently bought by the customers. Step 2 Create pairs of products such as RP, RO, RM, PO, PM, OM. You will get the given frequency table. Ttemset Frequency (Number of transactions) RP 4 RO 3 RM 2 PO 4 Page [17 Downloaded by Thammegowda MT (mtthammegowda @gmallcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, C.CHANDRAPRIVA, MScMPBil PM 3 om 2 Step 3 Implementing the same threshold support of 50 percent and consider the products that are more than 50 percent. In our ease, it is more than 3 Thus, we get RP, RO, PO, and PM Step 4 Now, look for a set of three products that the customers buy together. We get the given combination. 1. RP and RO give RPO 2. PO and PM give POM Step 5 Calculate the frequency of the two itemsets, and you will get the given frequency table. Itemset Frequency (Number of transactions) RPO 4 POM 3 Ifyou implement the threshold assumption, you can figure out that the customers’ set of three products is RPO. We have considered an easy example to discuss the apriori algorithm in data mining, In reality, you find thousands of such combinations. How to improve the efficiency of the Apriori Algorithm? ‘There are various methods used for the efficiency of the Apriori algorithm Hash-based itemset counting In hash-based itemset counting, you need (o exclude the k-itemset whose equivalent hashing bucket count is least than the threshold is an infrequent itemset. Transaction Reduction Page [18 Downloaded by Thammegowda MT (rithammegowda@gmallcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, (CCHANDRAPRIVA, MSc M.PHIL In transaction reduction, a transaction not involving any frequent X itemset becomes not valuable in subsequent scans. Apriori Algorithm in data mining We have already discussed an example of the apriori algorithm related to the frequent itemset generation. Apriori algorithm has many applications in data mining. The primary requirements to find the association rules in data mining are given below. Use Brute Force Analyze all the rules and find the support and confidence levels for the individual rule. Afterward, eliminate the values which are less than the threshold support and confidence levels. The two-step approaches ‘The two-step approach is a better option to find the associations rules than the Brute Force method. Step 1 In this article, we have already discussed how to create the frequency table and calculate itemsets having a greater support value than that of the threshold support. Step 2 To create association rules, you need to use a binary partition of the frequent itemsets. You need to choose the ones having the highest confidence levels, In the above example, you can see that the RPO combination was the frequent itemset. Now, we find out all the rules using RPO. RP-O, RO-P, PO-R, O-RP, P-RO, R-PO You can sce that there are six different combinations. Therefore, if' you have n elements, there will be 2° - 2 candidate association rules. Advantages of Apriori Algorithm © Itis used to calculate large itemsets. © Simple to understand and apply. Disadvantages of Apriori Algorithms © Apriori algorithm is an expensive method to find support since the calculation has to pass through the whole database. o Sometimes, you need a huge number of candidate rules, so it becomes computationally more expensive. Page [19 Downloaded by Thammegowda MT (mtthammegowda @gmallcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, (CCHANDRAPRIVA, MSc M.PHIL Association Rules Mining General Concepts This is an example of Unsupervised Data Mining-- You are not trying to predict a variable. Alll previous classification algorithms are considered Supervised techniques. Given a set of transactions, find rules that will predict the occurrence of an item based on the ‘occurrences of other items in the transaction Nominal attributes are required. Affinity Analysis is the process of determining which things go together. This is also called market basket analy: For example, we may have the following products: Milk, Cheese, Bread, Eggs Possible associations include: 1, if customers purchase milk they also purchase bread {milk} — {bread} if customers purchase bread they also purchase milk {bread}—> {milk} 3._if'customers purchase milk and eggs they also purchase cheese and bread {milk, eggs) > { cheese, bread} 4. if customers purchase milk, cheese, and eggs they also purchase bread {milk, cheese, eggs} = {bread} Based on a set of transactions of customers Note that #1 and #2 are not the same as is demonstrated in the confidence rating of each rule described below. Implication means co-occurrence, not causali Ttemset: Bread, Milk A collection of 1 or more items Bread, Diaper, Beer, Fees Milk, Diaper, Beer, Coke + {bread, milk, diaper} Bread, Milk, Diaper, Beer Bread, Milk, Diaper, Coke a) =)>)8)— Support Count: Support count, o, is the frequency count of occurences of the itemset + o({bread,milk,diaper}) = 2 Page [20 Downloaded by Thammegowda MT (mtthammegowda@gmailcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, (CCHANDRAPRIVA, MSc M.PHIL Support (similar to the idea of coverage with decision rules) Support is the percentage of instances in the database that contain all items listed in an itemset + For the bread AND milk cases #1 and #2 we might have o(bread and milk) = 5000 out of 50000 instances for s=10% support + or in the case of the tiny 5 items dataset, we would have 6=3 out of 5 instances for s=60%. Association Rule ‘An association rule is an implication expression of the form X —»> Y, where X and ¥ are itemsets + Example: {Milk, Diaper} + {Beer} Confidence (similar to the idea of accuracy with decision rules) Each rule has an associated confidence: the conditional probability of the association. E.g,, the probability that purchasing a set of items they then purchase another set of items, so if there were 10000 recorded transactions purchasing milk, and of those 5000 purchase bread, we have 50% confidence for rule #1. For rule #2, we might have 15000 purchasing bread, of which 5000 purchased milk, then it is 33% confidence. In the 4 itemset example Example: {Milk, Diaper} => {Beer} (Milk, Diaper, Beer) _ 2 _ IT] 5 _ o(Milk,Diaper,Beer) 2 ~~ @(Milk,Diaper) ~ 3 04 0.67 Item Sets tem sets are attribute-value combinations that meet a specified coverage requirement (minimum support). Item sets that do not make the cut are discarded. We can also talk about minimum confidence. Page [21 Downloaded by Thammegowda MT (mtthammegowda @gmallcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, (CCHANDRAPRIVA, MSc M.PHIL Association Rules Mining Approach Given a set of transactions, T, the goal of association rule mining is to find all rules having + support > minSup threshold + confidence > minConf threshold Brute-foree approacl + List all possible association rules + Compute the support and confidence for each rule ‘+ Prune rules that fail the minSup and minConf thresholds ‘Computationally prohibitive! (exponential 0(3")) Below is a graph showing the total number of rules to consider for d unique items. ~ " > x IM 3-2" 41 Number of rules Ifd=6, R= 602 rules {Milk,Diaper}—> {Beer} (s=0.4, c=0.67) x Bh ead, Milk. = {Milk.Beer} — {Diaper} (s-0.4, c=1.0) 2 __|| Bread, Diaper, Beer, Exes {Diaper,Beer}—> {Milk} (s-0.4, c=0.67) 3__| Milk, Diaper, Beer, Coke {Beer} —> (Milk,Diaper) (s=0.4, ¢=0.67) 7] Bread, Milk, Diaper, Beer {Diaper} + {Milk,Beer} (s=0.4, c=0.5) — {Milk} — {DiaperBeer} (s-0.4, c-0.5) 5 Bread, Milk, Diaper, Coke Observations: + All the above rules are binary partitions of the same itemset: {Milk, Diaper, Beer} + Rules originating from the same itemset have identical support but can have different confidence ‘Thus, we may decouple the support and confidence requirements Page [22 Downloaded by Thammegowda MT (mtthammegowda@gmailcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, (CCHANDRAPRIVA, MSc M.PHIL Mining the Association Rules ‘Two-step approach: 1. Frequent Itemset Generation Generate all itemsets whose support >minsup 2. Rule Generation Generate high confidence rules fiom each frequent itemset, where each rule is a binary partitioning of a frequent itemset Frequent itemset generation is still computationally expensive. = Given d items, there are 24 possible 3 candidate itemsets Frequent Itemset Tiadiectiiuts List of Generation Candidat TID [items Sty Brute-force 1 Bread, Milk { approach: 4 Bread, Diaper, Beer, Eggs Each N_ |3___|Milk, Diaper, Beer, Coke M itemset in 4 __|Bread, Milk, Diaper, Beer { the lattice } 3___|Bread, Milk, Diaper, Coke isa <+— w—— candidate frequent itemset + Count the support of each candidate by scanning the database + Match each transaction against every candidate Complexity is exponential ~ O(NMw), which is expensive since M = 2! !!! Page [23 Downloaded by Thammegowda MT (rtthammegowda@gmallcom)Limscacs DATA MINING AND BUSINESS INTELLIGENCE _G.CHANDRAPRIVA, MScALPHiL Strategies Reduce the number of candidates (M) + Complete search: M=24 + Use pruning techniques to reduce M (use Apriori principle, below) Reduce the number of transactions (N) + Reduce size of N as the size of itemset increases + Used by DHP and vertical-based mining algorithms Reduce the number of comparisons (NM) + Use efficient data structures to store the candidates or transactions + Noneed to match every candidate against every transaction Apriori principle: Ifan itemset is frequent, then all of its subsets must also be frequent Apriori principle holds duc to the following property of the support measure: WX,Y:(X CY) 9(X) 2 (Y), x and y are itemsets © Support of an itemset never exceeds the support of its subsets + This is known as the anti-monotone property of support Found to be Infrequent Pruned “*s._ supersets Page [2 Downloaded by Thammegowda MT (mtthammegowda@gmailcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, (CCHANDRAPRIVA, MSc M.PHIL 1 ‘Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke Lo Items (1-itemsets) 1 | Bread, Milk 2 ‘Beer, Bread, Diaper, Eggs [tem Count _| Bread 4 3 Beer, Coke, Diaper, Mik | =m Coke 2 4 __| Beer, Bread, Diaper, Milk ‘Milk 4 S = Beer 3 3 | Bread, Coke, Diaper, Milk Diaper ir Eggs 1 If every subset is considered, 6; + Cy + 6C; 6+15+20=41 With support-based pruning, 6+6+4=16 Click through the steps to see the animation sequence Step 0 Step 1 Step 2 Step 3 Step 4 Step 5 The formal Apriori algorithm Fe: frequent k-itemsets Lic candidate k-itemsets Algorithm + Letk=1 + Generate F; = {frequent 1-itemsets} + Repeat until Fx is empty: © Candidate Generation: Generate Ly) from Fx © Candidate Pruning: Prune candidate itemsets in Liv: containing subsets of length k that are infrequent © Support Counting: Count the support of each candidate in Ly. by scanning the DB © Candidate Elimination: Eliminate candidates in Lxs1 that are infrequent, leaving only those that are frequent => Fut Informally, the algorithm is, id iieiniuacs Ey studocu Page [25 Downloaded by Thammegowda MT (rtthammegowda@gmallcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, C.CHANDRAPRIVA, MScMPBil + Finding one-item sets easy + Use one-item sets to generate two-item sets, two-item sets to generate three-item sets, + Keep only those item sets that meet the support threshold at each level to prune those at higher levels. + Then partition the retained item sets into rules and keep only those that meet the confidence threshold. Example 2: credit card promotion database This example considers a dataset of nominal values, although binary, both of which can be considered interesting. Unlike marketbasket where only purchases are interesting, Single itemsets now can be twice as large than above. MagazineWatch Life Ins C4" Single item sets at a 40% coverage threshold: Card Sex Promo Promo Promo 7” ins. anole i Number of single item sets ite Yes No No No Male wems Yes Yes ‘Yes. No ‘Female | |A. Magazine Promo=Yes 7 No No No No Male B, Watch Promo=Yes 4 Yes Yes Yes Yes Male ‘Watch Promo=No 6 Yes NoYes No Female D. Life Ins Promo=Yes 5 No No No No Female E. Life Ins Promo=No 5 Yes No Yes Yes Male F. Credit Card Ins=No 8 No Yes No No Male (G. Sex=Male 6 Yes No No No Male Yes Yes Yes No Female | '/: Sex-Female 4 Pairing--Step 2 Now begin pairing up combinations with the same coverage threshold (again 40% here) Page [26 Downloaded by Thammegowda MT (rithammegowda@gmallcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, C.CHANDRAPRIVA, MScMPBil single item sets Number of items AB[CDIEF GH ‘A. Magazine Promo=Yes 7 B\3|- B, Watch Promo-Yes 4 cla) |- ‘atch Promo=No 6 b/s : D. Life Ins Promo=Yes 5 E|2| |4) |- IE, Life Ins Promo=No 5 F/5| |5| |5/- F. Credit Card Ins 8 Gia) 4) jaja). G. Sex=Male 6 H 4) |- H Sex=Female 4 Resulting rules from two item sets. Consider rules in both directions: 1. (AD) ( MagazinePromo=Yes )—> ( LifelnsPromo=Yes ) at 5/7 confidence 2, (DA) (LifelnsPromo=Yes ) — (MagazinePromo=Yees ) at 5/5 confidence 3. twenty more rules from the 10 two-item-sets (A then C, C then A, A then F, F then A, ete.) Now apply minimum confidence threshold If confidence threshold would be 80%, then the first rule (A — D) is eliminated. Repeat process for 3 item set rules, then 4 item set rules, ete., but keep the support and confidence thresholds the same. Candidate Generation: Fi.a x Fi Method Merge two frequent (k-1)-itemsets if their first (k-2) items are identical Example F; = {ABC,ABD,ABE,ACD,BCD,BDE} + Lexicographically ordered! Candidate four-item sets are: + Merge(ABC, ABD) = ABCD + Merge(ABC, ABE) = ABCE + Merge(ABD, ABE) = ABDE Page [27 Downloaded by Thammegowda MT (mtthammegowda @gmallcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, (CCHANDRAPRIVA, MSc M.PHIL Do not merge(ABD,ACD) because they share only prefix of length 1 instead of length 2 (A CDE) Not candidate because of no (C D E) Li= {ABCD,ABCE,ABDE} is the set of candidate 4-itemsets generated from first method Candidate pruning + Prune ABCE because ACE and BCE are infrequent + Prune ABDE because ADE is infrequent After candidate pruning: F; ~ {ABCD} Alternate Fy-1 x Fi-1 Method Merge two frequent (k-1)-itemsets if the last (k-2) items of the first one is identical to the first (k-2) items of the second. Fs = {ABC,ABD,ABE,ACD,BCD,BDE,CDE} +» Merge(ABC, BCD) = ABCD + Merge(ABD, BDE) = ABDE * Merge(ACD, CDE) = ACDE. + Merge(BCD, CDE) = BCDE Li= {ABCD,ABDE,ACDE,BCDE} is the set of candidate 4-itemsets generated from second method pruning results in Fs = {ABCD} why are others eliminated? Lass wi Page [28 Downloaded by Thammegowda MT (rithammegowda@gmallcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, C.CHANDRAPRIVA, MSs, Frequent 2ttemset Remset (eee Diaoers] (Bread, Diapers} (Bread. Milk) (Diapers, Man) Candidate Candidate Generation Pruning amy | aot Diapers, Min) Frequont (Bread, Diape 2atemset Homeat japere] (Bread, Diapers] (@read, Mik (Dapers, Map Figure 6.8. Generating and pruning candidate k-temsets by merging pairs of frequent (1) temsets. Rule generation A three item set will be partitioned to generate 6 rules: ‘An item set (A B C) generates rules (A&B) >C, (A&C)>B, (B&C)>A, A> (B&O), B>(A&O), C>(A&B) Example 4 item set L = (A B C D), the partitioning result in the following rules + ABCD, ABD— C, ACD +B, BCD — A, A~ BCD, B— ACD, C > ABD, D — ABC AB CD, AC BD, AD BC, BC — AD, BD— AC, CD AB If |L| =k, then there are 2*— 2 candidate association rules (We are ignoring L —> True and True —> L. Weka will include the latter!) In general, confidence does not have an anti-monotone property. + iie., conf{ABC —D) can be larger or smaller than conf(AB D) But confidence of rules generated from the same itemset has an anti-monotone property + E.g., Suppose {A,B,C,D} is a frequent 4-itemset: conf(ABC —> D) > conf(AB —+CD) > conf(A — BCD) Downloaded by Thammegowda MT (rtthammegowda@gmallcom)DATA MINING AND BUSINESS INTELLIGENCE C.CHANDRAPRIVA MS: Shi Confidence is anti-monotone w.r.t. number of items on the RHS of the rule. conf= (itemset) (ths ) Lattice of rules Low om Contideybo Rule ~—_, z Weather example Ne Ne me we No Page [30 Downloaded by Thammegowda MT (mtthammegowda@gmailcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, (CCHANDRAPRIVA, MSc M.PHIL One-tem sets ‘Two-tem sets “Three-tem sets Fourtem sets Outbok = Sunny (5) Outlook = Sunny ‘Outlook = Sunny Outlook = Sunny Temperature = Hot (2) Temperature = Hot Tempetature = Hot Humidty = High (2) Humidity = High Play = No (2) ‘Temperature = Cool (4) Outlook = Sunny ‘Outiook = Sunny Outlook = Rany Humidity = High (3) Humidty = High Temperature = Mid ‘Windy = False (2) windy = False Pay = Yes (2) In total: (with minimum support of two) + 12 one-item sets, + 47 two-item sets, + 39 three-item sets, + 6 four-item sets + 0 five-item sets Once all item sets with minimum support have been generated, we can turn them into rules Example: + Humidity ~ Normal, Windy ~ False, Play ~ Yes (4) Seven (2-1) potential rules (6 useful ones) If Humidity = Normal and Windy = False — Play = Yes (4/4) If Humidity =Normal and Play = Yes + Windy = False (4/6) If Windy = False and Play = Yes > Humidity = Normal (4/6) If Humidity = Normal — Windy = False and Play = Yes (4/7) If Windy = False > Humidity = Normal and Play = Yes (4/8) 1fPlay = Yes + Humidity = Normal and Windy = False (4/9) 27 If True + Humidity = Normal and Windy = False and Play = Yes (4/12) Factors Affecting Complexity of Apriori Choice of minimum support threshold + lowering support threshold results in more frequent itemsets + this may increase number of candidates and max length of frequent itemsets Dimensionality (number of items) of the data set + more space is needed to store support count of each item + if number of frequent items also increases, both computation and I/O costs may also increase Page [31 Downloaded by Thammegowda MT (rtthammegowda@gmallcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, C.CHANDRAPRIVA. MSs, Size of database + since Apriori makes multiple passes, run time of algorithm may increase with number of transactions Average transaction width + transaction width increases with denser data sets + This may increase max length of frequent itemsets and traversals of hash tree (number of subsets in a transaction increases with its width) Support Counting of Candidate Itemsets ‘Scan the database of transactions to determine the support of each candidate itemset Must match every candidate itemset against every transaction--- expensive operation The highlighted itemset support? Search all transactions... ‘Milk [itemset_ ae = Pray 2 |Beer, Bread, Diaper, Eggs eran 3 ___|Beer, Coke, Diaper, Milk Bread, Diap 4 ___ | Beer, Bread, Diaper, Milk era ay 5 Bread, Coke, Diaper, Milk To reduce number of comparisons, store the candidate itemsets in a hash structure Instead of matching each transaction against every candidate, match it against candidates contained in the hashed buckets Transactions Hash Structure TID [hems ‘Bread, Milk ‘Bread, Diaper, Beer, Eggs Milk, Diaper, Beer, Coke ‘Bread, Milk, Diaper, Beer ‘Bread, Milk, Diaper, Coke +2 ase os Buckels Suppose you have 15 candidate itemsets of length 3: {145}, (124), {457}, {125}, {45.8}, {159}, (13 6}, (234), (56 7}, (345), {356}, (35 7}, {68 9}, {3 67}, {3 6 8} How many of these itemsets are supported by transaction (1,2,3,5,6)? Page | Downloaded by Thammegowda MT (mtthammegowda@gmailcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, (CCHANDRAPRIVA, MSc M.PHIL Transaction, t 12356 Hash function 3,69 “ 14,7 ~" 14 345 356 367 “ 136° "" 357 368 689 125: Lsy 458 Matching transaction (1 2 3 5 6) leads to the buckets that contain the item sets to which counts can be incremented. EE cxsain Han ac om | [s89| Page [33 Downloaded by Thammegowda MT (rtthammegowda@gmallcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, (CCHANDRAPRIVA, MSc M.PHIL General Considerations Association rules do not require identification of dependent variables first. This is a good example of information discovery. Not all rules may be useful. We may have a rule that exceeds our confidence level, but the item sets are also high in probability so not much new information is revealed. The lift is low. If eustomers purchase milk, they also purchase bread (conf, level of 50%) but if 70% of all purchases involves milk and 50% of purchases include bread, the rule is of little use. ‘Two types of relationships of interest: 1. association rules that show a lift in product sales for a particular product where the lift in sales is the result of is association with one or more other products—may conclude that marketing may use this information 2. association rules that show a lower than expected confidence for a particular association ‘may conclude that the products involved in the rule are competing for the same market, Start with high thresholds to see what rules are found; then reduce the levels as needed Improved Apriori algorithm Methods To Improve Apriori Efficiency Many methods are available for improving the efficiency of the algorithm. 1, Hash-Based Technique: This method uses a hash-based structure called a hash table for generating the k-itemsets and its corresponding count. It uses a hash funetion for generating the table 2. Transaction Reduction: This method reduces the number of transactions scanning in iterations. The transactions which do not contain frequent items are marked or removed, 3. Partitioning: This method requires only two database scans to mine the frequent itemsets. It says that for any itemset to be potentially frequent in the database, it should be frequent in at least one of the partitions of the database, 4, Sampling: This method picks a random sample S from Database D and then searches for frequent itemset in S. It may be possible to lose a global frequent itemset, This can be reduced by lowering the min sup. 5. Dynamic Itemset Counting: This technique can add new candidate itemsets at any marked start point of the database during the scanning of the database. Applications Of Apriori Alg ‘Some fields where Apriori is used: 1, In Education Field: Extracting association rules in data mining of admitted students through characteristics and specialties. 2. In the Medical field: For example Analysis of the patient's database. 3. In Forestry: Analysis of probability and intensity of forest fire with the forest fire data 4. Apriori is used by many companies like Amazon in the Recommender System and by Google for the auto-complete feature. Page [34 Downloaded by Thammegowda MT (rithammegowda@gmallcom)DATA MINING AND BUSINESS INTELLIGENCE, C.CHANDRAPRIVA, M.Sc MPBil associative classification Associative Classification in Data Mining Bing Liu Et Al was the first to propose associative classification, in which he defined a model whose rule is “the right- hand side is constrained to be the attribute of the classification class”.An associative classifier is a supervised learning ‘model that uses association rules to assign a target value ‘The model generated by the association classifier and used to label new records consists of association rules that produce class labels. Therefore, they can also be thought of as a list of “if-then” clauses: if record meets certain criteria (specified on the left side of the rule, also known as antecedents), itis marked (or scored) according to the rule’s category on the right, Most associative classifiers read the list of rules sequentially and apply the first matching rule to mark new records. Association classifier rules inherit some metrics from association rules, such as Support or Confidence, which can be used to rank or filter the rules in the model and evaluate their quality Types of Associative Classification: ‘There are different types of Associative Classification Methods, Some of them are given below. 1. CBA (Classification Based on Associations): It uses association rule techniques to classily data, which proves to be more accurate than traditional classification techniques. It has to face the sensitivity of the minimum support threshold, When a lower minimum support threshold is specified, a large number of rules are generated. 2. CMAR (Classification based on Multiple Association Rules): It uses an efficient FP-tree, which consumes less memory and space compared to Classification Based on Associations, The FP-tree will not always fit in the main memory, especially when the number of attributes is large. 3. CPAR (Classification based on Predictive Association Rules): Classification based on predictive association rules combines the advantages of association classification and traditional rule-based classification. Classification based on predictive association rules uses a greedy algorithm to generate rules directly from training data, Furthermore, classification based on predictive association rules generates and tests more rules than traditional rule-based classifiers to avoid missing important rules. Association rule mining is a procedure which aims to observe frequently occurring patterns, correlations, or associations from datasets found in various kinds of databases such as relational databases, transactional databases, and other forms of repositories. Table of Contents E But what is association rule? ‘The Association rule is a learning technique that helps identify the dependencies between two data items. Based on the dependency, it then maps accordingly so that it can be more profitable. Association rule furthermore looks for interesting associations among the variables of the dataset. It is undoubtedly one of the most important concepts of Machine Learning and has been used in different cases such as association in data mining and continuous production, among others. However, like all other techniques, association in data mining, too, has its own set of disadvantages. The same has been discussed in brief in this article An association rule has 2 parts: + an antecedent (if) and + aconsequent (then) ‘An antecedent is something that’s found in data, and a consequent is an item that is found in combination with the antecedent, Have a look at this rule for instance: “Ifa customer buys bread, he's 70% likely of buying milk.” In the above association rule, bread is the antecedent and milk is the consequent. Simply put, it can be understood as a retail store's association rule to target their customers better. If the above rule is a result of a thorough analysis of some data sets, it can be used to not only improve customer Page 135 Downloaded by Thammegowéa MT (rthLiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, (CCHANDRAPRIVA, MSc M.PHIL service but also improve the company’s revenue. Association rules are created by thoroughly analyzing data and looking for frequent if/then pattems ‘Then, depending on the following two parameters, the important relationships are observed: 1. Support: Support indicates how frequently the iffthen relationship appears in the database. 2. Confidence: Confidence tells about the number of times these relationships have been found to be true. Must read: Free excel courses! So, ina given transaction with multiple items, Association Rule Mining primarily tries to find the rules that govern how or why such products/items are often bought together. For example, peanut butter and jelly are frequently purchased together because a lot of people like to make PB&I sandwiches. Learn Data Science Courses online at upGrad Association Rule Mining is sometimes referred to as “Market Basket Analysis”, as it was the first application area of association mining. The aim is to discover associations of items occurring together more often than you'd expect from randomly sampling all the possibilities. The classic anecdote of Beer and Diaper will help in understanding this better. The story goes like this: young American men who go to the stores on Fridays to buy diapers have a predisposition to grab a bottle of beer too. However unrelated and vague that may sound to us laymen, association rule mining shows us how and why! Let’s do a little analytics ourselves, shall we? Suppose an X store’s retail transactions database includes the following data: + Total number of transactions: 600,000 + Transactions containing diapers: 7,500 (1.25 percent) + Transactions containing beer: 60,000 (10 percent) + Transactions containing both beer and diapers: 6,000 (1.0 percent) From the above figures, we can conclude that if there was no relation between beer and diapers (that is, they were statistically independent), then we would have got only 10% of diaper purchasers to buy beer too. However, as surprising as it may seem, the figures tell us that 80% (=6000/7500) of the people who buy diapers also buy beer. This is a significant jump of 8 over what was the expected probability. This factor of increase is known as Lift — which is the ratio of the observed frequency of co-occurrence of our items and the expected frequency. How did we determine the lift? Simply by calculating the transactions in the database and performing simple mathematical operations. So, for our example, one plausible association rule can state that the people who buy diapers will also purchase beer with a Lift factor of 8. If we talk mathematically, the lift can be calculated as the ratio of the joint probability of two items x and y, divided by the product of their probabilities, Lift = P(xy)/[PO)P()] However, if the two items are statistically independent, then the joint probability of the two items will be the same as the product of their probabilities. Or, in other words, P(xy)=PQX)P(Y), which makes the Lift factor = 1. An interesting point worth mentioning here is that anti-correlation can even yield Lift values less than 1 — which corresponds to mutually exclusive items that rarely occur together Association Rule Mining has helped data scientists find out pattems they never knew existed. _ Basic Fundamentals of Statistics for Data Science Types Of Association Rules In Data Mining There are typically four different types of association rules in data mining. They are Page [36 Downloaded by Thammegowda MT (rithammegowda@gmallcom)LiMsc.cs, DATA MINING AND BUSINESS INTELLIGENCE, (CCHANDRAPRIVA, MSc M.PHIL + Multi-relational association rules + Generalized Association rule + Interval Information Association Rules + Quantitative Association Rules Multi-Relational Association Rule Also known as MRAR, multi-relational association rule is defined as a new class of association rules that are usually detived from different or multi-relational databases. Each rule under this class has one entity with different relationships that represent the indirect relationships between entities. Generalized Association Rule Moving on to the next (ype of association rule, the generalized association rule is largely used for getting a rough idea about the interesting patterns that often tend to stay hidden in data. Quantitative Association Rules ‘This particular type is actually one of the most unique kinds of all the four association rules available, What sets it apart from the others is the presence of numeric attributes in at least one attribute of quantitative association rules. This is in contrast to the generalized association rule, where the left and right sides consist of categorical attributes. Algorithms Of Associate Rule In Data Mining There are mainly three different types of algorithms that can be used to generate associate rules in data mining, Let’s take a look at them. + Apriori Algorithm Apriori algorithm identifies the frequent individual items in a given database and then expands them to larger item sets, keeping in check that the item sets appear sufficiently often in the database. + Eclat Algorithm ECLAT algorithm is also known as Equivalence Class Clustering and bottomup. Latice Traversal is another widely used method for associate rule in data mining. Some even consider it to be a better and more efficient version of the Apriori algorithm, + FP-growth Algorithm Also known as the recurring pattern, this algorithm is particularly useful for finding frequent patterns without the need for candidate generation. It mainly operates in two stages namely, FP-tree construction and extract frequently used item sets, Page [37 Downloaded by Thammegowda MT (mtthammegowda @gmallcom)
You might also like
Data Mining: Concepts and Techniques: - Slides For Textbook - Chapter 5
PDF
No ratings yet
Data Mining: Concepts and Techniques: - Slides For Textbook - Chapter 5
73 pages
Lecture 2.1.1 2.1.2
PDF
No ratings yet
Lecture 2.1.1 2.1.2
23 pages
Attribute Oriented Induction
PDF
100% (1)
Attribute Oriented Induction
6 pages
Concept Description: Characterization and Comparision: Chapter-10
PDF
No ratings yet
Concept Description: Characterization and Comparision: Chapter-10
5 pages
Unit-Iii Data Mining Material
PDF
No ratings yet
Unit-Iii Data Mining Material
23 pages
Chapter 5 Concept Description Characterization and Comparison 395
PDF
No ratings yet
Chapter 5 Concept Description Characterization and Comparison 395
64 pages
Data Mining Unit2
PDF
No ratings yet
Data Mining Unit2
9 pages
DATA MINING UNIT3
PDF
No ratings yet
DATA MINING UNIT3
19 pages
Data Warehouse
PDF
No ratings yet
Data Warehouse
19 pages
Chapter 5: Concept Description: Characterization and Comparison
PDF
No ratings yet
Chapter 5: Concept Description: Characterization and Comparison
58 pages
Data Mining: Concepts and Techniques: - Chapter 5
PDF
No ratings yet
Data Mining: Concepts and Techniques: - Chapter 5
63 pages
Data Mining Mid 2
PDF
No ratings yet
Data Mining Mid 2
20 pages
5 Desc
PDF
No ratings yet
5 Desc
60 pages
05 DM BI Concept Description
PDF
No ratings yet
05 DM BI Concept Description
21 pages
Data Mining-Unit-1
PDF
No ratings yet
Data Mining-Unit-1
21 pages
Data Mining: Concepts and Techniques: November 21, 2013
PDF
No ratings yet
Data Mining: Concepts and Techniques: November 21, 2013
64 pages
DM Concepts
PDF
No ratings yet
DM Concepts
64 pages
Data Mining: Concepts and Techniques: April 30, 2012
PDF
No ratings yet
Data Mining: Concepts and Techniques: April 30, 2012
64 pages
CH 4
PDF
No ratings yet
CH 4
58 pages
Data Mining Techniques Unit 2
PDF
No ratings yet
Data Mining Techniques Unit 2
48 pages
Data Mining: Concepts and Techniques: January 14, 2014
PDF
No ratings yet
Data Mining: Concepts and Techniques: January 14, 2014
64 pages
DM 1 PDF
PDF
No ratings yet
DM 1 PDF
67 pages
02-Data Mining Functionalities-2
PDF
No ratings yet
02-Data Mining Functionalities-2
23 pages
UNIT 1 Introduction of Data Mining
PDF
No ratings yet
UNIT 1 Introduction of Data Mining
11 pages
Chapter 1 Data Mining (cont.)
PDF
No ratings yet
Chapter 1 Data Mining (cont.)
50 pages
Unit III: Concept Description: Characterization and Comparison
PDF
No ratings yet
Unit III: Concept Description: Characterization and Comparison
53 pages
Data Mining: Concepts and Techniques: - Slides For Textbook - Chapter 5
PDF
No ratings yet
Data Mining: Concepts and Techniques: - Slides For Textbook - Chapter 5
64 pages
UNIT 4
PDF
No ratings yet
UNIT 4
39 pages
An 15 DM Caracterizacion
PDF
No ratings yet
An 15 DM Caracterizacion
38 pages
Lecture2 DataMiningFunctionalities
PDF
No ratings yet
Lecture2 DataMiningFunctionalities
18 pages
Unit 4 Data warehousing and Data mining
PDF
No ratings yet
Unit 4 Data warehousing and Data mining
15 pages
U1_1
PDF
No ratings yet
U1_1
13 pages
Association Rule Mining
PDF
No ratings yet
Association Rule Mining
61 pages
Unit 1
PDF
No ratings yet
Unit 1
21 pages
DW&M Unit - 1-Imp Vii Sem
PDF
No ratings yet
DW&M Unit - 1-Imp Vii Sem
9 pages
Data Mining
PDF
No ratings yet
Data Mining
14 pages
Unit-4 DWM
PDF
No ratings yet
Unit-4 DWM
73 pages
Concepts and Techniques: Data Mining
PDF
No ratings yet
Concepts and Techniques: Data Mining
22 pages
Lect 2
PDF
No ratings yet
Lect 2
35 pages
datamining ch1
PDF
No ratings yet
datamining ch1
24 pages
Patterns Mined +frequent Patterns
PDF
No ratings yet
Patterns Mined +frequent Patterns
18 pages
Chapter 1
PDF
No ratings yet
Chapter 1
16 pages
DataWarehouseMining Complete Notes
PDF
No ratings yet
DataWarehouseMining Complete Notes
55 pages
DWM Module 2
PDF
No ratings yet
DWM Module 2
122 pages
UNIT-4 Characterization and Comparison
PDF
No ratings yet
UNIT-4 Characterization and Comparison
61 pages
Lecture Notes 1.1 & 1.2
PDF
No ratings yet
Lecture Notes 1.1 & 1.2
8 pages
CH 2
PDF
No ratings yet
CH 2
37 pages
Solutions To DM I MID (A)
PDF
100% (1)
Solutions To DM I MID (A)
19 pages
dw and dm notes (1)
PDF
No ratings yet
dw and dm notes (1)
89 pages
Data Mining-2-1
PDF
No ratings yet
Data Mining-2-1
12 pages
important questions unit-1
PDF
No ratings yet
important questions unit-1
20 pages
UNIT-5 DMDW
PDF
No ratings yet
UNIT-5 DMDW
21 pages
Lecture 2.1.3 2.1.4
PDF
No ratings yet
Lecture 2.1.3 2.1.4
34 pages
Data Mining Functionalities
PDF
100% (1)
Data Mining Functionalities
4 pages
Unit-1 Notes (1)
PDF
No ratings yet
Unit-1 Notes (1)
24 pages
CSC 425 Data Mining and Warehousing 2024
PDF
No ratings yet
CSC 425 Data Mining and Warehousing 2024
54 pages
Data Warehousing & Data Mining Unit-4 Notes
PDF
No ratings yet
Data Warehousing & Data Mining Unit-4 Notes
68 pages
Kinds of Data: 1. Data Bases Data 2.data Warehouses Data 3. Transactional Data
PDF
No ratings yet
Kinds of Data: 1. Data Bases Data 2.data Warehouses Data 3. Transactional Data
24 pages