0% found this document useful (0 votes)
11 views31 pages

Data Mining Practical (1)

The document outlines practical implementations of various machine learning algorithms using Weka, including Naïve Bayes, Decision Tree, Clustering, and Apriori. Each section provides step-by-step instructions for loading datasets, applying algorithms, configuring parameters, and interpreting results. The document serves as a comprehensive guide for users to effectively utilize Weka for data analysis and machine learning tasks.

Uploaded by

Sahil Sayyad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views31 pages

Data Mining Practical (1)

The document outlines practical implementations of various machine learning algorithms using Weka, including Naïve Bayes, Decision Tree, Clustering, and Apriori. Each section provides step-by-step instructions for loading datasets, applying algorithms, configuring parameters, and interpreting results. The document serves as a comprehensive guide for users to effectively utilize Weka for data analysis and machine learning tasks.

Uploaded by

Sahil Sayyad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

INDEX

Sr. No Aim Date Sign

1
Show the implementation of Naïve Bayes algorithm.

2
Show the implementation of Decision Tree.

3
Show the implementation of Clustering Algorithm.

4
Show the implementation of Apriori Algorithm

5
Show the implementation of Time Series Algorithm.
Practical No : 1
Aim: Show the implementation of Naïve Bayes algorithm.

Step 1:

Step 1: Install Weka

1. Download Weka from Weka’s official website.


2. Install the application by following the on-screen instructions.

Step 2: Load Your Dataset

1. Open Weka.
2. Click on the "Explorer" button to launch the Weka Explorer interface.
3. In the "Preprocess" tab, click on "Open file".
4. Select your dataset file (usually in .arff or .csv format) and click "Open".
Step 3: Apply the Naïve Bayes Algorithm

1. Go to the "Classify" tab.


2. Click on the "Choose" button to select a classifier.
3. Navigate to:

rust
Copy
bayes -> NaiveBayes

Select NaiveBayes.
Step 4: Configure Naïve Bayes (Optional)

 If you need to tweak settings, click on the NaïveBayes classifier (after selecting it),
and the options will appear.
 Adjust parameters if needed, although Naïve Bayes generally requires little tuning.
Step 5: Train and Evaluate the Model

1. Choose an evaluation method:


o Use training set (evaluates on the same data).
o Cross-validation (recommended, e.g., 10-fold).
o Percentage split (e.g., 70% training, 30% testing).
2. Click "Start" to run the algorithm.
Step 6: Interpret Results

 Once the process is complete, Weka will display the results:


o Classifier output: Shows the model’s performance, accuracy, precision,
recall, etc.
o Confusion matrix: Helps in understanding classification errors.
Step 7: Save the Model (Optional)

 If satisfied with the model, you can save it:


o Right-click on the classifier output and select "Save model".
o Choose the location and name the file (usually .model format).
Step 8: Visualize the Data

 If satisfied with the model, you can save it:


o Right-click on the classifier output and select "Save model".
 The data must be clean.
 It should not contain null values.
 Visualize option allows you to visualize your processed data for
analysis.
 This is because the raw data collected from the field may contain null
values, irrelevant columns and so on.
 The data that is collected from the field contains many unwanted things
that leads to wrong analysis. For example, the data may contain null
fields, it may contain columns that are irrelevant to the current analysis,
and so on.
Practical No.- 2

Aim: Show the implementation of Decision Tree.

Step 1: Load Your Dataset

1. Click "Explorer" on the main Weka interface.


2. Go to the "Preprocess" tab.
3. Click "Open file" and select your dataset (preferably in .arff or .csv format).
4. Click "Open" to load the dataset.
Step 3: Apply the Decision Tree Algorithm

1. Switch to the "Classify" tab.


2. Click "Choose" to open the list of classifiers.
3. Navigate to:

rust
Copy
trees -> J48

o J48 is Weka's implementation of the C4.5 algorithm, commonly used for


Decision Trees.
Step 4: Configure J48 (Optional)

 Click on J48 after selecting it to open the configuration window.


 Here, you can adjust parameters:
o -C (confidence factor): Controls pruning (default is 0.25).
o -M (minimum number of instances per leaf): Defines the minimum data
required to create a leaf.
 Example settings:
o Confidence factor: 0.1 (more aggressive pruning).
o Minimum instances per leaf: 5 (fewer rules, more generalization).
Step 5: Train and Evaluate the Model

1. Choose an evaluation method:


o Use training set (quick but may overfit).
o Cross-validation (e.g., 10-fold) – recommended for better generalization.
o Percentage split (e.g., 70% train, 30% test).
2. Click "Start" to run the classifier.

Step 6: Interpret the Results

After execution, Weka will display:

 Classifier output: Shows accuracy, precision, recall, etc.


 Confusion matrix: Displays true positives, false positives, etc.
 Decision tree structure: A readable tree showing how decisions are made (e.g., if-
else conditions).
Practical No- 3
Aim: Show the implementation of Clustering Algorithm.

Step 1: Load Your Dataset

1. Click "Explorer" on the main Weka interface.


2. Go to the "Preprocess" tab.
3. Click "Open file" and select your dataset (.arff or .csv file).
4. Click "Open" to load the dataset.

Note: For clustering, the dataset should not have a class attribute because clustering
algorithms are unsupervised.

*Ensure the time attribute (e.g., a date or index) is set correctly.


Step 2: Prepare Time Series Data

 In Weka, time series data is treated as sequential data.


 Ensure that:
o The data is sorted chronologically.
o The target variable (the one you want to forecast) is set as the class attribute
(if applicable).

• Under Fields to forecast, select the attribute you want to predict (e.g., "Year" or "Pop").
Step 3: Apply the Time Series Algorithm (ARIMA)

1. Go to the "Classify" tab (since ARIMA is a supervised model).


2. Click "Choose" to open the list of classifiers.
3. Navigate to:

rust
Copy
timeSeries -> ARIMA
3. Fine-tune the parameters of the learning algorithm if needed.

Step 4: Configure Time Series Data(Optional)

 Click on ARIMA to open the configuration window.


 Adjust the parameters:
o p (autoregressive order): Number of lag observations included in the model.
o d (differencing order): Number of times the raw observations are
differenced.
o q (moving average order): Size of the moving average window.
 Example settings:
o p: 1
o d: 1
o q: 1

1. Visualize Predictions:
o Weka provides a graph comparing actual values and predicted
values for better.
OUTPUT
interpretability.
Practical No- 4
Aim: Show The Implementation of Clustering Algorithm.

To implement a Clustering Algorithm in Weka, we'll use the Explorer GUI.


Weka provides several clustering algorithms like k-means, EM (Expectation-
Maximization), and DBSCAN. Here’s how to apply a basic clustering algorithm,
such as k-means,

Step 2: Load Your Dataset

1. Click "Explorer" on the main Weka interface.


2. Go to the "Preprocess" tab.
3. Click "Open file" and select your dataset (.arff or .csv file).
4. Click "Open" to load the dataset.

Note: For clustering, the dataset should not have a class attribute because clustering
algorithms are unsupervised.

1) K-MEANS:
Step 3: Apply the Clustering Algorithm

1. Go to the "Cluster" tab (next to the "Classify" tab).


2. Click "Choose" to open the list of clustering algorithms.
3. Select SimpleKMeans (for k-means clustering):

rust
Copy
cluster -> SimpleKMeans
Step 4: Configure the Clustering Algorithm (Optional)

 Click on SimpleKMeans to open the configuration window.


 Adjust the parameters:
o -N (number of clusters): Specify the number of clusters (e.g., 3).
o -I (max iterations): Set the maximum number of iterations (default is 500).
o -t (random seed): For reproducibility.
 Example settings:
o Number of clusters: 3
o Max iterations: 100
o Seed: 10
Step 5: Run the Clustering Algorithm

1. Click "Start" to execute the algorithm


2. Weka will cluster the data based on the specified parameters.
B) Hierarchical Clustering

Hierarchical clustering is an unsupervised learning algorithm that is used to


group together the unlabeled data points having similar characteristics.

 Step 1 − Treat each data point as single cluster. Hence, we will be having say
K clusters at start. The number of data points will also be K at start.
 Step 2 − Now, in this step we need to form a big cluster by joining two closet
datapoints. This will result in total of K-1 clusters.
 Step 3 − Now, to form more clusters we need to join two closet clusters. This
will result in total of K-2 clusters.
Practical NO- 5

Aim: Show the implementation of Apriori Algorithm

The Apriori algorithm is commonly used for mining frequent itemsets and
association rule learning. Weka provides an easy-to-use interface to apply the
Apriori algorithm. Here’s a step-by-step guide on how to implement the Apriori
algorithm in Weka:

Step 1: Prepare Your Dataset

 Format: Your data should be in the ARFF format or CSV. The dataset must be
transactional, where each transaction contains a list of items.

Step 2: Load Dataset in Weka

1. Open Weka GUI Chooser.


2. Click on "Explorer".
3. Load your dataset by clicking "Open file" and selecting your ARFF or CSV file.
Step 3: Apply the Apriori Algorithm

1. In the Weka Explorer, go to the "Associate" tab.


2. In the "Associator" section, choose "Apriori" from the drop-down menu.
3. Configure the parameters:
o Support: Minimum support threshold (e.g., 0.5 for 50%).
o Confidence: Minimum confidence level (e.g., 0.8 for 80%).
o Search Method: You can choose from "Best First", "A* Search", etc.
Step 4: Run the Algorithm

 Click "Start" to run Apriori.


 Weka will process the data and display the frequent itemsets and association rules in
the output panel.

You might also like