0% found this document useful (0 votes)

21 views

LPDM Lab Manul

Uploaded by

Aliaa Tarek Ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views

LPDM Lab Manul

Uploaded by

Aliaa Tarek Ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 90

LINUX PROGRAMMING AND DATA MINIG

LAB MANUAL
IV-BTECH

VID
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

Page 2 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

Contents

Page
S.No Topic
no
Week1
1. Write a shell script that accepts a file name, starting and ending line numbers
3. as arguments and displays all the lines between the given line numbers.

2. Write a shell script that deletes all lines containing a specified word in one or
more files supplied as arguments to it.

3. Write a shell script that displays a list of all the files in the current directory 7
to which the user has read, write and execute permissions.

4. Write a shell script that receives any number of file names as arguments
checks if every argument supplied is a file or a directory and reports
accordingly. Whenever the argument is a file, the number of lines on it is also
reported.

Week 2
5. Write a shell script that accepts a list of file names as its arguments, counts
4. and reports the occurrence of each word that is present in the first argument file
on other argument files
10
6. Write a shell script to list all of the directory files in a directory.

7. Write a shell script to find factorial of a given integer.

Week 3
8. Write an awk script to count the number of lines in a file that do not contain
5. vowels.
9. Write an awk script to find the number of characters, words and lines in a file. 13
10. Write a c program that makes a copy of a file using standard I/O and system
calls

Page 3 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

Week 4
11. Implement in C the following UNIX commands using System calls
6. A. cat B. ls C. mv
12. Write a program that takes one or more file/directory names as command line
15
input and reports the following information on the file.
A. File type. B. Number of links.
C. Time of last access.
D. Read, Write and Execute permissions.
Week 5
13. Write a C program to emulate the UNIX ls –l command.
7.
14. Write a C program to list for every file in a directory, its inode number and
19
file name.
15. Write a C program that demonstrates redirection of standard output to a file.
Ex: ls > f1.

Week 6
16. Write a C program to create a child process and allow the parent to display
8. “parent” and the child to display “child” on the screen.
29
17. Write a C program to create a Zombie process.

18. Write a C program that illustrates how an orphan is created.

Week 7
19. Write a C program that illustrates how to execute two commands
9. concurrently with a command pipe.
Ex: - ls –l | sort
20. Write C programs that illustrate communication between two unrelated
processes using named pipe

21. Write a C program to create a message queue with read and write 31
permissions to write 3 messages to it with different priority numbers.

22. Write a C program that receives the messages (from the above message queue
as specified in (21)) and displays them.

Week 8
23. Write a C program to allow cooperating processes to lock a resource for 40
10. exclusive use, using a) Semaphores b) flock or lockf system calls.

Page 4 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

24. Write a C program that illustrates suspending and resuming processes using
signals

Week 9
11. 25. Write a C program that implements a producer-consumer system with two
processes.
41
(Using Semaphores).
26. Write client and server programs (using c) for interaction between server and
client processes using Unix Domain sockets.
12.
Week 10
27. Write client and server programs (using c) for interaction between server and
client processes using Internet Domain sockets.
47
28. Write a C program that illustrates two processes communicating using
shared memory

13. Listing of categorical attributes and the real-valued attributes separately. 55

14. Rules for identifying attributes. 56

15. Training a decision tree. 59

16. Test on classification of decision tree. 63

17. Testing on the training set . 67

18. Using cross –validation for training. 68

19. Significance of attributes in decision tree. 71

20. Trying generation of decision tree with various number of decision tree. 74

21. Find out differences in results using decision tree and cross-validation on a data 76
set.

22. Decision trees. 78

23. Reduced error pruning for training Decision Trees using cross-validation 78

24. Convert a Decision Trees into "if-then-else rules". 81

Page 5 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

Page 6 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

Week1

1. Write a shell script that accepts a file name, starting and ending line numbers as arguments
and displays all the lines between the given line numbers.

Aim: ToWrite a shell script that accepts a file name, starting and ending line numbers as
arguments and displays all the lines between the given line numbers.

Script:
$ awk ‘NR<2 || NR> 4 {print $0}’ 5 lines.dat

I/P: line1
line2
line3
line4
line5

O/P: line1
line5

2. Write a shell script that deletes all lines containing a specified word in one or more files
supplied as arguments to it.

Aim: To write a shell script that deletes all lines containing a specified word in one or more
files supplied as arguments to it.

Script:
clear
i=1
while [ $i -le $# ]
do
grep -v Unix $i > $i
done

Output:
$ sh 1b.sh test1
the contents before deleting
test1
hello
hello

Page 7 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

bangalore
mysore city
enter the word to be deleted
city
after deleting
hello
hello
Bangalore

$ sh 1b.sh
no argument passed

3. Write a shell script that displays a list of all the files in the current directory to which the
user has read, write and execute permissions.

Aim: To write a shell script that displays a list of all the files in the current directory to
which the user has read, write and execute permissions.

Script:
echo "enter the directory name"
read dir
if [ -d $dir ]
then
cd $dir
ls > f
exec < f
while read line
do
if [ -f $line ]
then
if [ -r $line -a -w $line -a -x $line ]
then
echo "$line has all permissions"
else
echo "files not having all permissions"
fi
fi

Page 8 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

done
fi

4. Write a shell script that receives any number of file names as arguments checks if every
argument supplied is a file or a directory and reports accordingly. Whenever the argument
is a file, the number of lines on it is also reported

Aim: To write a shell script that receives any number of file names as arguments checks if
every argument supplied is a file or a directory

Script:
for x in $*
do
if [ -f $x ]
then
echo " $x is a file "
echo " no of lines in the file are "
wc -l $x
elif [ -d $x ]
then
echo " $x is a directory "
else
echo " enter valid filename or directory name "
fi
done

Page 9 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

Week 2

5. Write a shell script that accepts a list of file names as its arguments, counts and reports the
occurrence of each word that is present in the first argument file on other argument files.

Aim : To write a shell script that accepts a list of file names as its arguments, counts and
reports the occurrence of each word that is present in the first argument file on other
argument files.

Script:
if [ $# -ne 2 ]
then
echo "Error : Invalid number of arguments."
exit
fi
str=`cat $1 | tr '\n' ' '`
for a in $str
do
echo "Word = $a, Count = `grep -c "$a" $2`"
done

Output :
$ cat test
hello ATRI
$ cat test1
hello ATRI
hello ATRI
hello
$ sh 1.sh test test1
Word = hello, Count = 3
Word = ATRI, Count = 2

Page 10 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

6. Write a shell script to list all of the directory files in a directory.

Script:
# !/bin/bash
echo"enter directory name"
read dir
if[ -d $dir]
then
echo"list of files in the directory"
ls $dir
else
echo"enter proper directory name"
fi
Output:
Enter directory name
Atri
List of all files in the directoty
CSE.txt
ECE.txt

7. Write a shell script to find factorial of a given integer.

Script:
# !/bin/bash
echo "enter a number"
read num
fact=1
while [ $num -ge 1 ]
do
fact=`expr $fact \* $num`
let num--
done
echo "factorial of $n is $fact"

Output:
Enter a number
5

Page 11 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

Factorial of 5 is 120

Page 12 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

Week 3

8. Write an awk script to count the number of lines in a file that do not contain vowels.
9. Write an awk script to find the number of characters, words and lines in a file.

Aim : To write an awk script to find the number of characters, words and lines in a file.

Script:
BEGIN{print "record.\t characters \t words"}
#BODY section
{
len=length($0)
total_len+=len
print(NR,":\t",len,":\t",NF,$0)
words+=NF
}
END{
print("\n total")
print("characters :\t" total len)
print("lines :\t" NR)
}

10. Write a c program that makes a copy of a file using standard I/O and system calls

#include <unistd.h>
#include <fcntl.h>
int main(int argc, char *argv[]){
int fd1, fd2;
char buffer[100];
long int n1;
if(((fd1 = open(argv[1], O_RDONLY)) == -1) ||
((fd2 = open(argv[2], O_CREAT|O_WRONLY|O_TRUNC,
0700)) == -1)){
perror("file problem ");
exit(1);
}
while((n1=read(fd1, buffer, 100)) > 0)
if(write(fd2, buffer, n1) != n1){
perror("writing problem ");
exit(3);

Page 13 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

}
// Case of an error exit from the loop
if(n1 == -1){
perror("Reading problem ");
exit(2);
}
close(fd2);
exit(0);
}

Page 14 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

Week 4

11. Implement in C the following UNIX commands using System calls

A. cat B. ls C. mv

AIM: Implement in C the cat Unix command using system calls

#include<fcntl.h>
#include<sys/stat.h>
#define BUFSIZE 1
int main(int argc, char **argv)
{
int fd1;
int n;
char buf;
fd1=open(argv[1],O_RDONLY);
printf("Welcome to ATRI\n");
while((n=read(fd1,&buf,1))>0)
{
printf("%c",buf);
/* or
write(1,&buf,1); */

}
return (0);
}

AIM: Implement in C the following ls Unix command using system calls

Algorithm:
1. Start.
2. open directory using opendir( ) system call.
3. read the directory using readdir( ) system call.
4. print dp.name and dp.inode .
5. repeat above step until end of directory.
6. End
#include <sys/types.h>
#include <sys/dir.h>
#include <sys/param.h>

Page 15 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

#include <stdio.h>

#define FALSE 0
#define TRUE 1

extern int alphasort();

char pathname[MAXPATHLEN];

main() {
int count,i;
struct dirent **files;
int file_select();

if (getwd(pathname) == NULL )
{ printf("Error getting pathn");
exit(0);
}
printf("Current Working Directory = %sn",pathname);
count = scandir(pathname, &files, file_select, alphasort);

if (count <= 0)
{
printf("No files in this directoryn");
exit(0);
}
printf("Number of files = %dn",count);
for (i=1;i<count+1;++i)
printf("%s \n",files[i-1]->d_name);
}

int file_select(struct direct *entry)

{
if ((strcmp(entry->d_name, ".") == 0) ||(strcmp(entry->d_name, "..") == 0))
return (FALSE);
else
return (TRUE);
}

AIM: Implement in C the Unix command mv using system calls

Page 16 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

Algorithm:
1. Start
2. open an existed file and one new open file using open()
system call
3. read the contents from existed file using read( ) system
call
4. write these contents into new file using write system
call using write( ) system call
5. repeat above 2 steps until eof
6. close 2 file using fclose( ) system call
7. delete existed file using using unlink( ) system
8. End.

Program:
#include<fcntl.h>
#include<stdio.h>
#include<unistd.h>
#include<sys/stat.h>
int main(int argc, char **argv)
{
int fd1,fd2;
int n,count=0;
fd1=open(argv[1],O_RDONLY);
fd2=creat(argv[2],S_IWUSR);
rename(fd1,fd2);
unlink(argv[1]);
printf(“ file is copied “);
return (0);
}

12. Write a program that takes one or more file/directory names as command line input and
reports the following information on the file.
A. File type. B. Number of links.
C. Time of last access. D. Read, Write and Execute permissions.
#include<stdio.h>
main()
{
FILE *stream;
int buffer_character;
stream=fopen(“test”,”r”);

Page 17 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

if(stream==(FILE*)0)
{
fprintf(stderr,”Error opening file(printed to standard error)\n”);
fclose(stream);
exit(1);
}
}
if(fclose(stream))==EOF)
{
fprintf(stderr,”Error closing stream.(printed to standard error)\n);
exit(1);
}
return();
}

Page 18 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

Week 5

13. Write a C program to emulate the UNIX ls –l command.

ALGORITHM :

Step 1: Include necessary header files for manipulating directory.

Step 2: Declare and initialize required objects.
Step 3: Read the directory name form the user.
Step 4: Open the directory using opendir() system call and report error if the directory is not
available.
Step 5: Read the entry available in the directory.
Step 6: Display the directory entry ie., name of the file or sub directory.
Step 7: Repeat the step 6 and 7 until all the entries were read.

/* 1. Simulation of ls command */
#include<fcntl.h>
#include<stdio.h>
#include<unistd.h>
#include<sys/stat.h>main()
{
char dirname[10];
DIR *p;
struct dirent *d;
printf("Enter directory name ");
scanf("%s",dirname);
p=opendir(dirname);
if(p==NULL)
{
perror("Cannot find dir.");
exit(-1);
}
while(d=readdir(p))
printf("%s\n",d->d_name);
}

SAMPLE OUTPUT:

enter directory name iii

Page 19 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

...

14. Write a C program to list for every file in a directory, its inode number and file name.

The Dirent structure contains the inode number and the name. The maximum length of a
filename component is NAME_MAX, which is a system-dependent value. opendir returns a
pointer to a structure called DIR, analogous to FILE, which is used by readdir and closedir. This
information is collected into a file called dirent.h.

#define NAME_MAX 14 /* longest filename component; */

/* system-dependent */

typedef struct { /* portable directory entry */

long ino; /* inode number */

char name[NAME_MAX+1]; /* name + '\0' terminator */

} Dirent;

typedef struct { /* minimal DIR: no buffering, etc. */

int fd; /* file descriptor for the directory */

Dirent d; /* the directory entry */

} DIR;

DIR opendir(char dirname);

Dirent readdir(DIR dfd);

void closedir(DIR *dfd);

The system call stat takes a filename and returns all of the information in the inode for that file,
or -1 if there is an error. That is,

char *name;

Page 20 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

struct stat stbuf;

int stat(char , struct stat );

stat(name, &stbuf);

fills the structure stbuf with the inode information for the file name. The structure describing the
value returned by stat is in <sys/stat.h>, and typically looks like this:

struct stat /* inode information returned by stat */

dev_t st_dev; /* device of inode */

ino_t st_ino; /* inode number */

short st_mode; /* mode bits */

short st_nlink; /* number of links to file */

short st_uid; /* owners user id */

short st_gid; /* owners group id */

dev_t st_rdev; /* for special files */

off_t st_size; /* file size in characters */

time_t st_atime; /* time last accessed */

time_t st_mtime; /* time last modified */

time_t st_ctime; /* time originally created */

};

Most of these values are explained by the comment fields. The types like dev_t and ino_t are
defined in<sys/types.h>, which must be included too.

The st_mode entry contains a set of flags describing the file. The flag definitions are also
included in<sys/types.h>; we need only the part that deals with file type:

#define S_IFMT 0160000 /* type of file: */

#define S_IFDIR 0040000 /* directory */

Page 21 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

#define S_IFCHR 0020000 /* character special */

#define S_IFBLK 0060000 /* block special */

#define S_IFREG 0010000 /* regular */

/* ... */

Now we are ready to write the program fsize. If the mode obtained from stat indicates that a file
is not a directory, then the size is at hand and can be printed directly. If the name is a directory,
however, then we have to process that directory one file at a time; it may in turn contain sub-
directories, so the process is recursive.

The main routine deals with command-line arguments; it hands each argument to the
function fsize.

#include <stdio.h>

#include <string.h>

#include "syscalls.h"

#include <fcntl.h> /* flags for read and write */

#include <sys/types.h> /* typedefs */

#include <sys/stat.h> /* structure returned by stat */

#include "dirent.h"

void fsize(char *)

/* print file name */

main(int argc, char **argv)

if (argc == 1) /* default: current directory */

fsize(".");

else

Page 22 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

while (--argc > 0)

fsize(*++argv);

return 0;

The function fsize prints the size of the file. If the file is a directory, however, fsize first
calls dirwalk to handle all the files in it. Note how the flag names S_IFMT and S_IFDIR are used
to decide if the file is a directory. Parenthesization matters, because the precedence of & is lower
than that of ==.

int stat(char , struct stat );

void dirwalk(char , void (fcn)(char *));

/* fsize: print the name of file "name" */

void fsize(char *name)

struct stat stbuf;

if (stat(name, &stbuf) == -1) {

fprintf(stderr, "fsize: can't access %s\n", name);

return;

if ((stbuf.st_mode & S_IFMT) == S_IFDIR)

dirwalk(name, fsize);

printf("%8ld %s\n", stbuf.st_size, name);

The function dirwalk is a general routine that applies a function to each file in a directory. It
opens the directory, loops through the files in it, calling the function on each, then closes the

Page 23 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

directory and returns. Since fsize calls dirwalk on each directory, the two functions call each
other recursively.

#define MAX_PATH 1024

/* dirwalk: apply fcn to all files in dir */

void dirwalk(char dir, void (fcn)(char *))

char name[MAX_PATH];

Dirent *dp;

DIR *dfd;

if ((dfd = opendir(dir)) == NULL) {

fprintf(stderr, "dirwalk: can't open %s\n", dir);

return;

while ((dp = readdir(dfd)) != NULL) {

if (strcmp(dp->name, ".") == 0

|| strcmp(dp->name, ".."))

continue; /* skip self and parent */

if (strlen(dir)+strlen(dp->name)+2 > sizeof(name))

fprintf(stderr, "dirwalk: name %s %s too long\n",

dir, dp->name);

else {

sprintf(name, "%s/%s", dir, dp->name);

(*fcn)(name);

Page 24 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

closedir(dfd);

Each call to readdir returns a pointer to information for the next file, or NULL when there are no
files left. Each directory always contains entries for itself, called ".", and its parent, ".."; these
must be skipped, or the program will loop forever.

Down to this last level, the code is independent of how directories are formatted. The next step is
to present minimal versions of opendir, readdir, and closedir for a specific system. The following
routines are for Version 7 and System V UNIX systems; they use the directory information in the
header<sys/dir.h>, which looks like this:

#ifndef DIRSIZ

#define DIRSIZ 14

#endif

struct direct { /* directory entry */

ino_t d_ino; /* inode number */

char d_name[DIRSIZ]; /* long name does not have '\0' */

};

Some versions of the system permit much longer names and have a more complicated directory
structure.

The type ino_t is a typedef that describes the index into the inode list. It happens to be unsigned
short on the systems we use regularly, but this is not the sort of information to embed in a
program; it might be different on a different system, so the typedef is better. A complete set of
``system'' types is found in <sys/types.h>.

opendir opens the directory, verifies that the file is a directory (this time by the system call fstat,
which is like stat except that it applies to a file descriptor), allocates a directory structure, and
records the information:

int fstat(int fd, struct stat *);

Page 25 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

/* opendir: open a directory for readdir calls */

DIR opendir(char dirname)

int fd;

struct stat stbuf;

DIR *dp;

if ((fd = open(dirname, O_RDONLY, 0)) == -1

|| fstat(fd, &stbuf) == -1

|| (stbuf.st_mode & S_IFMT) != S_IFDIR

|| (dp = (DIR *) malloc(sizeof(DIR))) == NULL)

return NULL;

dp->fd = fd;

return dp;

closedir closes the directory file and frees the space:

/* closedir: close directory opened by opendir */

void closedir(DIR *dp)

if (dp) {

close(dp->fd);

free(dp);

Page 26 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

Finally, readdir uses read to read each directory entry. If a directory slot is not currently in use
(because a file has been removed), the inode number is zero, and this position is skipped.
Otherwise, the inode number and name are placed in a static structure and a pointer to that is
returned to the user. Each call overwrites the information from the previous one.

#include <sys/dir.h> /* local directory structure */

/* readdir: read directory entries in sequence */

Dirent readdir(DIR dp)

struct direct dirbuf; /* local directory structure */

static Dirent d; /* return: portable structure */

while (read(dp->fd, (char *) &dirbuf, sizeof(dirbuf))

== sizeof(dirbuf)) {

if (dirbuf.d_ino == 0) /* slot not in use */

continue;

d.ino = dirbuf.d_ino;

strncpy(d.name, dirbuf.d_name, DIRSIZ);

d.name[DIRSIZ] = '\0'; /* ensure termination */

return &d;

return NULL;

15. Write a C program that demonstrates redirection of standard output to a file.

Ex: ls > f1.
Description:

Page 27 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

An Inode number points to an Inode. An Inode is a data structure that stores the following
information about a file :
 Size of file
 Device ID

 User ID of the file

 Group ID of the file
 The file mode information and access privileges for owner, group and others
 File protection flags
 The timestamps for file creation, modification etc
 link counter to determine the number of hard links
 Pointers to the blocks storing file’s contents

Page 28 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

Week 6

16. Write a C program to create a child process and allow the parent to display “parent” and the
child to display “child” on the screen.
#include<stdio.h>
#include<string.h>
main()
{
int childpid;

if (( childpid=fork())<0)
{
printf("cannot fork");
}
else if(childpid >0)
{

}
else
printf(“Child process”);
}

17. Write a C program to create a Zombie process.

If child terminates before the parent process then parent process with out child is called
zombie process

#include<stdio.h>
#include<string.h>
main()
{
int childpid;

if (( childpid=fork())<0)
{
printf("cannot fork");
}
else if(childpid >0)
{
Printf(“child process”);
exit(0);

Page 29 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

}
else
{
wait(100);
printf(“parent process”);
}
}

18. Write a C program that illustrates how an orphan is created.

#include<stdio.h>
main()
{
int id;
printf("Before fork()\n");
id=fork();

if(id==0)
{
printf("Child has started: %d\n ",getpid());
printf("Parent of this child : %d\n",getppid());
printf("child prints 1 item :\n ");
sleep(25);
printf("child prints 2 item :\n");
}
else
{
printf("Parent has started: %d\n",getpid());
printf("Parent of the parent proc : %d\n",getppid());
}

printf("After fork()");
}

Page 30 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

Week 7

19. Write a C program that illustrates how to execute two commands concurrently with a
command pipe.
Ex: - ls –l | sort

AIM: Implementing Pipes

DESCRIPTION:

A pipe is created by calling a pipe() function.

int pipe(int filedesc[2]);
It returns a pair of file descriptors filedesc[0] is open for reading and filedesc[1] is
open for writing. This function returns a 0 if ok & -1 on error.

ALGORITHM:

The following is the simple algorithm for creating, writing to and reading from a
pipe.
1) Create a pipe through a pipe() function call.
2) Use write() function to write the data into the pipe. The syntax is as follows
write(int [],ip_string,size);

int [] – filedescriptor variable, in this case if int filedesc[2] is the variable, then
use the filedesc[1] as the first parameter.

ip_string – The string to be written in the pipe.

Size – buffer size for storing the input

3) Use read() function to read the data that has been written to the pipe.
The syntax is as follows
read(int [], char,size);

PROGRAM:

#include<stdio.h>
#include<string.h>
main()
{
int pipe1[2],pipe2[2],childpid;

Page 31 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

if(pipe(pipe1)<0 || pipe(pipe2) < 0)

printf("pipe creation error");
if (( childpid=fork())<0)
{
printf("cannot fork");
}
else
if(childpid >0)
{
close(pipe1[0]);
close(pipe2[1]);
client(pipe2[0],pipe1[1]);
while (wait((int *) 0 ) !=childpid);
close(pipe1[1]);
close(pipe2[0]);
exit(0);
}
else
{
close(pipe1[1]);
close(pipe2[0]);
server(pipe1[0],pipe2[1]);
close(pipe1[0]);
close(pipe2[1]);
exit(0);
}
}
client(int readfd,int writefd)
{
int n;
char buff[1024];
if(fgets(buff,1024,stdin)==NULL)
printf("file name read error");
n=strlen(buff);
if(buff[n-1]=='\n')
n--;
if(write(writefd,buff,n)!=n)
printf("file name write error");
while((n=read(readfd,buff,1024))>0)
if(write(1,buff,n)!=n)

Page 32 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

printf("data write error");

if(n<0)
printf("data error");
}
server(int readfd,int writefd)
{
char buff[1024],errmsg[50];
int n,fd;
n=read(readfd,buff,1024);
buff[n]='\0';
if((fd=open(buff,0))<0)
{
sprintf(buff,"file does nit exist");
write(writefd,buff,1024);
}
else
{
while((n=read(fd,buff,1024))>0)
write(writefd,buff,n);
}
}

20. Write C programs that illustrate communication between two unrelated processes using
named pipe.

AIM: Implementing IPC using a FIFO (or) named pipe.

DESCRIPTION:

Another kind of IPC is FIFO(First in First Out) is sometimes also called as named
pipe.It is like a pipe, except that it has a name.Here the name is that of a file that multiple
processes can open(), read and write to. A FIFO is created using the mknod() system call.
The syntax is as follows

int mknod(char *pathname, int mode, int dev);

The pathname is a normal Unix pathname, and this is the name of the FIFO.

The mode argument specifies the file mode access mode.The dev value is ignored for a
FIFO.

Page 33 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

Once a FIFO is created, it must be opened for reading (or) writing using either the open
system call, or one of the standard I/O open functions-fopen, or freopen.

ALGORITHM:

The following is the simple algorithm for creating, writing to and reading from a

FIFO.

1) Create a fifo through mknod() function call.

2) Use write() function to write the data into the fifo. The syntax is as follows
write(int [],ip_string,size);

int [] – filedescriptor variable, in this case if int filedesc[2] is the variable, then
use the filedesc[1] as the first parameter.

ip_string – The string to be written in the fifo.

Size – buffer size for storing the input

3) Use read() function to read the data that has been written to the fifo.
The syntax is as follows

read(int [], char,size);

PROGRAM:

#define FIFO1 "Fifo1"

#define FIFO2 "Fifo2"
#include<stdio.h>
#include<string.h>
#include<sys/types.h>
#include<fcntl.h>
#include<sys/stat.h>
main()
{
int childpid,wfd,rfd;
mknod(FIFO1,0666|S_IFIFO,0);
mknod(FIFO2,0666|S_IFIFO,0);
if (( childpid=fork())==-1)
{

Page 34 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

printf("cannot fork");
}
else
if(childpid >0)
{
wfd=open(FIFO1,1);
rfd=open(FIFO2,0);
client(rfd,wfd);
while (wait((int *) 0 ) !=childpid);
close(rfd);
close(wfd);
unlink(FIFO1);
unlink(FIFO2);
}
else
{
rfd=open(FIFO1,0);
wfd=open(FIFO2,1);
server(rfd,wfd);
close(rfd);
close(wfd);
}
}
client(int readfd,int writefd)
{
int n;
char buff[1024];
printf ("enter s file name");
if(fgets(buff,1024,stdin)==NULL)
printf("file name read error");
n=strlen(buff);
if(buff[n-1]=='\n')
n--;
if(write(writefd,buff,n)!=n)
printf("file name write error");
while((n=read(readfd,buff,1024))>0)
if(write(1,buff,n)!=n)
printf("data write error");
if(n<0)
printf("data error");

Page 35 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

}
server(int readfd,int writefd)
{
char buff[1024],errmsg[50];
int n,fd;
n=read(readfd,buff,1024);
buff[n]='\0';
if((fd=open(buff,0))<0)
{
sprintf(buff,"file does nit exist");
write(writefd,buff,1024);
}
else
{
while((n=read(fd,buff,1024))>0)
write(writefd,buff,n);
}
}

21. Write a C program to create a message queue with read and write permissions to write 3
messages to it with different priority numbers.

#include <stdio.h>
#include <sys/ipc.h>
#include <fcntl.h>
#define MAX 255
struct mesg
{
long type;
char mtext[MAX];
} *mesg;
char buff[MAX];
main()
{
int mid,fd,n,count=0;;
if((mid=msgget(1006,IPC_CREAT | 0666))<0)
{
printf(“\n Can’t create Message Q”);
exit(1);
}

Page 36 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

printf(“\n Queue id:%d”, mid);

mesg=(struct mesg *)malloc(sizeof(struct mesg));
mesg ->type=6;
fd=open(“fact”,O_RDONLY);
while(read(fd,buff,25)>0)
{
strcpy(mesg ->mtext,buff);
if(msgsnd(mid,mesg,strlen(mesg ->mtext),0)== -1)
printf(“\n Message Write Error”);
}

if((mid=msgget(1006,0))<0)
{
printf(“\n Can’t create Message Q”);
exit(1);
}
while((n=msgrcv(mid,&mesg,MAX,6,IPC_NOWAIT))>0)
write(1,mesg.mtext,n);
count++;
if((n= = -1)&(count= =0))
printf(“\n No Message Queue on Queue:%d”,mid);

22. Write a C program that receives the messages (from the above message queue as specified
in (21)) and displays them.

Aim: To create a message queue

DESCRIPTION:

Message passing between processes are part of operating system, which are done through a
message queue. Where messages are stored in kernel and are associated with message queue
identifier (“msqid”). Processes read and write messages to an arbitrary queue in a way such that
a process writes a message to a queue, exits and other process reads it at later time.

ALGORITHM:

Before defining a structure ipc_perm structure should be defined which is done by including
following file.

Page 37 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

#include <sys/types.h>
#include <sys/ipc.h>
A structure of information is maintained by kernel, it should contain following.
struct msqid_ds{
struct ipc_perm msg_perm; /*operation permission*/
struct msg *msg_first; /*ptr to first msg on queue*/
struct msg *msg_last; /*ptr to last msg on queue*/
ushort msg_cbytes; /*current bytes on queue*/
ushort msg_qnum; /*current no of msgs on queue*/
ushort msg_qbytes; /*max no of bytes on queue*/
ushort msg_lspid; /*pid o flast msg send*/
ushort msg_lrpid; /*pid of last msgrecvd*/
time_t msg_stime; /*time of last msg snd*/
time_t msg_rtime; /*time of last msg rcv*/
time_t msg_ctime; /*time of last msg ctl*/
};
To create new message queue or access existing message queue “msgget()” function is used
Syntax:
int msgget(key_t key ,int msgflag);
Msg flag values
Num val Symb value desc
0400 MSG_R Read by owner
0200 MSG_w Write by owner
0040 MSG_R >>3 Read by group
0020 MSG_W>>3 Write by group

Msgget returns msqid, or -1 if error

1. To put message on queue “msgsnd()” function is used.

Syntax:
int msgsnd(int msqid , struct msgbuf *ptr,int length, int flag);

msqid is message queue id, a unique id

msgbuf is actual content to send, a pointer to structure which contain following
struct msgbuf
{
Long mtype; /*message type >0 */
Char mtext[1]; /*data*/
};
length is the size of message in bytes

Page 38 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

flag is
- IPC_NOWAIT which allows sys call to return immediately when no room on queue,
when this is specified msgsnd will return -1 if no room on queue.
Else flag can be specified as 0
2. To receive Message “msgrcv()” function is used
Syntax:
Int msgrcv(int msqid , struct msgbuf *ptr, int length, long msgtype, int flag);

*ptr is pointer to structure where message received is to be stored

Length is size to be received and stored in pointer area
Flag has MSG_NOERROR , it returns an error if length is not large enough to
receive msg, if data portion is greater than msg length it truncates and returns.

3. Variety of control operations on msg can be done through “msgctl()” function

Int msgctl(int msqid, int cmd, struct msqid_ds *buff);
IPC_RMID in cmd is given to remove a message queue from the system.

Let us create a header file msgq.h with following in it

#include <sys/type.h>
#include <sys/ipc.h>
#include <sys/msg.h>

#include <sys/errno.h>
extern int errno;

#define MKEY1 1234L

#define MKEY2 2345L
#define PERMS 0666

Server operation algorithm:

#include “msgq.h”
main()
{
Int readid, writeid;

If((readid = msgget(MSGKEY1, PERMS |IPC_CREAT))<0)

err_sys(“Server: cant get message queue 1”);
If((writeid= msgget(MKEY@, PERMS | IPC_CREAT))<0)
err_sys(“Server : cant get message queue 2”);

Page 39 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

server(readid,writeid);
exit(0);
}

Client process:

#include “msgq.h”
main()
{
int readid, writeid;
/* open queues which server has already created it */
If ( (wirteid =msgget(MKEY1,0))<0)
err_sys(“client : cant access msgget message queue 1”);
if((readid=msgget(MKEY2,0))<0)
err_sys(“client : cant msgget messages queue 2”):

client(readid,writeid);

/delete msg queuu /

If (msgctl(readid, IPC_RMID,( struct msqid_ds *)0)<0)

err_sys(“Client: cant RMID message queue1”);
if(msgctl(writeid, IPC_RMID, (struct msqid_ds *) 0) <0)
err_sys(“Client: cant RMID message queue 2”);

exit(0);
}

Week 8

23. Write a C program to allow cooperating processes to lock a resource for exclusive use,
using a) Semaphores b) flock or lockf system calls.

PROGRAM:

#include<stdio.h>
#include<stdlib.h>
#include<error.h>
#include<sys/types.h>

Page 40 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

#include<sys/ipc.h>
#include<sys/sem.h>
int main(void)
{
key_t key;
int semid;
union semun arg;
if((key==ftok("sem demo.c","j"))== -1)
{
perror("ftok");
exit(1);
}
if(semid=semget(key,1,0666|IPC_CREAT))== -1)
{
perror("semget"):
exit(1);
}
arg.val=1;
if(semctl(semid,0,SETVAL,arg)== -1)
{
perror("smctl");
exit(1);
}
return 0;
}

OUTPUT:
semget
smctl
24. Write a C program that illustrates suspending and resuming processes using signals.

#include<sys/types.h>
#include<signal.h>
//suspend the process(same as hitting crtl+z)
kill(pid,SIGSTOP);

//continue the process

kill(pid,SIGCONT);

Week 9

Page 41 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

25. Write a C program that implements a producer-consumer system with two processes. (using
Semaphores).

Algorithm:

1. Start
2. create semaphore using semget( ) system call
3. if successful it returns positive value
4. create two new processes
5. first process will produce
6. until first process produces second process cannot consume
7. End.

Source code:

#include<stdio.h>
#include<stdlib.h>
#include<sys/types.h>
#include<sys/ipc.h>
#include<sys/sem.h>
#include<unistd.h>
#define num_loops 2
int main(int argc,char* argv[])
{
int sem_set_id;
int child_pid,i,sem_val;
struct sembuf sem_op;
int rc;
struct timespec delay;
clrscr();
sem_set_id=semget(ipc_private,2,0600);
if(sem_set_id==-1)
{
perror(“main:semget”);
exit(1);
}
printf(“semaphore set created,semaphore setid‘%d’\n ”,
sem_set_id);
child_pid=fork();

Page 42 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

switch(child_pid)
{
case -1:
perror(“fork”);
exit(1);
case 0:
for(i=0;i<num_loops;i++)
{
sem_op.sem_num=0;
sem_op.sem_op=-1;
sem_op.sem_flg=0;
semop(sem_set_id,&sem_op,1);
printf(“producer:’%d’\n”,i);
fflush(stdout);
}
break;
default:
for(i=0;i<num_loops;i++)
{
printf(“consumer:’%d’\n”,i);
fflush(stdout);
sem_op.sem_num=0;
sem_op.sem_op=1;
sem_op.sem_flg=0;
semop(sem_set_id,&sem_op,1);
if(rand()>3*(rano_max14));
{
delay.tv_sec=0;
delay.tv_nsec=10;
nanosleep(&delay,null);
}
}
break;
}
return 0;
}

Output:
semaphore set created

Page 43 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

semaphore set id ‘327690’

producer: ‘0’
consumer:’0’
producer:’1’
consumer:’1’

26. Write client and server programs (using c) for interaction between server and client
processes using Unix Domain sockets.

Server.c

#include <stdio.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <sys/types.h>
#include <unistd.h>
#include <string.h>

int connection_handler(int connection_fd)

{
int nbytes;
char buffer[256];

nbytes = read(connection_fd, buffer, 256);

buffer[nbytes] = 0;

printf("MESSAGE FROM CLIENT: %s\n", buffer);

nbytes = snprintf(buffer, 256, "hello from the server");
write(connection_fd, buffer, nbytes);

close(connection_fd);
return 0;
}

int main(void)
{
struct sockaddr_un address;
int socket_fd, connection_fd;
socklen_t address_length;
pid_t child;

Page 44 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

socket_fd = socket(PF_UNIX, SOCK_STREAM, 0);

if(socket_fd < 0)
{
printf("socket() failed\n");
return 1;
}

unlink("./demo_socket");

/* start with a clean address structure */

memset(&address, 0, sizeof(struct sockaddr_un));

address.sun_family = AF_UNIX;
snprintf(address.sun_path, UNIX_PATH_MAX, "./demo_socket");

if(bind(socket_fd,
(struct sockaddr *) &address,
sizeof(struct sockaddr_un)) != 0)
{
printf("bind() failed\n");
return 1;
}

if(listen(socket_fd, 5) != 0)
{
printf("listen() failed\n");
return 1;
}

while((connection_fd = accept(socket_fd,
(struct sockaddr *) &address,
&address_length)) > -1)
{
child = fork();
if(child == 0)
{
/* now inside newly created connection handling process */
return connection_handler(connection_fd);
}

Page 45 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

/* still inside server process */

close(connection_fd);
}

close(socket_fd);
unlink("./demo_socket");
return 0;
}

Client.c
#include <stdio.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <unistd.h>
#include <string.h>

int main(void)
{
struct sockaddr_un address;
int socket_fd, nbytes;
char buffer[256];

socket_fd = socket(PF_UNIX, SOCK_STREAM, 0);

if(socket_fd < 0)
{
printf("socket() failed\n");
return 1;
}

/* start with a clean address structure */

memset(&address, 0, sizeof(struct sockaddr_un));

address.sun_family = AF_UNIX;
snprintf(address.sun_path, UNIX_PATH_MAX, "./demo_socket");

if(connect(socket_fd,
(struct sockaddr *) &address,
sizeof(struct sockaddr_un)) != 0)
{

Page 46 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

printf("connect() failed\n");
return 1;
}

nbytes = snprintf(buffer, 256, "hello from a client");

write(socket_fd, buffer, nbytes);

nbytes = read(socket_fd, buffer, 256);

buffer[nbytes] = 0;

printf("MESSAGE FROM SERVER: %s\n", buffer);

close(socket_fd);
return 0;
}
Week 10

27. Write client and server programs (using c) for interaction between server and client
processes using Internet Domain sockets.

Server.c

#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/types.h>
#include <time.h>

int main(int argc, char *argv[])

{
int listenfd = 0, connfd = 0;
struct sockaddr_in serv_addr;

char sendBuff[1025];
time_t ticks;

Page 47 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

listenfd = socket(AF_INET, SOCK_STREAM, 0);

memset(&serv_addr, '0', sizeof(serv_addr));
memset(sendBuff, '0', sizeof(sendBuff));

serv_addr.sin_family = AF_INET;
serv_addr.sin_addr.s_addr = htonl(INADDR_ANY);
serv_addr.sin_port = htons(5000);

bind(listenfd, (struct sockaddr*)&serv_addr, sizeof(serv_addr));

listen(listenfd, 10);

while(1)
{
connfd = accept(listenfd, (struct sockaddr*)NULL, NULL);

ticks = time(NULL);
snprintf(sendBuff, sizeof(sendBuff), "%.24s\r\n", ctime(&ticks));
write(connfd, sendBuff, strlen(sendBuff));

close(connfd);
sleep(1);
}
}

Client.c

#include <sys/socket.h>
#include <sys/types.h>
#include <netinet/in.h>
#include <netdb.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <arpa/inet.h>

int main(int argc, char *argv[])

Page 48 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

{
int sockfd = 0, n = 0;
char recvBuff[1024];
struct sockaddr_in serv_addr;

if(argc != 2)
{
printf("\n Usage: %s <ip of server> \n",argv[0]);
return 1;
}

memset(recvBuff, '0',sizeof(recvBuff));
if((sockfd = socket(AF_INET, SOCK_STREAM, 0)) < 0)
{
printf("\n Error : Could not create socket \n");
return 1;
}

memset(&serv_addr, '0', sizeof(serv_addr));

serv_addr.sin_family = AF_INET;
serv_addr.sin_port = htons(5000);

if(inet_pton(AF_INET, argv[1], &serv_addr.sin_addr)<=0)

{
printf("\n inet_pton error occured\n");
return 1;
}

if( connect(sockfd, (struct sockaddr *)&serv_addr, sizeof(serv_addr)) < 0)

{
printf("\n Error : Connect Failed \n");
return 1;
}

while ( (n = read(sockfd, recvBuff, sizeof(recvBuff)-1)) > 0)

{
recvBuff[n] = 0;
if(fputs(recvBuff, stdout) == EOF)
{

Page 49 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

printf("\n Error : Fputs error\n");

}
}

if(n < 0)
{
printf("\n Read error \n");
}

return 0;
}

28. Write a C program that illustrates two processes communicating using shared memory.

DESCRIPTION:

Shared Memory is an efficeint means of passing data between programs. One program
will create a memory portion which other processes (if permitted) can access.

The problem with the pipes, FIFO’s and message queues is that for two processes to
exchange information, the information has to go through the kernel. Shared memory provides a
way around this by letting two or more processes share a memory segment.

In shared memory concept if one process is reading into some shared memory, for
example, other processes must wait for the read to finish before processing the data.

A process creates a shared memory segment using shmget()|. The original owner of a
shared memory segment can assign ownership to another user with shmctl(). It can also revoke
this assignment. Other processes with proper permission can perform various control functions
on the shared memory segment using shmctl(). Once created, a shared segment can be attached
to a process address space using shmat(). It can be detached using shmdt() (see shmop()). The
attaching process must have the appropriate permissions for shmat(). Once attached, the process
can read or write to the segment, as allowed by the permission requested in the attach operation.
A shared segment can be attached multiple times by the same process. A shared memory
segment is described by a control structure with a unique ID that points to an area of physical
memory. The identifier of the segment is called the shmid. The structure definition for the shared
memory segment control structures and prototypews can be found in <sys/shm.h>.

shmget() is used to obtain access to a shared memory segment. It is prottyped by:

int shmget(key_t key, size_t size, int shmflg);

Page 50 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

The key argument is a access value associated with the semaphore ID. The size argument is the
size in bytes of the requested shared memory. The shmflg argument specifies the initial access
permissions and creation control flags.

When the call succeeds, it returns the shared memory segment ID. This call is also used to get
the ID of an existing shared segment (from a process requesting sharing of some existing
memory portion).

The following code illustrates shmget():

#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>
...
key_t key; /* key to be passed to shmget() */
int shmflg; /* shmflg to be passed to shmget() */
int shmid; /* return value from shmget() */
int size; /* size to be passed to shmget() */

...

key = ...
size = ...
shmflg) = ...

if ((shmid = shmget (key, size, shmflg)) == -1) {

perror("shmget: shmget failed"); exit(1); } else {
(void) fprintf(stderr, "shmget: shmget returned %d\n", shmid);
exit(0);
}
...
Controlling a Shared Memory Segment
shmctl() is used to alter the permissions and other characteristics of a shared memory segment. It
is prototyped as follows:
int shmctl(int shmid, int cmd, struct shmid_ds *buf);
The process must have an effective shmid of owner, creator or superuser to perform this
command. The cmd argument is one of following control commands:
SHM_LOCK
-- Lock the specified shared memory segment in memory. The process
must have the effective ID of superuser to perform this command.
SHM_UNLOCK

Page 51 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

-- Unlock the shared memory segment. The process must have the
effective ID of superuser to perform this command.
IPC_STAT
-- Return the status information contained in the control structure and
place it in the buffer pointed to by buf. The process must have read
permission on the segment to perform this command.
IPC_SET
-- Set the effective user and group identification and access
permissions. The process must have an effective ID of owner, creator
or superuser to perform this command.
IPC_RMID
-- Remove the shared memory segment.
The buf is a sructure of type struct shmid_ds which is defined in <sys/shm.h>
The following code illustrates shmctl():
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>
...
int cmd; /* command code for shmctl() */
int shmid; /* segment ID */
struct shmid_ds shmid_ds; /* shared memory data structure to
hold results */
...
shmid = ...
cmd = ...
if ((rtrn = shmctl(shmid, cmd, shmid_ds)) == -1) {
perror("shmctl: shmctl failed");
exit(1);
}
..
Attaching and Detaching a Shared Memory Segment
shmat() and shmdt() are used to attach and detach shared memory segments. They are prototypes
as follows:
void *shmat(int shmid, const void *shmaddr, int shmflg);
int shmdt(const void *shmaddr);
shmat() returns a pointer, shmaddr, to the head of the shared segment associated with a valid
shmid. shmdt() detaches the shared memory segment located at the address indicated by shmaddr
. The following code illustrates calls to shmat() and shmdt():
#include <sys/types.h>
#include <sys/ipc.h>

Page 52 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

#include <sys/shm.h>
static struct state { /* Internal record of attached segments. */
int shmid; /* shmid of attached segment */
char *shmaddr; /* attach point */
int shmflg; /* flags used on attach */
} ap[MAXnap]; /* State of current attached segments. */
int nap; /* Number of currently attached segments. */
...
char *addr; /* address work variable */
register int i; /* work area */
register struct state *p; /* ptr to current state entry */
...
p = &ap[nap++];
p->shmid = ...
p->shmaddr = ...
p->shmflg = ...
p->shmaddr = shmat(p->shmid, p->shmaddr, p->shmflg);
if(p->shmaddr == (char *)-1) {
perror("shmop: shmat failed");
nap--;
} else
(void) fprintf(stderr, "shmop: shmat returned %#8.8x\n",
p->shmaddr);
...
i = shmdt(addr);
if(i == -1) {
perror("shmop: shmdt failed");
} else {
(void) fprintf(stderr, "shmop: shmdt returned %d\n", i);
for (p = ap, i = nap; i--; p++)
if (p->shmaddr == addr) *p = ap[--nap];

}
...
Algorithm:
1. Start
2. create shared memory using shmget( ) system call
3. if success full it returns positive value
4. attach the created shared memory using shmat( ) system
call

Page 53 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

5. write to shared memory using shmsnd( ) system call

6. read the contents from shared memory using shmrcv( )
system call
7. End .
Source Code:
#include<stdio.h>
#include<stdlib.h>
#include<sys/ipc.h>
#include<sys/types.h>
#include<string.h>
#include<sys/shm.h>
#define shm_size 1024
int main(int argc,char * argv[])
{
key_t key;
int shmid;
char *data;
int mode;
if(argc>2)
{
fprintf(stderr,”usage:stdemo[data_to_writte]\n”);
exit(1);
}
if((shmid=shmget(key,shm_size,0644/ipc_creat))==-1)
{
perror(“shmget”);
exit(1);
}
data=shmat(shmid,(void *)0,0);
if(data==(char *)(-1))
{
perror(“shmat”);
exit(1);
}
if(argc==2)
printf(writing to segment:\”%s”\”\n”,data);
if(shmdt(data)==-1)
{
perror(“shmdt”);
exit(1);

Page 54 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

}
return 0;
}
Input:
#./a.out koteswararao
Output:
writing to segment koteswararao

Data Mining Lab

Credit Risk Assessment

Description: The business of banks is making loans. Assessing the credit worthiness of an
applicant is of crucial importance. You have to develop a system to help a loan officer
decide whether the credit of a customer is good, or bad. A bank’s business rules
regarding loans must consider two opposing factors. On the one hand, a bank wants to
make as many loans as possible. Interest on these loans is the ban’s profit source. On the
other hand, a bank cannot afford to make too many bad loans. Too many bad loans could
lead to the collapse of the bank. The bank’s loan policy must involve a compromise not
too strict, and not too lenient.

To do the assignment, you first and foremost need some knowledge about the world of
credit . You can acquire such knowledge in a number of ways.

1. Knowledge Engineering. Find a loan officer who is willing to talk. Interview her and try
to represent her knowledge in the form of production rules.

2. Books. Find some training manuals for loan officers or perhaps a suitable textbook on
finance. Translate this knowledge from text form to production rule form.

Page 55 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

3. Common sense. Imagine yourself as a loan officer and make up reasonable rules which
can be used to judge the credit worthiness of a loan applicant.

4. Case histories. Find records of actual cases where competent loan officers correctly
judged when not to, approve a loan application.

The German Credit Data :

Actual historical credit data is not always easy to come by because of confidentiality rules.
Here is one such dataset ( original) Excel spreadsheet version of the German credit data
(download from web).

In spite of the fact that the data is German, you should probably make use of it for this
assignment, (Unless you really can consult a real loan officer !)

A few notes on the German dataset :

 DM stands for Deutsche Mark, the unit of currency, worth about 90 cents Canadian
(but looks and acts like a quarter).

 Owns_telephone. German phone rates are much higher than in Canada so fewer
people own telephones.

 Foreign_worker. There are millions of these in Germany (many from Turkey). It is

very hard to get German citizenship if you were not born of German parents.

 There are 20 attributes used in judging a loan applicant. The goal is the classify the
applicant into one of two categories, good or bad.

Subtasks : (Turn in your answers to the following tasks)

Laboratory Manual For Data Mining

EXPERIMENT-1

Aim: To list all the categorical(or nominal) attributes and the real valued attributes using Weka
mining tool.

Tools/ Apparatus: Weka mining tool..

Procedure:

1) Open the Weka GUI Chooser.

2) Select EXPLORER present in Applications.

Page 56 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

3) Select Preprocess Tab.

4) Go to OPEN file and browse the file that is already stored in the system “bank.csv”.

5) Clicking on any attribute in the left panel will show the basic statistics on that selected
attribute.

SampleOutput:

EXPERIMENT-2

Aim: To identify the rules with some of the important attributes by a) manually and b) Using
Weka .

Tools/ Apparatus: Weka mining tool..

Theory:

Association rule mining is defined as: Let be a set of n binary attributes called items. Let be a set
of transactions called the database. Each transaction in D has a unique transaction ID and
contains a subset of the items in I. A rule is defined as an implication of the form X=>Y where

Page 57 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

X,Y C I and X Π Y=Φ . The sets of items (for short itemsets) X and Y are called antecedent (left
hand side or LHS) and consequent (righthandside or RHS) of the rule respectively.

To illustrate the concepts, we use a small example from the supermarket domain.

The set of items is I = {milk,bread,butter,beer} and a small database containing the items (1
codes presence and 0 absence of an item in a transaction) is shown in the table to the right. An
example rule for the supermarket could be meaning that if milk and bread is bought, customers
also buy butter.

Note: this example is extremely small. In practical applications, a rule needs a support of several
hundred transactions before it can be considered statistically significant, and datasets often
contain thousands or millions of transactions.

To select interesting rules from the set of all possible rules, constraints on various measures of
significance and interest can be used. The bestknown constraints are minimum thresholds on
support and confidence. The support supp(X) of an itemset X is defined as the proportion of
transactions in the data set which contain the itemset. In the example database, the itemset
{milk,bread} has a support of 2 / 5 = 0.4 since it occurs in 40% of all transactions (2 out of 5
transactions).

The confidence of a rule is defined . For example, the rule has a confidence of 0.2 / 0.4 = 0.5 in
the database, which means that for 50% of the transactions containing milk and bread the rule is
correct. Confidence can be interpreted as an estimate of the probability P(Y | X), the probability
of finding the RHS of the rule in transactions under the condition that these transactions also
contain the LHS .

ALGORITHM:

Association rule mining is to find out association rules that satisfy the predefined minimum
support and confidence from a given database. The problem is usually decomposed into two
subproblems. One is to find those itemsets whose occurrences exceed a predefined threshold in
the database; those itemsets are called frequent or large itemsets. The second problem is to
generate association rules from those large itemsets with the constraints of minimal confidence.

Suppose one of the large itemsets is Lk, Lk = {I1, I2, … , Ik}, association rules with this itemsets
are generated in the following way: the first rule is {I1, I2, … , Ik1} and {Ik}, by checking the
confidence this rule can be determined as interesting or not. Then other rule are generated by
deleting the last items in the antecedent and inserting it to the consequent, further the confidences
of the new rules are checked to determine the interestingness of them. Those processes iterated
until the antecedent becomes empty. Since the second subproblem is quite straight forward, most
of the researches focus on the first subproblem. The Apriori algorithm finds the frequent sets L
In Database D.

Page 58 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

· Find frequent set Lk − 1.

· Join Step.

o Ck is generated by joining Lk − 1with itself

· Prune Step.

o Any (k − 1) itemset that is not frequent cannot be a subset of a

frequent k itemset, hence should be removed.

Where · (Ck: Candidate itemset of size k)

· (Lk: frequent itemset of size k)

Apriori Pseudocode

Apriori (T,£)

L<{ Large 1itemsets that appear in more than transactions }

K<2

while L(k1)≠ Φ

C(k)<Generate( Lk − 1)

for transactions t € T

C(t)Subset(Ck,t)

for candidates c € C(t)

count[c]<count[ c]+1

L(k)<{ c € C(k)| count[c] ≥ £

K<K+ 1

return Ụ L(k) k

Procedure:

1) Given the Bank database for mining.

2) Select EXPLORER in WEKA GUI Chooser.

Page 59 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

3) Load “Bank.csv” in Weka by Open file in Preprocess tab.

4) Select only Nominal values.

5) Go to Associate Tab.

6) Select Apriori algorithm from “Choose “ button present in Associator

weka.associations.Apriori -N 10 -T 0 -C 0.9 -D 0.05 -U 1.0 -M 0.1 -S -1.0 -c -1

7) Select Start button

8) now we can see the sample rules.

Sample output:

Page 60 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

EXPERIMENT-3

Aim: To create a Decision tree by training data set using Weka mining tool.

Tools/ Apparatus: Weka mining tool..

Theory:

Classification is a data mining function that assigns items in a collection to target categories or
classes. The goal of classification is to accurately predict the target class for each case in the
data. For example, a classification model could be used to identify loan applicants as low,
medium, or high credit risks.

A classification task begins with a data set in which the class assignments are known. For
example, a classification model that predicts credit risk could be developed based on observed
data for many loan applicants over a period of time.

In addition to the historical credit rating, the data might track employment history, home
ownership or rental, years of residence, number and type of investments, and so on. Credit rating
would be the target, the other attributes would be the predictors, and the data for each customer
would constitute a case.

Classifications are discrete and do not imply order. Continuous, floatingpoint values would
indicate a numerical, rather than a categorical, target. A predictive model with a numerical target
uses a regression algorithm, not a classification algorithm.

The simplest type of classification problem is binary classification. In binary classification, the
target attribute has only two possible values: for example, high credit rating or low credit rating.
Multiclass targets have more than two values: for example, low, medium, high, or unknown
credit rating.

In the model build (training) process, a classification algorithm finds relationships between the
values of the predictors and the values of the target. Different classification algorithms use

Page 61 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

different techniques for finding relationships. These relationships are summarized in a model,
which can then be applied to a different data set in which the class assignments are unknown.

Classification models are tested by comparing the predicted values to known target values in a
set of test data. The historical data for a classification project is typically divided into two data
sets: one for building the model; the other for testing the model.

Scoring a classification model results in class assignments and probabilities for each case. For
example, a model that classifies customers as low, medium, or high value would also predict the
probability of each classification for each customer.

Classification has many applications in customer segmentation, business modeling, marketing,

credit analysis, and biomedical and drug response modeling.

Different Classification Algorithms

Oracle Data Mining provides the following algorithms for classification:

· Decision Tree

Decision trees automatically generate rules, which are conditional statements that reveal the
logic used to build the tree.

· Naive Bayes

Naive Bayes uses Bayes' Theorem, a formula that calculates a probability by counting the
frequency of values and combinations of values in the historical data.

Procedure:

1) Open Weka GUI Chooser.

2) Select EXPLORER present in Applications.

3) Select Preprocess Tab.

4) Go to OPEN file and browse the file that is already stored in the system “bank.csv”.

5) Go to Classify tab.

6) Here the c4.5 algorithm has been chosen which is entitled as j48 in Java and can be selected
by clicking the button choose

7) and select tree j48

9) Select Test options “Use training set”

Page 62 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

10) if need select attribute.

11) Click Start .

12)now we can see the output details in the Classifier output.

13) right click on the result list and select ” visualize tree “option .

Sample output:

Page 63 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

The decision tree constructed by using the implemented C4.5 algorithm

Page 64 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

EXPERIMENT-4

Aim: To find the percentage of examples that are classified correctly by using the above created
decision tree model? ie.. Testing on the training set.

Tools/ Apparatus: Weka mining tool..

Theory:

Naive Bayes classifier assumes that the presence (or absence) of a particular feature of a class is
unrelated to the presence (or absence) of any other feature. For example, a fruit may be
considered to be an apple if it is red, round, and about 4" in diameter. Even though these features
depend on the existence of the other features, a naive Bayes classifier considers all of these
properties to independently contribute to the probability that this fruit is an apple.

An advantage of the naive Bayes classifier is that it requires a small amount of training data to
estimate the parameters (means and variances of the variables) necessary for classification.
Because independent variables are assumed, only the variances of the variables for each class
need to be determined and not the entirecovariance matrix The naive Bayes probabilistic model :

The probability model for a classifier is a conditional model

P(C|F1 .................Fn) over a dependent class variable C with a small number of outcomes or
classes, conditional on several feature variables F1 through Fn. The problem is that if the
number of features n is large or when a feature can take on a large number of values, then basing
such a model on probability tables is infeasible. We therefore reformulate the model to make it
more tractable.

Using Bayes' theorem, we write

P(C|F1...............Fn)=[{p(C)p(F1..................Fn|C)}/p(F1,........Fn)]

In plain English the above equation can be written as

Posterior= [(prior *likehood)/evidence]

Page 65 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

In practice we are only interested in the numerator of that fraction, since the denominator does
not depend on C and the values of the features Fi are given, so that the denominator is effectively
constant. The numerator is equivalent to the joint probability model p(C,F1........Fn) which can
be rewritten as follows, using repeated applications of the definition of conditional probability:

p(C,F1........Fn) =p(C) p(F1............Fn|C) =p(C)p(F1|C) p(F2.........Fn|C,F1,F2)

=p(C)p(F1|C) p(F2|C,F1)p(F3.........Fn|C,F1,F2)

= p(C)p(F1|C) p(F2|C,F1)p(F3.........Fn|C,F1,F2)......p(Fn|C,F1,F2,F3.........Fn1)

Now the "naive" conditional independence assumptions come into play: assume that each feature
Fi is conditionally independent of every other feature Fj for j≠i .

This means that p(Fi|C,Fj)=p(Fi|C)

and so the joint model can be expressed as p(C,F1,.......Fn)=p(C)p(F1|C)p(F2|C)...........

=p(C)π p(Fi|C)

This means that under the above independence assumptions, the conditional distribution over the
class variable C can be expressed like this:

p(C|F1..........Fn)= p(C) πp(Fi|C)

where Z is a scaling factor dependent only on F1.........Fn, i.e., a constant if the values of the
feature variables are known.

Models of this form are much more manageable, since they factor into a so called class prior
p(C) and independent probability distributions p(Fi|C). If there are k classes and if a model for
eachp(Fi|C=c) can be expressed in terms of r parameters, then the corresponding naive Bayes
model has (k − 1) + n r k parameters. In practice, often k = 2 (binary classification) and r = 1
(Bernoulli variables as features) are common, and so the total number of parameters of the naive
Bayes model is 2n + 1, where n is the number of binary features used for prediction

P(h/D)= P(D/h) P(h) P(D)

• P(h) : Prior probability of hypothesis h

• P(D) : Prior probability of training data D

• P(h/D) : Probability of h given D

Page 66 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

• P(D/h) : Probability of D given h

Naïve Bayes Classifier : Derivation

• D : Set of tuples

– Each Tuple is an ‘n’ dimensional attribute vector

– X : (x1,x2,x3,…. xn)

• Let there me ‘m’ Classes : C1,C2,C3…Cm

• NB classifier predicts X belongs to Class Ci iff

– P (Ci/X) > P(Cj/X) for 1<= j <= m , j <> i

• Maximum Posteriori Hypothesis

– P(Ci/X) = P(X/Ci) P(Ci) / P(X)

– Maximize P(X/Ci) P(Ci) as P(X) is constant

Naïve Bayes Classifier : Derivation

• With many attributes, it is computationally expensive to evaluate P(X/Ci)

• Naïve Assumption of “class conditional independence”

• P(X/Ci) = n P( xk/ Ci)

k=1

• P(X/Ci) = P(x1/Ci) * P(x2/Ci) … P(xn/ Ci)

Procedure:

1) Given the Bank database for mining.

2) Use the Weka GUI Chooser.

3) Select EXPLORER present in Applications.

4) Select Preprocess Tab.

5) Go to OPEN file and browse the file that is already stored in the system “bank.csv”.

6) Go to Classify tab.

Page 67 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

7) Choose Classifier “Tree”

8) Select “NBTree” i.e., Navie Baysiean tree.

9) Select Test options “Use training set”

10) if need select attribute.

11) now Start weka.

12)now we can see the output details in the Classifier output.

Sample output:

=== Evaluation on training set ===

=== Summary ===

Correctly Classified Instances 554 92.3333 %

Incorrectly Classified Instances 46 7.6667 %

Kappa statistic 0.845

Mean absolute error 0.1389

Root mean squared error 0.2636

Relative absolute error 27.9979 %

Root relative squared error 52.9137 %

Total Number of Instances 600

=== Detailed Accuracy By Class ===

TP Rate FP Rate Precision Recall F-Measure ROC Area Class

0.894 0.052 0.935 0.894 0.914 0.936 YES

Page 68 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

0.948 0.106 0.914 0.948 0.931 0.936 NO

Weighted Avg. 0.923 0.081 0.924 0.923 0.923 0.936

=== Confusion Matrix ===

a b <-- classified as

245 29 | a = YES

17 309 | b = NO

EXPERIMENT-5

Aim: To “Is testing a good idea”.

Tools/ Apparatus: Weka Mining tool

Procedure:

1) In Test options, select the Supplied test set radio button

2) click Set

3) Choose the file which contains records that were not in the training set we used to create
the model.

Page 69 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

4) click Start(WEKA will run this test data set through the model we already created. )

5) Compare the output results with that of the 4th experiment

Sample output:

This can be experienced by the different problem solutions while doing practice.

The important numbers to focus on here are the numbers next to the "Correctly Classified
Instances" (92.3 percent) and the "Incorrectly Classified Instances" (7.6 percent). Other
important numbers are in the "ROC Area" column, in the first row (the 0.936); Finally, in the
"Confusion Matrix," it shows the number of false positives and false negatives. The false
positives are 29, and the false negatives are 17 in this matrix.

Based on our accuracy rate of 92.3 percent, we say that upon initial analysis, this is a good
model.

One final step to validating our classification tree, which is to run our test set through the model
and ensure that accuracy of the model

Comparing the "Correctly Classified Instances" from this test set with the "Correctly Classified
Instances" from the training set, we see the accuracy of the model , which indicates that the
model will not break down with unknown data, or when future data is applied to it.

EXPERIMENT-6

Aim: To create a Decision tree by cross validation training data set using Weka mining tool.

Tools/ Apparatus: Weka mining tool..

Theory:

Page 70 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

Decision tree learning, used in data mining and machine learning, uses a decision tree as a
predictive model which maps observations about an item to conclusions about the item's target
value In these tree structures, leaves represent classifications and branches represent
conjunctions of features that lead to those classifications. In decision analysis, a decision tree can
be used to visually and explicitly represent decisions and decision making. In data mining, a
decision tree describes data but not decisions; rather the resulting classification tree can be an
input for decision making. This page deals with decision trees in data mining.

Decision tree learning is a common method used in data mining. The goal is to create a model
that predicts the value of a target variable based on several input variables. Each interior node
corresponds to one of the input variables; there are edges to children for each of the possible
values of that input variable. Each leaf represents a value of the target variable given the values
of the input variables represented by the path from the root to the leaf.

A tree can be "learned" by splitting the source set into subsets based on an attribute value test.
This process is repeated on each derived subset in a recursive manner called recursive
partitioning. The recursion is completed when the subset at a node all has the same value of the
target variable, or when splitting no longer adds value to the predictions.

In data mining, trees can be described also as the combination of mathematical and
computational techniques to aid the description, categorisation and generalization of a given set
of data.

Data comes in records of the form:

(x, y) = (x1, x2, x3..., xk, y)

The dependent variable, Y, is the target variable that we are trying to understand, classify or
generalise. The vector x is comprised of the input variables, x1, x2, x3 etc., that are used for that
task.

Procedure:

1) Given the Bank database for mining.

2) Use the Weka GUI Chooser.

3) Select EXPLORER present in Applications.

4) Select Preprocess Tab.

5) Go to OPEN file and browse the file that is already stored in the system “bank.csv”.

Page 71 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

6) Go to Classify tab.

7) Choose Classifier “Tree”

8) Select J48

9) Select Test options “Cross-validation”.

10) Set “Folds” Ex:10

11) if need select attribute.

12) now Start weka.

13)now we can see the output details in the Classifier output.

14)Compare the output results with that of the 4th experiment

15) check whether the accuracy increased or decreased?

Sample output:

Page 72 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

=== Stratified cross-validation ===

=== Summary ===

Correctly Classified Instances 539 89.8333 %

Incorrectly Classified Instances 61 10.1667 %

Kappa statistic 0.7942

Mean absolute error 0.167

Root mean squared error 0.305

Relative absolute error 33.6511 %

Root relative squared error 61.2344 %

Total Number of Instances 600

=== Detailed Accuracy By Class ===

TP Rate FP Rate Precision Recall F-Measure ROC Area Class

0.861 0.071 0.911 0.861 0.886 0.883 YES

0.929 0.139 0.889 0.929 0.909 0.883 NO

Weighted Avg. 0.898 0.108 0.899 0.898 0.898 0.883

=== Confusion Matrix ===

a b <-- classified as

236 38 | a = YES

23 303 | b = NO

Page 73 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

EXPERIMENT-7

Aim: Delete one attribute from GUI Explorer and see the effect using Weka mining tool.

Tools/ Apparatus: Weka mining tool..

Procedure:

1) Given the Bank database for mining.

2) Use the Weka GUI Chooser.

3) Select EXPLORER present in Applications.

4) Select Preprocess Tab.

5) Go to OPEN file and browse the file that is already stored in the system “bank.csv”.

6) In the "Filter" panel, click on the "Choose" button. This will show a popup window with list
available filters.

7) Select “weka.filters.unsupervised.attribute.Remove”

8) Next, click on text box immediately to the right of the "Choose" button

9) In the resulting dialog box enter the index of the attribute to be filtered out (Make sure that the
"invertSelection" option is set to false )

10) Then click "OK" . Now, in the filter box you will see "Remove -R 1"

11) Click the "Apply" button to apply this filter to the data. This will remove the "id" attribute
and create a new working relation

12) To save the new working relation as an ARFF file, click on save button in the top panel.

Page 74 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

13) Go to OPEN file and browse the file that is newly saved (attribute deleted file)

14) Go to Classify tab.

15) Choose Classifier “Tree”

16) Select j48 tree

17) Select Test options “Use training set”

18) if need select attribute.

19) now Start weka.

20)now we can see the output details in the Classifier output.

21) right click on the result list and select ” visualize tree “option .

22) Compare the output results with that of the 4th experiment

23) check whether the accuracy increased or decreased?

24)check whether removing these attributes have any significant effect.

Sample output:

Page 75 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

Page 76 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

Page 77 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

EXPERIMENT-8

Aim: Select some attributes from GUI Explorer and perform classification and see the effect
using Weka mining tool.

Tools/ Apparatus: Weka mining tool..

Procedure:

1) Given the Bank database for mining.

2) Use the Weka GUI Chooser.

3) Select EXPLORER present in Applications.

4) Select Preprocess Tab.

5) Go to OPEN file and browse the file that is already stored in the system “bank.csv”.

6) select some of the attributes from attributes list which are to be removed. With this step only
the attributes necessary for classification are left in the attributes panel.

7) The go to Classify tab.

8) Choose Classifier “Tree”

9) Select j48

10) Select Test options “Use training set”

11) if need select attribute.

12) now Start weka.

13)now we can see the output details in the Classifier output.

14) right click on the result list and select ” visualize tree “option .

15)Compare the output results with that of the 4th experiment

16) check whether the accuracy increased or decreased?

Page 78 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

17)check whether removing these attributes have any significant effect.

Sample output:

Page 79 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

EXPERIMENT-9

Aim: To create a Decision tree by cross validation training data set by changing the cost matrix
in Weka mining tool.

Tools/ Apparatus: Weka mining tool..

Procedure:

1) Given the Bank database for mining.

2) Use the Weka GUI Chooser.

3) Select EXPLORER present in Applications.

4) Select Preprocess Tab.

5) Go to OPEN file and browse the file that is already stored in the system “bank.csv”.

6) Go to Classify tab.

7) Choose Classifier “Tree”

8) Select j48

9) Select Test options “Training set”.

10)Click on “more options”.

11)Select cost sensitive evaluation and click on set button

12)Set the matrix values and click on resize. Then close the window.

13)Click Ok

14)Click start.

15) we can see the output details in the Classifier output

Page 80 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

16) Select Test options “Cross-validation”.

17) Set “Folds” Ex:10

18) if need select attribute.

19) now Start weka.

20)now we can see the output details in the Classifier output.

21)Compare results of 15th and 20th steps.

22)Compare the results with that of experiment 6.

Sample output:

Page 81 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

EXPERIMENT-10

Aim: Is small rule better or long rule check the bias,by training data set using Weka mining tool.

Tools/ Apparatus: Weka mining tool..

Procedure:

This will be based on the attribute set, and the requirement of relationship among attribute we
want to study. This can be viewed based on the database and user requirement.

EXPERIMENT-11

Aim: To create a Decision tree by using Prune mode and Reduced error Pruning and show
accuracy for cross validation trained data set using Weka mining tool.

Tools/ Apparatus: Weka mining tool..

Theory :

Page 82 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

Reduced-error pruning

􀂄 Each node of the (over-fit) tree is examined for pruning

􀂄 A node is pruned (removed) only if the resulting pruned tree

performs no worse than the original over the validation set

􀂄 Pruning a node consists of

• Removing the sub-tree rooted at the pruned node

• Making the pruned node a leaf node

• Assigning the pruned node the most common classification of the training instances attached to
that node

􀂄 Pruning nodes iteratively

• Always select a node whose removal most increases the DT accuracy over the validation set

• Stop when further pruning decreases the DT accuracy over the validation set

IF (Children=yes) Λ (income=>30000)

THEN (car=Yes)

Procedure:

1) Given the Bank database for mining.

2) Use the Weka GUI Chooser.

3) Select EXPLORER present in Applications.

4) Select Preprocess Tab.

5) Go to OPEN file and browse the file that is already stored in the system “bank.csv”.

6) select some of the attributes from attributes list

7) Go to Classify tab.

8) Choose Classifier “Tree”

9) Select “NBTree” i.e., Navie Baysiean tree.

Page 83 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

10) Select Test options “Use training set”

11) right click on the text box besides choose button ,select show properties

12) now change unprone mode “false” to “true”.

13) change the reduced error pruning % as needed.

14) if need select attribute.

15) now Start weka.

16)now we can see the output details in the Classifier output.

17) right click on the result list and select ” visualize tree “option .

Sample output:

Page 84 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

Page 85 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

EXPERIMENT-12

Aim: To compare OneR classifier which uses single attribute and rule with J48 and PART
classifier’s, by training data set using Weka mining tool.

Tools/ Apparatus: Weka mining tool..

Procedure:

1) Given the Bank database for mining.

2) Use the Weka GUI Chooser.

3) Select EXPLORER present in Applications.

4) Select Preprocess Tab.

5) Go to OPEN file and browse the file that is already stored in the system “bank.csv”.

6) select some of the attributes from attributes list

7) Go to Classify tab.

8) Choose Classifier “TreesRules”

9) Select “J48” .

10) Select Test options “Use training set”

11) if need select attribute.

12) now Start weka.

13)now we can see the output details in the Classifier output.

14) right click on the result list and select ” visualize tree “option .

(or)

 java weka.classifiers.trees.J48 -t c:\temp\bank.arff

Page 86 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

Procedure for “OneR”:

1) Given the Bank database for mining.

2) Use the Weka GUI Chooser.

3) Select EXPLORER present in Applications.

4) Select Preprocess Tab.

5) Go to OPEN file and browse the file that is already stored in the system “bank.csv”.

6) select some of the attributes from attributes list

7) Go to Classify tab.

8) Choose Classifier “Rules”

9) Select “OneR” .

10) Select Test options “Use training set”

11) if need select attribute.

12) now Start weka.

13)now we can see the output details in the Classifier output.

Procedure for “PART”:

1) Given the Bank database for mining.

2) Use the Weka GUI Chooser.

3) Select EXPLORER present in Applications.

4) Select Preprocess Tab.

5) Go to OPEN file and browse the file that is already stored in the system “bank.csv”.

6) select some of the attributes from attributes list

7) Go to Classify tab.

Page 87 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

8) Choose Classifier “Rules”

9) Select “PART” .

10) Select Test options “Use training set”

11) if need select attribute.

12) now Start weka.

13)now we can see the output details in the Classifier output.

Attribute relevance with respect to the class – relevant attribute (science)

IF accounting=1 THEN class=A (Error=0, Coverage = 7 instance)

IF accounting=0 THEN class=B (Error=4/13, Coverage = 13 instances)

Sample output:

J48

java weka.classifiers.trees.J48 -t c:/temp/bank.arff

Page 88 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

One R

Page 89 ROKESH
LINUX PROGRAMMING AND DATA MINING LAB MANUAL

PART

Page 90 ROKESH

CoDeSys V3.5 - Part A (English) V1.3 PDF
88% (8)
CoDeSys V3.5 - Part A (English) V1.3 PDF
209 pages
REW 101 HTS Current Version
No ratings yet
REW 101 HTS Current Version
136 pages
Lenovo IdeaCentre AIO 510-22asr - CCA20 LA-D961P PDF
No ratings yet
Lenovo IdeaCentre AIO 510-22asr - CCA20 LA-D961P PDF
56 pages
C# for Beginners: Learn in 24 Hours
From Everand
C# for Beginners: Learn in 24 Hours
Alex Nordeen
No ratings yet
GuardLogix Safety Application Instruction Set
No ratings yet
GuardLogix Safety Application Instruction Set
398 pages
LPDM Lab Manul
No ratings yet
LPDM Lab Manul
89 pages
Final Linux Programming Lab Manual
No ratings yet
Final Linux Programming Lab Manual
42 pages
Linux Lab Manual PDF
100% (1)
Linux Lab Manual PDF
47 pages
Linux Programming Lab Programs-Part12
No ratings yet
Linux Programming Lab Programs-Part12
5 pages
Unix
No ratings yet
Unix
17 pages
Linux_Lab_Manual_069d908e-d8b8-499e-9b2c-2992d4f3430c
No ratings yet
Linux_Lab_Manual_069d908e-d8b8-499e-9b2c-2992d4f3430c
16 pages
OperatingSystem Lab - File Final
No ratings yet
OperatingSystem Lab - File Final
21 pages
LP Lab Manual
No ratings yet
LP Lab Manual
54 pages
Os Ass3
No ratings yet
Os Ass3
18 pages
Wa0014.
No ratings yet
Wa0014.
21 pages
Unix Lap Shell-Script
No ratings yet
Unix Lap Shell-Script
38 pages
Shell Script
33% (3)
Shell Script
38 pages
Unix Lab
No ratings yet
Unix Lab
23 pages
Wa0083.
No ratings yet
Wa0083.
21 pages
Unix & Shell Programming Lab Manual
100% (4)
Unix & Shell Programming Lab Manual
49 pages
Foss Lab Programs
No ratings yet
Foss Lab Programs
12 pages
LAB 3 - Shell Programming
No ratings yet
LAB 3 - Shell Programming
4 pages
java1-26
No ratings yet
java1-26
19 pages
Java Lab Manual
No ratings yet
Java Lab Manual
52 pages
Unix Test
No ratings yet
Unix Test
2 pages
Test1 Questions
No ratings yet
Test1 Questions
6 pages
Mohit Os
No ratings yet
Mohit Os
6 pages
Unix Aneet
No ratings yet
Unix Aneet
53 pages
OS Lab manual anna university laboratory
No ratings yet
OS Lab manual anna university laboratory
66 pages
LINUX AND SHELL-LAB PROGRAMS
No ratings yet
LINUX AND SHELL-LAB PROGRAMS
21 pages
Linux Programming Syllabus
No ratings yet
Linux Programming Syllabus
21 pages
Unix and Shell Programming Practical File
0% (1)
Unix and Shell Programming Practical File
6 pages
Lab Manual: Department of Computer Science
No ratings yet
Lab Manual: Department of Computer Science
36 pages
Os Lab
No ratings yet
Os Lab
29 pages
Linux and Shell Programming Practical File B.E V Semester
No ratings yet
Linux and Shell Programming Practical File B.E V Semester
20 pages
Linux Lab Progs Executed
No ratings yet
Linux Lab Progs Executed
4 pages
FALLSEM2024-25 SWE2007 ELA AP2024252000708 Reference-Material-I
No ratings yet
FALLSEM2024-25 SWE2007 ELA AP2024252000708 Reference-Material-I
61 pages
Unix Lab Manual
No ratings yet
Unix Lab Manual
23 pages
Unix Lab File
0% (1)
Unix Lab File
34 pages
Linux record (2)
No ratings yet
Linux record (2)
9 pages
Unix
91% (11)
Unix
36 pages
linuxPrg
No ratings yet
linuxPrg
22 pages
LP Lab Manuel
No ratings yet
LP Lab Manuel
23 pages
DOC-20241017-WA0005.
No ratings yet
DOC-20241017-WA0005.
47 pages
Unix Lab Programs 2
100% (1)
Unix Lab Programs 2
34 pages
LINUX Smiley Question Bank
No ratings yet
LINUX Smiley Question Bank
12 pages
Unix and Linux Programming File
No ratings yet
Unix and Linux Programming File
20 pages
Sqlkab2 1
No ratings yet
Sqlkab2 1
9 pages
Linux Lab
No ratings yet
Linux Lab
54 pages
Experiment:1: Write A Shell Script To Generate A Multiplication Table
100% (1)
Experiment:1: Write A Shell Script To Generate A Multiplication Table
48 pages
Linux Lab
No ratings yet
Linux Lab
6 pages
UNIX Programming Laboratory-Programs
No ratings yet
UNIX Programming Laboratory-Programs
34 pages
Shell programming college questions
No ratings yet
Shell programming college questions
2 pages
R.V.College of Engineering Dept. of MCA
No ratings yet
R.V.College of Engineering Dept. of MCA
31 pages
3
No ratings yet
3
7 pages
Syllabus 07 08 It II-i Unix and Shell Programming Lab
No ratings yet
Syllabus 07 08 It II-i Unix and Shell Programming Lab
4 pages
UNIX Shell Scripting Interview Questions, Answers, and Explanations: UNIX Shell Certification Review
From Everand
UNIX Shell Scripting Interview Questions, Answers, and Explanations: UNIX Shell Certification Review
Equity Press
4.5/5 (4)
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
From Everand
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
Tenko
No ratings yet
Mastering Unix Shell Scripting: Bash, Bourne, and Korn Shell Scripting for Programmers, System Administrators, and UNIX Gurus
From Everand
Mastering Unix Shell Scripting: Bash, Bourne, and Korn Shell Scripting for Programmers, System Administrators, and UNIX Gurus
Randal K. Michael
3.5/5 (2)
Professional Heroku Programming
From Everand
Professional Heroku Programming
Chris Kemp
4/5 (2)
UNIX Shell Programming Interview Questions You'll Most Likely Be Asked
From Everand
UNIX Shell Programming Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Extending Docker
From Everand
Extending Docker
Russ McKendrick
5/5 (1)
Swift 3 Object-Oriented Programming - Second Edition
From Everand
Swift 3 Object-Oriented Programming - Second Edition
Gastón C. Hillar
No ratings yet
Objective-C Programming For Dummies
From Everand
Objective-C Programming For Dummies
Neal Goldstein
4/5 (5)
Chapter 10
No ratings yet
Chapter 10
9 pages
Technology Infographics by Slidesgo
No ratings yet
Technology Infographics by Slidesgo
8 pages
Lecture - 6 - Lag - Compensation Design by The Time Response Method
No ratings yet
Lecture - 6 - Lag - Compensation Design by The Time Response Method
38 pages
Ijaia 03
No ratings yet
Ijaia 03
15 pages
Exam 268 PDF
No ratings yet
Exam 268 PDF
8 pages
Ircular: Research Problent Statetnents: Pedestrians
No ratings yet
Ircular: Research Problent Statetnents: Pedestrians
22 pages
Lecture 2 20242025
No ratings yet
Lecture 2 20242025
55 pages
Mscotdet: Language-Driven Multi-Modal Fusion For Improved Multispectral Pedestrian Detection
No ratings yet
Mscotdet: Language-Driven Multi-Modal Fusion For Improved Multispectral Pedestrian Detection
14 pages
Lecture 11
No ratings yet
Lecture 11
48 pages
E5073 P7P55-M
No ratings yet
E5073 P7P55-M
68 pages
Z:/windchill/codebase Z:/windchill/codebase
No ratings yet
Z:/windchill/codebase Z:/windchill/codebase
3 pages
Readme PDF
100% (1)
Readme PDF
5 pages
Top Use Cases
No ratings yet
Top Use Cases
55 pages
Snap Pac Brains User'S Guide
No ratings yet
Snap Pac Brains User'S Guide
40 pages
Simatic Industrial PC Simatic Rack PC 847B
No ratings yet
Simatic Industrial PC Simatic Rack PC 847B
38 pages
MDS Hardware Architecture
No ratings yet
MDS Hardware Architecture
54 pages
State Board of Cricket Council - Requirement Document 5
No ratings yet
State Board of Cricket Council - Requirement Document 5
10 pages
Laptop Motherboard Repair
100% (3)
Laptop Motherboard Repair
195 pages
Service Manual: Ta-S7Av
No ratings yet
Service Manual: Ta-S7Av
28 pages
Finger Print Training
No ratings yet
Finger Print Training
21 pages
ZEB Horizon M300 UAV User Guide
No ratings yet
ZEB Horizon M300 UAV User Guide
23 pages
Question Bank
No ratings yet
Question Bank
3 pages
Terraform Walkthrough
No ratings yet
Terraform Walkthrough
15 pages
Fundamentals of Computer Programming-I Multiple Choice Questions
No ratings yet
Fundamentals of Computer Programming-I Multiple Choice Questions
165 pages
How To Make Mp3 Player at Home: Power Saver Circuit Diagram
No ratings yet
How To Make Mp3 Player at Home: Power Saver Circuit Diagram
3 pages
MDRRF 855 RK
No ratings yet
MDRRF 855 RK
2 pages
Incremental Encoders: Blind or Through Hollow Shaft Up To Ø15 MM 5... 2048 Pulses Per Revolution
No ratings yet
Incremental Encoders: Blind or Through Hollow Shaft Up To Ø15 MM 5... 2048 Pulses Per Revolution
4 pages
Anchal (Ds Lab Manual)
No ratings yet
Anchal (Ds Lab Manual)
100 pages
Lec1 - Programmable Logic Controller PLC PDF
No ratings yet
Lec1 - Programmable Logic Controller PLC PDF
13 pages
AWS CLI Presentation
No ratings yet
AWS CLI Presentation
11 pages
RTL-MWP-079 - Adapting RFIF Over IP Within Range Applications
No ratings yet
RTL-MWP-079 - Adapting RFIF Over IP Within Range Applications
16 pages
An3007 PDF
No ratings yet
An3007 PDF
8 pages
Computer Arithmetic: Hamcher Chapter 6
No ratings yet
Computer Arithmetic: Hamcher Chapter 6
38 pages
Cisco Switch Port Security Configuration and Best Practices
No ratings yet
Cisco Switch Port Security Configuration and Best Practices
6 pages
Isu Tables
No ratings yet
Isu Tables
1 page

LPDM Lab Manul

Uploaded by

LPDM Lab Manul

Uploaded by

LINUX PROGRAMMING AND DATA MINIG

7. Write a shell script to find factorial of a given integer.

18. Write a C program that illustrates how an orphan is created.

13. Listing of categorical attributes and the real-valued attributes separately. 55

15. Training a decision tree. 59

16. Test on classification of decision tree. 63

17. Testing on the training set . 67

18. Using cross –validation for training. 68

19. Significance of attributes in decision tree. 71

22. Decision trees. 78

24. Convert a Decision Trees into "if-then-else rules". 81

6. Write a shell script to list all of the directory files in a directory.

7. Write a shell script to find factorial of a given integer.

11. Implement in C the following UNIX commands using System calls

AIM: Implement in C the cat Unix command using system calls

AIM: Implement in C the following ls Unix command using system calls

extern int alphasort();

int file_select(struct direct *entry)

AIM: Implement in C the Unix command mv using system calls

13. Write a C program to emulate the UNIX ls –l command.

Step 1: Include necessary header files for manipulating directory.

enter directory name iii

#define NAME_MAX 14 /* longest filename component; */

typedef struct { /* portable directory entry */

long ino; /* inode number */

char name[NAME_MAX+1]; /* name + '\0' terminator */

typedef struct { /* minimal DIR: no buffering, etc. */

int fd; /* file descriptor for the directory */

Dirent d; /* the directory entry */

DIR *opendir(char *dirname);

Dirent *readdir(DIR *dfd);

void closedir(DIR *dfd);

struct stat stbuf;

int stat(char *, struct stat *);

struct stat /* inode information returned by stat */

dev_t st_dev; /* device of inode */

ino_t st_ino; /* inode number */

short st_mode; /* mode bits */

short st_nlink; /* number of links to file */

short st_uid; /* owners user id */

short st_gid; /* owners group id */

dev_t st_rdev; /* for special files */

off_t st_size; /* file size in characters */

time_t st_atime; /* time last accessed */

time_t st_mtime; /* time last modified */

time_t st_ctime; /* time originally created */

#define S_IFMT 0160000 /* type of file: */

#define S_IFDIR 0040000 /* directory */

#define S_IFCHR 0020000 /* character special */

#define S_IFBLK 0060000 /* block special */

#define S_IFREG 0010000 /* regular */

#include <fcntl.h> /* flags for read and write */

#include <sys/types.h> /* typedefs */

#include <sys/stat.h> /* structure returned by stat */

/* print file name */

main(int argc, char **argv)

if (argc == 1) /* default: current directory */

while (--argc > 0)

int stat(char *, struct stat *);

void dirwalk(char *, void (*fcn)(char *));

/* fsize: print the name of file "name" */

void fsize(char *name)

struct stat stbuf;

if (stat(name, &stbuf) == -1) {

fprintf(stderr, "fsize: can't access %s\n", name);

if ((stbuf.st_mode & S_IFMT) == S_IFDIR)

printf("%8ld %s\n", stbuf.st_size, name);

#define MAX_PATH 1024

/* dirwalk: apply fcn to all files in dir */

void dirwalk(char *dir, void (*fcn)(char *))

if ((dfd = opendir(dir)) == NULL) {

fprintf(stderr, "dirwalk: can't open %s\n", dir);

while ((dp = readdir(dfd)) != NULL) {

continue; /* skip self and parent */

if (strlen(dir)+strlen(dp->name)+2 > sizeof(name))

fprintf(stderr, "dirwalk: name %s %s too long\n",

DIR opendir(char dirname);

Dirent readdir(DIR dfd);

int stat(char , struct stat );

int stat(char , struct stat );

void dirwalk(char , void (fcn)(char *));

void dirwalk(char dir, void (fcn)(char *))

DIR opendir(char dirname)

Dirent readdir(DIR dp)

/delete msg queuu /