0% found this document useful (0 votes)
137 views

Experiment No. - 9: Aim: (Tokenizing) - A Program That Reads A Source Code in C From An Unformatted File

The document describes an experiment to tokenize a C source code file. The program reads in a C source code file, extracts various tokens like keywords, variable names, operators, and constant values. It then writes the tokenized output to a new file, identifying each token with its type or category. The program uses separate files containing operator and keyword tokens to classify each tokenized word from the source file.

Uploaded by

Lion
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
137 views

Experiment No. - 9: Aim: (Tokenizing) - A Program That Reads A Source Code in C From An Unformatted File

The document describes an experiment to tokenize a C source code file. The program reads in a C source code file, extracts various tokens like keywords, variable names, operators, and constant values. It then writes the tokenized output to a new file, identifying each token with its type or category. The program uses separate files containing operator and keyword tokens to classify each tokenized word from the source file.

Uploaded by

Lion
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Experiment no.

– 9

Aim: (Tokenizing). A program that reads a source code in C from an unformatted file
and extract various types of tokens from it (e.g. keywords/variable names, operators,
constant values).

Description: Lexical analysis is the process of converting a sequence of characters


(such as in a computer program of web page) into a sequence of tokens (strings with
an identified
“meaning”). A program that perform lexical analysis may be called a lexer, tokenize or
scanner.

Token
A token is a structure representing a lexeme that explicitly indicates its categorization
for the Purpose of parsing. A category of token is what in linguistics might be called a
part-of- speech. Examples of token categories may include “identifier” and “integer
literal”, although the set of Token differ in different programming languages. The
process of forming tokens from an input stream of characters is called tokenization.
Consider this expression in the C programming language: Sum=3 + 2;
Tokenized and represented by the following table:

Program:
#include<stdio.h>
#include<conio.h>
#include<ctype.h>
#include<string.h>
void main()
{
FILE *fi,*fo,*fop,*fk;
int flag=0,i=1;
char c,t,a[15],ch[15],file[20];
clrscr();
printf("\n Enter the File Name:");
scanf("%s",&file);
fi=fopen(file,"r");
fo=fopen("inter.c","w");
fop=fopen("oper.c","r");
fk=fopen("key.c","r");
c=getc(fi);
while(!feof(fi))
{
if(isalpha(c)||isdigit(c)||(c=='['||c==']'||c=='.'==1))
fputc(c,fo);
else
{
if(c=='\n')
fprintf(fo,"\t$\t");
else fprintf(fo,"\t%c\t",c);
}
c=getc(fi);
}
fclose(fi);
fclose(fo);
fi=fopen("inter.c","r");
printf("\n Lexical Analysis");
fscanf(fi,"%s",a);
printf("\n Line: %d\n",i++);
while(!feof(fi))
{
if(strcmp(a,"$")==0)
{
printf("\n Line: %d \n",i++);
fscanf(fi,"%s",a);
}
fscanf(fop,"%s",ch);
while(!feof(fop))
{
if(strcmp(ch,a)==0)
{
fscanf(fop,"%s",ch);
printf("\t\t%s\t:\t%s\n",a,ch);
flag=1;
} fscanf(fop,"%s",ch);
}
rewind(fop);
fscanf(fk,"%s",ch);
while(!feof(fk))
{
if(strcmp(ch,a)==0)
{
fscanf(fk,"%k",ch);
printf("\t\t%s\t:\tKeyword\n",a);
flag=1;
}
fscanf(fk,"%s",ch);
}
rewind(fk);
if(flag==0)
{
if(isdigit(a[0]))
printf("\t\t%s\t:\tConstant\n",a);
else
printf("\t\t%s\t:\tIdentifier\n",a);
}
flag=0;
fscanf(fi,"%s",a); }
getch();
}
Input Files:
Oper.c
( open para
) closepara
{ openbrace
} closebrace
< lesser
> greater
" doublequote ' singlequote
: colon
; semicolon
# preprocessor
= equal
== asign
% percentage
^ bitwise
& reference
* star
+ add
- sub
\ backslash
Key.C
int
void
main
char
if
for
while
else
printf
scanf
FILE
Include
stdio.h
conio.h
iostream.h
Input.c
#include "stdio.h"
#include "conio.h"
void main()
{
int a=10,b,c;
a=b*c;
getch();
}

Output:-

You might also like