Step1: Start
Step2: Read Input file Name
Step3: Initialize flag and identifier count variables, flag=0, cnt=0
Step3: Check for end of file
Step4: Go to Step8
Step5: Check for data type in the given input
Step6: Go to Step3
Count the number of identifiers in the line
Go to Step3
Step7: Print the number of identifiers
Step8: Stop
Explanation:
%{
#include<stdio.h>
#include<string.h>
int flag=0,cnt=0;
char str[50];
%}
define header files. declare a string 'str' to hold all the scanned identifiers in the input file. as we process the identifiers, we concatenate those to this variable. 'flag' shall be used for validity checks, initially 0. 'cnt' variable for counting the total number of identifiers initially 0.
%%
(int\ )|(char\ )|(float\ )|(double\ )(short\ )(long\ )(unsigned\ ) {flag=1;}
this is to scan the following in the input file: 'int ','char ','long '..and so on, as we would see data types in the source codes. if we encounter any one of such data types, we set flag=1.
[a-zA-Z_][A-Za-z0-9_]*[,=\[;] {if(flag==1)
{ cnt++;
strncat(str,yytext,yyleng-1);
strcat(str," ");
}
}
- The first line is the regular expression for a variable declaration. [a-zA-Z_] indicates that variable shall start from either alphabet or underscore.
- [A-Za-z0-9_]* indicates that it can contain either character or number or underscore any number of times after the first character.
- [,=\[;] this is for representing either a comma ',' or an equal to sign '=' or an array bracket '[' ']' or a semicolon. because variables can be separated by any of them.
- example, int a,b,c[10],d=19;
- all the above are variables.
- so if we confront any variable during pattern matching, we check for flag=1(that is whether previous pattern was a data type or not. check previous snippet). If true, then count is incremented. and that variable is concatenated to str. (remember matched pattern will be in yytext).
- strcat is again done with concatenating space, to separate each encountered variable in the str string.
[^a-zA-Z_][^A-Za-z0-9_]*[,=\[;] ;
\n {flag=0;}
. ;
%%
- anything which is not a variable is neglected.
- newline symbol indicates next line. so set flag=0 again to mark the end of previous declarations.
- any other character '.' , is neglected.
main(int argc,char *argv[50])
{
yyin=fopen(argv[1],"r");
yylex();
printf("\nNo of identifiers=%d \n The identifiers are : %s\n",cnt,str);
}
yywrap()
{}
main function. take command line arguments, that is file name.
open the file in read mode. File name is argv[1]. (argv[0] is ./a.out)
call the parser.
after parsing, print the number of identifiers (nothing but variables). and also the list containing all the matched identifiers (which is stored in string str).
yywrap() function is responsible for checking whether there are any more inputs. If no, then yywrap() tells program to end.
Share your views about this article!