《编译原理》实验一:词法分析器 C++ 版
考虑如下C语言子集:
| 单词 | 类别编码 | 助记符 | 值 |
|---|
| break | 1 | BREAK | _ |
| char | 2 | CHAR | _ |
| do | 3 | DO | _ |
| double | 4 | DOUBLE | _ |
| else | 5 | ELSE | _ |
| if | 6 | IF | _ |
| int | 7 | INT | _ |
| return | 8 | RETURN | _ |
| void | 9 | VOID | _ |
| while | 10 | WHILE | _ |
| 标识符 | 11 | ID | 构成标识符的字符串 |
| 常数 | 12 | NUM | 数值 |
| 字符串 | 13 | STRING | 字符串 |
| + | 14 | ADD | _ |
| - | 15 | SUB | _ |
| * | 16 | MUL | _ |
| / | 17 | DIV | _ |
| > | 18 | GT | _ |
| >= | 19 | GE | _ |
| < | 20 | LT | _ |
| <= | 21 | LE | _ |
| == | 22 | EQ | _ |
| != | 23 | NE | _ |
| = | 24 | ASSIGN | _ |
| { | 25 | LB | _ |
| } | 26 | RB | _ |
| ) | 27 | LR | _ |
| ) | 28 | RR | _ |
| , | 29 | COMMA | _ |
| ; | 30 | SEMI | _ |
单词的正则定义如下
-
D = [0-9]
-
L = [a-zA-Z_]
-
H = [a-fA-F0-9]
-
E = [Ee][±]?{D}+
-
FS = (f|F|l|L)
-
IS = (u|U|l|L)*
标识符
-
id = {L}({L}|{D})*
常数
-
num:
-
0[xX]{H}+{IS}?
-
| 0{D}+{IS}?
-
| {D}+{IS}?
-
| L?‘(.|[^’])+’
-
| {D}+{E}{FS}?
-
| {D}*“.”{D}+({E})?{FS}?
-
| {D}+“.”{D}*({E})?{FS}?
字符串
-
string = L?“(.|[^”])*"
对给定的源程序进行词法分析,每个单词一行,以二元组的形式输出结果。
例如,下面的源程序代码
void main()
{
double sum = 0.0;
double x = 1.0;
while (x <= 100) sum = sum + x;
printf("sum = %fn", sum);
}
词法分析的结果为
(VOID, _)
(ID, “main”)
(LR, _)
(RR, _)
(LB, _)
(DOUBLE, _)
(ID, “sum”)
(ASSIGN, _)
(NUM, 0.0)
(SEMI, _)
(DOUBLE, _)
(ID, “x”)
(ASSIGN, _)
(NUM, 1.0)
(SEMI, _)
(RB, _)
(WHILE, _)
(LR, _)
(ID, “x”)
(LE, _)
(NUM, 100)
(RR, _)
(ID, “sum”)
(ASSIGN, _)
(ID, “sum”)
(ADD, _)
(ID, “x”)
(SEMI, _)
(ID, “printf”)
(LR, _)
(STRING, “sum = %fn”)
(COMMA, _)
(ID, “sum”)
(RR, _)
(SEMI, _)
(RB, _)
编写C++代码
#include
#include
运行结果