Lexical tokens are defined using the datatype
declaration.
The syntax is as follows.
Tokens_Decl ::= datatype Id :: lexeme = Token_Spec - ¼ - Token_Spec ; Token_Spec ::= lexeme class Id Include a lexeme class | Id [ Regexp ] Single token spec
A token datatype is defined by including one of more lexeme classes,
and by defining stand-alone tokens. If a lexeme class is included,
all the tokens defined within the lexeme class are included. A C++
enum type of the same name as the token datatype is generated.
If a token is given an identifier name, then the same name is used
as the enum literal. On the other hand, if a string is used
to denote a token, then it can be referred to by prefixing the
string with a dot .
For example, the token "=>"
can
be referenced as ."=>"
within a program.
As an example, the following token datatype definition is used within the Prop translator. Here, the keywords are first partitioned into 6 different lexeme classes. In additional, the tokens ID_TOK, REGEXP_TOK, etc. are defined.
datatype PropToken :: lexeme = lexeme class MainKeywords | lexeme class Keywords | lexeme class SepKeywords | lexeme class Symbols | lexeme class Special | lexeme class Literals | ID_TOK /{patvar}/ | REGEXP_TOK /{regexp}/ | QUARK_TOK /#{string}/ | BIGINT_TOK /#{sign}{integer}/ | PUNCTUATIONS /[\<\>\,\.\;\&\|\^\!\~\+\-\*\/\%\?\=\:\\]/ ;