We'll next describe the class LexerBuffer and its subclasses.
Class LexerBuffer
is the base class in the lexical buffer hierarchy.
It is defined in the library include file <AD/automata/lexerbuf.h>
.
This class is responsible for implementing a string buffer for use
during lexical analysis.
As it stands, it can be used directly if the lexer input is directly from a string. Memory management of the buffer is assumed to be handled by the user.
The class LexerBuffer has three constructors. The default
constructor initializes the string buffer to NULL. The two
other constructors initialize the string buffer to a string given
by the user. In the case when the length is not supplied, the buffer
is assumed to be '\0'
-terminated. The two set_buffer
methods
can be used to set the current string buffer. Notice that
all lexical analysis operations are done in place. The user should
not alter the string buffer directly, but should use the interface
provided by this class instead.
class LexerBuffer { public: LexerBuffer(); LexerBuffer(char *); LexerBuffer(char *, size_t); virtual ~LexerBuffer(); virtual void set_buffer (char *, size_t); void set_buffer (char *); };
The following methods are used access the string buffer.
Method capacity returns the size of the buffer. Method
length returns the length of the current matched token.
Methods text can be used to obtain a point to location
of the current matched token. The string returned is guaranteed
to be '\0'
-terminated. Methods operator [] return
the ith character of the token. Finally, method lookahead
returns the character code of the next character to be matched.
int capacity () const; int length () const; const char * text () const; char * text (); char operator [] (int i) const; char& operator [] (int i); int lookahead () const; void push_back (int n)
In addition to the string buffer, the class LexerBuffer keeps track of two additional types of information: the current context of the DFA, and whether the next token starts at the beginning of the line, or in our terminology, whether it is anchored. These are manipulated with the following methods:
int context () const; void set_context (int c = 0); Bool is_anchored() const; void set_anchored(Bool a = true);
Finally, the following methods should be redefined by subclasses to alter the behavior of this class. By default, the class LexerBuffer calls fill_buffer() when it reaches the end of the string; subclasses can use this method to refill the buffer and return the number of characters read. Currently, fill_buffer is defined to do nothing and return 0. When it reaches the end of the file (i.e. when fill_buffer() fails to refill the buffer and the scanning process finishes), method end_of_file is called. Currently, this is a no-op. Finally, the error handling routine error() is called with the position of the beginning and the end of the buffer in which the error occurs. By default, this routine prints out the buffer.
protected: virtual size_t fill_buffer(); virtual void end_of_file(); virtual void error(const char * start, const char * stop);