It is not easy for beginners to find all the right information on how to create a reentrant parser with tools like flex and bison. Most examples out there are out of date or not very clear.
Once I found out, I wrote a small example of how to do it. The objective is not yet another flex/bison tutorial, but to focus solely on the reentrant code aspects.
It is a simple caculator that evaluates arithmetic expressions and it is separated in three small files:

  • calc.h (common header file for lexer and parser)
  • lexer.l (obviously the lexer)
  • parser.y (the parser and the main function)

First, lets see the header:

    typedef struct      parse_parm_s
{
void *yyscanner;
char *buf;
int pos;
int length;
double result;
} parse_parm;

void parse(char *buf, double *result);

#define YYSTYPE double
#define YY_EXTRA_TYPE parse_parm *

int yylex(YYSTYPE *, void *);
int yylex_init(void **);
int yylex_destroy(void *);
void yyset_extra(YY_EXTRA_TYPE, void *);
int yyparse(parse_parm *, void *);
void yyerror();

Lets focus on the parse_parm structure. What we need is a way to share data between the lexer and the parser without the use of global variables. That is where this structure comes in. In this one we have:

  • yyscanner which is an uninitialized pointer that will be used by the lexer for internal suff (instead of globals)
  • buf is the buffer that contains the data to be parsed
  • pos is the current position in the buffer
  • length is the size of the buffer
  • result will be the result of the evaluated expression

parse is our main parsing function. We will see it in the parser.
YYSTYPE is the type of yylval but you should know that otherwise go back to basics.
YY_EXTRA_TYPE is the type of the extra data which will be given to the lexer thanks to the yyset_extra function. In this case it is our structure.

The prototypes are forward declaration of lexer/parser functions so the compiler shuts up about warnings. I think I could generate a .h with all that but I was lazy.

Here comes the lexer:

    %{
#include <stdlib.h>
#include <string.h>

#include "calc.h"
#include "parser.tab.h"

#define PARM yyget_extra(yyscanner)

#define YY_INPUT(buffer, res, max_size) \
do { \
if (PARM->pos >= PARM->length) \
res = YY_NULL; \
else \
{ \
res = PARM->length - PARM->pos; \
res > (int)max_size ? res = max_size : 0; \
memcpy(buffer, PARM->buf + PARM->pos, res); \
PARM->pos += res; \
} \
} while (0)

%}

%option reentrant bison-bridge
%option noyywrap
%option nounput

%%

[+\-*/()] { return (*yytext); }

[0-9]+ { *yylval = atoi(yytext); return (INT); }

([0-9]*.[0-9]+) { *yylval = atof(yytext); return (FLOAT); }

[ \t\r\n] ;

%%

The parse_parm structure is extra data so we have to access it with yyget_extra(yyscanner).

Since we read from a buffer and not stdin or a file, we have to to redefine the YY_INPUT macro (see section 10 of the flex manual 'The generated scanner').

We want the scanner to be reentrant, therefore generate no global variables. That is what the reentrant option is for. bison-bridge is used to create a bison compatible scanner and share yylval.

Finally the parser:

    %{
#include <stdio.h>
#include <string.h>

#include "calc.h"

void parse(char *buf, double *result)
{
parse_parm pp;

pp.buf = buf;
pp.length = strlen(buf);
pp.pos = 0;
*result = 0;
yylex_init(&pp.yyscanner);
yyset_extra(&pp, pp.yyscanner);
yyparse(&pp, pp.yyscanner);
*result = pp.result;
yylex_destroy(pp.yyscanner);
}

%}

%pure_parser
%parse-param {parse_parm *parm}
%parse-param {void *scanner}
%lex-param {yyscan_t *scanner}

%token INT FLOAT

%left '-' '+'
%left '*' '/'
%left NEG POS

%%

expr: math { parm->result = $1; }
;

math: math '+' math { $$ = $1 + $3; }
| math '-' math { $$ = $1 - $3; }
| math '*' math { $$ = $1 * $3; }
| math '/' math { $$ = $1 / $3; }
| '-' math %prec NEG { $$ = -$2; }
| '+' math %prec POS { $$ = $2; }
| '(' math ')' { $$ = $2; }
| INT { $$ = $1; }
| FLOAT { $$ = $1; }
;

%%

yylex_init and yylex_destroy have to be called to initialize the lexer and free its ressources after the parsing. yyset_extra is used to pass the parse_parm structure to the lexer.

The pure_parser option tells bison to use no global variables and create a reentrant parser.
yyparse gets two new parameters with the parse-param. The parse_parm structure and the pointer for the lexer.
yylex gets a new parameter thanks to lex-param so the parser can pass him the pointer.

Now the parse function can be called any time to evaluate an expression and get the result. Even in multiple threads.

Here is a link to the sources in this tutorial: calc.tar.gz. You can compile it with:

    $ bison -d parser.y
$ flex lexer.l
$ gcc lex.yy.c parser.tab.c -ly -ll

Other links: