parsepp  parsepp

parsepp is an experimental C++ library for creating custom parsers, similar to Boost.Spirit and YARD. Unlike Spirit, this framework is quite small.

parsepp is based on many insightful discussions with Colin Hirsch regarding his PEGTL library (and my subsequent variant of that code, parse0x).

In short, parsepp is a tool for writing custom parsers using template programming techniques. In some ways it is similar to the venerable lex and yacc tools, but this style of parser is much easier and safer to work with and also has no separate tokenization/parsing steps (they are done at the same time).

License

Public Domain. Do as you will.

Downloading

See the downloads page.

Features

  • A fairly small header-only implementation (templates-intensive).
  • Uses only ISO C++ Standards-specified techniques.
  • No 3rd-party code dependencies (only the STL).
  • Fairly well documented, in the form of this web site and the API docs (in the header files).
  • Really easy to use. The core interface revolves around only a small handful of functions and a useful set of parsing rules.
  • Can be used to generate parsers of arbitrary complexity, limited only by system resources (or potentially compiler limitations).
  • Has support for easily inserting matched tokens into standard containers.

Misfeatures

  • There is some very minimal (and fairly low-level) support for getting at exact parse error positions, but this should be extended.
  • Only works with std::string input, the main implication being that all input has to be buffered before parsing begins.
  • Currently relies on the null character to report EOF, so it cannot be used with binary data. In theory it'll be easy to switch it to parse arbitrary integer arrays, which would allow it to be parse UTF16 or binary data as well as ascii, provided we can reserve one int value to mark EOF.

TODOs

  • Figure out some reasonable error reporting techniques.
  • Maybe templatize the iterator type so we can work on arbitrary input iterators. This requires a partial-buffering strategy similar to the one from PEGTL, which can figure out "from which point are we guaranteed to not have to backtrack?" and discard all input before that point.

Starting points...

First, see test.cpp in the source tree. Once that code makes sense, see parsepp_numeric.hpp, then parsepp_calc.hpp.

Once you're sure that you'd like to try it out, see the RulesOverview wiki page. If that doesn't scare you away, then go visit the typelists wiki page.