parse0x: C++0x parser generator toolkit
parse0x is an experimental parser toolkit for C++0x-compliant C++ compilers, conceptually based very much on Dr. Colin Hirsch's PEGTL library, but stripped of the error reporting facilities that library has. (This makes it much smaller and somewhat easier to extend, but also not terribly useful for tracking parsing errors.) It is similar to libraries like Boost.Spirit but is much, much smaller in scope. It requires C++0x support, which is currently only available in beta form. Try gcc 4.3.
parse0x is believed to conform to the rules of Parsing Expression Grammars (PEGs), as detailed on this Wikipedia page about PEGs (though the formalities of it are admittedly over my head).
This package's home page is:
http://wanderinghorse.net/computing/parse0x
License
This source code is released into the Public Domain by its author, Stephan Beal (http://wanderinghorse.net/home/stephan/). That is, you may take it and use it for any purpose whatsoever, commercial or otherwise.
Examples
The source tree comes with a test app (test.cpp) which shows how to do things like parse IP addresses, string literals, and various numeric types. There is also that most classic of parsing examples, a calculator.
Aside from the example code we have a very-much-unfinished introduction to parse0x and an overview of the built-in parsing rules.
Features
- Allows creation of parsers using Rules, which are small classes which implement string-matching rules. Rules can be combined to create parsers of nearly arbitrary complexity. Rules have no local state and are never instantiated, so they're memory-light.
- There are no separate tokenization and parse phases - they're combined into a single phase.
- Validating-only parsers (that is, without any client-side actions) can normally be implemented using only typedefs or empty structs which inherit other rules.
- Actions can be tied to any Rules, to handle text a Rule matches.
- Passes a client-defined State type to Rules and Actions, so that custom Rules/Actions can manipulate client-side data during the parsing process.
- A small header-only implementation. The whole core library is implemented in a single header file, plus some optional headers are included for handling some common parsing cases (e.g. quoted strings and ipv4 addresses). The core implementation is trivial, thanks to C++0x's variadic templates feature.
- Fairly easy to use, at least if one is familiar with template-based programing and writing parsers.
Misfeatures
- Requires C++0x features, which has only limited support in the newest compilers.
- Currently provides no way of reporting where/why a parsing error occurs. That is, a parse either fails or succeeds, but if it fails there's no way to know why without adding additional instrumentation to the parser.
- Only works on std::string input, meaning that when reading from streams you must buffer the whole stream into a string first.
- Won't work with binary data, as it relies on a null character to mark eof.
Download
See the downloads page.