parsepp  Artifact Content

Artifact d5dc0ea6038c26d968aceaf7f0097c19e2326b6e:

Wiki page [parsepp] by anonymous 2016-01-02 08:59:03.
D 2016-01-02T08:59:03.157
L parsepp
P 6b6fe9fa8e2683c4f51b40940c298415d40a8ebe
U anonymous
W 2998
parsepp is an experimental C++ library for creating custom parsers,
similar to [http://spirit.sourceforge.net/|Boost.Spirit] and
[http://code.google.com/p/yardparser/|YARD]. Unlike Spirit, this framework
is quite small.

parsepp is based on many insightful discussions with Colin Hirsch
regarding his [http://code.google.com/p/pegtl|PEGTL] library (and my
subsequent variant of that code,
[http://wanderinghorse.net/cgi-bin/parse0x.cgi|parse0x]).

In short, parsepp is a tool for writing custom parsers using
template programming techniques. In some ways it is similar
to the venerable <tt>lex</tt> and <tt>yacc</tt> tools, but
this style of parser is <em>much</em> easier and safer to
work with and also has no separate tokenization/parsing
steps (they are done at the same time).

<h2>License</h2>
Public Domain. Do as you will.

<h2>Downloading</h2>

See the [download|downloads page].

<h2>Features</h2>

<ul>
<li>A fairly small header-only implementation (templates-intensive).</li>
<li>Uses only ISO C++ Standards-specified techniques.</li>
<li>No 3rd-party code dependencies (only the STL).</li>
<li>Fairly well documented, in the form of [http://www.fixithere.net/sky-customer-service/|this web site] and the
API docs (in the header files).</li>
<li>Really easy to use. The core interface revolves around
only a small handful of functions and a useful set of
parsing rules.</li>
<li>Can be used to generate parsers of arbitrary complexity,
limited only by system resources (or potentially compiler
limitations).</li>
<li>Has support for easily inserting matched tokens into standard
containers.</li>
</ul>

<h2>Misfeatures</h2>

<ul>
<li>There is some very minimal
(and fairly low-level) support for getting at exact parse error
positions, but this should be extended.</li>
<li>Only works with std::string input, the main implication being
that all input has to be buffered before parsing begins.</li>
<li>Currently relies on the null character to report EOF, so
it cannot be used with binary data. In theory it'll be easy to
switch it to parse arbitrary integer arrays, which would allow
it to be parse UTF16 or binary data as well as ascii, provided
we can reserve one int value to mark EOF.</li>
</ul>

<h2>TODOs</h2>
<ul>
<li>Figure out some reasonable error reporting techniques.</li>
<li>Maybe templatize the iterator type so we can work on
arbitrary input iterators. This requires a partial-buffering
strategy similar to the one from PEGTL, which can
figure out "from which point are we guaranteed to not have to
backtrack?" and discard all input before that point.</li>
</ul>

<h2>Starting points...</h2>

First, see <tt>test.cpp</tt> in the source tree. Once that
code makes sense, see <tt>parsepp_numeric.hpp</tt>,
then <tt>parsepp_calc.hpp</tt>.

Once you're sure that you'd like to try it out, see
the [RulesOverview] wiki page. If that doesn't scare
you away, then go visit the [typelists] wiki page.

Z f177b44f501c468ea0f025d6cee52e8b