The C-minus Preprocessor (a.k.a. c-pp
) is a truly minimal
C-like preprocessor application. Why? Because C preprocessors can
process non-C code but generally make quite a mess of it1. The
purpose of this application is an extremely minimal preprocessor with
only the most basic functionality of a C preprocessor (see below). It
was conceived for use with JavaScript code but is generic enough to be
used with essentially arbitrary UTF-8 text (including C code).
Design note: this tool makes use of SQLite. Though not strictly
needed in order to implement it, this tool was specifically created
for use with the sqlite3 project's own JavaScript
code in order to facilitate creation of
different builds, so there's no reason not to make use of sqlite3 to
do some of the heavy lifting. It does not require any cutting-edge
sqlite3 features and should be usable with any version which supports
features as old as WITHOUT ROWID
.
Formalities
Project home: https://fossil.wanderinghorse.net/r/c-pp
License: same as sqlite3 (see c-pp.c
for the
full details)
Author: Stephan Beal https://wanderinghorse.net/home/stephan/
Supported Markup
Like a C preprocessor, this tool reads input text files and
conditionally filters out parts. Unlike CPP, c-pp
does no inline
expansion of content. It reads only lines which start with its
keyword delimiter in column zero and passes all other input through
as-is unless it is elided due to an #if
. Like CPP, it accepts spaces
between the keyword delimiter and the keyword, plus it accepts
backslash-escaped newlines (but doesn't have terribly much use for
them).
#define
accepts one or more arguments, names of macros. Macros inc-pp
have no separate values: the fact that one is defined at all makes it implicitly "true" for purposes of logical operations. Defines may be passed via the CLI in the form-DXYZ
. Macro names may contain any non-space characters. Note that the name "macro" is a bit of misnomer since these are not used in text replacement like CPP macros are, but the term is retained for lack of a better name.#undef
undefines one or more macros. Undefines may be passed via the CLI in the form-UXYZ
.#if
interprets its one argument as a macro name which resolves to true if it's defined, false if it's not. Likewise,#ifnot
is the inverse. Includes#else
,#elif
, and#elifnot
. Note that it does not support any sort of boolean or math operators, and makes no distinction between "defined with a value" and "defined without a value": all defines are implicitly true. An#if
chain must be terminated with#endif
.#include
treats its single argument as a filename to recursively processes it, replacing the#include
line with the processed content. The include search path is defined by passing one or more-Idirname
flags to the app and if no such flags are provided,-I.
is assumed. Regardless of the include path, a filename which matches a file without any path expansion is considered a better match. e.g.#include /foo/bar/baz
will match file/foo/bar/baz
before it will match the filebaz
in the path provided by-I/foo/bar
(same file, but different resolution).
Achtung: the argument must currently be unquoted. Support for quoted names may be added later.#error
exitsc-pp
with an error and uses the rest of that line for the error message.#pragma
is currently for internal use only and has no well-defined interface.#stderr
sends all remaining text on the line tostderr
, along with the current file's name and line number. This is primarily intended for messages such as "enabling so-and-so", as well as debugging the processing of inputs.#//
is a single-line comment. Note that the//
part is the keyword and there must be a space after it.
Note that "#" above is symbolic. The keyword delimiter is
configurable and defaults to ##
. Define CMPP_DEFAULT_DELIM
to
a string when compiling to define the default at build-time. The
delimiter may be modifed via a command-line flag.
Examples
$ cat my.txt
##if foo
foo
##elif bar
bar
##else
baz
##endif
$ c-pp my.txt -Dfoo
foo
$ c-pp my.txt -Dbar
bar
$ c-pp my.txt
baz
$ cat hosts.txt
%include /etc/hosts
$ c-pp -d % hosts.txt
127.0.0.1 localhost
127.0.1.1 my-computer
...
$ cat hosts.txt
%include hosts
$ c-pp -d % -I/etc hosts.txt
127.0.0.1 localhost
127.0.1.1 my-computer
...
- ^
C preprocessors, when running in comment-retention mode, tend to
inject
#
characters all over the place and may do silly things like automatically include compiler-specific headers and emit the comments from those. e.g. usinggcc -E -CC
will include a gcc-internal header and emit a GPL license header in the output. e.g. try:
$ echo 'extern int x;' > y.c; gcc -E -CC y.c