The C-minus Preprocessor (a.k.a. c-pp) is a truly minimal C-like preprocessor application. Why? Because C preprocessors can process non-C code but generally make quite a mess of it¹. The purpose of this application is an extremely minimal preprocessor with only the most basic functionality of a C preprocessor (see below). It was conceived for use with JavaScript code but is generic enough to be used with essentially arbitrary UTF-8 text (including C code).

Design note: this tool makes use of SQLite. Though not strictly needed in order to implement it, this tool was specifically created for use with the sqlite3 project's own JavaScript code in order to facilitate creation of different builds, so there's no reason not to make use of sqlite3 to do some of the heavy lifting. It does not require any cutting-edge sqlite3 features and should be usable with any version which supports features as old as WITHOUT ROWID.

Formalities

Project home: https://fossil.wanderinghorse.net/r/c-pp

License: same as sqlite3 (see c-pp.c for the full details)

Author: Stephan Beal https://wanderinghorse.net/home/stephan/

Supported Markup

Like a C preprocessor, this tool reads input text files and conditionally filters out parts. Unlike CPP, c-pp does no inline expansion of content. It reads only lines which start with its keyword delimiter in column zero and passes all other input through as-is unless it is elided due to an #if. Like CPP, it accepts spaces between the keyword delimiter and the keyword, plus it accepts backslash-escaped newlines (but doesn't have terribly much use for them).

#define accepts one or more arguments, names of macros. Macros in c-pp have no separate values: the fact that one is defined at all makes it implicitly "true" for purposes of logical operations. Defines may be passed via the CLI in the form -DXYZ. Macro names may contain any non-space characters. Note that the name "macro" is a bit of misnomer since these are not used in text replacement like CPP macros are, but the term is retained for lack of a better name.
#undef undefines one or more macros. Undefines may be passed via the CLI in the form -UXYZ.
#if interprets its one argument as a macro name which resolves to true if it's defined, false if it's not. Likewise, #ifnot is the inverse. Includes #else, #elif, and #elifnot. Note that it does not support any sort of boolean or math operators, and makes no distinction between "defined with a value" and "defined without a value": all defines are implicitly true. An #if chain must be terminated with #endif.
#include treats its single argument as a filename to recursively processes it, replacing the #include line with the processed content. The include search path is defined by passing one or more -Idirname flags to the app and if no such flags are provided, -I. is assumed. Regardless of the include path, a filename which matches a file without any path expansion is considered a better match. e.g. #include /foo/bar/baz will match file /foo/bar/baz before it will match the file baz in the path provided by -I/foo/bar (same file, but different resolution).
Achtung: the argument must currently be unquoted. Support for quoted names may be added later.
#error exits c-pp with an error and uses the rest of that line for the error message.
#pragma is currently for internal use only and has no well-defined interface.
#stderr sends all remaining text on the line to stderr, along with the current file's name and line number. This is primarily intended for messages such as "enabling so-and-so", as well as debugging the processing of inputs.
#// is a single-line comment. Note that the // part is the keyword and there must be a space after it.

Note that "#" above is symbolic. The keyword delimiter is configurable and defaults to ##. Define CMPP_DEFAULT_DELIM to a string when compiling to define the default at build-time. The delimiter may be modifed via a command-line flag.

Examples

$ cat my.txt
##if foo
foo
##elif bar
bar
##else
baz
##endif
$ c-pp my.txt -Dfoo
foo
$ c-pp my.txt -Dbar
bar
$ c-pp my.txt
baz

$ cat hosts.txt
%include /etc/hosts
$ c-pp -d % hosts.txt
127.0.0.1	localhost
127.0.1.1	my-computer
...

$ cat hosts.txt
%include hosts
$ c-pp -d % -I/etc hosts.txt
127.0.0.1	localhost
127.0.1.1	my-computer
...

^{^} C preprocessors, when running in comment-retention mode, tend to inject # characters all over the place and may do silly things like automatically include compiler-specific headers and emit the comments from those. e.g. using gcc -E -CC will include a gcc-internal header and emit a GPL license header in the output. e.g. try:
$ echo 'extern int x;' > y.c; gcc -E -CC y.c

C-Minus Preprocessor

C-Minus Preprocessor

Formalities

Supported Markup

Examples