- Setup the chroot Jail
- Static Fossil Binary via Docker
- SSL Cert via certbot
- Althttpd HTTP Server
- Fossil Config
- Open Issues
Intro
Historically, all of my Fossil hosting, since 2008, has been on shared hosters via CGI, without any access to dedicated server capabilities. This system represents my first time having access to a full-fledged public-facing server from which to host fossil. This document describes how it is set up and touches on some of the reasons for certain decisions, e.g. the choice of web server.
In short, the server is comprised of:
- Ubuntu Linux running on a VPS.
- The althttpd web server for HTTP access.
- Optionally stunnel for HTTPS support, noting that althttpd has built-in TLS support since January 14, 2022, making stunnel optional.
- Optionally xinetd for launching the web server as needed.
- certbot for SSL certificates.
- A number of Fossil SCM source code repositories.
- Various websites built from static HTML and CGI apps.
The choice of OS was primarily one of habit and comfort: Linux is my virtual home since last millennium. The choice of fossil wasn't really a choice at all: it's the reason for setting up this server in the first place. The other choices were made after evaluating several options. This document won't provide an essay on each of those choices but will touch upon them in the relevant sections.
See mailserver.md for how this system's mail server is set up.
Prerequisite Software
This document will not detail the process of installing each piece of software except where it's deemed unusually exotic. The majority of the software it uses is available via the OS-level package repository. In particular, on this system we require:
$ apt install stunnel4 xinetd snapd docker.io
(snapd
is required for certbot on Ubuntu.)
We also need a copy of fossil somewhere in the PATH. Make sure to get a recent version, or build it from its trunk, rather than relying on a semi-ancient version provided in the OS's package repositories. See below for instructions on building a static fossil binary via docker.
An Administrative User
The content on this system is owned (in terms of filesystem
permission) and administered by a non-root user who has SSH access to
the system, as opposed to a system-level user like www-data
. That
user requires root access via sudo
in order to complete the main
setup and perform occasional system-level tasks, but administering the
web content requires no root access. If the environment is set up such
that the content is owned by a user who cannot log in (like
www-data
) then most administrative tasks will require logging in as
root and chown
'ing the files to the proper user. When these docs
refer to USER
, they're referring to the non-root user who manages
the site content.
Setup the chroot Jail
All of the public-facing content on this server is hosted under a so-called "chroot jail," which is essentially a directory tree in the filesystem in which the web server locks itself before starting to serve data. This security measure means that if someone manages to find a hole in the web server and gain access to the system via that process, they're limited to non-root access within that one branch of the filesystem. As the chroot jail holds only a bare minimum of system-level files, it's not even possible for an attacker to "sudo" their way out of it.
Debian-based Linux systems have an easy approach to installing full-featured chroot environments, but ours is intentionally as minimal as possible. For those interested in full-featured over minimal, see:
https://www.linode.com/docs/guides/use-chroot-for-testing-on-ubuntu/
The chroot environment was set up via a sequence of shell commands almost identical to the following, noting that (1) they require root access) and (2) USER refers to a non-root user account on whose behalf most of the web content will be managed.
Initial chroot Setup
$ mkdir -p /jail/dev
$ cd /jail/dev
$ mknod null c 1 3 # <-- values are OS-dependent!
$ mknod urandom c 1 9 # <-- values are OS-dependent!
$ chmod 0666 null urandom
$ cd ..
$ mkdir proc
$ chmod 0555 proc
$ mount -t proc /proc proc/
# ^^^^ see below for fstab entry
# ./tmp is required for chrooted fossil writing temp files:
$ mkdir tmp
$ chown USER tmp
$ mkdir .well-known
# ^^^ part of the ACME protocol for SSL cert renewal
The /proc
and /tmp
filesystems require mounting on each reboot, so
add these lines to /etc/fstab
:
/proc /jail/proc proc defaults 0 2
swap /jail/tmp tmpfs defaults,size=500m,uid=USER 0 2
and then:
$ mount /jail/tmp
Static Fossil Binary via Docker
For our minimalistic chroot we need a completely static fossil binary.
Though we can ostensibly build a static binary using the --static
flag to fossil's configure script, the resulting binary is not truly
static on Linux enviroments which use glibc. In order to build a
truly static binary on Linux, we need an environment with a different
libc. Fortunately, this is really easy to do with Docker.
The initial Docker setup looks like:
$ sudo apt install docker.io
# Add USER to the docker group:
$ sudo usermod -a -G docker USER
# ^^^ logout/login will be necessary for the user
# to activate their access to that group.
With that in place we can build a static fossil binary. Copy the
Dockerfile and build script shown below somewhere convenient, e.g.
~/tmp
, cd to that directory, then run that script...
Static Build Wrapper Script
#!/bin/sh
set -e
set -x
docker build -t fossil_static \
--build-arg cachebust=$(date +%s) \
"$@" \
.
docker create --name fossil fossil_static
docker cp fossil:/fossil-src/fossil fossil
strip fossil
ls -la fossil
docker container rm fossil
set +x
cat <<EOF
Now maybe do:
docker image rm \$(docker image ls | grep -e fossil_static -e alpine | awk '{print $3}')
or:
docker system prune --force
EOF
Dockerfile
This file must saved as Dockerfile
or the above wrapper script must
be modified to pass the -f filename
flag to docker build
.
########################################################################
# Builds a static fossil SCM binary from the latest trunk
# source code.
# Optional --build-arg entries:
# repoUrl=source repo URL (default=canonical tree)
# version=tag or version to pull (default=trunk)
# cachebust=an arbitrary value to invalidate docker's cache
########################################################################
ARG repoUrl=https://fossil-scm.org/home
ARG version=trunk
ARG cachebust=0
# FROM alpine:edge
# >3.13 breaks stuff unduly:
# https://wiki.alpinelinux.org/wiki/Draft_Release_Notes_for_Alpine_3.14.0#faccessat2
FROM alpine:3.13
RUN apk update && apk upgrade \
&& apk add --no-cache \
curl gcc make tcl musl-dev \
openssl-dev zlib-dev openssl-libs-static \
zlib-static
ARG repoUrl
ARG version
ARG cachebust
RUN curl \
"${repoUrl}/tarball/fossil-src.tar.gz?name=fossil-src&uuid=${version}" \
-o fossil-src.tar.gz \
&& tar xf fossil-src.tar.gz \
&& cd fossil-src \
&& ./configure \
--static \
--disable-fusefs \
--json \
&& make
Build the Static Fossil
$ ./build-fossil.sh
...
$ sudo mv fossil /jail/bin/.
Setup SSL Cert via certbot
certbot offers free short-lived SSL certificates and automation to renew them.
This step requires that at least one domain or subdomain has already been mapped to this server's static IP(s) and that resolution of that domain has propagated through the DNS system.
Precisely how to set up certbot is platform-specific but trivial. Simply select the operating environment from:
https://certbot.eff.org/instructions
And walk through the steps. For purposes of this server, when selecting a web server for certbot, the proper choice was "other" (as opposed to Apache or nginx or "hosted").
The certbot process took less than 10 minutes to complete (at a leisurely pace) and ended with output similar to:
Certificate is saved at: /etc/letsencrypt/live/MY-DOMAIN/fullchain.pem
Key is saved at: /etc/letsencrypt/live/MY-DOMAIN/privkey.pem
This certificate expires on 2022-04-04.
These files will be updated when the certificate renews.
Certbot has set up a scheduled task to automatically renew this
certificate in the background.
We need to remember that /etc/letsencrypt/...
path for later. We'll
need it for setting up SSL access to the web content.
On this platform that process sets up automation to renew the certificate as needed, so certbot becomes mostly a background detail. A following section explains a couple of scripts we need for managing certbot, but the software those scripts rely on will not be installed until a later step in the setup process.
Althttpd HTTP Server
Despite my being a long-time user of Apache, setting up this server was seen as an opportunity to try out a different web server solution. Apache, nginx, and althttpd were all evaluated, and althttpd was chosen because:
- The stunnel setup for HTTPS was trivial. A couple of weeks later althttpd got built-in TLS support, making stunnel unnecessary.
- Its virtual host configuration is absolutely trivial. Apache's isn't bad, but simply creating a directory with a matching name is easier and completely sufficient for this system.
- Its support for CGIs is more straightforward and flexible than Apache's, and those are important for me. Apache has to be configured to allow CGIs in specific directories and it then only allows files with extensions mapped to the CGI handler.
- Last, but not least, it's authored and maintained by someone close to this project, and it's been the basis of the primary Fossil SCM site since its inception.
Sidebar: nginx was only "superficially" evaluated, not actually tested. Certainly it would have been a good solution as well.
Sidebar: Another option would have been to use fossil's own embedded HTTP server, which includes SSL support as of January 2022. Though it would have been about 95% sufficient for this task, hosting of certain static content would not have worked as-is in that setup. Namely, though fossil can server static files in a limited capacity, it does not offer any ability to browse static directories, nor auto-select an
index.html
when given a directory name (on second thought, maybe it can?). althttpd also cannot browse static directories, but the way it supports CGIs makes that easy to to do via small CGI scripts which provide rendering of the contents of such directories.
althttpd is a single-source-file solution which must be downloaded, compiled, and installed:
$ fossil clone https://sqlite.org/althttpd
$ cd althttpd
$ make
$ sudo mv althttpd althttpsd /jail/bin
Adding Websites to althttpd
Althttpd uses the directory specified by its -root DIR
argument as
its virtual root, but it requires that a separate directory exist for
each website. The directory name is a normalized form of the
(sub)domain name which is used to access the site, as documented in
althttpd's own docs. At a very minimum, it requires a
directory named default.website
, which is the fallback it uses if it
cannot find a directory name matching the one which is used by the
client to access the site.
The -root DIR
is the virtual root for chroot purposes, but the
content root is the *.website
directory corresponding to the site
being accessed. CGI scripts must have all of their required resources,
e.g. binaries and any shared libraries they need, installed under
the -root DIR
directory.
Setup stunnel4 (for HTTPS access)
Historically, stunnel4 was the go-to solution for wrapping althttpd in a TLS-capable connection. As of January 14, 2022, althttpd can be compiled with built-in TLS support using OpenSSL, making stunnel optional for TLS support.
$ sudo apt install stunnel4
$ sudo emacs /etc/stunnel4/my.conf
pid = /var/run/stunnel4/stunnel.pid
cert = /etc/letsencrypt/live/DOMAINNAME/fullchain.pem
key = /etc/letsencrypt/live/DOMAINNAME/privkey.pem
[https]
accept = :::443
TIMEOUTclose = 0
exec = /jail/bin/althttpd
; Remember that some paths here are relative to chroot'd /jail:
execargs = /jail/bin/althttpd -logfile /log/althttpd.log -root /jail -user USER -https 1
^X^S
$ sudo service stunnel4 restart
As of this writing (Jan 18, 2022), stunnel is no longer deployed on this server.
Optional: Setup xinetd (for HTTP(S) access)
If HTTP access is required or desired, it can be provided by a
standalone instance of althttpd
or, optionally, via a service like
xinetd
. For this server xinetd
was chosen on the tried-and-true
grounds of "a working example was already available."
$ sudo apt install xinetd
$ sudo emacs /etc/xinet.d/http
service http
{
port = 80
flags = IPv4
bind = PUBLIC_IP_V4_ADDRESS_OF_THIS_MACHINE
socket_type = stream
wait = no
user = root
server = /jail/bin/althttpd
server_args = -logfile /log/althttpd.log -root /jail -user USER
}
service http
{
port = 80
flags = REUSE IPv6
bind = PUBLIC_IP_V6_ADDRESS_OF_THIS_MACHINE
socket_type = stream
wait = no
user = root
server = /jail/bin/althttpd
server_args = -logfile /log/althttpd.log -root /jail -user USER
}
^X^S
$ sudo /etc/init.d/xinetd restart
Note that the -logfile
path in the xinetd configuration is relative
to /jail
, not /
.
xinetd can also, as of January 2022, be used to serve HTTPS instances
by adding a snippet like this one to the xinetd config file
/etc/xinet.d/https
:
service https
{
port = 443
flags = IPv4
socket_type = stream
wait = no
user = root
server = /jail/bin/althttpsd
server_args = -logfile /log/althttpsd.log -root /jail -user USER -cert /path/to/cert.pem -pkey /path/to/key.pem
}
If the certificate file contains both a certificate and a key, the
-pkey
flag is optional.
Certbot Renewal Hooks
For certbot auto-renewal to work, we need to arrange for HTTP access to the server to be disabled briefly. We can optionally also disable HTTPS but do not need to, as the renewal process requires only HTTP. The following cerbot hook scripts, or equivalents, need be installed by the root user:
root@www4:/etc/letsencrypt/renewal-hooks# cat pre/stop-www.sh
#!/bin/bash
/usr/bin/systemctl stop xinetd
root@www4:/etc/letsencrypt/renewal-hooks# cat post/start-www.sh
#!/bin/bash
/usr/sbin/service stunnel4 restart
/usr/bin/systemctl start xinetd
Make sure they're executable by root.
With those in place, the certbot renewal process should look like:
root@host:~# certbot --dry-run renew
Saving debug log to /var/log/letsencrypt/letsencrypt.log
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Processing /etc/letsencrypt/renewal/MY-DOMAIN.conf
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Simulating renewal of an existing certificate for MY-DOMAIN
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Congratulations, all simulated renewals succeeded:
/etc/letsencrypt/live/MY-DOMAIN/fullchain.pem (success)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Web Root(s)
Althttpd is capable of hosting multiple domains via a single setup but
neither requires nor permits separate explicit configurations for each
(sub)domain. Instead, it looks for a directory, relative to the one
specified for its -root PATH
argument, which matches a normalized
form of the domain name with an extension of .website
. The
transformation and naming conventions can be found in the althttpd
docs, but for starters we just need to create a single
content directory named default.website
(the fallback name althttpd
looks for if no domain-specific name is found). Whereas Apache
requires configuring each so-called virtual host (a.k.a. vhost)
separately, althttpd is geared towards servers operated by a single
team with identical setups, where a single configuration can apply to
any number of hosts.
In short, althttpd's content-serving rules are: if an exact file match
is found, it is served as-is unless that file is executable, in which
case it is run as a CGI and its output becomes the HTTP response
result (so the output must include any HTTP headers and such). If a
directory is requested, althttpd looks in that directory for the first
one it finds of (home
, index.html
, index.cgi
), where home
is
typically a CGI script. althttpd does not offer directory-browsing
features like Apache's or like FTP servers typically do, but that is
easy to add to any given directory using a CGI script.
Fossil Configuration
This section touches on the configuration of the hosted fossil repositories.
Where to Store Repositories
Ideally all hosted repositories are stored outside of space accessible
to the web clients. On this server they are all under /jail/museum
and are accessed via CGI wrapper scripts in
/jail/fossil_wanderinghorse_net.website/r
(here). We "could" run fossil
in "directory mode," such that it would list out all repository files
from a given directory, but i prefer to have the option to configure
each repository separately, as well as "hide" some repositories from
common view, and one CGI script per repository, plus an index page
listing them, provides that level of flexibility. Directory mode also
has the disadvantage of needing to open each repository on every HTTP
request in order to pull out its name and description, which would be
quite expensive for this server (which has more than 60 repositories).
Sending Notifications from Fossil Repositories
All repositories on this server have the following configuration option in common:
email-send-method
=db
email-send-db
=/jail/notifications/repos.db
email-admin
=AN-EMAIL-ADDRESS
email-self
=THE-SAME-EMAIL-ADDRESS
email-subname
=[REPO-NAME]
It is important that /jail/notifications
and all files in it are
writable by the user named in the althttpd -user USER
flag. On this
system that looks something like:
$ ls -la notifications
total 104
drwsrwx--- 2 www-data www-data 4096 Jan 15 16:23 .
drwsrwxr-x 15 root www-data 4096 Jan 16 05:42 ..
-rw-rw---- 1 www-data www-data 8192 Jan 15 12:23 repos.db
-rw-rw---- 1 www-data www-data 32768 Jan 17 05:59 repos.db-shm
-rw-rw---- 1 www-data www-data 53592 Jan 16 15:58 repos.db-wal
(The www-data
user is the one althttpd's running as.)
With that in place, all notifications for all repositories will be added to that db file. Once email is up and running (see mailserver.md for how this server does it), the notifications can be polled and mailed out via a script such as this one:
#!/usr/bin/tclsh
set POLLING_INTERVAL 60000 ;
set DBFILE /jail/notifications/repos.db
set PIPE "/usr/sbin/sendmail -ti"
package require sqlite3
sqlite3 db $DBFILE
db timeout 5000
catch {db eval {PRAGMA journal_mode=WAL}}
db eval {
CREATE TABLE IF NOT EXISTS email(
emailid INTEGER PRIMARY KEY,
msg TXT
);
}
while {1} {
db transaction immediate {
set n 0
db eval {SELECT msg FROM email} {
set pipe $PIPE
if {[regexp {\nFrom:[^\n]*<([^>]+)>} $msg all addr]} {
append pipe " -f $addr"
}
set out [open |$pipe w]
puts -nonewline $out $msg
flush $out
close $out
incr n
}
if {$n>0} {
db eval {DELETE FROM email}
}
}
after $POLLING_INTERVAL
}
That script runs forever, so should be started in the background
using a system service or a cron job. On Linux systems crontab
supports a run-time of @reboot
to start tasks at system boot, so
something like this will do in a pinch:
@reboot sleep 30 && /jail/bin/send-fossil-notifications.tcl
On this particular system we require an intermediary script to start that with:
$ cat send-fossil-notifications.sh
#!/bin/sh
# Workaround to start send-fossil-notifications.tcl without
# my local LD_LIBRARY_PATH's incompatible copy of libsqlite3
# getting in the way.
LD_LIBRARY_PATH=
export LD_LIBRARY_PATH
pid=`pgrep -f send-fossil-notifications`
#If pgrep isn't available (e.g. in busybox):
#pid=`ps -ef | grep send-fossil-notifications.tcl | grep -v grep | awk '{print $1}'`
#^^^^ noting that in busybox the PID is field $1 and in /bin/ps it's $2!
if test x != "x$pid"; then
kill -9 $pid
fi
exec nohup /jail/bin/send-fossil-notifications.tcl >/dev/null 1>&2 &
Open Issues
Currently none :-D.