Ezjail host
Background
This is my personal checklist for when I am setting up a new ezjail host. I like my jail hosts configured in a very specific way. There is a good chance that what is right for me is not right for you. For example, my filesystem layout is basically a zpool on top of a geli device created on a zvol exported from another zpool. If you decide to go with this layout you shouldn't expect blazing filesystem performance. I don't need filesystem blazing performance, but I do need encryption, and I do need ZFS, so this is just right for me :).
Also note that I talk a lot about the German hosting provider Hetzner, if you are using another provider or you are doing this at home, just ignore the Hetzner specific stuff. Much of the content here can be used with little or no changes outside Hetzner.
Installation
OS install with mfsbsd
After receiving the server from Hetzner I boot it using the rescue system which puts me at an mfsbsd prompt. I then edit the zfsinstall script /root/bin/zfsinstall
and add "usr" to FS_LIST near the top of the script. I do this because I like to have /usr as a seperate ZFS dataset.
I then run the zfsinstall script like below. I am going to export the majority of the available diskspace as a ZVOL which will be used for a GELI device with another zfs pool on top. This pool will house the actual jails and data.
Note that the disks are new-ish (Power_On_Hours is 73 on both drives according to smartctl, which the mfsbsd author has been clever enough to include in mfsbsd) but I still found an MBR partition that needed to be deleted first. This can be done with the destroygeom command like shown below:
[root@rescue ~]# zfsinstall -d ad4 -d ad6 -r mirror -s 5G -t /nfs/mfsbsd/9.0-amd64-zfs.tar.xz Error: /dev/ad4 already contains a partition table. => 63 5860533105 ad4 MBR (2.7T) 63 5860533105 - free - (2.7T) You may erase the partition table manually with the destroygeom command [root@rescue ~]# destroygeom Usage: /root/bin/destroygeom [-h] -d geom [-d geom ...] [-p zpool ...] [root@rescue ~]# destroygeom -d ad4 -d ad6 Destroying geom ad4: Destroying geom ad6: [root@rescue ~]# zfsinstall -d ad4 -d ad6 -r mirror -s 5G -t /nfs/mfsbsd/9.0-amd64-zfs.tar.xz Creating GUID partitions on ad4 ... done Configuring ZFS bootcode on ad4 ... done => 34 5860533101 ad4 GPT (2.7T) 34 2014 - free - (1.0M) 2048 128 1 freebsd-boot (64K) 2176 10485760 2 freebsd-swap (5.0G) 10487936 5850045199 3 freebsd-zfs (2.7T) Creating GUID partitions on ad6 ... done Configuring ZFS bootcode on ad6 ... done => 34 5860533101 ad6 GPT (2.7T) 34 2014 - free - (1.0M) 2048 128 1 freebsd-boot (64K) 2176 10485760 2 freebsd-swap (5.0G) 10487936 5850045199 3 freebsd-zfs (2.7T) Creating ZFS pool tank on ad4p3 ad6p3 ... done Creating tank root partition: ... done Creating tank partitions: var tmp usr ... done Setting bootfs for tank to tank/root ... done NAME USED AVAIL REFER MOUNTPOINT tank 210K 2.68T 21K none tank/root 88K 2.68T 25K /mnt tank/root/tmp 21K 2.68T 21K /mnt/tmp tank/root/usr 21K 2.68T 21K /mnt/usr tank/root/var 21K 2.68T 21K /mnt/var Extracting FreeBSD distribution ... done Writing /boot/loader.conf... done Writing /etc/fstab...Writing /etc/rc.conf... done Copying /boot/zfs/zpool.cache ... done Installation complete. The system will boot from ZFS with clean install on next reboot You may type "chroot /mnt" and make any adjustments you need. For example, change the root password or edit/create /etc/rc.conf for for system services. WARNING - Don't export ZFS pool "tank"! [root@rescue]#
Finally to fix a small shortcoming in the zfsinstall script. It only enables swap on the first drive even though it creates swap partitions on all drives. I run the commands below to find the gptid of the swap partition not yet swapon'd. I run swapon on the gptid and check swapinfo output again to verify that it worked:
[root@rescue]# swapinfo Device 1K-blocks Used Avail Capacity /dev/gptid/e0f61e4f-a59e-11e1-9 5242880 0 5242880 0% [root@rescue]# glabel status ada0p2 | grep gptid gptid/e0f61e4f-a59e-11e1-93ed-e840f2090be0 N/A ada0p2 [root@rescue]# glabel status ada1p2 | grep gptid gptid/e20f4cc8-a59e-11e1-93ed-e840f2090be0 N/A ada1p2 [root@rescue]# swapon /dev/gptid/e20f4cc8-a59e-11e1-93ed-e840f2090be0 [root@rescue]# swapinfo Device 1K-blocks Used Avail Capacity /dev/gptid/e0f61e4f-a59e-11e1-9 5242880 0 5242880 0% /dev/gptid/e20f4cc8-a59e-11e1-9 5242880 0 5242880 0% Total 10485760 0 10485760 0% [root@rescue]#
Post install configuration (before reboot)
Before rebooting into the installed FreeBSD I need to make certain I can reach the server through SSH after the reboot. This means:
- Adding network settings to
/etc/rc.conf
- Adding sshd_enable="YES" to
/etc/rc.conf
- Change PermitRootLogin to Yes in
/etc/ssh/sshd_config
- Finally I set the root password.
All of these steps are essential if I am going to have any chance of logging in after reboot. Most of these changes can be done from the mfsbsd shell but the password change requires chroot into the newly installed environment.
I use the chroot command but start another shell as bash is not installed in /mnt:
[root@rescue ~]# chroot /mnt/ csh rescue# ee /etc/rc.conf rescue# ee /etc/ssh/sshd_config rescue# passwd New Password: Retype New Password: rescue#
So, the network settings are sorted, root password is set, and root is permitted to ssh in. Time to reboot (this is the exciting part).
Creating the encrypted zvol
I export most of the available diskspace as a ZVOL which will be used to house a GELI volume.
[root@ ~]# zfs list NAME USED AVAIL REFER MOUNTPOINT tank 359M 2.68T 21K none tank/root 359M 2.68T 84.8M / tank/root/tmp 28K 2.68T 28K /tmp tank/root/usr 274M 2.68T 274M /usr tank/root/var 412K 2.68T 412K /var [root@ ~]# zfs create -V 2640G tank/gelizvol [root@ ~]# zfs list NAME USED AVAIL REFER MOUNTPOINT tank 2.66T 17.0G 21K none tank/gelizvol 2.66T 2.68T 16K - tank/root 359M 17.0G 84.8M / tank/root/tmp 28K 17.0G 28K /tmp tank/root/usr 274M 17.0G 274M /usr tank/root/var 412K 17.0G 412K /var [root@ ~]# ls -l /dev/zvol/tank/gelizvol crw-r----- 1 root operator 0, 124 May 24 13:10 /dev/zvol/tank/gelizvol [root@ ~]#
Then I create a geli key from /dev/random
and initialize the geli provider:
[root@ ~]# dd if=/dev/random of=/root/encrypted.key bs=64 count=1 1+0 records in 1+0 records out 64 bytes transferred in 0.000031 secs (2064888 bytes/sec) [root@ ~]# ls -l /root/encrypted.key -rw-r--r-- 1 root wheel 64 May 24 13:14 /root/encrypted.key [root@ ~]# geli init -s 512 -K /root/encrypted.key /dev/zvol/tank/gelizvol Enter new passphrase: Reenter new passphrase: Metadata backup can be found in /var/backups/zvol_tank_gelizvol.eli and can be restored with the following command: # geli restore /var/backups/zvol_tank_gelizvol.eli /dev/zvol/tank/gelizvol [root@ ~]#
Next is to attach the newly created geli provider:
[root@ ~]# geli attach -k /root/encrypted.key /dev/zvol/tank/gelizvol Enter passphrase: [root@ ~]# ls -l /dev/zvol/tank/ total 0 crw-r----- 1 root operator 0, 124 May 24 13:20 gelizvol crw-r----- 1 root operator 0, 127 May 24 13:22 gelizvol.eli [root@ ~]#
Now to create the zpool on top of the unlocked geli provider:
[root@ ~]# zpool create cryptopool /dev/zvol/tank/gelizvol.eli [root@ ~]# zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT cryptopool 2.56T 108K 2.56T 0% 1.00x ONLINE - tank 2.72T 363M 2.72T 0% 1.00x ONLINE - [root@ ~]# zpool status cryptopool pool: cryptopool state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM cryptopool ONLINE 0 0 0 zvol/tank/gelizvol.eli ONLINE 0 0 0 errors: No known data errors [root@ ~]#
The last remaining thing is to create a filesystem in the new zfs pool:
[root@ ~]# zfs list NAME USED AVAIL REFER MOUNTPOINT cryptopool 91K 2.52T 31K /cryptopool tank 2.66T 17.0G 21K none tank/gelizvol 2.66T 2.68T 1.16M - tank/root 359M 17.0G 84.8M / tank/root/tmp 28K 17.0G 28K /tmp tank/root/usr 274M 17.0G 274M /usr tank/root/var 419K 17.0G 419K /var [root@ ~]# zfs set mountpoint=none cryptopool [root@ ~]# zfs create -o compression=gzip -o mountpoint=/usr/jails cryptopool/jails [root@ ~]# zfs list NAME USED AVAIL REFER MOUNTPOINT cryptopool 149K 2.52T 31K none cryptopool/jails 31K 2.52T 31K /usr/jails tank 2.66T 17.0G 21K none tank/gelizvol 2.66T 2.68T 1.33M - tank/root 359M 17.0G 84.8M / tank/root/tmp 28K 17.0G 28K /tmp tank/root/usr 274M 17.0G 274M /usr [root@ ~]#
The next things are post-install configuration stuff like firewall and so on. The basic install is finished \o/
Upgrade OS (buildworld)
I usually run -STABLE on my hosts, which means I need to build and install a new world and kernel. I also like having rctl available on my jail hosts, so I can limit jail ressources in all kinds of neat ways. Additionally I also need the built world to populate ezjails basejail.
Note: I will need to update the host and the jails many times during the lifespan of this server, which is likely > 2-3 years. As new security problems are found or features are added that I want, I will update host and jails. There is a section about staying up to date later in this page. This section (the one you are reading now) only covers the OS update I run right after installing the server.
Fetching sources
I make a copy of the example config file, and set the servername to cvsup.de.FreeBSD.org
using sed
:
$ sudo cp /usr/share/examples/cvsup/stable-supfile /etc/ $ sudo sed -i "" "s/=CHANGE_THIS/=cvsup.de/g" /etc/stable-supfile $ sudo csup -L 2 /etc/stable-supfile
Create kernel config
After the sources finish downloading, I create a new kernel config file /etc/TYKJAIL
with the following content:
include GENERIC ident TYKJAIL #rctl options RACCT options RCTL #dtrace options KDTRACE_HOOKS # all architectures - enable general DTrace hooks options DDB_CTF # all architectures - kernel ELF linker loads CTF data options KDTRACE_FRAME # amd64 - ensure frames are compiled in makeoptions DEBUG="-g" # amd64? - build kernel with gdb(1) debug symbols makeoptions WITH_CTF=1
I then create a symlink to the kernel config file in /etc/:
$ ln -s /etc/TYKJAIL /usr/src/sys/amd64/conf/ $ ls -l /usr/src/sys/amd64/conf/TYKJAIL lrwxr-xr-x 1 root wheel 9 Jul 22 16:14 /usr/src/sys/amd64/conf/TYKJAIL -> /etc/TYKJAIL
Building world and kernel
Finally I start the build. I use -j12 because I have 12 cores in this system. Check the number of cores in your system with sysctl:
$ sysctl hw.ncpu hw.ncpu: 12
To build the new system:
$ cd /usr/src/ $ sudo make -j12 buildworld && sudo make -j12 buildkernel KERNCONF=RCTL && sudo make installkernel KERNCONF=RCTL && date
After the build finishes, reboot and run mergemaster, installworld, and mergemaster again:
$ cd /usr/src/ $ sudo mergemaster -pFUi && sudo make installworld && sudo mergemaster -FUi
DO NOT OVERWRITE /etc/group AND /etc/master.passwd AND OTHER CRITICAL FILES!
Reboot after the final mergemaster completes, and boot into the newly built world.
Preparing ezjail
ezjail needs to be installed and a bit of configuration is also needed, in addition to bootstrapping /usr/jails/basejail
and /usr/jails/newjail
.
Installing ezjail
First I install /usr/ports/sysutils/ezjail
using Portmaster:
$ sudo portmaster /usr/ports/sysutils/ezjail
Configuring ezjail
Then I go edit the ezjail config file /usr/local/etc/ezjail.conf
and add/change these two lines near the bottom:
ezjail_use_zfs="YES" ezjail_jailzfs="cryptopool/jails"
This makes ezjail use seperate zfs datasets under cryptopool/jails
for the basejail
and newjail
, as well as for each jail created with the -c zfs
option.
Bootstrapping ezjail
Finally I populate basejail
and newjail
from the world I build earlier:
$ sudo ezjail-admin update -i
The last line of the output is a message saying:
Note: a non-standard /etc/make.conf was copied to the template jail in order to get the ports collection running inside jails.
This is because ezjail defaults to symlinking the ports collection in the same way it symlinks the basejail. I prefer having seperate/individual ports collections in each of my jails though, so I remove the symlink and make.conf from newjail:
$ sudo rm /usr/jails/newjail/etc/make.conf /usr/jails/newjail/usr/ports
ZFS goodness
Note that ezjail has created two new ZFS datasets to hold basejail
and newjail
:
$ zfs list -r cryptopool/jails NAME USED AVAIL REFER MOUNTPOINT cryptopool/jails 129M 2.52T 31K /usr/jails cryptopool/jails/basejail 128M 2.52T 128M /usr/jails/basejail cryptopool/jails/newjail 816K 2.52T 816K /usr/jails/newjail
Adding -c zfs
to the ezjail-admin create
command will make ezjail create a seperate zfs dataset for each jail, more on that later.
Configuration
This section outlines what I do to further prepare the machine to be a nice ezjail host.
Firewall
One of the first things I fix is to enable the pf firewall from OpenBSD. I add the following to /etc/rc.conf
to enable pf at boot time:
[root@ ~]# grep pf /etc/rc.conf pf_enable="YES" pflog_enable="YES" [root@ ~]#
I also create a very basic /etc/pf.conf
:
[root@ ~]# cat /etc/pf.conf ### macros if="em0" table <portknock> persist #external addresses tykv4="a.b.c.d" tykv6="2002:ab:cd::/48" table <allowssh> { $tykv4,$tykv6 } #local addresses glasv4="w.x.y.z" ### scrub scrub in on $if all fragment reassemble ################ ### filtering ### block everything block log all ################ ### skip loopback interface(s) set skip on lo0 ################ ### icmp6 pass in quick on $if inet6 proto icmp6 all icmp6-type {echoreq,echorep,neighbradv,neighbrsol,routeradv,routersol} ################ ### pass outgoing pass out quick on $if all ################ ### pass incoming ssh and icmp pass in quick on $if proto tcp from { <allowssh> } to ($if) port 22 pass in quick on $if inet proto icmp all icmp-type { 8, 11 } ################ ### pass ipv6 fragments (hack to workaround pf not handling ipv6 fragments) pass in on $if inet6 block in log on $if inet6 proto udp block in log on $if inet6 proto tcp block in log on $if inet6 proto icmp6 block in log on $if inet6 proto esp block in log on $if inet6 proto ipv6
To load pf without rebooting I run the following:
[root@ ~]# kldload pf [root@ ~]# kldload pflog [root@ ~]# pfctl -f /etc/pf.conf && sleep 60 && pfctl -d No ALTQ support in kernel ALTQ related functions disabled
I get no prompt after this because pf has cut my SSH connection. But I can SSH back in if I did everything right, and if not, I can just wait 60 seconds after which pf will be disabled again. I SSH in and reattach to the screen I am running this in, and press control-c, so the "sleep 60" is interrupted and pf is not disabled. Neat little trick for when you want to avoid locking yourself out :)
Replacing sendmail with Postfix
I always replace Sendmail with Postfix on every server I manage. See Replacing_Sendmail_With_Postfix for more info.
Listening daemons
When you add an IP alias for a jail, any daemons listening on * will also listen on the jails IP, which is not what I want. For example, I want the jails sshd to be able to listen on the jails IP on port 22, instead of the hosts sshd. Check for listening daemons like so:
$ sockstat -l46 USER COMMAND PID FD PROTO LOCAL ADDRESS FOREIGN ADDRESS root master 1554 12 tcp4 *:25 *:* root master 1554 13 tcp6 *:25 *:* root sshd 948 3 tcp6 *:22 *:* root sshd 948 4 tcp4 *:22 *:* root syslogd 789 6 udp6 *:514 *:* root syslogd 789 7 udp4 *:514 *:* [tykling@glas ~]$
This tells me that I need to change Postfix, sshd and syslogd to stop listening on all IP addresses.
Postfix
The defaults in Postfix are really nice on FreeBSD, and most of the time a completely empty config file is fine for a system mailer (sendmail replacement). However, to make Postfix stop listening on port 25 on all IP addresses, I do need one line in /usr/local/etc/postfix/main.cf
:
$ cat /usr/local/etc/postfix/main.cf inet_interfaces=localhost $
sshd
To make sshd
stop listening on all IP addresses I uncomment and edit the ListenAddress
line in /etc/ssh/sshd_config
:
$ grep ListenAddress /etc/ssh/sshd_config ListenAddress x.y.z.226 #ListenAddress ::
(IP address obfuscated..)
syslogd
I don't need my syslogd to listen on the network at all, so I add the following line to /etc/rc.conf
:
$ grep syslog /etc/rc.conf syslogd_flags="-ss"
Restarting services
Finally I restart Postfix, sshd and syslogd:
$ sudo /etc/rc.d/syslogd restart Stopping syslogd. Waiting for PIDS: 789. Starting syslogd. $ sudo /etc/rc.d/sshd restart Stopping sshd. Waiting for PIDS: 948. Starting sshd. $ sudo /usr/local/etc/rc.d/postfix restart postfix/postfix-script: stopping the Postfix mail system postfix/postfix-script: starting the Postfix mail system
A check with sockstat
reveals that no more services are listening on all IP addresses:
$ sockstat -l46 USER COMMAND PID FD PROTO LOCAL ADDRESS FOREIGN ADDRESS root master 1823 12 tcp4 127.0.0.1:25 *:* root master 1823 13 tcp6 ::1:25 *:* root sshd 1617 3 tcp4 x.y.z.226:22 *:* $
Network configuration
Network configuration is a big part of any jail setup. If I have enough IP addresses (ipv4 and ipv6) I can just add IP aliases as needed. If I only have one or a few v4 IPs I will need to use rfc1918 addresses for the jails. In that case, I create a new loopback interface, lo1
and add the IP aliases there. I then use the pf firewall to redirect incoming traffic to the right jail, depending on the port in use.
IPv4
If rfc1918 jails are needed, I add the following to /etc/rc.conf
to create the lo1 interface on boot:
### lo1 interface for ipv4 rfc1918 jails cloned_interfaces="lo1"
When the lo1 interface is created, or if it isn't needed, I am ready to start adding IP aliases for jails as needed.
IPv6
On the page Hetzner_ipv6 I've explained how to make IPv6 work on a Hetzner server where the supplied IPv6 default gateway is outside the IPv6 subnet assigned.
When basic IPv6 connectivity works, I am ready to start adding IP aliases for jails as needed.
Allow ping from inside jails
I add the following to /etc/sysctl.conf
so the jails are allowed to do icmp ping. This enables raw socket access, which can be a security issue if you have untrusted root users in your jails. Use with caution.
#allow ping in jails security.jail.allow_raw_sockets=1
Tips & tricks
Get jail info out of top
To make top show the jail id of the jail in which the process is running in a column, I need to specify the -j flag to top. Since this is a multi-cpu server I am working on, I also like giving the -P flag to top, to get a seperate line of cpu stats per core. Finally, I like -a to get the full commandline/info of the running processes. I add the following to my .bashrc in my homedir on the jail host:
alias top="nice top -j -P -a"
...this way I don't have to remember passing -j -P -a
to top every time. Also, I've been told to run top with nice to limit the cpu used by top itself. I took the advice so the complete alias looks like above.
Staying up-to-date
I update my ezjail hosts and jails to track -STABLE regularly. This section describes the procedure I use.
Updating the jail host
First I update world and kernel of the jail host like I normally would. This is described earlier in this guide, see Ezjail_host#Building_world_and_kernel.
Updating ezjails basejail
To update ezjails basejail located in /usr/jails/basejail
, I run the following commands:
- ezjail-admin update -i
- rm /usr/jails/newjail/etc/make.conf
Running mergemaster in the jails
Finally, to run mergemaster in all jails I use the following small script snippet. Run it in bash as root:
- for jail in `jls -n jid`; do JID=`echo $jail | cut -d "=" -f 2`; echo Processing jail id $JID; JPATH=`jls -j $JID -n path`; JPATH=`echo $JPATH | cut -d "=" -f 2`; mount_nullfs /usr/src $JPATH/usr/src; mount_nullfs /usr/obj $JPATH/usr/obj; jexec $JID mergemaster -FUi; umount $JPATH/usr/src; umount $JPATH/usr/obj; done
NOTE THAT THIS ONLY HANDLES RUNNING JAILS. YOU CANNOT RUN MERGEMASTER IN A STOPPED JAIL!
To restart all jails I run:
- ezjail-admin restart