Revision as of 17:39, 21 October 2013

Background

This is my personal checklist for when I am setting up a new ezjail host. I like my jail hosts configured in a very specific way. There is a good chance that what is right for me is not right for you. As always, YMMV.

Also note that I talk a lot about the German hosting provider Hetzner, if you are using another provider or you are doing this at home, just ignore the Hetzner specific stuff. Much of the content here can be used with little or no changes outside Hetzner.

Installation

OS install with mfsbsd

After receiving the server from Hetzner I boot it using the rescue system which puts me at an mfsbsd prompt via SSH. This is perfect for installing a zfs-only server.

Changes to zfsinstall

I edit the zfsinstall script /root/bin/zfsinstall and add "usr" to FS_LIST near the top of the script. I do this because I like to have /usr as a seperate ZFS dataset.

Check disks

I create a small zpool using just 30gigs, enough to confortably install the base OS and so on. The rest of the diskspace will be used for GELI which will have the other zfs pool on top. This encrypted zpool will house the actual jails and data. This setup allows me to have all the important data encrypted, while allowing the physical server to boot without human intervention like full disk encryption would require.

Note that the disks in this server are not new, they have been used for around two years (18023 hours/24 = 702 days):

[root@rescue ~]# grep "ada[0-9]:" /var/run/dmesg.boot | grep "MB "
ada0: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
ada1: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
[root@rescue ~]# smartctl -a /dev/ada0 | grep Power_On_Hours
  9 Power_On_Hours          0x0032   096   096   000    Old_age   Always       -       18023
[root@rescue ~]# smartctl -a /dev/ada1 | grep Power_On_Hours
  9 Power_On_Hours          0x0032   096   096   000    Old_age   Always       -       18023
[root@rescue ~]#

Destroy existing partitions

Any existing partitions need to be deleted first. This can be done with the destroygeom command like shown below:

[root@rescue ~]# destroygeom -d ada0 -d ada1
Destroying geom ada0:
    Deleting partition 3 ... done
Destroying geom ada1:
    Deleting partition 1 ... done
    Deleting partition 2 ... done
    Deleting partition 3 ... done

Install FreeBSD

Installing FreeBSD with mfsbsd is easy. I run the below command, adjusting the release I want to install of course:

[root@rescue ~]# zfsinstall -d ada0 -d ada1 -r mirror -z 30G -t /nfs/mfsbsd/9.1-release-amd64.tbz
Creating GUID partitions on ada0 ... done
Configuring ZFS bootcode on ada0 ... done
=>        34  3907029101  ada0  GPT  (1.8T)
          34        2014        - free -  (1M)
        2048         128     1  freebsd-boot  (64k)
        2176    62914560     2  freebsd-zfs  (30G)
    62916736  3844112399        - free -  (1.8T)

Creating GUID partitions on ada1 ... done
Configuring ZFS bootcode on ada1 ... done
=>        34  3907029101  ada1  GPT  (1.8T)
          34        2014        - free -  (1M)
        2048         128     1  freebsd-boot  (64k)
        2176    62914560     2  freebsd-zfs  (30G)
    62916736  3844112399        - free -  (1.8T)

Creating ZFS pool tank on ada0p2 ada1p2 ... done
Creating tank root partition: ... done
Creating tank partitions: var tmp usr ... done
Setting bootfs for tank to tank/root ... done
NAME            USED  AVAIL  REFER  MOUNTPOINT
tank            270K  29.3G    31K  none
tank/root       127K  29.3G    34K  /mnt
tank/root/tmp    31K  29.3G    31K  /mnt/tmp
tank/root/usr    31K  29.3G    31K  /mnt/usr
tank/root/var    31K  29.3G    31K  /mnt/var
Extracting FreeBSD distribution ... done
Writing /boot/loader.conf... done
Writing /etc/fstab...Writing /etc/rc.conf... done
Copying /boot/zfs/zpool.cache ... done

Installation complete.
The system will boot from ZFS with clean install on next reboot

You may type "chroot /mnt" and make any adjustments you need.
For example, change the root password or edit/create /etc/rc.conf for
for system services.

WARNING - Don't export ZFS pool "tank"!
[root@rescue ~]#

Post install configuration (before reboot)

Before rebooting into the installed FreeBSD I need to make certain I can reach the server through SSH after the reboot. This means:

Adding network settings to /etc/rc.conf
Adding sshd_enable="YES" to /etc/rc.conf
Change PermitRootLogin to Yes in /etc/ssh/sshd_config Note: In the current This is now the default in the zfsinstall image that Hetzner provides
Add nameservers to /etc/resolv.conf
Finally I set the root password.

All of these steps are essential if I am going to have any chance of logging in after reboot. Most of these changes can be done from the mfsbsd shell but the password change requires chroot into the newly installed environment.

I use the chroot command but start another shell as bash is not installed in /mnt:

[root@rescue ~]# chroot /mnt/ csh
rescue# ee /etc/rc.conf
rescue# ee /etc/ssh/sshd_config
rescue# passwd
New Password:
Retype New Password:
rescue#

So, the network settings are sorted, root password is set, and root is permitted to ssh in. Time to reboot (this is the exciting part).

Remember to use shutdown -r now and not reboot when you reboot. shutdown -r now performs the proper shutdown process including rc.d scripts and disk buffer flushing. reboot is the "bigger hammer" to use when something is preventing shutdown -r now from working.

Basic config after first boot

If the server boots without any problems, I do some basic configuration before I continue with the disk partitioning.

/etc/resolv.conf

I need DNS to be able to do anything, so I add the following to /etc/resolv.conf:

nameserver 89.233.43.71
nameserver 89.104.194.142

Note: The latest version of the hetzner freebsd image has some nameservers in /etc/resolv.conf but I still replace them with mine listed above.

Timezone

I run the command tzsetup to set the proper timezone, and set the time using ntpdate if neccesary.

Note: The current hetzner freebsd image has the timezone set to CEST, I like my servers configured as UTC

Basic ports

I also add some basic ports with pkg_add so I can get screen etc. up and running as soon as possible:

# pkg_add -r bash screen sudo portmaster portaudit

I then add the following to /usr/local/etc/portmaster.rc:

ALWAYS_SCRUB_DISTFILES=dopt
PM_DEL_BUILD_ONLY=pm_dbo
SAVE_SHARED=wopt
PM_LOG=/var/log/portmaster.log
PM_IGNORE_FAILED_BACKUP_PACKAGE=pm_ignore_failed_backup_package

An explanation of these options can be found on the Portmaster page.

After a rehash and adding my non-root user with adduser, I am ready to continue with the disk configuration. I also remember to disable root login in /etc/ssh/sshd_config.

Further disk configuration

After the reboot into the installed FreeBSD environment, I need to do some further disk configuration.

Create GELI partitions

First I create the partitions to hold the geli devices:

# gpart add -t freebsd-ufs ada0
ad4p3 added
# gpart add -t freebsd-ufs ada1
ad6p3 added
#

I add them as freebsd-ufs type partitions, as there is no dedicated freebsd-geli type.

Create GELI key

To create a GELI key I copy some data from /dev/random:

# dd if=/dev/random of=/root/geli.key bs=256k count=1
1+0 records in
1+0 records out
262144 bytes transferred in 0.003347 secs (78318372 bytes/sec)
#

Create GELI volumes

I create the GELI volumes with 4k blocksize and 256bit AES encryption:

root@rescue:/root # geli init -s 4096 -K /root/geli.key -l 256 /dev/ada0p3
Enter new passphrase:
Reenter new passphrase:

Metadata backup can be found in /var/backups/ada0p3.eli and
can be restored with the following command:

        # geli restore /var/backups/ada0p3.eli /dev/ada0p3

root@rescue:/root # geli init -s 4096 -K /root/geli.key -l 256 /dev/ada1p3
Enter new passphrase:
Reenter new passphrase:

Metadata backup can be found in /var/backups/ada1p3.eli and
can be restored with the following command:

        # geli restore /var/backups/ada1p3.eli /dev/ada1p3

root@rescue:/root #

Attach GELI volumes

Now I just need to attach the GELI volumes before I am ready to create the second zpool:

# geli attach -k /root/geli.key /dev/ad4p3
Enter passphrase:
# geli attach -k /root/geli.key /dev/ad6p3
Enter passphrase:
#

Create second zpool

# zpool create gelipool mirror /dev/ad4p3.eli /dev/ad6p3.eli
# zpool status
  pool: gelipool
 state: ONLINE
  scan: none requested
config:

        NAME           STATE     READ WRITE CKSUM
        gelipool       ONLINE       0     0     0
          mirror-0     ONLINE       0     0     0
            ad4p3.eli  ONLINE       0     0     0
            ad6p3.eli  ONLINE       0     0     0

errors: No known data errors

  pool: tank
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        tank        ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            ad4p2   ONLINE       0     0     0
            ad6p2   ONLINE       0     0     0

errors: No known data errors
#

Swap partition

Finally I export a ZVOL to use as swap partition, and swapon the ZVOL:

# zfs create -V 5G gelipool/swap
# swapon /dev/zvol/gelipool/swap 
# swapinfo
Device 1K-blocks Used Avail Capacity
/dev/zvol/tank/swap 5242880 0 5242880 0%
#

I also disable checksums on the zvol, and make sure the swap partition is used after next reboot:

#  zfs set checksum=off gelipool/swap
#  zfs set org.freebsd:swap=on gelipool/swap

I am aware that this swap partition will not be available until the encrypted disk is mounted. I do not expect this to be a problem as swap will not be needed unless the server is heavily loaded, and the server will not be doing any work if the encrypted disks are not mounted.

Create ZFS filesystems on the new zpool

The last remaining thing is to create a filesystem in the new zfs pool:

[tykling@haze ~]$ zfs list
NAME                  USED  AVAIL  REFER  MOUNTPOINT
gelipool             5.16G   652G   144K  /gelipool
gelipool/swap        5.16G   658G    72K  -
tank                  810M  28.5G    31K  none
tank/root             810M  28.5G   610M  /
tank/root/tmp          38K  28.5G    38K  /tmp
tank/root/usr         200M  28.5G   200M  /usr
tank/root/usr/home   40.5K  28.5G  40.5K  /usr/home
tank/root/usr/ports    31K  28.5G    31K  /usr/ports
tank/root/var         519K  28.5G   519K  /var
# sudo zfs set mountpoint=none gelipool
# sudo zfs set compression=on gelipool
# sudo zfs create -o mountpoint=/usr/jails gelipool/jails
# zfs list
NAME                  USED  AVAIL  REFER  MOUNTPOINT
gelipool             5.16G   652G   144K  none
gelipool/jails        144K   652G   144K  /usr/jails
gelipool/swap        5.16G   658G    72K  -
tank                  810M  28.5G    31K  none
tank/root             810M  28.5G   610M  /
tank/root/tmp          38K  28.5G    38K  /tmp
tank/root/usr         200M  28.5G   200M  /usr
tank/root/usr/home   40.5K  28.5G  40.5K  /usr/home
tank/root/usr/ports    31K  28.5G    31K  /usr/ports
tank/root/var         520K  28.5G   520K  /var
#

Disable atime

One last thing I like to do is to disable atime or access time on the filesystem. Access times are recorded every time a file is read, and while this can have it's use cases, I never use it. Disabling it means a lot fewer write operations, as a read operation doesn't automatically include a write operation when atime is disabled. Disabling it is easy:

# zfs set atime=off tank
# zfs set atime=off gelipool
#

The next things are post-install configuration stuff like firewall and so on. The basic install is finished \o/

Upgrade OS (buildworld)

I usually run -STABLE on my hosts, which means I need to build and install a new world and kernel. I also like having rctl available on my jail hosts, so I can limit jail ressources in all kinds of neat ways. Additionally I also need the built world to populate ezjails basejail.

Note: I will need to update the host and the jails many times during the lifespan of this server, which is likely > 2-3 years. As new security problems are found or features are added that I want, I will update host and jails. There is a section about staying up to date later in this page. This section (the one you are reading now) only covers the OS update I run right after installing the server.

Fetching sources

I make a copy of the example config file, and set the servername to cvsup.de.FreeBSD.org using sed:

# cp /usr/share/examples/cvsup/stable-supfile /etc/
# sed -i "" "s/=CHANGE_THIS/=cvsup.de/g" /etc/stable-supfile
# csup -L 2 /etc/stable-supfile

Create kernel config

After the sources finish downloading, I create a new kernel config file /etc/TYKJAIL with the following content:

include GENERIC
ident TYKJAIL

#rctl
options RACCT
options RCTL

#dtrace
options KDTRACE_HOOKS        # all architectures - enable general DTrace hooks
options DDB_CTF              # all architectures - kernel ELF linker loads CTF data
options KDTRACE_FRAME        # amd64 - ensure frames are compiled in
makeoptions DEBUG="-g"       # amd64? - build kernel with gdb(1) debug symbols
makeoptions WITH_CTF=1

I then create a symlink to the kernel config file in /etc/:

# ln -s /etc/TYKJAIL /usr/src/sys/amd64/conf/
# ls -l /usr/src/sys/amd64/conf/TYKJAIL 
lrwxr-xr-x  1 root  wheel  9 Jul 22 16:14 /usr/src/sys/amd64/conf/TYKJAIL -> /etc/TYKJAIL

Building world and kernel

Finally I start the build. I use -j12 because I have 12 cores in this system. Check the number of cores in your system with sysctl:

# sysctl hw.ncpu
hw.ncpu: 12

To build the new system:

# cd /usr/src/
# sudo make -j12 buildworld && sudo make -j12 buildkernel KERNCONF=TYKJAIL && sudo make installkernel KERNCONF=TYKJAIL && date

After the build finishes, reboot and run mergemaster, installworld, and mergemaster again:

# cd /usr/src/
# sudo mergemaster -pFUi && sudo make installworld && sudo mergemaster -FUi

DO NOT OVERWRITE /etc/group AND /etc/master.passwd AND OTHER CRITICAL FILES!

Reboot after the final mergemaster completes, and boot into the newly built world.

Ports

Installing the ports tree

I need to bootstrap the ports system, I use portsnap as it is way faster than using c(v)sup. Initially I run portsnap fetch extract and when I need to update the tree later I use portsnap fetch update.

smartd

I install smartd to monitor the disks for problems:

$ sudo portmaster /usr/ports/sysutils/smartmontools/

I add the following line to /usr/local/etc/smartd.conf:

DEVICESCAN -a -m thomas@gibfest.dk

This makes smartd monitor all disks and send me an email if it finds an error.

Remember to enable smartd in /etc/rc.conf:

smartd_enable="YES"

openntpd

I install net/openntpd to keep the clock in sync. I find this a lot easier to configure than the base ntpd.

$ sudo portmaster /usr/ports/net/openntpd/

I enable openntpd in /etc/rc.conf and add a suitable server to /usr/local/etc/ntp.conf:

$ grep -v "^#" /usr/local/etc/ntpd.conf | grep -v "^$"
servers de.pool.ntp.org
$

ntpdate

I also enable ntpdate to help set the clock after a reboot. I add the following two lines to /etc/rc.conf:

ntpdate_enable="YES"
ntpdate_hosts="de.pool.ntp.org"

Preparing ezjail

ezjail needs to be installed and a bit of configuration is also needed, in addition to bootstrapping /usr/jails/basejail and /usr/jails/newjail.

Installing ezjail

First I install /usr/ports/sysutils/ezjail using Portmaster:

$ sudo portmaster /usr/ports/sysutils/ezjail

At the time of writing, a couple of bugs have been fixed in the cvs version of ezjail, not yet released and in FreeBSD ports. The author of ezjail takes his sweet time between releases, so often it is a good idea to get the latest cvs version from the ezjail website after installing the ports version:

$ cvs -d :pserver:anoncvs@cvs.erdgeist.org:/home/cvsroot co ezjail
cvs checkout: warning: failed to open /home/tykling/.cvspass for reading: No such file or directory
cvs checkout: Updating ezjail
U ezjail/Makefile
U ezjail/ezjail-admin
U ezjail/ezjail-clone.sh
U ezjail/ezjail.conf.sample
U ezjail/ezjail.sh
cvs checkout: Updating ezjail/examples
cvs checkout: Updating ezjail/examples/example
cvs checkout: Updating ezjail/examples/example/etc
U ezjail/examples/example/etc/make.conf
U ezjail/examples/example/etc/periodic.conf
U ezjail/examples/example/etc/rc.conf
cvs checkout: Updating ezjail/examples/example/etc/rc.d
U ezjail/examples/example/etc/rc.d/ezjail.flavour.example
cvs checkout: Updating ezjail/examples/example/pkg
cvs checkout: Updating ezjail/examples/example/usr
cvs checkout: Updating ezjail/examples/example/usr/home
cvs checkout: Updating ezjail/examples/example/usr/home/admin
cvs checkout: Updating ezjail/examples/example/usr/local
cvs checkout: Updating ezjail/examples/example/usr/local/etc
U ezjail/examples/example/usr/local/etc/sudoers
cvs checkout: Updating ezjail/examples/nullmailer-example
U ezjail/examples/nullmailer-example/ezjail.flavour
cvs checkout: Updating ezjail/examples/nullmailer-example/etc
U ezjail/examples/nullmailer-example/etc/rc.conf
cvs checkout: Updating ezjail/examples/nullmailer-example/etc/mail
U ezjail/examples/nullmailer-example/etc/mail/mailer.conf
cvs checkout: Updating ezjail/examples/nullmailer-example/usr
cvs checkout: Updating ezjail/examples/nullmailer-example/usr/local
cvs checkout: Updating ezjail/examples/nullmailer-example/usr/local/etc
cvs checkout: Updating ezjail/examples/nullmailer-example/usr/local/etc/nullmailer
U ezjail/examples/nullmailer-example/usr/local/etc/nullmailer/remotes
cvs checkout: Updating ezjail/man1
cvs checkout: Updating ezjail/man5
U ezjail/man5/ezjail.conf.5
cvs checkout: Updating ezjail/man7
U ezjail/man7/ezjail.7
cvs checkout: Updating ezjail/man8
U ezjail/man8/ezjail-admin.8
cvs checkout: Updating ezjail/share
cvs checkout: Updating ezjail/share/zsh
cvs checkout: Updating ezjail/share/zsh/site-functions
U ezjail/share/zsh/site-functions/ezjail-admin

Now just go into the ezjail folder and install:

$ cd ezjail/
$ sudo make install
mkdir -p /usr/local/etc/ezjail/ /usr/local/man/man1/ /usr/local/man/man5/ /usr/local/man/man7 /usr/local/man/man8 /usr/local/etc/rc.d/ /usr/local/bin/ /usr/local/share/examples/ezjail /usr/local/share/zsh/site-functions
cp -p ezjail.conf.sample /usr/local/etc/
cp -R -p examples/example /usr/local/share/examples/ezjail/
cp -R -p examples/nullmailer-example /usr/local/share/examples/ezjail/
cp -R -p share/zsh/site-functions/ /usr/local/share/zsh/site-functions/
sed s:EZJAIL_PREFIX:/usr/local: ezjail.sh > /usr/local/etc/rc.d/ezjail
sed s:EZJAIL_PREFIX:/usr/local: ezjail-admin > /usr/local/bin/ezjail-admin
sed s:EZJAIL_PREFIX:/usr/local: man8/ezjail-admin.8 > /usr/local/man/man8/ezjail-admin.8
sed s:EZJAIL_PREFIX:/usr/local: man5/ezjail.conf.5 > /usr/local/man/man5/ezjail.conf.5
sed s:EZJAIL_PREFIX:/usr/local: man7/ezjail.7 > /usr/local/man/man7/ezjail.7
chmod 755 /usr/local/etc/rc.d/ezjail /usr/local/bin/ezjail-admin
chown -R root:wheel /usr/local/man/man8/ezjail-admin.8 /usr/local/man/man5/ezjail.conf.5 /usr/local/man/man7/ezjail.7 /usr/local/share/examples/ezjail/
chmod 0440 /usr/local/share/examples/ezjail/example/usr/local/etc/sudoers
$

This is written in September 2012. Things might have changed so make sure you check the ezjail website changelog section before using the cvs version.

Configuring ezjail

Then I go edit the ezjail config file /usr/local/etc/ezjail.conf and add/change these three lines near the bottom:

ezjail_use_zfs="YES"
ezjail_jailzfs="gelipool/jails"
ezjail_use_zfs_for_jails="YES"

This makes ezjail use seperate zfs datasets under cryptopool/jails for the basejail and newjail, as well as for each jail created. ezjail_use_zfs_for_jails is supported since ezjail 3.2.2.

Bootstrapping ezjail

Finally I populate basejail and newjail from the world I build earlier:

$ sudo ezjail-admin update -i

The last line of the output is a message saying:

Note: a non-standard /etc/make.conf was copied to the template jail in order to get the ports collection running inside jails.

This is because ezjail defaults to symlinking the ports collection in the same way it symlinks the basejail. I prefer having seperate/individual ports collections in each of my jails though, so I remove the symlink and make.conf from newjail:

$ sudo rm /usr/jails/newjail/etc/make.conf /usr/jails/newjail/usr/ports /usr/jails/newjail/usr/src
$ sudo mkdir /usr/jails/newjail/usr/src

ZFS goodness

Note that ezjail has created two new ZFS datasets to hold basejail and newjail:

$ zfs list -r cryptopool/jails
NAME                        USED  AVAIL  REFER  MOUNTPOINT
cryptopool/jails            129M  2.52T    31K  /usr/jails
cryptopool/jails/basejail   128M  2.52T   128M  /usr/jails/basejail
cryptopool/jails/newjail    816K  2.52T   816K  /usr/jails/newjail

ezjail flavours

ezjail has a pretty awesome feature that makes it possible to create templates or flavours which apply common settings when creating a new jail. I always have a basic flavour which adds a user for me, installs an SSH key, adds a few packages like bash, screen, sudo and portmaster - and configures those packages. Basically, everything I find myself doing over and over again every time I create a new jail.

It is also possible, of course, to create more advanced flavours, I've had one that installs a complete nginx+php-fpm server with all the neccesary packages and configs.

ezjail flavours are technically pretty simple. By default, they are located in the same place as basejail and newjail, and ezjail comes with an example flavour to get you started. Basically a flavour is a file/directory hierachy which is copied to the jail, and a shell script called ezjail.flavour which is run once, the first time the jail is started, and then deleted.

For reference, I've included my basic flavour here. First is a listing of the files included in the flavour, and then the ezjail.flavour script which performs tasks beyond copying config files.

$ find /usr/jails/flavours/tykbasic
/usr/jails/flavours/tykbasic
/usr/jails/flavours/tykbasic/ezjail.flavour
/usr/jails/flavours/tykbasic/usr
/usr/jails/flavours/tykbasic/usr/local
/usr/jails/flavours/tykbasic/usr/local/etc
/usr/jails/flavours/tykbasic/usr/local/etc/portmaster.rc
/usr/jails/flavours/tykbasic/usr/local/etc/sudoers
/usr/jails/flavours/tykbasic/usr/home
/usr/jails/flavours/tykbasic/usr/home/tykling
/usr/jails/flavours/tykbasic/usr/home/tykling/.ssh
/usr/jails/flavours/tykbasic/usr/home/tykling/.ssh/authorized_keys
/usr/jails/flavours/tykbasic/usr/home/tykling/.screenrc
/usr/jails/flavours/tykbasic/etc
/usr/jails/flavours/tykbasic/etc/fstab
/usr/jails/flavours/tykbasic/etc/rc.conf
/usr/jails/flavours/tykbasic/etc/periodic.conf
/usr/jails/flavours/tykbasic/etc/resolv.conf

As you can see, the flavour contains files like /etc/resolv.conf and other stuff to make the jail work. The name of the flavour here is tykbasic which means that if I want a file to end up in /usr/home/tykling after the flavour has been applied, I need to put that file in the folder /usr/jails/flavours/tykbasic/usr/home/tykling/ - remember to also chown the files in the flavour appropriately.

Finally, my ezjail.flavour script looks like so:

#!/bin/sh
#
# BEFORE: DAEMON
#
# ezjail flavour example

# Timezone
###########
#
ln -s /usr/share/zoneinfo/Europe/Copenhagen /etc/localtime

# Groups
#########
#
pw groupadd -q -n tykling

# Users
########
#
# To generate a password hash for use here, do:
# openssl passwd -1 "the password"
echo -n '$1$L/fC0UrO$bi65/BOIAtMkvluDEDCy31' | pw useradd -n tykling -u 1001 -s /bin/sh -m -d /usr/home/tykling -g tykling -c 'tykling' -H 0

# Packages
###########
#
PACKAGESITE=ftp.de.freebsd.org
pkg_add -r bash
pkg_add -r sudo
pkg_add -r screen
pkg_add -r portmaster
pkg_add -r portaudit

#change shell to bash
chsh -s bash tykling

#update /etc/aliases
echo "root:   thomas@gibfest.dk" >> /etc/aliases
newaliases

#remove adjkerntz from crontab
cat /etc/crontab | grep -E -v "(Adjust the time|adjkerntz)" > /etc/crontab.new
mv /etc/crontab.new /etc/crontab

#remove ports symlink
rm /usr/ports

# create symlink to /usr/home in / (adduser defaults to /usr/username as homedir)
ln -s /usr/home /home

Creating a flavour is easy: just create a folder under /usr/jails/flavours/ that has the name of the flavour, and start adding files and folders there. The ezjail.flavour script should be placed in the root (see the example further up the page).

Configuration

This section outlines what I do to further prepare the machine to be a nice ezjail host.

Firewall

One of the first things I fix is to enable the pf firewall from OpenBSD. I add the following to /etc/rc.conf to enable pf at boot time:

[root@ ~]# grep pf /etc/rc.conf 
pf_enable="YES"
pflog_enable="YES"
[root@ ~]#

I also create a very basic /etc/pf.conf:

[root@ ~]# cat /etc/pf.conf 
### macros
if="em0"
table <portknock> persist

#external addresses
tykv4="a.b.c.d"
tykv6="2002:ab:cd::/48"
table <allowssh> { $tykv4,$tykv6 }

#local addresses
glasv4="w.x.y.z"

### scrub
scrub in on $if all fragment reassemble


################
### filtering
### block everything
block log all

################
### skip loopback interface(s)
set skip on lo0


################
### icmp6                                                                                                        
pass in quick on $if inet6 proto icmp6 all icmp6-type {echoreq,echorep,neighbradv,neighbrsol,routeradv,routersol}


################
### pass outgoing
pass out quick on $if all


################
### portknock rule (more than 5 connections in 10 seconds to the port specified will add the "offending" IP to the <portknock> table)
pass in quick on $if inet proto tcp from any to $glasv4 port 32323 synproxy state (max-src-conn-rate 5/10, overload <portknock>)

### pass incoming ssh and icmp
pass in quick on $if proto tcp from { <allowssh>, <portknock> } to ($if) port 22
pass in quick on $if inet proto icmp all icmp-type { 8, 11 }

################
### pass ipv6 fragments (hack to workaround pf not handling ipv6 fragments)
pass in on $if inet6
block in log on $if inet6 proto udp
block in log on $if inet6 proto tcp
block in log on $if inet6 proto icmp6
block in log on $if inet6 proto esp
block in log on $if inet6 proto ipv6

To load pf without rebooting I run the following:

[root@ ~]# kldload pf
[root@ ~]# kldload pflog
[root@ ~]# pfctl -f /etc/pf.conf && sleep 60 && pfctl -d
No ALTQ support in kernel
ALTQ related functions disabled

I get no prompt after this because pf has cut my SSH connection. But I can SSH back in if I did everything right, and if not, I can just wait 60 seconds after which pf will be disabled again. I SSH in and reattach to the screen I am running this in, and press control-c, so the "sleep 60" is interrupted and pf is not disabled. Neat little trick for when you want to avoid locking yourself out :)

Replacing sendmail with Postfix

I always replace Sendmail with Postfix on every server I manage. See Replacing_Sendmail_With_Postfix for more info.

Listening daemons

When you add an IP alias for a jail, any daemons listening on * will also listen on the jails IP, which is not what I want. For example, I want the jails sshd to be able to listen on the jails IP on port 22, instead of the hosts sshd. Check for listening daemons like so:

$ sockstat -l46
USER     COMMAND    PID   FD PROTO  LOCAL ADDRESS         FOREIGN ADDRESS      
root     master     1554  12 tcp4   *:25                  *:*
root     master     1554  13 tcp6   *:25                  *:*
root     sshd       948   3  tcp6   *:22                  *:*
root     sshd       948   4  tcp4   *:22                  *:*
root     syslogd    789   6  udp6   *:514                 *:*
root     syslogd    789   7  udp4   *:514                 *:*
[tykling@glas ~]$

This tells me that I need to change Postfix, sshd and syslogd to stop listening on all IP addresses.

Postfix

The defaults in Postfix are really nice on FreeBSD, and most of the time a completely empty config file is fine for a system mailer (sendmail replacement). However, to make Postfix stop listening on port 25 on all IP addresses, I do need one line in /usr/local/etc/postfix/main.cf:

$ cat /usr/local/etc/postfix/main.cf
inet_interfaces=localhost
$

sshd

To make sshd stop listening on all IP addresses I uncomment and edit the ListenAddress line in /etc/ssh/sshd_config:

$ grep ListenAddress /etc/ssh/sshd_config 
ListenAddress x.y.z.226
#ListenAddress ::

(IP address obfuscated..)

syslogd

I don't need my syslogd to listen on the network at all, so I add the following line to /etc/rc.conf:

$ grep syslog /etc/rc.conf 
syslogd_flags="-ss"

Restarting services

Finally I restart Postfix, sshd and syslogd:

$ sudo /etc/rc.d/syslogd restart
Stopping syslogd.
Waiting for PIDS: 789.
Starting syslogd.
$ sudo /etc/rc.d/sshd restart
Stopping sshd.
Waiting for PIDS: 948.
Starting sshd.
$ sudo /usr/local/etc/rc.d/postfix restart
postfix/postfix-script: stopping the Postfix mail system
postfix/postfix-script: starting the Postfix mail system

A check with sockstat reveals that no more services are listening on all IP addresses:

$ sockstat -l46
USER     COMMAND    PID   FD PROTO  LOCAL ADDRESS         FOREIGN ADDRESS      
root     master     1823  12 tcp4   127.0.0.1:25          *:*
root     master     1823  13 tcp6   ::1:25                *:*
root     sshd       1617  3  tcp4   x.y.z.226:22         *:*
$

Network configuration

Network configuration is a big part of any jail setup. If I have enough IP addresses (ipv4 and ipv6) I can just add IP aliases as needed. If I only have one or a few v4 IPs I will need to use rfc1918 addresses for the jails. In that case, I create a new loopback interface, lo1 and add the IP aliases there. I then use the pf firewall to redirect incoming traffic to the right jail, depending on the port in use.

IPv4

If rfc1918 jails are needed, I add the following to /etc/rc.conf to create the lo1 interface on boot:

### lo1 interface for ipv4 rfc1918 jails
cloned_interfaces="lo1"

When the lo1 interface is created, or if it isn't needed, I am ready to start adding IP aliases for jails as needed.

IPv6

On the page Hetzner_ipv6 I've explained how to make IPv6 work on a Hetzner server where the supplied IPv6 default gateway is outside the IPv6 subnet assigned.

When basic IPv6 connectivity works, I am ready to start adding IP aliases for jails as needed.

Allow ping from inside jails

I add the following to /etc/sysctl.conf so the jails are allowed to do icmp ping. This enables raw socket access, which can be a security issue if you have untrusted root users in your jails. Use with caution.

#allow ping in jails
security.jail.allow_raw_sockets=1

Tips & tricks

Get jail info out of top

To make top show the jail id of the jail in which the process is running in a column, I need to specify the -j flag to top. Since this is a multi-cpu server I am working on, I also like giving the -P flag to top, to get a seperate line of cpu stats per core. Finally, I like -a to get the full commandline/info of the running processes. I add the following to my .bashrc in my homedir on the jail host:

alias top="nice top -j -P -a"

...this way I don't have to remember passing -j -P -a to top every time. Also, I've been told to run top with nice to limit the cpu used by top itself. I took the advice so the complete alias looks like above.

ZFS snapshots and backup

So, since all this is ZFS based, there is a few tricks I do to make it easier to restore data in case of accidental file deletion or other dataloss.

Periodic snapshots using sysutils/zfs-periodic

sysutils/zfs-periodic is a little script that uses the FreeBSD periodic(8) system to make snapshots of filesystems with regular intervals. It supports making hourly snapshots with a small change to periodic(8), but I've settled for daily, weekly and monthly snapshots on my servers.

After installing sysutils/zfs-periodic I add the following to /etc/perioric.conf:

#daily zfs snapshots
daily_zfs_snapshot_enable="YES"
daily_zfs_snapshot_pools="tank cryptopool"
daily_zfs_snapshot_keep=7

#weekly zfs snapshots
weekly_zfs_snapshot_enable="YES"
weekly_zfs_snapshot_pools="tank cryptopool"
weekly_zfs_snapshot_keep=5

#monthly zfs snapshots
monthly_zfs_snapshot_enable="YES"
monthly_zfs_snapshot_pools="tank cryptopool"
monthly_zfs_snapshot_keep=6

#monthly zfs scrub
monthly_zfs_scrub_pools="tank cryptopool"
monthly_zfs_scrub_enable="YES"

Note that the last bit also enables a monthly scrub of the filesystem. Remember to change the pool name and remember to set the number of snapshots to retain to something appropriate. These things are always a tradeoff between diskspace and safety. Think it over and find some values that make you sleep well at night :)

After this has been running for a few days, you should have a bunch of daily snapshots:

$ zfs list -t snapshot | grep cryptopool@ 
cryptopool@daily-2012-09-02                                 0      -    31K  -
cryptopool@daily-2012-09-03                                 0      -    31K  -
cryptopool@daily-2012-09-04                                 0      -    31K  -
cryptopool@daily-2012-09-05                                 0      -    31K  -

Back-to-back ZFS mirroring

I am lucky enough to have more than one of these jail hosts, which is the whole reason I started writing down how I configure them. One of the advantages to having more than one is that I can configure zfs send/receive jobs and make server A send it's data to server B, and vice versa.

Introduction

The concept is pretty basic, but as it often happens, security considerations turn what was a simple and elegant idea into something... else. To make the back-to-back backup scheme work without sacrificing too much security, I first make a jail on each jailhost called backup.jailhostname. This jail will have control over a designated zfs dataset which will house the backups sent from the other server.

Create ZFS dataset

First I create the zfs dataset:

$ sudo zfs create cryptopool/backups
$ sudo zfs set jailed=on cryptopool/backups

'jail' the new dataset

I create the jail like I normally do, but after creating it, I edit the ezjail config file and tell it which extra zfs dataset to use:

$ grep dataset /usr/local/etc/ezjail/backup_glas_tyknet_dk 
export jail_backup_glas_tyknet_dk_zfs_datasets="cryptopool/backups"

This makes ezjail run the zfs jail command with the proper jail id when the jail is started.

sysctl settings

I also add the following to /etc/sysctl.conf (and set them using sysctl):

### allow zfs in jails
security.jail.mount_allowed=1
security.jail.enforce_statfs=1

security.jail.mount_allowed is self explanatory. security.jail.enforce_statfs=2 (which is the default) means that jails cannot see any other mountpoints than it's own chroot. security.jail.enforce_statfs=1 means that the jail can see it's own chroot and mountpoints below that, this is what I want. security.jail.enforce_statfs=0 means that all mountpoints are visible to jails, which I cant recommend for numerous reasons.

Configuring the backup jail

The jail is ready to run now, and inside the jail a zfs list looks like this:

$ zfs list
NAME                                USED  AVAIL  REFER  MOUNTPOINT
cryptopool                         3.98G  2.52T    31K  none
cryptopool/backups                   62K  2.52T    31K  none
$

I don't want to open up root ssh access to this jail, but the remote servers need to call zfs receive which requires root permissions. zfs allow to the rescue! zfs allow makes it possible to say "user X is permitted to do action Y on dataset Z" which is what I need here. In the backup jail I add a user called tykbackup which will be used as the user receiving the zfs snapshots from the remote servers.

I then run the following commands to allow the user to work with the dataset:

$ sudo zfs allow tykbackup atime,compression,create,mount,mountpoint,readonly,receive cryptopool/backups
$ sudo zfs allow cryptopool/backups
---- Permissions on cryptopool/backups -------------------------------
Local+Descendent permissions:
        user tykbackup atime,compression,create,mount,mountpoint,readonly,receive
$

Testing if it worked:

$ sudo su tykbackup
$ zfs create cryptopool/backups/test
$ zfs list cryptopool/backups/test
NAME                      USED  AVAIL  REFER  MOUNTPOINT
cryptopool/backups/test    31K  2.52T    31K  none
$ zfs destroy cryptopool/backups/test
cannot destroy 'cryptopool/backups/test': permission denied
$

Since the user tykbackup only has the permissions create,mount,mountpoint,receive on the cryptopool/backups dataset, I get Permission Denied (as I expected) when trying to destroy cryptopool/backups/test. Works like a charm.

To allow automatic SSH operations I add the public ssh key for the root user of the server being backed up to /usr/home/tykbackup/.ssh/authorized_keys:

$ cat /usr/home/tykbackup/.ssh/authorized_keys
from="ryst.tyknet.dk",no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty,command="/usr/home/tykbackup/zfscmd.sh $SSH_ORIGINAL_COMMAND" ssh-rsa AAAAB3......KR2Z root@ryst.tyknet.dk

The script called zfscmd.sh is placed on the backup server to allow the ssh client to issue different command line arguments depending on what needs to be done. The script is very simple:

#!/bin/sh
shift
/sbin/zfs $@
exit $?

A few notes: Aside from restricting the command this SSH key can run, I've restricted it to only be able to log in from the IP of the server being backed up. These are very basic restrictions that should always be in place no matter what kind of backup you are using.

The last bit of these instructions are missing intentionally, will be written asap.

Staying up-to-date

I update my ezjail hosts and jails to track -STABLE regularly. This section describes the procedure I use. It is essential that the jail host and the jails use the same world and kernel version, or bad stuff will happen.

Updating the jail host

First I update world and kernel of the jail host like I normally would. This is described earlier in this guide, see Ezjail_host#Building_world_and_kernel.

Updating ezjails basejail

To update ezjails basejail located in /usr/jails/basejail, I run the same commands as when bootstrapping ezjail, see the section Ezjail_host#Bootstrapping_ezjail.

Running mergemaster in the jails

Finally, to run mergemaster in all jails I use the following small script snippet. Run it as root on the host machine, it will nullfs mount /usr/src and /usr/obj from the hostmachine into each jail, and run mergemaster:

[root@ryst ~]# cat jailmergemaster.sh
#!/bin/sh
for jail in $(jls -n jid); do
        JID=$(echo $jail | cut -d "=" -f 2)
        JPATH=$(jls -j $JID -n path | cut -d "=" -f 2)
        echo "--------------------------------------"
        echo Processing jail id $JID path $JPATH ...
        mount_nullfs /usr/src $JPATH/usr/src;
        mount_nullfs /usr/obj $JPATH/usr/obj;
        jexec $JID mergemaster -FUi;
        umount $JPATH/usr/src;
        umount $JPATH/usr/obj;
done

NOTE THAT THIS ONLY HANDLES RUNNING JAILS. YOU CANNOT RUN MERGEMASTER IN A STOPPED JAIL!

To restart all jails I run the command ezjail-admin restart.

Replacing a defective disk

I had a broken harddisk on one of my servers this evening. This section describes how I replaced the disk to make everything work again.

Booting into the rescue system

After Hetzner staff physically replaced the disk my server was unable to boot because the disk that died was the first one on the controller. The cheap Hetzner hardware is unable to boot from the secondary disk, bios restrictions probably. If the other disk had broken the server would have booted fine and this whole process would be done with the server running. Anyway, I booted into the rescue system and partitioned the disk, added a bootloader and added it to the root zpool. After this I was able to boot the server normally, so the rest of the work was done without the rescue system.

Partitioning the new disk

The following shows the commands I ran to partition the disk:

[root@rescue ~]# gpart create -s GPT /dev/ad4
ad4 created
[root@rescue ~]# /sbin/gpart add -b 2048 -t freebsd-boot -s 128 /dev/ad4
ad4p1 added
[root@rescue ~]# gpart add -t freebsd-zfs -s 30G /dev/ad4
ad4p2 added
[root@rescue ~]# gpart add -t freebsd-ufs /dev/ad4
ad4p3 added
[root@rescue ~]# gpart show
=>        34  1465149101  ad6  GPT  (698G)
          34        2014       - free -  (1M)
        2048         128    1  freebsd-boot  (64k)
        2176    62914560    2  freebsd-zfs  (30G)
    62916736  1402232399    3  freebsd-ufs  (668G)

=>        34  1465149101  ad4  GPT  (698G)
          34        2014       - free -  (1M)
        2048         128    1  freebsd-boot  (64k)
        2176    62914560    2  freebsd-zfs  (30G)
    62916736  1402232399    3  freebsd-ufs  (668G)

[root@rescue ~]#

Importing the pool and replacing the disk

Next step is importing the zpool (remember altroot=/mnt !) and replacing the defective disk:

[root@rescue ~]# zpool import
   pool: tank
     id: 3572845459378280852
  state: DEGRADED
 status: One or more devices are missing from the system.
 action: The pool can be imported despite missing or damaged devices.  The
        fault tolerance of the pool may be compromised if imported.
   see: http://www.sun.com/msg/ZFS-8000-2Q
 config:

        tank                      DEGRADED
          mirror-0                DEGRADED
            11006001397618753837  UNAVAIL  cannot open
            ad6p2                 ONLINE
[root@rescue ~]# zpool import -o altroot=/mnt/ tank
[root@rescue ~]# zpool status
  pool: tank
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-2Q
  scan: scrub repaired 0 in 0h2m with 0 errors on Thu Nov  1 05:00:49 2012
config:

        NAME                      STATE     READ WRITE CKSUM
        tank                      DEGRADED     0     0     0
          mirror-0                DEGRADED     0     0     0
            11006001397618753837  UNAVAIL      0     0     0  was /dev/ada0p2
            ad6p2                 ONLINE       0     0     0

errors: No known data errors
[root@rescue ~]# zpool replace tank 11006001397618753837 ad4p2

Make sure to wait until resilver is done before rebooting.

If you boot from pool 'tank', you may need to update
boot code on newly attached disk 'ad4p2'.

Assuming you use GPT partitioning and 'da0' is your new boot disk
you may use the following command:

        gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da0

[root@rescue ~]#
[root@rescue ~]# zpool status
  pool: tank
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Nov 27 00:24:41 2012
        823M scanned out of 3.11G at 45.7M/s, 0h0m to go
        823M resilvered, 25.88% done
config:

        NAME                        STATE     READ WRITE CKSUM
        tank                        DEGRADED     0     0     0
          mirror-0                  DEGRADED     0     0     0
            replacing-0             UNAVAIL      0     0     0
              11006001397618753837  UNAVAIL      0     0     0  was /dev/ada0p2
              ad4p2                 ONLINE       0     0     0  (resilvering)
            ad6p2                   ONLINE       0     0     0

errors: No known data errors
[root@rescue ~]# zpool status
  pool: tank
 state: ONLINE
  scan: resilvered 3.10G in 0h2m with 0 errors on Tue Nov 27 01:26:45 2012
config:

        NAME        STATE     READ WRITE CKSUM
        tank        ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            ada0p2  ONLINE       0     0     0
            ada1p2  ONLINE       0     0     0

errors: No known data errors
[root@rescue ~]#

Reboot into non-rescue system

At this point I rebooted the machine into the normal FreeBSD system.

Re-create geli partition

To recreate the geli partition on p3 of the new disk, I just follow the same steps as when I originally created it, more info here.

To attach the new geli volume I run geli attach as described here.

Add the geli device to the encrypted zpool

First I check that both geli devices are available, and I check the device name that needs replacing in zpool status output:

[tykling@haze ~]$ geli status
      Name  Status  Components
ada1p3.eli  ACTIVE  ada1p3
ada0p3.eli  ACTIVE  ada0p3

[tykling@haze ~]$ zpool status gelipool
  pool: gelipool
 state: DEGRADED
status: One or more devices has been removed by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
  scan: scrub repaired 68K in 0h28m with 0 errors on Thu Nov  1 05:58:08 2012
config:

        NAME                      STATE     READ WRITE CKSUM
        gelipool                  DEGRADED     0     0     0
          mirror-0                DEGRADED     0     0     0
            18431995264718840299  REMOVED      0     0     0  was /dev/ada0p3.eli
            ada1p3.eli            ONLINE       0     0     0

errors: No known data errors
[tykling@haze ~]$

To replace the device and begin resilvering:

[tykling@haze ~]$ sudo zpool replace gelipool 18431995264718840299 ada0p3.eli
Password:
[tykling@haze ~]$ zpool status
  pool: gelipool
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Nov 27 00:53:40 2012
        759M scanned out of 26.9G at 14.6M/s, 0h30m to go
        759M resilvered, 2.75% done
config:

        NAME                        STATE     READ WRITE CKSUM
        gelipool                    DEGRADED     0     0     0
          mirror-0                  DEGRADED     0     0     0
            replacing-0             REMOVED      0     0     0
              18431995264718840299  REMOVED      0     0     0  was /dev/ada0p3.eli/old
              ada0p3.eli            ONLINE       0     0     0  (resilvering)
            ada1p3.eli              ONLINE       0     0     0

errors: No known data errors
[tykling@haze ~]$

When the resilver is finished, the system is good as new.

Ezjail host: Difference between revisions