Ezjail host: Difference between revisions
(→smartd) |
|||
(63 intermediate revisions by the same user not shown) | |||
Line 42: | Line 42: | ||
<pre> | <pre> | ||
[root@rescue ~]# zfsinstall -d ada0 -d ada1 -r mirror -z 30G -t /nfs/mfsbsd/ | [root@rescue ~]# zfsinstall -d ada0 -d ada1 -r mirror -z 30G -t /nfs/mfsbsd/10.0-release-amd64.tbz | ||
Creating GUID partitions on ada0 ... done | Creating GUID partitions on ada0 ... done | ||
Configuring ZFS bootcode on ada0 ... done | Configuring ZFS bootcode on ada0 ... done | ||
Line 119: | Line 119: | ||
=== Basic ports === | === Basic ports === | ||
I also add some basic ports with <code> | I also add some basic ports with <code>pkg</code> so I can get screen etc. up and running as soon as possible: | ||
<pre> | <pre> | ||
# | # pkg install bash screen sudo portmaster | ||
</pre> | </pre> | ||
Line 139: | Line 139: | ||
After the reboot into the installed FreeBSD environment, I need to do some further disk configuration. | After the reboot into the installed FreeBSD environment, I need to do some further disk configuration. | ||
=== Create swap partitions === | === Create swap partitions === | ||
Swap-on-zfs is not a good idea for various reasons. To keep my swap encrypted but still off zfs I use geli onetime encryption. First I add the partitions with gpart: | Swap-on-zfs is not a good idea for various reasons. To keep my swap encrypted but still off zfs I use geli onetime encryption. To avoid problems if a disk dies I also use gmirror. First I add the partitions with gpart: | ||
<pre> | <pre> | ||
Line 149: | Line 149: | ||
</pre> | </pre> | ||
Then I make sure gmirror is loaded, and loaded on boot: | |||
<pre> | <pre> | ||
/ | $ sudo sysrc -f /boot/loader.conf geom_mirror_load="YES" | ||
$ sudo kldload geom_mirror | |||
</pre> | </pre> | ||
Then I create the gmirror: | |||
<pre> | <pre> | ||
sudo gmirror label swapmirror /dev/ada0p3 /dev/ada1p3 | |||
</pre> | </pre> | ||
Finally I add the following line to <code>/etc/fstab</code> to get encrypted swap on top of the gmirror: | |||
<pre> | |||
/dev/mirror/swapmirror.eli none swap sw,keylen=256,sectorsize=4096 0 0 | |||
</pre> | |||
I can enable the new swap partition right away: | |||
<pre> | |||
$ sudo swapon /dev/mirror/swapmirror.eli | |||
$ swapinfo | |||
Device 1K-blocks Used Avail Capacity | |||
/dev/mirror/swapmirror.eli 8388604 0 8388604 0% | |||
$ | |||
</pre> | |||
=== Create GELI partitions === | === Create GELI partitions === | ||
Line 176: | Line 188: | ||
To create a GELI key I copy some data from <code>/dev/random</code>: | To create a GELI key I copy some data from <code>/dev/random</code>: | ||
<pre> | <pre> | ||
$ sudo dd if=/dev/random of=/root/geli.key bs=256k count=1 | |||
1+0 records in | 1+0 records in | ||
1+0 records out | 1+0 records out | ||
262144 bytes transferred in 0.003347 secs (78318372 bytes/sec) | 262144 bytes transferred in 0.003347 secs (78318372 bytes/sec) | ||
$ | |||
</pre> | </pre> | ||
Line 206: | Line 218: | ||
$ | $ | ||
</pre> | |||
=== Enable AESNI === | |||
Most Intel CPUs have hardware acceleration of AES which helps a lot with GELI performance. I load the <code>aesni</code> module during boot from <code>/boot/loader.conf</code>: | |||
<pre> | |||
$ sudo sysrc -f /boot/loader.conf aesni_load="YES" | |||
</pre> | </pre> | ||
Line 280: | Line 298: | ||
<pre> | <pre> | ||
$ sudo zfs set atime=off tank | |||
$ sudo zfs set atime=off gelipool | |||
$ | |||
</pre> | </pre> | ||
Line 313: | Line 331: | ||
</pre> | </pre> | ||
I | I create the file <code>/usr/local/etc/smartd.conf</code> and add this line to it: | ||
<pre> | <pre> | ||
DEVICESCAN -a -m thomas@gibfest.dk | DEVICESCAN -a -m thomas@gibfest.dk | ||
Line 330: | Line 348: | ||
<pre> | <pre> | ||
sudo pkg install openntpd | |||
</pre> | |||
I enable <code>openntpd</code> in <code>/etc/rc.conf</code>: | |||
<pre> | |||
sudo sysrc openntpd_enable="YES" | |||
</pre> | </pre> | ||
and add a one line config file: | |||
<pre> | <pre> | ||
$ grep -v "^#" /usr/local/etc/ntpd.conf | grep -v "^$" | $ grep -v "^#" /usr/local/etc/ntpd.conf | grep -v "^$" | ||
servers de.pool.ntp.org | servers de.pool.ntp.org | ||
$ | $ | ||
</pre> | |||
sync the clock and start openntpd: | |||
<pre> | |||
sudo ntpdate de.pool.ntp.org | |||
sudo service openntpd start | |||
</pre> | </pre> | ||
Line 343: | Line 374: | ||
I also enable <code>ntpdate</code> to help set the clock after a reboot. I add the following two lines to <code>/etc/rc.conf</code>: | I also enable <code>ntpdate</code> to help set the clock after a reboot. I add the following two lines to <code>/etc/rc.conf</code>: | ||
<pre> | <pre> | ||
ntpdate_enable="YES" | sudo sysrc ntpdate_enable="YES" | ||
ntpdate_hosts="de.pool.ntp.org" | sudo sysrc ntpdate_hosts="de.pool.ntp.org" | ||
</pre> | </pre> | ||
Line 353: | Line 384: | ||
== Fetching sources == | == Fetching sources == | ||
I | First I get the sources: | ||
<pre> | <pre> | ||
$ sudo | [tykling@jail1 ~]$ sudo git clone -b stable/13 https://git.freebsd.org/src.git /usr/src | ||
Password: | |||
Cloning into '/usr/src'... | |||
remote: Enumerating objects: 4060913, done. | |||
remote: Counting objects: 100% (379329/379329), done. | |||
remote: Compressing objects: 100% (27474/27474), done. | |||
remote: Total 4060913 (delta 373583), reused 351855 (delta 351855), pack-reused 3681584 | |||
Receiving objects: 100% (4060913/4060913), 1.38 GiB | 5.94 MiB/s, done. | |||
Resolving deltas: 100% (3217803/3217803), done. | |||
( | Updating files: 100% (86931/86931), done. | ||
[tykling@jail1 ~]$ | |||
$ | |||
</pre> | </pre> | ||
This takes a while the first time, but subsequent runs are much faster. | This takes a while the first time, but subsequent <code>git pull</code> runs are much faster. | ||
'''Note:''' | '''Note:''' I used to use svn or svnlite here but since the migration to Git I switched to using the regular git client | ||
== Create kernel config == | == Create kernel config == | ||
I used to create a kernel config to get <code>RACCT</code> and <code>RCTL</code> but these days both are included in GENERIC, so no need for that anymore. Yay. | |||
< | |||
</ | |||
== Building world and kernel == | == Building world and kernel == | ||
Line 422: | Line 415: | ||
To build the new system: | To build the new system: | ||
<pre> | <pre> | ||
$ sudo -i bash | |||
# cd /usr/src/ | # cd /usr/src/ | ||
# time ( | # time (make -j$(sysctl -n hw.ncpu) buildworld && make -j$(sysctl -n hw.ncpu) kernel) && date | ||
</pre> | </pre> | ||
After the build finishes, reboot and run mergemaster, installworld, and mergemaster again: | After the build finishes, reboot into the newly built kernel and run mergemaster, installworld, and mergemaster again; and finally delete-old and delete-old-libs: | ||
<pre> | <pre> | ||
$ sudo -i bash | |||
# cd /usr/src/ | # cd /usr/src/ | ||
# | # mergemaster -pFUi && make installworld && mergemaster -FUi && make -DBATCH_DELETE_OLD_FILES delete-old && make -DBATCH_DELETE_OLD_FILES delete-old-libs | ||
</pre> | </pre> | ||
'''DO NOT OVERWRITE /etc/group AND /etc/master.passwd AND OTHER CRITICAL FILES!''' | '''PAY ATTENTION DURING MERGEMASTER! DO NOT OVERWRITE /etc/group AND /etc/master.passwd AND OTHER CRITICAL FILES!''' | ||
Reboot after the final mergemaster completes, and boot into the newly built world. | Reboot after the final mergemaster completes, and boot into the newly built world. | ||
Line 440: | Line 435: | ||
== Installing ezjail == | == Installing ezjail == | ||
Just install it with pkg: | |||
<pre> | <pre> | ||
sudo pkg install ezjail | |||
</pre> | </pre> | ||
== Configuring ezjail == | == Configuring ezjail == | ||
Line 615: | Line 538: | ||
########### | ########### | ||
# | # | ||
env ASSUME_ALWAYS_YES=YES pkg bootstrap | |||
pkg install -y bash | |||
pkg install -y sudo | |||
pkg install -y portmaster | |||
pkg install -y screen | |||
#change shell to bash | #change shell to bash | ||
Line 653: | Line 576: | ||
One of the first things I fix is to enable the pf firewall from OpenBSD. I add the following to <code>/etc/rc.conf</code> to enable pf at boot time: | One of the first things I fix is to enable the pf firewall from OpenBSD. I add the following to <code>/etc/rc.conf</code> to enable pf at boot time: | ||
<pre> | <pre> | ||
sudo sysrc pf_enable="YES" | |||
pf_enable="YES" | sudo sysrc pflog_enable="YES" | ||
pflog_enable="YES" | |||
</pre> | </pre> | ||
Line 726: | Line 647: | ||
I get no prompt after this because pf has cut my SSH connection. But I can SSH back in if I did everything right, and if not, I can just wait 60 seconds after which pf will be disabled again. I SSH in and reattach to the screen I am running this in, and press control-c, so the "sleep 60" is interrupted and pf is not disabled. Neat little trick for when you want to avoid locking yourself out :) | I get no prompt after this because pf has cut my SSH connection. But I can SSH back in if I did everything right, and if not, I can just wait 60 seconds after which pf will be disabled again. I SSH in and reattach to the screen I am running this in, and press control-c, so the "sleep 60" is interrupted and pf is not disabled. Neat little trick for when you want to avoid locking yourself out :) | ||
== Process Accounting == | |||
I like to enable process accounting on my jail hosts. It can be useful in a lot of situations. I put the accounting data on a seperate ZFS dataset. | |||
I run the following commands to enable it: | |||
<pre> | |||
sudo zfs create tank/root/var/account | |||
sudo touch /var/account/acct | |||
sudo chmod 600 /var/account/acct | |||
sudo accton /var/account/acct | |||
sudo sysrc accounting_enable="YES" | |||
</pre> | |||
== Replacing sendmail with Postfix == | == Replacing sendmail with Postfix == | ||
Line 830: | Line 764: | ||
...this way I don't have to remember passing <code>-j -P -a</code> to top every time. Also, I've been told to run top with ''nice'' to limit the cpu used by top itself. I took the advice so the complete alias looks like above. | ...this way I don't have to remember passing <code>-j -P -a</code> to top every time. Also, I've been told to run top with ''nice'' to limit the cpu used by top itself. I took the advice so the complete alias looks like above. | ||
= Base jails = | |||
My jails need various services, so: | |||
* I have a DNS jail with a caching DNS server which is used by all the jails. | |||
* I also have a syslog server on each jailhost to collect syslog from all the jails and send them to my central syslog server. | |||
* I have a postgres jail on each jail host, so I only need to maintain one postgres server. | |||
* I have a reverse webproxy jail since many of my jails serve some sort of web application, and I like to terminate SSL for those in one place in an attempt to keep the individual jails as simple as possible. | |||
* Finally I like to run a tor relay on each box to spend any excess resources (cpu, memory, bandwidth) on something nice. As long as I don't run an exit node I can do this completely without any risk of complaints from the provider. | |||
This section describes how I configure each of these "base jails". | |||
== DNS jail == | |||
I don't need a public v4 IP for this jail, so I configure it with an RFC1918 v4 IP on a loopback interface, and of course a real IPv6 address. | |||
I am used to using bind but unbound is just as good. Use whatever you are comfortable with. Make sure you permit DNS traffic from the other jails to the DNS jail. | |||
This jail needs to be started first since the rest of the jails need it for DNS. ezjail runs rcorder on the config files in <code>/usr/local/etc/ezjail</code> which means I can use the normal <code>PROVIDE:</code> and <code>REQUIRE:</code> to control the jail dependencies. I change the ezjail config for my DNS jail to have the following <code>PROVIDE:</code> line: | |||
<pre> | |||
# PROVIDE: dns | |||
</pre> | |||
The rest of the jails all get the following <code>REQUIRE:</code> line: | |||
<pre> | |||
# REQUIRE: dns | |||
</pre> | |||
== Syslog jail == | |||
I don't need a public v4 IP for this jail, so I configure it with an RFC1918 v4 IP on a loopback interface, and of course a real IPv6 address. | |||
This jail gets the following <code>PROVIDE:</code> line in it's ezjail config file: | |||
<pre> | |||
# PROVIDE: syslog | |||
</pre> | |||
I use <code>syslog-ng</code> for this. I install the package using <code>pkg install syslog-ng</code> and then add a few things to the default config: | |||
* I use the following options, YMMV: | |||
<code>options { chain_hostnames(off); flush_lines(0); threaded(yes); use_fqdn(yes); keep_hostname(no); use_dns(yes); stats-freq(60); stats-level(0); };</code> | |||
* I remove <code>udp()</code> from the default source and define a new source: | |||
<code>source jailsrc { udp(ip("10.0.0.1")); udp6(ip("w:x:y:z:10::1")); };</code> | |||
* I also remove the line: <code>destination console { file("/dev/console"); };</code> since the jail does not have access to <code>/dev/console</code>. I also remove any corresponding <code>log</code> statements that use <code>destination(console);</code>. | |||
* I add a new destination: | |||
<pre> | |||
destination loghost{ | |||
syslog( | |||
"syslog.tyktech.dk" | |||
transport("tls") | |||
port(1999) | |||
tls(peer-verify(required-untrusted)) | |||
localip("10.0.0.1") | |||
log-fifo-size(10000) | |||
); | |||
}; | |||
</pre> | |||
* Finally tell syslog-ng to send all logdata from <code>jailsrc</code> to <code>loghost</code>: | |||
<code>log { source(jailsrc); destination(loghost); flags(flow_control); };</code> | |||
The rest of the jails get the following <code>REQUIRE:</code> line: | |||
<pre> | |||
# REQUIRE: dns syslog | |||
</pre> | |||
... and my jail flavours default <code>/etc/syslog.conf</code> all get this line so all the jails send their syslog messages to the syslog jail: | |||
<pre>*.* @10.0.0.1</pre> | |||
== Postgres jail == | == Postgres jail == | ||
Line 844: | Line 842: | ||
<pre> | <pre> | ||
export jail_postgres_kush_tyknet_dk_parameters="allow.sysvipc=1" | export jail_postgres_kush_tyknet_dk_parameters="allow.sysvipc=1" | ||
</pre> | |||
While I'm there I also change the jails <code># PROVIDE:</code> line to: | |||
<pre> | |||
# PROVIDE: postgres | |||
</pre> | |||
And ofcourse the standard <code># REQUIRE:</code> line: | |||
<pre> | |||
# REQUIRE: dns syslog | |||
</pre> | </pre> | ||
After restarting the jail I can run <code>initdb</code> and start Postgres. When a jail needs a database I need to: | After restarting the jail I can run <code>initdb</code> and start Postgres. When a jail needs a database I need to: | ||
* Add a DB user (with the <code>createuser -P someusername</code> command) | |||
* Add a database with the new user as owner (<code>createdb -O someusername somedbname</code>) | |||
* Add permissions in <code>/usr/local/pgsql/data/pg_hba.conf</code> | |||
* Open a hole in the firewall so the jail can reach the database on TCP port <code>5432</code> | |||
* Add <code>postgres</code> to the <code># REQUIRE:</code> line in the ezjail config file | |||
</ | |||
== Web Jail == | == Web Jail == | ||
I need a public V4 IP for the web jail and I also give it a V6 IP. Since I use a different V6 IP per website, I will need additional v6 addresses when I start adding websites. I add the v6 addresses to the web jail in batches of 10 as I need them. After creating the jail and bootstrapping the ports collection I install <code>security/openssl</code> and <code>www/nginx</code> and configure it. More on that later. | I need a public V4 IP for the web jail and I also give it a V6 IP. Since I use a different V6 IP per website, I will need additional v6 addresses when I start adding websites. I add the v6 addresses to the web jail in batches of 10 as I need them. After creating the jail and bootstrapping the ports collection I install <code>security/openssl</code> and <code>www/nginx</code> and configure it. More on that later. | ||
== Tor relay jail == | |||
The Tor relay needs a public IP. It also needs the <code>security/openssl</code> and <code>security/tor</code> ports built (from ports, not packages, to ensure Tor is built with a recent OpenSSL to speed up ECDH). I put the following into the <code>/usr/local/etc/tor/torrc</code> file: | |||
<pre> | |||
Log notice file /var/log/tor/notices.log | |||
ORPort 443 NoListen | |||
ORPort 9090 NoAdvertise | |||
Address torrelay.bong.tyknet.dk | |||
Nickname TykRelay01 | |||
ContactInfo Thomas Steen Rasmussen / Tykling <thomas@gibfest.dk> (PGP: 0x772FF77F0972FA58) | |||
DirPort 80 NoListen | |||
DirPort 9091 NoAdvertise | |||
ExitPolicy reject *:* | |||
</pre> | |||
Changing the <code>Address</code> and <code>Nickname</code> depending on the server. | |||
A few steps (that should really be done by the port) are needed here: | |||
<pre> | |||
sudo rm -rf /var/db/tor /var/run/tor | |||
sudo mkdir -p /var/db/tor/data /var/run/tor /var/log/tor | |||
sudo chown -R _tor:_tor /var/db/tor /var/log/tor /var/run/tor | |||
sudo chmod -R 700 /var/db/tor | |||
</pre> | |||
I also add the following line to the jail hosts <code>/etc/sysctl.conf</code> to make it impossible to predict IP IDs from the server: | |||
<pre> | |||
net.inet.ip.random_id=1 | |||
</pre> | |||
Finally I redirect TCP ports 9090 and 9091 to ports 443 and 80 in the jail in <code>/etc/pf.conf</code>: | |||
<pre> | |||
$ grep tor /etc/pf.conf | |||
torv4="85.235.250.88" | |||
torv6="2a01:3a0:1:1900:85:235:250:88" | |||
rdr on $if inet proto tcp from any to $torv4 port 443 -> $torv4 port 9090 | |||
rdr on $if inet6 proto tcp from any to $torv6 port 443 -> $torv6 port 9090 | |||
rdr on $if inet proto tcp from any to $torv4 port 80 -> $torv4 port 9091 | |||
rdr on $if inet6 proto tcp from any to $torv6 port 80 -> $torv6 port 9091 | |||
pass in quick on { $if, $jailif } proto tcp from any to { $torv4 $torv6 } port { 9090, 9091 } | |||
$ | |||
</pre> | |||
= ZFS snapshots and backup = | = ZFS snapshots and backup = | ||
Line 869: | Line 918: | ||
daily_zfs_snapshot_pools="tank gelipool" | daily_zfs_snapshot_pools="tank gelipool" | ||
daily_zfs_snapshot_keep=7 | daily_zfs_snapshot_keep=7 | ||
daily_zfs_snapshot_skip="gelipool/backups" | |||
#weekly zfs snapshots | #weekly zfs snapshots | ||
Line 874: | Line 924: | ||
weekly_zfs_snapshot_pools="tank gelipool" | weekly_zfs_snapshot_pools="tank gelipool" | ||
weekly_zfs_snapshot_keep=5 | weekly_zfs_snapshot_keep=5 | ||
weekly_zfs_snapshot_skip="gelipool/backups" | |||
#monthly zfs snapshots | #monthly zfs snapshots | ||
Line 879: | Line 930: | ||
monthly_zfs_snapshot_pools="tank gelipool" | monthly_zfs_snapshot_pools="tank gelipool" | ||
monthly_zfs_snapshot_keep=6 | monthly_zfs_snapshot_keep=6 | ||
monthly_zfs_snapshot_skip="gelipool/backups" | |||
#monthly zfs scrub | #monthly zfs scrub | ||
daily_scrub_zfs_enable="YES" | |||
daily_scrub_zfs_default_threshold=30 | |||
</pre> | </pre> | ||
Line 980: | Line 1,032: | ||
A few notes: Aside from restricting the command this SSH key can run, I've restricted it to only be able to log in from the IP of the server being backed up. These are very basic restrictions that should always be in place no matter what kind of backup you are using. | A few notes: Aside from restricting the command this SSH key can run, I've restricted it to only be able to log in from the IP of the server being backed up. These are very basic restrictions that should always be in place no matter what kind of backup you are using. | ||
'''The | === Add the periodic script === | ||
I then add the script <code>/usr/local/etc/periodic/daily/999.zfs-mirror</code> to each server being backed up with the following content: | |||
<pre> | |||
#!/bin/sh | |||
#set -x | |||
### check pidfile | |||
if [ -f /var/run/$(basename $0).pid ]; then | |||
echo "pidfile /var/run/$(basename $0).pid exists, bailing out" | |||
exit 1 | |||
fi | |||
echo $$ > /var/run/$(basename $0).pid | |||
### If there is a global system configuration file, suck it in. | |||
if [ -r /etc/defaults/periodic.conf ]; then | |||
. /etc/defaults/periodic.conf | |||
source_periodic_confs | |||
fi | |||
case "$daily_zfs_mirror_enable" in | |||
[Yy][Ee][Ss]) | |||
;; | |||
*) | |||
exit | |||
;; | |||
esac | |||
pools=$daily_zfs_mirror_pools | |||
if [ -z "$pools" ]; then | |||
pools='tank' | |||
fi | |||
targethost=$daily_zfs_mirror_targethost | |||
if [ -z "$targethost" ]; then | |||
echo '$daily_zfs_mirror_targethost must be set in /etc/periodic.conf' | |||
exit 1 | |||
fi | |||
targetuser=$daily_zfs_mirror_targetuser | |||
if [ -z "$targetuser" ]; then | |||
echo '$daily_zfs_mirror_targetuser must be set in /etc/periodic.conf' | |||
exit 1 | |||
fi | |||
targetfs=$daily_zfs_mirror_targetfs | |||
if [ -z "$targetfs" ]; then | |||
echo '$daily_zfs_mirror_targetfs must be set in /etc/periodic.conf' | |||
exit 1 | |||
fi | |||
if [ -n "$daily_zfs_mirror_skip" ]; then | |||
egrep="($(echo $daily_zfs_mirror_skip | sed "s/ /|/g"))" | |||
fi | |||
### get todays date for later use | |||
tday=$(date +%Y-%m-%d) | |||
### check if the destination fs exists | |||
ssh ${targetuser}@${targethost} /usr/home/tykbackup/zfscmd.sh list ${targetfs} > /dev/null 2>&1 | |||
if [ $? -ne 0 ]; then | |||
echo "Creating destination fs on target server" | |||
echo ssh ${targetuser}@${targethost} /usr/home/tykbackup/zfscmd.sh create ${targetfs} | |||
ssh ${targetuser}@${targethost} /usr/home/tykbackup/zfscmd.sh create ${targetfs} | |||
fi | |||
echo -n "Doing daily ZFS mirroring - " | |||
date | |||
### loop through the configured pools | |||
for pool in $pools; do | |||
echo " Processing pool $pool ..." | |||
### enumerate datasets with daily snapshots from today | |||
if [ -n "$egrep" ]; then | |||
datasets=$(zfs list -t snapshot -o name | grep "^$pool[\/\@]" | egrep -v "$egrep" | grep "@daily-$tday") | |||
else | |||
datasets=$(zfs list -t snapshot -o name | grep "^$pool[\/\@]" | grep "@daily-$tday") | |||
fi | |||
echo "found datasets: $datasets" | |||
for snapshot in $datasets; do | |||
dataset=$(echo -n $snapshot | cut -d "@" -f 1) | |||
echo "working on dataset $dataset" | |||
### find the latest daily snapshot of this dataset on the remote node, if any | |||
echo ssh ${targetuser}@${targethost} /usr/home/tykbackup/zfscmd.sh list -t snapshot \| grep "^${targetfs}/${dataset}@daily-" \| cut -d " " -f 1 \| tail -1 | |||
lastgoodsnap=$(ssh ${targetuser}@${targethost} /usr/home/tykbackup/zfscmd.sh list -t snapshot | grep "^${targetfs}/${dataset}@daily-" | cut -d " " -f 1 | tail -1) | |||
if [ -z $lastgoodsnap ]; then | |||
echo "No remote daily snapshot found for local daily snapshot $snapshot - cannot send incremental - sending full backup" | |||
zfs send -v $snapshot | mbuffer | ssh ${targetuser}@${targethost} /usr/home/tykbackup/zfscmd.sh receive -v -F -u $targetfs/$dataset | |||
if [ $? -ne 0 ]; then | |||
echo " Unable to send full snapshot of $dataset to $targetfs on host $targethost" | |||
else | |||
echo " Successfully sent a full snapshot of $dataset to $targetfs on host $targethost - future sends will be incremental" | |||
fi | |||
else | |||
### check if this snapshot has already been sent for some reason, skip if so..." | |||
temp=$(echo $snapshot | cut -d "/" -f 2-) | |||
lastgoodsnap="$(echo $lastgoodsnap | sed "s,${targetfs}/,,")" | |||
if [ "$temp" = "$lastgoodsnap" ]; then | |||
echo " The snapshot $snapshot has already been sent to $targethost, skipping..." | |||
else | |||
### zfs send the difference between latest remote snapshot and todays local snapshot | |||
echo " Sending the diff between local snapshot $(hostname)@$lastgoodsnap and $(hostname)@$pool/$snapshot to ${targethost}@${targetfs}/${pool} ..." | |||
zfs send -I $lastgoodsnap $snapshot | mbuffer | ssh ${targetuser}@${targethost} /usr/home/tykbackup/zfscmd.sh receive -v -F -u $targetfs/$dataset | |||
if [ $? -ne 0 ]; then | |||
echo " There was a problem sending the diff between $lastgoodsnap and $snapshot to $targetfs on $targethost" | |||
else | |||
echo " Successfully sent the diff between $lastgoodsnap and $snapshot to $targethost" | |||
fi | |||
fi | |||
fi | |||
done | |||
done | |||
### remove pidfile | |||
rm /var/run/$(basename $0).pid | |||
</pre> | |||
=== Run the periodic script === | |||
I usually to the initial run of the periodic script by hand, so I can catch and fix any errors right away. The script will loop over all datasets in the configured pools and zfs send them including their snapshots to the backup server. Next time the script runs it will send an incremental diff instead of the full dataset. | |||
=== Caveats === | |||
This script does not handle deleting datasets (including their snapshots) on the backup server when the dataset is deleted from the server being backed up. You will need to do that manually. This could be considered a feature, or a missing feature, depending on your preferences. :) | |||
= Staying up-to-date = | = Staying up-to-date = | ||
I update my ezjail hosts and jails to track -STABLE regularly. This section describes the procedure I use. '''It is essential that the jail host and the jails use the same world | I update my ezjail hosts and jails to track -STABLE regularly. This section describes the procedure I use. '''It is essential that the jail host and the jails use the same world version, or bad stuff will happen.''' | ||
== Updating the jail host == | == Updating the jail host == | ||
Line 1,000: | Line 1,175: | ||
MM_RC=0 | MM_RC=0 | ||
if [ -e /root/.mergemasterrc ]; then | if [ -e /root/.mergemasterrc ]; then | ||
MM_RC=1 | |||
mv /root/.mergemasterrc /root/.mergemasterrc.old | |||
fi | fi | ||
### loop through jails | ### loop through jails | ||
for | for jailname in $(ls -1 /usr/jails/ | grep -Ev "(^basejail$|^newjail$|^flavours$)"); do | ||
jailroot="/usr/jails/${jailname}" | |||
echo "processing ${jailroot}:" | |||
### check if jailroot exists | |||
if [ -n "${jailroot}" -a -d "${jailroot}" ]; then | |||
### create .mergemasterrc | |||
cat <<EOF > /root/.mergemasterrc | |||
AUTO_INSTALL=yes | AUTO_INSTALL=yes | ||
AUTO_UPGRADE=yes | AUTO_UPGRADE=yes | ||
Line 1,018: | Line 1,194: | ||
IGNORE_FILES="/boot/device.hints /etc/motd" | IGNORE_FILES="/boot/device.hints /etc/motd" | ||
EOF | EOF | ||
### remove backup of /etc from previous run (if it exists) | |||
if [ -d "${jailroot}/etc.bak" ]; then | |||
rm -rfI "${jailroot}/etc.bak" | |||
fi | |||
### create backup of /etc as /etc.bak | |||
cp -pRP "${jailroot}/etc" "${jailroot}/etc.bak" | |||
### check if mtree from last mergemaster run exists | |||
if [ ! -e ${jailroot}/var/db/mergemaster.mtree ]; then | |||
### delete /etc/rc.d/* | |||
rm -rfI ${jailroot}/etc/rc.d/* | |||
fi | |||
### run mergemaster for this jail | |||
mergemaster -D "${jailroot}" | |||
else | |||
echo "${jailroot} doesn't exist" | |||
fi | |||
sleep 2 | |||
done | done | ||
### if an existing .mergemasterrc was moved out of the way in the beginning, move it back now | ### if an existing .mergemasterrc was moved out of the way in the beginning, move it back now | ||
if [ ${MM_RC} -eq 1 ]; then | if [ ${MM_RC} -eq 1 ]; then | ||
mv /root/.mergemasterrc.old /root/.mergemasterrc | |||
else | else | ||
rm /root/.mergemasterrc | |||
fi | fi | ||
Latest revision as of 18:18, 24 January 2022
Background
This is my personal checklist for when I am setting up a new ezjail host. I like my jail hosts configured in a very specific way. There is a good chance that what is right for me is not right for you. As always, YMMV.
Also note that I talk a lot about the German hosting provider Hetzner, if you are using another provider or you are doing this at home, just ignore the Hetzner specific stuff. Much of the content here can be used with little or no changes outside Hetzner.
Installation
OS install with mfsbsd
After receiving the server from Hetzner I boot it using the rescue system which puts me at an mfsbsd prompt via SSH. This is perfect for installing a zfs-only server.
Changes to zfsinstall
I edit the zfsinstall script /root/bin/zfsinstall
and add "usr" to FS_LIST near the top of the script. I do this because I like to have /usr as a seperate ZFS dataset.
Check disks
I create a small zpool using just 30gigs, enough to confortably install the base OS and so on. The rest of the diskspace will be used for GELI which will have the other zfs pool on top. This encrypted zpool will house the actual jails and data. This setup allows me to have all the important data encrypted, while allowing the physical server to boot without human intervention like full disk encryption would require.
Note that the disks in this server are not new, they have been used for around two years (18023 hours/24 = 702 days):
[root@rescue ~]# grep "ada[0-9]:" /var/run/dmesg.boot | grep "MB " ada0: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C) ada1: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C) [root@rescue ~]# smartctl -a /dev/ada0 | grep Power_On_Hours 9 Power_On_Hours 0x0032 096 096 000 Old_age Always - 18023 [root@rescue ~]# smartctl -a /dev/ada1 | grep Power_On_Hours 9 Power_On_Hours 0x0032 096 096 000 Old_age Always - 18023 [root@rescue ~]#
Destroy existing partitions
Any existing partitions need to be deleted first. This can be done with the destroygeom command like shown below:
[root@rescue ~]# destroygeom -d ada0 -d ada1 Destroying geom ada0: Deleting partition 3 ... done Destroying geom ada1: Deleting partition 1 ... done Deleting partition 2 ... done Deleting partition 3 ... done
Install FreeBSD
Installing FreeBSD with mfsbsd is easy. I run the below command, adjusting the release I want to install of course:
[root@rescue ~]# zfsinstall -d ada0 -d ada1 -r mirror -z 30G -t /nfs/mfsbsd/10.0-release-amd64.tbz Creating GUID partitions on ada0 ... done Configuring ZFS bootcode on ada0 ... done => 34 3907029101 ada0 GPT (1.8T) 34 2014 - free - (1M) 2048 128 1 freebsd-boot (64k) 2176 62914560 2 freebsd-zfs (30G) 62916736 3844112399 - free - (1.8T) Creating GUID partitions on ada1 ... done Configuring ZFS bootcode on ada1 ... done => 34 3907029101 ada1 GPT (1.8T) 34 2014 - free - (1M) 2048 128 1 freebsd-boot (64k) 2176 62914560 2 freebsd-zfs (30G) 62916736 3844112399 - free - (1.8T) Creating ZFS pool tank on ada0p2 ada1p2 ... done Creating tank root partition: ... done Creating tank partitions: var tmp usr ... done Setting bootfs for tank to tank/root ... done NAME USED AVAIL REFER MOUNTPOINT tank 270K 29.3G 31K none tank/root 127K 29.3G 34K /mnt tank/root/tmp 31K 29.3G 31K /mnt/tmp tank/root/usr 31K 29.3G 31K /mnt/usr tank/root/var 31K 29.3G 31K /mnt/var Extracting FreeBSD distribution ... done Writing /boot/loader.conf... done Writing /etc/fstab...Writing /etc/rc.conf... done Copying /boot/zfs/zpool.cache ... done Installation complete. The system will boot from ZFS with clean install on next reboot You may type "chroot /mnt" and make any adjustments you need. For example, change the root password or edit/create /etc/rc.conf for for system services. WARNING - Don't export ZFS pool "tank"! [root@rescue ~]#
Post install configuration (before reboot)
Before rebooting into the installed FreeBSD I need to make certain I can reach the server through SSH after the reboot. This means:
- Adding network settings to
/etc/rc.conf
- Adding sshd_enable="YES" to
/etc/rc.conf
- Change PermitRootLogin to Yes in
/etc/ssh/sshd_config
Note: In the current This is now the default in the zfsinstall image that Hetzner provides - Add nameservers to
/etc/resolv.conf
- Finally I set the root password.
All of these steps are essential if I am going to have any chance of logging in after reboot. Most of these changes can be done from the mfsbsd shell but the password change requires chroot into the newly installed environment.
I use the chroot command but start another shell as bash is not installed in /mnt:
[root@rescue ~]# chroot /mnt/ csh rescue# ee /etc/rc.conf rescue# ee /etc/ssh/sshd_config rescue# passwd New Password: Retype New Password: rescue#
So, the network settings are sorted, root password is set, and root is permitted to ssh in. Time to reboot (this is the exciting part).
Remember to use shutdown -r now
and not reboot
when you reboot. shutdown -r now
performs the proper shutdown process including rc.d scripts and disk buffer flushing. reboot
is the "bigger hammer" to use when something is preventing shutdown -r now
from working.
Basic config after first boot
If the server boots without any problems, I do some basic configuration before I continue with the disk partitioning.
Timezone
I run the command tzsetup
to set the proper timezone, and set the time using ntpdate
if neccesary.
Note: The current hetzner freebsd image has the timezone set to CEST, I like my servers configured as UTC
Basic ports
I also add some basic ports with pkg
so I can get screen etc. up and running as soon as possible:
# pkg install bash screen sudo portmaster
I then add the following to /usr/local/etc/portmaster.rc
:
ALWAYS_SCRUB_DISTFILES=dopt PM_DEL_BUILD_ONLY=pm_dbo SAVE_SHARED=wopt PM_LOG=/var/log/portmaster.log PM_IGNORE_FAILED_BACKUP_PACKAGE=pm_ignore_failed_backup_package
An explanation of these options can be found on the Portmaster page.
After a rehash
and adding my non-root user with adduser
, I am ready to continue with the disk configuration. I also remember to disable root login in /etc/ssh/sshd_config
.
Further disk configuration
After the reboot into the installed FreeBSD environment, I need to do some further disk configuration.
Create swap partitions
Swap-on-zfs is not a good idea for various reasons. To keep my swap encrypted but still off zfs I use geli onetime encryption. To avoid problems if a disk dies I also use gmirror. First I add the partitions with gpart:
$ sudo gpart add -t freebsd-swap -s 10G /dev/ada0 ada0p3 added $ sudo gpart add -t freebsd-swap -s 10G /dev/ada1 ada1p3 added $
Then I make sure gmirror is loaded, and loaded on boot:
$ sudo sysrc -f /boot/loader.conf geom_mirror_load="YES" $ sudo kldload geom_mirror
Then I create the gmirror:
sudo gmirror label swapmirror /dev/ada0p3 /dev/ada1p3
Finally I add the following line to /etc/fstab
to get encrypted swap on top of the gmirror:
/dev/mirror/swapmirror.eli none swap sw,keylen=256,sectorsize=4096 0 0
I can enable the new swap partition right away:
$ sudo swapon /dev/mirror/swapmirror.eli $ swapinfo Device 1K-blocks Used Avail Capacity /dev/mirror/swapmirror.eli 8388604 0 8388604 0% $
Create GELI partitions
First I create the partitions to hold the geli devices:
$ sudo gpart add -t freebsd-ufs ada0 ada0p4 added $ sudo gpart add -t freebsd-ufs ada1 ada1p4 added
I add them as freebsd-ufs
type partitions, as there is no dedicated freebsd-geli
type.
Create GELI key
To create a GELI key I copy some data from /dev/random
:
$ sudo dd if=/dev/random of=/root/geli.key bs=256k count=1 1+0 records in 1+0 records out 262144 bytes transferred in 0.003347 secs (78318372 bytes/sec) $
Create GELI volumes
I create the GELI volumes with 4k blocksize and 256bit AES encryption:
$ sudo geli init -s 4096 -K /root/geli.key -l 256 /dev/ada0p4 Enter new passphrase: Reenter new passphrase: Metadata backup can be found in /var/backups/ada0p4.eli and can be restored with the following command: # geli restore /var/backups/ada0p4.eli /dev/ada0p4 $ sudo geli init -s 4096 -K /root/geli.key -l 256 /dev/ada1p4 Enter new passphrase: Reenter new passphrase: Metadata backup can be found in /var/backups/ada1p4.eli and can be restored with the following command: # geli restore /var/backups/ada1p4.eli /dev/ada1p4 $
Enable AESNI
Most Intel CPUs have hardware acceleration of AES which helps a lot with GELI performance. I load the aesni
module during boot from /boot/loader.conf
:
$ sudo sysrc -f /boot/loader.conf aesni_load="YES"
Attach GELI volumes
Now I just need to attach the GELI volumes before I am ready to create the second zpool:
$ sudo geli attach -k /root/geli.key /dev/ada0p4 Enter passphrase: $ sudo geli attach -k /root/geli.key /dev/ada1p4 Enter passphrase: $
Create second zpool
$ sudo zpool create gelipool mirror /dev/ada0p4.eli /dev/ada1p4.eli $ zpool status pool: gelipool state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM gelipool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0p4.eli ONLINE 0 0 0 ada1p4.eli ONLINE 0 0 0 errors: No known data errors pool: tank state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0p2 ONLINE 0 0 0 ada1p2 ONLINE 0 0 0 errors: No known data errors $
Create ZFS filesystems on the new zpool
The last remaining thing is to create a filesystem in the new zfs pool:
$ zfs list NAME USED AVAIL REFER MOUNTPOINT gelipool 624K 3.54T 144K /gelipool tank 704M 28.6G 31K none tank/root 704M 28.6G 413M / tank/root/tmp 38K 28.6G 38K /tmp tank/root/usr 291M 28.6G 291M /usr tank/root/var 505K 28.6G 505K /var $ sudo zfs set mountpoint=none gelipool $ sudo zfs set compression=on gelipool $ sudo zfs create -o mountpoint=/usr/jails gelipool/jails $ zfs list NAME USED AVAIL REFER MOUNTPOINT gelipool 732K 3.54T 144K none gelipool/jails 144K 3.54T 144K /usr/jails tank 704M 28.6G 31K none tank/root 704M 28.6G 413M / tank/root/tmp 38K 28.6G 38K /tmp tank/root/usr 291M 28.6G 291M /usr tank/root/var 505K 28.6G 505K /var $
Disable atime
One last thing I like to do is to disable atime
or access time on the filesystem. Access times are recorded every time a file is read, and while this can have it's use cases, I never use it. Disabling it means a lot fewer write operations, as a read operation doesn't automatically include a write operation when atime
is disabled. Disabling it is easy:
$ sudo zfs set atime=off tank $ sudo zfs set atime=off gelipool $
The next things are post-install configuration stuff like OS upgrade, ports, firewall and so on. The basic install is finished \o/
Reserved space
Running out of space in ZFS is bad. Stuff will run slowly and may stop working entirely until some space is freed. The problem is that ZFS is a journalled filesystem which means that all writes, even a deletion, requires writing data to the disk. I've more than once wound up in a situation where I couldn't delete file to free up diskspace because the disk was full.
Sometimes this can be resolved by overwriting a large file, like:
$ echo > /path/to/a/large/file
This will overwrite the file thereby freeing up some space, but sometimes even this is not possible. This is where reserved space comes in. I create a new filesystem in each pool, set them readonly and without a mountpoint, and with 1G reserved each:
$ sudo zfs create -o mountpoint=none -o reservation=1G -o readonly=on gelipool/reserved $ sudo zfs create -o mountpoint=none -o reservation=1G -o readonly=on tank/reserved
If I run out of space for some reason, I can just delete the dataset, or unset the reserved
property, and I immediately have 1G diskspace available. Yay!
Ports
Installing the ports tree
I need to bootstrap the ports system, I use portsnap
as it is way faster than using c(v)sup. Initially I run portsnap fetch extract
and when I need to update the tree later I use portsnap fetch update
.
smartd
I install smartd to monitor the disks for problems:
$ sudo pkg install smartmontools
I create the file /usr/local/etc/smartd.conf
and add this line to it:
DEVICESCAN -a -m thomas@gibfest.dk
This makes smartd monitor all disks and send me an email if it finds an error.
Remember to enable smartd in /etc/rc.conf
and start it:
sudo sysrc smartd_enable="YES" sudo service smartd start
openntpd
I install net/openntpd
to keep the clock in sync. I find this a lot easier to configure than the base ntpd.
sudo pkg install openntpd
I enable openntpd
in /etc/rc.conf
:
sudo sysrc openntpd_enable="YES"
and add a one line config file:
$ grep -v "^#" /usr/local/etc/ntpd.conf | grep -v "^$" servers de.pool.ntp.org $
sync the clock and start openntpd:
sudo ntpdate de.pool.ntp.org sudo service openntpd start
ntpdate
I also enable ntpdate
to help set the clock after a reboot. I add the following two lines to /etc/rc.conf
:
sudo sysrc ntpdate_enable="YES" sudo sysrc ntpdate_hosts="de.pool.ntp.org"
Upgrade OS (buildworld)
I usually run -STABLE on my hosts, which means I need to build and install a new world and kernel. I also like having rctl available on my jail hosts, so I can limit jail ressources in all kinds of neat ways. I also like having DTRACE available. Additionally I also need the built world to populate ezjails basejail.
Note: I will need to update the host and the jails many times during the lifespan of this server, which is likely > 2-3 years. As new security problems are found or features are added that I want, I will update host and jails. There is a section about staying up to date later in this page. This section (the one you are reading now) only covers the OS update I run right after installing the server.
Fetching sources
First I get the sources:
[tykling@jail1 ~]$ sudo git clone -b stable/13 https://git.freebsd.org/src.git /usr/src Password: Cloning into '/usr/src'... remote: Enumerating objects: 4060913, done. remote: Counting objects: 100% (379329/379329), done. remote: Compressing objects: 100% (27474/27474), done. remote: Total 4060913 (delta 373583), reused 351855 (delta 351855), pack-reused 3681584 Receiving objects: 100% (4060913/4060913), 1.38 GiB | 5.94 MiB/s, done. Resolving deltas: 100% (3217803/3217803), done. Updating files: 100% (86931/86931), done. [tykling@jail1 ~]$
This takes a while the first time, but subsequent git pull
runs are much faster.
Note: I used to use svn or svnlite here but since the migration to Git I switched to using the regular git client
Create kernel config
I used to create a kernel config to get RACCT
and RCTL
but these days both are included in GENERIC, so no need for that anymore. Yay.
Building world and kernel
Finally I start the build. I use -j to start one thread per core in the system. sysctl hw.ncpu
shows the number of available cores:
# sysctl hw.ncpu hw.ncpu: 12
To build the new system:
$ sudo -i bash # cd /usr/src/ # time (make -j$(sysctl -n hw.ncpu) buildworld && make -j$(sysctl -n hw.ncpu) kernel) && date
After the build finishes, reboot into the newly built kernel and run mergemaster, installworld, and mergemaster again; and finally delete-old and delete-old-libs:
$ sudo -i bash # cd /usr/src/ # mergemaster -pFUi && make installworld && mergemaster -FUi && make -DBATCH_DELETE_OLD_FILES delete-old && make -DBATCH_DELETE_OLD_FILES delete-old-libs
PAY ATTENTION DURING MERGEMASTER! DO NOT OVERWRITE /etc/group AND /etc/master.passwd AND OTHER CRITICAL FILES!
Reboot after the final mergemaster completes, and boot into the newly built world.
Preparing ezjail
ezjail needs to be installed and a bit of configuration is also needed, in addition to bootstrapping /usr/jails/basejail
and /usr/jails/newjail
.
Installing ezjail
Just install it with pkg:
sudo pkg install ezjail
Configuring ezjail
Then I go edit the ezjail config file /usr/local/etc/ezjail.conf
and add/change these three lines near the bottom:
ezjail_use_zfs="YES" ezjail_jailzfs="gelipool/jails" ezjail_use_zfs_for_jails="YES"
This makes ezjail use seperate zfs datasets under gelipool/jails
for the basejail
and newjail
, as well as for each jail created. ezjail_use_zfs_for_jails
is supported since ezjail 3.2.2.
Bootstrapping ezjail
Finally I populate basejail
and newjail
from the world I build earlier:
$ sudo ezjail-admin update -i
The last line of the output is a message saying:
Note: a non-standard /etc/make.conf was copied to the template jail in order to get the ports collection running inside jails.
This is because ezjail defaults to symlinking the ports collection in the same way it symlinks the basejail. I prefer having seperate/individual ports collections in each of my jails though, so I remove the symlink and make.conf from newjail:
$ sudo rm /usr/jails/newjail/etc/make.conf /usr/jails/newjail/usr/ports /usr/jails/newjail/usr/src $ sudo mkdir /usr/jails/newjail/usr/src
ZFS goodness
Note that ezjail has created two new ZFS datasets to hold basejail
and newjail
:
$ zfs list -r gelipool/jails NAME USED AVAIL REFER MOUNTPOINT gelipool/jails 239M 3.54T 476K /usr/jails gelipool/jails/basejail 236M 3.54T 236M /usr/jails/basejail gelipool/jails/newjail 3.10M 3.54T 3.10M /usr/jails/newjail
ezjail flavours
ezjail has a pretty awesome feature that makes it possible to create templates or flavours which apply common settings when creating a new jail. I always have a basic flavour which adds a user for me, installs an SSH key, adds a few packages like bash, screen, sudo and portmaster - and configures those packages. Basically, everything I find myself doing over and over again every time I create a new jail.
It is also possible, of course, to create more advanced flavours, I've had one that installs a complete nginx+php-fpm server with all the neccesary packages and configs.
ezjail flavours are technically pretty simple. By default, they are located in the same place as basejail and newjail, and ezjail comes with an example flavour to get you started. Basically a flavour is a file/directory hierachy which is copied to the jail, and a shell script called ezjail.flavour which is run once, the first time the jail is started, and then deleted.
For reference, I've included my basic flavour here. First is a listing of the files included in the flavour, and then the ezjail.flavour script which performs tasks beyond copying config files.
$ find /usr/jails/flavours/tykbasic /usr/jails/flavours/tykbasic /usr/jails/flavours/tykbasic/ezjail.flavour /usr/jails/flavours/tykbasic/usr /usr/jails/flavours/tykbasic/usr/local /usr/jails/flavours/tykbasic/usr/local/etc /usr/jails/flavours/tykbasic/usr/local/etc/portmaster.rc /usr/jails/flavours/tykbasic/usr/local/etc/sudoers /usr/jails/flavours/tykbasic/usr/home /usr/jails/flavours/tykbasic/usr/home/tykling /usr/jails/flavours/tykbasic/usr/home/tykling/.ssh /usr/jails/flavours/tykbasic/usr/home/tykling/.ssh/authorized_keys /usr/jails/flavours/tykbasic/usr/home/tykling/.screenrc /usr/jails/flavours/tykbasic/etc /usr/jails/flavours/tykbasic/etc/fstab /usr/jails/flavours/tykbasic/etc/rc.conf /usr/jails/flavours/tykbasic/etc/periodic.conf /usr/jails/flavours/tykbasic/etc/resolv.conf
As you can see, the flavour contains files like /etc/resolv.conf and other stuff to make the jail work. The name of the flavour here is tykbasic which means that if I want a file to end up in /usr/home/tykling after the flavour has been applied, I need to put that file in the folder /usr/jails/flavours/tykbasic/usr/home/tykling/ - remember to also chown the files in the flavour appropriately.
Finally, my ezjail.flavour script looks like so:
#!/bin/sh # # BEFORE: DAEMON # # ezjail flavour example # Timezone ########### # ln -s /usr/share/zoneinfo/Europe/Copenhagen /etc/localtime # Groups ######### # pw groupadd -q -n tykling # Users ######## # # To generate a password hash for use here, do: # openssl passwd -1 "the password" echo -n '$1$L/fC0UrO$bi65/BOIAtMkvluDEDCy31' | pw useradd -n tykling -u 1001 -s /bin/sh -m -d /usr/home/tykling -g tykling -c 'tykling' -H 0 # Packages ########### # env ASSUME_ALWAYS_YES=YES pkg bootstrap pkg install -y bash pkg install -y sudo pkg install -y portmaster pkg install -y screen #change shell to bash chsh -s bash tykling #update /etc/aliases echo "root: thomas@gibfest.dk" >> /etc/aliases newaliases #remove adjkerntz from crontab cat /etc/crontab | grep -E -v "(Adjust the time|adjkerntz)" > /etc/crontab.new mv /etc/crontab.new /etc/crontab #remove ports symlink rm /usr/ports # create symlink to /usr/home in / (adduser defaults to /usr/username as homedir) ln -s /usr/home /home
Creating a flavour is easy: just create a folder under /usr/jails/flavours/
that has the name of the flavour, and start adding files and folders there. The ezjail.flavour script should be placed in the root (see the example further up the page).
Finally I add the following to /usr/local/etc/ezjail.conf
to make ezjail always use my new flavour:
ezjail_default_flavour="tykbasic"
Configuration
This section outlines what I do to further prepare the machine to be a nice ezjail host.
Firewall
One of the first things I fix is to enable the pf firewall from OpenBSD. I add the following to /etc/rc.conf
to enable pf at boot time:
sudo sysrc pf_enable="YES" sudo sysrc pflog_enable="YES"
I also create a very basic /etc/pf.conf
:
[root@ ~]# cat /etc/pf.conf ### macros if="em0" table <portknock> persist #external addresses tykv4="a.b.c.d" tykv6="2002:ab:cd::/48" table <allowssh> { $tykv4,$tykv6 } #local addresses glasv4="w.x.y.z" ### scrub scrub in on $if all fragment reassemble ################ ### filtering ### block everything block log all ################ ### skip loopback interface(s) set skip on lo0 ################ ### icmp6 pass in quick on $if inet6 proto icmp6 all icmp6-type {echoreq,echorep,neighbradv,neighbrsol,routeradv,routersol} ################ ### pass outgoing pass out quick on $if all ################ ### portknock rule (more than 5 connections in 10 seconds to the port specified will add the "offending" IP to the <portknock> table) pass in quick on $if inet proto tcp from any to $glasv4 port 32323 synproxy state (max-src-conn-rate 5/10, overload <portknock>) ### pass incoming ssh and icmp pass in quick on $if proto tcp from { <allowssh>, <portknock> } to ($if) port 22 pass in quick on $if inet proto icmp all icmp-type { 8, 11 } ################ ### pass ipv6 fragments (hack to workaround pf not handling ipv6 fragments) pass in on $if inet6 block in log on $if inet6 proto udp block in log on $if inet6 proto tcp block in log on $if inet6 proto icmp6 block in log on $if inet6 proto esp block in log on $if inet6 proto ipv6
To load pf without rebooting I run the following:
[root@ ~]# kldload pf [root@ ~]# kldload pflog [root@ ~]# pfctl -ef /etc/pf.conf && sleep 60 && pfctl -d No ALTQ support in kernel ALTQ related functions disabled
I get no prompt after this because pf has cut my SSH connection. But I can SSH back in if I did everything right, and if not, I can just wait 60 seconds after which pf will be disabled again. I SSH in and reattach to the screen I am running this in, and press control-c, so the "sleep 60" is interrupted and pf is not disabled. Neat little trick for when you want to avoid locking yourself out :)
Process Accounting
I like to enable process accounting on my jail hosts. It can be useful in a lot of situations. I put the accounting data on a seperate ZFS dataset.
I run the following commands to enable it:
sudo zfs create tank/root/var/account sudo touch /var/account/acct sudo chmod 600 /var/account/acct sudo accton /var/account/acct sudo sysrc accounting_enable="YES"
Replacing sendmail with Postfix
I always replace Sendmail with Postfix on every server I manage. See Replacing_Sendmail_With_Postfix for more info.
Listening daemons
When you add an IP alias for a jail, any daemons listening on * will also listen on the jails IP, which is not what I want. For example, I want the jails sshd to be able to listen on the jails IP on port 22, instead of the hosts sshd. Check for listening daemons like so:
$ sockstat -l46 USER COMMAND PID FD PROTO LOCAL ADDRESS FOREIGN ADDRESS root master 1554 12 tcp4 *:25 *:* root master 1554 13 tcp6 *:25 *:* root sshd 948 3 tcp6 *:22 *:* root sshd 948 4 tcp4 *:22 *:* root syslogd 789 6 udp6 *:514 *:* root syslogd 789 7 udp4 *:514 *:* [tykling@glas ~]$
This tells me that I need to change Postfix, sshd and syslogd to stop listening on all IP addresses.
Postfix
The defaults in Postfix are really nice on FreeBSD, and most of the time a completely empty config file is fine for a system mailer (sendmail replacement). However, to make Postfix stop listening on port 25 on all IP addresses, I do need one line in /usr/local/etc/postfix/main.cf
:
$ cat /usr/local/etc/postfix/main.cf inet_interfaces=localhost $
sshd
To make sshd
stop listening on all IP addresses I uncomment and edit the ListenAddress
line in /etc/ssh/sshd_config
:
$ grep ListenAddress /etc/ssh/sshd_config ListenAddress x.y.z.226 #ListenAddress ::
(IP address obfuscated..)
syslogd
I don't need my syslogd to listen on the network at all, so I add the following line to /etc/rc.conf
:
$ grep syslog /etc/rc.conf syslogd_flags="-ss"
Restarting services
Finally I restart Postfix, sshd and syslogd:
$ sudo /etc/rc.d/syslogd restart Stopping syslogd. Waiting for PIDS: 789. Starting syslogd. $ sudo /etc/rc.d/sshd restart Stopping sshd. Waiting for PIDS: 948. Starting sshd. $ sudo /usr/local/etc/rc.d/postfix restart postfix/postfix-script: stopping the Postfix mail system postfix/postfix-script: starting the Postfix mail system
A check with sockstat
reveals that no more services are listening on all IP addresses:
$ sockstat -l46 USER COMMAND PID FD PROTO LOCAL ADDRESS FOREIGN ADDRESS root master 1823 12 tcp4 127.0.0.1:25 *:* root master 1823 13 tcp6 ::1:25 *:* root sshd 1617 3 tcp4 x.y.z.226:22 *:* $
Network configuration
Network configuration is a big part of any jail setup. If I have enough IP addresses (ipv4 and ipv6) I can just add IP aliases as needed. If I only have one or a few v4 IPs I will need to use rfc1918 addresses for the jails. In that case, I create a new loopback interface, lo1
and add the IP aliases there. I then use the pf firewall to redirect incoming traffic to the right jail, depending on the port in use.
IPv4
If rfc1918 jails are needed, I add the following to /etc/rc.conf
to create the lo1 interface on boot:
### lo1 interface for ipv4 rfc1918 jails cloned_interfaces="lo1"
When the lo1 interface is created, or if it isn't needed, I am ready to start adding IP aliases for jails as needed.
IPv6
On the page Hetzner_ipv6 I've explained how to make IPv6 work on a Hetzner server where the supplied IPv6 default gateway is outside the IPv6 subnet assigned.
When basic IPv6 connectivity works, I am ready to start adding IP aliases for jails as needed.
Allow ping from inside jails
I add the following to /etc/sysctl.conf
so the jails are allowed to do icmp ping. This enables raw socket access, which can be a security issue if you have untrusted root users in your jails. Use with caution.
#allow ping in jails security.jail.allow_raw_sockets=1
Tips & tricks
Get jail info out of top
To make top show the jail id of the jail in which the process is running in a column, I need to specify the -j flag to top. Since this is a multi-cpu server I am working on, I also like giving the -P flag to top, to get a seperate line of cpu stats per core. Finally, I like -a to get the full commandline/info of the running processes. I add the following to my .bashrc in my homedir on the jail host:
alias top="nice top -j -P -a"
...this way I don't have to remember passing -j -P -a
to top every time. Also, I've been told to run top with nice to limit the cpu used by top itself. I took the advice so the complete alias looks like above.
Base jails
My jails need various services, so:
- I have a DNS jail with a caching DNS server which is used by all the jails.
- I also have a syslog server on each jailhost to collect syslog from all the jails and send them to my central syslog server.
- I have a postgres jail on each jail host, so I only need to maintain one postgres server.
- I have a reverse webproxy jail since many of my jails serve some sort of web application, and I like to terminate SSL for those in one place in an attempt to keep the individual jails as simple as possible.
- Finally I like to run a tor relay on each box to spend any excess resources (cpu, memory, bandwidth) on something nice. As long as I don't run an exit node I can do this completely without any risk of complaints from the provider.
This section describes how I configure each of these "base jails".
DNS jail
I don't need a public v4 IP for this jail, so I configure it with an RFC1918 v4 IP on a loopback interface, and of course a real IPv6 address.
I am used to using bind but unbound is just as good. Use whatever you are comfortable with. Make sure you permit DNS traffic from the other jails to the DNS jail.
This jail needs to be started first since the rest of the jails need it for DNS. ezjail runs rcorder on the config files in /usr/local/etc/ezjail
which means I can use the normal PROVIDE:
and REQUIRE:
to control the jail dependencies. I change the ezjail config for my DNS jail to have the following PROVIDE:
line:
# PROVIDE: dns
The rest of the jails all get the following REQUIRE:
line:
# REQUIRE: dns
Syslog jail
I don't need a public v4 IP for this jail, so I configure it with an RFC1918 v4 IP on a loopback interface, and of course a real IPv6 address.
This jail gets the following PROVIDE:
line in it's ezjail config file:
# PROVIDE: syslog
I use syslog-ng
for this. I install the package using pkg install syslog-ng
and then add a few things to the default config:
- I use the following options, YMMV:
options { chain_hostnames(off); flush_lines(0); threaded(yes); use_fqdn(yes); keep_hostname(no); use_dns(yes); stats-freq(60); stats-level(0); };
- I remove
udp()
from the default source and define a new source:
source jailsrc { udp(ip("10.0.0.1")); udp6(ip("w:x:y:z:10::1")); };
- I also remove the line:
destination console { file("/dev/console"); };
since the jail does not have access to/dev/console
. I also remove any correspondinglog
statements that usedestination(console);
.
- I add a new destination:
destination loghost{ syslog( "syslog.tyktech.dk" transport("tls") port(1999) tls(peer-verify(required-untrusted)) localip("10.0.0.1") log-fifo-size(10000) ); };
- Finally tell syslog-ng to send all logdata from
jailsrc
tologhost
:
log { source(jailsrc); destination(loghost); flags(flow_control); };
The rest of the jails get the following REQUIRE:
line:
# REQUIRE: dns syslog
... and my jail flavours default /etc/syslog.conf
all get this line so all the jails send their syslog messages to the syslog jail:
*.* @10.0.0.1
Postgres jail
I don't need a public v4 IP for this jail, so I configure it with an RFC1918 v4 IP on a loopback interface, and of course a real IPv6 address. I add a AAAA record in DNS for the v6 IP so I have something to point the clients at.
I install the latest Postgres server port, at the time of writing that is databases/postgresql93-server
. But before I can run /usr/local/etc/rc.d/postgresql initdb
I need to permit the use of SysV shared memory in the jail. This is done in the ezjail config file for the jail, in the _parameters line. I need to add allow.sysvipc=1
so I change the line from:
export jail_postgres_kush_tyknet_dk_parameters=""
to:
export jail_postgres_kush_tyknet_dk_parameters="allow.sysvipc=1"
While I'm there I also change the jails # PROVIDE:
line to:
# PROVIDE: postgres
And ofcourse the standard # REQUIRE:
line:
# REQUIRE: dns syslog
After restarting the jail I can run initdb
and start Postgres. When a jail needs a database I need to:
- Add a DB user (with the
createuser -P someusername
command) - Add a database with the new user as owner (
createdb -O someusername somedbname
) - Add permissions in
/usr/local/pgsql/data/pg_hba.conf
- Open a hole in the firewall so the jail can reach the database on TCP port
5432
- Add
postgres
to the# REQUIRE:
line in the ezjail config file
Web Jail
I need a public V4 IP for the web jail and I also give it a V6 IP. Since I use a different V6 IP per website, I will need additional v6 addresses when I start adding websites. I add the v6 addresses to the web jail in batches of 10 as I need them. After creating the jail and bootstrapping the ports collection I install security/openssl
and www/nginx
and configure it. More on that later.
Tor relay jail
The Tor relay needs a public IP. It also needs the security/openssl
and security/tor
ports built (from ports, not packages, to ensure Tor is built with a recent OpenSSL to speed up ECDH). I put the following into the /usr/local/etc/tor/torrc
file:
Log notice file /var/log/tor/notices.log ORPort 443 NoListen ORPort 9090 NoAdvertise Address torrelay.bong.tyknet.dk Nickname TykRelay01 ContactInfo Thomas Steen Rasmussen / Tykling <thomas@gibfest.dk> (PGP: 0x772FF77F0972FA58) DirPort 80 NoListen DirPort 9091 NoAdvertise ExitPolicy reject *:*
Changing the Address
and Nickname
depending on the server.
A few steps (that should really be done by the port) are needed here:
sudo rm -rf /var/db/tor /var/run/tor sudo mkdir -p /var/db/tor/data /var/run/tor /var/log/tor sudo chown -R _tor:_tor /var/db/tor /var/log/tor /var/run/tor sudo chmod -R 700 /var/db/tor
I also add the following line to the jail hosts /etc/sysctl.conf
to make it impossible to predict IP IDs from the server:
net.inet.ip.random_id=1
Finally I redirect TCP ports 9090 and 9091 to ports 443 and 80 in the jail in /etc/pf.conf
:
$ grep tor /etc/pf.conf torv4="85.235.250.88" torv6="2a01:3a0:1:1900:85:235:250:88" rdr on $if inet proto tcp from any to $torv4 port 443 -> $torv4 port 9090 rdr on $if inet6 proto tcp from any to $torv6 port 443 -> $torv6 port 9090 rdr on $if inet proto tcp from any to $torv4 port 80 -> $torv4 port 9091 rdr on $if inet6 proto tcp from any to $torv6 port 80 -> $torv6 port 9091 pass in quick on { $if, $jailif } proto tcp from any to { $torv4 $torv6 } port { 9090, 9091 } $
ZFS snapshots and backup
So, since all this is ZFS based, there is a few tricks I do to make it easier to restore data in case of accidental file deletion or other dataloss.
Periodic snapshots using sysutils/zfs-periodic
sysutils/zfs-periodic
is a little script that uses the FreeBSD periodic(8)
system to make snapshots of filesystems with regular intervals. It supports making hourly snapshots with a small change to periodic(8)
, but I've settled for daily, weekly and monthly snapshots on my servers.
After installing sysutils/zfs-periodic
I add the following to /etc/periodic.conf
:
#daily zfs snapshots daily_zfs_snapshot_enable="YES" daily_zfs_snapshot_pools="tank gelipool" daily_zfs_snapshot_keep=7 daily_zfs_snapshot_skip="gelipool/backups" #weekly zfs snapshots weekly_zfs_snapshot_enable="YES" weekly_zfs_snapshot_pools="tank gelipool" weekly_zfs_snapshot_keep=5 weekly_zfs_snapshot_skip="gelipool/backups" #monthly zfs snapshots monthly_zfs_snapshot_enable="YES" monthly_zfs_snapshot_pools="tank gelipool" monthly_zfs_snapshot_keep=6 monthly_zfs_snapshot_skip="gelipool/backups" #monthly zfs scrub daily_scrub_zfs_enable="YES" daily_scrub_zfs_default_threshold=30
Note that the last bit also enables a monthly scrub of the filesystem. Remember to change the pool name and remember to set the number of snapshots to retain to something appropriate. These things are always a tradeoff between diskspace and safety. Think it over and find some values that make you sleep well at night :)
After this has been running for a few days, you should have a bunch of daily snapshots:
$ zfs list -t snapshot | grep gelipool@ gelipool@daily-2012-09-02 0 - 31K - gelipool@daily-2012-09-03 0 - 31K - gelipool@daily-2012-09-04 0 - 31K - gelipool@daily-2012-09-05 0 - 31K -
Back-to-back ZFS mirroring
I am lucky enough to have more than one of these jail hosts, which is the whole reason I started writing down how I configure them. One of the advantages to having more than one is that I can configure zfs send/receive
jobs and make server A send it's data to server B, and vice versa.
Introduction
The concept is pretty basic, but as it often happens, security considerations turn what was a simple and elegant idea into something... else. To make the back-to-back backup scheme work without sacrificing too much security, I first make a jail on each jailhost called backup.jailhostname
. This jail will have control over a designated zfs dataset which will house the backups sent from the other server.
Create ZFS dataset
First I create the zfs dataset:
$ sudo zfs create cryptopool/backups $ sudo zfs set jailed=on cryptopool/backups
'jail' the new dataset
I create the jail like I normally do, but after creating it, I edit the ezjail config file and tell it which extra zfs dataset to use:
$ grep dataset /usr/local/etc/ezjail/backup_glas_tyknet_dk export jail_backup_glas_tyknet_dk_zfs_datasets="cryptopool/backups"
This makes ezjail
run the zfs jail
command with the proper jail id when the jail is started.
jail sysctl settings
I also add the following to the jails ezjail config:
# grep parameters /usr/local/etc/ezjail/backup_glas_tyknet_dk export jail_backup_glas_tyknet_dk_parameters="allow.mount.zfs=1 enforce_statfs=1"
Configuring the backup jail
The jail is ready to run now, and inside the jail a zfs list
looks like this:
$ zfs list NAME USED AVAIL REFER MOUNTPOINT cryptopool 3.98G 2.52T 31K none cryptopool/backups 62K 2.52T 31K none $
I don't want to open up root ssh access to this jail, but the remote servers need to call zfs receive
which requires root permissions. zfs allow
to the rescue! zfs allow
makes it possible to say "user X is permitted to do action Y on dataset Z" which is what I need here. In the backup jail I add a user called tykbackup
which will be used as the user receiving the zfs snapshots from the remote servers.
I then run the following commands to allow the user to work with the dataset:
$ sudo zfs allow tykbackup atime,compression,create,mount,mountpoint,readonly,receive cryptopool/backups $ sudo zfs allow cryptopool/backups ---- Permissions on cryptopool/backups ------------------------------- Local+Descendent permissions: user tykbackup atime,compression,create,mount,mountpoint,readonly,receive $
Testing if it worked:
$ sudo su tykbackup $ zfs create cryptopool/backups/test $ zfs list cryptopool/backups/test NAME USED AVAIL REFER MOUNTPOINT cryptopool/backups/test 31K 2.52T 31K none $ zfs destroy cryptopool/backups/test cannot destroy 'cryptopool/backups/test': permission denied $
Since the user tykbackup
only has the permissions create,mount,mountpoint,receive
on the cryptopool/backups
dataset, I get Permission Denied (as I expected) when trying to destroy cryptopool/backups/test
. Works like a charm.
To allow automatic SSH operations I add the public ssh key for the root user of the server being backed up to /usr/home/tykbackup/.ssh/authorized_keys
:
$ cat /usr/home/tykbackup/.ssh/authorized_keys from="ryst.tyknet.dk",no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty,command="/usr/home/tykbackup/zfscmd.sh $SSH_ORIGINAL_COMMAND" ssh-rsa AAAAB3......KR2Z root@ryst.tyknet.dk
The script called zfscmd.sh
is placed on the backup server to allow the ssh client to issue different command line arguments depending on what needs to be done. The script is very simple:
#!/bin/sh shift /sbin/zfs $@ exit $?
A few notes: Aside from restricting the command this SSH key can run, I've restricted it to only be able to log in from the IP of the server being backed up. These are very basic restrictions that should always be in place no matter what kind of backup you are using.
Add the periodic script
I then add the script /usr/local/etc/periodic/daily/999.zfs-mirror
to each server being backed up with the following content:
#!/bin/sh #set -x ### check pidfile if [ -f /var/run/$(basename $0).pid ]; then echo "pidfile /var/run/$(basename $0).pid exists, bailing out" exit 1 fi echo $$ > /var/run/$(basename $0).pid ### If there is a global system configuration file, suck it in. if [ -r /etc/defaults/periodic.conf ]; then . /etc/defaults/periodic.conf source_periodic_confs fi case "$daily_zfs_mirror_enable" in [Yy][Ee][Ss]) ;; *) exit ;; esac pools=$daily_zfs_mirror_pools if [ -z "$pools" ]; then pools='tank' fi targethost=$daily_zfs_mirror_targethost if [ -z "$targethost" ]; then echo '$daily_zfs_mirror_targethost must be set in /etc/periodic.conf' exit 1 fi targetuser=$daily_zfs_mirror_targetuser if [ -z "$targetuser" ]; then echo '$daily_zfs_mirror_targetuser must be set in /etc/periodic.conf' exit 1 fi targetfs=$daily_zfs_mirror_targetfs if [ -z "$targetfs" ]; then echo '$daily_zfs_mirror_targetfs must be set in /etc/periodic.conf' exit 1 fi if [ -n "$daily_zfs_mirror_skip" ]; then egrep="($(echo $daily_zfs_mirror_skip | sed "s/ /|/g"))" fi ### get todays date for later use tday=$(date +%Y-%m-%d) ### check if the destination fs exists ssh ${targetuser}@${targethost} /usr/home/tykbackup/zfscmd.sh list ${targetfs} > /dev/null 2>&1 if [ $? -ne 0 ]; then echo "Creating destination fs on target server" echo ssh ${targetuser}@${targethost} /usr/home/tykbackup/zfscmd.sh create ${targetfs} ssh ${targetuser}@${targethost} /usr/home/tykbackup/zfscmd.sh create ${targetfs} fi echo -n "Doing daily ZFS mirroring - " date ### loop through the configured pools for pool in $pools; do echo " Processing pool $pool ..." ### enumerate datasets with daily snapshots from today if [ -n "$egrep" ]; then datasets=$(zfs list -t snapshot -o name | grep "^$pool[\/\@]" | egrep -v "$egrep" | grep "@daily-$tday") else datasets=$(zfs list -t snapshot -o name | grep "^$pool[\/\@]" | grep "@daily-$tday") fi echo "found datasets: $datasets" for snapshot in $datasets; do dataset=$(echo -n $snapshot | cut -d "@" -f 1) echo "working on dataset $dataset" ### find the latest daily snapshot of this dataset on the remote node, if any echo ssh ${targetuser}@${targethost} /usr/home/tykbackup/zfscmd.sh list -t snapshot \| grep "^${targetfs}/${dataset}@daily-" \| cut -d " " -f 1 \| tail -1 lastgoodsnap=$(ssh ${targetuser}@${targethost} /usr/home/tykbackup/zfscmd.sh list -t snapshot | grep "^${targetfs}/${dataset}@daily-" | cut -d " " -f 1 | tail -1) if [ -z $lastgoodsnap ]; then echo "No remote daily snapshot found for local daily snapshot $snapshot - cannot send incremental - sending full backup" zfs send -v $snapshot | mbuffer | ssh ${targetuser}@${targethost} /usr/home/tykbackup/zfscmd.sh receive -v -F -u $targetfs/$dataset if [ $? -ne 0 ]; then echo " Unable to send full snapshot of $dataset to $targetfs on host $targethost" else echo " Successfully sent a full snapshot of $dataset to $targetfs on host $targethost - future sends will be incremental" fi else ### check if this snapshot has already been sent for some reason, skip if so..." temp=$(echo $snapshot | cut -d "/" -f 2-) lastgoodsnap="$(echo $lastgoodsnap | sed "s,${targetfs}/,,")" if [ "$temp" = "$lastgoodsnap" ]; then echo " The snapshot $snapshot has already been sent to $targethost, skipping..." else ### zfs send the difference between latest remote snapshot and todays local snapshot echo " Sending the diff between local snapshot $(hostname)@$lastgoodsnap and $(hostname)@$pool/$snapshot to ${targethost}@${targetfs}/${pool} ..." zfs send -I $lastgoodsnap $snapshot | mbuffer | ssh ${targetuser}@${targethost} /usr/home/tykbackup/zfscmd.sh receive -v -F -u $targetfs/$dataset if [ $? -ne 0 ]; then echo " There was a problem sending the diff between $lastgoodsnap and $snapshot to $targetfs on $targethost" else echo " Successfully sent the diff between $lastgoodsnap and $snapshot to $targethost" fi fi fi done done ### remove pidfile rm /var/run/$(basename $0).pid
Run the periodic script
I usually to the initial run of the periodic script by hand, so I can catch and fix any errors right away. The script will loop over all datasets in the configured pools and zfs send them including their snapshots to the backup server. Next time the script runs it will send an incremental diff instead of the full dataset.
Caveats
This script does not handle deleting datasets (including their snapshots) on the backup server when the dataset is deleted from the server being backed up. You will need to do that manually. This could be considered a feature, or a missing feature, depending on your preferences. :)
Staying up-to-date
I update my ezjail hosts and jails to track -STABLE regularly. This section describes the procedure I use. It is essential that the jail host and the jails use the same world version, or bad stuff will happen.
Updating the jail host
First I update world and kernel of the jail host like I normally would. This is described earlier in this guide, see Ezjail_host#Building_world_and_kernel.
Updating ezjails basejail
To update ezjails basejail located in /usr/jails/basejail
, I run the same commands as when bootstrapping ezjail, see the section Ezjail_host#Bootstrapping_ezjail.
Running mergemaster in the jails
Finally, to run mergemaster in all jails I use the following script. It will run mergemaster in each jail, the script comments should explain the rest. When it is finished the jails can be started:
#! /bin/sh ### check if .mergemasterrc exists, ### move it out of the way if so MM_RC=0 if [ -e /root/.mergemasterrc ]; then MM_RC=1 mv /root/.mergemasterrc /root/.mergemasterrc.old fi ### loop through jails for jailname in $(ls -1 /usr/jails/ | grep -Ev "(^basejail$|^newjail$|^flavours$)"); do jailroot="/usr/jails/${jailname}" echo "processing ${jailroot}:" ### check if jailroot exists if [ -n "${jailroot}" -a -d "${jailroot}" ]; then ### create .mergemasterrc cat <<EOF > /root/.mergemasterrc AUTO_INSTALL=yes AUTO_UPGRADE=yes FREEBSD_ID=yes PRESERVE_FILES=yes PRESERVE_FILES_DIR=/var/tmp/mergemaster/preserved-files-$(basename ${jailroot})-$(date +%y%m%d-%H%M%S) IGNORE_FILES="/boot/device.hints /etc/motd" EOF ### remove backup of /etc from previous run (if it exists) if [ -d "${jailroot}/etc.bak" ]; then rm -rfI "${jailroot}/etc.bak" fi ### create backup of /etc as /etc.bak cp -pRP "${jailroot}/etc" "${jailroot}/etc.bak" ### check if mtree from last mergemaster run exists if [ ! -e ${jailroot}/var/db/mergemaster.mtree ]; then ### delete /etc/rc.d/* rm -rfI ${jailroot}/etc/rc.d/* fi ### run mergemaster for this jail mergemaster -D "${jailroot}" else echo "${jailroot} doesn't exist" fi sleep 2 done ### if an existing .mergemasterrc was moved out of the way in the beginning, move it back now if [ ${MM_RC} -eq 1 ]; then mv /root/.mergemasterrc.old /root/.mergemasterrc else rm /root/.mergemasterrc fi ### done, a bit of output echo "Done. If everything went well the /etc.bak backup folders can be deleted now." exit 0
To restart all jails I run the command ezjail-admin restart
.
Replacing a defective disk
I had a broken harddisk on one of my servers this evening. This section describes how I replaced the disk to make everything work again.
Booting into the rescue system
After Hetzner staff physically replaced the disk my server was unable to boot because the disk that died was the first one on the controller. The cheap Hetzner hardware is unable to boot from the secondary disk, bios restrictions probably. If the other disk had broken the server would have booted fine and this whole process would be done with the server running. Anyway, I booted into the rescue system and partitioned the disk, added a bootloader and added it to the root zpool. After this I was able to boot the server normally, so the rest of the work was done without the rescue system.
Partitioning the new disk
The following shows the commands I ran to partition the disk:
[root@rescue ~]# gpart create -s GPT /dev/ad4 ad4 created [root@rescue ~]# /sbin/gpart add -b 2048 -t freebsd-boot -s 128 /dev/ad4 ad4p1 added [root@rescue ~]# gpart add -t freebsd-zfs -s 30G /dev/ad4 ad4p2 added [root@rescue ~]# gpart add -t freebsd-ufs /dev/ad4 ad4p3 added [root@rescue ~]# gpart show => 34 1465149101 ad6 GPT (698G) 34 2014 - free - (1M) 2048 128 1 freebsd-boot (64k) 2176 62914560 2 freebsd-zfs (30G) 62916736 1402232399 3 freebsd-ufs (668G) => 34 1465149101 ad4 GPT (698G) 34 2014 - free - (1M) 2048 128 1 freebsd-boot (64k) 2176 62914560 2 freebsd-zfs (30G) 62916736 1402232399 3 freebsd-ufs (668G) [root@rescue ~]#
Importing the pool and replacing the disk
Next step is importing the zpool (remember altroot=/mnt !) and replacing the defective disk:
[root@rescue ~]# zpool import pool: tank id: 3572845459378280852 state: DEGRADED status: One or more devices are missing from the system. action: The pool can be imported despite missing or damaged devices. The fault tolerance of the pool may be compromised if imported. see: http://www.sun.com/msg/ZFS-8000-2Q config: tank DEGRADED mirror-0 DEGRADED 11006001397618753837 UNAVAIL cannot open ad6p2 ONLINE [root@rescue ~]# zpool import -o altroot=/mnt/ tank [root@rescue ~]# zpool status pool: tank state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-2Q scan: scrub repaired 0 in 0h2m with 0 errors on Thu Nov 1 05:00:49 2012 config: NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 11006001397618753837 UNAVAIL 0 0 0 was /dev/ada0p2 ad6p2 ONLINE 0 0 0 errors: No known data errors [root@rescue ~]# zpool replace tank 11006001397618753837 ad4p2 Make sure to wait until resilver is done before rebooting. If you boot from pool 'tank', you may need to update boot code on newly attached disk 'ad4p2'. Assuming you use GPT partitioning and 'da0' is your new boot disk you may use the following command: gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da0 [root@rescue ~]# [root@rescue ~]# zpool status pool: tank state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Tue Nov 27 00:24:41 2012 823M scanned out of 3.11G at 45.7M/s, 0h0m to go 823M resilvered, 25.88% done config: NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 replacing-0 UNAVAIL 0 0 0 11006001397618753837 UNAVAIL 0 0 0 was /dev/ada0p2 ad4p2 ONLINE 0 0 0 (resilvering) ad6p2 ONLINE 0 0 0 errors: No known data errors [root@rescue ~]# zpool status pool: tank state: ONLINE scan: resilvered 3.10G in 0h2m with 0 errors on Tue Nov 27 01:26:45 2012 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0p2 ONLINE 0 0 0 ada1p2 ONLINE 0 0 0 errors: No known data errors [root@rescue ~]#
Reboot into non-rescue system
At this point I rebooted the machine into the normal FreeBSD system.
Re-create geli partition
To recreate the geli partition on p3 of the new disk, I just follow the same steps as when I originally created it, more info here.
To attach the new geli volume I run geli attach
as described here.
Add the geli device to the encrypted zpool
First I check that both geli devices are available, and I check the device name that needs replacing in zpool status
output:
[tykling@haze ~]$ geli status Name Status Components ada1p3.eli ACTIVE ada1p3 ada0p3.eli ACTIVE ada0p3
[tykling@haze ~]$ zpool status gelipool pool: gelipool state: DEGRADED status: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Online the device using 'zpool online' or replace the device with 'zpool replace'. scan: scrub repaired 68K in 0h28m with 0 errors on Thu Nov 1 05:58:08 2012 config: NAME STATE READ WRITE CKSUM gelipool DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 18431995264718840299 REMOVED 0 0 0 was /dev/ada0p3.eli ada1p3.eli ONLINE 0 0 0 errors: No known data errors [tykling@haze ~]$
To replace the device and begin resilvering:
[tykling@haze ~]$ sudo zpool replace gelipool 18431995264718840299 ada0p3.eli Password: [tykling@haze ~]$ zpool status pool: gelipool state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Tue Nov 27 00:53:40 2012 759M scanned out of 26.9G at 14.6M/s, 0h30m to go 759M resilvered, 2.75% done config: NAME STATE READ WRITE CKSUM gelipool DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 replacing-0 REMOVED 0 0 0 18431995264718840299 REMOVED 0 0 0 was /dev/ada0p3.eli/old ada0p3.eli ONLINE 0 0 0 (resilvering) ada1p3.eli ONLINE 0 0 0 errors: No known data errors [tykling@haze ~]$
When the resilver is finished, the system is good as new.