Friday, March 13, 2015

FreeBSD From the Trenches: Using autofs(5) to Mount Removable Media

This next FreeBSD From the Trenches story come to us from Edward Tomasz Napierała who shares his work on the new FreeBSD automounter.

My big project for 2014 was the new FreeBSD automounter.  Like any proper FreeBSD Foundation sponsored project, it included the usual kind of documentation - man pages and the Handbook chapter.  But there is no document that shows how it works inside, from the advanced system administrator or a power user point of view.

So, here it is.  The article demonstrates how modular the automounter is, and how easy it is to adopt to any mount-related situation you might have, using recently added removable media support as an example.  (And it shows some related mechanisms as a bonus.)

autofs(5) Basics

The purpose of autofs(5) is to mount filesystems on access, in a way that's transparent to the application. In other words, filesystems get mounted when they are first accessed, and then unmounted after some time passes. The application trying to access the filesystem doesn't even notice this event, apart from a slight delay on first access.  It's a mechanism similar to ones available in other systems, in particular to OS X.  It's a completely independent implementation, it's just that OS X is the other operating system I use.

Automounting requires cooperation of four things: the kernel filesystem, autofs.ko, which is responsible, among other things, for "pausing" the application until the filesystem is actually there; the automountd(8) daemon, which is the component that retrieves configuration information from maps (this includes fetching it from remote sources, such as LDAP) and actually mounts the filesystems; the automount(8) utility for various administrative purposes; and then the autounmountd(8) daemon to, well, unmount the filesystems mounted by automountd(8) after a timeout.

Setting it up is fairly simple: you obviously need to have autofs(5) enabled in /etc/rc.conf:
autofs_enable="YES"
And you need to have the autofs(5) daemons running - just like other deamons in FreeBSD those will get started at system bootup if autofs_enable was set; otherwise you need to start them by hand:
# /etc/rc.d/automount start
# /etc/rc.d/automountd start
# /etc/rc.d/autounmountd start
The kernel driver will get loaded automatically, you can see it in kldstat(8) output.

autofs(5) and Removable Media

Note that at the time of this writing, this is only available in FreeBSD 11-CURRENT. This will change soon.

The main configuration file for autofs(5) is /etc/auto_master; you need to uncomment this line:
/media -media -nosuid
This basically says that there is a /media directory, and automount will mount the "-media" map there, and everything that gets mounted there will have the "nosuid" mount option, for security reasons.

If you already had autofs(5) running before uncommenting the line, you must refresh its configuration by running automount(8) as root; run it as "automount -v" for a detailed explanation of what it does.  It looks like this:
# automount -v
automount: parsing auto_master file at "/etc/auto_master"
automount: done parsing "/etc/auto_master"
automount: unmounting stale autofs mounts
automount: skipping /, filesystem type is not autofs
automount: skipping /dev, filesystem type is not autofs
automount: leaving autofs mounted on /net
automount: mounting new autofs mounts
automount: autofs already mounted on /net
automount: nothing mounted on /media; mounting
automount: mounting map -media on /media, prefix "/media", options "nosuid"
If you run mount(8), you will see so called "trigger nodes" of type autofs(5):
# mount
/dev/ada0p2 on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)

Basic usage

With all that done, plug a drive into USB, and here is what happens in a real-world case:
[trasz@brick:~]% ll /media
total 9
drwxr-xr-x  3 root wheel  512 Feb 24 12:54 .
drwxr-xr-x 30 root wheel 1024 Feb 24 12:28 ..
drwxr-xr-x  1 root wheel 4096 Jan  1 1980 ADATA UFD
drwxr-xr-x  3 root wheel  512 Feb 24 12:54 md0
[trasz@brick:~]% cd /media/ADATA\ UFD
[trasz@brick:/media/ADATA UFD]% ll
total 10117
drwxr-xr-x 1 root wheel    4096 Jan  1  1980 .
drwxr-xr-x 3 root wheel     512 Feb 24 12:54 ..
drwxr-xr-x 1 root wheel    4096 Nov 24 00:03 .Spotlight-V100
drwxr-xr-x 1 root wheel    4096 Nov 24 00:03 .Trashes
-rwxr-xr-x 1 root wheel    4096 Nov 24 00:03 ._.Trashes
drwxr-xr-x 1 root wheel    4096 Jan 13 11:24 .fseventsd
drwxr-xr-x 1 root wheel    4096 Nov 22 22:44 Bonus
-rwxr-xr-x 1 root wheel 3309568 Nov 24 14:50 DSC05996.JPG
-rwxr-xr-x 1 root wheel 4063232 Nov 24 14:50 DSC05997.JPG
-rwxr-xr-x 1 root wheel 2953199 Nov 25 21:40 DSC05998.JPG
drwxr-xr-x 1 root wheel    4096 Nov 22 18:24 Meshuggah
drwxr-xr-x 1 root wheel    4096 Nov 22 21:06 System Volume Information
[trasz@brick:/media/ADATA UFD]% mount
/dev/ada0p2 on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/da0s1 on /media/ADATA UFD (msdosfs, local, nosuid, automounted)
[trasz@brick:/media/ADATA UFD]% cd /
[trasz@brick:/media/ADATA UFD]% sudo automount -u
[trasz@brick:/media/ADATA UFD]% mount
/dev/ada0p2 on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
Two things to notice here: first, the "ADATA UFD" is a factory default
filesystem label on the flash drive.  If there was no filesystem label,
autofs(5) would use device name instead - in this case, that would
be "da0s1".  Second - if you don't want to wait for autounmountd(8)
to unmount the automounted volume, you can use "automount -u".  Or
"automount -fu", if you want to force unmount.

Not So Basic Usage

Take a close look at the directory listing for /media in previous example. Did you notice the "md0" there?  It looks like a device node for memory disk (md(4)), but is a directory.  That's a leftover from my earlier experimentation, and shows an interesting feature of autofs(5)-based automounter: it's not limited to removable media, it can mount everything that's available for mounting.  In this case it's a memory disk (kind of ramdisk, see "man mdconfig").  It can also be an iSCSI lun.  And, of course, a removable media.  How does that work?

GEOM

In FreeBSD, GEOM is a name of what could otherwise be called a block device layer.  It's a piece of code that manages all the "disk-like devices", both physical and virtual: SATA/SAS/FC/NVME/USB drives, memory disks, iSCSI LUNs, partitions, encrypted GELI volumes etc.

GEOM has another meaning: an instance of GEOM class.  The "class" here means the "kind" of device, and the instance is an actual device of that kind. It's easiest to explain it with an example:
# geom disk list
Geom name: cd0
Providers:
1. Name: cd0
   Mediasize: 0 (0B)
   Sectorsize: 2048
   Mode: r0w0e0
   descr: MATSHITA DVD/CDRW UJDA775
   ident: (null)
   fwsectors: 0
   fwheads: 0

Geom name: ada0
Providers:
1. Name: ada0
   Mediasize: 250059350016 (233G)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r2w2e3
   descr: Samsung SSD 850 EVO 250GB
   lunid: 5002538da000f602
   ident: S21PNSAFC02149R
   fwsectors: 63
   fwheads: 16

Geom name: da0
Providers:
1. Name: da0
   Mediasize: 7654604800 (7.1G)
   Sectorsize: 512
   Mode: r0w0e0
   descr: ADATA USB Flash Drive
   lunname: USB MEMORY BAR
   lunid: 2020030102060804
   ident: 14A0711312300023
   fwsectors: 63
   fwheads: 255

# geom part list
Geom name: ada0
modified: false
state: OK
fwheads: 16
fwsectors: 63
last: 488397127
first: 34
entries: 128
scheme: GPT
Providers:
1. Name: ada0p1
   Mediasize: 65536 (64K)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 1024
   Mode: r0w0e0
   rawuuid: 42dc1b8b-c49b-11e3-8066-001c257ac65f
   rawtype: 83bd6b9d-7f41-11dc-be0b-001560b84f0f
   label: (null)
   length: 65536
   offset: 17408
   type: freebsd-boot
   index: 1
   end: 161
   start: 34
2. Name: ada0p2
   Mediasize: 236223201280 (220G)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 1024
   Mode: r1w1e1
   rawuuid: 42dc921f-c49b-11e3-8066-001c257ac65f
   rawtype: 516e7cb6-6ecf-11d6-8ff8-00022d09712b
   label: (null)
   length: 236223201280
   offset: 82944
   type: freebsd-ufs
   index: 2
   end: 461373601
   start: 162
3. Name: ada0p3
   Mediasize: 13836045312 (13G)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 1024
   Mode: r1w1e0
   rawuuid: 21a8eef9-a0d4-11e4-ab80-001c257ac65f
   rawtype: 516e7cb5-6ecf-11d6-8ff8-00022d09712b
   label: (null)
   length: 13836045312
   offset: 236223284224
   type: freebsd-swap
   index: 3
   end: 488397127
   start: 461373602
Consumers:
1. Name: ada0
   Mediasize: 250059350016 (233G)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r2w2e3

Geom name: da0
modified: false
state: OK
fwheads: 255
fwsectors: 63
last: 14950399
first: 1
entries: 4
scheme: MBR
Providers:
1. Name: da0s1
   Mediasize: 7654576128 (7.1G)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 28672
   Mode: r0w0e0
   rawtype: 12
   length: 7654576128
   offset: 28672
   type: !12
   index: 1
   end: 14950399
   start: 56
Consumers:
1. Name: da0
   Mediasize: 7654604800 (7.1G)
   Sectorsize: 512
   Mode: r0w0e0
See?  I've used the geom(8) command to get the information about two GEOM classes: "disk", and "part".  The first one returned information about three instances of the disk class: the DVD drive, the SSD, and the flash.  The second one returned information on the partitions known to the system. Everything that is potentially mountable - a physical disk, a partition, encrypted ELI volume, multipath device, RAID3 volume, memory disks, even volume labels - it all has its GEOM class and can be queried in a similar way.  To see all the GEOM instances in the running system, use:
# sysctl kern.geom.conftxt
Now, notice the "Mode" lines.  Like the one for ada0: "r2w2e3".  Those are three usage counters for ada0 GEOM: read, write, and exclusive.  They are non-zero, because ada0 is used: there are three partitions on it; three instances of PART GEOM class hold it opened.  The partitions, just like any other GEOM nodes, have their own counters.  Take a look at the first one, ada0p1: the mode there is "r0w0e0".  This means it's not open by anything.  It's, in other words, available for mounting.  If you check the MD geom class:
# geom md list 
Geom name: md0 
Providers: 
1. Name: md0
   Mediasize: 1073741824 (1.0G) 
   Sectorsize: 512
   Mode: r0w0e0
   type: swap
   access: read-write
   compression: off
   length: 1073741824
   fwsectors: 0
   fwheads: 0
   unit: 0
You will see the same thing: it's not opened.  That's the first thing the autofs(5) "-media" map checks for: zero access counts; if the counts are not zero, it means the node is used by something: it's either mounted (like ada0p2, mounted on /), or there is something "on top of it" - like ada0.

But why there is no /media/ada0p1?  Because it's not mountable; there is no filesystem there.  It's a boot loader partition.  How does autofs(5) figure it out?

fstyp(8)

Before we can do anything with a filesystem, we need to determine what kind of filesystem it is - and whether it actually is a supported filesystem in the first place.  That means we need a piece of code that can take a look at it and determine if it has a format it recognizes.

It is possible to use file(1) for this, eg:
# file -s /dev/md0
Vermaden's sysutils/automount port uses this approach.  There are a few problems with doing it this way, though.  First, the output, for a typical FAT filesystem, looks like this:
/dev/md0: DOS/MBR boot sector, code offset 0x3c+2, \
 OEM-ID "BSD4.4  ", sectors/cluster 32, root entries \
 512, sectors/FAT 256, sectors/track 63, heads 255, \
 sectors 2097144 (volumes > 32 MB) , serial number \
 0x668a120e, unlabeled, FAT (16 bit)
It's not particularly easy to parse.  It's even harder to extract the volume label.

Second, file(1) can recognize all kinds of file types, from JPEG to 6502 assembly.  This means that if there are some strange data on the removable media, instead of a filesystem we expect, the file(1) will output something our script wasn't tested against, making the first problem even harder.

Third, file(1) had its share of security bugs, eg CVE-2014-1943, CVE-2014-9620, or CVE-2014-3710.

For this reason I've decided the proper fix would be to just write a new utility. The strange name - "fstyp" - comes from the utility of the same name, installed by default on Solaris, IRIX, OS X, and perhaps most other UNIX systems.

The fstyp(8) addresses the file(1) issues: the output is easily parsable (just a filesystem name, one word), it only recognizes filesystems supported by FreeBSD, and uses Capsicum sandboxing to make sure that even if there is a vulnerability, its impact is limited to incorrectly reporting the filesystem type.  It's a good topic for another article, but in short - in FreeBSD, every process can enter what's called a "capability mode". It's one way - a process can enter it, but there is no way to exit it.  Child processes inherit the mode. In capability mode, kernel will deny all attempts to open new files, create sockets, attach the shared memory segments etc, but the process is pretty much free to do anything it likes with the file descriptors it already had opened before entering the capability mode - and it can receive other file descriptors over a UNIX socket.  So, the fstyp(8) utility opens the device file, then calls cap_enter(2), which switches it into capability mode, and then continues execution, reading from the device to determine what's there.  Should it be compromised, it won't be able to execute /bin/sh, it won't be able to open a socket to transmit the data to some external host, etc.

The "-media" Map

Those are the components underneath the autofs(5), but how does it fit together? Let's start with the actual map.  In FreeBSD, special maps (the ones with names starting with "-") are just executables in /etc/autofs/:
# ls -al /etc/autofs
total 36
drwxr-xr-x   2 root  wheel  512 Feb 14 21:18 .
drwxr-xr-x  25 root  wheel 3072 Feb 24 11:22 ..
-rwxr-xr-x   1 root  wheel 1010 Oct 17 11:26 include_ldap
-rwxr-xr-x   1 trasz wheel   43 Aug 17  2014 include_nis
-rwxr-xr-x   1 root  wheel  367 Oct 17 11:26 special_hosts
-rwxr-xr-x   1 root  wheel 2294 Dec  6 10:15 special_media
-rwxr-xr-x   1 root  wheel  355 Feb 14 21:17 special_noauto
-rwxr-xr-x   1 root  wheel   97 Oct 17 11:26 special_null
-rwxr-xr-x   1 root  wheel  357 Aug 22  2014 special_smb
See the special_media?  That's the one.  It's a shell script.  The reason it's in /etc is that the system administrator can modify it if required, or add new special maps.

Now, let's try to run it by hand, as root:
# /etc/autofs/special_media
ADATA UFD
md0 

# /etc/autofs/special_media md0
-fstype=msdosfs,nosuid  :/dev/md0
That's exactly how automountd(8) uses it, after the kernel component notifies it that it needs the /media directory taken care of.  It's described in more detail in the auto_master(5) manual page.  The shell script is pretty well commented, and I don't think there is any point in explaining it here.

Bottom line:
the core autofs itself doesn't know anything about removable devices; the special map "-media" does: it queries GEOM for the list of all disk-like nodes that are not in use, and then uses fstyp(8) to determine whether they contain a useful filesystem.  UNIX.  Modularity.  Plain text.  ;-)

Cache

Now, let's create a second memory disk, 1GB in size (the "1g" below) to see if it all works as intended:
# mdconfig -s1g
md1
# newfs_msdos /dev/md1
newfs_msdos: cannot get number of sectors per track: \
 Operation not supported
newfs_msdos: cannot get number of heads: \
 Operation not supported
newfs_msdos: trim 8 sectors to adjust to a multiple of 63
/dev/md1: 2096576 sectors in 65518 FAT16 clusters \
 (16384 bytes/cluster)
BytesPerSec=512 SecPerClust=32 ResSectors=1 FATs=2 \
 RootDirEnts=512 Media=0xf0 FATsecs=256 SecPerTrack=63 \
 Heads=255 HiddenSecs=0 HugeSectors=2097144

# ll /media
total 5
drwxr-xr-x   3 root wheel  512 Feb 24 12:25 .
drwxr-xr-x  30 root wheel 1024 Feb 23 09:04 ..
drwxr-xr-x   1 root wheel 4096 Jan  1 1980 ADATA UFD
drwxr-xr-x   3 root wheel  512 Feb 24 12:25 md0
Whoops.  Where is /media/md1?

There is one more mechanism for the whole thing to work correctly: the autofs(5) cache needs to be dealt with.

The first paragraph mentioned that it's automountd(8) that does all the map parsing - including running the /etc/autofs/special_media - and actual mounting. Doing it every time someone accesses the /media directory - or any directory, for that matter - would kill performance.  For this reason, after the kernel component asks the automound(8) to do its magic, it doesn't do that again until some time later.  In most cases it doesn't matter - the list of NFS exports for a given host doesn't change too often - but in case of removable media it's not acceptable.  The cache needs to be flushed, using "automount -c".  After that, the subsequent lookup in /media will trigger automountd(8), which will query the devices list and refresh the directory contents.

This obviously needs to happen automatically.  And if you actually went and opened /etc/auto_master in a text editor, you would have noticed this:
# When using the -media special map, make sure to edit devd.conf(5)
# to move the call to "automount -c" out of the comments section.
The devd(8) is a daemon responsible for listening for notifications from the kernel and running whatever is configured in its config, /etc/devd.conf. There are all kinds of things there, from running utilities to upload firmware for various USB devices, to launch moused(8) when a mouse gets connected, to switching power profiles, to... discarding autofs(5) caches.  It looks like this:
notify 100 {
   match "system" "GEOM";
   match "subsystem" "DEV";
   action "/usr/sbin/automount -c";
}
If you do "man devd.conf", you will see the description of those events. Note that, just like the "-media" map works the same way for flash drives and encrypted volumes over multipath over iSCSI, this mechanism does not care about any specific hardware either.

Caveats

Two, really.  First: you need to run 11-CURRENT.  Second: the nodes in /media never disappear.  I expect to merge this support to 10-STABLE after the second issue is addressed.

3 comments:

  1. I like to use sysutils/automount, but I think it's great the fstyp(8) command given its development based on capsicum. I wish and add parts of this publication to the FreeBSD Handbook, when 11.0-RELEASE arrives.

    ReplyDelete
  2. Note that both fstyp(8) and GEOM devd notifications could be used by sysutils/automount too - for now it still uses file(1) and devd rules to match specific names instead.

    ReplyDelete
  3. Во FreeBSD 10.1-STABLE под Xfce4 4.12 использую automount-1.5.7. При подключении внешнего винчестера к USB-порту почему-то запускается второй экземпляр Xfce. Выхожу из него, попадаю в первый, в Thunar вижу подключенную партицию и работаю с ним обыным образом. Отмонтируется нормально.
    Очень напрягает запуск второго экземпляра Xfce. Как это вылечить?

    ReplyDelete