Out of tree file systems

This page is a list of stuff to take note of if you're maintaining an out-of-tree file system for NetBSD. It may not be complete, but it should list at least everything I've changed around while hacking up the VFS layer.

From netbsd-6 to -7

6.99.16 December 2012
The bread() and breadn() functions were changed so they never return a buffer on error. Callers must be changed to not correspondingly call brelse() on error.

6.99.15 November 2012
The interface to the name cache was changed around to disentangle it from struct componentname and namei internals. This is actually two changes committed on top of each other. The first, in typical usage, changes
	if ((error = cache_lookup(vdp, vpp, cnp)) >= 0)
		return (error);
   
to
	if (cache_lookup(vdp, cnp, NULL, vpp)) {
		return *vpp == NULLVP ? ENOENT : 0;
	}
   
that is, the order of the arguments is fixed to put the result parameter last and the sense of the return value changes. The new return value is either true (for a cache hit) or false (for a cache miss); a hit can be either a negative result, in which case the vnode is null, or a positive one, in which case it is not. If your filesystem supports whiteouts, you must to fetch an additional "iswhiteout" result and update cnp_flags; see the changes in ufs for details. Other/most filesystems can just pass NULL.
The second change passes various members of struct componentname to cache operations instead of the componentname pointer itself. For example, the above call becomes
	if (cache_lookup(vdp, cnp->cn_nameptr, cnp->cn_namelen,
		cnp->cn_nameiop, cnp->cn_flags, NULL, vpp)) {
		return *vpp == NULLVP ? ENOENT: 0;
	}
   
and the changes to other namecache calls are similarly mechanical.

6.99.13 October 2012
Changes to namei to allow the openat() family of system calls were added. You shouldn't need to change your filesystem for this but it is worth taking note of in case it breaks something.

6.99.10 July 2012
The semantics of cache_enter() were changed slightly; in particular, several tests that filesystems were previously supposed to perform themselves to see if cache_enter() should be skipped are now done within cache_enter(). These are: If your filesystem makes any or all of these checks itself before calling cache_enter() please remove them. Note that most of the existing filesystems typically checked a randomly chosen subset of these conditions, not all of them, so if you copied the logic from somewhere you probably don't have all three tests. Note: this change was also quietly added to netbsd-6 during the beta period and is thus in 6.0.

6.99.7 May 2012
A genfs_rename operation was added to handle locking for rename. You are strongly urged to convert your filesystem's rename locking to use genfs_rename. Most home-rolled solutions are incorrect one way or another. (And recall that ffs had been wrong for years, and we've had to fix zfs too.)

6.99.4 March 2012
A bunch of kauth-related changes went in, some pertaining to vnodes and filesystems. Check your favorite filesystem for how to adapt, and beware because a number of the initial changes were incorrect. If in doubt, ask tech-kern. (Sorry, better documentation is not really available.)

From netbsd-5 to -6

6.0_BETA2 August 2012
The VFS change described above under 6.99.10 (adjustments to cache_enter) was pulled up to the netbsd-6 branch owing to accident/miscommunication; it was decided to keep it rather than revert it because other more desirable changes depended on it and the backward compatibility risks were minor.

5.99.62 (and .63) January 2012
New quota code (again). If you were for some reason trying to support the old ufs-only quota interfaces, or the recent proplib-based quota interface, or any quotas at all, you'll need to do some hacking. Ask tech-kern for help.
Note: it was discovered in November 2012 that the auto-generated prototype for vfs_quotactl() implementations in VFS_PROTOS() was wrong. This slipped by because the only implementation of vfs_quotactl is ufs_quotactl, which isn't covered by a VFS_PROTOS(). The fix will be pulled up to the 6.0_STABLE branch and should be in 6.1.

5.99.56 September 2011
The handling of NAME_MAX was tidied up. Currently NAME_MAX is 511 but filenames cannot actually exceed 255 characters long. If your filesystem uses NAME_MAX, please change it to use a constant belonging to your filesystem instead (e.g. MYFS_NAME_MAX or MYFS_MAXNAMLEN), choose the value of this constant based on the capability of the file system, and, for futureproofing, enforce the limit in VOP_LOOKUP.

5.99.55 July 2011:
VOP_BWRITE now takes a vnode as its first argument like all other VOPs. All occurrences of VOP_BWRITE(bp) should be changed to VOP_BWRITE(bp->b_vp, bp), and references to layer_bwrite() can be removed.

5.99.53 June 2011:
UVM locking changed; minor effects on some filesystems, particularly layers. Check genfs for examples.

5.99.51 April 2011:
vflushbuf() can now return an error. Make sure you check for it.

5.99.50 April 2011:
VOP_LINK changed: filesystems are now no longer responsible for checking for cross-device links or links to directories; the FS-independent code does that now. It's recommended on general principles that you KASSERT these properties.

5.99.48 March 2011:
New quota code. If you were for some reason trying to support the old ufs-only quota interfaces, or any quotas at all, you'll need to do some hacking. Ask tech-kern for help.

5.99.43 January 2011:
SAVESTART is history. If you were calling relookup with SAVESTART set (that is, calling relookup in your fs's rename code without explicitly clearing SAVESTART from cn_flags) the directory vnode will no longer gain an extra reference from the relookup call and you need to adjust the reference counting accordingly. If you were explicitly clearing SAVESTART before calling relookup you can prune the code that clears SAVESTART. (Most but not all existing fs code explicitly cleared SAVESTART.) Also make sure that your rename code doesn't drop the last reference to either of the directory vnodes (either to or from) and expect it to magically remain valid. Some legacy code did that, and SAVESTART was apparently originally invented as a workaround. Or something. The signature of relookup was changed: it now needs an extra dummy integer argument, to make sure all code calling relookup gets examined. Consider fixing your locking so you don't need to use relookup. (If on the other hand you have FS code that was setting SAVESTART or making namei calls and using ni_startdir, contact me or ask tech-kern for advice.)

5.99.41 November 2010:
The SAVENAME and HASBUF namei flags have been removed. There is now always a buffer (so HASBUF would be always true) and the pathname in struct componentname is always valid in VOP operations. The buffer is no longer exposed as cn_pnbuf. You can safely remove all logic from your FS that frees cn_pnbuf or sets SAVENAME, and any HASBUF-based logic that remains can be made unconditional. If you were using cn_pnbuf for other purposes, contact me or ask tech-kern for advice.

5.99.40 November 2010:
struct pathbuf was added and the signature of NDINIT() changed to require a pathbuf rather than a string and uio_seg. See pathbuf(9) and namei(9), and example code all over the kernel. Calls to namei_simple_* are not affected.

5.99.38 July 2010:
The VI_FREEING vnode flag was killed off. On-disk inodes should be freed in the reclaim routine. See the ffs code.

5.99.34 July 2010:
vlockmgr() was killed off. Uses of vlockmgr() in file systems should be replaced with VOP_LOCK or VOP_UNLOCK.

5.99.32 June 2010:
The flags argument to VOP_UNLOCK was removed as it served no purpose.

5.99.31 June 2010:
Vnode locks are no longer allowed to be recursive. Hopefully your FS wasn't relying on this.

5.99.30 June 2010:
Layered FSes now pass the locking ops down to the leaf FS. The v_vnlock member of struct vnode is no longer used. If you have a layer FS, check the nullfs diffs; if not, you shouldn't be affected.

5.99.19 September 2009:
The VFS-level lookup() function was actually an abusive interface used only by nfsd. It has been removed; in the off-chance you have a network FS that was using it please look into more appropriate ways of calling namei and if necessary get in touch with me or post on tech-kern. Please don't use the private interfaces currently exposed for nfsd as they aren't meant to be stable.

January 2010:
The VATTR_NULL, VREF, VHOLD, and HOLDRELE vnode macros were killed off. Use the lowercase function forms.

5.99.15 June 2009:
The functions namei_simple_kernel and namei_simple_user were added to cover the most common cases of namei. If your FS calls namei and the usage matches these functions, switching to them will insulate you from upcoming namei interface changes.

February 2009:
ffs+softupdates was trashed and some of its supporting material ripped out along with it, particularly the gross "bioops" function table for FS callbacks for buffer operations. If your FS was relying on this, (1) please post to tech-kern to get started on a better interface and (2) we apologize for allowing this crap to exist in the form it did.

5.99.7 January 2009:
time_t was changed to be 64 bits wide. So was dev_t. Make sure your on-disk structures use explicitly sized types and do not rely on the sizes of system types.