|
|
Subscribe / Log in / New account

XFS online filesystem check and repair

Did you know...?

LWN.net is a subscriber-supported publication; we rely on subscribers to keep the entire operation going. Please help out by buying a subscription and keeping LWN on the net.

By Jake Edge
June 15, 2023
LSFMM+BPF

Darrick Wong has been doing work on XFS online repair for a number of years and things are getting to the point where most of the filesystem-internal work has been completed and is under review. The work remaining mostly concerns the user-space side to set up a periodic scan and repair cycle, so he wanted to discuss what user space needs from this kind of feature in a filesystem session at the 2023 Linux Storage, Filesystem, Memory-Management and BPF Summit that he led remotely. The session may not have gone quite as he hoped, as it got somewhat derailed by topics that spilled over from the earlier session on unprivileged image mounts.

His current patch set for XFS online repair is "out for review on Dave Chinner's laptop right now", so it is time to start talking about the missing pieces. That means that he will be talking more about user space than he would normally; there is a user-space driver program that controls how often the online fsck mechanism runs. There is nothing yet for notifying user space of problems that were found by an online fsck pass, nor is there a daemon monitoring for notifications to do anything about them, such as to issue repair requests. There is no good infrastructure in the kernel for handling and dispatching such things, he said.

[Darrick Wong]

He said that the earlier discussion in the unprivileged-mounts session on using fsck to decide that an image was sound enough to mount made him think that it was a good time to discuss these kinds of issues.

As he noted, there is a command-line program, xfs_scrub, which opens the block device and root directory, then starts issuing the right ioctl() commands, but the real use case is not for running a tool in that fashion. Instead, the idea is that it would do a background check and repair periodically from a systemd service; he is struggling a bit with setting that up, but has something working. It is not, however, much different from the age-old periodic cron job that reports its results to the system log and hopes an administrator is paying attention.

He would like to create a notification system that would allow the system to respond dynamically to the events that get reported by the periodic scrubbing. He would also like there to be a way for programs to initiate scrubbing for various reasons, such as a container manager that notices relatively low activity so it kicks off scrubbing on the mounted filesystems. Maybe that could mesh with the unprivileged-mounting use case in some fashion as well, Wong said.

So he wondered if any user-space developers had thoughts on how they might want to use this facility. He could continue developing "with my kernel colored glasses on", but he fears that may not produce the best results. There was an effort made to scare up Lennart Poettering, who might have some thoughts on the matter, but who had not made it back to the filesystem room after the coffee break.

Josef Bacik said that he generally relied on people from Fedora and other distributions to give him feedback on features of this sort. The distribution developers often have different ideas on how these things will be used. So, for thoughts on policies that might be applied to the online scrubber, he recommended seeking out people from Linux distributions.

Ted Ts'o replayed some of the earlier discussion around using (offline) fsck to check image files before mounting them. In order to be sure that image files are not modified by user space while the fsck was being done, Ts'o had said that they would need to be copied somewhere inaccessible to user space beforehand. One difference with the in-kernel fsck equivalent that XFS is planning to add might be that the copy/snapshot step would be unnecessary. A kernel-level fsck might not have that requirement, he suggested, but that does not really change whether using fsck in that manner is sufficient.

By that point, Poettering had returned so Wong repeated some of what he had said earlier. He said that the work on the online scrubber had quite recently become more urgent because "a lot more distros than the zero I thought there were will actually let you mount XFS filesystems without privilege". There have also been recent efforts in XFS to flag strange problems ("weird-looking metadata or outright bad metadata") that it sees, but that is not connected with fsnotify events (as ext4 is) to notify user space of these kinds of corruption. XFS generally knows exactly what the problem was, which could be encoded in the notification somehow in the hopes that someone is listening and can take appropriate action. For some filesystems that might be to unmount and fsck the filesystem, while XFS could use the online repair facility.

Poettering said that the current practice of having desktops mount removable media automatically is "stupid"; the approach that Chrome OS takes with only mounting certain specific filesystem types (e.g. VFAT), and only through a user-space driver, is much better and one that other desktops should adopt. The desktop use case is generally for USB sticks, and people do not normally put XFS on that kind of media, he said, so those should not be automatically mounted at all.

For mounting filesystem images in containers, though, he thinks trust should come from dm-verity as described in his earlier talk. Ts'o had said that fsck might be sufficient for establishing that an ext4 image would not compromise the kernel, so Poettering wondered if Wong would say the same for XFS. There is a difficult answer to that, Wong said; "as soon as I say 'yes', everybody in the world will watch their fuzzer rigs in order to try to find all of the things that fsck doesn't catch". That said, he generally agrees with Ts'o that fsck, either online or offline, should be robust enough to catch any bad filesystems, but it is not an absolute guarantee since bugs happen.

Poettering noted that the online checking for XFS was not usable for establishing trust, since the filesystem would need to be mounted first. Wong agreed, but wondered about images that had been signed by the distributor. Poettering and Christian Brauner said that signed images are fully trustable or, at least, that it is a user-space problem if they are not. Kent Overstreet said that fsck could not be used to establish trust in any case because a malicious device could change the data out from underneath the check. While that is true for, say, USB devices, the snapshot/copy requirement for a local image file that is getting mounted in a container removes that possibility, Ts'o said.

Overstreet argued that requiring the copy was onerous and unenforceable for users. Instead, he thinks "the responsible thing for us to be doing as filesystem implementers is to start taking it a little bit more seriously to just hardening our code at run time". He said that XFS does a lot of read- and write-time verification of metadata along with fuzzing, as does Overstreet's bcachefs, so "we might not be in as bad a shape as we assume".

Brauner wanted to clarify that the copy and fsck being discussed was not something that is under the user's control, but would be handled by a mount daemon. Overstreet was adamant that it would still be unacceptable to do the copy and "people are going to want to be able to mount images in the cloud untrusted very soon".

Bacik said that the session was "getting off the rails" at that point. He said that Wong is interested in what kinds of notifications would be of interest to user space and how to handle the policy questions around those; Wong agreed with that. Poettering said that he is "not a storage guy" so he does not know what kinds of policies they might want, but he thinks that simply shutting down the affected services when a filesystem it relies on has errors is the safest approach. If systemd were to get a notification of that sort, it could easily be set up to shut down affected services.

Ts'o said that those who are running these kinds of services should be consulted about how to handle the events. For example: what do the Kubernetes people actually want? They may want to shut down affected services, but give the services a short time frame to send a "goodbye cruel world" message or similar. The ext4 notifications that Wong mentioned were specifically added for the internal Google Kubernetes-like container manager Borg; the people maintaining those systems wanted to be able to shut down services in the face of filesystem corruption.

Wong said things are a little different working for a large database vendor (Oracle); most of the use of XFS, beyond root filesystems, is for "really large data partitions where we would like to be able to perform at least simple repairs on the 100TB data partition to try to keep the VM running". At any given time, the workload running in the VM or container is probably not accessing the whole 100TB, so there is an opportunity to fix things before the application even notices. "We would at least like to try to grow new engines on the plane while it's flying in order to avoid having to do an emergency landing." Restoring 100TB (or even more) can take a long time, which is best avoided.

Poettering wondered if a mount option that simply instructed XFS to run its online scrubber whenever it detected an anomaly might be a reasonable approach. "Why involve user space to trigger the online filesystem check?" User space is better for performing actions on other parts of the system, such as shutting down relevant services, so it does not really make sense for XFS to notify of a problem and have user space say "go fix yourself". Wong said that he was willing to write an XFS daemon that would receive notifications and schedule scrubbing if need be.

He wrapped up by describing some of the fuzzing that is done for XFS, which uses the XFS debugger to "walk every single field of every metadata object in the entire filesystem and fuzz them". That is part of why the XFS QA test suite takes almost a week to run; it spends a lot of time fuzzing and checking to see that the repair tool notices the problems and can fix them, both online and offline. He thinks he added some fuzzing of ext4 metadata blocks to fstests along the way, but not to the same level of precision of the XFS fuzz testing.


Index entries for this article
KernelFilesystems/XFS
ConferenceStorage, Filesystem, Memory-Management and BPF Summit/2023


(Log in to post comments)

XFS online filesystem check and repair

Posted Jun 15, 2023 18:38 UTC (Thu) by GhePeU (subscriber, #56133) [Link]

>The desktop use case is generally for USB sticks, and people do not normally put XFS on that kind of media, he said, so those should not be automatically mounted at all.

I normally put XFS on USB sticks and I'd like them to keep being mounted automatically, thank you very much.

XFS online filesystem check and repair

Posted Jun 15, 2023 19:02 UTC (Thu) by Deltabeard (guest, #152764) [Link]

I agree. It's not clear why they would think that some Linux users wouldn't use XFS for removable storage.

XFS online filesystem check and repair

Posted Jun 16, 2023 4:59 UTC (Fri) by rsidd (subscriber, #2582) [Link]

I don't see why they should be automounted though. I have devices with vfat, ntfs, ext4, and I don't automount anything, it takes a few seconds to type "sudo mount -t vfat /dev/sdc1 /mnt" or whatever it is. Your favourite file manager can make it simpler to mount on request.

XFS online filesystem check and repair

Posted Jun 16, 2023 9:10 UTC (Fri) by tedd (subscriber, #74183) [Link]

I'm sorry, this sounds ridiculous. It takes a few seconds to ensure that mountpoint is emtpy/exists, type that long sudo command, and also type your hopefully secure root password?

XFS online filesystem check and repair

Posted Jun 16, 2023 20:14 UTC (Fri) by dwest (subscriber, #110523) [Link]

Control-R mount to pull it from history
Control-E then Alt-B to go back a few words and change the device if needed Alt-D to delete the word (eg. sdc) and type the new one
type 20+ char password with muscle memory and hit enter

Yeah, probably about 5-10 seconds.

XFS online filesystem check and repair

Posted Jun 17, 2023 14:23 UTC (Sat) by wtarreau (subscriber, #51152) [Link]

And in any case it takes less time to explicitly mount an FS you want than to figure what to unmount because that stupid thing automatically mounted it for you while you only wanted to run fdisk on it or dump an SBC image using dd.

I agree that graphical distros should always ask the user before magically mounting. For me that's extremely irritating every time it happens.

XFS online filesystem check and repair

Posted Jun 18, 2023 21:30 UTC (Sun) by mathstuf (subscriber, #69389) [Link]

I prefer to have `autofs` set up such that `cd /mnt/auto/usb/sdc1` will do the mount for me (and unmount on inactivity). I should probably have it do UUID-based mounting so that stable symlinks work, but I use them so rarely that I never get around to it.

XFS online filesystem check and repair

Posted Jun 16, 2023 12:43 UTC (Fri) by GhePeU (subscriber, #56133) [Link]

I got tired of that dance almost 20 years ago and I didn't become more patient with age. In addition to that, if it's an USB stick I may have a few identical ones, I may need to use more than one at the same time, and personally I find it quite useful to have the system automount them on paths that match the filesystem labels.

XFS online filesystem check and repair

Posted Jun 17, 2023 2:15 UTC (Sat) by simcop2387 (subscriber, #101710) [Link]

I'm pretty similar but I don't let them be automounted, I let my DE handle that part of things for me so it's just a single click and then it "does the right thing" for basically all removable media, putting it into /media/$USER/$LABEL so that it's not in a spot that'll be looked at for $PATH or conflict with other common places. for the rare cases where that isn't what should happen (ZFS, removable backup disks, etc.) and I want them elsewhere, not having them automount means that they don't get impacted at all. it's a pretty nice setup for it all.

XFS online filesystem check and repair

Posted Jun 16, 2023 4:59 UTC (Fri) by RobertBrockway (guest, #48927) [Link]

Interesting. I've been using xfs as my primary filesystem for more than 20 years and I don't think I've ever put it on a usb stick.

Having said that, the real power of Unix is in not making assumptions about how the system is used. They shouldn't just assume that people don't use xfs on removable media.

XFS online filesystem check and repair

Posted Jun 16, 2023 12:51 UTC (Fri) by GhePeU (subscriber, #56133) [Link]

If I plan to use removable devices to exchange data with other systems (that these days may not even necessarily be computers, they may be set-top boxes or "smart" TVs) I tend to use NTFS to avoid VFAT limitations, but if I'm the only one who's going to use them I might as well go for a full-featured native filesystem, so why not XFS since I'm already using it on the hard disks?

Said removable devices in this particular case are more likely than not LUKS-encrypted so at that point I've already gave up on (wide) interoperability.

XFS online filesystem check and repair

Posted Jun 16, 2023 6:10 UTC (Fri) by otaylor (subscriber, #4190) [Link]

> I normally put XFS on USB sticks and I'd like them to keep being mounted automatically, thank you very much.

The thing is, it's hard to know that the new webcam, or even USB cable someone bought isn't a "USB stick" in disguise.

For the common case where the logged-in user is an admin (and can be assumed to not be trying to root the laptop), it seems you could get pretty far by simply asking the first time you see a particular partition GUID:

"New removable storage device detected. It is formatted as XFS (this is not normal), do you trust it?"

Hopefully, someone seeing that when they plug in a newly purchased webcam would click "Heck no".

Even better when combined with fuse-mounting VFAT devices and an effective fsck for XFS!

XFS online filesystem check and repair

Posted Jun 16, 2023 7:17 UTC (Fri) by eru (subscriber, #2753) [Link]

> Hopefully, someone seeing that when they plug in a newly purchased webcam would click "Heck no".

Years ago I encountered a device (a USB 3G modem stick) that when inserted presents FAT file system that contains the Windows driver. When that has been installed (I think it would be automatic on Windows, at least back then), the driver uses some magic command to set to mode to networking. Fortunately when I needed to set one up for Linux, someone had already figured out the magic, and made a helper program "usb_modeswitch" to set the mode. I think this is now commonly supplied in distributions.

XFS online filesystem check and repair

Posted Jun 16, 2023 12:35 UTC (Fri) by GhePeU (subscriber, #56133) [Link]

If that's the threat model, you might as well always ask before automounting a device that has never been seen before, without special casing only some filesystems because they're unusual.

XFS online filesystem check and repair

Posted Jun 16, 2023 14:06 UTC (Fri) by Wol (subscriber, #4433) [Link]

This would probably address pizza's concerns because, even if you auto-click the popup after inserting a USB disk, you are far more likely to do a double-take if you get the popup after inserting, say, a mouse ...

And hopefully, if you are disciplined enough to take a moment to stick every new USB stick into your laptop and accept the warning, you should only get the warning when you expect it, and thus do a double-take when you're not expecting it.

Cheers,
Wol

XFS online filesystem check and repair

Posted Jun 16, 2023 12:41 UTC (Fri) by pizza (subscriber, #46) [Link]

> "New removable storage device detected. It is formatted as XFS (this is not normal), do you trust it?"

99% of the time the user is going to say "yes" to this, to the point where they'll just click 'yes' without even registering what they just did, Because of course they want to access this disk, it's why they plugged it in.

(XFS isn't any more "unusual" than NTFS or HFS+ or whatever on removeable storage. And meanwhile the overwhelming common case here is "sure the filesystem itself is reasonably trustable but the data on it is anything but")

XFS online filesystem check and repair

Posted Jul 11, 2023 10:33 UTC (Tue) by rbanffy (guest, #103898) [Link]

> Hopefully, someone seeing that when they plug in a newly purchased webcam would click "Heck no".

On my Windows days, a webcam that came with a small mountable ROM containing an OS-appropriate driver or other software/manuals would be a wonderful thing. I believe Blackberry phones did that.

Having said that, a pluggable device that exposes a block device with a file-like interface inside a mountable filesystem (faked by the device itself) would be useful without a device driver. Think, for instance, as a multi-camera or TV tuner interface that exposes the cameras as a directory of mp4 streams. Or a robot that exposes sensors and motors as readable and writable files.

But I digress. I agree that leaving mounting a just-plugged file system to the UI environment would be the best decision, and detailed reporting on trust issues of the filesystem before you mount it would be wise.

My own workstation seems to be configured to present the block devices and partitions as soon as they are plugged (which, from an attack PoV, might be too late already) and only mount them when I open the volumes in the filesystem browser.

XFS online filesystem check and repair

Posted Jun 16, 2023 21:01 UTC (Fri) by developer122 (guest, #152928) [Link]

It probably makes sense for the filesystem to just go and fix any problems it finds. As it is, filesystems like ZFS and BTRFS fix anything they find. It can both do that *and* send a notification to userspace.

Then userspace can do what it wants, like stopping affected services. That could depend on the notification saying "this is fixable" or "this is fatal, here's what was damaged"

XFS online filesystem check and repair

Posted Jun 17, 2023 8:22 UTC (Sat) by strcmp (subscriber, #46006) [Link]

I don‘t want filesystems on possibly defective devices to be auto-fixed AKA auto-shredded.

XFS online filesystem check and repair

Posted Jun 17, 2023 22:36 UTC (Sat) by gerdesj (subscriber, #5446) [Link]

"I don‘t want filesystems on possibly defective devices to be auto-fixed AKA auto-shredded."

I'd like to be sure that something like this happens:

(Here I use fs_verify as a stand in for all the fsck things and the like)

SMART and the like understand the hardware level and are responsible for it - fs_verify doesn't and isn't. If SMART and co are ineffective then direct your ire there.

If SMART says broken - do not mount and probably panic, else if fs_unhappy then run fs_verify ....

Now fs_verify will kick in. At this point your hardware is being reported as working correctly and hence the scenario you describe should not happen. fs_verify does its thing ...

XFS online filesystem check and repair

Posted Jul 11, 2023 17:04 UTC (Tue) by someplaceguy (guest, #166012) [Link]

There are a huge number of hardware and software problems which can lead to fsck causing data loss instead of repairing the filesystem, no matter how battle-tested fsck is.

It's not just bad disks (which SMART doesn't even always catch) which can cause this problem, there's all kinds of similar issues which can lead to data loss when running fsck: bad SATA cables, bad/buggy hard disk firmware, bad/buggy RAID cards and/or their firmware, bad/buggy SATA controllers, faulty ECC and especially non-ECC memory, networking issues (e.g. for filesystems mounted over a network block device), buggy or faulty CPUs, kernel bugs, bugs in fsck itself, etc.

Many of these hardware faults cannot be automatically detected (not to mention automatically fixed) and the software/firmware bugs are perhaps even more hopeless.

Whenever any kind of hardware fault or broken filesystem is encountered, the safest course of action is always to do the following:

1. Stop modifying the filesystem immediately (to not make things even worse), i.e. either make the filesystem read-only or halt the machine.
2. Diagnose (and repair if encountered) any possible hardware faults, and then either:
3 a. Make a backup of the broken or non-broken filesystem (just in case fsck makes things worse) and only then run fsck to return it to a coherent state, or:
3 b. Create a new filesystem and restore data from backup (if you do have backups, and if they are up-to-date).


Copyright © 2023, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds