Debating overlayfs
LWN looked at the overlayfs filesystem last year. Overlayfs, written by Miklos Szeredi, is distinguished by its relative simplicity. Recently, Miklos asked if overlayfs could be merged for the 3.1 development cycle. He may get his wish, but some worries will have to be addressed first.
Andrew Morton has raised a couple of concerns; one of which is that the problem might be
better solved in user space. He dismissed the simplicity of overlayfs,
saying "Not merging it would be even smaller and simpler,
" and
suggested that performance problems should be addressed by making the
user-space implementation faster. Linus has pretty much ended that aspect of the debate by saying
"People who think that userspace filesystems are realistic for
anything but toys are just misguided.
" So the way seems to be clear
for a union filesystem implementation in the kernel.
Andrew's other concern is that overlayfs may not be a sufficiently complete solution:
That objection is harder to answer. It has been pointed out that OpenWRT is happily using overlayfs and Ubuntu is considering it. About the only viable alternative project is union mounts, which has not seen much developer attention recently. On the feature front, it doesn't seem like anything else will come along and outshine overlayfs in the near future.
On the technical side, union filesystems have always presented some unique challenges. Valerie Aurora, who has done a fair amount of work in this area, looked at overlayfs in March and seemed to be positive about it:
She has changed her tune a bit in the current discussion, suggesting that there are some difficulties which need to be addressed:
She raised some locking concerns as well, which Miklos addressed in detail; the concern about
changing the underlying filesystem has not been answered, though. So it's
possible that technical correctness issues may yet delay the merging of
overlayfs into the kernel. That said, it seems clear that there is demand
for this feature, and that overlayfs appears to satisfy that demand
nicely. There will likely come a time when keeping it out of the kernel
becomes too hard to justify.
Index entries for this article | |
---|---|
Kernel | Filesystems/Union |
Kernel | Overlayfs |
Posted Jun 16, 2011 19:44 UTC (Thu)
by martinfick (subscriber, #4455)
[Link] (6 responses)
Posted Jun 16, 2011 21:46 UTC (Thu)
by ndye (guest, #9947)
[Link]
Neither do I, and you paint the benefits well . . .
. . . but now your headache has gone viral.
Posted Jun 17, 2011 6:38 UTC (Fri)
by neilbrown (subscriber, #359)
[Link] (3 responses)
When you access (e.g. open) a file (not a directory) in a read-only mode which doesn't exist in the upper layer, you get exactly the file from the lower layer. If you fstat the file descriptor it will look exactly like the lower-layer file - st_dev, st_ino and all. It really is the lower-level file.
So much so that if someone else opens the file for 'write', it will get copied into the upper layer and they will get a handle on the file in the upper layer which they can then change, but you will still have a handle on the lower level file which, of course, will not see those changes.
Posted Jun 17, 2011 16:14 UTC (Fri)
by martinfick (subscriber, #4455)
[Link] (2 responses)
So with overlayfs, if I have 1000 containers each with their own upper layer mounted separately on top of the same lower layer, and each one of them runs the same copy of apache, will the linux MM system share most of the memory for those apache executables, as much as if they all ran off of the same file in the lower layer directly?
If so, this will be a major boon for "virtualisation" on linux, extremely memory efficient and lightweight containers. This would allow linux containers in the mainline to share some of the ideas and similar benefits to the linux vserver project's "unification".
Posted Jun 19, 2011 22:58 UTC (Sun)
by Sho (subscriber, #8956)
[Link] (1 responses)
Posted Jun 19, 2011 23:43 UTC (Sun)
by neilbrown (subscriber, #359)
[Link]
If the Linux/Unix file hierarchy had been design with sufficient foresight (which would have been total impractical in reality) then you probably could do it all with shared subtrees. Those files that might need to be configure per-machine or per-instance would be in one subtree (a bit like /var maybe) and all the other files would be elsewhere. The one subtree would be copied for each instance, the rest would be shared.
But we don't have such a forward looking design .. and it is entirely possible that differing needs are such that such a design would be impossible. So configuration files are often mixed in with non-configuration files. A solution is needed which makes copies of the first type, but shares the second type.
One could imagine a forest-of-symlinks which could map all 'configuration' files into one subtree, but symlinks don't always (ever?) provide perfect semantics. If you update a config file by writing a new copy then renaming it, you break the symlink.
You could do the symlinks in the other direction: with symlinks for all the files that you want to share, but that would have it's own problems I suspect.
So overlayfs complements shared subtrees and allows you to selectively have some files shared and some files private within the same directory. And it achieved this almost transparently.
Posted Feb 25, 2012 3:18 UTC (Sat)
by scientes (guest, #83068)
[Link]
Posted Jun 21, 2011 9:51 UTC (Tue)
by nikanth (guest, #50093)
[Link] (1 responses)
Wouldn't it be better, if COW file-systems like btrfs can provide a feature to write new blocks only to writable disk, instead of going for generalized solutions. Btrfs would need a way to check for the root of the tree(superblock) in the new disk, before using the one from read-only disk.
Posted Aug 2, 2012 11:26 UTC (Thu)
by bluss (guest, #47454)
[Link]
Posted Jun 22, 2011 18:24 UTC (Wed)
by rilder (guest, #59804)
[Link]
Looks like the developer made an effort here to get it into tree -- http://thread.gmane.org/gmane.linux.file-systems/29813 , not sure where the discussion proceeded.
Shared inodes
Shared inodes
Of course, I am not sure how that could actually be done... :(
;-)
Shared inodes
Shared inodes
Don't shared subtrees get you a long part of the way, too?
Shared inodes
Shared inodes
Shared inodes
IOW hard-links on steroids.
Now, making this work in full-virtualization environments is not exactly the same problem....and certainly can't be as elegant.
Debating overlayfs
Debating overlayfs
Debating overlayfs