Hacker News new | past | comments | ask | show | jobs | submit login

> zfs is technically great

It's only great due to the lack of competitors in the checksummed-CoW-raid category. It lacks a bunch of things: Defrag, Reflinks, On-Demand Dedup, Rebalance (online raid geometry change, device removal, device shrink). It also wastes RAM due to page cache + ARC.




> It's only great due to the lack of competitors in the checksummed-CoW-raid category.

blinks eyes, shakes head

"It's only great because it's the only thing that's figured out how to do a hard thing really well" may be peak FOSS entitlement syndrome.

Meanwhile, btrfs has rapidly gone nowhere, and, if you read the comments to this PR, bcachefs would love to get to simply nowhere/btrfs status, but is still years away.

ZFS fulfills the core requirement of a filesystem, which is to store your data, such that when you read it back you can be assured it was the data you stored. It's amazing we continue to countenance systems that don't do this, simply because not fulfilling this core requirement was once considered acceptable.


> Meanwhile, btrfs has rapidly gone nowhere […]

A reminder that it came out in 2009:

* https://en.wikipedia.org/wiki/Btrfs

(ext4 was declared stable in 2008.)


Yes! File systems are hard. My prediction is that it will be *at least* 10 years before this newfangled FS gains both feature- and stability parity with BTRFS and ZFS.

Also, BTRFS (albeit a modified version) has been used successfully in at least one commercial NAS (Synology), for many years. I don't see how that counts as "gone nowhere".


> Also, BTRFS (albeit a modified version) has been used successfully in at least one commercial NAS (Synology), for many years. I don't see how that counts as "gone nowhere".

Excuse me for sounding glib. My point was btrfs isn't considered a serious competitor to ZFS in many of the spaces ZFS operates. Moreover, it's inability to do RAID5/6 after years of effort is just weird now.


Years of effort is a stretch, nobody serious (read: who's willing to pay for it) has been working on raid5/6 pretty much since its inception (since nobody serious needs raid 5/6 at all). Western Digital promised to fix it a couple of years ago, but there doesn't seem to be much progress since then.


raid1c2, raid1c3 and raid1c4 will get you close to RAID5/6 on btrfs (in terms of redundancy), albeit with tad less disk space, but still more than normal raid 1.

> ZFS in many of the spaces ZFS operates

not a lot.


>> My point was btrfs isn't considered a serious competitor to ZFS in many of the spaces ZFS operates.

> raid1c2, raid1c3 and raid1c4 will get you close > not a lot.

I guess I just don't understand this take. btrfs doesn't do what ZFS does, and still isn't as reliable as ZFS is. When it is, maybe I'll take another look. But this is really the problem with btrfs stans -- they've been saying it's ready, when it's not, for years.

Fix the small stuff. Make it reliable. Quit making promises about how it's as good as ZFS, when it's clear it doesn't do all the things ZFS does, just some of the things most of the time.


Are all the foot guns described described in 2021 been fixed?

* https://arstechnica.com/gadgets/2021/09/examining-btrfs-linu...


Not sure about "all", but apart from that article being more pissy than strictly necessary, RAID1 can now, in fact survive losing ore than one disk. That is, provided you use RAID1C3 or C4 (which keeps 3 or 4 copies, rather than the default 2). Also, not really sure how RAID1 not surviving >1 disk failure is a slight against btrfs, I think most filesystems would have issues there...

As for the rest of the article — the tone rubs me the wrong way, and somehow considering a FS shit because you couldn't be bothered to use the correct commands (the scrub vs balance ranty bit) doesn't instill confidence in me that the article is written in good faith.

I believe the writer's biggest hangup/footgunnage with btrfs is still there: it's not zfs. Ymmv.


The author is a Canonical fanboy, a company whom put all their effort into ZFS, which now seems to have been in vain.


They put no real effort into ZFS, their own userspace tooling was only half-baked and then thrown aside. Continuing to build and ship the kernel module doesn't cost them much, the hard work of ZFS development is done by others. Quite interesting how you blame others for being fanboys while being a fanboy yourself.


> The author is a Canonical fanboy, a company whom put all their effort into ZFS, which now seems to have been in vain.

ZFS is in pretty heavy use?


I don't see what's entitled about the idea that "it fulfills the core requirements" is enough to get it "good" status but not "great" status. Even if that's really rare among filesystems.


> I don't see what's entitled about the idea that "it fulfills the core requirements" is enough to get it "good" status but not "great" status. Even if that's really rare among filesystems.

You don't see? Well. Uh, I think this review would make more sense coming from someone who wrote a "great" filesystem, or at the very least understood how hard it was to write ZFS. "Big whoop", or "I don't understand what the big deal is" is what is entitled about it.


If "good" is an accurate assessment, then "It's only great because of lack of competitors" seems like a fair statement to me, and far from "big whoop". The list of problems they put in the post is real and meaningful, and they didn't say it was bad, they implied something more like big fish in a small pond.

> this review would make more sense coming from someone who wrote a "great" filesystem

That's not a reasonable standard for reviewers.


> That's not a reasonable standard for reviewers.

It is when the attitude is "What's the big deal?" ZFS is two decades on, and, is by many metrics, still the state of the art in the traditional filesystem space. What ZFS does is extremely hard, and the reason we know is because every open source competitor can't touch it, so I'm saying -- have a little respect.

You don't like it? You prioritize reflinks (ZFS just merged block cloning BTW, so hello reflinks: https://github.com/openzfs/zfs/pull/13392) or offline dedup over RAIDZ? Fine. But make sure your favorite filesystem (or your new filesystem) can do what ZFS does, day in and day out, before you throw that shade. If it does half the things, or breaks sometimes, it's still a toy compared to ZFS.


I think you read the comment as a lot harsher than it actually was.


> [ZFS is] only great due to the lack of competitors in the checksummed-CoW-raid category.

You forgot robust native encryption, network transparent dump/restore (ZFS send/receive) - and broad platform support (not so much anymore).

For a while you could have a solid FS with encryption support for your USB hd that could be safely used with Linux, *BSD, Windows, Open/FOSS Solaris and MacOS.


Is it just the implementation of zfs which is owned by oracle now? I wonder how hard it would be to write a compatible clean room reimplementation of zfs in rust or something, from the spec.

Even if it doesn’t implement every feature from the real zfs, it would still be handy for OS compatibility reasons.


I would suppose it would take years of effort? and a lot of testing in search of performance enhancements and elimination of corner cases. Even if the code of the FS itself is created in a provably correct manner (a very tall order even with Rust), real hardware has a lot of quirks which need to be addressed.


I wish the btrfs (and perhaps bcachefs) projects would collaborate with OpenZFS to rewrite equivalent code that they all used.

It might take years, but washing Sun out of OpenZFS is the only thing that will free it.


OpenZFS is already free and open source. Linux kernel developers should just stop punching themselves in face.

One way to solve the ZFS issue, Linus Torvalds could call a meeting of project leadership, and say, "Can we all agree that OpenZFS is not a derived work of Linux? It seems pretty obvious to anyone who understands the meaning of copyright term of art 'derived work' and the origin of ZFS ... Good. We shall add a commit which indicates such to the COPYING file [0], like we have for programs that interface at the syscall boundary to clear up any further confusion."

Can you imagine trying to bring a copyright infringement suit (with no damages!) in such an instance?

The ZFS hair shirt is a self imposed by semi-religious Linux wackadoos.

[0]: See, https://github.com/torvalds/linux/blob/master/LICENSES/excep...


Linus has some words on this matter:

> And honestly, there is no way I can merge any of the ZFS efforts until I get an official letter from Oracle that is signed by their main legal counsel or preferably by Larry Ellison himself that says that yes, it's ok to do so and treat the end result as GPL'd.

> Other people think it can be ok to merge ZFS code into the kernel and that the module interface makes it ok, and that's their decision. But considering Oracle's litigious nature, and the questions over licensing, there's no way I can feel safe in ever doing so.

> And I'm not at all interested in some "ZFS shim layer" thing either that some people seem to think would isolate the two projects. That adds no value to our side, and given Oracle's interface copyright suits (see Java), I don't think it's any real licensing win either.

https://www.realworldtech.com/forum/?threadid=189711&curpost...


> Linus has some words on this matter:

I hate to point this out, but this only demonstrates Linux Torvalds doesn't know much about copyright law. Linus could just as easily say "I was wrong. Sorry! As you all know -- IANAL. It's time we remedied this stupid chapter in our history. After all, I gave similar assurances to the AFS module when it was open sourced under a GPL incompatible license in 2003."

Linus's other words on the matter[0]:

> But one gray area in particular is something like a driver that was originally written for another operating system (ie clearly not a derived work of Linux in origin). At exactly what point does it become a derived work of the kernel (and thus fall under the GPL)?

> THAT is a gray area, and _that_ is the area where I personally believe that some modules may be considered to not be derived works simply because they weren't designed for Linux and don't depend on any special Linux behaviour.

[0]: https://lkml.org/lkml/2003/12/3/228


> I hate to point this out, but this only demonstrates Linux Torvalds doesn't know much about copyright law.

Maybe.

When I was young I had an honestly awful employment contract waved under my nose that I was expected to sign. It included waivers of "moral rights" - like the company was allowed to give credit for my work to someone else and lie and say I never contributed to a project I worked on. I felt weird about it, so I talked to some senior people I respected.

Some of the advice I got was that the existence of a signed contract only gave the employer cover to could sue me if they wanted to. But if a company starts suing ex-employees over things that sound capricious and unfair, even if they win the court case its an incredibly bad look. Doing so would probably cost them employees and customers. So in a very real sense, particularly awful terms would never be enforced anyway.

This cuts the other way when it comes to Linux, ZFS and Oracle. Imagine Linux includes ZFS in the kernel. Oracle decides that maybe they can claim that linux is thus a derived work of ZFS. Ridiculous, but that might be enough cover to start suing companies who use linux. If it went to court they might eventually lose. So they don't go after Google. They sue smaller companies. They sue Notion. They sue banks. They sue random YC companies right after a raise. And then they graciously offer to settle each time for a mere hundreds of thousands of dollars. Much less than the court case would cost.

It doesn't matter that they're legally in the wrong. Without a court case to demonstrate that they're wrong, they get to play mafia boss and make a killing. This really hurts Linux - which gets a reputation as a business liability. And thats what Linus wants to avoid.

I'm sure Apple has fantastic lawyers. It might surprise you to learn that Apple's lawyers came to the same decision as Linus. Apple was in the process of transitioning MacOS to ZFS when Oracle bought SUN (and by extension acquired ZFS). They'd done all the technical work to make that happen - and they were set to announce it at WWDC, launching ZFS as a headlining feature of the next version of macos. But after oracle got involved, they pulled the plug on the project and threw out all their work. We can only assume Apple's lawyers considered it too big of a legal liability. Even if they might have won the court case, they didn't want to take the risk. Cheaper in the long run to make their own ZFS-like filesystem (APFS) instead. So thats what they did.


> Oracle decides that maybe they can claim that linux is thus a derived work of ZFS.

Whosiwhatsit? That's a non-sequitur.

I get it -- "Be scared of Oracle." But this is fever dreams from r/linux stuff.


It doesn’t need to make sense for oracle to use the threat of lawsuits to bully small companies. It doesn’t matter if it’s crazy if your pockets aren’t deep enough to survive the legal challenge.

I don’t blame people for deciding zfs isn’t worth the risk.


It's a shame some legalese is holding back this great filesystem in the Linux world. I've used ZFS on FreeBSD for 20 years or so and it's amazing, especially since it's been possible to use it as a root FS since about 10 years ago.

I wish the Sun legacy had been sold to a more ethical company, but at least the original ZFS was fully open source and there's nothing Oracle can do about it.


It's not that simple because the compatibility is from both ends; the CDDL states:

> Any Covered Software that You distribute or otherwise make available in Executable form must also be made available in Source Code form and that Source Code form must be distributed only under the terms of this License. …

> You may not offer or impose any terms on any Covered Software in Source Code form that alters or restricts the applicable version of this License

Now, you can argue up and down and left and right whether this applies on including the CDDL code in a GPL project (with or without exception), but the fact remains that anyone can sue anyone for any reason, and as long as the complaint is not so ludicrous that a judge will throw it out – which this complain isn't – you're going to end up helping out lawyers with their retirement fund.

Linus in general is far more pragmatic about these sort of license issues than, say, the FSF or Stallman. The problem isn't on the Linux end, the problem is "someone may sue our pants off 10 or 20 years down the road". Remember that BSD lawsuit in the 90s? That kind of stuff. Even if Oracle wasn't the Oracle we know and love today it would still be a risk: things can change in the future, companies get taken over.


> It's not that simple because the compatibility is from both ends; the CDDL states:

After reading this and thinking about it, I don't understand any argument a CDDL copyright holder would have against Linux? It frankly doesn't make any sense. You're going to have to explain the "Why?" of this. Start with -- "The facts are ... so the claim is ..."

> Linus in general is far more pragmatic about these sort of license issues than, say, the FSF or Stallman.

I wish they both, Linus and Stallman, would stop living in the 1980s. Both don't seem to understand modern copyright jurisprudence, from and including Computer Associates International, Inc. v. Altai, Inc.. I'm not sure it's intentional, but they certainly have both misled the public about what the GPLv2 entails, and their misinformation re: ZFS has been particularly egregious.

> Even if Oracle wasn't the Oracle we know and love today it would still be a risk: things can change in the future, companies get taken over.

AFAICT there is nothing special about Oracle as a litigant in this instance. AFAICT any Linux copyright holder would have have standing to bring suit.


> I don't understand any argument a CDDL copyright holder would have against Linux?

If you ship ZFS with Linux then those clauses I mentioned may apply. Whether that will hold up in court? I don't know. But do you want to run the risk of trying? I wouldn't, and I certainly can't blame Linux for not wanting to either.

I don't really know much about Computer Associates International. v. Altai, and I freely admit my ignorance on the finer points of US copyright law – it's not a topic I find especially interesting.

However, my point is that this doesn't really matter. Even if we assume you're 100% correct in this regard, that still doesn't mean Oracle can't and won't sue Linux. Anyone can sue anyone, and in the US the defendant is expected to carry their own legal costs regardless of the outcome of that suit unless the suit was filled under spectacular bad faith, which doesn't really apply here.

That is the issue; if ZFS was MIT or GPL then there obviously wouldn't an issue and any lawsuit would be completely baseless and any judge would throw it out in an instance. But with CDDL this is a lot less clear, even if we assume you're 100% correct on copyright, there still is a real dispute here with enough legal ambiguity that a judge will have to make ruling (i.e. they're unlikely to throw out the suit).

> AFAICT there is nothing special about Oracle as a litigant in this instance.

The Oracle-Google Java lawsuit stands out here. They argued that all the way to the Supreme Court with a novel and (IMO) creative view on copyright.

But like I said: it doesn't really matter. Even companies with good reputations can change, through change in management, change in owner, or just change of mind.


> If you ship ZFS with Linux then those clauses I mentioned may apply. Whether that will hold up in court? I don't know.

I'm sorry, but this is FUD by another name. This is a "Maybe there is something wrong with the CDDL too?" take, not a "This. See, this here is the problem with the CDDL" take.

FOSS people are supposed to be against this stuff.

> That is the issue; if ZFS was MIT or GPL then there obviously wouldn't an issue and any lawsuit would be completely baseless and any judge would throw it out in an instance. But with CDDL this is a lot less clear

Let me get this straight -- the law is complex, but it would be less complex if we had a different license, also we don't like this license and Oracle, so we won't even try with this license, even though it may not be an issue.

All of this is a giant turn off for me. I don't find it hard to believe that Canonical made the determination it did, because they're not beholden to a view that we should be scared of Oracle for indeterminate reasons, all the time.

And since you brought it up -- when Oracle sued Google, the basis of the suit was a claim that Google had violated the GPL because of, yes, another ridiculously broad interpretation of copyright law centered around the GPL. These over-broad interpretations, and attendant scared straight FUD campaigns, are not a service to the Linux/FOSS community, and Linux and the FSF and their minions should stop now. We should pick a side and the side should be in the majority of Oracle v. Google, and a long line of cases that say roughly similar things, not FSF, et. al., craziness.


Even if you were to be able to say that OpenZFS is not a derived work of Linux, all it would allow you to do is to distribute OpenZFS. You would _still_ not be able to distribute OpenZFS + Linux as a combined work.

(I am one of these guys who thinks what Ubuntu is doing is crossing the line. To package two pieces of software whose license forbids you from distributing their combination in a way that "they are not combined but can be combined with a single click" is stretching it too much. )

It would be much simpler for Oracle to simply relicense older versions of ZFS under another license.


> Even if you were to be able to say that OpenZFS is not a derived work of Linux, all it would allow you to do is to distribute OpenZFS. You would _still_ not be able to distribute OpenZFS + Linux as a combined work.

Why? Linus said such modules and distribution were acceptable re: AFS, an instance which is directly on point. See: https://lkml.org/lkml/2003/12/3/228


Where is he saying that you can distribute the combined work? That would not only violate the GPL, it would also violate AFS's license...

The only thing he's saying that there is that he's not even 100% sure whether AFS module is a derived work or not (if it was, it would be a violation _just to distribute the module by itself_!). Go imagine what his opinion will be on someone distributing a kernel already almost pre-linked with ZFS.

Not that it matters, since he's not the license author not even the copyright holder these days...


> Where is he saying that you can distribute the combined work?

What's your reasoning as to why one couldn't, if we grant Linus's reasoning re: AFS as it applies to ZFS?

> Not that it matters, since he's not the license author not even the copyright holder these days...

Linux kernel community has seen fit to give its assurances re: other clarifications/exceptions. See the COPYING file.


> What's your reasoning as to why one couldn't

Simply put, because it's a license incompatible with the GPL, as it literally says so on the page of the creators of the GPL, Wikipedia, etc.

> if we grant Linus's reasoning re: AFS as it applies to ZFS?

You are the one claiming that Linus reasoning implies the AFS code combined with the Linux kernel _can be distributed_. Linux is not actually saying that in the post you quoted. I am asking for where he says that.

> Linux kernel community has seen fit to give its assurances re: other clarifications/exceptions. See the COPYING file.

It doesn't really matter; not one individual can grant "an exception" unless it was already allowed to begin with (the GPL _explicitly_ grants the system library exception). Linus even says so in the very beginning of the post you quoted ("No such exception exists. There's a clarification [...]").


> Simply put, because it's a license incompatible with the GPL, as it literally says so on the page of the creators of the GPL, Wikipedia, etc. > not one individual can grant "an exception"

"The creators of the GPL"? You mean the FSF? If you don't think Linus and the kernel devs have standing, what standing or authority do the FSF and Wikipedia have? Neither are a licensor of the Linux kernel either.

I'm saying the Linux kernel devs have seen fit to give their view, and they think it carries weight, and seems to govern the actual practice and use, and this would seem to be another analogous instance where they could give their view again.

> You are the one claiming that Linus reasoning implies the AFS code combined with the Linux kernel _can be distributed_. Linux is not actually saying that in the post you quoted. I am asking for where he says that.

Linus doesn't explicitly say this in that post. I thought that was clear enough from ... reading the linked post. However, I do feel it naturally follows from what he says. Otherwise, this discussion would have an angels on the head of a pin quality (which we're talking about the GPL on the internet, so...).

Perhaps we simply disagree about the underlying copyright jurisprudence? After all, you might say, the GPLv2 states: "But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License,..." and "a 'work based on the Program' means either the Program or any derivative work under copyright law". Maybe, you say, ZFS is a "derived work" under copyright law, if not a "derived work" given the plain meaning of derived work?

I happen to believe that courts would not be sympathetic to broad claims of what is "derived" software/infringement, like those made by the FSF, and have not been since Computer Associates v. Altai (and a long line of other cases). For example, you might see Sega v. Accolade, where a court of appeals held that Accolade could reverse engineer Sega’s video game system to create games that ran on the system, even though it involved copying Sega’s code. And I happen to believe the FSF has misled the FOSS community as to the state of the underlying copyright law with their (mis)interpretations of their licenses.


> "The creators of the GPL"? You mean the FSF? If you don't think Linus and the kernel devs have standing, what standing or authority do the FSF and Wikipedia have? Neither are a licensor of the Linux kernel either.

They are, at least, actual lawyers.

> I thought that was clear enough from ... reading the linked post. However, I do feel it naturally follows from what he says.

No, it doesn't.

If OpenAFS is a derived work, then it is in violation of the GPL to distribute it at all. i.e. _it cannot exist_ for all practical purposes.

If OpenAFS is not a derived work, then you can distribute it. You can still work on it, distribute it, and individual users may combine it with Linux on their own machines.

Under no circumstances you can distribute a Linux kernel combined with a non-GPL-compatible module and call it a day. Note that this does not depend _at all_ on whether OpenAFS is a derived work or not. The kernel combined with OpenAFS _for sure_ is a derived work of the kernel ( you really need the courts to determine that?) , and you can't distribute that without violating the GPL. AND the CDDL.

What Ubuntu does is stretching it (they never distribute binaries, only aggregate sources, which supposedly the user can one-click-combine into a derived work).


> They are, at least, actual lawyers.

Are they?

> Under no circumstances you can distribute a Linux kernel combined with a non-GPL-compatible module and call it a day.

That's exactly the Q presented -- whether the ZFS module/s licensed as CDDL is GPL compatible. I am arguing it is.

> From GPLv2, Section 2: "If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it."

If the module is not a derived work, then my understanding is you can distribute them with "the Program". See also GPLv2, Section 3.

> What Ubuntu does is stretching it (they never distribute binaries, only aggregate sources, which supposedly the user can one-click-combine into a derived work).

My understanding is that this is incorrect. Canonical ships a fully built and linked zfs.ko in a separate package from the main kernel image. See: https://packages.ubuntu.com/kinetic/amd64/linux-modules-5.19...


> Are they?

What's this, ELIZA ?

> I am arguing it is.

That's new: none of the arguments you've presented has helped make that case yet.

I don't understand why you keep trying to argue whether the "ZFS module" _by itself_ is a derived work or not. It is irrelevant. You are distributing _the entire work_, which is obviously derived from the Linux kernel since it literally _contains it_ (or an almost verbatim copy of it). The paragraph you yourself quoted literally says the entire distribution must then be on the terms of this license (GPL), which the CDDL _forbids_.

The GPL has exceptions for "aggregation" but as I said in my opinion what Ubuntu is doing crosses the line. It can hardly be claimed to be aggregation when literally the module they ship is strictly designed to work with the kernel they ship, and it is absolutely useless otherwise. Such interpretation basically makes the LGPL meaningless.


> What's this, ELIZA ?

Do you know if any of these FSF "interpretations" are written by attorneys? I'm curious. Here[0], re: ZFS, Stallman says he solicited advice from others, at least one an attorney, but AFAIK Stallman alone is the named author. But I generally don't trust 2nd hand legal opinions of attorneys, who won't publish under their own name, and don't work for me.

> I don't understand why you keep trying to argue whether the "ZFS module" _by itself_ is a derived work or not. You are distributing _the entire work_, which is obviously derived from the Linux kernel since it literally _contains it_ (or an almost verbatim copy of it).

This is obviously a point of disagreement. You might see [1], written by an actual attorney and expert in these issues: "With the ambiguous definitions in the GPL, and rather paltry protections provided to software under the Copyright Act, kernel module code likely falls outside of the definition of 'derivative works.'"

> The paragraph you yourself quoted literally says the entire distribution must then be on the terms of this license (GPL), which the CDDL _forbids_.

Again [1], re: the paragraph I quoted and Section 0, it notes: "...[T]he GPL fails to make the distinction between a work containing the Program and a work based on the Program, or collective and derivative works as Congress defined them under the act. Combining these terms into a single all-encompassing definition is illogical, especially given the GPL’s reference in the same sentence to copyright law and the importance of those legal terms of art under the act."

Which I have to agree with! When the GPL says "The "Program", below, refers to any such program or work, and a 1) "work based on the Program" means either the Program or any derivative work under copyright law:" and then says 2) "that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language." The correct interpretation is not to conflate two distinct copyright terms. That actually the GPL applies to both. It's to read out the clearly illogical, contradictory language.

[0]: https://www.fsf.org/licensing/zfs-and-linux

[1]: https://www.networkworld.com/article/2301697/encouraging-clo...


You are quoting someone who tries to argue whether the module by itself or not is a derivative work, but I have literally say I find that irrelevant. You literally removed the part where I say "it is irrelevant" when quoting me...

You miss the point. I'm not claiming that the GPL applies to ZFS code, most definitely not when it is obviously not derived from any GPL software. Whether the ZFS Linux module is derivative or not I don't care. I am claiming that the GPL applies when you _distribute Linux itself_. What possible point of contention you could have? Linux is obviously GPL...

If you are trying to distribute Linux you have to do it under the terms of the GPL, whatever your opinion of them are, in the same way that when you want to _use_ Windows you have to use it under the terms of the Windows EULA, whatever your opinion of Windows is. Unlike the Windows EULA, the GPL doesn't put any limitations to use, so at your own home you can do whatever you want (including combining it with ZFS, if the CDDL were to allow you), but you can't distribute the combined work (because the GPL forbids you to do so)!


> Unlike the Windows EULA, the GPL doesn't put any limitations to use, so at your own home you can do whatever you want (including combining it with ZFS, if the CDDL were to allow you), but you can't distribute the combined work (because the GPL forbids you to do so)!

This is what I'm arguing is wrong. GPL explicitly allows distributing a collective/combined work, in contrast to a derived work. See, again, GPLv2, Section 2.


Which part exactly you claim allows you to do that? The only which remotely even allows you to do so is the "mere aggregation", and mere aggregation I already mentioned like 2 days ago.

The only part you mentioned so far:

> From GPLv2, Section 2: "If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it."

Is literally saying that the whole must be licensed under the GPL (note: even whether there are sections that may be considered independent and/or separate and/or derivative and/or whatever), otherwise you just can't distribute it.

Just think about it. The LGPL would make no sense if this wasn't the case.


> Is literally saying that the whole must be licensed under the GPL (note: even whether there are sections that may be considered independent and/or separate and/or derivative and/or whatever), otherwise you just can't distribute it.

You seem to think it's very clear what the "whole work" is and that it must include ZFS.

See my comment of 2 days ago.[0] Section 2 states "...a whole which is a work based on the Program..." which is, as I've already mentioned, defined at Section 0. I'll say again -- at Section 0, it says "a work based on the Program" is either 1) the licensed program (Linux only), or 2) "any derivative work under copyright law", which ZFS is not, so the combination is such a "mere aggregation."

> Just think about it. The LGPL would make no sense if this wasn't the case.

Whether the LGPL makes sense or not is not really a concern of mine, but the LGPL also only applies to "derived works" as well.

[0]: https://news.ycombinator.com/item?id=35914449


> You seem to think it's very clear what the "whole work" is and that it must include ZFS.

And you disagree with that? Not only it is rather evident (Ubuntu distributes both in the same media, and in the same one-click installer), but you even need to admit this point in order to claim "mere aggregation" as you are now doing for the first time.

> See my comment of 2 days ago.[0] Section 2 states "...a whole which is a work based on the Program..." which is, as I've already mentioned, defined at Section 0. I'll say again -- at Section 0, it says "a work based on the Program" is either 1) the licensed program (Linux only), or 2) "any derivative work under copyright law", which ZFS is not

The "whole" here (Ubuntu) is evidently based on Linux since it literally contains it almost verbatim. You are distributing Linux. I still don't get how you can possibly disagree there. You just keep throwing the argument on whether ZFS is derivative or this or that but it is simply irrelevant.

> so the combination is such a "mere aggregation."

And now we finally get to the real core of the argument, where you are finally claiming that Ubuntu distributing ZFS alongside the kernel is just "mere aggregation", the only exception that the GPL acknowledges that would even be relevant here. It is an important exception -- otherwise all of Ubuntu would have to be GPL-compatible -- but notice that it has absolutely nothing to do with whether the parts are derivatives of each other or not, and in fact the very paragraph which introduces this exception actually mentions this _explicitly_.

Which means that, going back to your assertion, it doesn't follow from the arguments you posit that it is "mere aggregation" -- it cannot be decided based on derivative/not derivative.

As I literally said on my very first message on this discussion, Ubuntu claiming this is "mere aggregation" is stretching it. The ZFS module they ship only works with the very same kernel version they ship and viceversa (this for me is already enough to fail the "it is not derivative" claims -- but that's another story). The installer will one-click link the two for you, so that the module loads into _the same address space as the kernel_, and Ubuntu loses functionalities if you don't use the provided ZFS module and use some other filesystem module instead. They could very well just prelink zfs.ko it into the kernel and it would be practically indistinguishable. It is, in my view, impossible to claim this as "mere aggregation".

And notice that Linus claiming "OpenZFS is not derivative" won't make this any different. Even if it was God himself claiming it. The above is just legal tricky, no matter what the legal status of ZFS itself is.

> the LGPL also only applies to "derived works" as well.

And? That is the intention (and the limit) of copyright. To protect distribution of these.

A license for X software could claim that you can only distribute X alongside other software which has an even number of lines of source code and you'd have to comply with it. Not because "other software" is derivative from X or not, but because you have to comply with it to copy X.


> You just keep throwing the argument on whether ZFS is derivative or this or that but it is simply irrelevant.

And you keep saying things like this without explaining them. Perhaps I've failed in being as clear as I possibly can be, and for that I'm sorry, but this is last comment I'll add to this very long thread, because where as I've have provided references to the text, and tried to explain my thinking, you have not.

> The installer will one-click link the two for you

AFAIK this is false, as I believe I've said before. The zfs.ko is prelinked.

> The ZFS module they ship only works with the very same kernel version they ship and viceversa (this for me is already enough to fail the "it is not derivative" claims -- but that's another story).

So, your idea is because a binary module is dependent in some way upon an underlying substrate like a library or a kernel (as opposed to the source being dependent), it is a derivative work? I think that's a fine test to propose as being the test for whether a work is derivative, but you will need to point at some textual basis for that belief. So far, I don't see one from you, and AFAIK the GPLv2 does not propose a specific boundary itself other than that of "derived work."

> It is, in my view, impossible to claim this as "mere aggregation".

Again -- you have to explain "Why?" of this bare assertion. You use language like "clearly" and "evident" and "impossible" when you haven't explained yourself with reference to the GPL text and/or copyright law.

> A license for X software could claim that you can only distribute X alongside other software which has an even number of lines of source code and you'd have to comply with it. Not because "other software" is derivative from X or not, but because you have to comply with it to copy X.

There are boundaries to what are permissible covenants in a copyright license. Copyright is a actually a rather weak IP right, and one example of a very broad exception/exemption would be copyright "fair use". It's fair use to copy API definitions, for instance.


Not exactly ZFS in Rust, but more like a replacement for ZFS in Rust: https://github.com/redox-os/tfs

Worked stalled, though. Not compatible, but I was working on overlayfs for freebsd in rust, and it was not pleasant at all. Can't imagine making an entire "real" file system in Rust.


> wonder how hard it would be to write a compatible clean room reimplementation of zfs in rust or something, from the spec

As for every non-trivial application - almost impossible.


If I understand things correctly, the only thing Oracle owns that could impact Open ZFS is the patents the CDDL permits the usage of. Would a clean room implementation even matter?


Reflinks and copy_file_range() are just landing in OpenZFS now I think? (Block cloning)


Block cloning support has indeed recently landed in git and already allows for reflinks under FreeBSD. Still has to be wired up for Linux though.


Really excited about this.

Once support hits in Linux, a little app of mine[0] will support block cloning for its "roll forward" operation, where all previous snapshots are preserved, but a particular snapshot is rolled forward to the live dataset. Right now, data is simply diff copied in chunks. When this support hits, there will be no need to copy any data. Blocks written to the live dataset can just be references to the underlying snapshot blocks, and no extra space will need to be used.

[0]: https://github.com/kimono-koans/httm


What does it mean to roll forward? I read the linked Github and I don't get what is happening

> Roll forward to a previous ZFS snapshot, instead of rolling back (this avoids destroying interstitial snapshots):

     sudo httm --roll-forward=rpool/scratch@snap_2023-04-01-15:26:06_httmSnapFileMount
    [sudo] password for kimono:
    httm took a pre-execution snapshot named: rpool/scratch@snap_pre_2023-04-01-15:27:38_httmSnapRollForward
    ...
    httm roll forward completed successfully.
    httm took a post-execution snapshot named: rpool/scratch@snap_post_2023-04-01-15:28:40_:snap_2023-04-01-15:26:06_httmSnapFileMount:_httmSnapRollForward


From the help and man page[0]:

    --roll-forward="snap_name"

    traditionally 'zfs rollback' is a destructive operation, whereas httm roll-forward is non-destructive.  httm will copy only files and their attributes that have changed since a specified snapshot, from that snapshot, to its live dataset.  httm will also take two precautionary snapshots, one before and one after the copy.  Should the roll forward fail for any reason, httm will roll back to the pre-execution state.  Note: This is a ZFS only option which requires super user privileges.
I might also add 'zfs rollback' is a destructive operation because it destroys snapshots between the current live version of the filesystem and the rollback snapshot target (the 'interstitial' snapshots). Imagine you have a ransom-ware installed and you need to rollback, but you want to view the ransomware's operations through snapshots for forensic purposes. You can do that.

It's also faster than a checksummed rsync, because it makes a determination based on the underlying ZFS checksums, or more accurate than a non-checksummed rsync.

This is a relatively minor feature re: httm. I recommend installing and playing around with it a bit.

[0]: https://github.com/kimono-koans/httm/blob/master/httm.1


What I don't understand is: aren't zfs snapshots writable, like in btrfs?

If I wanted to rollback the live filesystem into a previous snapshot, why couldn't I just start writing into the snapshot instead? (Or create another snapshot that is a clone of the old one, and write into it)


> What I don't understand is: aren't zfs snapshots writable, like in btrfs?

ZFS snapshots, following the historic meaning of "snapshot", are read-only. ZFS supports cloning of a read-only snapshot to a writable volume/file system.

* https://openzfs.github.io/openzfs-docs/man/8/zfs-clone.8.htm...

Btrfs is actually the one 'corrupting' the already-accepted nomenclature of snapshots meaning a read-only copy of the data.

I would assume the etymology of the file system concept of a "snapshot" derives from photography, where something is frozen at a particular moment of time:

> In computer systems, a snapshot is the state of a system at a particular point in time. The term was coined as an analogy to that in photography. […] To avoid downtime, high-availability systems may instead perform the backup on a snapshot—a read-only copy of the data set frozen at a point in time—and allow applications to continue writing to their data. Most snapshot implementations are efficient and can create snapshots in O(1).

* https://en.wikipedia.org/wiki/Snapshot_(computer_storage)

* https://en.wikipedia.org/wiki/Snapshot_(photography)


Sure, there's lots of room for improvement. IIRC, rebalancing might be a WIP, finally?

But credit where credit is due: for a long time, ZFS has been the only fit for purpose filesystem, if you care about the integrity of your data.


Afaik true rebalancing isn't in the works. Some limited add-device and remove-vdev features are in progress but AIUI they come with additional overhead and aren't as flexible.

btrfs and bcachefs rebalance leave your pool as if you had created it from scratch with the existing data and the new layout.


Yeah world decided just replicating data somewhere is far preferable if you want to have resilience, instead of making the separate nodes more resilient.


“Wastes” ram? That’s a tunable my friend.


https://github.com/openzfs/zfs/issues/10516

The data goes through two caches instead of just page cache or just arc as far as I understand it.


Can I totally disable ARC yet?


    zfs set primarycache=none foo/bar
?

Though this will amplify reads as even metadata will need to be fetched from disk, so perhaps "=metadata" may be better.

* https://openzfs.github.io/openzfs-docs/man/7/zfsprops.7.html...


I'm curious what your workflow is that not having any disk caching would have acceptable performance.


A workflow where the person doesn't understand that RAM isn't wasted and it just their utility to show usage is wrong. Imagine being mad at file system cache being stored in RAM.


The problem with ARC in ZFS on Linux is the double caching. Linux already has a page cache. It doesn't need ZFS to provide a second page cache. I want to store things in the Linux page cache once, not once in the page cache and once in ZFS's special-sauce cache.

If ARC is so good, it should be the general Linux page cache algorithm.


I maybe wrong, but I remember (circa 2011) that any access to ZFS on Linux entirely ignored page cache buffers unless you use mmap, so by "default" only binaries and libraries get double cached.

ARC is better than page cache in linux right now. It's not used by linux because:

1) Linus irrational hate towards ZFS (every rant I read shows clearly that he has next to zero knowledge about ZFS)

2) Patents and Licensing

Also, it's a linux's problem that it's doing that, not ZFS - nothing is cached twice on other platforms that run ZFS. Why? See reason #1.


Are you really saying: “design of user land code Y should be in the kernel”?

ZFS has been run just fine on a system with 1GB of ram. Ram issues with ZFS are just FUD.


>If ARC is so good, it should be the general Linux page cache algorithm

Not possible until IBM's patent expires


Well patent 6,996,676 was filed in November 2002, which should mean it's expired now?

I guess there's a few others listed in various places that were filed up to 2006-2008, I'm not sure how important they are.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: