I find this conversation fascinating because there is so much disagreement on the meaning of "functional" and "immutable"
What I've gathered so far from reading the article and the comments is that some people who are in the know about a very specific paper agree that Git is a purely functional data structure. And that others look at the ways you can use Git and point a finger and say, "Look! It can be mutated! Therefore it cannot be functional!" And the response to that is, "Don't be so technical about how you define functional. Or immutable. You know it when you see it."
Is this some kind of Obi-wan Kenobi from a certain point of view stuff? Why is this so difficult to get a handle on?
If a thing says immutable on the tin, and it's mutable, how is that purely functional? I know, read the paper. I know. But still, it's a legit question.
It seems to me that a data structure so amazing as being purely functional shouldn't be so easy to misunderstand as what we're seeing here. And it's clearly being misunderstood. And not only by me.
> some people who are in the know about a very specific paper
This stuff isn't obscure just because you don't happen to know about it already. Chris Okasaki's publications have been cited over 1000 times, mostly for his PhD work -- those papers, especially on functional data structures and amortization analysis in lazy languages, are considered foundational for a whole research area in Computer Science.
> Is this some kind of Obi-wan Kenobi from a certain point of view stuff? Why is this so difficult to get a handle on?
Did you learn calculus overnight, or expect to understand a technical conversation about differential equations and Cauchy sequences without taking two years of classes or reading a couple of really thick books first? Why should this be any different?
> If a thing says immutable on the tin, and it's mutable, how is that purely functional? .... It seems to me that a data structure so amazing as being purely functional shouldn't be so easy to misunderstand as what we're seeing here. And it's clearly being misunderstood. And not only by me.
Sigh. Explaining it properly would involve a tour though the untyped lambda calculus, simply-typed lambda calculus and the Curry-Howard isomorphism, a discussion on denotational vs. operational semantics (Hoare logic, functional interpreters, type-preserving compilation, small-step vs large-step operational semantics, etc).
TL;DR: "purely functional" is a description of the program's meaning (in a technical sense), not its implementation.
Okay, so if a program means for an ibject to be immutable but it actually is mutable then it’s still immutable if you explain it in terms of basic CS theory.
> "Look! It can be mutated! Therefore it cannot be functional!" And the response to that is, "Don't be so technical about how you define functional. Or immutable. You know it when you see it."
I think I can offer something to the discussion here, as I'm straddling the 2 camps - having a fairly intricate understanding of git (I've held training seminars at my company about it), and basic familiarity with functional (technically "persistent") data structures thanks to a long-standing interest in Clojure.
What the functional camp is talking about, when they refer to git as a "purely functional data structure", is the core of git's implementation - commits are immutable snapshots of the repository arranged in a directed acyclic graph (or a tree, if there are no merge commits), and branches are "just" movable pointers to some commit. In this view, everything other than this core model of history as a graph is extra details layered on top.
The other camp (let's call them the "git camp") looks at git in its entirety, including branches, tags, the index, the working tree, stashes, and so on, and they can't help but come to the conclusion you mentioned - "Branches are mutable references! The index is mutable! The working tree is mutable! How can git possibly be called a functional data structure?"
While this 2nd perspective is understandable and technically correct, I think it misses the point the other camp is trying to make: that the immutable commit is the "fundamental unit of git" so to speak, that we interact with everyday, and that most of git's history-manipulating commands (git commit, git commit --amend, git rebase, git merge, git cherry-pick) can be described in terms of purely-functional operations on the commit graph. Once you understand this, many complex git scenarios become easier to understand (rebase, and filter-branch, for example), so this view does have practical value. Then branches, tags, the index, stashes, etc can indeed be understood separately.
In fact, this perspective of git as a data structure is so useful, that I rarely use "git log" as it is, preferring these additional flags to view the entire graph instead:
The commit is immutable because when you "amend" a commit, git actually creates a new commit object, assigning it a new SHA1 hash. This means that the original contents from before the amend (i.e., the "old" commit) are still accessible via the old SHA1 hash - proving that an "amended" commit is actually a copy of the old one, not an in-place mutation.
Think of it like this - if I copy a text file and edit the copy, would you say that I've "mutated" the original file? No, right? That's what git does - it never actually mutates a commit :)
EDIT: to clarify further, git identifies each commit with a unique SHA1 hash which is assigned when the commit is created, and that hash depends on the contents of the files + the current timestamp + the SHA1 of its parent commit + some other things. Which means, if you try to edit the contents of the commit in any way at all (or even if you create another commit with the same exact files at a different time, or as a child of a different commit), its SHA1 hash will be different, i.e., it is a new commit by definition.
What I've gathered so far from reading the article and the comments is that some people who are in the know about a very specific paper agree that Git is a purely functional data structure. And that others look at the ways you can use Git and point a finger and say, "Look! It can be mutated! Therefore it cannot be functional!" And the response to that is, "Don't be so technical about how you define functional. Or immutable. You know it when you see it."
Is this some kind of Obi-wan Kenobi from a certain point of view stuff? Why is this so difficult to get a handle on?
If a thing says immutable on the tin, and it's mutable, how is that purely functional? I know, read the paper. I know. But still, it's a legit question.
It seems to me that a data structure so amazing as being purely functional shouldn't be so easy to misunderstand as what we're seeing here. And it's clearly being misunderstood. And not only by me.