Hacker News new | past | comments | ask | show | jobs | submit login

> git doesn't actually create new copies of the content for each commit

More precisely, it doesn't create new copies of content that you didn't change. For example, if you have 100 files in your repo and you change one of them and then commit, git creates a new copy of the content of the file you changed--a new blob storing the new file content--and a new tree object that references the new blob instead of the old one, plus the other 99 blobs that store the contents of files you didn't change; the new commit object then references the new tree object (plus the message and metadata). But git never stores diffs between old and new content; it just creates a new blob every time the content of a file changes.




> But git never stores diffs between old and new content; it just creates a new blob every time the content of a file changes.

Git pack files compress objects by storing them as diff files going backwards. That is, it stores the most recent state in full, then uses patches to go backwards. Because you're more likely to need a recent version in full than an older one.

https://git-scm.com/book/en/v2/Git-Internals-Packfiles


This is true but packfiles are an implementation detail.

It's still useful and more accurate conceptually to consider every commit as a complete snapshot of the state of code that point.


That can be said of every version control system. Restoration of state to any given version is their defining feature. How they achieve that is always an implementation detail, but those details can still be important and interesting.


Git commits are composed of all of the files in the commit, it’s parent and the commit message. This is an important guarantee that each checkout is valid without the rest of the repo. This allows you to have a lot of exotic implementations guarantee consistency between them. Meaning if your GitHub you can distribute commits across many servers. Or your Microsoft and you build partial checkouts for Gvfs. It’s what allows Git LFS to keep many of git’s core guarantees while making tradeoffs to improve areas where git is traditionally weak.


Sorta true but see what bbatha said.

There are people who distinguish changeset oriented and snapshot oriented and will hotly debate that one or the other is better.

But as you say, restoration of state is a necessary and defining feature.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: