To the folks in this thread who think they know better than researchers at the forefront of foundational mathematics: I encourage you to learn type theory, and especially homotopy type theory. In doing so, you'll be forced to confront the various notions of judgmental and propositional equality. You'll see why it's useful to have a formal justification for treating isomorphic objects as equal (this is called "univalence"), which is something you may have even been doing informally/subconsciously all along.
In particular, homotopy type theory is a type theory, which means it looks like a programming language. So people here might find it a bit more familiar than, say, higher category theory. It'll be much easier if you know functional programming, so you may need to learn that first if you haven't already.
This is not "foundational mathematics." This is about the foundations of algebraic topology. The experts in the foundations of mathematics as a whole (set theorists, logicians, model theorists, etc.) don't care about homotopy type theory. It's sort of pointless for the study of the foundations of mathematics. That is, they really just care about ZF or ZFC, because those are already formulated so that they are maximally useful for foundational investigations. (And the e.g. reverse mathematics people who want to work outside of ZF/ZFC already have more direct ways of doing so.)
Some idea of what people in foundations think about day-to-day can be gleaned by searching the Foundations of Math mailing list archive: https://cs.nyu.edu/pipermail/fom/
In particular: don't confuse higher category theory (of interest to the algebraic topology community at large) with homotopy type theory (which is not). Jacob Lurie has a rather pointed conversation with Urs Schreiber about this, linked to here: https://mathematicswithoutapologies.wordpress.com/2015/05/18....
edit: I'd love to hear from the downvoters why they disagree with this take. It's pretty standard, afaik.
I've read rather too much of the fascinating comments thread you've indirectly linked to, but this nugget by Mike Shulman in that thread seems especially relevant to the present one:
> If we distill it down from the categorical heights, this brings us back to John Baez’s pithy quote about equality: “Every interesting equation is a lie.” The only obviously true equation is x=x, but it carries no (or little) information. Any other equation, like x=y, is “a lie” because x and y are not the same thing; yet what is interesting about the equality is that they are nevertheless the same in some way. But knowing the equation doesn’t mean that we should forget all about y and use only x; the point of having the equation is that we can pass back and forth between x and y, according to which is most appropriate in any given context.
> I'd love to hear from the downvoters why they disagree with this take. It's pretty standard, afaik.
Because it's obtuse. There are well-known ways of phrasing all of this stuff in set theory (or rather, "legitimate" varieties of set theory) should anyone really care, replicating type-theoretical constructions by adding increasingly large "inaccessible cardinals" to one's pre-existing notion of set. But it's a pointless exercise when the type-theoretical description of these foundations is so much easier to work with. It's only useful if one believes that "set theory" is the only legitimate language for talking about anything foundational, which is again quite silly.
In what sense is type theory "easier to work with"? For what purpose? ZFC exists largely because it's a useful framework for proving meta-mathematical theorems. In this sense, it is extremely convenient to work with.
Also, I'm not sure what "all of this stuff" refers to here re: your comment about inaccessible cardinals. You don't need large cardinals to do 99.99% of modern mathematics (essentially, everything but set/proof theory). ZFC is good enough (and you can usually get away with something weaker).
Frankly we are sick of set theorist traditionalists complaining about type theory, category theory, and other new things without actually bothering to understand them at all.
After multiple decades, it's clear you all are never going to bother to learn our stuff nearly as well as we bothered to learn your stuff. This "debate" is thus completely unproductive without any good agreed-upon foundation to argue form.
First, I'm not a set theorist. Second, one of the major motivations for introducing type theory in mathematics was to assist with computer-aided formalization efforts, which has had some success and has a promising future (e.g. the stuff the Lean community has been doing lately). This is great! I'm not complaining at all.
But I do think trumpeting HoTT as a new, improved foundation of mathematics evinces a misunderstanding of what people who work on foundations actually want out of a foundation of mathematics; or even more broadly, the mistaken belief that finding a new foundation of mathematics is an interesting project in the 21st century.
(And clearly, Lurie is not some ignorant traditionalist. There's a substantive disagreement that can't be reduced to people not wanting to learn "your stuff.")
I think Mike Shulman's side of the discussion in that thread is also very worth reading, and it's somewhat unfair to (perhaps unintentionally) characterize Lurie as the representative of mathematicians trying to talk sense into the homotopy type theorists. They have a very balanced discussion where each respectfully tries to understand where the other is coming from.
I'm desperately trying to find a lovely philosophical writeup on the different purposes to which a "foundation" is put, and why different foundationalists want those foundations for distinct purposes. I can't seem to hunt it down (Google has become steadily less helpful over time). But the thrust is, I don't think anybody is claiming that HoTT ought to replace ZFC for reverse mathematics in particular; it's "just" a capturing of some pervasive intuitions for concepts at the ground floor of the formalism itself. It takes a perspective that makes grappling with some concepts easier to approach. If you're already a deft hand with a different (very well-established) framework, as Lurie exhibits throughout, maybe it's worth it to keep using the entirely viable tools you have!
There's a well-known line of research into the equiconsistency of set-theoretical and type-theoretical foundations, with many compelling results. So you actually can do "reverse mathematics" of type-theoretical foundations. You aren't losing anything.
You write: "I don't think anybody is claiming that HoTT ought to replace ZFC for reverse mathematics in particular; it's "just" a capturing of some pervasive intuitions for concepts at the ground floor of the formalism itself. It takes a perspective that makes grappling with some concepts easier to approach. If you're already a deft hand with a different (very well-established) framework, maybe it's worth it to keep using the entirely viable tools you have!"
My claim is that "capturing [...] some pervasive intuitions" is not what ZFC is trying to do, or what people in foundations are interested in. Lurie puts the point well when he says a foundation "mean[s] a small number of concepts and assumptions from which the rest of mathematics can be derived." The purposes here are first, to figure out what exactly one must assume to get the rest of mainstream mathematics, and second, to provide a convenient base for doing metamathematics (since the fewer elementary operations/objects you have to reason with, the easier it is to write proofs). Capturing intuitions is completely orthogonal to this endeavor; no one is claiming that set-theoretic translations of everyday mathematical arguments capture the intuitive content of those arguments, because no one one intended ZFC to ever do that in the first place.
Similarly, no one is claiming that handing someone a Turing machine that performs quicksort is a great way to capture the intuition behind quicksort, or explain the algorithm in an understandable way. Turing machines exist as a special-purpose construct to probe the limits of computation, not to reproduce or communicate high-level algoritmhic reasoning. ZFC is like a Turing machine for (meta)math.
Yes, hence my desperate desire to relocate the article on what purposes a foundation may be put to. I say this as neutrally as possible, but I don't think "foundation" has such a singular meaning, and misattributing one "foundation's" goals to be those of another's is probably near the root of your frustration. (It would help my case to find the article; at present I can only relay the ghost of what I took from it.)
HoTT/UF share some purposes of a foundation with ZFC, and not others. Each has purposes the other does not. That's about the safest thing I can say with any faith that I'm doing it justice.
I understand this desire but I think people who believe this should develop a new word or description for what you're attempting to do. See my comment here: https://news.ycombinator.com/item?id=30817964
I want to emphasize I'm not policing semantics just for the sake of being ornery. My reaction is approximately the same as it would be to someone who sends me a graph theory paper and calls it number theory. These distinctions are useful.
Wouldn't it be amusing if a common english word like "foundation", usually referring to the thing on which buildings are constructed, is uniquely reserved for the all the kinds of things on which you can theoretically build all of mathematics, but where none of the people involved are actually trying to do that, and a different word has to be invented for a similar thing, because you got banned out of the club for trying to actually build mathematics on top of it.
Yes, I don't especially disagree; I also think words are most useful when they mean something consistent and precise. Nonetheless, I think it's worth separating the merely terminological disagreement from the deeper semantic disagreement. Please do see the article I found(!) in a cousin comment, by Penelope Maddy. I think it provides a very enlightening perspective on the ways in which ZFC and HoTT/UF both purport to be "foundations" and yet seek to perform different duties.
Thanks. I enjoyed the article you linked. I think the description on page 16 is a good summary of what most people working in foundations take the purpose of foundations to be.
I'm sorry, but this is not gatekeeping; that term has an established meaning and mathematical history, and corresponds to a productive vein of research going back to Gödel (at least). If you'd like to argue that the study of foundations should be expanded to include the things that HoTT people are interested in, that's one thing. But can we at least agree it does not have a ton of relevance to the questions people have traditionally considered in foundational work? (This is almost by definition if we set it up so that it is mutually interpretable with ZFC.)
The fact is both narratives are wrong --- intellectual history is always more pluralistic and meandering than partisans want to admit in hindsight. We, as the underdogs, are ready to admit that. Can you all do the same?
That link talks almost entirely about implications for programming and not mathematics. I recommend the Penelope Maddy article linked in a comment above for a more in-depth discussion of the relevant history and aims of the field. She says things much more clearly than I can.
See, that is the basic misunderstanding here. Foundations now is a wider topic area than it was in Gödel's day. It's like saying that automobiles run on internal combustion engines in a productive vein of manufacturing going back to Henry Ford (at least). Since those days though, there is this guy Elon who you might like to meet...
I absolutely agree that it's a wider, more sophisticated topic today. This doesn't mean that its main concerns have fundamentally changed. The Maddy article linked in a comment above is extremely clear explanation of this.
Thanks, that looks interesting and I'll try to read it. But it appears to mostly be about philosophy, and it doesn't say much about what (mathematical) logicians are doing today.
I did start to read it last night, and finished today.
My understanding is that regular MLTT already can do all the bold things Set theory does, and HFTT (and other developments) aims to make "Proof Checker" better realized (no more "setoid hell").
Ultimately what us CS people care about is a unity of theory and practice. What drew us to computers int the first place was trying to make explicit and tangible ideas that previously were only laboriously communicated from one generation of big brains to the next in the ivory tower.
In that regards, I would say don't even look HoTT or fancy theory developments. Look at Lean, written by a non-type-theorist, and mathlib, written by a non-constructivist.
Lean and Mathilb are plainly plowing a head in ways that no set theory based tool has ever achieved, and covering all the set theory bold points in the process.
Perhaps a new more elegant type theory will replace Lean soon, but honestly I am not even worried about that. The important thing is making math much more accessible without loss of rigor with these tools. If better theories can follow rather than lead that methological revolution, that's fine with me,
As to Category Theory and essential guidance, I have a simple test :). As mathlib (or its sucesssors) matures, let us see how many proofs are refactored to use category theory! I don't want to poo-poo the deeper connections between type theory and cateogory theory, but merely using lots of category theory "library code" in a type theoretic proof assistant not itself designed with explicit reference to category theory is good enough a success story for me.
In conclusion:
- Risk Assessment: depends how many "large cardinal" type things we want (unlike the 1970s category theorists I personally don't really care)
- Regular type theory: *check*
- HoTT: *check*
- Set theory: *check*, bonus points for more exotic large cardinals
- Regular type theory: *check*
- Set theory: *check*
- HoTT: WIP because culture shock of weird equality
- Proof Checker:
- Set theory: *FAIL*
- Regular type theory: *partial support*, full if we discard all constructive side goals (Lean)
- HoTT: *check*! (Very recent cubical results
- Essential guidance:
- Category theory does best, but can use as "library" rather than "language". good enough for now.
I have no objections to claiming success in formalizing mathematics! As I've said, Lean is a big win for type theory.
But formalizing proofs with computers is essentially separate from the rigorous study of the foundations of math (consistency, relative strength of theories, etc.).
You've complained above about set theorists not learning "your stuff," and now it's my turn to complain. You write: 'Risk Assessment: depends how many "large cardinal" type things we want (unlike the 1970s category theorists I personally don't really care).' Large cardinal axioms are essential to the study of the relative consistency strength of mathematical theories. You personally might not care about them, but accommodating them is not a negotiable detail for anything that purports to be useful in the study of foundations.
And yeah, I'm aware the HoTT people have ways around this, fine. My comment is more about the perspective conveyed by your remarks: if you want to have a seat at the table in a discussion about foundational issues, you need to understand basic facts like this. My suspicion is that you might have had bad interactions with set theorists because you don't share certain knowledge they take for granted, and then they get frustrated and end the conversation early. I truly believe that if you went and read a book on set theory and logic, and learned, e.g., wny large cardinals are important, you might understand better where I and other people are coming from.
> Large cardinal axioms are essential to the study of the relative consistency strength of mathematical theories. You personally might not care about them, but accommodating them is not a negotiable detail for anything that purports to be useful in the study of foundations.
It's my understanding that some larger cardinals are needed for foundations, but since the foundations problem was "solved" future work on large cardinals has continued, and that is more "more set theory for set theory's sake", than something needed to accommodate math other "regular" mathematicians are doing.
(With the possible exception of the Grothendiek and Voevodsky line of problems perhaps as the article relates --- separate and ancillary to Voevodsky's desire for a proof checker, and also separate to whether or not categorical methods alone can suffice.)
I actually don't know of type theory having "ways around this". My understanding is that while type theory doesn't appear to face fundamental obstacles here, set theory "has the lead" in making large cardinals (or their equivalents) due to being older, and the type theorists relative disinterest in this area has meant the gap is not closing.
Large cardinals are closely related to what type theorists call "universes". Again, there's a huge amount of equiconsistency and relative consistency results linking type- and set theories. To keep suggesting and implying that type theory cannot possibly be a language for talking about these things is so incredibly careless and obtuse that some people might call it actively misleading.
I'm not sure how that comment is responsive to my post. Sure, type theorists talk about universes, and sure, there are relative consistency results linking type and set theories. My claim is just that large cardinals (or some equivalent concept) are essentially necessary to order theories in consistency strength, so is hard to take a response of the form "it doesn't matter whether HoTT can talk about large cardinals" seriously.
Not downvoting,in particular there are useful links in your post.
But note that by foundations one has atleast two different meanings. One meaning being examining the consistency of axioms or finding weakest set of axioms for a theorem.
The other meaning being - finding a foundation in which a given mathematical theory is naturally and easily described, analogus to designing a programming language in which the programs are easy to describe.
Most of the time in mathematics, this is done by defining high level concepts and working from there - no need to change the foundational axioms or rules of inference.
But sometimes we need more - for instance, if you read the book "Synthetic Differential Geometry", infinitesimals are naturally embedded in the theory by working in a cartesian closed category of sets. One can either embed this in ZFC using cumbersome models, or change the logic itself(remove the law of excluded middle).
The logic in HOTT deals with equality in a more careful way than the usual foundations. This is not restricted to algebraic topology alone - it certainly includes algebraic geometry and tackles a general issue which will plausibly have many more applications once the technology is available. The field of noncommutative geometry, in the style of Connes, is also about this as the problem of taking quotients is another way of the stating the issue with equality.
It is a work in progress and might take some time before bearing substantive fruit, but note that the current alternative is that any student learning higher level mathematics at the level of Lurie's books has to go through a huge amount of foundational material while the ideas themselves may not be intrinsically complicated. Instead, this complexity is an artifact of embedding the theory in current foundations like the SDG example mentioned above.
With a new foundations, there is the possibility that we can teach the material at this level to students except that there are some tweaks in foundations.
I don't have much to say here that I didn't already say above about the meta-mathematical issues. However, about object-level mathematical issues, you write: "It is a work in progress and might take some time before bearing substantive fruit, but note that the current alternative is that any student learning higher level mathematics at the level of Lurie's books has to go through a huge amount of foundational material while the ideas themselves may not be intrinsically complicated. Instead, this complexity is an artifact of embedding the theory in current foundations like the SDG example mentioned above."
Higher category theory is not the same thing as homotopy type theory. One does not need to learn anything about the foundations of mathematics to read and understand Lurie's work. As far as I know, issues of ZFC vs type theory simply do not arise. You can just pick up Higher Topos Theory and start reading, assuming you have a strong background in algebraic topology.
The linked article is only about higher category theory and making sure all of its claims are written down in a clear and rigorous fashion to be maximally useful to workers in the field. It has nothing to do with the foundations of mathematics as a whole.
> One does not need to learn anything about the foundations of mathematics to read and understand Lurie's work.
By foundations here, I didnt mean foundations of mathematics, rather the amount of material a student needs to read to become familiar with the basic definitions of what a space means. We can teach topology to undergrads, can we teach higher categorical notions of space? Even schemes themselves had a daunting reputation some time back, now we have derived stacks which involve infinite categories.
> You can just pick up Higher Topos Theory and start reading
Yes, but my point, as illustrated in SDG example, was that definitions can be easy or hard depending on which foundations you use. You havent mentioned whether you disagree with this and if you see Kock's book, I think you will agree.
When dealing with higher categories, a foundations which uses '=' symbol in a different way, with an equivalence relation(modelled on homotopy equivalence) playing a much more prominent role could make the theory much more natural.
Note, that I am not even emphasizing sets vs type theory here, but excluded middle and a good notion of equivalence(and equivalence between equivalences... like in homotopy theory) in the foundations.
Maybe a lot of this debate will get dissolved if we just use two different words for 'examining consistency of axioms' and 'naturality of theory as encoded in a given system' instead of using the same word - foundations - for both of them.
At the very least if discussions begin with an awareness of this distinct usage, there would be lot less talking past each other.
1) Yeah, Lurie's books are advanced tracts aimed at specialists in algebraic topology. Of course there are going to be substantial prerequisites. This is sort of inevitable if you want to understand the motivating applications.
2) Why would we ever want to teach higher categorical notions of space to undergraduates? We can barely get them out in four years knowing real analysis and abstract algebra. The next stop would be the basic theory of manifolds, not higher topos theory.
3) I looked at Kock's book (the draft of the CUP book on his website). The definitions are more abstract than the ones I would give to undergrads, and consequently harder for them to understand. The appeal is not clear to me.
4) You don't have to change your foundations (meaning ZFC vs. type theory, etc.) to do higher category theory in a reasonable way. Again, go pick up Lurie's books. The exposition would not really be affected by whether you choose type theory or ZFC for a foundation. Let me know if you think there's a counterexample to this claim. (The exposition probably would be affected if you denied the law of excluded middle, for the worse; luckily Lurie doesn't do that.)
5) Foundations are sort of irrelevant here anyway. The point of ZFC is that you can forget about it when actually doing math. If you're thinking about foundations when reading Lurie's books, you're doing something wrong. (Consider: most mathematicians have zero opinion about foundational questions or concepts because they have no bearing on their day to day work.)
ZFC is the x86 assembly language of foundations.
Saying that ZFC is good enough because you can do all of mathematics in it is like saying that we don't need Java/Rust/Python because x86 assembly is good enough because x86 assembly is Turing complete. True, but that's missing the point of what people are trying to do with higher level languages.
You can theoretically do everything you want in ZFC, but it's low level and a bit outdated and you don't actually want to write formal proofs in it. We know from experience that it is possible to translate the kinds of proofs that people do on paper into formal ZFC proofs, at least in theory. The mathematicians that study ZFC-like foundations are OK with that "in theory"-qualifier and are instead investigating the theoretical properties of the foundations.
The goals of (homotopy) type theory are very different: they are trying to be suitable for actually doing modern mathematics in, and computer proof assistants are being developed that make it possible to actually write down formal proofs in them, in the form of proof code. For this to work it needs to be possible to translate all the kinds of arguments that mathematicians write down on paper into formal proof steps. It is not sufficient to remark that it is theoretically possible to find some kind of translation: it must actually be done for the computer to accept the validity, and the burden must not lie on the mathematician writing down the proof. Instead, the system must make it easy to write down the steps that the mathematician wants to do. One of the kinds of arguments that mathematicians use on paper is that if A is isomorphic to B, then a theorem about A automatically gives you a corresponding theorem about B. Homotopy type theory is trying to formally enable this kind of reasoning.
In summary, the foundations people interested in ZFC have a very different goal than than the foundations people interested in (homotopy) type theory. The fact that the ZFC people are perfectly happy with ZFC and only care about ZFC doesn't mean much. The goals of the type theory people are not achieved by ZFC, not by a long shot.
Note that it is not the case that the general body of mathematicians cares about ZFC and does not care about HoTT. They care neither about HoTT nor about ZFC. The nod to ZFC that mathematicians give when pressed is mere lip service; most mathematicians would not even be able to list the axioms of ZFC. Only foundations people care about ZFC, and the kind of foundations that the ZFC people care about is irrelevant to the practice of general mathematics.
> ZFC is the x86 assembly language of foundations.
> The goals of (homotopy) type theory are very different: they are trying to be suitable for actually doing modern mathematics
> Only foundations people care about ZFC, and the kind of foundations that the ZFC people care about is irrelevant to the practice of general mathematics.
I agree with you that this is really about the foundations of algebraic topology, but what makes you say that these are not the foundations of mathematics? In the community of computer-aided formalization, a lot of interest in modern type theory arose, because some things in algebraic topology just couldn't easily be done in the old First-order-logic + ZFC framework: For example, Kevin Buzzard once asked if Isabelle/HOL (which even uses higher-order-logic) can even be used to formalize schemes! (It turns out that it is possible if you do it in a special way with locales... but the Lean formalization still seems more elegant). Of course, there is a consistency proof of Leans type theory using inaccessible cardinals, so the construction of schemes carries over to ZFC+large cardinals, but then it is not very understandable anymore.
I agree with your point about HoTT not being useful to the algebraic topology community, but this seems orthogonal to its usefulness in formalization: I also would not expect Lean's type theory to be directly useful to working mathematicians. Instead it can make formalization easy or even possible: Even Jacob Lurie seems to acknowledge that it could be useful for that and only disputes that it does not bring new insights on an intuitive, informal level. My reading of the discussion is that some HoTT proponents see HoTT as a new language of mathematics (similar to category theory) and that Lurie does not believe that this perspective adds much, but that he does believe that one can use that perspective to understand the theorems.
In the linked discussion Voevodsky points out that "Calculus of Inductive Constructions, the type theory that we currently using to formalize mathematics in the univalent style [and basically what Lean uses] is surprisingly convenient for doing mathematics at the level of sets and categories (and maybe 2-categories). As for mathematics of higher h-levels it is, in practice, of very little use. Note that this is a great advance of what was before as before we did not even have a system for doing abstract mathematics both formally and conveniently at the level of sets." [0]
This highlights how crucial type theory is to do modern algebraic topology, even as he seems to claim that HoTT actually can not formalize Lurie's work? Still, he seems to agree that dependent type theories were a step in the right direction (unsurprisingly, since he founded HoTT) and that more work is needed for finding new foundations for mathematics. Indeed, the, as you write below, "mistaken belief that finding a new foundation of mathematics is an interesting project in the 21st century" is something that some people in foundations (and close to the Lean community) like Jeremy Avigad are actively pursuing: see e.g. https://arxiv.org/pdf/2009.09541.pdf
> It turns out that it is possible if you do it in a special way with locales
What was special about it? From memory, the formalisation [1] proceeded exactly how you would expect. Locales are simply an Isabelle mechanism (in addition to type classes) through which hierarchies of structures are built up.
Curious how approachable you think homotopy type theory is to people who think they know better than researchers? Simple type systems would be understandable (Haskell has type systems similar to System F and rust has an affine your system) but if anything being a programmer may add obstacles to understand HoTT since programmers (generally) have never had the opportunity to learn topology or anything else that HoTT builds on. Also not everything in HoTT has a clear computational interpretation, notably the univalence axiom.
That said I encourage everyone who's interested to investigate this but I don't think it's realistic without having a solid foundation in mathematics.
(And I also agree with the sibling comment that HoTT isn't really used as a foundation of mathematics.)
I don't think it's an easy journey—not because the material is intrinsically difficult, but because almost all resources are written for mathematicians and computer scientists. I think a "homotopy type theory for programmers" book/blog/playlist could be a big hit. I've been considering putting something like that together.
I view the relationship between HoTT and topology as similar to that of functional programming and category theory: you don't need to know the latter to learn the former, but having that extra background can certainly help.
I was also considering writing some introductory stuff about HoTT but I don't think it's possible to avoid talking about topology, homotopy theory, and category theory in depth like you can with, say Haskell and category theory. This is because the stuff that makes HoTT so cool cannot be separated from it's mathematical foundations. How would you explain truncated types without going fairly deep into the math? Or the circle type? Sure both of these could be explained as instances of higher inductive types but that explanation is missing a lot.
As already has been done with category theory, you need to target a specific audience, whether it is programmers, scientists, engineers, or mathematicians, and illustrate the concepts by giving examples from their area of expertise, if possible.
> You'll see why it's useful to have a formal justification for treating isomorphic objects as equal (this is called "univalence"), which is something you may have even been doing informally/subconsciously all along.
Would you be able to discuss how this compares to quotienting? It sounds similar, but I assume has very important differences. I have, very formally, been doing quotienting. I identify an equivalence relation, quotient it, prove that the properties and operations of interest are well-defined when lifted to the equivalence classes, and then work exclusively at the level of the equivalence classes and the lifted operations. Say, when I work exclusively with reals, which are equivalence classes of Cauchy sequences, or with cosets in group theory.
I see where you're coming from, but it's hard to appreciate the subtle issues surrounding the notion(s) of equality without actually investing some time into doing formal mathematics. My claim is that type theory is a good avenue to explore this subject matter, since you can check your work and get instant feedback with a type checker.
Anyway, I sympathize with your sentiment that it can be hard to know whether something is worth learning without knowing if or how that knowledge will be useful to you. I feel that way about many topics myself.
Except equality never meant "exactly the same" to begin with, though, right? Like we write 1 / (1 - x) = 1 + x + x^2 + ..., but one of them diverges for x = 2 and the other doesn't, so they're not "exactly the same", just equivalent in some sense. Or heck, even x / x = 1 doesn't mean the two are exactly the same. I'm not sure how far you could take math if you treat equality as meaning "exactly the same"; it seems even middle-school math requires understanding that that's not the case.
Edit: In case it wasn't clear, I'm hoping someone can explain how this notion of equivalence differs from what we've traditionally called equality since middle school, given that the latter doesn't appear to mean exactly the same either.
You’ve hit upon one of the key insights in category theory. It’s been awhile, let me see if I can remember enough to describe the sketch…
“Equality” is a tricky concept, and various branches of mathematics have slightly different ideas (or axioms) of equality. Simple arithmetic is based upon strict equality: two things are equal if and only if a sequence of logical steps are followed precisely and the final answer matches exactly. Other branches have looser (or just different) versions of equality: set theory may use a relation to induce an equivalence relation, topology may see things as equal based on structure preserving transformations, computer science has notions of program equivalence based on identical outputs given identical inputs, etc.
Category theory is an attempt to identify and unify various similar concepts across mathematical branches, including “equality”. Category theorists are limited to talking about objects (which are opaque), and arrows between those objects. Using those (incredibly) abstract tools, how does one define “equality”? The answer is, one can’t, so what are the closest things possible that capture “equality”, and can be specialized back into more traditional notions as needed.
Part of the confusion is that category theory is described from the “arrows and objects” perspective, when the really interesting stuff happens two layers higher (natural transformations and above). And another part of the confusion is that many mathematicians see category theory as needless nonsense :)
> “Equality” is a tricky concept, and various branches of mathematics have slightly different ideas (or axioms) of equality.
Due to my own circumstances I’m drawn to notions of equality that have pragmatic value. One of my favorite is Leibniz’s rule
x=y => f(x)=f(y)
I use that rule so frequently and effectively when reasoning about programs that pragmatically I’m not interested in a model where it doesn’t hold. It looks very simple, but it’s practically foundational for reasoning about predicate transformers.
Incidentally, I’d be really interested in a rigorous rendition of Dijkstra and Scholten’s work on program semantics in the language of category theory.
Leibniz's rule does hold under these generalized notions of isomorphism and equivalence. Each of these notions implicitly sets apart some class of functions `f` as "not evil", i.e. as respecting Leibniz's rule wrt. that equivalence. So as long as your predicate transformers are not "evil" in that sense, you'll be covered. And in practice, trying to work with possibly "evil" functions without a clear notion of what equivalences would render them "not evil" is generally a waste of effort and leads to results that are not semantically useful.
Observe that when you identify x with y, you are already drawing an equivalence between their formal representations (i.e. x, y), and the rule holds only for f that is blind to the formal representations, and distinguishes only between these equivalence classes. But within math topics there are notions of equivalence that are stronger than just that that the topic is uninterested in "piercing through the equivalence veil". For example, computer scientists often talk about Turing machines being equal even when their alphabets are disjoint, as long as their structure is the same, reserving equivalence typically to mean accepting the same language.
You’re not even wrong. I suggest that you at least take the five minutes to read the wikipedia summary on predicate transformers before posting a replying in which you observably have no idea what you’re talking about.
Briefly, predicate transformers are functions about the behavior of a language, not functions in the language. In that model predicates are pure functions over the program state space and predicate transformers are functions from predicates to predicates. Clearly your objections are inapplicable.
Decisions Theory (so, statistics and economics) have also always relied on the idea of equivalence (I believe the notion is the set-theoretic one) for purely practical reasons.
So for instance (drawing on the other reply in this thread about Leibnitz), to say that
x = y => f(x) = f(y)
is not really useful when the meaning of the equality is contextual.
If two choices are equal for one person (or prior distribution, utility function, uncertainty aversion etc.), they may not be "equal" if anything changes. And if they are, but other choices are not, the "equality" doesn't help.
By contrast, in if in the moment of where a context is fixed (we look at one potential set of decisions, one choice relation etc.) there exists an equivalence relation, then we can analyze the choice situation with math, and it's the equivalence of choices (with respect to this one relation) that does this. And then, naturally, if we conceive of a situation that we can analyze (or we have data on it), we can conceive the relationships that might have given rise to it.
And, as it turns out, there's typically infinitely many of those. And so, even lowly economists or statisticians must - in some simple way - grapple with what I would imagine categories are (?)
Would love to read more on this. Are there materials on category theory for non-mathematicians (and non-computer scientists thinking in terms of language theory)?
Mathematician here. I understand why some people interpreted this post as dismissive of the article's premise. But I wish it could be okay, on a forum that putatively exists for founders and makers and challengers of the status quo, to say "this doesn't make sense when I think about it on a deeper level". Even if the reason it doesn't make sense to the speaker is because the speaker isn't an expert, and ultimately the thing does make sense.
A challenge or contradiction doesn't have to be disparaging. We are all laymen sometimes, and sometimes we just want to scratch an itch. It usually comes from a place of curiosity, not arrogance. We want to understand how something that seems weird to us can actually make sense, even if our understanding of only limited and superficial.
Why not let people express that in a direct way, without a bunch of deferential throat-clearing ("not an expert, just my thoughts here, this doesn't make sense to me...")? Who is hurt?
Great comment, thank you. Also, bravely asking these obvious, "stupid", elephant-in-the-room, naive questions on HN is a great place for a discussion to start, and will attract people who can explain things in simple terms. The comments will be more useful reading to more people than ones on a high, technical level. 'Being wrong on the internet' attracts truth. It's not the experts who will object to such questions!
And as a complete non-expert in this subfield of math (or anything within a hundred miles of it) I would love to hear why people are saying stuff like "equality is a fundamental mistake of mathematics and this approach is fixing it" rather than "this approach is developing a more refined/general/flexible notion of equality".
After a few minutes of deferential throat clearing, I'd like to say "in all seriousness, come on, equality isn't a mistake, and whatever brilliant insight your theory affords still isn't going to do away with it."
Programs and logical proofs are "isomorphic" according to Curry-Howard. So logical theories like math in general are "programs". Two programs can be "equal" in the sense that same inputs produce the same outputs, yet their source-code can be very different.
Therefore to explain a theory to a layman or anybody consider that your theory is a program, and therefore there can be any number of equal/equivalent but different programs, i.e. representations of that theory that are "equivalent" but have a different representation.
Therefore, and this is the point I'm trying to make, (and maybe I'm wrong) any good explanation of a theory could also be presented in some other way, and possibly in a simpler way. Programmers know this intuitively: You can write the same program in a complicated way or in a much simpler way, sometimes. Therefore the fact that some math is difficult to understand may in fact not be the fault of the student, but of that "math" itself.
And therefore it is useful that people come up with questions about what they find difficult to understand about a given theory. The "stupid" questions reveal something about the theory, they reveal what is difficult about it, what kind of mistakes people make when trying to understand it.
> In case it wasn't clear, I'm hoping someone can explain how this notion of equivalence differs from what we've traditionally called equality since middle school, given that the latter doesn't appear to mean exactly the same either.
The point is that what's called "equality" in current foundations of mathematics is not equality in the middle-school math sense, but something far more restrictive. There are older, property-based notions of "equivalence relation" that broaden this a bit, but not nearly enough for many uses. OP describes a logical formalism that might enable us to make rigorous, watertight inferences about something really, really close to what a middle-schooler would call "equality". It gets rid of a pervasive abuse of mathematical language and notation, a worrying "gap" in previous attempts to make mathematical reasoning rigorous.
In ZFC theory (fundamental maths), "A in B" means "there is a relation between A and B, and this relation is called 'in'". Replace 'in' by the correct math symbol or anything you want, it works the same.
From this, we can define a new relation called "=" such as:
for all x in A and for all y in B, x in B and y in A
We don't care what A, B, x or y are. We don't care what "x in A" means. It's just logical statements that are useful, and we build the rest of maths upon it.
For me "=" does not mean "equal", "equivalent", or anything. It means "there is a relation between 2 objects".
For example:
- in a programming language "a = 3" means that "a" is related to "3"
- in a BNF grammar "rule := term term term" means that "rule" is related to "term term term"
- in a math equation "y = x + 1" means there is a relation between "y" and "x + 1"
- on some old calculator "A <- 1" means there is a relation between "A" and "1"
In each of those domains, the relation is precisely defined. But it does not need to be the same relation (they are different math theories).
Another example, "y = x + 1" could be interpreted as "there is a value for x for which I can replace x + 1 by y". You don't even have to care about what "x + 1" means.
In both of your examples equality holds for x in some specified ___domain. The first equation is not valid (outside of, say, analytical continuation) for x >= 1.
I thought the article was cool. From this comment, you are coming across as someone who is trivializing a potentially important piece of work. I didn’t really understand it, but I take mathematicians at IAS seriously.
Why the ad hominem? I have nothing against the mathematicians or their work. I don't even know the math; that's why I'm reading the article. And I'm just trying to wrap my head around the article's explanation and trying to figure out why its premise seems inconsistent with my understanding. The problem is more likely somewhere in my understanding, so I posted it here in case somebody could clarify whether/what I'm misunderstanding of the difference between what we've called "equality" and what they're calling "equivalent", given neither of them seems to necessarily mean "exactly the same".
> you are coming across as someone who is trivializing a potentially important piece of work.
I doubt it. The parent is confused and hence the question obviously. But I get the impression that the parent would benefit from reading an elementary textbook on discrete math as the post is gibberish (as are a few other posts in this thread). I know it sounds rude, but I don't know how else to put it. It's not only the fact that the parent doesn't have a conceptual idea about equivalence relations and in particular equality, isomorphism, let alone homotopy, but is also very confused about what it means for a series to converge which is also very elementary. Before discussing the merits of Dostoevsky in Russian, one must learn and know the Russian alphabet at the very least or something.
> I'm not sure how far you could take math if you treat equality as meaning "exactly the same"; it seems even middle-school math requires understanding that that's not the case.
I honestly didn't mean that in a rude way, more like a "hey dude, be careful how you're coming across here, as there may be things you don't understand." I read the above phrase as describing a mathematician's work as something a middle schooler should already know. But I take back what I said. I'm all good, everyone's good, I think there was just a misunderstanding here.
In the same vein, I don't see how you could read my message as needlessly insulting, as I thought it was pretty softly worded. But I guess that's the nature of written communication. I think you should have given me the benefit of the doubt, just as I should have done for the original poster.
No, it wasn't softly worded. Now you take back what you said, although it was fine and I'm the one in the wrong to think it was insulting?! Etc. I don't think those are reasonable interpretations. Your comment was good without the last paragraph. Can't help sinking the boot in, eh? That's not a very nice habit. But anyway, I have things to do today! See ya.
What you have up there is an equality. For example, x = y means x and y are the exact same thing. Equality is one example of equivalence relation of which there are a ton. Equivalence relation is a more general concept.
A set is a bag of items. We usually call these items elements of a set. For example, {1, 2, 4} is a set containing elements 1, 2, 4.
Ordered pairs are tuples written like so: (a, b) where (a, b) != (b, a).
Relation is an association from one set to another. For example, a -> x, b -> &, c -> apple is a relation from {a, b, c} to {x, &, apple}. Let R stand for this relation, then we write R = {(a, x), (b, &), (c, apple)} where R associates a with x, b with & and c with apple.
Consider the sets X = {1, 2, 4, 5} and Y = {6, 9, 1000, 5} with the following association R from X to Y: 1 -> 9, 2 -> 6, 4 -> 1000, 5 -> 5. This R looks like an ordering relation meaning this R is ≤. Let's see, 1 ≤ 9, 2 ≤ 6, 4 ≤ 1000, 5 ≤ 5. Then ≤ = {(1, 9), (2, 6), (4, 1000), (5, 5)}.
Some relations are called functions if they meet certain criteria.
If a relation R abides by the following rules it is called an equivalence relation. For any a, b, c in a set S:
1. (a, a) in R
2. If (a, b) in R, then (b, a) in R
3. If (a, b) and (b, c) in R, then (a, c) in R.
For example, congruence modulo integer n is an equivalence relation as it abides by the three rules above which can be easily shown. Two integers a, b are congruent modulo n if n divides a - b.
Equality is but one of many different equivalence relations.
What I wrote above doesn't even come close to scratching the surface of this topic. However, any introductory textbook on abstract algebra, discrete math or elementary number theory will have this material well fleshed out. Also, every intro to proofs book dedicates significant number of pages to the topic of relations. But this stuff is very elementary and many steps removed from category theory and so you are missing some foundational background to understand the topic broached in the article. You want to have some experience with abstract algebra and general topology before approaching category theory.
As for the equality you gave above,
let X = 1 + x + x^2 + ... and Y = 1/(1 - x). Then X is a geometric series and X = Y only if |x| < 1. If |x| >= 1, then X diverges. When X = Y, you cannot talk about one side of the equality diverging and the other not as they both are the same object.
This topic of geometric series (and many others) can be found in any textbook on elementary real analysis. Textbooks are great because they start from the beginning, don't leave out (or add) some random topics, give airtight definitions and the statements are given watertight proofs (like that one above about X = Y needs proof).
One thing I did not find in thr article is what the improvement here is.
If I read it (and some of the comments here) correctly the idea is to replace equality from ZFC (so isomorphic sets) with equality in categories (so isomorphic objects in a k-category).
The article says something about better describing all the isomorphisms, but doesn't explain why that is better.
A comment here suggested to me that this system might make it easier to rigorously 'compose' different types of equivalence relationships. Something like "this field extension has a galois group 'equivalent' to A. And field extensions of A have this other property so this field extensions must also have this property".
Am I on the right track as to what improvements this system offers?
As a relative layperson alongside you, I'm not sure what precise deficit Lurie is addressing. But having spent some deep time in a proof assistant (Agda) lately, I think there is a related issue.
Formally, equality can be "intensional" or "extensional". An intensional theory equality says that things are the same only when they are defined the same way. So, for instance, "the set of even primes" and "the set of primes less than 3" are not intensionally the same. In contrast, an extensional theory says that things are the same when they behave the same. Both sets answer the same membership questions the same way -- 2 is the only element of either -- so they are the same set.
ZFC is fundamentally extensional -- one of its axioms is literally called the axiom of extensionality, defining equality in terms of membership. In deeply identifying the above sets, we lose some of the ability to distinguish them formally and model the way in which they are equivalent.
That's one guess.
Another, less specific guess is that it's much like how the dual of a vector space is isomorphic to the original vector space, but in a way that depends on certain arbitrary choices. Meanwhile, the dual of the dual is naturally isomorphic to the original vector space. Thus, there are qualities of equivalence that we're interested in. (But this example is perfectly well captured by linear algebra as it stands.)
In standard mathmatics you cannot identify isomorphic objects without the use of an informal argument or abuse of notation: The issue is that the two equivalent objects aren't actually identical, they just behave the same in some way.
This becomes necessary if you look at more complex objects than the ones equality was originally designed to tackle:
a permutationsort and a quicksort are not the same, therefore you can't (at least in the original sense of equality) write permutationsort=quicksort.
Of course you can invoke the notion of Leibniz equality and say that ∀x. permutationsort(x) = quicksort(x), therefore they are the same.
However, you can't generally replace every occurrence of "quicksort" in your program with "permutationsort" because most likely your program is not going to finish before the heat-death of the universe.
So for your program, you need to be able to express the notion of "they two algorithms have to be the same wrt. their (input,output) pairs and asymptotic runtime".
However, if you ask your friend, the pure mathematician, you will notice he is quite fine with either one of them, because for the soundness of his new algorithm he just needs _some_ sorting algorithm: In fact, he wants that every sorting algorithm works, not just the specific one he chose to prove his algorithm with.
The third person you go to is your old computational-complexity theory professor: From his point of view, permutationsort and quicksort aren't the same at all, but quicksort and matrix-multiplicationa are: after all, they are both P-TIME algorithms.
Meanwhile your, at this point very confused, mathematics student comes up to you and asks "if we take a vector space with basis {(1,0),(0,1)} and transform it into a new vectorspace {(1,2),(0,1)}, are they still the same?" and simultaneously two mathmaticians stand up, one saying "yes of course, both describe the ℝ²" and the second will say "no they are not: clearly one is an orthogonal basis, while the other isn't". Depending on the proof you want to do, both can be right.
Importantly here is that all of these aren't only considered "equivalent" they are considered "equal", just like you would consider the same algorithm written in Java and C++ as the same, even though they clearly aren't. Therefore, what you want to do is extend the notion of equality such that these notions of equivalence are covered by it. For that, you need a more fine-grained notion of equality, which leads to the equality to be "indexed" by its proof (after all, you need to know which "equality" you were talking about). This will very quickly lead you to the notion of "paths as equalities" and the univalence axiom.
So you get rid of the uniqueness of identity axiom, and instead build different notions of equality on different "levels" to have all these notions of equality (and many more) in your system natively.
For my mathematical bird-brain, the most straightforward way to capture the situation you describe would be in terms of equivalence relations: Each person in your example defines an equivalence relation and each is "valid" and useful in their own way.
My question would be:
1. What makes this informal? Is it because a relation is in some way mathematically informal, because given general objects we either have to enumerate permutations or we have to describe the relation in some informal way? Or is it something else?
2. Starting with the above understanding of equivalence relations, where does category theory come in, and what does it solve (and add)?
My guess would be that it is operating on a higher order concept than equivalence relations. For instance, seeing that a situation might be described/given by several equivalence relations, we'd probably want to have equivalences between equivalence relations, and then work based on "quotient sets of equivalence relations" and derive theorems based that?
Doing everything in terms of explicit equivalence relations is perfectly rigorous, but it results in a combinatorial explosion of bookkeeping tasks for anything other than set theory, and so no one actually does it. We want foundations that allow us to engage in idiomatic mathematical reasoning and be perfectly rigorous at the same time.
> For instance, seeing that a situation might be described/given by several equivalence relations, we'd probably want to have equivalences between equivalence relations, and then work based on "quotient sets of equivalence relations"
tl;dr seems to be that some mathematicians think Jacob Lurie's work on infinity categories points to a better concept of equivalence than normal equivalence relations. Those mathematicians have been playing fast and loose with developing this theory, and now a few of them are trying to construct infinity categories of infinity categories that they think might fill holes in some of the less than rigorous work that's been going on.
That sounds interesting, but unless higher category theory has any meaningful applications to non-mathematicians learning math in high school or undergraduate, saying that any of this will replace the equals sign or the standard notion of equivalence relations is overselling it.
> That sounds interesting, but unless higher category theory has any meaningful applications to non-mathematicians learning math in high school or undergraduate, saying that any of this will replace the equals sign or the standard notion of equivalence relations is overselling it.
It's possible that our new understanding of equality as it appears in higher category theory and homotopy type theory will finally make working with proof assistants sufficiently practical for mathematicians, since it enables reasoning in a way that is closer to how mathematicians reason informally (treating isomorphic objects as equal, working with quotients, etc.). This area of research is obviously an active construction site, but we've seen incredible progress in the past two decades (e.g., cubical type theory).
In high school, we often work informally with notions that have a more formal treatment as one progresses to university and graduate school. For example, in high school, one might be introduced to calculus by learning some rules for calculating derivatives and antiderivatives. Later, in university, those calculations are justified by real analysis. One could imagine a similar treatment of equality: in high school, we learn how to do equational reasoning by substitution, and then in university maybe we'd learn about (for example) how equality is defined as an inductive family and how substitution is justified by its corresponding induction principle.
So, your comment seems a bit short-sighted or closed-minded to me, even if you're correct that this will not have any impact on high school mathematics. I just think that's a poor measuring stick for impact.
This is conflating analytical descriptions of these potential mathematical objects for synthetic ones. Analytical characterizations ("how does this object arise") is what's sought in this kind of research, but a synthetic one ("here's a mini-language that formally describes this mathematical object, and the exactly-stated 'rules of the game' that ensure that you're using it correctly in a logical sense") is all that a high school student needs. High school students don't typically concern themselves with constructing, e.g. the real numbers as formal mathematical objects, but they need to know how to make correct statements in the language of real numbers.
Note that synthetic descriptions are not "informal" in any sense; they're just as watertight logically as analytical ones, they just arrive at the result in a different, and arguably complementary way.
Hopefully someone can explain here because I feel like equivalence fails in the same way equality does.
Is 1 + 2 == 3? well, I understand they're equivalent, but they're not the same terms.
Is (1+(1+1)) the same amount of equivalence to 3 as 1+2 is? I'd much prefer a notion that could look at how 'equivalent' terms are opposed to just looking at equality after β-reduction.
In algebra, monads are used to reify the notion of an "explicit expression". So you have the integers, Z, and the monad m gives m(Z), the set of expressions (let's say sums) over Z. And then you have m(m(Z)) which is nested expressions and so on. An algebra for a monad evaluates an expression m(Z)->Z, and all the monad laws basically enforce the kind of equational reasoning ordinary notation takes for granted (eg sub-expressions can be replaced by their value, 1+(1+1)=1+2).
So those would be related in this way
join
(1)+(1+1) : m(m(Z)) -----> 1+1+1 : m(Z)
| |
fmap(eval) | | eval
v v
1+2 : m(Z) -----------> 3 : Z
eval
That's the "infinity": ordinary (1-)categories have (1-)arrows between objects, 2-categories have 2-arrows between 1-arrows, 3-categories have 3-arrows between 2-arrows, and so on. Infinity-categories have (k+1)-arrows between k-arrows all the way up. (And a 0-category is just a set: objects, but no arrows)
> Is (1+(1+1)) the same amount of equivalence to 3 as 1+2 is? I'd much prefer a notion that could look at how 'equivalent' terms are opposed to just looking at equality after β-reduction.
"Amounts" of equivalence between terms isn't really the right notion here: there's not much you can do with "an equivalence exists", which is all that "X and Y are equivalent" really means. A 2-equivalence (which in this context I think is going to end up being something like a rewrite rule?) should be an equivalence between particular programs, not the whole space of programs implementing a given function.
There is such a notion. You write a number on the equality sign saying essentially how many operations can be performed so 1+2 is equal to 3 if that number is larger than 0 (the special case of 0 is reflexivity, i.e. true equality). This is used a lot in some computer science stuff I've seen.
I bought a book on Category Theory by Emily Riehl that I find myself unable to get through. I both find her talks on YouTube to be more digestible, and I also wish I was better at mathematics.
imagine if this guy became a trader. He would crush it, like james simmons. i wish there was more research into mathamatical finance, but if anyone had something good they would use it to make money, not publish it.
In particular, homotopy type theory is a type theory, which means it looks like a programming language. So people here might find it a bit more familiar than, say, higher category theory. It'll be much easier if you know functional programming, so you may need to learn that first if you haven't already.