Hacker News new | past | comments | ask | show | jobs | submit login

> Isn‘t there anybody close to the Feynman of Linear Algebra?

No. The subject is too young (the first book dedicated to Linear Algebra was written in 1942). Since then, there have been at least 3 generations of textbooks (the first one was all about matrices and determinants). That was boring. Each subsequent iteration is worse.

What is dual space? What motivates the definition? How useful is the concept? After watching no less than 10 lectures on the subject on youtube, I'm more confused than ever.

Why should I care about different forms of matrix decomposition? What do they buy me? (It turns out, some of them are useful in computer algebra, but the math textbook is mum about it)

My overall impression is: the subject is not well understood. Give it another 100 years. :-)




> No

Gilbert Strang (already mentioned by fellow commenters).

> The subject is too young

"The first modern and more precise definition of a vector space was introduced by Peano in 1888; by 1900, a theory of linear transformations of finite-dimensional vector spaces had emerged." (from Wikipedia)


The first book was written in 1942 - it's mentioned explicitly in LADR. It doesn't mean the concepts didn't exist - they did, Frobenius even built a brilliant theory around them (representation theory), but the subject was defined quite loosely - apparently no one cared to collect the results in one place. It doesn't even matter much: I remember taking the course in 1974, and it was totally different from what is being taught today.


What? Linear Algebra is easily one of the best understood fields of mathematics. Maybe elementary number theory has it beat, but the concepts that drive useful higher level number theory aren't nearly so clear or direct as those driving linear algebra. It's used as a lingua franca between all sorts of different subjects because mathematicians of all stripes share an understanding of what it's about.

From what you said there, it seems like you tried to approach linear algebra from nearly random directions- and often from the end rather than the beginning. If you're in it for the computation, Axler definitely isn't for you. There are texts specifically on numeric programming- they'll jump straight to the real world use. If you want to understand it from a pure math perspective, I'd recommend taking a step back and tackle a textbook of your choosing in order. The definition of a dual space makes a lot more sense once you have a vector space down.


I sympathize with the person you're responding to a lot more than you.

It's very easy to understand what a dual space is. It's very hard to understand why you should care. Many of the constructions that use it seem arbitrary: if finite vector spaces are isomorphic to their duals, why bother caring about the distinction? There are answers to this question, but you get them somewhere between 1 and 5 years later. It is a pedagogical nightmare.

Every concept should have both a definition and a clear reason to believe you should bother caring about it, such as a problem with the theory that is solved by the introduction of that concept. Without the motivating examples, definitions are pointless (except, apparently, to a certain breed of mathematicians).

I've read something like 100 math textbooks at this point. I would rate their pedagogical quality between an F and a D+ at best. I have never read a good math textbook. I don't know what it is, but mathematicians are determined to make the subject awful for everybody who doesn't think the way they do.

(I hope someday to prove that it's possible to write a good math textbook by doing it, but I'm a long way away from that goal.)


I absolutely see what you're saying with that. I think I'm definitely the target audience of the abstracted definition, but I've long held that every new object should be introduced with 3 examples and 3 counter-examples. But you said it yourself- that's the style pure math texts are written in! Saying that "we" as a species don't have a good understanding of linear algebra is unbelievable nonsense. I can't conceive of the thought process it would take to say that with a straight face. The fact is, 10 separate YouTube lectures disconnected from anything else is just the wrong way to try and learn a math topic. That's going to have as much or more to do with why dual spaces seem unmotivated as the style of pedagogy does.


It's not that we don't have a good understanding of linear algebra at all. It's that we don't understand how to make it simple. It's like a separate technological problem than actually building the theory itself.

I'm not the person you were originally replying to, but I have taken all the appropriate classes and still find the dual space to be mostly inappropriately motivated. There is a style of person for whom the motivation is simply "given V, we can generate V* and it's a vector space, therefore it's worth studying". But that is not, IMO, sufficient. A person the subject can't make sense of that understanding the alternative: not defining it, and discarding it, and ultimately why one approach was stolen over the others.

I think in 50 years we will look back on the way pure math was written today as a great tragedy of this age that is thankfully lost to time.


> I think in 50 years we will look back on the way pure math was written today as a great tragedy of this age that is thankfully lost to time.

That could very well be true. I mean just a 100 years ago mathematics (and most education) consisted almost exclusively of the most insane drudgery imaginable. I do sometimes wonder what the world could have been like if we didn't gate contributions in math or physics behind learning classical greek.

I do think that some of the issues come down to different learning styles. I personally like getting the definition up front- it keeps me less confused, and I can properly appreciate the examples down the line. The way Axler introduces the dual space was really charming for me, and it clicked in a way that "vectors as columns, covectors as rows" never did. But that's not everyone! It's by no means everyone in pure math, and its definitely not everyone who needs to use math. I've met people far better than me who struggled just because the resources weren't tuned towards them- there's a huge gap.


*stolen => chosen


more like in 100 years


My arguments is: whoever understands linear algebra has to be able to explain it to anyone having a sufficient math background. The failure to do so signals the lack of understanding. Presenting it as a pure algebraic game cleverly avoids the problems of interpretation, but when you proceed to applications, it leads to conceptual confusion. One "discovery" I made while learning LA is that most applications are based on mathematical coincidence. Namely, the formula for the scalar product of 2 vectors is identical to the formula for the correlation between 2 series of data. There's no apparent connection between the "orthogonality" in one sense and "orthogonality" (as a lack of correlation) in another.

I submit that not only the subject is not well understood, but even the name of the subject is wrong. It should be called "The study of orthogonality". This change of perspective will naturally lead to discussion of orthogonal polynomials, orthogonal functions, create a bridge to representation theory and (on the other end) to the applications in data science. What say you? :-)


I think that "when you proceed to applications" is the issue there. Applications where? For applications in field theory, the spatial metaphor is exactly incorrect! For applications in various spectral theories, it's worse than useless.

What you say regarding the seeming coincidental nature of "real world" applications is basically correct (with correlation specifically there's some other stuff going on, it isn't that surprising, but in general), but unavoidable for any aspect of pure mathematics. Math is the study of formal systems, and the real world wasn't cooked up on a black board. If we can demonstrate that some component of reality obeys laws which map onto axioms, we can apply math to the world. But re-framing an entire field to work with one specific real world use (not even imo the most important real world use!) is just silly.

I love the idea of encouraging students early on to look at different areas of math and see the connections. But linear algebra is connected in more ways to more things than just using an inner product to pull out a nice basis. Noticing that polynomials, measurable functions, etc are vectors is possible without reframing the entire field, and there are lots of uses of linear algebra that don't require a norm! Hell representation theory only does in some situations.


You start with a controversial statement ("Math is the study of formal systems"), and the rest follows. Not everyone agrees with this viewpoint. I think algebraic formalization provides just one perspective of looking at things, but there are other perspectives, and their interplay (superposition) constitutes the "knowledge". Focusing just on albegraic perspective is a pedagogical mistake IMO. Some say it's all a kind of hangover from bourbakinism though. (Treating math as a game of symbols is equivalent to artificial restriction to use just 1% of your brain capacity IMO)


Hmm, I do see where you're coming from. To me, saying math is the study of formal systems is a statement of acceptance and neutrality- we can welcome ultrafinitists and non-standard analysts under one big tent. But you correctly point out that it's still a boundary I've drawn, and it happens to be drawn around stuff I enjoy. I'm by no means saying that there isn't room for practical, grounded math pedagogy with less emphasis on rigor.

However, there's plenty of value in the formal systems stuff. Algebraic formalization is just one way of looking at the simplest forms on linear algebra, but there really isn't any other way of looking at abstract algebra. Or model theory, or the weirder spectral stuff. Or algebraic topology. And when linear algebra comes up in those contexts (which it does often, it's the most well developed field of mathematics), it's best understood from an abstract, formal perspective.

And, just as a personal note, I personally would never have pursued mathematics if it were presented any other way. I'm not trying to use that as an argument- as we've discussed, the problem with math pedagogy certainly isn't a lack of abstract definitions and rigor. But there are people who think like me, and the reason the textbooks are written like that is because that's what was helpful to the authors when they were learning. It wasn't inflicted on our species from the outside.


> the reason the textbooks are written like that is because that's what was helpful to the authors when they were learning

The author writing a book after 30 years of learning, thinking, talking with other people cannot easily reconstruct what was helpful and what wasn't. Creating 1-dimensional representation of the state of mind (which constitues "understanding") is a virtually impossible task. And here algebraic formalism comes to the rescue. "Definition" - "Theorem" - "Corollary" structure looks like a silver bullet, it fits very well in a linear format of a book. Unfortunately, this format is totally inadequate when it comes to passing knowledge. Very often, you can't understand A before you understand B, and you can't understand B before understanding A - the concepts in math are very often "entangled" (again, I'm talking about understanding, not formal consistency). You need examples, motivations, questions and answers - the whole arsenal of pedagogical tricks.

Some other form of presentation must be found to make it easier to encode the knowledge. Not sure what this form might be. Maybe some annotated book format will do, not sure. It should be a result of a collective effort IMO. Please think about it.

BTW, this is not a criticism of LADR book in particular. The proofs are concise and beautiful. But... the compression is very lossy in terms of representing knowledge.


> "Definition" - "Theorem" - "Corollary" structure looks like a silver bullet, it fits very well in a linear format of a book. Unfortunately, this format is totally inadequate when it comes to passing knowledge.

I really can't emphasize enough that this is exactly how I learn things. I don't claim to be a majority! But saying that no one can learn from that sort of in-order definition-first method is like saying no one can do productive work before 6am. It sucks that morning people control the world, but its hardly a human universal to sleep in.

> Some other form of presentation must be found to make it easier to encode the knowledge. Not sure what this form might be. Maybe some annotated book format will do, not sure. It should be a result of a collective effort IMO.

I 100% agree. Have you seen the napkin project? I don't love the exposition on everything, but it builds up ideas pretty nicely, showing uses and motivation mixed in with the definitions. I've been trying to write some resources of my own intended for interested laymen, so more focus on motivation and examples and less on proofs and such. I like the challenge of trying to cut to the core of why we define things a certain way- though I'm biased towards "because it makes the formal logic nice" as an explanation.


> Have you seen the napkin project?

Will take a look, thanks! I think I understand your point. I want to thank you for the productive discussion!


What do you mean with correlation and orthogonality? Like with signal processing, you might calculate the cross-correlation of two signals, and it basically tells you at each possible shifted value, to what extent does one signal project onto the other (so what's their dot product). Orthogonality is not invariant under permuting/shifting entries in just one of the vectors, obviously (e.g. in your standard 2-d arrows space, x-hat is orthogonal to y-hat but not x-hat).

Linear algebra studies linearity, not (just) orthogonality. Orthogonality requires an inner product, and there isn't a canonical one on a linear structure, nor is there any one on e.g. spaces over finite fields. Mathematics, like programming, has an interface segregation principle. By writing implementations to a more minimal interface, we can reuse them for e.g. modules or finite spaces. It also makes it clear that questions like "are these orthogonal" depend on "what's the product", which can be useful to make sense of e.g. Hermite polynomials, where you use a weighted inner product.


> Namely, the formula for the scalar product of 2 vectors is identical to the formula for the correlation between 2 series of data. There's no apparent connection between the "orthogonality" in one sense and "orthogonality" (as a lack of correlation) in another.

Of course there is. Covariance looks like an L2 norm (what you're calling the scalar product) because it is an L2 norm. They're the exact same object.


Why should it buy you something is the real question.

You don't need to understand it the way the "initial" author thought about it, should that person had given it more thoughts...

History of maths is really interesting but it's not to be confused with math.

Concepts are not useful as you think about them in economic opportunity case. Think about them as "did you notice that property" and then you start doing math, by playing with these concepts.

Otherwise you'll be tied to someones way of thinking instead of hacking into it.


I know more math than the average bear, but I think the parent has a point even if I don’t totally agree with them.

Take for instance the dual space example. The definition of it to someone who hasn’t been exposed to a lot of math seems fine but not interesting without motivation — it looks just another vector space that’s the same as the original vector space if we’re working in finite dimensions.

However, the distinction starts to get interesting when you provide useful examples of dual spaces. For example, if your vector space is interpreted as functions (for the novice, even they can see that a vector can be interpreted as a function that maps an index to a value), then the dual space is a measure — a weighting of the inputs of those functions. Even if they are just finite lists of numbers in this simple setting, it’s clear that they represent different objects and you can use that when modeling. How those differences really manifest can be explored in a later course, but a few bits of motivation as to “why” can go a long way.

Mathematicians don’t really care about that stuff — at least the pure mathematicians who write these books and teach these classes — because they are pure mathematicians. However, the folks taking these classes aren’t going to all grow up and be pure mathematicians, and even if they are, an interesting / useful property or abstraction is a lot more compelling than one that just happens to be there.


There can be several motivations.

Would it be more interesting to present these with the Gelfand triple instance?

Does it have “more” to say than the initial raw definition?

The concept can be used in different contexts and that’s what makes algebra being algebra.

People have different motivations and usually that’s what brings new light into a field.


Your post represents a common viewpoint, but I don't agree with it. I'm a retired programmer trying to learn algebra for the purposes of education only. I am not supposed to take an exam or use the material in any material way, so to speak. I'd like to understand. Without understanding motivations and (on the opposite end) applications I simply lose interest. I happen to have a degree in math, and I know for the fact that when you know (or can reconstruct) the untuition behind the theory - it makes a world of a difference. If this kind of understanding is not a goal, then what is?

BTW, by "buying" I din't mean that it should buy me a dinner, but at least it's supposed to tell me something conceptually important within the theory itself. Example: in the LADR book, the chapter on dual spaces has no consequences, and the author even encourages the reader to skip it :).


> Why should I care about different forms of matrix decomposition? What do they buy me?

A natural line of questioning to go down once you're acquainted with linear maps/matrices is "which functions are linear"/"what sorts of things are linear functions capable of doing?"

It's easy to show dot products are linear, and not too hard to show (in finite dimensions) that all linear functions that output a scalar are dot products. And these things form a vector space themselves, the "dual space" (because each element is a dot-product mirror of some vector from the original space). So linear functions from F^n -> F^1 are easy enough to understand.

What about F^n -> F^m? There's rotations, scaling, projections, permutations of the basis, etc. What else is possible?

A structure/decomposition theorem tells you what is possible. For example, the Jordan Canonical Form tells you that with the right choice of basis (i.e. coordinates), matrices all look like a group of independent "blocks" of fairly simple upper triangle matrices that operate on their own subspaces. Polar decomposition says that just like complex numbers can be written in polar form re^it, where multiplication scales by r and rotates by t, so can linear maps be written as a higher dimensional multiplication/scaling and orthogonal transformation/"rotation". The SVD says that given the correct choice of basis for the source and image, linear maps all look like multiplication on independent subspaces. The coordinate change for SVD is orthogonal, so another interpretation is that roughly speaking, SVD says all linear maps are a rotation, scaling, and another rotation. The singular vectors tell you how space rotates and the singular values tell you how it stretches.

So the name of the game becomes to figure out how to pick good coordinates and track coordinate changes, and once you do this, linear maps become relatively easy to understand.

Dual spaces come up as a technical thing when solving PDEs for example. You look for "distributional" solutions, which are dual vectors (considering some vector space of functions). In that context people talk about "integrating a distribution with test functions", which is the same thing as saying distributions are dot products (integration defines a dot product) aka dual vectors. There's some technical difficulties here though because now space is infinite dimensional, and not all dual vectors are dot products, e.g. the Dirac delta distribution delta(f) = f(0) can't be written as a dot product <g,f> for any g, but it is a limit of dot products (e.g. with taller/thinner gaussians). One might ask whether all dual vectors are limits of dot products and whether all limits of dual vectors are dual vectors (as limits are important when solving differential equations). The dual space concept helps you phrase your questions.

They also come up a lot in differential geometry. The fundamental theorem of calculus/Stokes theorem more-or-less says that differentiation is the adjoint/dual to the map that sends a space to its boundary. I don't know off the top of my head of more "elementary" examples. It's been like 10 years since I've thought about "real" engineering, but roughly speaking, dual vectors model measurements of linear systems, so one might be interested in studying the space of possible systems (which, as in the previous paragraph, might satisfy some linear differential equations). My understanding is that quantum physics uses a dual space as the state space and the second dual as the space of measurements, which again seems like a fairly technical point that you get into with infinite dimensions.

Note that there's another factoring theorem called the first isomorphism theorem that applies to a variety of structures (e.g. sets, vector spaces, groups, rings, modules) that says that structure-preserving functions can be factored into a quotient (a sort of projection) followed by an isomorphism followed by an injection. The quotient and injection are boring; they just collapse your kernel to zero without changing anything else, and embed your image into a larger space. So the interesting things to study to "understand" linear maps are isomorphisms, i.e. invertible (square) matrices. Another way to say this is that every rectangular matrix has a square matrix at its heart that's the real meat.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: