I have no specific comment to make on the (first two chapters of the) book recommended by this blog. However I have become wary of people recommending a textbook because it finally helped them understand something they had tried to wrap their head around for years.
I heard a conjecture once that the best textbook you'll find on any given topic is your third. The point, of course, is that it simply takes about three serious attempts to make it click - but it is a fallacy to give all the credit to the third book.
A reinforcing point to this is that I still have a lot of my textbooks from undergrad/grad school. And while I hadn't studied any math for 20 years or so, it was interesting when I went through some of these books -- how much easier it is to understand some of these concepts now. I don't know what it is that helped me understand these concepts, twenty years after graduating, better than I did when I was studying it everyday. But I would NOT attribute it to the textbook, given it is literally the same one.
I had the same experience. I think it's because school dumps on us solutions to problems we haven't had yet. It's difficult to correlate abstract concepts to tangible problems. But 20 years later those formulas are actually painting a picture of things we've experienced. It's like reading someone else putting perfectly into words thoughts you had. It clicks only if you had those thoughts; to other people it is devoid of meaning.
> I don't know what it is that helped me understand these concepts, twenty years after graduating, better than I did when I was studying it everyday. But I would NOT attribute it to the textbook, given it is literally the same one.
Probably additional skills and experience you picked up all these years by solving/understanding hard problems
And more time for it to be mulled over. It was frustrating, but in college I found that the 16 week semester was rarely enough to actually learn something, only enough time to commit it to memory. It was often a year or more later that I'd have a sudden eureka moment while studying a separate topic. Perhaps something that was applying the unlearned material (like Physics I and Calculus, which were a semester apart for me since I took Chem I my first semester in college) or something related but not exactly the same (like Concrete Mathematics by Knuth et al. leading me to a realization about some aspects of calculus).
I had the experience of trying to relearn some math just a few years after first learning it in college and it was similarly much easier to learn it all a second time. I don't think I had a whole bunch of new experience. I think rather there must be some kind of residual understanding, even if it can't all be articulated cold.
This is IMO what makes teaching well incredibly hard: you not only need to understand the topics at hand better than people who just apply it, you also need to remember the time before you understood what tou are teaching.
This is something my mentor had to come to grips with when I started, and something I've been working through the last few years as my team grows.
It is very difficult to take that step back and divorce myself from years of experience and tough lessons, and to present the subject matter in a way that can be grasped without an innate understanding that took me years to reach.
Same here as well. From what I've found out is that, I know the problem space much better now, meaning, I know how some of it applies in real life and I can better visualize it now than when I was just trying to finding the right answer for grades. Maybe that's just for me but that's my reasoning anyway.
Very true. After spending lot of time looking for best books to learn algorithms and data structures, and buying more than 10 books I realised what I lacked was not the resources but rigor and discipline to pursue one of the tons of best resources. I am not telling that there aren't bad books, but most likely the limiting factor to acquire the skills isn't lack of resources, but the rigor to sit and plod through one ( or couple) of the best resources that you have zeroed on.
" I am not telling that there aren't bad books, but most likely the limiting factor to acquire the skills isn't lack of resources, but the rigor to sit and plod through one ( or couple) of the best resources that you have zero"
Ah, but there are books, just making you want to go tp sleep by just looking at them and some are able to spark passion (in me).
My point being, it is definitely about motivation and discipline, but a good didactic book, helps with that.
And since we are all different (types of lerners), there definitely isn't one book to rule them all. And its been a while since I studied from a book, but I could usually tell from skimmimg over a few pages, of whether this book can help me, or not.
I’ve had a similar experience in my learning journey. This was really clear in High School: the physics taught by my poorly trained teachers or the “recommended” books were all targeted towards rote memorization and I really disliked the subject because of it. When I looked for other books that explained these concepts in a more accessible way, It became a joy to learn the subject.
Yes, you have a point. For example, if we consider algorithms, the grokkkng algorithms book is definitely easier to digest but maybe doesn't go as deep.
So it gets you over a hump which might make the more advanced / detailed books more accessible.
It is free online. It starts with recursion and dynamic programming. I felt like dynamic programming really clicked for me after reading the first few chapters but this was probably the 3rd or 4th book I read on the subject.
This reminds me of Hannah Fry’s TED talk on the mathematics of love, specifically point #2 on how to pick the perfect partner. [1] It was basically about not committing to the very first person you meet, but also not searching for the perfect person in perpetuity. To paraphrase, you should pick reject the first two people that come along, and then commit to the first person that is better than everyone you previously dated. It’s not a perfect similarity, but I think the point of making a good faith effort a couple times and then really going for it once you understand a little bit makes sense.
Seconding this a little. The best way to learn math is to spend a long time with it and see it in as many ways as you can. You have to build an intuition for it so you can move past definitions and theorems and just "get it".
Although I also dont want to discourage anyone from trying this either. Anything that'll get people to learn is better than apathy. :)
While I agree this is definitely a thing, it would be a mistake to swing too far in the other direction and view any texts sharing a subject as roughly equivalent.
I've experienced the '3rd text' phenomenon, but I could also point to specific features affecting the wide variance in math text effectiveness.
And this book is a pretty good example of exactly that: it has specific unique features allowing it to fulfill its promise of being an effective 'translation' guide for a programmers to a bunch of otherwise typically implicit ideas relating to methods or foundational concepts in mathematics that can be extremely difficult stumbling blocks for the self-taught.
IMO a good strategy: take people's glowing praise about particular texts with a grain of salt—but, if specific beneficial features can be pointed out, which would be advantageous to you as a learner, know that can mean striking gold sometimes (in terms of not wasting time).
This is how I learned C++11. I didn't fully understand move semantics and smart pointers until I read the relevant chapter from 3 or 4 books. Whenever an inter/co-op student joined I'd have them read the specific chapters from multiple books plus a good article on smart pointers before having them touch any code. All during working hours of course.
I'm learning Rust now and feel like I need to do the same thing. Read "the rust book" plus Rust by Example plus one or two Rust books on the O'Reilly website. It takes a lot longer to read multiple books in parallel but I find I remember better this way.
> I heard a conjecture once that the best textbook you'll find on any given topic is your third
Walter Rudin’s Principles of Mathematical Analysis (chapters 1 through 7). A mental torture on the first exposure, but like a fine wine when the palate is mature.
I'm curious what the target audience this book is for. The chapter "Our Goal" says that the books is to teach programmers how to engage mathematics, but programmers have wide range of mathematical maturity. The second theorem in the book is as follows:
For any integer n >= 0 and any list of n + 1 points (x[1], y[1]), ... , (x[n+1], y[n+1]) in R^2 with x[1] < x[2] < ... < x[n+1], there exists a unique polynomial p(x) of degree at most n such tat p(x[i]) = y[i] for all i.
So, it seems the author assumes that a reader will have math maturity of a good senior high-school student, as most of students wouldn't need to worry about property of existence. The book also covers the proof of such theorem with formal notations and the proof is built up with previous theorems -- a pretty standard way in math books which nonetheless requires math maturity of a good high school senior. The table of contents also shows that the book will cover linear algebra, calculus, and group theory in a whirlwind. Again, such content demands close-to-college-level math maturity. I'm also generous here, as public schools of the US do not really teach that much formal math.
So, here is the dilemma: people with this level of maturity should already be good at math or have access to other materials to help them with math. People who do not possess such maturity will not go through the book anyway, or have more beginner-friendly materials to read. Note I'm sure there are exceptions, but I question the percentage of such exception.
I have this book. I was pretty excited about it. I have tried 3 times to work through it so far, and haven’t made it through chapter two.
I did not connect with math when I was in High School. I liked geometry, but never took anything after that. I didn’t really learn anything new in college algebra.
I program at an accomplished level- I’m doing senior dev work in a complex ___domain, and have led teams of developers.
But I feel bad when trying to go through this book. Like I had huge gaps in knowledge that the author assumed I had, which led me to wonder why that is.
I will probably try to struggle through this again, in hopes that eventually it clicks. If anyone knows of a book I could use as a prerequisite or intermediate step, I would appreciate that!
I have struggled with the same thing (and continue to do so). One thing I have found to be very helpful is Ivan Savov's "No bullshit guide to math & physics". It helps that it's written by a person who does a lot of tutoring, unlike the authors of many frightening ex-cathedra maths books. It basically builds its way up to calculus, in a way that's mostly easy to grok and in a way that's useful and interesting (which is where the physics bit comes in).
It certainly hasn't made me a mathematician by any stretch, but it's helped me fill in a lot of gaps left behind by awful maths teachers in school, and it's helped rekindle an interest that I'd long since forgotten.
That is a bit of an uncharitable reading. The subchapter literally preceding the example you cited is titled "Existence & Uniqueness" and explains why those concepts are important to mathematicians, and right following the stated theorem the author explains it in minute detail and gives an informal phrasing ("there is a unique degree n poly-
nomial passing through a choice of n + 1 points"). The entire chapter is devoted to how to take these kind of complex looking statements and understand what concepts are behind them, and the author explains the concepts in both normal language and how one would normally see them in a maths textbook.
You are right that this book requires high school level math knowledge, as it's the equivalent of a first (and maybe second) year math course. Most programmers I have met do display that amount of knowledge however. What other starting point (or topic) would you suggest for teaching someone mathematics that can be related to programming?
The article has a good example of exactly the kind of thing I also found useful in the book:
> He also discusses the fact that the language of mathematics is looser than programming in a lot of ways. In code, things have to be expressed a very exact way or they just don’t compile. Variables and functions have to be fully and explicitly defined if you expect the computer to run them. But in math, there’s a lot of tacit agreement and assumptions that go on. Lots of shortcuts and conventions.
This kind of context around how math is done as a human activity in practice, especially in contrast to programming, is extremely helpful orientation for programmers trying to self-teach mathematics.
It would've saved me tons of time and trouble if I'd known the above while trying to work through math texts after graduating with a CS degree: instead I wasted a tone of time writing over-detailed proofs, always feeling as if I were doing something wrong if every tiny step weren't explicit (more closely matching my experience with programming).
For those who liked my book, or want a different angle, or if you're looking for inspiration into why math is interesting and useful, I'm in the (slow) process of writing another book, called "Practical Math for Programmers." It's more of a broad sample of interesting, short programs that use math, with lots of references. Sort of like "Programming Gems" books
I just read the first chapter and wanted to say well done, this is great! This seems to exemplify the maxim that if you really understand something you can explain it clearly to a layman. I am that layman and I learnt a few things today! I look forward to taking a deep dive. :)
Hi, is there a Kindle edition? I can only find an option to purchase a PDF eBook on your site (“pay what you like”, which is great). This is not to say I mistrust your site, but I only enter credit card data into a very small number of sites.
Not that specifically. I switched from standard to PWYW after a year of sales and I felt I had made enough and wanted it to be open. I still get decent sales, but the majority of income was always from print books which are not PWYW
I personally appreciate this approach, I tend to do a first pass read on ebooks and if I find it of high value and the subject is not temporal (e.g learn Flash development). I always purchase the physical book for my library. I don't mind paying twice, if I have some confidence that the material will be of value, but it does cause me to pass on some texts that I may have found worth buying the physical book had I not passed on the ebook.
Hard to say. Maybe just paying attention to detail is enough. Or, maybe some understanding of mathematical notation (such as logic and sets, elementary algebra…). You might want to peruse some other of his primers [1], particularly the topology series (of which the one about homology is part).
Homotopy has is origin in topology (but eventually transcended it), so learning about it first in a more “tangible” setting of where it came from might indeed be helpful.
Ive never read this book in particular, but had a similar experience with another book (Serge Lang, Basic Mathematics) that changed my life trajectory. If you find this stuff remotely interesting please give it a go! Math is really wonderful!
Learning math in some prescribed way (book or sequence of books) is the mathematics equivalent of "what programming language should I learn first". The most important thing is simply doing anything at all! Don't get planning paralysis!
If you think you'll actually do or read anything, then give it a try. Definitely push yourself some, but if it becomes a slog don't be afraid to move on to something that looks more interesting.
Youll find your way back to anything that was actually important anyway. :)
I always recommend anybody wanting to learn a language should just start using it. Find a project and learn the language and libraries as you need them.
What's the equivalent for learning mathematics? A lot of mathematics seems only useful for learning more advanced mathematics.
Projects in pure math are basically just research. I would say just follow your curiosity and try to figure things out. You might not be scratching novel work for awhile, but its still enjoyable.
I don't know exactly what you like to work on, but perhaps theres some related mathematical area youre curious to know more about?
I dont know what level youre at, but if you don't already know how to write proofs then something on that. Its an extremely important foundation for everything else.
I learned in a class and we didnt use a book so I cant recommend one. I dont think it should matter too much though.
Aside from that, what kind of things are you curious about?
I got a master's in engineering, so a lot of the foundations stuff is missing. Basically the math there is all very applied stuff, not a lot of elegance and overview.
No tbh I think that's fantastic. Engineering and physics are a great way to get the right intuition. Having a firm grasp of the basics and having lots of possible examples in mind is very useful! Basically I would try to round out the basic pure math stuff for sure, but I think youre better equipped than most.
Another intro book I thought of was Peter Eccle's introduction to mathematical reasoning. Might be worth looking at.
If you want a nice leisurely introduction to groups Nathan Carter's Visual group theory is nice.
I got a lot of use out of the princeton encyclopedias of math. Dont expect for them to really teach you anything, but the articles are nice for seeing whats out there.
Definitely try to learn basic analysis and algebra. I dont think any book I know is amazing, but basically any will do the job.
Importantly, dont be afraid to try to learn something out of your depth. In fact I think its important to try! If something really grabs you, try to read more and backfill what you don't know.
I really love this book as it really shows the value if mathematics in programming.
However, it's an intense read. I strongly recommend it, but if you don't have some college level math under your belt, it can be harder to understand than its title makes it seem.
"in math, there’s a lot of tacit agreement and assumptions that go on. Lots of shortcuts and conventions. So if you’re not steeped in that culture, it all looks like black magic to you." "The text will talk vaguely about an idea and then there will be a formula with all kinds of Greek letters, and no explanation of what those symbols mean. If you aren’t already familiar with them, you don’t stand a chance."
This issue has been bothering me for years. In a typical math forumla you find on wikipedia, there are many unnecessary symbols included + really critical things are left vague.
I feel like mathematicians are at fault here. They should clean up their shit, maybe write these equations as code. When I translate one of these equations into code its always much much shorter, and it has the benefit of being 100% deterministic.
Eg. one example f(X) (often drawn big and elaborate) means Y!!
Author of the original article here. I feel your pain, but again, it's just a different set of conventions. Mathematicians are used to f(x) = ... kind of notation. Once you get used to it, it makes total sense. That particular one I got used to ages ago. f(x) is the same as a function in your code. It takes and argument, x, and returns some value.
Often specific symbols have implicit meanings, like theta θ is pretty commonly used for some angle, r is often used to mean a radius. So you'll often see something like "r sin θ" with no explanation. At first it's meaningless, but once you know the conventions, it's crystal clear. It's considered so basic, that nobody would waste the space explaining it. Same is if you're reading something about code and something says "const float x = 0.1" or something. The author is probably not going to go into an explanation of what a const or a float is or what x means. You're expected to know.
So what I like about the book is that helps someone without knowledge of all these conventions to begin to understand them.
Thanks for taking the time to write about the book. As someone who had a hard time applying their school-taught knowledge about vectors and matrices when trying to understand OpenGL and Direct3D back in the early days ("why isn’t there a proper 'camera' object I can use") I really appreciate when people make an effort to offer alternative POVs to get deeper into topics they might not be familiar with.
Sometimes the right kind of intuition is all it needs to make it click. Sometimes it's that tiny bit of knowledge one is missing to get the whole picture and suddenly everything makes sense.
(Btw I think we might have met ages ago at a conference or two in Cologne)
I really clicked with this part of your article, for the first part of my undergrad maths modules I had been automatically trying to connect these symbols with some global definition and getting in a muddle. I still remember the combination feeling of outrage + sudden realisation when the penny dropped that they were just "making it up as they went along"! (sort of)
Perhaps the difference is that for programming you could internet search for "C# const" or "C# float" for example, and find the documentation or even easy to understand tutorials explaining what they mean.
Whereas for math it does not seem to be the same. There is no documentation, and no-one ever seems to explain those basics online. Eg. this book is a pretty obscure pdf.
Basically, math need a major refactoring and code rewrite. I've been saying this for years. It's getting to a ridiculous point in some areas. Also imagine naming functions in code after people ... That's what mathematicians do with some major often used concepts, making them hard to grasp and remember (i.e. Why is it called "Laplace distribution" and not "mirrored-exponential"?)
For me the worst part is the nomenclature. Under the guise of paying homage to the inventor the nomenclature has become borderline dysfunctional. Just have simple words instead of enigmatic mathematician names for defining core concepts in the field
Lets say I wanna draw some weird kind of curve. Theres an equation for it in wikipedia. The equation will be f(x) = 5x + 3x^2 blah blah blah..
In order to draw the curve I iterate through values of X for each pixel, plug those values into the right hand side, and what the right hand side of equation is equal to is my Y value for that point on the curve. So if they had just written that big fancy f(X) as Y it would have been much clearer and easier to understand for me initially.
That use-case seems not strictly useful, but things become trickier with more elaborate expressions; like when we want to abstract over what f looks like.
Eg One might have f(x,y,z) = (...). If f is in some class of functions with some properties (homogeneous, linear, smooth, etc) we can operate on it abstractly.
We could even derive properties of surfaces f(x,y,z) = C.
Also I should point out that the variable names they use in these formulas are the worst possible, single letters, X, Y, some random greek numeral. If a programmer wrote variable names like that they would be fired.
Sometimes things have no particular meaning, or the meaning is related to the convention. In the other example you used, of drawing the curve, then x and y are perfectly fine names thanks to the convention (now long established) of using x and y to represent two orthogonal axes (generally the "horizontal" and "vertical", whatever that may mean in context). Using a more elaborate name would add no value.
With respect to the use of Greek letters and such, I do lament that many writers of mathematics fail to define their terms. Instead assuming that the reader is fully conversant in the ___domain, when often a single paragraph at the start would add a great deal to the clarity of their work. However, that doesn't mean that the use of such variables is bad, they just need to be defined.
The benefit of the mathematical notation is that it permits conciseness and lends itself well to symbolic manipulation (that is, a large portion of what we do when we do algebra and calculus). The former is a tricky subject, conciseness at the cost of clarity can be a net negative. But the latter is crucial to a lot of work, the way that we write programs does not lend itself well to symbolic manipulation and would be counterproductive for mathematics.
In fact, I've often had to translate programs into a symbolic notation in order to try and decipher them because the long descriptive names, as useful as they are in isolation, ended up rendering the total procedure nearly impenetrable. Or at least unanalyzable. And the conversion to a symbolic notation permitted me to simplify the program substantially because I was able to apply ideas from algebra to the program (often boolean algebra, in particular, this is a very useful practice for condition heavy code with lots of predicates).
What would you replace for x and y? Mathematicians study equations in the abstract. x is just a real number. It's not distance, or time, or velocity. Would you rather they say "real_number" everywhere? And then what would you do when the formula has more than 1? "real_number1" vs "real_number2"?
As to your original criticism, I'll wager that more people in this world will understand f(x) = 5x + 3x^2 than will understand rudimentary code, as it is taught a lot more than programming is. I don't mean anything negative when I say this, but the only people I know who complain about math syntax are programmers. By changing these conventions, you are asking all mathematicians, physicists, chemists, most engineers, economists, etc to change. I will wager that over 95% of them will not prefer your style.
Finally, the trouble with replacing f(x) with Y is that the former tells me I'm dealing with a function. The latter does not. In fact, for many mathematicians, y = 5x + 3x^2 is a constraint, not a function.
I wonder how much of the differences between mathematical and programming notation stem from mathematics traditionally being done by hand on paper vs. programming being typed on a computer.
These are very childish and silly arguments. The letters have strong conventions in mathematics and definitely make sense when the functions are generic. It's like impulsively criticising Haskell or any other formal language for looking stupid and "worst possible" when you haven't put any effort whatsoever into learning it.
I bought a physical copy when it was released a couple of years ago. I've recommended it a few times; I think it's worth reading for anyone who hasn't spent much time writing actual mathematical software or hasn't had a formal CS education.
I’ve come to be somewhat known as a “math guy” in creative coding. It’s one of my impostor syndrome items because I’m really not any kind of expert in the field
I feel similarly about statistics. I of course, given my line of work, have a solid foundation there. But my area of expertise is... more complicated to explain or define. When it comes to statistics expertise though, apart from a solid foundation I simply know enough to know what tools to use, and how to research and evaluate such tools. For example, in a recent project I knew that LSTM was an appropriate tool, but I don't know more than a high level abstraction of how it works and the ___domain of problems it might help solve.
To give a very basic example sort
of like knowing when to use the Pythagorean theorem but not knowing enough to prove it up from axioms.
Hey, I’m literally going through this at the moment. Only at the end of the first chapter, but so far I can say it’s written in a very accessible, clear style.
I don't believe that an introduction to
math has to be long, and here I will try
to give an introduction, and overview,
just as blog posts. That is this
introduction is much shorter than a book.
So I introduce the main topics.
Throughout, nearly always you can get a
lot more from a simple Google search or
Wikipedia article.
By far the most important topics are
calculus and linear algebra. My view is
that a 12 year old with good interest and
some okay or better teaching and some good
texts and access to Google and Wikipedia
can do well with both calculus and linear
algebra.
Both calculus and linear algebra have
enough so that a person can spend their
life in research, maybe in applications,
in advanced parts. This is especially the
case for calculus. E.g., calculus can
include calculus on manifords, partial
differential equations, differential
geometry, and numercial and algorithmic
topics for all of these.
What is described here could be covered in
a good undergraduate math major. Such a
student would have all or nearly all of a
good math background for graduate work in
math, physics, or other STEM fields.
None of this math is nearly new. So, for
good texts, I recommend ones that have
been regarded as among the best for at
least 20 years. For some texts, 60 years
is not too old. E.g., my favorite text in
linear algebra was first published in
1942. So, look for old texts. Usually
can buy used copies in good condition for
less than $10.
How to use the texts: (1) Get a stack of
blank paper, a sharp pencil, a big eraser,
a comfortable chair, a good light, and a
quiet room. (2) Read a section of the
text, try to understand all or nearly all
the section, when time is available try to
get a first cut intuitive understanding
good enough to explain to someone who
knows no math at all, and then do the
exercises. If your text proves theorems
and is short on good exercises, then prove
the theorems yourself as a good source of
exercises. (3) For the more important
topics, especially calculus and linear
algebra, get 3-4 well recommended texts,
pick one as your main text, and use the
others for alternative explanations and
more exercises. (4) Occasionally look
back at what you have learned and find a
good, intuitive view of what is going on.
Qualifications: I hold a Ph.D. in
pure/applied math from a world class
research university. While I've published
peer-reviewed orginal research in
pure/applied math, mathematical
statistics, and artificial intelligence,
my interests in math are mostly for
applications; I've applied math for US
national security, in business, and in
computer science research. I've been
programming for decades, and now I'm doing
a startup, a novel Web site, that has some
old pure and new applied math at its core
(that users won't see) and have written
and run the software for the Web site.
The software is almost entirely in
Microsoft's .NET and consists of 24,000
statements, in 100,000 lines of typing.
Currently I'm collecting initial data
before going live on the Internet.
Below are 17 numbered sections. Calculus
is in section 11, and linear algebra,
section 12. For sections 1-10, plane
geometry in section 8 deserves careful
study, e.g., as in a good high school
course, but for most students for the
other sections of 1-10 just the material
here may be enough.
(1) Sets
A set is a collection, aggregation, box
of, etc. of elements -- we're supposed
to already know what the elements are.
In math, sets are useful in much the same
way as little covered plastic containers
are in cooking, say to store some left
over peas, cookies, pizza sauce, etc.
So, a set is a way to identify, keep track
of what we are talking about.
Given a set A and some element x, the x is
either in the set A or it is not -- there
are no other alternatives. If element x
is in set A, then it is in set A exactly
once. e.g., never 2 or 3 times.
The elements in a set are not in any
particular order. E.g.,
{1, 3, -4} = {3, -4, 1}
Given two sets A and B, an element x might
be in both A and B, might be in many sets.
We also permit a special set, the empty
set that has no elements. The empty set
is curiously useful, plays a role roughly
like 0 does for numbers.
We can define sets with notation, e.g.,
A = {x| x > 7}
We read this as "A is the set of all
elements x such that x > 7."
(2) Foundations
Near 1900 a lot of math was known. A
nagging question was, what are the
foundations that let us be confident
that what we are talking about makes
sense?
B. Russell noticed we could write
A = {x| x is not a element of x}
Then A is an element of A if and only if
it is not an element of A. So, we have a
contradiction, angst, tummy aches, heart
burn, head aches, etc. This is the
Russell Paradox.
The problem was solved by first deciding
what the elements were and then making
sets out of just those elements. Net, we
rule out asking of set A is an element of
set A. Of course this work was all just
conceptual.
Also there was work to set up some
axioms (properties) to be assumed for
sets; this was axiomatic set theory.
Then there was an effort, successful, to
use axiomatic set theory to define
everything else in math. So, roughly, if
the axioms are solid, so is the rest of
math.
Some deep, difficult, and profound
questions were found and some answers were
found for some of these.
In the end, except for people specializing
in foundations, the approach taken for set
theory is Zermelo–Fraenkel set theory
usually assuming also the axiom of
choice. I know that that work is solid
way down in the basement, foundations,
of math, but not many people go down there
very often. I've been there for a good
tour and have no intention of going back.
Roughly, axiomatic set theory is to add
credibility to the math you already knew
in the 8th grade.
It may be that this material gets taught
to give a deeper understanding to people
who intend to teach in K-12. E.g., Newton
invented calculus, and I doubt that he
knew about the Russell Paradox or the
Axiom of Choice.
(3) Numbers
The most important concept in math is
numbers, and the most important numbers
are the real numbers. To understand
these numbers it is nearly always enough
to visualize them as just the points on a
line. That is essentially the same as
using a yard stick. No biggie.
We can talk about the set of real
numbers; maybe we let R denote the set of
real numbers.
We can also consider the whole or
natural numbers -- 1, 2, 3, .... We
might agree that N is the set of natural
numbers.
We might define the set N of natural
number by: 1 is in N and, for each n in
N, n+1 is also in N. We will use this in
proofs by mathematical induction.
By including 0 and the negatives
of the whole numbers we get the integers
... -3, -2, -1, 0, 1, 2, ...
Maybe we say that Z is the set of
integers.
The rational numbers are all the p/q
where p and q are integers and q is not
zero. We can say that Q is the set of
rational numbers.
If we let i be or act something like the
square root of -1 all the complex
numbers are all the x + iy for reals x and
y. We can let C denote the set of all the
complex numbers.
Of course, there is no square root of -1.
But we can regard the complex numbers as a
clever bookkeeping trick that at times is
curiously useful in defining the sine and
cosine in trigonometry, electrical circuit
theory, analysis of wave motion, quantum
mechanics, etc.
This use of R, N, Z, Q, and C is common,
but there is no rule that says we could
not use other notation.
The most important properties of these
numbers you learned by the 9th grade and
learned most of those in grade school. No
biggies.
There is one more property
that is crucial in
calculus:
The set of real numbers R is
complete.
There are several equivalent
definitions of complete.
Here is maybe the simplest
definition:
Given a subset S of R and some x
in R, if for all a in S
we have that a <= x, then
x is an upper bound of set S.
The completeness property is that
each non-empty subset S of R with an upper
bound has a least upper bound.
E.g., let
S = {x| x in R and x < 2}
Then 3 is an upper bound of S and 2 is the
least upper bound.
Perhaps a better way to
explain completeness is to
say, intuitively,
that
whenever we move
in steps to be as close
as we please to some point
on the line, there is a
real number there we are
getting close to.
A big point is that
the rational numbers
are not complete.
E.g., let
S = {a | a rational and a < square root of 2}
Then the square root of 2 would
be the least upper bound but is
not rational.
So, the rational numbers are not complete.
The concept of completeness,
especially viewing it as
converging to something,
is a major concept in
advanced
work in math analysis.
There is something of a
joke that "Calculus is
the elementary consequences
of the completeness property of the
real number system."
During the days near 1900 struggling with
axiomatic set theory, there was also
struggling with the concept of infinity
or infinite.
We say that two sets A and B have the same
cardinality, essentially are the same
size, if they can be put into 1-1
correspondence.
Okay, set B is a subset of set A
provided each element of B is an element
of A. B is a proper subset of A if A
contains some elements not in B, that is,
if B is not all of A.
Then a set is infinite if it can be put
into 1-1 correspondence with a proper
subset of itself. A common example is to
let A be the set of natural numbers and B
be the set of even (can be divided by 2
with no remainder -- 4th grade math)
natural numbers. Then B is a proper
subset of A, and, for each natural number n
in A, let it correspond to the natural
number 2n in B. Presto, bingo, we have put
A and B into 1-1 correspondence. So, set
A is infinite.
A set that is not infinite is finite.
Similarly sets of integers, rationals,
reals, and complex numbers are all
infinite.
Not very difficult to see. the set of
natural numbers has the same cardinality
as the set of integers, that is, the
natural numbers and the integers can be
put into 1-1 correspondence.
Any set that can be put into 1-1
correspondence with the natural numbers,
has the same cardinality as the natural
numbers, is said to be countably
infinite.
Curiously, the rationals are countably
infinite. To show this, use the clever
Cantor diagonal process (lots of hits at
Google).
Is the set of reals countably infinite?
No, and there is a short,
clever argument that
shows this. So, the cardinality of the
set of reals is larger (after we handle
some details) than the set of natural
numbers.
Is there a set D with cardinality greater
than the natural numbers but smaller than
the cardinality of the set of real
numbers? Difficult question. That there
cannot be such a set D is the continuum
hypothesis (CH). As can see at
Wikipedia, this question was settled by
work of Kurt Gödel in 1940 and Paul Cohen
in 1963: The result is that the CH is
independent of Zermelo–Fraenkel set
theory with the axiom of choice. By
independent we mean that we can assume
that the CH is true or false and cannot
get a contradiction. This work has been a
bit amazing, even philosophical.
Understanding infinity is important, and
we have done a good job at that. Unless
we are down in the basement of
foundations, we at most rarely consider
the continuum hypothesis.
There is a theorem
that for any natural number n,
we have
1^3 + 2^3 + ... + n^3 =
n^2(n + 1)^2 / 4
We can prove this by mathematical induction.
We want a proof that works for
any natural number n, and for this
we use the definition (above) of the
natural numbers.
So, first we check if the equation
is true for n = 1:
1^3 = 1
and when n = 1
n^2(n + 1)^2 / 4 = 1 (2)^2 / 4 = 1
So, the equation holds for n = 1.
Suppose the equation holds for some
n and check the equation for
n + 1:
1^3 + 2^3 + ... + n^3 + (n + 1)^3 =
n^2(n + 1)^2 / 4 + (n + 1)^3 =
n^2(n + 1)^2 / 4 + 4(n + 1)(n + 1)^2 / 4 =
(n^2 + 4(n + 1))(n + 1)^2 / 4 =
(n + 1)^2 (n^2 + 4(n + 1)) / 4 =
But
n^2 + 4(n + 1) =
n^2 + 2n + 1 + 2(n + 1) + 1 =
((n + 1) + 1)^2
So
(n + 1)^2 (n^2 + 4(n + 1)) / 4 =
(n + 1)^2((n + 1) + 1)^2 / 4
and the equation also holds
for n + 1.
Then by the definition of the
natural numbers, the equation
holds for all natural numbers.
Done.
So, that's an example
of proof by mathematical
induction.
(5) Fundamental Theorem of Arithmetic
A prime number is a natural number like
1, 3, 5, 7, 11, 13, 17, 19, 23, ..., that
is, evenly divisible by only itself and 1.
Prime numbers have been studied since the
ancient Greeks, and there are many
difficult questions easy to state; now we
have answers to some of the questions.
Consider 45:
45 = 3 * 3 * 5
So, we have written 45 as a product of
prime numbers. It is easy enough to see
that any natural number can be written as
a product of prime numbers.
The Fundamental Theorem of Arithmetic is
that each natural number can be written as
a product of prime numbers in just one
way, e.g., where we neglect the order of
the prime numbers in the product.
So,
45 = 3 * 3 * 5
is the only way to write 45 as a product
of prime numbers.
(6) High School Algebra
A quadratic is an algebraic expression
of the form
ax^2 + bx + c
A polynomial is the same except can have
higher natural number powers of x, e.g.,
7x^5 - 4x^3 + 2x - 1
Polynomials play an important role in
linear algebra -- e.g., via determinants
and eigenvalues.
Polynomials are also used in error
correcting codes and related work.
There have been attempts to use
polynomials in curve fitting, but for
large natural numbers n the x^n term means
that as go on the real line a long way
from 0, commonly the value of the
polynomial goes a very long way from 0, so
far from 0 that it is usually unrealistic
for applications. That is, polynomials
don't provide good, versatile means of
curve fitting.
Also interesting and useful is the
binomial, for natural number n and
numbers x and y
(x + y)^n
Presumably the name binomial is from
having the two numbers, x and y.
Curiously, surprisingly, this binomial is
important in various approximations and in
combinations, permutations, probability,
and statistics.
(7) Second Year High School Algebra
A root of a polynomial is a value of x
that makes the value of the polynomial 0.
The fundamental theorem of algebra is that
each polynomial can be written as
(ax + p)(bx + q) ... (cx + r)
where the p, q, ... r are roots of the
polynomial. The theorem holds for
polynomials with either real or complex
coefficients. The roots, even with only
real coefficients, can be complex.
I've omitted some details due to the
challenge of typing math.
Mathematicians worked for centuries to get
good proofs of this result.
Also might be covered are the algorithms
for least common multiple and greatest
common divisor. These were known to the
ancient Greeks and get used currently.
We can define a function f by, for any
number x,
f(x) = ax^2 + bx + c
So, function f is a look-up table -- plug
in a value for x and get back the value of
ax^2 + bx + c
Since the x is a number, we say that the
___domain of the function f is the set of
numbers, usually the set of real numbers.
In this case, the range of function f is
also the set of real numbers.
We can have look-up tables for telephone
numbers -- the ___domain a set of names of
people and the range a set of telephone
numbers. We can have a function for
Internet addresses, ___domain URLs and range,
say, the 64 bit internet addresses.
So, functions have enormous generality --
given any two non-empty sets A and B, we
can have a function with ___domain A and
range B and to indicate that f is a
function with ___domain A and range B write
High school plane geometry contains some
nice, useful math and an excellent first
lesson in mathematical proofs.
(9) Abstract Algebra
In grade school, we learned that
2 + 3 = 5
Abstract algebra calls the +, addition, an
operation. Multiplication is also an
operation.
We learned that for real numbers a and b
a + b = b + a
So, addition is a commutative operation.
It is also associative: For real
numbers a, b, c we have
a + (b + c) = (a + b) + c
Multiplication is also commutative and
associative.
For addition, 0 is the identity element
since for any number a we have
a + 0 = 0 + a = a
And -a is the inverse of a since
a + -a = 0.
Etc. with the rest of the usual properties
of the operations of addition and
multiplication we knew in grade school.
Well, abstract algebra studies operations
on sets more general than the numbers we
have considered so far. So, there are
groups, rings, fields, etc. The
rationals, reals, and complex numbers are
all examples of fields.
E.g., a group is a nonempty set with
just one operation. The operation is
associative and there is an identity
element and inverses.
Some groups are commutative, and some are
not. Some groups are on a set that is
infinite and some are on a set that is
finite.
Given a common definition of vectors, with
vector addition the set of all the vectors
forms a commutative group.
Similarly for the rings, fields, etc. --
that is, we have sets with operations and
properties.
There are some important applications,
e.g., in error correcting coding.
(10) Calculus
Calculus was invented mostly by Newton and
mostly for understanding the motions of
the planets etc.
The first part of calculus has to do with
derivatives. So, suppose we have an
object that for each time t is at distance
t^2. Then the speed of the object at time
t is 2t and the acceleration is 2. The
acceleration is directly proportional to
the force on the object.
We got 2 and 2t from t^2 by taking
calculus derivatives.
We write
d/dt t^2 = 2t
d/dt 2t = 2
So at time t, consider increment in time
h. Then at time t + h, the object is at
position (t + h)^2. So in time h, the
object has moved from t^2 to (t + h)^2,
that is, has moved distance
(t + h)^2 - t^2
= t^2 + 2th + h^2 - t^2
= 2th + h^2
Since speed is distance divided by time,
the object has moved at speed
(2th + h^2) / h
= 2t + h
Then as h is made small and moves to 0,
the speed moves to just
2t
So for very small h, the speed is
approximately just 2t, and as h moves to 0
the speed is exactly 2t.
So, 2t is the instantaneous speed at
time t.
So, 2t is the calculus derivative of
t^2.
To reverse differentiation we can do the
second half of calculus, do integration,
start with the 2t and get back to t^2.
Calculus is the most important math in
physics.
Later chapters in a calculus book show how
to find various areas, arc lengths, and
surface areas.
Calculus with vectors is crucial for heat
flow, fluid flow, electricity and
magnetism, and much more.
Einstein's E = mc^2 can be derived from
just some thought experiments and the
first parts of calculus. There is an
amazing video at
This topic starts with systems of linear
equations in several variables, e.g., for
two equations in three variables:
2x + 5y - 2z = 21
3x - 2y + 2z = 12
The system has none, one, or infinitely
many solutions. To see this, we can use
Gauss elimination. To justify that,
multiplying one of the equations by a
non-zero number does not change the set of
solution. Neither does adding a copy of
one of the equations to another. So, with
these two tools, we just rewrite the
equations so that the set of solutions is
obvious.
To save space here,
I will leave the rest of the
details to the Internet.
A solution can be regarded as a vector,
e.g.,
(x, y, z)
A function f is linear provided for x
and y in the ___domain of f and numbers a and
b we have
f(ax + by) = af(x) + bf(y)
It can be claimed that the two pillars of
the analysis part of math are continuity
and linearity.
Linearity is enormously important,
essentially because it is such a powerful
simplification.
The linear equations of linear algebra are
linear in this sense.
Differentiation and integration in
calculus are linear.
Linearity is a key assumption in physics
and engineering, especially signals and
signal processing.
In linear algebra we take the two
equations in two unknowns and define
matrices so that we can write the
equations as
So, left to right, we have three matrices.
The first one has 2 rows and three columns
and, thus, is a 2 x 3 (read, 2 by 3)
matrix. The second one is 3 x 1. The
third one is 2 x 1.
We define matrix multiplication so that
multiplying the first and second matrix
gives the same results as in the linear
equations:
2x + 5y - 2z = 21
3x - 2y + 2z = 12
In principle could do all of linear
algebra with just such linear equations
and without matrices, but the effort would
be very clumsy. Matrix notation,
multiplication, etc. are terrific
simplifications.
The 3 x 1 and 2 x 1 matrices are examples
of vectors.
Matrix multiplication is linear.
The known properties of matrices and their
multiplication and addition and of
associated vectors are amazing, profound,
and enormously powerful.
There is a joke, surprisingly close to
true, that all math, once applied, becomes
linear algebra.
(12) Differential Equations
In the subject differential equations,
we are given some equations satisfied by
some function and its derivatives and
asked to find the function.
The classic application was for the motion
of a planet: Newton's laws of motion
specified the derivatives of the position
of the planet as a function of time, and
the goal was to find the function, that
is, predict where the planet would be.
Science and engineering have many
applications of differential equations.
Some of the models for the spread of a
disease are differential equations.
We have seen that
d/dx t^2 = 2t
Well, we may have
function
f(s, t) = st^2
We can still write
d/dx f(s, t) = d/dx st^2 = 2st
but usually we say that we are taking the
partial derivative of f(s, t) with
respect to x.
(13) Mathematical Analysis
This topic is really part of calculus and
has the fine details on the assumptions
that guarantee that calculus works. The
main important assumption is continuity.
The next most important assumption is that
the function has ___domain a closed
interval or a generalization to what is
called a compact set; a closed
interval is, for real numbers a and b,
{x| a <= x <= b}
Also important are infinite sequences and
series that make approximations as close
as we please to something we want. So, in
this way we define sine, cosine, etc.
This topic is also part of calculus but
concentrates on surfaces, volumes, and
maybe the math of flows of heat, water,
etc.
Commonly this part of calculus makes heavy
use of vectors and matrices.
Also covered can be Fourier series and
integrals.
A pure tone in music, say, from an organ,
has a fundamental frequency, say, 440
cycles per second and also overtones at
natural number multiples of that
fundamental frequency. Each of the
overtone frequenciess works like an axis
of orthogonal coordinates and is a
terrific way to analyze or synthesize such
a tone.
The Fourier transform is closely related
and has an orthogonal coordinate axis for
each real number.
Engineering is awash in linear systems
that do not change their properties over
time -- time invariant linear systems.
Such systems merely adjust some Fourier
coefficients. This is a powerful result.
(15) Measure Theory
This topic develops integral calculus with
a lot more generality.
(16) Probability Theory
In probability we imagine that we do
some experimental trials and observe the
results. We have a sample space, our
set of trials.
We regard the set of all trials like a
region with area, and we say that the area
is 1.
Maybe we flip a coin and get Heads. The
set of all trials that yield Heads is one
event Heads.
The event Heads has area, probability
P(Heads)
With a fair coin we have
P(Heads) = 1/2
A random variable X is some outcome of a
trial. Usually X is a number. In this
case we can write
P(X >= 2)
for the probability that random variable X
is >= 2.
Events A and B are independent provided
P(A and B) = P(A)P(B)
In probability we can have algebraic
expressions with random variables, have a
distance between random variables, have
a sequence of random variables converge
to a random variable, etc.
Given a real valued random variable X, for
real number x we can let
F_X(x) = P(X <= x)
Then F_X is the cumulative distribution
of X, and
f_X(x) = d/dx F_X(x)
is the probability density of X.
One of the major results is the central
limit theorem that shows a distribution
converging to the Gaussian bell curve.
Two more results are the weak and strong
laws of large numbers that show
convergence to average values (means,
expectations).
One more result, amazing, is the
martingale convergence theorem.
The Radon-Nikodym theorem of measure
theory provides an amazing general theory
of conditional expectation which is one
way to look at the value of information.
In statistics we are given the values of
some random variables and try to infer
the values of something related.
The more serious work in probability and
statistics is done on the foundation of
measure theory.
(17) Optimization
We have a furniture factory and a new
shipment of wood. We have some freedom in
what we do with this wood. An
optimization problem is how to use the
freedom to make the greatest revenue from
the wood.
More generally we have some work to do
with some freedom in how we do the work,
and an optimization problem is how best to
exploit the freedom to have the best
outcome, profit,
for the work.
If the profit is linear and if the work
and freedom are described by linear
equations, then we have the topic of
linear programming.
Linear programming has played a role in
the Nobel prizes in economics.
The main means of solving linear
programming problems is the simplex
algorithm which is a not very large
modification of Gauss elimination.
In practice the simplex algorithm is
shockingly fast; in theory its worst case
performance is exponential.
In the furniture making, we can't sell a
fraction, say, 1/3rd, of a dining room
chair. So, we want our solution to
consist of integers. The simplex
algorithm does not promise us integers.
So we have a problem in integer linear
programming.
In practice from the simplex algorithm we
may get to make 601 + 1/3rd dining room
chairs and round that to 601 or just 600.
No biggie.
If we insist on integers, then,
surprisingly, even shockingly, we have one
of the early examples of a problem in NP
complete.
In practice there are still ways to
proceed; often we can get the optimal
integer solutions we want easily;
sometimes such solutions are challenging;
sometimes we have to settle for
approximations; often the approximations
can be quite good, like forgetting about
1/3rd of a chair; in theory the ways have
worst case performance that is exponential
and nearly useless.
One integer linear programming problem I
attacked has 600,000 variables and 40,000
constraints. I wrote some software that
in 600 seconds on a slow computer found an
integer solution within 0.025% of
optimality. The solution made heavy use
of the simplex algorithm, non-linear
duality theory, and something called
Lagrangian relaxation.
In addition to optimization, the simplex
algorithm and related math make nice
contributions to geometry, e.g., prove
theorems of the alternative used in
proving the famous Kuhn-Tucker conditions
necessary for optimality.
Since it is possible that the work we are
trying to optimize is complicated, its
mathematical description may also be
complicated and the optimization effort,
difficult.
Optimization problems are common in animal
feeding, cracking crude oil,
transportation problems, and much more.
Game theory can be regarded as a special
case. One proof of the famous Nash result
(featured in the movie about Nash, A
Beautiful Mind) is via linear
programming.
Some problems are to find curves, say, how
to climb, cruise, and descend an airplane
to minimize fuel burned or the best curve
for a missile to attack an enemy fighter
jet. Newton found the brachistochrone
curve solution to the problem of the
frictionless curve that would let a bead
slide down in minimum time.
Relevant fields are calculus of variations
and deterministic optimal control. One
famous result is the Pontryagin maximum
principle.
Some problems have uncertainty; so the
optimization may be to find the solution
with the best expected outcome. For
such problems there is the field of
stochastic optimal control.
A lot of professional programmers have no significant math background or have it but haven't exercised it so it's as good as absent. I'll even include many CS graduates here, whose college level math experience often ends with Calculus 2 (in the US) and linear algebra, perhaps a discrete math course. But then without any application to most of their other courses this information is quickly forgotten. I work predominantly with EEs and CS majors (my employer does not hire non-degreed persons for programming work, which does eliminate some really good candidates) and outside of the one teaching orbital mechanics, most would be hard pressed to solve even a basic linear algebra problem anymore. I've even had to re-teach boolean algebra to the EEs who seem to have forgotten even Karnaugh maps and how to use them.
And then there are all the non-technical majors who become programmers, like the many philosophy graduates I've worked with. This isn't to say they can't learn the math, but they often have even less exposure than the typical business major in the US.
And globally there are many people who come to professional programming without any degree at all beyond a high school diploma. And given the variance in high school curricula globally there's no way to say what level of math this group possesses, but they almost certainly lack college level academic math exposure, the majority at least.
> I've even had to re-teach boolean algebra to the EEs who seem to have forgotten even Karnaugh maps and how to use them.
Heh heh. Electrical engineer here. My first and last use of Karnaugh maps for a job was 14-15 years after taking my digital logic course. The irony was that it was for a routine programming problem: The customer had given me with an ugly flowchart and I felt I must reduce that monstrosity to something much simpler. I did, but then my fellow coworkers would always wonder if my result was identical to the flowchart. I would tell them to make a truth table and confirm it. Finally one coworker did that and left a comment pointing out he had already verified it. In one sense, it made the code less "readable", but no one wanted the job of doing a direct translation of the flowchart.
I did have to Google to remind myself how to do them, but it took only a few minutes to figure out.
Beyond the answers people have provided, I'd like to add that there is a social element to it as well. Many/most programmers have forgotten math like LA and calculus, there does tend to be opposition to using math at work. Will other people be able to understand your work if you solve it with applied math? Will your coworkers be willing to admit they don't understand your work? And so on.
Of course, this doesn't apply to domains where math is necessary (graphics, etc).
I remember in one engineering job I had, two engineers were tasked with the problem: Given an ellipse, assume a horizontal/vertical line "cuts" off the ellipse. What is the area of the remaining piece?
This is a standard Calc II problem, and I'm sure everyone there had taken it - we required a MS in that team, and some people had PhDs. The solution takes a bit of work but in the end you'll have an analytical formula with the exact answer. In code you'd just put one line with the formula.
Is that what they did? No. They wrote a program to approximate the area. I really doubt they took care of any floating point subtleties, but I knew that I wouldn't be popular if I probed their solution.
It of course would depend on what roles and geographies you include, but from personal anecdata, I'd say that about 50% of developers I've worked with have taken 'some' university-level maths. And amongst these, there's of course significance variance in backgrounds. Again from anecdata, the best at applying maths to programming are physics majors, who seem to often recognize that a software system exhibits some dynamics, and that they could find (or build) some relatively simple model that would explain and predict that behavior.
You may be viewing the field from your own bubble. There are a LOT of programmers who do not have formal CS degrees. I hire them regularly. There are a lot of boot camps and intensive non-degree programming schools out there.
There are also a lot of people who got a degree in some other field and later moved to a programming career (I see a lot of Philosophy majors get into programming, interestingly.)
I heard a conjecture once that the best textbook you'll find on any given topic is your third. The point, of course, is that it simply takes about three serious attempts to make it click - but it is a fallacy to give all the credit to the third book.