To everyone arguing against special-casing matrix multiplication, please base your arguments on the PEP's "Motivation" section to avoid too much of rehashing the obvious. It even has a subsection "But isn't matrix multiplication a pretty niche requirement?"
I would argue that almost every single linear algebra routine can be interpreted in some form as a matrix-matrix or matrix-vector multiplication. Considering that matrix-matrix and matrix-vector are fundamental numerical operations (they are part of BLAS which are the basic API functions called by every single linear algebra routine I know of) having a concise notation is key.
And since formulas can get complicated quickly, having a closer 1-1 correspondence with the mathematics is critical for understanding the meaning of the code. The PEP contains a nice example from statistics. And trust me, when you are writing numerical code, being able to read the mathematical formula clearly is essential, especially when you have a lot of formulas and need to figure out why your code is giving you numerical non-sense.
A symmetric group is then a subgroup of the general linear group in any field, where the general linear group GL(K, N) is the set of invertible NxN matrices with entries from a "field" (just think "numbers") and therefore any [finitely generated] group is isomorphic to a group of matrices under matrix multiplication; this is the underlying concept of representation theory.
If we back up a little, a group is a very general sort of algebraic structure; many important concepts have an underlying group structure, such as rotations, permutations, and any sort of reversible computation. This latter case implies that matrix multiplication is Turing complete; the simplest such set of matrices is generated by the Toffoli gate matrix. The relationship of groups to geometry is due to the underlying correspondence between the axioms of a group and those of geometrical transformations. A set of reversible geometric transformations includes an identity element -- do nothing -- and obeys associativity (sorta complicated, but it makes sense if you think about it) and inverse operations (by assumption): this makes it a group, and it can be represented by matrices. If we remove "reversible", we get exceptions -- like the cross product -- but these are usually related to groups (cross product -> quaternion algebra).
So matrix multiplication is actually really, really fundamental in a lot of mathematics. It's also a special case of tensor contraction, which could justify another tower post (but won't).
>And trust me, when you are writing numerical code, being able to read the mathematical formula clearly is essential, especially when you have a lot of formulas and need to figure out why your code is giving you numerical non-sense.
(spent three months chasing a bug where two commands were out of order)
> A symmetric group is then a subgroup of the general linear group in any field, where the general linear group GL(K, N) is the set of invertible NxN matrices with entries from a "field" (just think "numbers") and therefore any [finitely generated] group is isomorphic to a group of matrices under matrix multiplication; this is the underlying concept of representation theory.
As a representation theorist, I am very sympathetic to this point of view, but I'm not sure that it proves that "[almost] every single linear algebra routine can be interpreted as a matrix-matrix multiplication"—unless one first has some reduction from an arbitrary linear-algebra routine to a group action.
> A symmetric group is then a subgroup of the general linear group in any field, where the general linear group GL(K, N) is the set of invertible NxN matrices with entries from a "field" (just think "numbers") and therefore any [finitely generated] group is isomorphic to a group of matrices under matrix multiplication; this is the underlying concept of representation theory.
Also, as a very minor nitpick, I think that you want 'finite' instead of 'finitely generated'; even infinite groups embed in (infinite) symmetric groups, but it's not obvious to me that infinite symmetric groups embed in (finite-dimensional) matrix groups, and it's certainly not true (just by counting cardinality) that infinite but finitely generated groups embed in finite symmetric groups.
What you say about almost all linear algebra routines being multiplication sounds crazy to me. How are the following supposed to be multiplication: (1) computing the transpose, (2) finding a basis for the kernel of a matrix, (3) factorizations, such as QR, LU or SVD (3) computing the Jordan normal form of a matrix, etc.?
It's not as crazy as it sounds, though I wouldn't make such a blanket statement. Things can be phrased in terms of matrix multiplication (maybe not a single matrix multiplication), it's just not the most efficient way to go about it.
1. Transpose is a linear map on matrices (vectors in R^nm), so in a very concrete sense it is precisely a matrix multiplication. And it's not hard to figure out the matrix, because it's analogous to the matrix that swaps entries in vectors.
2. Finding the first eigenvalue can be approximated via matrix multiplication, and for sparse matrices with good spectral gap this extends to all of them.
3. Row reduction can be phrased as matrix multiplication, and hence finding the basis for the kernel of a matrix is a byproduct of it, as is computing eigenvectors when given eigenvalues.
4. Computing orthonormal basis is a sequence of projection operators (and all linear maps are matrix multiplications)
5. I'm pretty sure nobody computes the Jordan canonical form on computers.
The point is that the connection between matrices and linear maps is the spirit of linear algebra.
It is as crazy as it sounds because we were talking about implementing an actual language/library, not about doing the linear-algebra equivalent of restating Turing's thesis.
> we were talking about implementing an actual language/library
No, we're talking about language-level syntactic standardization of the most fundamental computational concept in linear algebra (and I would argue all of mathematics). My list just gives evidence to how unifying the concept is.
If you want supercomputing-grade efficiency you won't stop using numpy/scipy just because of this update, or you've already rewritten your Python prototype in C/Fortran anyway.
> If you want supercomputing-grade efficiency you won't stop using numpy/scipy just because of this update
Especially since the update is just creating a language feature for numpy and similar libraries to leverage -- there is still no matrix implementation (and thus no matrix multiplication) in the standard library, just an overloadable operator so that numerical libraries can consistently use it for matrix multiplication and use * for elementwise multiplication, providing a standard API for those libraries.
Lots of factorizations such as the NMF can be implemented as a hill climb where you just multiply things against other things (and normalize elementwise)
1. Numpy people overload * for element-wise mul and matmul inconsistently and confusing people
2. The prefix func calls convention for matmul makes the formulas difficult to read
3. Python's precedence for splitting / into / and // can't apply to matmul because * * is already taken.
4. ` is banned, ?! lend unrelated meanings to context, $ is Perl and PHP baggage, so @ is the only thing left?
I got lost on the choice for @. If ?! lends unrelated meaings, and $ is Perlism, why doesn't @ suggest some kind of concat ops or Bashish/Perlish array sigils?
I'd personally much prefer >< . It looks like x, so it's much clearer. I don't understand the PEP's reason for not using ><.
Yes `backticks` are banned since Python 3 because 1) They are hard to type many on common keyboard layouts 2) They are hard to read, especially in Python books.
I don't understand the hate for backticks either. I used to dislike them too, when I was just a hobby programmer, and used a Slovenian keyboard layout (even for programming). Needless to say, after switching to English layout, my programming skills quadrupled - ` isn't the only symbol barely accessible (without finger acrobatics) on international keyboards (other examples include: {}][|~\^ ).
I use backticks a lot when writing Markdown and Ruby code and it's easy to type (on a Mac keyboard at least) and is as easy to print as any other character.
Because * has been element-wise multiplication in python numeric/numpy since the python 1.0 days. Introducing such a massive and fundamental backwards incompatible syntax change just as python 3 is starting to settle down and slowly gaining acceptance is probably not a good idea
The change being made to python exists to enable the fix preferred by much of the community to the API fragmentation problem that exists among python numeric libraries given the need for both convenient matrix multiplicationa and convenient elementwise multiplication.
Since its a fairly dominant application area that is pretty key for Python (there's a reason there are so many bundled python distributions that include the common scientific/numeric libraries, and that those environments are often chosen as pedagogical tools even for general-purpose programming), making a fairly modest language-level change to enable a clean resolution to this fragmentation is a sensible thing to do.
Speaking of the obvious, I'm somewhat surprised that the PEP didn't discuss dropping the "dot" function and using the existing function call operator directly. It seems like a natural idea to me. After all, a matrix represents a linear transformation, and a linear transformation is ... a function.
If you add some whitespace and squint just right, it even kinda sorta looks like math. If you further wrap all matrices and vectors in parentheses, you can pretend it's a whole new operator that lets you do matrix multiplication by juxtaposition.
This is surprisingly detailed design rationale which weighs many alternatives and gives careful consideration to possible tradeoffs. Great job by Nathaniel Smith and the numerical Python community.
> The nice thing is that it makes a nice symbol `@` available to objects that aren't matrices, to do with as they want ...
The (ab)use seems endless :)
Dozens of operators in Python have been overloadable and open for abuse for decades now, and yet I don't see much abuse, unlike the C++ and Scala communities. This says a lot about the Python community's ability to avoid pitfalls over the years.
@ is a binary operator with the same precedence and left-right associativity as * , /, and //. It's hard to think of places where @ could be abused where those other three couldn't equally well be a source of abuse. There hasn't been much abuse of those other three, so there likely won't be much abuse of @.
I think the only real (ab)use of @ is to echo its use in another language. For example, @ means something in XPath, so X@Y could be used as a short-hand for X.attrib[Y] in an ElementTree-like API:
tree = load_xml_tree(...)
for node in tree.select("//item[@price > 2*@discount]"):
print(node @ "price", node @ "discount")
The only other hypothetical I can think of right now is evaluation shorthand. Something like an expression group represented by a class which may be subject to arithmetic using standard operators. @ could be used as shorthand for a .evaluate_at() method but it's probably not that useful anyway.
Maybe passing objects with 'argument @ resource' would be interesting...
@ is already used by decorators. I hope decorators will be reimplemented as prefix operators on functions with this change, so that @<foo> becomes available as well.
With @ as a grammatical infix matrix mult operator, there doesn't seem to be any real need beyond exclusivity to redefine the decorator grammar. By the same argument, we shouldn't use * for unpacking.
But the use as an operator and the use as a decorator is easily distinguished by where they appear in the grammar. Neither the parser nor human readers (as far as I can tell) have any difficulty determining which use is which. So let's not confuse the two: they are two different uses that happen to utilize the same symbol.
My point is that if @<func> is re-implemented as, for example, the function.__dec__, then it can be overridden and some cool DSL uses can pop up. As it stands, the use of @ as a (class-)decorator is built in to the grammar, rather than simply being a prefix operator.
How might that implementation be used in code? Would coders be directly accessing and overriding the __dec__ attribute? Since decorators are definition time anyway, I don't really see much potential that decorated function factories can't already provide.
"x = @y" will call y.__dec__. "@foo" behind a function will call foo.__dec__ with the function as an argument, meaning decoration behaviour can then be altered or be even completely different for other classes.
As I said, there is DSL potential. As it stands, it's just something that is not possible.
That seems too esoteric to ever go into the language unfortunately. I love the idea of some sort of __in__ targeting values in mappings but have never been able to think of an elegant mechanism that uses the existing grammar sanely or introduces useful new grammar.
I've always thought that maps and a matching __maps__ would have been a nicer way of referring to the keys of a mapping with in/__in__ referring to the values. Although the semantics are a bit odd in that case. Technically the mapping maps a key, but the language semantic to to have it the other way around as in key is mapped by mapping.
foo = {'bar': 42}
assert 42 in foo
but which makes more sense below?
assert 42 maps foo
assert foo maps 42
or even
assert 42 mapped by foo
The benefits of including by in the grammar could be of consequence here too.
That's why none of the alternative solutions are attractive. They don't really improve on the existing semantics.
What bugs me is the implicitness of __in__ defaulting to keys for a mapping. We have .keys() and .values() which are nice and clear since they explicitly grab an object which has an unambiguous definition for __in__.
Double Edit: It's an implementation decision that had to be made and I'm not aware of the rationale or debate behind the original choice. It's directly attributable to the decision to make __iter__ return the keys for a mapping which is the root of my issue. It's presumably useful and convenient (from a language writer's perspective) for iteration of a mapping to be along the keys in whatever order they may be traversed but if that's all it comes down to, why should a mapping be iterable at all when there is obvious ambiguity in what may be iterated? Maybe there is some history I'm not aware of where the .keys(), .values() and .items() methods were introduced post-hoc and the previous behaviour was such due to their non-existence and the need to iterate and then index to get all values in the object.
I think I'm combining issues in a problematic way here and I don't even know what was going on when I typed __in__ (twice! :/) instead of __contains__ - I'll just blame the lack of coffee. I realise why the membership 'in' should operate on the hash table in memory for O(1) performance and it makes perfect sense. I've just never been fully comfortable with the idea that keys are considered the 'members' of a 'mapping' in and of themselves. Maybe because I usually consider one-way mapping as a special case.
The more reasonable (I think) problem I have is one of language semantics which is introduced by in being applicable to a mapping in the first place - why should a mapping be iterable at all if it is ambiguous (as I believe) as to what it should return? Members should be key:val pairs but membership checks refer only to keys and it's probably not unreasonable for a user to want any of .keys(), .values() or .items() when iterating. That's obviously why they're made available so why should __iter__ special case one of them?
I assume this is to match semantics introduced by the membership check and probably historical because without a .values() method, key iteration and lookups would be the usual way to get all values out of a mapping. This just seems like the kind of thing that could have been changed in Python3 (although 2to3 would probably not have been able to handle the syntax changes automatically).
> It's presumably useful and convenient (from a language writer's perspective) for iteration of a mapping to be along the keys in whatever order they may be traversed but if that's all it comes down to, why should a mapping be iterable at all when there is obvious ambiguity in what may be iterated?
Its iterable at all because early python didn't have generators, so iterating over dict.keys() would be inefficient for large dictionaries (since you would create an intermediate list just to iterate over.)
> Maybe there is some history I'm not aware of where the .keys(), .values() and .items() methods were introduced post-hoc and the previous behaviour was such due to their non-existence and the need to iterate and then index to get all values in the object.
Prior to Python 3.x, .keys(), .values(), and .items() returned lists (rather than generators providing a view on the underlying dict), which would generally be inefficient if the only purpose was to iterate over them once.
Well, nothing. Actually, I do not support Python having operators other than well-known, standard operators, like + - * / % and bitwise operators. My comment was just a quick and dirty idea which came to my mind after reading the parent comment of it.
That checks whether a key exists. Your parent wants to check whether a value exists. Not that I'm any huge fan of it, considering how it would be inefficient by default.
the `in` keyword uses `__contains__` under the hood, if I remember correctly, so you could always make a `dict` subclass that provides this functionality.
I'd love to see it used for partial application - f@a instead of functools.partial(f, a). If wishes were better functional programming support in Python...
After a few years in Scala, where all operators are methods and vice versa (2 + 3 is just sugar for 2.+(3)), the method-operator distinction just seems weird.
> in Python, 2 + 3 is just sugar for (2).__add__(3).
Sure, but it doesn't work the other way; there's no nicer way to write 2.florble(3). So you have a few operator methods, and then other methods are second-class citizens. Which in turn encourages people to abuse the operators as shortcuts for commonly-used operations, á la C++'s <<
> Which in turn encourages people to abuse the operators...
In practice, very few languages have suffered problems from operator overloading. I can't think of another language where this is a problem (other than C++), which makes me think it's a cultural/code style problem, rather than a problem with the language itself. C++ seems to encourage abuse because everyone learns its IO library, and its IO has an API design so horrible it should make you retch.
By comparison, I'm not sure that adding more operators makes the resulting code any more palatable. Look at Haskell: libraries define their own operators, and the operators in Haskell have a steep learning curve. Can you tell me, without looking at a chart, whether $ or >>= has higher precedence, and what kind of associativity they have?
> Can you tell me, without looking at a chart, whether $ or >>= has higher precedence, and what kind of associativity they have?
No, which is why I don't think that should be allowed. Scala doesn't do that - all "custom operators" have the same precedence, and the global rule is that anything ending in : associates to the right (and is defined by the thing on its right), otherwise to the left. So I could answer those questions easily for a Scala library.
The difference is that in scala, expressions of the form foobarbaz in general are desugared to foo.bar(baz), and symbols are allowed in method names, so every arity-1 method defines an infix operator, and you don't need language-level changes to add new infix operators.
I don't think it's really that different -- either Scala does some magic to desugar the operators into method calls taking into account operator preference, or you're going to get results that don't match expectations. If latter, then like Rebol operators don't match (mathematical) expectation. If former, then there is actually a distinction between methods and operators in Scala, so Scala is just like Python.
It means that "overloading" an operator in Scala is not a special case - downthread someone was saying this addition to python was great not because they were going to use @ to multiply matrices but because now they could overload @ to do things with their own types. In Scala you wouldn't have to wait for a new operator to be added to the language - "defining a custom operator" is exactly the same as "overloading an existing operator". If you want to make "a + b" work, you define a "+" method on a. If you want to make "a @ b" work, you define an "@" method on a. If you want to make "a ə b" do something, you define a "ə" method.
You're right that scala does special-case the precedence of a small number of operators (the list is much shorter than the likes of C, but I still wish we could move away from it).
Io uses Operator Shuffling to move those operator method calls back into normal mathematical expectation (or to be exact... C precedence order) prior to building the AST. Ruby's parser must do (or does) something similar. From lmm answer it sounds like Scala does something similar to Ruby here.
Smalltalk, Self & Rebol have no operator precedence and simply interpret the operators from left to right (using parenthesis to enforce higher precedence).
Haskell also has first-class operators. Operators are infix, surrounding them with parentheses makes them prefix: (+) 1 2. Functions are prefix, surrounding them with backticks makes them infix: 1 `plus` 2. You can define your own, and define precedence and left/right associativity.
Since this is just a special case of a multiply-and-add indexing loop, maybe they should just introduce some form of tensor notation, so that
A[i,j]*B[j,k] is the matrix product of A and B? That would extend to so many more use cases than just a 2d matrix product.
> Matrix multiplication is more of a special case. It's only defined on 2d arrays (also known as "matrices")
but this is not true. Matrix multiplication is just a special case of contraction of indices in a tensor (https://en.wikipedia.org/wiki/Tensor_contraction)—probably the most frequently used case, but not the only one. I'm certainly not arguing for the inclusion of general tensor-manipulating operators in Python, but it does seem to suggest a sensible alternative to:
> For inputs with more than 2 dimensions, we treat the last two dimensions as being the dimensions of the matrices to multiply, and 'broadcast' across the other dimensions.
namely, just contract on the inner indices. That is, arr(n1, ..., nk, m) @ arr(m, p1, ... pl) = arr(n1, ..., nk, p1, ..., pl).
> namely, just contract on the inner indices. That is, arr(n1, ..., nk, m) @ arr(m, p1, ... pl) = arr(n1, ..., nk, p1, ..., pl).
No. This is what a mathematician might assume the PEP proposes without actually reading it. It instead proposes an entirely non-obvious definition which not equivalent to what you wrote.
I know it is not equivalent—that is why I proposed it as a sensible alternative to what the PEP proposes (which I quoted). The punctuation may have made it unclear, but what I was trying to say was:
> … a sensible alternative to [PEP proposal]; namely, just contract on the inner indices.
and not
> … a sensible alternative to contracting on the inner indices.
My argument for why it's sensible is precisely what you mentioned, namely, that it is what a mathematician would expect.
My issue is not with the matrix-multiplication operator—I'm a mathematician before I'm a programmer, and so am all for it—but with the vector-to-matrix promotions: why not just consistently promote vectors to columns (or rows, if BDFL swings that way)? This would achieve almost the same effect as promoting to rows or columns as needed, and would avoid the non-associativity problem that the PEP mentions.
This PEP seems to imply that the cost would be a profusion of superfluous `newaxis`s, but I can't see that: it seems to me that you would need only to remember which kind of promotion works by default, and sprinkle in a `.T` whenever you need the other kind. (Anyone who's uncomfortable with lots of `.T`s in matrix-crunching code is not, I submit, someone who writes or reads lots of matrix-crunching code.)
I'm perfectly fine with defining get and get_in ala Clojure and using them (they work on __getitem__ supporting things, so dicts, lists, strings, most random user-level containers). I go a bit further in my codebases and implement:
def get(obj, k, default=None):
""" safe __getitem__ obj[k] with a default """
def assoc(obj, k, v):
""" obj[k] = v returning obj """
def dissoc(obj, k):
""" safe del obj[k] returning obj without k"""
def get_in(obj, keys, default=None):
""" __getitem__ obj[k0][k1][kn] with a default. """
def assoc_in(obj, keys, v, default=lambda n: dict()):
""" __setitem__ obj[k0][k1][kn] = v. __getitem__ failures handled with default """
def dissoc_in(obj, keys):
""" Return obj asserting that obj[k0][k1][kn] does not exist. """
def update_in(obj, keys, update_fn=None, default=lambda n:dict()):
""" Update the value at obj[k0][k1][kn] with update_fn returning obj. __getitem__ failures handled by default. """
def merge_with(fn, **dictionaries):
""" Merge resolving node conflicts with fn """
def deep_merge_with(fn, **dictionaries):
""" Recursively merge resolving node conflicts with fn"""
as a matter of course in most of python webapp and data munging projects. They are insanely useful when working with the gobs of JSON that is common when interacting with modern web services. I really should throw the implementations into a public library at this point.
> Why though? I'm of the opinion that the error should be handled at the place that produces the first NULL/None.
Producing NULL (or whatever the languages equivalent is) isn't an error -- if there was an error, it would be throwing an exception, not returning NULL. The idea of "do this chain of operations propagating nulls encountered at any point" is reasonably useful.
I just hate NULL standing in for a boolean notion of existence, I'd rather see a separate boolean member indicating the validity of the other member (string middle_name; boolean has_middle_name;). Or in a strongly typed language an Option/Maybe type. Just assigning things NULL sometimes has the problem of people looking at the object and assuming everything will always be populated.
> But the complexity of any proposed solution for this puzzle is immense, to me: it requires the parser (or more precisely, the lexer) to be able to switch back and forth between indent-sensitive and indent-insensitive modes, keeping a stack of previous modes and indentation level. Technically that can all be solved (there's already a stack of indentation levels that could be generalized). But none of that takes away my gut feeling that it is all an elaborate Rube Goldberg contraption.
I really like this snippet from the justification for the symbol chosen: "APL apparently used +.×, which by combining a multi-character token, confusing attribute-access-like . syntax, and a unicode character, ranks somewhere below U+2603 SNOWMAN on our candidate list".
One of the reasons I have been such a fan of Python for so long is the relatively no-nonsense approach to design decisions that many others would have rushed through.
Python's syntax is already pretty ponderous, this kind of thing pushes it rapidly toward bash-like incomprehensibility. I've been a big fan of the language for years but this kind of thing makes me consider jumping to Lua or something which is more sparing with new complicated sigils.
How does this reduce comprehensibility? In cases where you're doing matrix muls, "a @ b" is way clearer than "a.dot(b)". In other cases, infix "@" won't appear.
I actually find Python's syntax to be extremely clear, even to someone who hasn't written much Python. Can you give some examples of where it's particularly incomprehensible?
Matlab's approach to using * as the matrix multiplier makes sense, because every numerical variable is an array with integers/floats simply being a 1x1 array. Using .* then as the elementwise version works.
I'd have personally preferred to see Python do type testing on the variables - it already does. For example:
So why not make it a case where * on an int or float does the 'standard' multiplication that already exists whereas * on an array does matrix multiplication?
You arrive at the problem of then not having an element-wise version of the multiplication but it's not as if this solves that problem anyway.
A Python list is not a matrix, so I don't see how this change would entail your statement. Of course, for very specific use cases, a list looks like a numerical vector; the semantics are very different, though.
For numpy arrays, yes (for pretty much all versions). As apl said, Python lists aren't strictly a numerical type.
"Numerical type" isn't quite the right description either, but the underlying point is that the design choices for lists don't necessarily favor mathematical convenience or consistency.
Python's way of treating * as element-wise multiplication when operands are arrays makes it easier to vectorize algorithms - if f(a, b) = a * b, then you can treat a and b as numbers, a number and an array, or as two (multi-dimensional) arrays. In Matlab, you'd have to use *. for that.
The beginning of the PEP (linked document) has a rationale for it, including an argument that matrix multiplication has a combination of features which distinguish it from other binary operations. Check it out. Like most Python PEPs it is quite readable. There are even alternative proposals and links to where the discussion of this issue has occurred.
It seems that the most fundamental reason is that matrix and memberwise multiplication are two distinct operations which use the same datatypes as operands and both have a very good claim to use of the * operator.
Also, just because a language is general purpose does not mean that it cannot support features which might be restricted in their use.
Two of the python's most popular uses are in scientific and numeric fields (see SciPy, NumPy and ipython). This operator seems like it would be very useful for improving readability in those areas which is probably reason enough to use it.
However, please resist the urge to just overload this new operator for something unrelated that you want. The point of infix operators is to improve the readability of code and lessen the cognitive load on the reader. Overloading an operator for a purpose which is not conventional goes exactly against this for... what benefit exactly?
Thanks. I understand its importance to scientific computing and I welcome the change. I was just wondering if matrix multiplication had more flexible uses, like `sum` does.
And the benefit of overloading is still readability, when carefully used and in a different ___domain. And sparingly. For the `@` operator I will be trying it as an "at" operator for accessing objects from a game map, or some sort of translation. If it performs well I may start using it in my pet projects. If it changes my life and enlightens everyone who gazes upon it, I may start using in production.
I do understand your worry and I promise to be extra careful.
Here's my uneducated guess: the Python maintainers see NumPy as a bit of a killer app, and the NumPy maintainers would like to be a MATLAB killer. The entire MATLAB language is designed to make life easier when dealing with matrices (people think it stands for MATh LAB but it actually stands for MATrix LAB.)
MATLAB went down the matrix rabbit hole so far, they made ordinary scalars second-class types. If you want to multiply scalars or otherwise do element-wise operations, you have to prefix the '*' operator with a '.'. If Python can make their language more linear-algebra friendly while avoiding that sort of silliness, I'm all for it.
You don't have to use .* and friends when operating in 2 scalers, 2 * 3 works just fine. But if you have 2 matrices A and B, A * B does a matrix multiplication, And A .* B is needed for element wise multiplication.
I'm still new to matlab, but the place where this has bitten me is if I'm trying to write code that works on either a scalar or vector of scAlars, if you aren't specific, the code my run fine against a single scalar, but blow up or do the wrong thing with a vector.
Ehm, Python is not a "scripting language". It is a — as you said — general purpose programming language. And it is becoming more and more popular for scientific applications, where matrix multiplication plays a key role.
I honestly didn't expect a downvote for calling Python a "scripting language". I use the term loosely and always thought Python fit spot on. I still think it applies here, but I've edited the parent post to correct this possibly incorrect usage of the word.
Just for the record, these are the characteristics of Python that made me call it "scripting":
- Interpreted
- Highly expressive
- No boilerplate
- Vast majority of programs are short
- Lots of libraries for gluing stuff together
- Used for turing-complete customization of software in other languages (e.g.: Sublime Text)
- Recommended as replacement for Bash in some cases
Anyway, I'm happy to see the language evolving to meet its users' needs.
> Indeed, and because it doesn't support closures, it's not a true functional programming language either. And because you have to import all sorts of modules to do the simplest things (e.g., regular expressions), neither is it a true scripting language. Indeed, because it doesn't support labeled break or continue statements, it's not even a true structured programming language.
I think the point is, trying to classify programming languages (except ___domain specific ones like SQL or maybe PHP) is kind of pointless.
But to answer your real question, Python is one of the top languages in number crunching. After web work, that's Python's second biggest niche.
I believe that the common usage of python has switched away from "scripting usage" - it's now generally used to implement stuff, not glue stuff together or replace bash, and the vast majority of programs isn't short anymore.
It can be used as a scripting language, but that doesn't seem to be the main use-case anymore.
No offense intended. I just have a slightly different understanding of a "scripting language": a — often ___domain-specific — language that is merely used to write small scripts/macros. But anyway, I did not want to start a discussion about that. Sorry for that.
It's addressed in the PEP, even with some convincing empirical data. Python is becoming hugely popular as a scientific data processing language. This paragraph from the motivation summarizes:
"We all live in our own little sub-communities, so some Python users may be surprised to realize the sheer extent to which Python is used for number crunching -- especially since much of this particular sub-community's activity occurs outside of traditional Python/FOSS channels. So, to give some rough idea of just how many numerical Python programmers are actually out there, here are two numbers: In 2013, there were 7 international conferences organized specifically on numerical Python [3] [4]. At PyCon 2014, ~20% of the tutorials appear to involve the use of matrices [6]."
I suspect this was to distance matrix mult
from traditional scalar mult operators.
Remember that ** is just shorthand for n *s.
whereas matrix mult is a different beast
entirely.
There is also added clutter with 3* where
typos may be non-obvious (already a potential
issue with ** I guess).
I think that you're not alone, but this is specifically considered and rejected (in the section on "Rejected alternatives …"):
> Add lots of new operators, or add a new generic syntax for defining infix operators: In addition to being generally un-Pythonic and repeatedly rejected by BDFL fiat, this would be using a sledgehammer to smash a fly.
PEP 225 was an attempt to add a load of new operators that was rejected (well, deferred 14 years ago). I don't know of a proposal to define arbitrary binary operators.
Why not use the existing convention for names of infix operators like __mmul__ but allow then to be infix as well.. So, we could write A __mmul__ B.
I think this is more clear than @ and is already familiar to programmers that deal with operator overloading methods. It would also enable the addition of arbitrary infix operators to python.
You've got to wonder, does matrix multiplication happen so often and is it so inconvenient to call as a function that it's worth complicating the syntax? I'm really not sure about this one.
Yes! Numpy is hugely popular in (at least) physics and machine learning and the main negative comment I get is "Wow that's horrible syntax". I'm still not sure they'll be convinced, however:
> You've got to wonder, does matrix multiplication happen so often and is it so inconvenient to call as a function that it's worth complicating the syntax?
It is empirically observed (and the identified rationale for this PEP) that matrix multiplication is done by overloading existing operators in popular Python libraries (going so far in Numpy as to have separate types differing primarily in whether the * operator does matrix or elementwise multiplication), and that this causes confusion, API fragmentation, and inefficiency, so, yes, it happens enough and is inconvenient enough that a general solution for doing it as an operator that doesn't interfere with existing operators is desirable.
Its not really "complicating the syntax", its just adding two new operators and corresponding magic methods, the structure of the syntax is unchanged, and these operators work just like other python operators, including their relationship to their corresponding magic methods.
Really, you should read the PEP (Python Enhancement Proposal) that is the linked article. It covers your question in great depth.
When you have a multiplication of 5-10 matrices that has to be done in the correct order, the expression gets horribly unreadable when written as nested function applications. And those kinds of expressions are ubiquitous in numerical computing.
Agreed that deeply nested function application is hard to read, for any purpose (not just matrix multiplication), which is why anyone who cared about readability and verifying correct readability would split it over multiple lines/expressions. Method chaining, given the functions return the correct types, could be a way to avoid that deep nesting too.
Usually the algorithm is described in another document without several expressions & lines. "Readability" in some of these contexts means verifying that you've correctly typed in the expression.
Currently, we can use * to multiply two numpy matrix. We'll have two choices, * and @, to do the matrix multiply things in numpy. Is it pythonic? Or should we deprecate * ?
No, we should deprecate numpy.matrix which is the culprit here. If there's only numpy.ndarray, the semantics are absolutely clear: * for element-wise multiplication and @ for "proper" multiplication of two matrices.
"In numerical code, there are two important operations which compete for use of Python's * operator: elementwise multiplication, and matrix multiplication"
The idea is to keep using * for elementwise multiplication, and use @ for matrix multiplication.
Adding @ for matrix multiplication is to enable a consistent matrix API for python where * is always elementwise multiplication. Numpy -- and other major projects that use * for matrix multiplication in APIs, as reported in the PEP -- have committed to migrate to the new standard approach once the PEP is implemented.
from the pep as to why not using a type that defines _mul_ as matrix multiplication :
"
The result is a strong consensus among both numpy developers and developers of downstream packages that numpy.matrix should essentially never be used, because of the problems caused by having conflicting duck types for arrays
"
looks to me like an issue with python lack of static type checking. if you need to define an operator every time two types conflict, you're in for a long list...
Because the whole problem it is solving is that both matrix multiplication and elementwise multiplication are frequently used enough that there is API fragmentation in popular matrix-using Python libraries between places where * is used for matrix multiplication and places where it is used for elementwise multiplication. Read the linked PEP, this is addressed, at length.