"I expected Go to kick Erlang's ass in the performance department but the message latency was much higher than Erlang's latency and we had 225 unhappy customers."
Go is neat and I hope to see it succeed and thrive, but it is simply unavoidable that Erlang has existed for much longer, has been getting tuned for much longer, and this sort of thing is its monomaniacal focus whereas Go is spread a bit more thinly at the moment. Go is trying to be a systems language, Erlang very much isn't.
What was surprising was how much faster the Java implementation was than Go. I expected it to be competitive but I didn't expect it to perform much better.
his reporting is very misleading. If you look at the number of messages and connections, Java only held ~5k connections, while Go held just shy of ~10k. In the same amount of time, Java was only able to facilitate the transmission of half the number of messages, so... it depends what you mean by "fast".
Interestingly, Go's "connection time" was by far the lowest.
He also tested with m1.medium instances, which are single cpu. No mention if the instances were 64bit or 32bit (this may matter for Go, as the GC has some issues under 32bit currently).
Tests are hard. Still nice to see a real-world-like comparison between some popular stacks.
he's using io.Copy in the main websocket handler, which is a generic, buffered routine for manipulating byte buffers. I don't know if the other implementations are buffered in this manner. It's likely that there are ways to improve the Go source.
Erlang is significantly more established, but I'm wary of judging technologies on their technological merits alone. Although the technical merits of Erlang are very interesting, I've been quite happy with my experiences using Go.
That big difference can't be right. There's something wrong somewhere, possibly in Go's websockets implementation (which I think haven't been update for a while).
To you and zemo's point, those are specific instantiations of the general point I'm making. I'm sure Go will get there, as long as it stays alive, but it shouldn't be a surprise that Erlang is more polished at the moment. That's not a criticism of Go per se, it's just an effect of where they are in their lifespans. Erlang is basically mature and just polishing, and Go's still a rambunctious early teen with a wide-open future.
I agree with this, however I'm specifically pointing out that this is not about polish of Go in general sense -- while there's a lot things to improve in the compiler, etc., in this instance, there's something wrong somewhere, perhaps in websockets library. It doesn't mean that there's something wrong with Go per se. Note that websockets is not included in the standard library yet.
When benchmarking language performance results, it's a good practice to explain why is one language faster than the other. In other words, profile and figure out the bottleneck.
If you don't know why, then it's better not to post them until you figure out the bottlenecks. There could be factors in your environment that can affect one language and not the other which you are not aware of. Publishing such number will confuse people.
My guess would be that the poor performing implementations are using the naive select() solution, instead of one of the more logarithmic solutions like epoll or kqueue.
Pretty surprised to see node.js behave so poorly. We had discussions recently at where I work as to what server solution we should use for a multiplayer platform, and as I had recently built a socket/connection handling server in C they asked what I thought about node.js, and my only concern was that if the underlying code uses select(), then it's going to be a poor choice...but I don't think anyone got around to testing it.
I did a bunch of these benchmarks a while back and the most important factor in a performance test like this is the VM/Language environment. Simply calling into a VM on every IO request will rapidly destroy performance no matter what your IO setup looks like.
Erlang's IO stack if just fantastic for this and is generally very well optimized. If you want to go even faster you can code the fast path in C/C++ and call out to a higher level language as needed. A little dated but helpful: http://www.metabrew.com/article/a-million-user-comet-applica...
I think that Webbit is also pretty good, this benchmark is little contrived.
Poller is a very important factor but there are other things to consider as well depending on how fast you want to go. But sometimes 'fastest' is not what you should look at though.
Solutions like node.js, python, etc.. copy strings and buffers all the time and this can slow down the server considerably. They don't have an honest iov layer that won't copy data before passing it to writev and you cannot manage your own memory and create memory pools. These factors can have a bigger impact on performance than selecting the right poller.
That doesn't sound right to me. There are 10k clients, each sending (and the server echoing) one small timestamp record per second. An AWS medium instance has access to half a DRAM pipe (frankly this test should fit entirely in L3 cache, but let's be conservative). The back of my envelope says that the RAM bandwidth can handle about 100kb of data copying per send or receive event. That's a staggeringly large amount of copying, literally thousands of times larger than the size of the record.
Honestly I think CPU or syscall overhead is a more likely culprit.
Write a web server in C and a web server in python/erlang/go/node.js. You'll notice that the one written in C (or C++) runs twice as fast. If it was a syscall bottleneck then they'd achieve the same performance.
zero-copy request handling is not possible in high level languages without hacking.
For a quick test, run httperf against the dumbest node.js web server that returns 204.
Then run the same test against an nginx server that does nothing but return a 204
The Python version uses Gevent so likely uses epoll/kqueue for this test. It'd be interesting to profile and see where the issues are in the Node.js and Python code.
Using .apply({}, arguments) is actually a bit slower, but either method is so fast that it won't be a bottleneck. I use it in this case for elegance, because I can. Note that in your original implementation this is impossible, because you have to access the "type" variable and use one of two methods (sendUTF or sendBytes). `ws` has only one method (send) and the parameters of this method match the arguments to the callback for on('message').
Not surprised to see Erlang come out on top. As I understand it, this is precisely the type of problem Erlang is good at - doing lots of little lightweight things with minimal overhead.
Erlang is quietly doing heavy lifting all over the place. Amazon's SDB, Facebook's Chat, Various High Speed Trading Implementations, Ejabberd, CouchDB, RabbitMQ... and far more places quietly and privately.
The good news is, it finally seems like other languages are starting to tool out the way Erlang has (Akka, ZMQ, etc).
A common case where it goes very wrong is disconnection logic. It's rather complex and gevent has a very naive view of it, which means you can get stuck greenlets.
Micro-benchmarks rarely are complex enough to expose design problems, my point was that one should at least use the de-facto standard, if not several alternatives too.
This initial release is not meant to be definitive. I welcome pull requests for better implementations in your language/platform of choice and any suggestion to better tune Linux for these tests.
Looking at https://github.com/ericmoritz/wsdemo it seems
he's using his own erlang ws implementation and using a third party library for all the others?
If this is correct, then it's no big surprise that he
does well in his own benchmark that he's developing against.
No. He is using Cowboy[1], which is the mainstream websocket Erlang implementation (uh.. cowboy actually is a socket acceptor pool which happens to have awesome HTTP and WS handlers).
Go is neat and I hope to see it succeed and thrive, but it is simply unavoidable that Erlang has existed for much longer, has been getting tuned for much longer, and this sort of thing is its monomaniacal focus whereas Go is spread a bit more thinly at the moment. Go is trying to be a systems language, Erlang very much isn't.