The core idea is incredibly exciting (to us, anyway). Rather than baking in a specific multicore scheduler, we're allowing pluggable schedulers written in OCaml. They use algebraic effects to allow an independent scheduler to compose concurrency among OCaml threads. This will ensure that the OCaml runtime remains lean, and even allow applications to define their own strategies for concurrent scheduling.
For those asking "How the hell does OCaml not support multicore in 2015????", this is my reply, crossposted from /r/ocaml:
You can make OS level threads, but they can't be both running at the same time due to the GIL (Global Interpreter Lock). Then why are they even there you might ask? Because it allows you to do a blocking call on a thread and to keep executing other stuff in the main thread. Other languages that have a GIL (and the same restriction) are Javascript (including Node.js), Ruby and Python.
Now, IN PRACTICE, things are a bit different. You're never gonna make your own thread to block on things. You're gonna use Lwt to manage all your concurrency so you can do tons of blocking stuff at the same time and combine the tasks nicely without ending up in a Node.js-style "callback hell".
But still, even with tons of concurrency, you don't have parallelism. It's all you need for 98% of your programs, but if you then need to do heavy number-crunching it won't be enough. This is the exact same situation that happens in Node.js, Python, etc, except that OCaml is massively faster than those languages, so even some CPU-bound work is acceptable because OCaml is really performant.
Currently, there's 2 options if you wanna do CPU-bound work: you can use ctypes to call C code easily (from Lwt_preemptive) and then release the lock from within C with caml_release_runtime_system(), so your C code will be truly parallel (and running in the thread pool automatically managed by Lwt_preemptive), and you can call caml_acquire_runtime_system() before returning the result back to OCaml to get the lock back and merge back with the normal code.
The second option is to do an oldschool fork() and communicate with message-passing. Or have a master that manages workers and communicates with ZMQ, HTTP, TCP, IPC, etc. Or use a library that does it all for you like parmap, Async Parallel, etc etc.
What this "multicore support" means is that you'll be able to have threads in the same process that run in parallel because the GIL is going away. In practice it'll probably be implemented directly into Lwt so you'll be able to do something with Lwt_preemptive and just tell it to run some function in a separate thread and then use >>= to handle its result. It's gonna be simpler than both options I described above.
Again, more technical information is available in my r/ocaml post
> The second option is to do an oldschool fork() and communicate with message-passing. Or have a master that manages workers and communicates with ZMQ, HTTP, TCP, IPC, etc. Or use a library that does it all for you like parmap, Async Parallel, etc etc.
I work on the Hack language typechecker at Facebook. The typechecker is written in OCaml, and since it needs to operate on the scale of Facebook's codebase (tens of millions of lines of code), it's a pretty performance-sensitive program. We needed real parallelism, but doing it with fork() and IPC was too costly for us, both in terms of storage (if you aren't careful you end up duplicating a bunch of data) and CPU (serializing/deserializing OCaml data structures to send over IPC is CPU-intensive).
We ended up doing something somewhat more interesting. Before we fork(), we mmap a MAP_ANON|MAP_SHARED region of memory -- that region will be backed by the same physical frames in each child after we fork, so writes to it in one child process will be visible in the others. We use a little bit of C code to safely manage the shared-memory concurrency here.
> We ended up doing something somewhat more interesting. Before we fork(), we mmap a MAP_ANON|MAP_SHARED region of memory -- that region will be backed by the same physical frames in each child after we fork, so writes to it in one child process will be visible in the others. We use a little bit of C code to safely manage the shared-memory concurrency here.
Isn't that similar to how Linux implemented threads for a long time (before NPTL [1]) ?
I vaguely recall that for a long time people were complaining about the cost of starting threads in Linux, because it basically amounted to fork()+shared memory.
I don't know the history of threads/NPTL on Linux. However, the distinction between "thread" and "process" in the Linux kernel is mostly a human one, not a technical one. Take a look at the clone() syscall -- spawning a thread vs. forking a process amount just to different flags to that call, to tell it whether to copy pages or not, how to assign a new ID to the new thread/process, etc. (Not sure if that's how fork() and friends are actually implemented under the hood.)
When implementing fork(2), the return value from clone(2) is the child's PID from the context of the parent process. When implementing pthread_create(3), the return value for the parent is still an integer value which is unique to the thread, and which strace uses as if it were a PID when it's tracing down the system calls of individual threads in separate files, which strace can do because it's awesome.
Some more information:
> Linux has a unique implementation of threads. To the Linux kernel, there is no concept of a thread. Linux implements all threads as standard processes. The Linux kernel does not provide any special scheduling semantics or data structures to represent threads. Instead, a thread is merely a process that shares certain resources with other processes. Each thread has a unique task_struct and appears to the kernel as a normal process (which just happens to share resources, such as an address space, with other processes).
>Other languages that have a GIL (and the same restriction) are Javascript (including Node.js)
Not quite true; JS just doesn't support threads at all. It's asynchronous and single-threaded. In node.js's case, an event loop uses a system call like epoll or kqueue to wait for many events at a time, and dispatches those events to the correct callbacks.
You can do parallelism in JS with Web Workers, and they do use native OS threads, but they lack shared memory, and can only communicate using message passing. So from the perspective of the JS code, they behave more like processes than threads. No GIL, in any case.
I think thats not directly language related but an implementation detail - although a very important one. Javascript on the JVM (Nashorn) allows full multithreading - including then the need for all the common synchronization stuff.
But of course - Javascript and also the typical Javascript libraries were not made with multithreading in mind.
The thing is that numbers are usually wrapped in other things, like objects, hashtables, arrays, etc, and OCaml is a beast at dealing with that kind of code.
From a purely numbers perspective, its operations on integers have to use the LEA instruction instead of ADD (for example) because of the 1-bit tag, which slows things down a bit, but the speed at dealing with symblic code as I explained above more than makes up for it.
> To make things worse, non-blocking I/O is done completely differently
> under Unix and under Win32. I'm not even sure Win32 provides enough
> support for async I/O to write a real user-level scheduler.
sigh, VMS got the link between processes, threads, I/O and waitable events (specifically, the link between tying the completion of future I/O to subsequent computation) right from day one. And by virtue of Cutler, therefore, so did NT, and thus, Windows.
UNIX did not. The core concept of separating the work (computation to be done after an event occurs) from the worker[1] (the thread that performs the work) is absent; the manifestation of that is the lack of good, completion-oriented asynchronous I/O primitives. Instead of being able to say to the kernel "here, do this, then let me know when you're done"[2] and moving on to the next piece of work in the queue, you have to do the elaborate non-blocking multiplex dance for socket I/O, palm file I/O off onto a separate set of threads that can block (or do AIO) and generally manage all threading and concurrency primitives yourself.
It took me ten years of UNIX systems programming to suddenly grasp the elegance of the VMS/NT/Windows approach a few years ago. It provides you with everything you need to optimally exploit all your cores for work that is both heavily compute bound and I/O bound.
It has been fascinating to see the difference in performance between Linux and Windows in practice with PyParallel when Windows kernel primitives are exploited properly:
Did you try to run that code under ReactOS too? I would assume (I've not checked) that they follow the NT kernel design -- so should have similar "architectural" performance -- even if I doubt they've had as much time to hand-tune the details.
It'd be interesting if running under the VMS/NT thread/fork model could be seen as a reason to deploy some apps on ReactOS rather than Linux/BSD. Would also be interesting if one could see any difference running a multi-core KVM guest on ReactOS vs a Linux/BSD guest/container/jail. Although I suppose one would need to dedicate a hw nic to see any real results (avoiding the host OS/VM scheduler etc)?
I haven't tried ReactOS; I'm not sure if they have all of the threadpool stuff (Vista+) implemented, and I use that exclusively for PyParallel. It'd be an interesting experiment never the less.
I was also curious to see what would happen if I tried to install it on Wine.
> It'd be interesting if running under the VMS/NT thread/fork model could be seen as a reason to deploy some apps on ReactOS rather than Linux/BSD.
I... couldn't imagine trying to use ReactOS instead of Windows for an actual deployment of anything. Why wouldn't you just use Windows? (Serious question.)
I couldn't imagine deploying anything that's closed source/non foss for anything serious. I know that windows is source available (if you have a 100k?+ contract...) -- but really - why would you risk your platform going away?
This isn't academic -- look at Sun OS/Solaris. Granted we have open Solaris etc... but that appears as an accident of timing more than anything -- in retrospect.
Now, for the more relevant part: ReactOS vs Windows: If all you want is the kernel/thread model I could see going with ReactOS (pending actual research, as in: does it actually work :-). If you're deploying SQL Server/IIS .net (pending the so far seemingly serious effort to open .net) -- I don't know why one wouldn't go with Windows, no. In that scenario you'd be beholden (good and bad) to Redmond either way.
But for something like a python fork -- I could see something like ReactOS (or any other alternate kernel) be an interesting thing. You don't need much from the OS -- just classic services: basic filesystem/persistence, perhaps privilege separation (not so important for micro-service vms), scheduling.
Xavier's first sentence states that they two operating systems have a visibly different philosophy, not that one is better than the other. The second sentence should be interpreted in the context of this first sentence: if you try to emulate Unix's primitive with Windows', and especially if you want to do this and write a user-level scheduler that does not occasionally deadlock without reason, you will get stuck in a couple of places.
This doesn't mean that Windows' philosophy does not give you optimal performance in PyParallel. It simply means that OCaml had chosen for its low-level system primitives a Unix model and that it was difficult to make a Windows version of the same primitives so that OCaml programmers could write this kind of program portably between Windows and Unix.
NOTE: without, at the time it is in my timezone, looking up the full post, I have to say that I don't think that the quoted two sentences have anything to do with the discussion. It seems to me that the two sentences assume that a multicore (multiprocessor, at the time the post was written) OCaml runtime is not available, and discusses the options to still provide threads. A user-level scheduler is one option to provide threads to OCaml programs without a concurrent OCaml runtime. Another option is to use Windows' native threads and superior philosophy for blocking primitives to run each OCaml thread as a native thread (although at most one of these will be running at any given time. All the others will be waiting on the heap mutex).
OCaml ended up providing threads under Windows and a Unix-like “Unix” module around 1996-ish, way before the linked discussion. So thanks for the explanation about VMS, but I think it is off-topic, too.
NOTE 2: I have now read the original post. You should, too. It starts with:
> Threads have at least three different purposes:
>
> 1- Parallelism on shared-memory multiprocessors.
> 2- Overlapping I/O and computation (while a thread is blocked on a network
> read, other threads may proceed).
>3- Supporting the "coroutine" programming style
> (e.g. if a program has a GUI but performs long computations,
> using threads is a nicer way to structure the program than
> trying to wrap the long computation around the GUI event loop).
>
> The goals of OCaml threads are (2) and (3) but not (1) (for reasons
> that I'll get into later)
What makes it relevant to the current discussion is (1), but Xavier is discussing (2) and (3) at the time of the quote you chose to take out of context.
Oh, no, that's what the sigh was for; Windows has the best model, but there's no equivalent on UNIX, so, you end up having to code to the lowest common denominator (the UNIX model) if you want your software to run somewhere else other than Windows (i.e. almost all open source software).
I'm not disputing any of the technical things he's saying; just ranting about the unfortunate nature of two vastly different kernel models, and the fact that no open source stuff properly exploits Windows facilities, despite them being technically superior.
The principle architect of VMS was David Cutler, purportedly the best engineer at Digital at the time (80s), and best OS designer in the industry.
Digital dropped the ball in the late 80s with regards to management of Cutler and his team, canceling his PRISM project and leaving him and his team disgruntled.
Elsewhere in Seattle, a chap named Bill Gates was flush with billions of cash and knew that the shelf life of DOS was limited; if Microsoft were to succeed, they needed a new, robust, reliable and high-performance OS that they could "bet the company on".
Gates got word that Cutler was disgruntled at Digital, and a mutual party set up a meeting. Cutler was dismissive of Microsoft's technology stack at the time (DOS and some office apps) -- he was a hardcore OS engineer, and DOS was a toy.
Gates persisted, ensuring Cutler that he would have the opportunity to build the next generation of OS from the ground up and essentially unlimited resources at his disposal to do it. Cutler eventually agreed, and the NT kernel project was born.
I actually just read Show Stopper recently. The author is very non-technical and can't really explain the engineering details behind what he's writing about, but if you know something about basic OS design and concepts, that's ok. And the human stories - the stories behind the developers working on the project - are fascinating.
Reading the book and learning the story behind NT's development, it's just amazing that such a good OS came out of that process - they released years after their initial projections and were rushed the whole time. But of course the really good parts of NT - the kernel, the object manager, the pager, async IO, the threading model - were things Cutler and his cohorts had been working on for years, first with VMS, then with PRISM, and then finally in NT. They had YEARS to ruminate about those things before they ever arrived at Microsoft.
The bits of NT that aren't so well-regarded - the registry, NTFS, the graphical shell, csrss.exe and the 'microkernel' design - were completely new and developed in much less time and with less practical experience behind them than they really deserved.
I sent a tweet to the author saying I was really enjoying the book when I was about half way through it and he actually e-mailed me to say thanks. How nice is that!
Dave Cutler is the real stuff of legends. Obviously this is my opinion but I admire him and his work far far far FAR more than anything Linus Torvalds has done.
Arguably, Linus' greatest work was Git, not Linux. Linux is, architecturally, a piece of shit! Actually, wait, so is Git. Mercurial does everything Git does and does it far better and more elegantly. So yeah, wait... one wonders where Linus gets all his fanatics from!
Heh, after reading Show Stoppers and Just For Fun, I actually think Cutler and Linus are actually very similar and would potentially get along in real life if it weren't for the epic technology divide.
This is the kind of remark that always gets me downvotes. I couldn't agree more with your opinion about git. I guess the attraction of git comes mainly from the fact that most of its users are too inexperienced to know any better. And then, git makes any old random directory a “repository”. That's waaaay cooler than to have to get a repo from a central server and having to integrate your commits with it...
> Of course, all this SMP support stuff slows down the runtime system even if there is only one processor, which is the case for almost all our users...
> What about hyperthreading? Well, I believe it's the last convulsive movement of SMP's corpse :-)
Oh how things have changed. This was written before it was clear just how much of a disaster the P4 was, so it was a pretty reasonable position at the time.
He was hardly the only one thinking that way - I remember Gabe Newell being scathing about multicore/multiprocessing when the PS3 and Xbox 360 were released. All a fad, waste of time.
"In summary: there is no SMP support in OCaml, and it is very very
unlikely that there will ever be. If you're into parallelism, better
investigate message-passing interfaces."
I have started playing with Rust and have used OCaml on and off, but recently really diving hard into it building a REST-based application from top to bottom in OCaml, including all the infrastructure bits. We are slowly making our stuff open source as it becomes useable: https://github.com/afiniate
In short, OCaml is a mature language that has been used for decades in commercial applications. I feel OCaml is the next progression for the people that got excited about distributed systems via the Erlang path and want more of the safety and reasoning that comes from a strongly/statically-typed language like OCaml. Rust may or may not take off, but I am confident OCaml will remain viable for the foreseeable future, and probably gain slow, but steady popularity as engineers see all the cool things you can do like MirageOS: http://openmirage.org/
Isn't the Unicode situation in OCaml more or less the same as in Erlang and Ruby 1.8, ie. "string" is just a byte string, and there's no native encoding support?
Last I checked, there was decent third-party library support in Batteries. I imagine it would be painful if you were to use Batteries' "UTF8.t" string type and had to interface with some other library that used "string" or some other string solution (like Camomile)?
There's no built-in encoding/decoding stuff, ie. you need to use a library like Batteries, Camomile, uutf/uucp if you want to do something like capitalise, split or count characters.
Writing the appropriate glue isn't very hard, the interfaces either work with bytes or have to/from-bytes functions, but I suppose it's a bit annoying (at least when first starting out with the language) to have to figure out which lib is needed for which type of string operation, e.g. if you're into Batteries you still need Camomile (or uucp) for lowercasing:
module C = CamomileLibraryDefault.Camomile
module CM = C.CaseMap.Make(C.UTF8)
module U = Batteries.UTF8
let lower_initial bytes =
U.sub (U.of_string_unsafe bytes) 0 1
|> U.to_string_unsafe
|> CM.lowercase
let () =
lower_initial "Åge" |> print_endline (* prints "å" *)
Erlang has no string type. Most strings are a list of integers (any size), you can put Unicode code points there if you want, or integers less than 255 if you prefer. There is also a binary/bitstring type which is an array of bits (if a multiple of 8, it's a binary). You can put whatever you want in a binary, it's binary.
If you'd like things encoded in some way, that's up to you, there is no type to help you (there is a Unicode module which can help convert between encodings)
F# is a member of the ML family, but I'd really hesitate to call it the .NET OCaml. It would be similarly accurate to say that Objective-C is the Apple C++.
They belong to the same family, and they both share a common ancestor that is not object-oriented. But their object systems are very different from each other.
> It would be similarly accurate to say that Objective-C is the Apple C++.
Well it's not inaccurate. Both of them were designed to make C object-oriented, and they tend to be used for many of the same situations because of that.
A more accurate nomenclature would be ".NET's OCaml-equivalent" or "Apple's C++-equivalent" (much like how C# is characterized as ".NET's Java-equivalent). This falls apart with thorough inspection, of course, but it's good enough for tongue-in-cheek comparisons like this.
F# has a compiler flag to force it to use OCaml syntax parsing. It basically amounts to just being a strict mode. Because F# has very slightly more forgiving syntax than OCaml; but essentially almost identical.
To call it the .NET OCaml is not too far from the truth. And it was clearly meant tongue in cheek!
I've long admired OCaml, but never used it really. I picked up Rust a while ago and actually made some use of it. I love the language a lot, but it is still too immature and low level for my current situation. I ended up using F# as the compromise, which has been amazing.
There are some areas where OCaml is more advanced than F# (functors, the codegen from the optimized compiler, lack of the msbuild barf sandwich, less hacky on non-Windows platforms), but there are also plenty of areas where F# is more advanced than OCaml (computation expressions, code interoperability, real 32 and 64 bit integers, agents, multicore runtime).
I would say that if OCaml was ideal for you except for the lack of parallelism, then you should definitely check out F# before you go all the way to Rust. Rust is awesome and for the right use case you should use it, but F# is a lot closer to OCaml than Rust is.
I am the author of the major web frameworks for ocaml [0]. On of the nicest things about OCaml is the identical system calls. I built my entire OWebl platform by reading through C servers.
On the other hand, Rust (like Erlang) reinvents and wraps a lot of these calls in ways that are not immediately obvious. (Or at least, not AS immediately obvious as they are in OCaml.)
This is such a tremendous aid because there are nearly limitless documents and examples of the Unix API.
I'm in the same place sort of, and Golang and Julia seem like the obvious higher-performance transitions from Python. What about Rust makes you consider it so highly?
One of the main reason. It sounds lame, but I can justify to a customer re-writing a project in Rust because they've heard about it and they will be able to hire people who have either used it or at the very least will be interested in learning it. Also, they network/hype means that we are going to see good libraries emerge fairly quickly.
* multicore support
A year ago I was ready to move to OCaml, bought books, started to learn it, but the multicore situation was worse than python. Today we hear that "there is a good chance it's going to get multicore support". In Rust, it's already there, and it's not and after thought.
Rust also happened to be slightly faster for most things according to micro-benchmarks, but we know how reliable those are, and it's we're not talking order of magnitudes here, although it is early and we can hope it gets even better.
I've seen Nim going back to when it was Nimrod, but my impression (correct it here if needed please) of their development cycle isn't especially high. I do like the syntax and the inherent 'threadiness' of it, for lack of a better word.
Why not C or C++? OCaml and Rust aren't going to interface with Python nearly as well.
Also, it's trivial to write multithreaded extensions in C or C++ (at least for data processing). You just have to make sure to release the GIL.
IMO the trick is to just use the Python C API, and not use any wrappers like SWIG or ctypes. The Python C API is a little odd but it is fairly explicit. Also, it's better to write plain functions rather than classes, but for data processing that is natural anyway. If you need a class, do that part in Python.
Writing "one way" extension functions (that don't call back into Python) is quite easy and can give you a huge performance boost.
I think people get hung up on Python extensions because they are using TWO unfamiliar languages -- the wrapper language, and C/C++. But if you are just using one additional language, it's not hard to figure out.
> OCaml and Rust aren't going to interface with Python nearly as well.
I don't know about OCaml, but Rust (supposedly) will interface perfectly fine with anything providing a C API (which includes Python, along with quite a few other scripting languages from that era, like Perl and Ruby). Rust's ABI-compatibility with C is one of the primary selling points.
I don't know off the top of my head if anyone's tried writing Python extensions in Rust, but I don't see why it wouldn't be possible to do so with at least as much capability (if not more) as C/C++.
If it has C ABI compatibility, then technically it can. That doesn't mean it's easier to interface to Python than C or C++ though. I'd find it hard to imagine it being easier, and I know it's not simpler.
I'll have to watch the video below, but don't you have to write some sort of Rust wrapper for every definition in Python.h? What about macros like Py_INCREF and Py_DECREF? I'd guess that someone has or will eventually do that work, but it's yet another layer of complexity, which can have the same downsides as SWIG.
For better or worse, the Python C API is tighlty coupled to the C implementation. That makes it very difficult to be more natural and understandable than C. It's a fairly huge API: https://docs.python.org/2/c-api/index.html
People always try to cover it up with "nice" abstractions, but they invariably end up being leaky (e.g. with respect to threads, garbage collection, OS portability, etc.)
Another huge can of worms: the build system. With a plain C extension, all I need is a C compiler on the system, and I can just do "python setup.py build". The situation with Windows is also quite messy -- I can't imagine Rust making it better.
I have experimented with creating normal .o files from OCaml and linking them with .o files from C/C++. That is a great feature of the OCaml toolchain. Still, I remember the documentation being sparse and I don't even remember how I did it at the moment.
FWIW, my main language is Python, but I love OCaml for certain things, and have grown to like C++ as well. Rust seems very interesting to me for its security properties and because it has native threads rather than including a mini-OS in the runtime like Go. My understanding is that Go is hopeless for many kinds of Python extensions, precisely because of the runtime issue, and calling from Go back into Python. (at least this was true a year or 2 ago)
I'm always wary of creating more layers than necessary. A system composed of Python and C will necessarily have fewer layers than one composed of Python and Rust, simply because Python is written in C and its interface is defined in C.
EDIT: I left out the BIGGEST point -- a dealbreaker. Python does NOT have an ABI. It has an API. AFAIK, that means you have to write a Rust ABI-compatible wrapper for EVERY VERSION of Python.
To make a generalization, programmers learn about C APIs before they understand what the ABI is. It's just more concepts that you need to know about to write correct code. So if you just want to speed up an existing Python program, I would still recommend using a simple C or C++ extension. This solves both single-threaded speed issues and give you parallelism with multiple cores, so you will get huge speedups.
> That doesn't mean it's easier to interface to Python than C or C++ though. I'd find it hard to imagine it being easier, and I know it's not simpler.
I think Rust can be easier to learn than C. You get high-level features like modules instead of include guards and preprocessor hacks, and you can skip learning the part about how to debug segfaults.
> I'll have to watch the video below, but don't you have to write some sort of Rust wrapper for every definition in Python.h? What about macros like Py_INCREF and Py_DECREF?
We have bindgen to automate most of that for you. For macros, you do need to duplicate them, but people should just do that once (and Google brings up several projects in which people have already written bindings for that).
> People always try to cover it up with "nice" abstractions, but they invariably end up being leaky (e.g. with respect to threads, garbage collection, OS portability, etc.)
But I don't see how Rust makes that any worse than in C.
> Another huge can of worms: the build system. With a plain C extension, all I need is a C compiler on the system, and I can just do "python setup.py build". The situation with Windows is also quite messy -- I can't imagine Rust making it better.
Yes, you do need to add Rust support to the extension's build system. That is fair, but I think Rust's advantages outweigh that cost :)
> A system composed of Python and C will necessarily have fewer layers than one composed of Python and Rust, simply because Python is written in C and its interface is defined in C.
That doesn't make any sense to me. It's all machine code at runtime. The pertinent question is whether Rust adds any measurable abstraction taxes at runtime (it doesn't) and whether Rust uses the same ABI as C (it does).
> EDIT: I left out the BIGGEST point -- a dealbreaker. Python does NOT have an ABI. It has an API. AFAIK, that means you have to write a Rust ABI-compatible wrapper for EVERY VERSION of Python.
That is unfortunate, but someone could write a crate that dynamically determines which Python is in use and chooses the right functions based on that, put it on Cargo, and everyone could link against it.
Anyway, people are already writing Python programs that call into Rust via ctypes and whatnot, so this hasn't stopped people yet.
> So if you just want to speed up an existing Python program, I would still recommend using a simple C or C++ extension. This solves both single-threaded speed issues and give you parallelism with multiple cores, so you will get huge speedups.
At the cost of memory safety, which is a big deal. And you have to learn C, which for a dynamic language programmer can be a lot harder than learning Rust.
Since you mentioned multicore, I should mention that the threading facilities available "out of the box" in C and C++ are much harder to use than the equivalent ones available in Rust, and if you reach for something like TBB and Boost now you have all the build system issues you described previously.
Sure, we have the same understanding of what's going on in each situation. But I can't see how anyone could call the Rust version as simple as C/C++. There are simply more concepts involved.
You can try to cover them up with magic code generators and version wrappers, but in my experience those are precisely the leaky abstractions that make people scared of FFI. They work 99% of the time, then bite you 1% of the time. The built-in Python FFI is somewhat ugly, but simple and debuggable.
I'm not saying you shouldn't use Rust for Python extensions -- just that there is an inherent awkwardness in doing so, given that Python is written in C, and the Python 2.x has no stable ABI. Of course Rust offers benefits, so if you want to pay the cost for those benefits, it might be worth it.
I'm considering OCaml for a new project where C++ would be the typical choice. Think algorithms handling massive amounts of data, and some numerics.
I have some experience with ML, Haskell & Lisp. OCaml is appealing because it is quite efficient and predictable. Does it have the bit of laziness Clojure has that makes functional programming easy with large data?
Yes, there is support for laziness (see "streams"). A couple things to keep in mind: floating point values in Ocaml are boxed (floats are actually pointers to float data on the heap), and integers are one bit shorter than native types (31 or 63 bits) due to the way that Ocaml values are tagged internally. The native compiler generates good, predictable, but fairly simple code: few optimizations are applied (although there is active work underway, in the "flambda" project, that will significantly change this). Also, there is of course a garbage collector, though it is quite efficient in most cases. These factors may or may not be a performance issue in your own project.
In practice the generated code is already extremely fast and the 1-bit shorter ints help make the GC one of the fastest I've seen. If you do a lot of floating point calculations, you can put your floats in an array and they'll become unboxed.
Are you from JSC by chance? Is this from an authoritative source, or an just an observation of likeliness? Would love if they indeed do push hard for this.
JavaScript Webworkers can't interact with the page at all, they can only send messages around. This implementation detail leads me to believe the browser is probably doing little more than instantiating another JavaScript interpreter and handling IPC/synchronization for you. This is a far cry from true parallelism.
Further illustrating that JavaScript doesn't have parallelism is that Node.js isn't parallel and in fact encourages its users to use process forking instead.
> And I don't really care if it is from an other era - C has threads (as a library but still).
OCaml has threads. It just doesn't have parallel threads (yet). Threads existed before CPU parallelism so of course C and a bunch of other pre-CPU parallelism languages have them. The difference is C doesn't have a GIL whereas OCaml does/did.
Looked into a static FP language recently. Was torn between OCaml and Haskell. Leaned more toward Haskell than OCaml. Mainly because OCaml feels like it was hacked together, with a lot of very strange and inconsistent syntax and poorly thought out semantics. That said, I haven't chosen either yet, because Haskell has its own share of oddities that I'm still not comfortable with. But at least it feels more pure and consistent and well thought out in its syntax and semantics.
Ocaml and Haskell are both good languages. Haskell probably has a bigger community and more momentum at this point. I switched from Ocaml to Haskell long ago because I wanted parallelism.
The implementation philosophy of the two languages is pretty different, despite being superficially similar in terms of syntax. Ocaml is pretty predictable -- you can look at code and have a pretty good idea of what kind of code the compiler is going to generate.
Haskell is a lot more opaque. Between laziness and a more rigid type system, ghc can do some pretty crazy code transformations. In general, this is a good thing, but it can make performance questions harder to figure out.
I think that Ocaml is easier to learn, but Haskell is more fun, and I've learned more from using it.
Haskell libraries also tend to have more levels of abstraction then Ocaml ones, in my experience. Ocaml libraries don't tend to use things like monad transformers or lenses.
After many years of struggling with Haskell, I've all but given up on it, because of the broken record semantics. They seem to be able to find time to implement monad comprehensions or whatever paper fodder is most in vogue, but you still can't have two datatypes with the same field name in the same module. It's not a serious language.
That is also true of Ocaml, and probably every other ML derived language that treats accessor functions as ordinary functions (rather than using the C-style dot operator). There is a type-directed name resolution proposal that would remove this limitation in Haskell, but it would probably make the typechecker a lot more complicated.
The can't-reuse-field-names thing is annoying, but claiming that it "isn't a serious language" because they made a design choice that doesn't meet your exact expectations seems kind of closed-minded to me.
> ML derived language that treats accessor functions as ordinary functions
OCaml is not in that tradition—field projection uses dot notation and has historically not been an ordinary function (although maybe that has changed in later versions).
The Ocaml way would be to define the functions in separate modules (A.foo, B.foo, etc.). There are convenient syntaxes for declaring which modules are in scope in a given expression, as well as support for first-class modules (you can define functions that take modules as parameters). The module system is sophisticated, and is arguably the "power feature" of Ocaml.
Yes, this is one of the cases where ML modules are elegant, since you can put the definition of the type into its own module, so it becomes F.t and P.t and you avoid field name clashes.
Having two datatypes with the same field name in scope doesn't work well with type inference. OCaml permits it, but it requires type annotations, and in practice it seems like developers solve the problem the same way Haskell does -- separate modules.
That said, working with modules is a lot nicer in OCaml than in Haskell, so it's a less painful solution.
A record field accessor is just a normal function, are you also expecting define methods with the same name within the same module? If you want an elegant solution, write a typeclass. You have to write you java in the haskell way. https://wiki.haskell.org/Name_clashes_in_record_fields
Sure, for a subset of the language. But the point is, a hand-wavy complaint about PL semantics (which can be precisely defined) doesn't make sense when comparing languages in this context.
That's the one for Standard ML, yes. There are others for SML, some using proof assistants [1]. Other (S)ML extensions have formal semantics as well, and OCaml itself is partially specified [2].
This is not really a point release. OCaml versions a little different, a version number has three parts: Super-Major.Major.Patch. Super-Major releases are incredibly rare, the last one was the bump to 4, which was done since the language now supports GADTs (while staying compatible with OCaml 3.x). I don't even know what caused the bump from 2.x to 3.00. Then the Major part is a normal release in which many features may be added. The format is always two digits, of which the first may as well be a 0. The Patch part is just for fixes, stuff that was broken and overlooked when the release was done.
So OCaml 4.03.0 is basically 4.3.0 in a Python-esque versioning scheme (remember how many changes were done between Python 2.2.0 and 2.7.0?).
Does OCaml not already support multicore? Is concurrency green thread based? Even at that, there's nothing stopping a user from starting multiple processes....
Yes, the threads are green threads: only one can run at a time. There's also an Async framework in Jane Street's Core library and there's LWT.
My understanding is that GC is hard with multithread, particularly in a functional language where it's going to do some heavy lifting and needs to be very performant.
That's not really true. Erlang's VM is fantastic at GC with thousands upon thousands of green processes multiplexed onto the system threads, allowing soft realtime performance. Similarly, Haskell's Parallel Strategies library works well with the Parallel GC. Immutability makes this a whole lot easier.
Erlang uses only actors as concurrency mechanism and exploits that fact by giving each actor its own heap. So Erlang's GC does not need to accommodate concurrency, even though Erlang itself does.
To OCaml in particular, and you're right that immutability does make it much easier, but OCaml has mutable variables, which is precisely why it's hard.
GC can be hard with single threading as well. Even if you don't have concurrency to worry about, it still has to be incremental - meaning to do its work in small batches and provide a guarantee that the process doesn't get blocked for more than X millis.
Also, if you can rely on the data-structures stored in your heap to be persistent, then you can tune the GC for it. The problem is that you need to make assumptions about the life-cycle of those data-structures. For example, the persistent data-structures being used in Scala or Clojure can be pretty heavy for the JVM's garbage collectors because they tend to produce junk that is neither short-term or long-term, thus invalidating the assumptions with which the JVM was built with. And generally that's OK, because the JVM's GCs can cope pretty well and if the need to optimize arises, well both Scala and Clojure are hybrids (just like OCaml), so you can just use mutable stuff if by profiling you see problems. So the theory is known and a decent concurrent GC can be built.
Starting multiple processes sucks though, you know, for the kind of use cases for which OCaml should be well suited for. This is because OS processes are more expensive than threads and communication and synchronization between such processes gets very expensive.
on this argument you could state that Erlang's processes the way to go because OS threads are way more expensive. If I have a problem that I can easily make parallel than I could just start up N OCaml processes and send each process a chunk of work and this would not be much more inefficient than the thread based implementation. On the top of that, I don't like to create a tonn of threads in any application, I much rather have a fixed number of threads and using channels to send them work than dynamically (on demand) creating threads or processes.
Erlang's processes are actually better than threads for many use cases, yes. But the thing is that with 1:1 multi-threading you can build many abstractions on top and for example on top of the JVM frameworks using Erlang's model like Akka [1] and Quasar [2] are very popular. Either way, you can't compare Erlang with platforms whose notion of parallelism involves POSIX's fork, but lo and behold, I'm comparing it with Java because I can.
MOST problems are not easily parallelizable and you'll end up with concurrency. Concurrency means synchronizing and sending messages between processes. For example sending messages between processes means opening a socket of some sort, serializing the data you want to send and deserializing it on the consumer's side. That's extremely inneficient and there's no way you can end up with a pipeline of tens of millions of messages per second, but with 1:1 multi-threading you can [3]. Erlang can't cope with this load btw.
One other thing I like about working with threads is the user-friendliness. Yes, skipping over the perils of multi-threading which you can sort of avoid by using better libraries, you can easily do things like number crunching using "parallel collections", combine actors with reactive streams and futures, or fake asynchronous I/O by blocking threads. People nowadays tend to underestimate the utility of blocking threads, but it's pretty cool having an interface like `def fetchData: Future[Result]` that could be implemented on top of Netty (asynchronous I/O) or with JBDC (blocking I/O), only suffer a very small penalty and your process to still be able to reach 80% of CPU utilization.
So I think OCaml getting multi-threading support is a pretty big deal.
Thanks. I think Erlang is the grandfather of languages using the actor pattern and at the same time Joe realized that POSIX threads does not belong to business logic code and it can be done without leaking the number of actual OS threads to the user's codebase. On the top of that Joe also implemented the message passing to avoid the shared memory that is a serious problem from the safety point of view for applications with threads. One thread can take down the entire process execution. Addressing these in Erlang made it possible to write the first commercial system with extra high availability and fault tolerance.
JVM is obviously great a VM, and Scala brings most of the Erlang (and many other languages) features to the table. I am not sure about the fault tolerance. I need to look into how Akka implements the actors.
> Concurrency means synchronizing and sending messages between
> processes.
I think in Erlang you can do only async sending and when the receiving process wakes up it gets it through one of the means you mentioned.
This is what I am curious about. I have seen only one big system written in Erlang that was massive big. WhatsApp was also running in Erlang and they achieved something like 1M connection/server. That was impressive. I am wondering how could you do that with Akka?
I am following the development of these, starting to use it in Clojure soon, I am curious how it works out. Personally I prefer Aleph on the JVM for Clojure projects but wanna see what is happening in Quasar & co.
http://kcsrk.info/ocaml/multicore/2015/05/20/effects-multico...
The core idea is incredibly exciting (to us, anyway). Rather than baking in a specific multicore scheduler, we're allowing pluggable schedulers written in OCaml. They use algebraic effects to allow an independent scheduler to compose concurrency among OCaml threads. This will ensure that the OCaml runtime remains lean, and even allow applications to define their own strategies for concurrent scheduling.