Hacker News new | past | comments | ask | show | jobs | submit login

In fact, you can access the hardware through native calls via JNI (or JNA). Or course, then, you have to embedded multiplatform libraries and manage the associated issues. Also, the OpenBLAS implementation is very well optimized for several Intel and AMD processors (you can compile it so that it autodetects which one you're using). It can even reach the efficiency of the Intel's MKL implementation in mono-threaded mode.



We don't even have to guess, since that's exactly what Neanderthal does. Also, I micro-benchmarked lots of options and have yet to find one that fills similar use case that is faster than Neanderhtal+MKL on the CPU, regardless of the JNI overhead (minus the obvious direct use of MKL, but that is much more low-level code). Also, most higher level libraries have considerable overhead. Neanderthal's overhead is tiny.

OpenBLAS's huge drawback is that it only supports BLAS without LAPACK, sparse, tensors, FFT etc.

Anyway, regarding the OP's comment, I guess that they meant to suggest implementing all that in pure Java, not Java + FFI, since then the native code has to be written in non-Java.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: