I think my point stands that the criticism on this thread is mostly a surface le...

p1esk · on April 9, 2020

I'm a deep learning researcher. I have an 8 GPU server, and today I'm experimenting with deformable convolutions. Can you tell me why I should consider switching from Pytorch to Swift? Are there model implementations available in Swift and not available in Pytorch? Are these implementations significantly faster on 8 GPUs? Is it easier to implement complicated models in Swift than in Pytorch (after I spend a couple of months learning Swift)? Are you sure Google will not stop pushing "deep learning in Swift" after a year or two?

If the answer to all these questions is "No", why should I care about this "new generation tooling"?

EDIT: and I'm not really attached to Pytorch either. In the last 8 years I switched from cuda-convnet to Caffe, to Theano, to Tensorflow, to Pytorch, and now I'm curious about Jax. I have also written cuda kernels, and vectorized multithreaded neural network code in plain C (Cilk+ and AVX intrinsics) when it made sense to do so.

mkolodny · on April 9, 2020

I've taken Chris Lattner / Jeremy Howard's lessons on Swift for TensorFlow [0][1]. I'll try to paraphrase their answers to your questions:

There aren't major benefits to using Swift4TensorFlow yet. But (most likely) there will be within the next year or two. You'll be able to do low level research (e.g. deformable convolutions) in a high level language (Swift), rather than needing to write CUDA, or waiting for PyTorch to write it for you.

[0] https://course.fast.ai/videos/?lesson=13 [1] https://course.fast.ai/videos/?lesson=14

p1esk · on April 9, 2020

You'll be able to do low level research (e.g. deformable convolutions) in a high level language (Swift), rather than needing to write CUDA

Not sure I understand - will Swift automatically generate efficient GPU kernels for these low level ops, or will it be making calls to CuDNN, etc?

mkolodny · on April 9, 2020

The first one. At least as of last year, Swift4TensorFlow's goal is to go from Swift -> XLA/MLIR -> GPU kernels.

p1esk · on April 9, 2020

Sounds great! I just looked at https://github.com/tensorflow/swift - where can I find a convolution operation written in Swift?

dklend122 · on April 9, 2020

You can't. It won't be available for at least a year I'm guessing.

Even then I'm not sure what granularity MLIR will allow.

On the other hand you can do it in Julia today. There is a high-level kernel compiler and array abstractions but you could also write lower level code in pure Julia as well. Check out the Julia GPU GitHub org

p1esk · on April 10, 2020

If it's not ready I don't see much sense in discussing it. Google betting on it does not inspire much confidence either. Google managed to screw up Tensorflow so bad no one I know uses it anymore. So if this Swift project is going to be tied to TF in any way it's not a good sign.

As for Julia, I like it. Other than the fact that it counts from 1 (that is just wrong!). However, I'm not sure it's got what it'd take to become a Python killer. I feel like it needs a big push to become successful in a long run. For example, if Nvidia and/or AMD decide to adopt it as the official language for GPU programming. Something crazy like that.

Personally, I'm interested in GPU accelerated Numpy with autodiff built in. Because I find pure Numpy incredibly sexy. So basically something like ChainerX or Jax. Chainer is dead, so that leaves Jax as the main Pytorch challenger.

bigmit37 · on April 12, 2020

I was looking around for a language to write my own versions of convolution layers or LTSM or various other ideas I have. I thought I would have to learn c++ and CUDA, which from what I hear would take a lot of time. Is this difficult in Julia If I would go through some courses and learn the basics of Julia?

This would really give me some incentive to learn the language.

ChrisRackauckas · on April 12, 2020

You could just use LoopVectorization on the CPU side. It's been shown to match well-tuned C++ BLAS implementations, for example with the pure Julia Gaius.jl (https://github.com/MasonProtter/Gaius.jl), so you can follow that as an example for getting BLAS-speed CPU side kernels. For the GPU side, there's CUDAnative.jl and KernelAbstractions.jl, and indeed benchmarks from NVIDIA show that it at least rivals directly writing CUDA (https://devblogs.nvidia.com/gpu-computing-julia-programming-...), so you won't be missing anything just by learning Julia and sticking to using just Julia for researching new kernel implementations.

p1esk · on April 13, 2020

In that benchmark, was Julia tested against CuDNN accelerated neural network CUDA code? If not, is it possible (and beneficial) to call CuDNN functions from Julia?

ChrisRackauckas · on April 13, 2020

That wasn't a benchmark with CuDNN since it was a benchmark about writing such kernels. However, Julia libraries call into optimized kernels whenever they exist, and things like NNLib.jl (the backbone of Flux.jl) and KNet.jl expose operations like `conv` that dispatch CuArrays to automatically use CuDNN.

flipgimble · on April 9, 2020

I’m not telling you to switch. I don’t think the S4TF team is telling you to switch anytime soon. At best you might want to be aware and curious about why Google is investing in a statically typed language with built in differentiation, as opposed to python.

Those that are interested in machine learning tooling or library development may see an opportunity to join early, especially when people have such irrational unfounded bias against a language, as evidenced by the hot takes in this thread. My personal opinion, that I don’t want to force on anyone, is that Swift as a technology is under-estimated outside of Apple and Google.

jimbokun · on April 9, 2020

Please read the article. It answers your question pretty straightforwardly as "no, it's not ready yet."

But it also gives reason it shows signs of promise.

So you should get involved if you are interested in contributing to and experimenting with a promising new technology, but not if you're just trying to accomplish your current task most efficiently.

DeathArrow · on April 11, 2020

Google hopes you will be using their SaaS platform to do ML, not just use your own server. This is one of the reasons they push hard to develop some instruments.

p1esk · on April 11, 2020

When it’s cheaper for 24/7 training jobs than buying equivalent hw from Nvidia - sure, why not.

bananabreakfast · on April 9, 2020

You should probably just read the article before aggressively rejecting a premise it is not suggesting.

BubRoss · on April 9, 2020

Your point doesn't stand because what you said was a defensive reaction to what you thought was criticism of swift.

I think you have bought into the coolaide pretty hard here. Everything you are saying is a hopeful assumption of the future.

pjmlp · on April 9, 2020

> To me it seems like they are thinking a generation or two ahead with their tooling while the outside observers can't imagine anything beyond 80s era language design.

Given the ML and Modula-3 influences in Swift, and the Xerox PARC work on Mesa/Cedar, it looks quite 80s era language design to me.

DeathArrow · on April 11, 2020

Swift inherits some APIs from Objective C.

You have to use something like CFAbsoluteTimeGetCurrent while even in something not very modern like C# you would use DateTime.Now()