Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Getting into AI?
164 points by juansg on Dec 3, 2022 | hide | past | favorite | 98 comments
I'm a software engineer with ~10 years of experience in robotics and in more traditional full stack development. For the past year I've been looking at all the progress happening in ML/AI and each day I'm more convinced that there's a lot of game-changing stuff that will come out of it (what we're seeing with Stable Diffusion and GPT3 are some examples of this).

I would love to pivot my career from traditional backend/frontend web development type work towards, but I'm struggling to put a plan in place.

- What would be the main topics to learn? - What are potentially relevant companies to apply to? In the past I've been wary of companies throwing AI words around, as in reality many of them were just using some basic ML models and calling it ground breaking AI to hype investors and potential hires.




If you get into it now, you'll probably be on the losing end of a pork cycle [1].

All of the hype is creating overinvestment on the side of "producers" of AI. All that overinvestment will mature at roundabout the same time. When it all hits the market at the same time, they'll have to fiercely compete with each other at the same time as having to deal with "reality" kicking in, i.e. learning the difference between hype and real demand to create real value for real paying customers. There will be massive oversupply.

You'd have to find some way to be short that thing, i.e. to somehow take the other side of that trade.

You want to be on the receiving end of that investment with no exposure to the crash that will follow (if any). For example, if you had an AI background now, you could start an AI school. Your customers would be people taking the hype at face value. You'd take their money now, but when it later turns out that the skill isn't worth in the job market what they thought it would be, you're not exposed to that. ...that's what acting school does for wannabe Hollywood superstars. Running an acting school for wannabe stars is definitely a better business than trying to actually be a star.

[1] https://en.wikipedia.org/wiki/Pork_cycle


I agree. I did my degree at a top 10 university in the world. Bachelor mixed, Master focused on AI.

All I can say is, the job market for data science, machine learning engineering and similar is heavily overcrowded. This means that due to competition (lots of supply, not so much demand) salaries will be (a lot) lower than e.g. software engineering. I didn't even bother going into the field, I heard enough horror stories from friends. Several of which got into high prestigious AI companies, for which they had to pass 6+ very challenging interviews and compete against hundreds or thousands of other candidates to get in. Yet, they get paid peanuts compared to what I now rake in as an SWE. When I do 6+ challenging interviews with a company for an SWE job, the least I (can realistically) expect is a TC of > $160k.

Sure, there's these mythical $500k+ salaries for data science / machine learning as well, but they are a lot rarer than for SWE, simply because the market is much smaller for them. So you're playing the game of trying to become a famous football player, where only the top of the top get dream contracts, the rest not in a long shot.

Money is not everything, true, but at some point you have to ask yourself what's worth more; chasing an elusive dream of meaning or focusing a little bit on your well-being as well.

I don't necessarily regret focusing on AI, but from a pragmatic point of view, I rather should have taken a couple more systems and cloud computing classes in hindsight.


Ugh, it's like we are living on a different planet. SW engs seem to be now commoditized, all the money and hiring is in the ML space. It's fairly straightforward to get a 2x rate as a ML eng compared to one would get as a fullstack SW eng.


Note that you mentioned ML eng, while the parent mentioned data scientist.


How does one switch careers from SW eng to ML eng?


Either use your network or become an expert with some proof-of-skill. Or rely on luck.


Can anyone give me some advice that isnt completely generic?


There are no bootcamps for ML. You can get a graduate degree (MS) in 1 year if you get into UTexas' online MSDSO which has a bunch of ML classes (ML/DL/NLP). UTexas itself is a top 10 school in CS so that might put you on the map for recruiters though PhD is strongly preferred. Another option is to take Stanford AI graduate classes which are deeper than UTexas' but also more demanding.


I agree that this is possible, and definitely worth paying attention to.

But some hype cycles are "real", e.g., looking at the internet in early 2000, you might have thought it was about to crash or about to be huge. Either way, you'd be right. It was about to crash in the short term, but in the long term, it still made sense to "get into the internet" in 2000 because it was still a secular trend that ended up making a huge impact on the world.


How that cycle phases with the business cycle and inflation will matter a lot. A raising tide can lift sinking ships.


This is overly simplistic. Some hype cycles are real, and drive real long term value, like mobile and cloud. These technologies rendered many businesses obsolete and created many new opportunities.

I wouldn’t be surprised if there was a new AI version of every SaaS out there, followed by a consolidation cycle in a few years.


This is a stunning description of every technology hype cycle. Remember when everyone wanted to get in on VR development?


...oh, I remember many many hype cycles. I'm old enough to have been around when people ripped out relational databases from underneath applications and replaced them with XML files because if you didn't you were clearly living under a rock.


And repeated years later with the nosql mania.


Or when mainframes were the cloud.


what's stunning about it?


Intersting comment, but I think there might be fluctuations around a growing overall demand.


Given the uncertain job market in ML/AI, I would recommend taking a few online classes aimed at practical applications. If you are a full stack developer working at a company you like then perhaps there are opportunities to add value writing simple models with available data that would be valuable to your employer?

Sorry if I sound jaded in those comments, but I have mostly been employed as an AI practitioner since 1982 and I have seen severe cycles of plentiful AI work and little available work.

I really recommend having a generalist’s mindset, and adding data prep/model building/deploying models as yet another skill.

I mostly use deep learning now, and Andrew Ng’s online classes are very good. If you just want to have fun, you can read all of my recent AI books on my https://markwatson.com site, but to be honest I do less teaching of fundamentals and more just offering fun small programming experiments. In addition to Andrew Ng’s classes (still on Coursera?) I have found https://www.edx.org/ classes really useful and most can be taken for free.

I also recommend signing up for OpenAI’s APIs. I spend very little money for a lot of use, and I have all but given up writing my own NLP code, something I have been doing off and on for over 30 years.

I have a difficult time imagining what the future AI work landscape might look like for you but my guess is that tooling and applying new theories will keep getting easier and more people will at least have ML on their resumes. This is why, for both employability and work enjoyment, I recommend the generalist mindset.


Impressive. And how generous of you to keep the books online as well. Thank you.


Even ten years ago this would have been a difficult career switch without a serious (graduate school level) investment in the very specialized mathematical background of deep learning models. Nowadays its even harder. The mathematics haven't changed much, but the easy pickings are long gone. Now you'd have in addition to target specifically a sub-___domain, with all its associated context and heuristics, whether that is text, image, audio or more exotic stuff like protein folding.

Nevertheless the above caveat concerns the overhyped "AI/ML" space. Digitization, quantification and automation of information flows is much more general phenomenon and with a more modest investment in statistics / data science you can be part of this general trend of productionizing "analytics". Just don't expect bubble era FAANG salaries. These were the product of very specific conditions.

If a general data science transition works out for you and you are still interested in the AI/DL/ML bandwagon after you are in better position to understand why and how it works you could easier drift into that space later as it is just an extremely specialized subset of that world.


> Even ten years ago this would have been a difficult career switch without a serious (graduate school level) investment in the very specialized mathematical background of deep learning models

Isn't it the case that a big chunk of the AI that became popular 10 years ago (via Norvig, Thrun, Ng) is no longer relevant today (except as background)?


No .. when you do MNIST in a DL tutorial it all looks easy. The problem is when things don't work as expected (real-world TM). I have a freakin PhD in CS (top school where a bunch of this stuff was invented .. some of my friends are "Gods" now) and did a bunch of basic ML in my thesis but ... I never got deep into the math (my PhD was in something else .. a math-lite field). I know a metric space but do understand it well enough? What the hell are kernel smoothing methods? Laplacians, Jacobians, what, what, what??? (Fine .. I was joking with the last ones .. I remember vector calculus) I work in AI and feel I study ALL-THE-TIME. Please don't torture yourself like I did. Get a proper Masters (but realize you need a PhD in the specific sub-field and strong math chops).

Rant Over .. I also had a thought for the OP. In Web dev (and other kinds of SW dev), you kick it hard enough and it works. I had a job offer at a famous company where they wanted me to do a vision model to detect theft. I got the offer but didn't take the job .. one factor was if it didn't work with my existing toolbox, what does that mean for me? Do I get fired or I quit myself? This was a serious question I posed to myself.

In research gigs, you take on hard problems and try your best. In an industrial/startup setting, AI of hard problems requires incredible training and self-confidence at the leadership level. A data engineer does not require this hard stuff and you just work under some scientist. But be aware the turn around time for experiments is very fast (like the scientists had 5 new ideas on the whiteboard by the time the dev team walked from the conf room back to their desks). I don't even know how my engineers keep up with it.

I think AI jobs where you do actual innovation/exploration are incredibly hard and require a ton of investment (personally in terms of a near 24/7 job and your employer). Even then success isn't guaranteed. I think there are much easier jobs out there (like Cloud, Security) that have an equally bright career outlook .


> Laplacians, Jacobians, what, what, what???

Can you really get a PhD in anything even remotely STEM-related these days without at least knowing what those things are? I'm growing old...


Parent said they were joking about that example. Not every STEM PhD is math heavy. CS itself is big field with subareas that may or may not require fluency in topics such as optimization (e.g. HCI, Databases). ML/DL is optimization heavy, no? So the point is valid I think.


SWE ~20 years, curious I have spend like 4 days to learn and do some hacking around that Stable Diffusion hype :) Well, there are three types of "boring" work I discovered there:

- researchers - math and stuff, optimizing training, inference, reading tons of research papers, writing papers, implementing low level algos, a lot of trial and error work - that is like 0.0001% of the entire community,

- integrators - building workflow front-ends, wrappers, PS plugins, optimizing the stack, doing some Dreambooth training, tinkering with high-level Python, 0.1% guys,

- users - running one-click tools or just clicking thru some AI-art generator webapp, participating in reddit d* size contest, spreading the news, writing sensational articles for mass audience, hype preachers on YT, designers augmenting Photoshop skills, the best are trying to monetize the tutorials and AI-artwork, 99%.

All that reminds me a bit of "blockchain" career hype cycle which is now fading. For now it is all "vitamins" (vs painkiller).

I will definitely play with more of it on the side. But for now I am back to my boring and stable devops/backend web job, I see a lot of K8s pains to kill :)


> All that reminds me a bit of "blockchain" career hype cycle which is now fading

We'll see what the job prospects hold in this field, but it holds no comparison with blockchain hype. People aren't claiming (yet?) that AI art and dialogue will free us from the Shackles Of The Oppresionist Financial Regime, with an anonymous manifesto to stir our dreams of revolution.

No -- people are excited about this new AI art and dialogue because it is mind-blowing, magical, inspiring, and wonderful in itself.

I can't wait to see where it goes.


FWIW, the GP's reference to "hype cycle" might be Gartner's Hype Cycle [0] rather than the dictionary definition of hype.

[0] https://en.wikipedia.org/wiki/Gartner_hype_cycle


I’d rather say, it’s impressive. But wonderful?

It produces something impressive, yes, and can be useful for “content” producing. So it can definitely be useful (painkiller and vitamin) to artists or producers.

But all in itself, what is does produce is void of process, meaning and intent. Those have still crucially to be desired, designed, modeled and injected by _someone_.

Or one takes the risk of publishing something either dull either conveying unexpected meaning (thus being inauthentic in both cases).


> But all in itself, what is does produce is void of process, meaning and intent

Here's another way to look at it. Everyone has imagination, few have the refined skills to execute. What you find with the explosion of Midjourney AI artwork is people imagining concepts, letting the compute apply the technique, and then iterate until the artifact resembles the vision. This is actually quite similar to the fim director who tells their art director what they want, then reviews, refines, and iterates until the vision is met.

Myself, last week I created an image that matched a visual idea I had over 10 years ago, which I could never execute with my limited drawing/painting skills.

So yeah, on Midjourney there's a ton of meaningless "darth vader cat" images, but there is also meaning and intent.

And we're just at the beginning. Imagine what this will be like when people can tweak an image as fluidly as you could if you were giving a human artist direction in real-time.


There's some difference between blockchain and AI, that is AI works.


Hmm I can send cryptocurrency across the earth in seconds while my self driving car is perpetually 5 years away…


That's true but it's also what Bitcoin did 13 years ago, meanwhile all the hype about taking over the world doesn't seem to be coming true.


That certainly depends on whether you are on the paying or the receiving end of the blockchain scam. For one of them it works really well ...


Blockchain works very well for money laundering too!


What are the remaining 0.8999% doing?


Haha, yeah. These likely just go long with NVDA and AMD ;)


You can do a PhD in the field, depending on your background this wouldn’t necessarily mean that you had to give up working. There are many companies interested in „Industry PhD“ and your experience and background in robotics would probably be a great fit at many places. To give you an example I recently interacted with someone that was doing a PhD and was working for Stabilo (a German pen maker).

Here is a random PhD job opening in Stockholm, that I just got send. I know the professor he would be very happy if someone with industry experience showed up:

We have a PhD opening in my group, addressing “event-based vision in challenging automotive environments”. https://kth.varbi.com/en/what:job/jobID:560761/type:job/wher...


Wouldn't you need at least a few publications in the field to be considered for your group?

Also, I think you forgot to post one link, the one you got sent.


There is no expectation that you have published, you would be evaluated based on your academic record but also your industry experience. I can’t tell you how things work in the case of Industry PhD, but I can definitely tell you that since now a lot of academic fields have connections to AI and could really use people with strong engineering skills, there is a good chance you could find such a position.


Is it Ok if I can email you?


To clarify the link I posted above, was for a job opening not in our group but in the group of Jörg Conradt, I think he would be happy if you send him a mail. I'm at a physics institute, so it is hard to hire non-physicists. I'm trying to be semi-anonymous here so I would prefer to not share my mail, but I can reach out to you if you give me a way to contact you.


I have added an email to my description, you can see it if you click my username.

Could you drop me a hi?


SWE 20++ years. Same thing. And here is the way, I think. Get the basic skills, what is model, how do they work, how to train them. Get some product idea. Probably some web site with, say, image processing AI. You don't have to invent much at the beginning, there are many almost ready to use models. Stable Diffusion, for example. Check the license, make sure it's legal. You don't have to pay much, or quit your job. So it's just the time you are investing. If site becomes profitable - excellent, you are in. By this time you will know what do you want and what do you need.

As for myself, I consider several options. First, create a robot with computer vision, probably walking, doing something useless. With this proof of skills join robotic startup. (plus those 20++ years, CS, some experience building / training, reading, PyTorch, and so on..)

Second, create novel model solving existing problem, which many tried and nobody does well yet. I tried with promising results, and have ideas how to do it better. If it works then either a paper, or a product.

Third, put together models from different domains to create a useful product. Probably will have to reimplement some of them. There is so much free stuff around that it's almost a crime to not to use it.

Forth, keep current job and have AI as a hobby which may one day become a job.


I switched from being a SWE of ~10 years to a Research Engineer at DeepMind, so while it is difficult (as other comments have noted), it is absolutely doable. I wrote a blog post about it here: [EDIT: fixed] https://medium.com/@kfedvanilla/switching-from-software-engi.... Happy to answer any specific questions you may have beyond what's in the blog!


FYI, the link to your blog post doesn't appear to work (404 error).



You are correct, thanks so much!


Agh sorry, I'm not very good with computers... Fixed now!


>For the past year I've been looking at all the progress happening in ML/AI and each day I'm more convinced that there's a lot of game-changing stuff that will come out of it (what we're seeing with Stable Diffusion and GPT3 are some examples of this).

Wow, I must admit that out of all big branches of computer science AI/ML is the least exciting for me. I don't know, but all that unreliability is just putting me off.

I do agree that it's better to have something automated in 90% instead of 0% or 40%, but the impossibility to getting it to 100% is annoying.

It's been like over a decade of huge hype on AI/ML and yet I feel like biggest applied AI/ML that affects my life directly or indirectly is search & ad industry or things like chat bots, but it's mehhh.

I don't believe in autonomous cars based on computer vision


I really want to like AI research, but ultimately find it kind of boring. For me it's a combination of increasing complexity of the models and the requirement of having lots of appropriate data and computation power. Things are more interesting on the mathematical side, as long as you are happy with not getting anything practical out of it.


> getting anything practical out of it

You need to start low, e.g. my friend is non-IT but I have shown him SD so now he tries out AI-art t-shirts business. But I agree, AI is completely still dumb, especially at scale - just asking my Android Auto Google if there is paid parking zone at the destination it navigates to, "Sorry I don't understand". We are still far from ELI5 usability.


I specialized in ML in my Masters, and I was surprised by how boring my first job in ML was. Sure, the results are "sexy" and the job itself is apparently much sought after, but building the model was frustrating. Like you say, the interesting part for me was what people would think of as boring - getting the data cleaned up and correctly aligned, the scaling work, latency issues with serving the model etc. I knew then that ML engineering was not for me.


A full career switch will likely be hard. For everyone working on stable diffusion or GPT3 there probably 1000s of people working on rather mundane tasks in ML (nothing wrong with that).

If I were you I would leverage your existing knowledge in robotics and see how AI might apply there.

For example, computer vision will very likely be a major part of robotics. I could also image that there will be Speech2Text models where you can just tell a robot what to do. However, the main driver here is not necessarily the AI model but the people understanding how to use it in the context of robotics.

I would argue that plain AI/ML is highly scientific work and unless you have a PhD in ML or a similar field it will be extremely hard to get into it. Applying models, tweaking them for your use case and all the work around that will create the actual value in most applications.


You essentially have two choices:

- Get your PhD and try to become a researcher at one of the labs. Super crowded now, but basically the only way if you want to do research.

- Become a ML engineer, data scientist, etc. While you will be working with ML models and understand them to some extent, you may find that it's not exactly what you wanted. You'll likely end up spending most of your time on the same old engineering stuff: Calling into black-box APIs, data engineering, iterating on hyperparameters, building experiment pipelines, dealing with cloud scaling and setting up GPUs, etc.

For example, if you take a look at GPT, 99% of the work that has gone into it is standard engineering/scaling work, not AI-specific work.


No, you can definitely publish without a Ph.D. You don't have to work in a research lab.

It's as simple as finding a conference, finding the submission style, writing the paper, submitting, and waiting and praying to the gods of peer review.

Of course, you need to have a good idea, but the fruit in AI is still very low hanging.


I didn't say you cannot publish without a PhD. Rather, it's incredibly difficult to get a position in a good research lab without a PhD due to market forces and competition. Publishing in top venues also requires a certain skillset that's difficult to self-study... and these days it often requires money for running experiments. And if you don't want to get position in a research lab / academia, why would you want to publish at all? Getting those citations for your job is the whole point. Otherwise you can just write blog posts.


What is the point of publishing research papers, if not beeing a phd student or professor?

Many ground breaking ideas come from underpaid phd students anyway, e.g., yolo.


My 2 cents is that don't learn anything now.

Apply to SWE roles in companies that specialize in AI or ML. Tell them you are a veteran SWE and the work they are doing excites you. Find a place where you can keep using your SWE skills while you get paid to learn AI or ML concepts on a practical level.

There is a huge demand for SWEs who specialize in specific industries and are not considered more generalized SWE. So first try to become the SWE with an AI interest then look into AI focused roles.


I found the Fast AI course to be a great way to learn the foundations. It was recently updated after some years too:

https://course.fast.ai/


> as in reality many of them were just using some basic ML models and calling it ground breaking AI to hype investors and potential hires

Without Phd in ML your chances for any leading AI company to hire you to work on SOTA ground breaking models as researcher are near zero, unless you are fine with devops/support roles. As to normal work of ML engineer on not ground breaking stuff it's actually really boring. You’ll spend a lot of your time just collecting + massaging data, slightly tweaking parameters to see tiny incremental improvement in your model.


Learn what ML and AI "really" is (start with neural nets, simple classifiers, get into GANs etc.) and just experiment with PyTorch or whatever else you like. Pick up Ray at some point and just go on from there. Rebuild models from papers and understand them. I wouldn't recommend "toying around" with ready to use stuff like premodeled and/or pretrained models (especially stable diffusion, the hype is really big right now) right at the beginning because that won't really teach an understanding of what is going on exactly. Compare with "script kiddies" as people called them back the days: They just use ready-to-use tools but have no idea what is happening and hence never surpass those tools or the level they are on, unless they bother to actually start at the beginning some day after all ... Like just using Linux won't make you an OS developer. What's all that generated prompts and content worth if you can't even tell at first glance if 0.05 or 0.0003 is a better learning rate for Adam optimizer?


If you want to learn A.I. try https://www.fast.ai/ Its probably one of the best places to start. Highly recommended.


I am going to work through this as well, but I have minimal coding experience. I'll just learn what's relevant as I proceed through the course.


It seems like a decent path would be to learn enough to start integrating AI into the type of things you are already working on, for example adding recommendations, AI-based tagging or autocomplete to apps that don’t already have it. Users are coming to expect these in more places. Or perhaps building small projects that use off-the-shelf models with some fine tuning. This seems more commercially valuable and less competitive than trying to publish at the edge of the field or being a full-time researcher like some people are talking about here.


I'm in a similar situation. I have 1,5 year of experience working in a web dev full-stack job, but the FOMO is so strong with all the advancements in AI. I fear my skills won't be future-proof and eventually AI is going to take my job. These thoughts are taking a toll on my mental health and as a result, I can't focus on improving my skills because of the constant fear of me losing the hype train / making the wrong career decision.

Is this fear justified? How should I proceed? Is software development dead? So many questions and uncertainties...


One thing you have to consider is how context plays a very important role in your job. AI does not have the context of your job/codebase and therefore can only do very small tasks that have probably already been done before (e.g. "Write a function that reverses a string" - which can be found in a few seconds by Googling).

You have nothing to worry about until we're able to feed entire codebases into an AI and have them generate features or fix bugs, and even then, having the context of what the task is on a human level requires even more advanced computing. We're just not there yet.


* if I were college age I'd definitely try to get as much exposure to AI-related courses as I can

* in our line of business nothing is future-proof, in 10 years your resume will retain at most 20% of the buzzwords you put on it now

* In the last 15 years backend development has lost its shine (interesting problems moved to DE/MLE or got replaced with cloud services). You stand a better chance with frontend development and native mobile Swift/Kotlin stuff is probably a natural extension

* it was said 20 years ago that software development was not a promising career and the developers would be replaced by this or that (e.g. RAD/no-code, AI, automation, outsourcing); IMO jobs moving to cheaper locations is actually the highest risk with FWH normalized at scale for the first time in history


If AI will be able to do your SWE job I can't really see why it wouldn't be able to do basic to mid-level AI work. So in that case lots of data scientists will lose their job as well...with perhaps a tiny tiny minority of freak geniuses still developing and experimenting with better AI techniques. I see where you're coming form but don't make huge changes based just on fear. You'll keep being afraid even if you move to AI...it's in our DNA to be worried all the time.


1. watch out for your mental health 2. learn about time management 3. give yourself time to learn new things and experiment with them, it should be joyful experience


I'll be honest, ChatGPT gave me a ton of FOMO. Suddenly it felt like I was irrelevant and useless.


what you’re probably looking for is becoming an AI app developer, leveraging inference to create new types of products. No need to become an expert in ML and training, to start with. That’s similar to what some people did about 12-15 yrs ago pivoting from web development to smartphone apps development.


You can make the transition, there's demand for all kinds of roles along the spectrum, likely in different companies though. Start working closely with the ML team as a SWE would be a start, then take over some of the pipelining stuff, and so on. Just be realistic with it being precisely that, a transition, and that you will have to apply yourself and take initiative. No, probably you won't become a Google AI researcher, but probably that's not your genuine desire anyway.


The other job is data processing for AI.

Collect data from field, cleanse, and then feed to ML. Lots of work. Especially the organizing, cleansing side, which can be pseudo automated upto a point and then manual process the rest.

This data collection for ai already has a large market. And if you want to get involved in ML/AI, either you need to buy data from the market, or do the heavy lifting of collecting processing by yourself.


SWE 10 years. I got into ML/AI recently after a background in UI development (from JS GUIs, to Python CLIs and APIs.)

AI was not on my radar as a realistic career option until the startup I was working for got acquired by <big tech company>. A couple years after the acquisition, I started exploring options and ended up taking a transfer to an AI SDK team (even though I had no experience with AI at the time.)

I build tools for the people doing the actual AI work, so in my case, it was more useful to know how to talk to users and build systems to steer people toward best practices. I'm learning more about ML/AI on the job, but I was productive from day 1 because of my software engineering skills.

I'd look at companies doing AI and see what they're looking for. You might be surprised at how much overlap there is with your current skillset. If you learn some AI skills on your own and look for jobs on the tooling side, you can probably stand out from the crowd earlier than you'd expect.


There's already way too many people in the field who see these Stable Diffusion, GPT etc. in the news and want to do something like that. Stable Diffusion looks impressive, but I have no idea what the business use case for that is. Likewise lots of people want to do "deep learning" but most companies don't have that kind of data or it doesn't make sense otherwise. 99% of data science work is more boring -- what you need is try to understand the business, be familiar with a wide range of different methodology and have an exploratory mindset to look for a good solution to a particular problem. This is not to say you can't do "AI" or "deep learning", but a narrow focus, overcrowded market and no existing skills are not a good combination. There's plenty of business need for general data engineering and data science though that can also serve as a path forward if not interesting to you as an end.


If you are interested in on cutting edge research oriented work like development of models like GPT-3 and Stable Diffusion, then you should know that such opportunities are rare and limited to large FANG type companies. And you need to be super talented with a PhD from a top tier college.

The next group of ML practitioners are data scientists or ML engineers in smaller technology companies and startups. Their work involves applying advancements in these areas to improve their products or services. Very rarely do they do research oriented work. You could look into applying for such roles. An ML engineer role would be more suitable given your background. My advice would be to get your hands dirty and try to build a product on top of some ML model. It would give you some experience in model deployment and MLops type activities which companies often look for.


> My advice would be to get your hands dirty and try to build a product on top of some ML model

I second that. Grab Stable Diffusion and build something around it using your current experience. On the side I would suggest a structured training program as a Udacity Nano degree. Of course, you can do that on your own and for free, but having the structure and payed for it made it easier for me to stick to it.


GPT-3 and stable diff are just useless toys. Where the AI is practically applied nowadays, is police and government surveillance and new smart weapons. Go there if you want to make real impact.


In addition to gaining a strong foundation in math and computer science, there are a few other things you can do to get started in the field of AI. One is to start building your own projects and experimenting with AI technologies. This can be a great way to gain practical experience and develop a deeper understanding of how AI works. There are also many online resources, such as tutorials and courses, that can help you learn about AI and get started with building your own projects. Finally, networking with others in the field and staying up-to-date with the latest developments in AI can also be incredibly valuable. Participating in online communities, attending conferences and workshops, and even collaborating with others on AI projects can all be great ways to learn and grow in the field.


This sounds like a ChatGPT response.


The spectrum is too wide, might need to dabble to see which works for you:

- Mainstream vision/text/speech, something Huggingface et al do. The classic advice where you learn SGD, stack net layers and put loss, train, deploy on web, clean and collect data, retrain

- Hardware accelerators, inference and training. Check out startups like tenstorrent, tinygrad, mythic. Also involve compilers, net graph data structure and good software frontends and abstraction to hardware to write models onto hardware.

- Anything else, applying ML for finance, manufacturing, oil and gas, all those boring but important stuff


Just in case someone wants to apply. Computer Vision Engineer at robotic startup, already 159 applications in 4 days. Looks like this area is overcrowded. Other than that it's cool.

https://www.linkedin.com/jobs/collections/recommended/?curre...

Computer Vision Engineer Lawrence Harvey Boston, MA On-site 4 days ago 158 applicants

$160,000/yr - $220,000/yr · Full-time · Mid-Senior level 51-200 employees · Staffing and Recruiting


This sounds like what you’re looking for:

https://github.com/AMAI-GmbH/AI-Expert-Roadmap

Interactive version:

https://i.am.ai/roadmap


> In the past I've been wary of companies throwing AI words around, as in reality many of them were just using some basic ML models and calling it ground breaking AI

What made you change opinion? Afaik, nothing has changed in the field except the amount of hype surrounding it.


> basic ML models

Disclaimer: I don't work in the game changing language model stuff, so maybe I'm the wrong person to answer

But I'd kind of assume basic ml models are one of the essential prerequisites, so go read Elements of Statistical Learning if you haven't already?


You should leverage your SWE skills for the transition.

As things stand today, here are some things to start with:

- Python 3 - PyTorch - Transformers

Some overarching advice:

Write your own code when possible. Test it with the most realistic data you can find. Iterate over your models and code, always.

There is no magic. It is all software.


Get a PhD in ML if you are serious, otherwise all you'll be doing is data engineering/cleaning. Eventually, take at least online Stanford grad courses for credit to have some credibility and to be able to stand out.


I respectfully disagree. A PhD is to apply knowledge you already know. If you want to go the academic route, go for a Masters. That's what you use to learn. I have a PhD in CompSci, and I can tell you, that's NOT the path to follow to learn a subject.

Now, if you want to go the independent learning route, I can recommend the Udemy course: https://www.udemy.com/course/python-for-data-science-and-mac... It will give you a good hands on overview of all the current topics in Machine Learning.

You first will be learning to USE ML methods, then you can start extending the ML field itself (if you want).


I agree with you in principle, however it's about perception. For some reason a PhD and even better articles at top conferences are the minimal qualification for great ML jobs. So just having a MS in ML is often insufficient. If you want to start your own company doing ML or work on entry-level ML things, then yes, alternatives are there but if you want to work on important things for a lot of money, non-PhD route is rare. I've only once met a person who dropped out of a PhD at a top 5 university to work on a top-end ML. And unfortunately due to the pace in the industry, not working on a top-end stuff means working with significantly worse models/ideas somebody already used and discarded for something better. I have some private info about what the best ML folks are doing and many techniques even taught at Stanford grad courses are already obsolete (not just architectures, but the whole categories of how certain things are modeled in ML).


This is an excellent and not-often-suggested course: https://bloomberg.github.io/foml/#home


I would like to create one that helps me write. I tried a few and they were very superficial.

Something like I supply an article and the AI optimizes it with well known techniques that help learning the conten, etc.


Find a problem that you think can be solved with AI or ML. Work on solving that problem. Based on your background I would do something in the AI space that intersects robotics.


It's not that simple. To train a net, you need lot of computing time and lots of labeled samples. Both of these things are expensive. Before you start to "work", you would need to secure the funding.


For many problems large companies have, decision trees or SVMs work just fine :) The rest is a huge hype, were a hyped solution searches for problems, like blockchain.


Not a generalized advice, but once you start learning, please check out the Sci-Kit documentation. It's very concise in its explanation of concepts.



If you’re excited about Stable Diffusion, why not join the Hugging Face Discord and participate in their community?


AI bubble coming!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: