I am biomedical scientist considering a switch to computational biology (specially computational immunology). I would love to get to a point that I can get a job in it in the next 6 months. Any tips will be much appreciated.
Computational biology is a pretty broad term. Usually things that have to do with computers & biology are bioinformatics or computational biology. Briefly, for bioinformatics you’d need things like C++ under your belt and interest to come up with ways to make things work really fast and optimal on huge sequencing datasets.
Comp bio is a super fun field to be in. For me it's mostly using computers to do biology. But it’s a mixture of ___domain knowledge in bio, a good grasp of stats, and a whole lot of programming (usually not terribly difficult tasks, though).
Basic python and R are what you absolutely need to know (I started with intermediate python, no R). To do comp bio well, you need to learn computational statistics. I can’t stress enough how much knowing statistics matters in this case because there are so many assumptions that all sorts of libraries make about sequencing data and you need to decide for yourself how you’ll go about things and produce good science.
On a practical level for comp bio, I suggest: 1. Learning python & R 2. Basic knowledge 3. Knowing what your fave labs use for techniques (eg NGS? What kind of NGS?) and learning how it works, 4. Learning probability & statistics (lin alg always helps too) 5. If you got bored, learn clustering methods... Because, good god people in this field love seeing pretty tSNE figures and 98% of them have no idea how they just produced what they did but make biological assumptions based on it. You’ll probably have to learn them anyway
Specifically about immunology, I'm not so sure - I study cancer but most my grad companions in comp immunology use similar techniques as me. But there might be very ___domain specific immunology techniques that I'm not aware of. Regardless of the "shared" technique though, you need a lot of ___domain knowledge to interpret your results.
But as broad advice:
If you know a broad technique you're interested in, see which immunology lab does that and go from there.
If you already know the types of immunology questions you wanna go after, find the lab that studies that question, then see which techniques they use and learn those.
The resource I recommend to people looking to move from wet lab to dry lab stuff is https://www.biostarhandbook.com/. From your post history it looks like you already have some programming experience, so you could skip the first few chapters which are just a linux intro. I don't think it has all the best practices, but I think it's the most comprehensive overview that starts from square 1 and fills in all the gaps no one tells you when you first start, for example the "Common data types" chapter.
I think one of the easiest ways to get into it is by knowing how to use a software called MaxQuant Perseus (https://maxquant.net/perseus/). It's like advanced Excel that was designed so scientists don't have to learn R but still get the job done. Good luck with your journey!
A biomedical scientist probably has a good base in proteomics already, learning the ins and outs of the tools used for research would be the best option for a bio scientist wanting to get into the computational side of things.
I think the main question need to answer for yourself is whether you're more interested in biology or programming, ie do you want to use software tools to do biology (computational biology), or do you want to make the tools that others use (bioinformatics)?
If biology, you need to focus on bio, stats, Python, R, and a hundred other specialized tools for working with data.
If you're more interested in programming, you can get away with much less bio/stats knowledge, unless you're working on developing low-level algorithms. A lot of the work has more to do with efficiently storing, moving, and visualizing large datasets. Bonus here is that much of this knowledge is transferable to other (much higher paying) domains if you get burned out or want to sell out.
My current job could be described as bio-aware web development, with an emphasis in data visualization. I need to know a decent amount of biology, but I can almost always defer stats to others in the lab with more expertise.
Knowing how to do programming is a must: Python, R(both are quite popular). Being hands-on with Linux helps as well, as many real-world datasets won’t fit your laptop, so you’ve got to use high-performance computing infrastructure. But it’s mostly about being able to make inferences from data. You need a solid stats background for that.
Wow, incredible resource! Thanks for linking it up. Looks like the raw dataset is compact enough to run experimental inference right on your laptop ;)
A somewhat related, but certainly left-field question: is there a similar tutorial / library for exploring the frontiers of quantum computing and neuroscience?
Quantum Computing at the Frontiers of Biological Sciences
Are you specifically interested in immunology? Computational biology is a pretty broad umbrella. My personal experience was learning bioinformatics, which was heavy on Python and genomics/proteomics.
A brief search for comp immunology turns up things like data mining and mathematical modeling, I would assume Python and R would be a good place to start. You may even be able to find some lectures that cover some of the basics online.
Comp bio is a super fun field to be in. For me it's mostly using computers to do biology. But it’s a mixture of ___domain knowledge in bio, a good grasp of stats, and a whole lot of programming (usually not terribly difficult tasks, though).
Basic python and R are what you absolutely need to know (I started with intermediate python, no R). To do comp bio well, you need to learn computational statistics. I can’t stress enough how much knowing statistics matters in this case because there are so many assumptions that all sorts of libraries make about sequencing data and you need to decide for yourself how you’ll go about things and produce good science.
On a practical level for comp bio, I suggest: 1. Learning python & R 2. Basic knowledge 3. Knowing what your fave labs use for techniques (eg NGS? What kind of NGS?) and learning how it works, 4. Learning probability & statistics (lin alg always helps too) 5. If you got bored, learn clustering methods... Because, good god people in this field love seeing pretty tSNE figures and 98% of them have no idea how they just produced what they did but make biological assumptions based on it. You’ll probably have to learn them anyway