Ask HN: Best way to learn computational biology/immunology?

__keshav · on Jan 16, 2021

Computational biology is a pretty broad term. Usually things that have to do with computers & biology are bioinformatics or computational biology. Briefly, for bioinformatics you’d need things like C++ under your belt and interest to come up with ways to make things work really fast and optimal on huge sequencing datasets.

Comp bio is a super fun field to be in. For me it's mostly using computers to do biology. But it’s a mixture of ___domain knowledge in bio, a good grasp of stats, and a whole lot of programming (usually not terribly difficult tasks, though).

Basic python and R are what you absolutely need to know (I started with intermediate python, no R). To do comp bio well, you need to learn computational statistics. I can’t stress enough how much knowing statistics matters in this case because there are so many assumptions that all sorts of libraries make about sequencing data and you need to decide for yourself how you’ll go about things and produce good science.

On a practical level for comp bio, I suggest: 1. Learning python & R 2. Basic knowledge 3. Knowing what your fave labs use for techniques (eg NGS? What kind of NGS?) and learning how it works, 4. Learning probability & statistics (lin alg always helps too) 5. If you got bored, learn clustering methods... Because, good god people in this field love seeing pretty tSNE figures and 98% of them have no idea how they just produced what they did but make biological assumptions based on it. You’ll probably have to learn them anyway

__keshav · on Jan 16, 2021

Specifically about immunology, I'm not so sure - I study cancer but most my grad companions in comp immunology use similar techniques as me. But there might be very ___domain specific immunology techniques that I'm not aware of. Regardless of the "shared" technique though, you need a lot of ___domain knowledge to interpret your results.

But as broad advice:

If you know a broad technique you're interested in, see which immunology lab does that and go from there.

If you already know the types of immunology questions you wanna go after, find the lab that studies that question, then see which techniques they use and learn those.

__keshav · on Jan 16, 2021

sorry for 2 I meant basic biology knowledge lol

emiller88 · on Jan 16, 2021

The resource I recommend to people looking to move from wet lab to dry lab stuff is https://www.biostarhandbook.com/. From your post history it looks like you already have some programming experience, so you could skip the first few chapters which are just a linux intro. I don't think it has all the best practices, but I think it's the most comprehensive overview that starts from square 1 and fills in all the gaps no one tells you when you first start, for example the "Common data types" chapter.

pandatigox · on Jan 16, 2021

I think one of the easiest ways to get into it is by knowing how to use a software called MaxQuant Perseus (https://maxquant.net/perseus/). It's like advanced Excel that was designed so scientists don't have to learn R but still get the job done. Good luck with your journey!

f6v · on Jan 16, 2021

You probably won’t go far by just looking at MaxQuant. But here’s a good introduction to proteomics https://statomics.github.io/pda/pages/techVideos

netizen-9748 · on Jan 16, 2021

A biomedical scientist probably has a good base in proteomics already, learning the ins and outs of the tools used for research would be the best option for a bio scientist wanting to get into the computational side of things.

apitman · on Jan 16, 2021

I think the main question need to answer for yourself is whether you're more interested in biology or programming, ie do you want to use software tools to do biology (computational biology), or do you want to make the tools that others use (bioinformatics)?

If biology, you need to focus on bio, stats, Python, R, and a hundred other specialized tools for working with data.

If you're more interested in programming, you can get away with much less bio/stats knowledge, unless you're working on developing low-level algorithms. A lot of the work has more to do with efficiently storing, moving, and visualizing large datasets. Bonus here is that much of this knowledge is transferable to other (much higher paying) domains if you get burned out or want to sell out.

My current job could be described as bio-aware web development, with an emphasis in data visualization. I need to know a decent amount of biology, but I can almost always defer stats to others in the lab with more expertise.

f6v · on Jan 16, 2021

Knowing how to do programming is a must: Python, R(both are quite popular). Being hands-on with Linux helps as well, as many real-world datasets won’t fit your laptop, so you’ve got to use high-performance computing infrastructure. But it’s mostly about being able to make inferences from data. You need a solid stats background for that.

There’s a ton of courses online and https://www.edx.org/bio/rafael-irizarry is a good start.

dannykwells · on Jan 16, 2021

The vignettes for Seurat are the place to start.

Also, I'm a founder at Immunai and this is literally what we do. Please dm if you have further questions. Happy to help however I can.

ArtWomb · on Jan 16, 2021

>>> vignettes for Seurat are the place to start

Wow, incredible resource! Thanks for linking it up. Looks like the raw dataset is compact enough to run experimental inference right on your laptop ;)

A somewhat related, but certainly left-field question: is there a similar tutorial / library for exploring the frontiers of quantum computing and neuroscience?

Quantum Computing at the Frontiers of Biological Sciences

https://arxiv.org/abs/1911.07127

netizen-9748 · on Jan 16, 2021

Are you specifically interested in immunology? Computational biology is a pretty broad umbrella. My personal experience was learning bioinformatics, which was heavy on Python and genomics/proteomics.

A brief search for comp immunology turns up things like data mining and mathematical modeling, I would assume Python and R would be a good place to start. You may even be able to find some lectures that cover some of the basics online.

psyklic · on Jan 16, 2021

There is a very nice set of bioinformatics coding challenges here: http://rosalind.info/problems/locations/

warlog · on Jan 16, 2021

dn/dt

And then

dn/ds

Learn everything that leads to and comes from these two equations.