Hacker News new | past | comments | ask | show | jobs | submit login
Researcher questions years of his own work with a reexamination of fMRI data (duke.edu)
177 points by EndXA on June 4, 2020 | hide | past | favorite | 49 comments



An implicit assumption of commentary here is that no one has bothered to test how reliable task and resting state fMRI results are. But that simply is not the case:

https://scholar.google.com/scholar?hl=en&as_sdt=0,47&q=%22te...

Results are: from poor to excellent, depending on task, sample, methodology.

Speaking as someone who had conducted longitudinal fMRI analyses, I would say two things:

1) its been fairly well established that within-individual correlations of BOLD is generally higher than between-individual

2) within-individual correlation can be weak to moderate, but group-level patterns are very often quite robust. For example, in a cognitive control task you are going to see presupplemental motor area, dorsal anterior cingulate, anterior insula, and so on. For risk tasks, you are going to find that anterior insula correlates with risk; for learning tasks, you are going to find striatum signals for reward prediction errors.

I encourage people to visit Neurosynth, which performs automated metanalyses. Here, for example, are results for "prediction error" after accounting for activation that occurs generally during tasks:

https://neurosynth.org/analyses/terms/prediction%20error/

and here for "interference"

https://neurosynth.org/analyses/terms/interference/

(default mode network and task-positive networks anti-correlate in activity, another super-reliable result)


I do not want to leave the impression that there are no issues with fMRI analysis. Its hard to do right.

https://www.nature.com/articles/s41586-020-2314-9.pdf


This is pretty close to my background. The article is fairly dramatically written. It’s not as if nobody has been looking at reliability and discriminability until now. It’s a big topic in the field with both task and rest fMRI.


Mine too and I agree.

There is a big spread in the quality of work in the field. From thoughtful analyses of large datasets, to pretty bad examples of p-hackery with push-button software.

A root cause of problems is people asking scientific questions of fMRI data whose answers would lie at a much finer spatiotemporal resolution than the medium can support.


>> A root cause of problems is people asking scientific questions of fMRI data whose answers would lie at a much finer spatiotemporal resolution than the medium can support.

Reminds me of the "mirror neurons" nonsense from a while back.


What was nonsensical about mirror neurons, if I might ask?


I am as well, and I had similar thoughts about the article's flare for the dramatic.

And yes, most researchers realize that fMRI can find average difference fairly reliabily but test retest cross correlations are poor. We'd all love if we can improve imaging methods, but we work with what we got.


The original study: https://journals.sagepub.com/doi/abs/10.1177/095679762091678...

Abstract:

> Identifying brain biomarkers of disease risk is a growing priority in neuroscience. The ability to identify meaningful biomarkers is limited by measurement reliability; unreliable measures are unsuitable for predicting clinical outcomes. Measuring brain activity using task functional MRI (fMRI) is a major focus of biomarker development; however, the reliability of task fMRI has not been systematically evaluated. We present converging evidence demonstrating poor reliability of task-fMRI measures. First, a meta-analysis of 90 experiments (N = 1,008) revealed poor overall reliability—mean intraclass correlation coefficient (ICC) = .397. Second, the test-retest reliabilities of activity in a priori regions of interest across 11 common fMRI tasks collected by the Human Connectome Project (N = 45) and the Dunedin Study (N = 20) were poor (ICCs = .067–.485). Collectively, these findings demonstrate that common task-fMRI measures are not currently suitable for brain biomarker discovery or for individual-differences research. We review how this state of affairs came to be and highlight avenues for improving task-fMRI reliability.


As an fMRI practitioner, I just want to point out that this problem has nothing to do with fMRI. It has to do with the weaknesses inherent in the most common methods for designing fMRI experiments, and for analyzing and modeling the data. The SNR in fMRI depends heavily on the level of blood pressure, arousal, attention and other factors that vary day-to-day. Any analysis that doesn't account for that variability will be unreliable.

Methods for designing experiments, analyzing and modeling fMRI data that are robust to this variability are available. The problem is that most people in the field don't use them.


Which methods are you referring to?


To quantify the value of a method it is useful to consider the amount of information that the method recovers from the data stream, the prediction accuracy of the resulting models, generalization ability outside of the conditions used to fit the model and decoding/reconstruction accuracy. For all four of those criteria, the best approach is to use linearized encoding models that estimate an FIR filter for each voxel separately.


On the other hand you have the reverse engineering of what people see.

[1] https://news.berkeley.edu/2011/09/22/brain-movies/

[2] https://nuscimag.com/your-brain-on-youtube-fmris-reverse-eng...

[3] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4941940/

[4] http://longnow.org/seminars/02018/oct/29/toward-practical-te...

While this reminds me a little bit of all the back and forth in nutrition, I simply enjoy the ride and think of

[5] https://en.wikipedia.org/wiki/Brainstorm_(1983_film)

:)


The movie reconstruction from brain activity work was from my lab, thanks for the shout out!

The problem with many methods of analyzing and modeling functional MRI data is that the signal-to-noise varies hugely across time, across individuals and across brain regions within an individual. Unfortunately, the most common methods of analyzing and modeling fMRI data do not incorporate any principled method for dealing with this SNR variability.

What makes this frustrating from a practitioner's point of view is that we do have methods for analyzing and modeling these data that can account for these uncontrolled SNR changes. The problem is that most people don't use these methods. (My lab has pioneered many of these techniques, and that is why we could produce those compelling decoding results.)


Does anyone know the mechanism that increases blood flow to more active parts of the brain? Not being an expert, I'm imagining expanding/constricting blood vessels controlled by a signalling pathway from... what?


The following is my understanding, but I could be wrong:

When neurons activate, they release a neurotransmitter (typically glutamate) that then binds to nearby astrocytes. Astrocytes then increase their intracellular calcium levels which in turn causes vascular dilation of the nearby area, thus increasing blood flow to that area.


Thanks, this was what I was looking for. So I guess this pathway is not very predictable, hence the variations in subsequent measurements of the same subject.


Also consider aligning the images to process them all with a single model, the different sized/shaped heads in the world and correcting for their different angles and little bits of movement. That's not even considering differences between brains.

I am continually amazed at the level of automated preprocessing tools to deal with this alignment problem. You can play around with fMRI data using some popular packages without thinking too hard about it and produce visualizations almost as nice looking as the ones in the article.


The most sensitive fMRI methods don't do alignment at all, they do all data processing in the individual subject's brain space.


Much of the day-to-day variation concerns uncontrolled variability in blood pressure, arousal and attention. Those factors have enormous influence on the blood-oxygen-level-dependent signal.


The term you are looking for is neurovascular coupling [1]. The increase in blood flow allows more oxygenated hemoglobin to flow through the capillary bed where the neuronal activity is taking place. This increases oxygen availability, which is necessary to maintain homeostasis and replenish the membrane potentials of the neurons, given that the brain has (essentially) no energy storage. There are multiple signalling pathways that control the mechanism, many of which are still poorly understood - for a review see [2].

I should add that every full moon a study comes out criticizing fMRI or drawing attention to its limitations, but researchers are very well aware of that. The fMRI response* to a task is neither temporally nor spatially specific, and it is confounded by anything that could remotely influence blood flow and blood volume. Figuring out how to deal with these issues is an active area of research - see an informal discussion here [3].

It's really a shame that the field is purported so negatively by the press, given it's the only thing we have to study brain function in humans with sub-millimeter spatial resolution non-invasively in-vivo.

[1] http://scholarpedia.org/article/Neurovascular_coupling [2] https://www.cell.com/neuron/pdf/S0896-6273(17)30652-9.pdf [3] https://practicalfmri.blogspot.com/2017/08/fluctuations-and-...

* Assuming the conventional Blood Oxygenated Level Dependent contrast used for functional imaging.


There is almost irrational glee expressed whenever an article like this comes out, not just by the press, but also by HN commenters, who seem primed to pounce on any evidence that supports their preconceived negative views on neuroscience. I often wonder why this is the case.


Disclaimer: I have an interest in neurosciece but not expertise on fMRI. I also haven't had time to read the article closely (yet). I'm sure others are better equipped to answer the question, so please jump in.

Anyway, here's my impression at a high level: neurons need energy to spike, and increased brain activity in a particular brain region ought to translate to higher metabolic activity, and ultimately to higher blood flow (to deliver e.g. oxygen). However, what one can infer about neural activity from e.g. blood oxygen level is still a matter of debate, precisely because (again as you suggested) the relevant mechanisms are complicated and AFAIK not completely understood. Other factors besides metabolism may also come into play.

I suppose this article is a reasonable start:

  https://en.wikipedia.org/wiki/Blood-oxygen-level-dependent_imaging
Despite these reservations fMRI remains popular, at least in part because it is non-invasive and is one of the few tools we have for studying humans.

Edit: more disclaimers. :-)


Presumably activity from background tasks, subconscious processes, will be a changing overlay to whatever current conscious tasks are happening. And, again presumably, other overlays like mood exist?

Poor repeatability seems inherent, the starting brain state will be different, the background processes will be different, the accompanying thoughts will be different?

Is this reanalysis looking at something else beyond this: Like, are people using no brain areas in common across some repeated tasks??


Well, the issue is that if you accept that brain state will always be different then there isn't much predictive power in the measure.

The conceit was always that you could measure it across a bunch of people and find the commonly active areas across enough datasets. Even with different baselines, the areas critical to the task would elevate above that baseline.

This paper finds that even in a single person, activity (above the baseline) is poorly correlated across recording sessions. They use a technique called intraclass correlation to measure this: https://en.wikipedia.org/wiki/Intraclass_correlation


But does not contest whether the mean averages for the sample are reliable, which has been demonstrated repeatedly.


I spent 3 years doing research on (f)MRI. This problem is deeply linked with the models used for the analysis. Basically, the BOLD signal is a new exponential fit everytime. There are no subject parameters considered. On the other hand, task-based analyses always consider the stimulus a discrete step function for the deconvolution.

For example, whenever we enter a room with an annoying clock tick-tacking, we observe it doesn't affect the person native to the room as much. Or whenever we watch an add a second time, it goes by more quickly. Such cases are not accounted for in current models.


My brother-in-law suffered a severe heart attack about a month ago. He’s 40. His wife told us he showed no brain activity, other than that in his upper spinal cord, in the MRI’s done at 1 and 2 weeks after the event. He woke up a day after his second MRI and is currently in rehab. I never saw the scan results, but I don’t understand how they could be so wrong.


I'm under the impression that fMRI for predicting coma outcomes is still quite experimental: https://www.statnews.com/2015/11/11/brain-scans-coma-recover...


The article claims that images of the same people done months apart have poor correlation. It doesn't say how closely they're correlated on the same day, but presumably a poor correlation there would have been noticed before. If there's some sort of activity shift on month time scales, that would be extremely interesting.


I have felt that there’s circular reasoning going on with fMRI results, where the parts of the brain light up for something, showing that the process is in that part of the brain, and then they light up again with something else, which shows that thing is the same process.


A brain region can calculate more than function; so that part isn't problematic. But interpretation of BOLD signal is inherently tricky because its like measuring heat off of a processor; it might tell you how hard its working, but not its working on; but if you can modulate the temporal dynamics of the processing, then maybe you have something work with.

One approach is to have a generative normative model of a mechanism (e.g. temporal-difference learning) verified by lower-level research (unit recordings in animal models), fit parameters to the model based on the task behavior (e.g. learning rate) and then find the correlates to that (e.g. to the subjective reward prediction errors as they occur in the task at time of decision feedback). The benefit here is you have already a plausible mechanism that can recover the behavior, and you are finding changes in BOLD signal that track those. Doesn't solve the problem entirely, but its better than just correlation with whatever.


We also have information about what parts of the brain are responsible for what from examining functional loss resulting from injuries and tumors. It's all a bit messy because the boundaries are loose and things can develop differently for different people.


Didn't that already happen with the "brain activity in a dead salmon" affair ?


There's a Wired article from almost fifteen years ago taking the piss out of fMRI.


This should be less of a problem as fmri resolution continues to increase


Why? It doesn't seem to be a problem of resolution.


I’m not sure if you refer to temporal or spatial resolution. Increases in temporal resolution have been problematic, as motion previously averaged out (cardiac and breathing) has become more problematic as the TRs have dropped.


Is the mesh in the photo the atom network shaping the protein?


Not mentioned: the dead salmon study of 2009 that outed a bunch of fMRI studies as using incorrect statistics, and which went on to win an ignoble prize.

[1] https://blogs.scientificamerican.com/scicurious-brain/ignobe...


It is not mentioned because it is not relevant. The dead salmon experiment highlighted that some researchers were failing to correct for multiple comparisons sufficiently. This article is talking about the test-retest reliability of the fMRI measure in individuals.

edit: Instead of down voting, perhaps moderators might write how that experiment is actually relevant.


I agree: dead salmon study outlines the flaws of Statistics. This study outlines the flaws/limitations of poorly formed experimental paradigms to which fMRI is not immune. Its like comparing applies to oranges.


How gutting for the researcher. It's very admirable for him to be so open about it all


To be fair, if this paper holds up then he's going to be the center of a scientific controversy which will increase citations, or it's going to become a required cite in pretty much every fMRI paper, which will increase citations.

So from a career perspective, this is actually pretty good (and wonderful work, even if it wasn't going to be good for his career).


How deliciously wonderful.. we humans can still have some mystery about us.. i find this to be great news.


We still have a whole heap of mystery about us - especially where the brain is involved. :-)


I’m in neuroscience. Don’t buy the media narrative that they figured all out, many areas are very close to psychology and it’s vague folk definitions. Ask a neuroscientist to define consciousness and you will see them struggle. You can read deeper starting from there, ask very obvious questions and the big mystery part will come back.


Good. It's been as dumb as pointing a thermal camera at a CPU and memory chips and getting excited about hotspots.


One of the better HN analogies. Well-played!


"Phrenology (from Ancient Greek φρήν (phrēn), meaning 'mind', and λόγος (logos), meaning 'knowledge') is a pseudoscience which involves the measurement of bumps on the skull to predict mental traits."

https://en.wikipedia.org/wiki/Phrenology




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: