Hacker News new | past | comments | ask | show | jobs | submit login

Oh the Simpson's Paradox is that the relative portion of the lower performing groups is increasing as well. For instance, ELL students have increased from 8.1% in 2000 to 9.6% today (https://www.edweek.org/leadership/the-nations-english-learne...)



Simpson's paradox is that several (most, all) groups show an increase, but the average is decreasing.

Imagine a simple chart 1980:

-----------| students | avg-score

poor ...... | 10 ....... | .2

non-poor | 100 ...... | .8

If the chart in 2020 looked like this:

-----------| students | avg-score

poor ...... | 100 ...... | .25

non-poor | 150 ...... | .85

Both groups increased their average score.

But the total average went from (.2x10+.8x100)/110=~.75 to (.25x100+.85x150)/250=~.61


Which is not actually surprising because the dominant change is the migration from non-poor to poor, not the change of the average score - 1980 has only 9% poor results, 2020 has 40% poor results.

When scores change, at first the average scores within groups will change, but if the scores change enough they will eventually cross group boundaries at which point the group averages become more or less irrelevant and the changes are mostly reflected by changes in group sizes. The group averages are bounded by the range of each group and can only indicate if the group is moving towards one of the boundaries, but as scores cross boundaries they start contributing to different group averages and at opposite ends, e.g. decreasing scores will at first contribute at the lower boundary of one group and then at the upper boundary of the next lower group which may increase the average score of both groups, in one case by loosing a low score relative to the group range, in the other case by gaining a high score within the group range.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: