It is funny that although I have published peer-reviewed papers about evaluation of ML models and could probably discuss with you about many nuances of the process if you woke me at 3 AM, I always find it difficult to remember which of [Type I, Type II] is false positive and which is a false negative. If I have this problem I suppose almost everyone has it.
I wish people stopped using [Type I, Type II] when we can use clearly superior terminology. This feels to me very similar to amateur software engineers using non-descriptive variable names.
This is exactly my experience. I have never known what type the error I was talking about was...
Imagine calling (for instance) type I groups those which are abelian and type II those which are not... And then not being a professional mathematician. So is xy=yx here? Mmmmhhh type I says yes or was it no?
Specificity because "how specific is the test? Does it measure the true positives correctly without getting a whole heap of false positives as well?" You might ask someone to be "more specific" about something so they aren't including irrelevant things in their discussion.
Sensitivity because "how sensitive is the test? Does it detect the needle in the haystack you need to find? How many does it miss." In common english you might complain that your car brakes are too sensitive - they are too quick to register the pressure from your foot.
Don't forget these are actually strict mathematical concepts. Hope this helps clarify :)
I think the issue people have with these names is that their English meanings (as you described) make sense when the positive class is not as prevalent as the negative class. If they are equally probable (or worse, if it's the opposite), then the English meanings quickly become out-of-context
"Abelian" is at least a fresh new concept to hang your own associations off of, with no previous interference, and without interference from the similarly-named "Adelian" groups or something equally stupid.
The problem isn't just that the term is something you've never heard before, but that "I" and "II" is not a very good concept to try to hang them off of. This is relevant to software engineering naming too: In general, you should not use naming schemes that imply properties that don't actually exist in your values. I and II have all sorts of properties that don't apply to the terms in question, most noticeably, they have an order. But, which is "first", false positives or false negatives? They don't have a natural order. Using numbers to name them just gets in the way.
(Especially when there are perfectly serviceable words.)
Math jargon isn't perfect by any means. But it does at least avoid naming things by sheer numbers most of the time, unless there isn't really a choice because it needs a few hundred names right now.
(Similarly, pop quiz: In Kahneman's classification scheme, is System 1 the fast or the slow system? Odds are, even if you get that right, it's because the book title "Thinking, Fast and Slow" is something that stuck and it happen to be in order. It probably wasn't because you remember them by number.)
> Imagine mathematicians calling commutative groups abelian? How do you remember if xy=yx there?
Actually, even if you ignore jerf's response, this is different in an important way from the "Type I" / "Type II" terminology.
Group in which the group operation is commutative: "Abelian group".
Group with no guarantees except the group axioms: "group".
The special one is marked and the non-special one is unmarked. In contrast, the designations "Type I" and "Type II" are parallel; it's not at all obvious which one is the default and which one deviates from the default.
Maybe I can help you with that ... the mnemonic I developed is:
Type I error, with probability often denoted by α (alpha) which is the first (I) letter of the alpha bet. AL-PH-A stands for al ALlegedly PHalse Alarm. Or just fALse-Positive-HA!
(yes, it's stupid, and weird, but has been helping me remember it for nearly three decades now).
As another commenter noted, the story of the Boy Who Cried Wolf is an easy mnemonic -- the villagers committed a series of Type I errors, and then a Type II error.
They are certainly better than Type I and Type II, but it is still a potentially ambiguous (at least as a non-native speaker). What makes a "false positive" false? Is it called a false positive because it is actually a negative, or is it called a false positive because it is a positive for which you made the error of calling it negative?
That much is fine -- it's ambiguous if you don't know the general idea in the first place -- it's true of most things. The problem with type 1/2 is that it's so utterly devoid of memory hooks that even if you recognize it, and know the idea, you can't confidently identify which is which.
> or is it called a false positive because it is a positive for which you made the error of calling it negative?
Not to pick on the non-nativeness of the problem here, but that's not really a way you can use "false". I'd be a lot more comfortable calling an underlying positive that tested negative an "unidentified positive" or a "misdiagnosed positive" [this one really is ambiguous in exactly the way you suggest] or anything else that suggested that the positive was there and an error occurred in noticing it, as opposed to suggesting that the positive wasn't there in the first place.
So, it's a fair complaint for non-native speakers, but you just can't choose all your terminology to meet their needs. :/
I wish people stopped using [Type I, Type II] when we can use clearly superior terminology. This feels to me very similar to amateur software engineers using non-descriptive variable names.