Sunday, June 23, 2024

Note on Epistemology of Outliers

 Outliers generally aren't really part of the intended data-set. Imagine someone surveying cats, felix catus, and it turns out one of them is 700 pounds. "Damn, this cat is huge! What an outlier!" Turns out a bear somehow got into the data. It's not an outlier, it's a mistake. Fuzzy, yes, claws, yes, cat, no.

 If you graph terrorist hits by mass lethality, you get a nice bell curve, except 9/11. This is because 9/11 was basically not a terrorist attack, exactly the way a bear is basically not a cat. It was the organized, disciplined action of a large state, not a disorganized and undisciplined guerilla or muslim force. War crime, not terrorism.

 

 Remember that although Cauchy distributions are super weird, they are also still bell-shaped. (Fun fact: IQ is Gaussian on the low end, but Cauchy on the high end.) To get a bear into the bell curve of cat traits, the bell would have to flare up instead of smoothly sloping down, something neither Gaussian nor Cauchy can do. 


 Never forget illiterates can't categorize. The majority of scientists need a gene readout to know what species a thing is, because anything more ambiguous than binary code is too ambiguous for them to deal with. If you can't gene-sequence the bear and show it differs too much from felix catus, they will argue the bear is in fact an ""outlier"" kind of cat. They will not notice how many other bears (and lions and foxes oh my) you would have to include in the data to be consistent with this definition.
 Fun: in this case you can't gene-sequence the bear, because the data was already gathered and tabled. The specimens have already been released. Lolscience.
 More fun: the GAE bureaucracy showed it's perfectly capable of moving quickly when it wants to, because it destroyed all the evidence from 9/11 in under a week. As if it knew it had something serious to hide. I choose to believe them. ¯\_(ツ)_/¯

No comments: