Dunning Kruger and the emperor's new clothes
I don't think I've written a post outright defending my writing thus far, but I think I might as well. One of the things I'm doing is often overstepping my area of expertise to point out obvious flaws in areas such as quantitative research or philosophy of language.
I was recently reminded of a popular and silly post I made on /r/ml when I first started seriously working in the field. It was a simple observation, anyone, with a gram of common sense, would ask:
How do I know a paper’s results are real if I can't see the code and the data? Furthermore, in many cases the data is open and the paper showcases, in theory, how the code ought to be written. So what’s the downside in including them?
This was before I was aware that a large amount of scientific research, even in STEM-adjacent domains, is in part fake or wrong. In hindsight, my question was stupid. There’s an obvious reason why the code and data aren’t made open in some papers, but it’s not because this ought not to be done, it’s a political reason, it’s how fake research propagates.
I knew the replication crisis was a thing in sociology and psychology, but those areas were obviously "harder" to get right, more politics, complex environments, not hard benchmarks. Machine learning is almost a purely theoretical affair in many cases. It seemed that almost by definition you can't have a replication crisis. After all reviewers must be running the paper's code on the same data with different seeds to confirm results and use profilers to confirm that all assumptions are correct (e.g. the novel layer being introduced is actually differentiable and its weights are changing).
Turns out no, these things that should obviously happen don't happen. Why? I assume the reason mainly has to with crony academic setups that encourage bunk "research" meant only to increase citation and works-published counts. Having to write down the actual code would make this more difficult, so why bother? I assume this has downstream effects, where people in the industry don't bother to get their code in shape to publish (or deal with the legal hassle of doing so) because they know the state of research and understand that this won’t matter.
Note: there are some papers such as proof, reviews, summaries of research, etc where source code and data needn't be published and indeed may make no sense.
I'm not mentioning this to shit on ML. In my attempt to analyse the amount of fraud and mistakes in the epistemic foundations of various fields, my personal take is that the vague area around machine learning and deep learning in particular is close to a gold standard. While I have nowhere near the amount of evidence to make this argument now, I mention this since I don’t want this post to come off as an attack on the high-quality research being done in this area, especially since there’s a constant push for more open data and source code as a prerequisite for anyone taking you seriously.
I’m highlighting this as an example of where one might mistake a lack of political knowledge with the Dunning-Kruger effect. You have a relatively inexperienced person come in and say:
Look, like 30% of what this field produces seems to be shit, and could easily be improved by following this one simple trick. You guys should do it.
To which the answer is something like:
Yes, basically everybody knows this, but due to corruption and lack of interest, it doesn’t happen.
Is this the Dunning-Kruger effect? Somebody that’s just learned about a field thinking they know better than experts?
No, this is just common sense mixed with naive optimism. Seeing a problem that’s obviously solvable and assuming that we live in a highly collaborative environment where a good solution should essentially be enough to solve it, ignoring the fact that most “sides” working to “solve the problem” are using it as a proxy for a conflict.
I think this is a rather important distinction, when “being an expert in a field” means “being an expert at navigating the politics/corruption in that field”, this is an indication that something has gone terribly wrong. It indicates that “expertise” has transitioned from something collaborative useful for advancing the field to something competitive, useful mainly for personal gain (in money, “status” or whatever other silly things people fool themselves into thinking they want).
Again, I don’t think ML is a good example of this. But I won’t name the obvious examples since everyone already knows them, and I won’t name the contentious ones because I don’t have enough data to defend my current intuitions around those.
Beware of calling the kid pointing out that the emperor is naked as someone under the spell of the Dunning-Kruger. It may indeed be that she is, but first, look for yourself to make sure the emperor isn’t actually fucking naked.
Published on: 2021-06-28