Theory-first vs. reality-first people and the EM algorithm

Mon, 23 Aug 2021 00:50:34 -0700

Tags: philosophical, political, academic


As the years pass by, I start realizing people minds when approaching a problem seem to have a bias towards either theory-first or reality-first. Theory-first people are more mathematically driven and equate the properties of things to the things themselves. If the model is appealing or beautiful enough, they will fight reality to have reality change and make it closer to the model. The manifestation of some religious, economic or political beliefs sometimes falls into this category.

On the other hand, we have people which are more observational and describe reality in all its infinite glory. While I see myself more within this camp, I'd say it might lead to some sort of navel gazing. As the full reality is too complex for our limited cognition to understand, it is difficult to reach actionable conclussions from such maremagnum of data.

This model-driven gross oversimplification of people cognitions is, however, quite useful, both at the personal level to understand our own biases, at the interpersonal level to understand the biases of our collaborators, and even at the community level. I'd argue that Leo Breiman: The Two Cultures hinges precisely around this point. (Now, you can be an empiricist that reaches actionable conclussions if you're not constrained by the limits of your own cognition and resort to computers to do the trick but then the belief in the computers themselves is theory-driven, ups.)

But continuing with my musings mixing computational decision-making and politics, there is a very popular algorithm in statistics / machine learning that intermixes these two views: the expectation-maximization algorithm (EM for short). In this algorithm, we intermix two steps improving a model being built. In the general case, EM allows to solve equations with two sets of interlocking unknowns, and it does so by solving each set using the values for the other set from the previous step. This algorithm can be proved to find local maximum estimates.

In the case of clustering values, however, an interpretation of EM implies that the E step "believes" the model (the centers of the clusters computed so far, therefore reclustering the data using them) and the M step believes the data (by recomputing centers of clusters using the reclustered data in the M step). For the current discussion, the point is that without both types of mindsets, progress cannot be achieved. The theory-driven people will push for changes to reality, while the empiricists will force updates to the theory. The parallelism with EM might be far-fetched but I find it quite satisfying.


Your name:

URL (optional):

Your e-mail (optional, won't be displayed):

Something funny using the word 'elephant' (spam filter):

Your comment: