4 Comments
May 9, 2023Liked by Randy Au

As I read, I kept waiting for you to mention Bayesian Inference. The entire Bayesian approach is "attempt to describe the thing that generated your data... which we call a model." That is, Freq assumes some distributions that drive the data thanks to some big N patterns (pretend it's gaussian and it'll be ok), while Bayesian says "that's cool, but allow smaller N and focus on that which generates the data, rather than the data itself". Bayesian also emphasizes using what you know (or think you know) about the data to nudge the model in the "right" direction... and if it doesn't fit well, your nudge could be wrong. If you haven't spent much time looking at this stuff, you may enjoy it, as you already think in that way.

I am always frustrated when analysts don't stop to ask how their data was generated before doing a large analysis. So much wasted work can be avoided if one understands the basics of the process, what was intended, and what was actually tracked/logged. An analysis can be 100% correctly done with appropriate checks for assumptions violations and outliers... and yet be 100% wrong because the data was collected badly, misinterpreted, and non-reflective of the underlying process. And just because it's "digital" doesn't make the data any better on it's own, it just makes it easier to collect bad data in volume.

Expand full comment
May 9, 2023Liked by Randy Au

I would venture to say that all data analysis has to be done with respect to a model, and the model should contain both signal (what we want to measure) and noise (what gets in the way). Of course, all models are wrong - it's just in what way and to what degree. Sometimes noise can become signal - depending on what we want to measure.

There is an oil well logging tool called a caliper. It simply uses a pair of arms to measure the width of the hole. Sometimes in a particular rock, like a soft shale, the rock will slough off creating an enlargement of the hole. If it gets big enough, the caliper will max out. Normally one would call those sections noise - all you know is that the hole diameter is larger than some value. But I used those sections once to pick out salt layers, since those all dissolved out. Noise became signal.

Expand full comment