So very true. I work with publically available data, and I find myself very often trying to guess at the "why was this data collected?". It would be oh so very helpful if, in addition to the data dictionary (though even that is often missing), a short explanation of why the data was collected was documented. That could explain a lot of what I see as oddities.

I worked with local crime incident data once. As I went through cleanup, I realised that some incidents were from far away places, other states even. That's when I realized that the data was not gathered to track crime, but to track cops responses to incidents - a subtle but important difference. It was more a management tool for chiefs to track what their people were doing, not a tool to track crime.

