Lack of trust in source data is a common concern with data analytic solutions. A friend of mine is a product manager for a large software company that uses analytics for insights into product sales. He told me the first thing executives and managers do when new analytic products are released in his NYSE-traded, multi-billion dollar company is… manually recalculate key metrics. Why would a busy manager or executive spend valuable time opening up a spreadsheet to recalculate a metric? Because he or she has been burned before by unreliable calculations.
I’ve been exploring the subject of unreliable data since a recent survey of CEOs revealed that only 1/3 trust their data analytics. I have also been studying for an exam next week to earn a Certified Analytics Professional designation to formalize my knowledge on the subject. While studying each step in the analytics process on INFORMS’ analytic process, the sponsoring organization for the Certified Analytics Professional exam, I’ve considered how things could go wrong and result in an unreliable outcome. In the flavor of Lean process improvement (an area I specialized earlier in my career), I pulled those potential pitfalls together in a fishbone diagram:
The potential pitfalls in this diagram are listed from right to left in the order they would likely occur for decision scientists following INFORMS’ analytics process. These are provided below using my own choice of terminology so the names would fit on a diagram:
- Business Problem Misinterpretation
- Project/Problem Requirements are Incomplete
- Insufficient Data Understanding
- Incorrect Model or Method
- Incorrect Model Application
- Unreliable Data: We previously discussed this topic in the Analytic Auditor blog post titled Should you Trust Your Analytics.
- Data Inaccurately Prepared: We previously discussed this topic in the Analytic Auditor blog post titled Should You Trust Your Analytics II: Data Provenance.
- Superior Models Exist
- Champion Model Not Validated
- Model Not Current
- Visualizations Contain Translation Errors
Hopefully this assessment is useful to those decision scientists creating data products , those performing quality control, and those who rely on the results to make organizational decisions. I will continue to explore the items not yet covered on this list in future posts, so please register using the button on the right!