Should You Trust Analytics III: Analytics Process

Lack of trust in source data is a common concern with data analytic solutions. A friend of mine is a product manager for a large software company that uses analytics for insights into product sales. He told me the first thing executives and managers do when new analytic products are released in his NYSE-traded, multi-billion dollar  company is…  manually recalculate key metrics.   Why would a busy manager or executive spend valuable time opening up a spreadsheet to recalculate a metric? Because he or she has been burned before by unreliable calculations.

I’ve been exploring the subject of unreliable data since a recent survey  of CEOs revealed that only 1/3 trust their data analytics.   I have also been studying for an exam next week to earn a Certified Analytics Professional designation  to formalize my knowledge on the subject.  While studying each step in the analytics process on INFORMS’ analytic process, the sponsoring organization for the Certified Analytics Professional exam, I’ve considered how things could go wrong and result in an unreliable outcome.  In the flavor of Lean process improvement (an area I specialized earlier in my career), I pulled those potential pitfalls together in a fishbone diagram:

Analytic Errors Fishbone

Continue reading “Should You Trust Analytics III: Analytics Process”

Visualized Correlations

One interesting approach to root cause analysis is to correlate descriptive variables about errors with one another.  I created this correlogram to visualize every possible combination of correlation coefficients among observations from a large information system.  At the intersection of two numbers is a square that represents the correlation of those two variables across hundreds of observations.

2015-05-12.correlogram2

Blue shows a positive correlation, red represents a negative, and darker saturation signifies a stronger relationship.  What trends that might give insights to the root causes?  I chose to explore variables 14 (vertical blue trend), 25 (horizontal), and 27 (horizontal).

The analysis was performed in Excel and also in R using the correlogram package.