It’s a cold winter’s night in northern New Hampshire. You go out to the woodshed to grab a couple more logs, but as you approach, your hear a rustling inside the shed. You’ve gotten close enough to know you have a critter in the woodpile. You run back inside, bolt the door, hunker down with your .30–06, and prepare for a cold, fireless night.
ADVERTISEMENT |
Analyzing data using common tools like f-tests, t-tests, transformations, and ANOVA methods are a lot like that scenario. They can tell you that you’ve got a critter in the woodshed, but they can’t tell you whether it’s a possum or a black bear. You need to take a look inside to figure this out. Limiting data analysis to the results that you get from the tools cited above is almost always going to lead to missed information and, often, to wrong decisions. Charting is the way to take a look inside your data.
…
Comments
Transforming Data
Douglas,
Thanks for the nice article. There is just one point of discomfort for me when you state: "There’s no indication that these data are anything but normal, so I won’t transform the data." In my 40+ years utilizing Process Behavior Charts, I cannot recollect a single time I have had to worry about normality or transform data before constructing a chart to get the understanding of the process I needed. Shewhart himself dispelled the notion that the normal distribution was important to the use of control charts: "The normal distribution is neither a prerequisite for nor a consequence of statistical control."
Transformining Data
Hi Steve,
I guess I didn't get it across well, but that was my point. I think many these days transform their data as the first step. If I ever did, (and I never have either) it would only be on a known homogeneous population with strong and I repeat STRONG statistical evidence that it was something other than normal and I would not do it for my control charting. I am sitting here trying to think of a situation where it would be really important to perform a transformation, but even in determining predicted proportions defective at extreme tails (on said homogeneous population), it is really a roll of the dice whether you get a better estimate or not. I think usually it is done these days when people don't like the results of the normal model and want to present prettier numbers.
Transforming
Thank-you, Douglas. Well said!
Transformining Data
For anyone who may not of seen it, Dr. Wheeler expressed it well in his response to comments on his latest article:
"The point is not about how we find an estimate of the parameters for a probability model. But rather that regardless of how we estimate our parameters, the whole process is filled with uncertainty, and that these uncertainties will have the greatest impact upon the extreme critical values. The statistical approach and Shewhart's approach are diametrically opposite, and until someone understands this, they cannot begin to understand how Shewhart's distribution free approach can work."
This is the "roll of the dice" that defines any data transformation.
How Elegant, (deceptively) Simple and Eloquently Stated
Hi, Douglas,
All I can say is BRAVO! for a clear article that should, but unfortunately won't, eliminate hours of legalized torture that goes in the name of statistical training for a "belt." As I like to say, there is no "app" for critical thinking applied to a simple plot of data over time.
Regarding the Normal distribution: I saw a live broadcast of a Deming 4-day seminar and someone mentioned the Normal distribution in a question to him. He gave that famous terrifying scowl and GROWLED, "Normal distibution? I've never seen one!" -- end of answer.
I don't feel so alone in my approach to data any more. Thank you!
Kind regards,
Davis Balestracci
Add new comment