Since no two things are exactly alike, variation affects us in every aspect of our lives. Surprisingly, most American managers still don't pay much attention to variation.
When they do look at it, they base their reaction on whether it is good or bad. For example, when the number of accidents goes up one month they start to wonder if something was done wrong in the safety program. What they don't realize is that there are at least two ways of interpreting variation.
Years ago, I presented to an operations manager a simple run chart of the number of workers' compensation injuries per month in his department. During the review of the data, the corporate risk manager interrupted the discussion and asked the manager what he did wrong in the month of June when the number of accidents was higher than the other months of the year.
Somewhat perplexed, the operations manager explained he did nothing different that month. All of the safety meetings were held and all safety duties were carried out the same way they were in all the other months. After an awkward moment, the risk manger replied, “You must have done something different. The figures don't lie.” Which reminded me of something Dr. W. Edwards Deming used to say: “So much misunderstanding in so few words.”
The risk manager was acting as though he could see something in the data the operations manager was missing. He told the production manager he should redouble his efforts to find out what went wrong in June. The risk manager was working from the premise that variation in the number of accidents (outcomes) is the indicator of good or bad performance. For him, good management involved demanding a detailed explanation about every single data point.
The risk manager's management method relies on judging performance by the numbers but not knowing what the numbers really mean. He had no idea what the statement “absence of a negative doesn't mean you have a positive” means.
When this view of variation by managers is carried further, it becomes the way they determine good or bad performance between departments, divisions or people. These managers focus on interpreting what they see, and labeling these observations as causes of problems. They spend no time examining the system because they have mistaken symptoms for causes. Their world is one of command and control, where it is assumed that when things go wrong, it is not the fault of the system but of the employees.
The risk manager had never learned how variation in outcomes is affected by common and special causes. Understanding this difference would have provided him with a better idea of how to react to variation.
COMMON AND SPECIAL CAUSES
The other way to interpret variation in performance is that variation stems from common and special causes. William Shewhart classified these causes as follows;
Common causes: Those causes that inherently are part of the process (or system) hour after hour, day after day and that affect everyone working in the process.
Special causes: Those causes that are not part of the process under normal circumstances and don't affect everyone.
For example, when employees are watching a safety presentation, their attention is affected by causes common to everyone in the audience. These could include the room temperature, the condition of the chairs, the delivery by the presenter, the lighting and noise from outside the room. These are common causes. There also are causes that affect how each individual will process the information delivered that are special for each person. These include hearing loss, lack of sleep the night before, eyesight and personal problems that are creating a distraction.
If the lack of attention is due to common causes, then the speaker and the program planner need to take these into consideration for future presentations. If lack of attention is from special causes, then the employees in the audience need to address them. It is important for managers to know if they are dealing with common or special causes when fixing problems in a constant cause system.
Common causes exist in any constant cause system. The best way to understand the performance of a system is to track its quality characteristics. Quality characteristics of work include things such as the color of parts, the fit and finish of final products and the number of accidents.
Safety management directly is responsible for a quality characteristic of a work system. The ultimate purpose of safety in any work system is to prevent employee injuries, therefore the key performance indicator or quality characteristic measurement of safety is the number of accidents that occur in manufacturing or service operations. It is part of the system hour after hour, day after day and week after week.
Employee accidents are outcomes of a safety management process of common causes that make up the system, i.e. safety training, supervision, job layout and design, temperature, the methods used to assemble parts, lighting and the age of employees. Most accidents (85 percent to99 percent) are caused by the interaction of common cause variation or faults in the system. Safety is a system and as such, the variation in its outcomes can be measured to determine if it is stable.
The proper tool for determining if a system is impacted by common or special causes of variation is the control chart developed by William Shewhart and perfected by Deming and statisticians after him. Control charts are made by collecting measurements of a quality characteristic of a process. You can start with measurements of outcomes, such as accidents, to determine if the process is stable.
For purposes of improvement, measurements of the early stages of the process are required. For safety, this might include data on the effectiveness of your safety training. The figure above depicts how an SPC chart actually measures the outcomes of the variation of common causes in the system.
The purpose of a control chart is to determine if a process is stable, i.e., do any special causes exist in the process or is the variation coming from common causes?
If special causes exist, a process it is not stable and you cannot predict how it will behave in the future. If a special cause is found that moves the process away from the target and this special cause persists, you will have to determine what is going on and take action. Conversely, if a special cause is discovered that is moving toward the target, you will want to find out what has changed and permanently incorporate it into the process.
If the process is stable and variation is due only to common causes in the system, the next question is, is it on target? The target for safety for the number of accidents is zero, but contrary to traditional safety management rhetoric, you can't prevent every single accident. So you must work to bring the average number of accidents as close to zero as possible.
A stable process usually is improved through fundamental changes of common causes in the system. Some examples of changes include improving the effectiveness of safety training, conducting ergonomic assessments for job tasks or encouraging cooperation between the safety department and production.
For the reasons noted above, control charts are important tools for safety management. They are the process talking to you, telling you about the stability of your safety management system. They also provide a common language between management and workers to prevent inappropriate action by managers when safety needs to be improved.
The most common mistake made by managers is treating common causes the same as special causes. This usually ends up with management holding workers accountable for things they can't control, such as the quality of safety training, poor work layout or design, unsafe actions or at-risk behaviors perpetrated by the pressure of production. This action leads to a breakdown between management and workers.
INTERPRETING DATA POINTS
The run chart on the next page depicts the raw data count of the number of accidents per month vs. the accident rate per 200,000 man hours worked per month. A manager who interprets variation from the good or bad viewpoint could easily assume there has been a positive result in safety performance based on the last 4 months of 2009. He also might have assumed in July 2008 that something bad had happened because the facility went from 2.64 to 6.54 in 1 month!
A manager who understands common and special causes of variation would take the analysis one step further. He or she would start by generating the appropriate SPC charts to determine if any special causes exist. There are statistical rules for special causes such as a point above or below a control limit or seven consecutive descending or ascending points. Any special causes would be highlighted in red on the lines of the chart. The IMR (Individuals and Moving Range) charts on p. 54 help make this determination.
The technique first calls for examining the moving range, which shows the amount of difference between the accident rates on a monthly basis. This chart shows no special causes, so the variation of the common causes of safety is stable. The upper chart shows location of the rates over time vs. zero. Again, no statistical evidence of special causes exists so the safety process is stable. This helps prevent management from making the mistake of thinking something unusual was going on during the last four months of 2009.
At this point, the outcomes in October, November and December are normal common cause variation. Unless safety management can prove it has implemented a fundamental change in the safety process, it is not the time to pat the safety department on the back for a job well done. Nor should management beat up the safety department for the months where the numbers went up compared to the previous month.
The problem is that all of the data points represent the worst form of scrap in any type of work process: workers being hurt on the job. The data also show the outcomes of the safety process are off target. This represents a loss to the safety customers (the workers), management and ultimately, the customers who buy the company's goods.
IT TAKES A TEAM
Quality management has taught us it takes a concerted team effort to address common cause variation in the system. As is done for quality, fixing common causes in the safety system involves teamwork and the application of the basic tools of problem solving, including flow charts, run charts, scatter diagrams, cause and effect, pareto charts, histograms and SPC charts.
The SPC charts show it is the responsibility of management and employees to work on the upstream factors, the faults of common causes built into the system that are driving the safety outcomes. Deming estimated that common causes may be responsible for as much as 99 percent of all the accidents in work systems, not the unsafe actions or at risk behaviors of workers.
The problem is common cause variation is the result of faults built into the system that are not easy to see or correct. More often than not, they are buried deep in the system and require the cooperation of management and technical workers to find them and fix the system so safety performance improves.
Without the aid of statistical thinking and teamwork, managers and workers are severely handicapped and have little chance for success at completing this task.
Thomas A. Smith of Mocal Inc. works with management and hourly employees to help them learn about new theory of management to obtain team skills and work on culture change. His book, System Accidents: Why Americans Are Injured At Work And What Can Be Done About It, has received high praise and can be obtained on Amazon. He can be reached at [email protected], via his company Web site at http://www.mocalinc.com or 248-391-1818.
Thomas W. Nolan and Lloyd P. Provost, “Understanding Variation,” Quality Progress, May, 1990.
Walter A. Shewhart, The Economic Control of Quality of Manufacturing Product, (New York: D. Van Nostrand Company), 1931, Reprinted by ASQC Quality Press, Milwaukee, WI, 1980.
W. Edwards Deming, The New Economics, p. 176.
W. Edwards Deming, Out of the Crisis, p. 479, (Cambridge, MA, MIT, 1986).