An Introduction to Control Charts

In Measurement, and Data Representation we looked at measurements associated with manufacturing processes and used histograms to visualize these measurements.

In general, continuous measurements resulting from manufacturing processes are normally distributed. A histogram associated with an in-control process remains stationary over time. In other words, the population does not exhibit significant changes in the spread or the mean, marking a process that is stable and predictable.

Even a well-established process will likely deteriorate and become out-of-control after a period of time. This is due to wear and tear, personnel changes, changes in the quality of raw materials, and changes in the environment. Therefore it is crucial to monitor the process through regular sampling. Sample statistics can help us spot early signs of a shift in the mean (loss of accuracy), or a widening of the spread (loss of precision). They can also point to the nature of non-random factors affecting product quality, such as cyclical fluctuations in temperature and humidity.

In this section we will look at how control charts can help us track sample statistics over time and alert us to a process gone wrong, thereby allowing us to halt and fix the process before it starts producing scrap.

Out of Control

A machine at a juice factory is set to fill juice bottles with 300 ml of juice. If the process is operating properly, the mean amount of juice per bottle is historically known to be normally distributed with ml, and standard deviation of ml. To check how well the process is running the factory routinely samples their output to avoid under-filling or over-filling the bottles. Six samples of size are shown below.

Variability is inherent in all manufacturing processes. The first four samples are depicted to represent amounts close to 300 ml, and a reasonable amount of variation. Samples 5 and 6 are designed to illustrate a process gone wrong. If you compare samples 5 and 6 carefully, you will see that each sample points to a different kind of problem with the process.

Samples are collected to gauge the current state of the process. What does Sample 5 indicate about the process?

The process is no longer consistent. Standard deviation is too high. The mean is much less than 300 ml. The mean is much greater than 300 ml.

What does Sample 6 indicate about the process?

The process is no longer consistent. Standard deviation is too high. The mean is much less than 300 ml. The mean is much greater than 300 ml.

Variability is a natural part of any manufacturing process, but what amount of variability is too much? Samples 5 and 6 visually suggest that the process is out of control, statistical analysis will help us determine whether or not this is the case.

Something significant...

Let’s take a look at the data corresponding to the graphic in Exploration exp:OJ. In the table below, individual volumes are listed for each sample along with numerical summaries. You are familiar with the statistics sample mean () as a measure of center and sample standard deviation () as a measure of spread. Another measure of spread is the range of the sample, denoted by . The range is the difference between the largest and the smallest values in the sample.

Both range () and sample standard deviation () can be used to estimate population standard deviation (), but because the range takes into account only the two extreme values in the sample, it fails to capture features of the spread as the standard deviation does. For small samples (), either or can be used to estimate with approximately the same efficiency. As a result, when dealing with small samples, range is often preferred over standard deviation due to ease of computation. (Montgomery)

Compute for samples 3 and 6 and enter your answers in the cells provided.

Sample 1 Sample 2 Sample 3 Sample 4 Sample 5 Sample 6














300 299 298 302 295 296
300 297 299 298 296 305
299 299 300 300 297 299
302 301 301 300 297 298














300.25 299 299.5 300 296.25 299.5







1.26 1.63 1.29 1.63 0.96 3.87







3 4 4 2







Observe that the sample mean for Sample 5 is considerably lower than the other sample means, and also lower than the presumed population mean (). Observe also that the volumes in Sample 6 are more spread out than the volumes in the other samples, as indicated by the magnitudes of and . Is this part of the natural variation inherent in any manufacturing process, or is this an indication that the process is no longer in control? The following questions will guide you to the answer.

Recall that this population is normally distributed with ml, and ml.

Select the most appropriate graph to represent the historically established population distribution.

One diagram does not have the correct center. To decide between the other two, recall that the empirical rule tells us in a normal distribution:
  • Approximately 68% of the data values lie within one standard deviation from the mean.
  • Approximately 95% of the data values lie within two standard deviations from the mean.
  • Approximately 99.7% of the data values lie within three standard deviations from the mean.
Impossible to determine from the information given.
Now we turn our attention to the distribution of sample means, also known as sampling distribution. Given that the population is normally distributed with ml, and ml, find the mean and standard deviation for the sampling distribution for samples of size 4 ().

Use the formulas and from the Central Limit Theorem.

Select the most appropriate graph to represent the sampling distribution for .

Impossible to determine from the information given.

Now that we know what the sampling distribution should look like, let’s take a look at our samples. Advance the slider to see where each sample mean falls within this distribution.

Observe that all sample means, with the exception of Sample 5, are located in the green zone near the center of the graph. The green zone, also known as Zone 1, is located between and . From the Empirical Rule, we know that approximately of data are located in this zone.

Sample 5 mean is located more than three standard deviations away from the mean. From the Empirical Rule, we know that approximately of data are located within three standard deviations of the mean. Therefore, the probability of Sample 5, or a more extreme sample, occurring is very near zero. What can this tell us?

Turning the normal distribution on its side

To monitor a process over time, we will mark times (or sample numbers) along the horizontal axis, and the relevant sample statistic on the vertical axis. Just like we marked the green, yellow and the red zones under the bell curve in the diagram above, we will mark green, yellow and red zones (Zones 1, 2, 3) to help us identify the zone where each sample mean falls. We can even mentally add a bell curve to help us visualize the situation, as shown for Sample 1.

The diagram above is an example of a Control Chart. Control charts are used by technicians and engineers supervising manufacturing processes to track sample means and spread in order to quickly detect a process no longer in control.

The following is a list of some red flags that technicians look for in a control chart:

  • A point outside of control limits (e.g. Sample 5). Such a sample is highly unlikely if the process is functioning properly, and indicates that the process is out of control.
  • Cyclic patterns may point to environmental fluctuations (e.g. cooler temperatures at night and higher temperatures during the day), or operator fatigue. Such patterns indicate a presence of non-random factors that need to be investigated.
  • A continuous upward/downward trend may indicate gradual equipment wear.
  • A sudden shift in the height of the points signals a shift in the mean. This may correspond to introduction of new workers or a change in raw materials.

This list will give rise to a set of rules for interpreting control charts in the next section.

Control Charts

To make sure that a process is in control, it is essential to monitor variability as well as means. Earlier, we presented a control chart for sample means. This kind of chart is called an X-bar chart. To monitor variability, either range (R) or sample standard deviation () can be used. As discussed earlier, is much easier to compute than , and produces similar results for small samples (). As a result, R-charts, which monitor the range, are used more often than s-charts.

Let’s return to our juice bottle example to examine what the X-bar chart and the R-chart tell us when taken together. The following charts were generated in RStudio using the qcc package.

This example highlights the reason why X-bar charts and R-charts should be used together. The X-bar chart was sufficient to spot the problem with Sample 5. The problem with Sample 6, however, would have been missed if we had used the X-bar chart alone. In the next section we will focus on constructing and interpreting X-bar charts.

References

Montgomery, D. C. (2009). Statistical quality control: A modern introduction (6th ed.). Wiley.