The term ‘proficiency testing’ is one that every accredited calibration laboratory is quite familiar with. In basic terms it refers to a method of evaluating how well a laboratory is doing a particular calibration with respect to its claimed measurement uncertainty and how it fares with other participants.
Accrediting agencies require their laboratories to participate in a number of proficiency tests each year so that all of the items on their scope of accreditation are covered over a specified period of time. If a laboratory fails to do these tests or their results when doing them are not up to speed it could mean that a particular item is withdrawn from their accredited scope or their measurement uncertainty claim is adjusted to suit. In severe cases accreditation for the laboratory in total may be withdrawn.
These tests are not cheap and when staff time is figured in can be a costly burden for small labs, and along with the other costs of accreditation, are why labs that are not accredited can get away with lower prices.
There are a number of agencies that offer this service for labs—for a fee of course—and they are referred to as ‘providers’ of proficiency testing or ‘PT providers.’ Some of these are accredited in their own right but there are some that are not but in the end, the procedure is the same.
Let’s say a laboratory is accredited to calibrate plain plug gages. A PT provider will have a schedule showing when that test is available so the lab orders the test. In due course, the PT provider sends a selection of plain plug gages to the lab with instructions on where the measurements are to be taken on each gage.
Once a number of labs have submitted their results to the PT provider, the number crunching begins and results are supplied to each lab that participated in the test. I’m not steeped in statistical lore so I won’t comment on the data processing aspect of the test. Each participant is advised what their set of numbers is but the other participants are not identified. Some might say this is to protect the guilty but I won’t go there either. However, the range of the results can be a good indicator of the state of the craft.
Within a group of labs being tested, it is not uncommon for the results of one or more of them to be kicked out of the statistical analysis as they are considered ‘outliers.’ This means it is assumed their readings are off the wall and shouldn’t be counted in with the others because theirs would skew the other results unfairly.
This is usually a safe assumption because while this sub-group may follow the same procedures as the rest in a general way, they just may not know how to use their equipment properly. In this situation, the lab’s accrediting agency will require them to determine why their results are outside the norm and explain how they will fix it. And they will then be required to prove that their ‘fix’ did the job and this usually means repeating the test
I do have some reservations on tests of this type that I have participated in over the last twenty or thirty years. Most of these tests do not indicate what equipment and/or masters were used by the participants. This information could help explain some of the ‘outliers.’
Similar studies conducted by the American Measuring Tool Manufacturers Association listed the equipment used by each participant so when someone’s numbers appeared to be from outer space, you could understand why. This enabled follow-up tests to help determine where the largest source of variations came from for the participants. Another benefit from their tests was NIST’s participation so if you didn’t believe how your numbers got massaged, you could compare your readings to those from NIST for better or worse.
Being the wide-eyed radical that I am, I often wonder what would happen if NIST participated in a typical PT provider’s test on an undercover basis. I have come to the conclusion that there would be a number of those tests where NIST numbers would be considered ‘outliers.’