Friday, May 22, 2009

How To Measure the Accuracy of a Counting Machine

Someone should point out to the Comelec that a 20,000 optical mark test of their proposed Automated Counting Machines implies a plus or minus 0.7 percent Statistical Margin of Error on the measured reading accuracy. This statistical error or imprecision of 0.7 percent is much too large for acceptance testing of equipment expected to operate at 99.995% levels of reading performance and 20,000 test marks is too few, because we cannot discriminate compliant from non compliant counting machines. For example, if one counting machine is tested with 20,000 test marks and it makes a single error, the best we can say is that it has a rated accuracy of 19,999 out of 20,000 or 99.995 percent, plus or minus 0.7 percent (at the 95 percent confidence level.) In other words, with only 20,000 test marks, such a machine could have an accuracy rating as low as 99.995% minus 0.7% or 99.295% accuracy, which is below the acceptable spec level, or be perfectly acceptable at 100%.

The Commission on Elections has announced that some 80,000 Counting Machines to be used in the 2010 national elections must be capable of 99.995% reading accuracy--which means that for any batch of 20,000 optical marks the Counting Machine attempts to interpret, it is not expected to make more than one error.
The Precinct Count Optical Scan (PCOS) machine to be used in the balloting will be tested this week, according to Comelec spokesperson James Jimenez.

The accuracy requirement of 99.995 percent means a threshold of “one error out of 20,000 markings,” Jimenez said. If it falsely reads two or more ballot markings, the machine will be rejected, he said.

Jimenez said the accuracy rate was based on the number of ballots that would be fed into the voting and counting machine. Each PCOS equipment is expected to process about 1,000 ballots with 35,000 markings, he said.

This Margin of Error depends mainly on the Sample Size, or in this case the number of test marks.  Intuitively, the Margin of Error goes down as the number of test marks goes up.   It is the exact same animal as the plus or minus 2.8 percent (or 3%)  margin of error in the standard 1200-respondent SWS survey, where also, the margin of error intuitively goes down as the number of respondents goes up.   Both come from the formula that gives the Margin of error to be plus or minus the reciprocal of the square root of the number of survey respondents or test marks.

The correct sample size depends on the precision desired and how strictly we want to run the test.   Careful calculation should now be done by Comelec as to what that correct number of test marks is.  I can tell you it is far many more that 20,000! [More in a subsequent post!]

An excellent standard statistical reference on accuracy and precision in statistical quality control testing, sample size and margin of error is: Intermediate Statistics for Dummies which explains things mostly in plain English.

Question: What is the required  number of test marks to measure the reading accuracy rating of a given machine with a Margin of Error of, say,  0.0025 percent (half the last digit of precision in the spec of 99.995 percent)?


GabbyD said...

how did you calculate.7%?

DJB Rizalist said...

Gabby 0.7% or 0.007 is the reciprocal of the square root of 20,000. This is the general formula that I use to estimate "Margin of Error" based on sample size.

DJB Rizalist said...

A simple general rule of thumb relates the so called Statistical Margin of Error to the sample size of respondents in a survey or the number of test points in a test of election counting machines. The Margin of Error is equal to the reciprocal of the square root of the number of respondents or test points. For 1200 respondents in SWS surveys, the margin of error is plus or minus 2.8 percent (often rounded up to 3 percent in news reporting). For 20,000 proposed test points, the margin of error or statistical precision of the measurement of accuracy is 0.7 percent. This is is much too coarse, it can only distinguish a counting machine with accuracy equal to 99.995% and one with 99.295%. Way too coarse!

GabbyD said...

two comments:
1) you may have to be more careful in calculating the standard error at the 95% CI

the standard error of the average is affected by the population standard error and the sample size.

effectively, SWS is assuming that the pop standard error is (root(.5*.5))=.5, from the assumption that its a binomial probability. [variance of binomial is root[p*(1-p)]

2. the interpretation of CI is:

if i run 100 sets of 100 ballots on a machine, 95 of the sets will have an average success rate of 99.995 for 95 of 100 sets (containing 100 ballots).

3. lets say that jimenez is correct: the point estimate is 99.995%. note that ANY standard error would mean the min bound of the CI is going to be less than 99.995%.

i wonder therefore, if jimenez means that the point estimate is in fact 99.995%.

DJB Rizalist said...

You have a deeper understanding of the margin of error than most, but the simple point is that at 20,000 points, your statistical precision is plus or minus 0.7 percent. That means you can only "bin" the results into boxes of about that width. But if you want to tell the difference between a machine accuracy of 99.995% (acceptable level) and 99.990% (no acceptable accuracy level) then your statistical test better have precision of plus or minus 0.0025% (half the difference you are trying to detect) -- in order for your claim to be valid that you could reject any machine that falls below the spec.

Yet at 0.7% precision the marks on the ruler are "too fat".