Tax - Statistical tests - Vital statistics

The Inland Revenue may be querying accounts as a result of incorrect statistical analysis, according to Duncan Williamson.

Has a client of yours ever been the subject of an Inland Revenue query that has been based on the Chi Squared statistical test? Have you ever come across Benford's Law?

The Revenue uses the Chi Squared test to check whether taxpayers are acting honestly, or whether they may be falsifying their accounts. This is because the Chi Squared test is a statistical method for comparing expected and actual results. (See Panel 1 for an example exploring the idea that employees may be taking more sick leave than expected.) However, through research I discovered that the Revenue seems to be applying the Chi Squared test without reference to Benford's Law - a mathematical technique for fine-tuning what is meant by expected. This is a potentially problematic approach when analysing business accounts.

Benford's Law

Benford's Law was developed by Simon Newcomb late in the 19th century, forgotten about when he died and then revived, independently, by Frank Benford around 1938. Benford's Law tells us that there is a pattern to the occurrence of the leading digits in a set of numbers. (Note: in the number 43.82 the first leading digit is four, the second leading digit is three and so on.)

As explained in the New York Times in 1998, Benford's Law is surprising.

"Given a string of at least four numbers … the chance that the first digit will be one is not one in nine, as many people would imagine; according to Benford's Law, it is 30.1%, or nearly one in three. The chance that the first number in the string will be two is only 17.6%, and the probabilities that successive numbers will be the first digit decline smoothly up to nine, which has only a 4.6% chance."


The probabilities of occurrence also vary (but less dramatically) in terms of the second leading digit, but for the third and fourth leading digits, there is a virtually equal chance of one, two or any other digit occurring.

The Inland Revenue's analysis

The Revenue uses a spreadsheet-based data interrogation tool called IDEA, which has the Chi Squared test built into it. What the Revenue does with sales revenue data is to:

•   take the sales values: daily, weekly or whatever basis they have;

•   tabulate them;

•   extract the leading or test digit;

•   create a frequency distribution of the test digits; and

•   apply the Chi Squared test.

The table in Panel 2 shows an extract from tabulated sales data and how the Revenue takes what it calls the test digits. The data is based on a real case involving a small retail business. Note that the Revenue works from right to the left when extracting the test digits ie, pence, then tens of pence, then pounds and so on.

When it has extracted all of the test digits it derives a frequency distribution of the pence test digits, then the tens of pence digits and so on. Panel 3 gives an example of a table that the Revenue would prepare for a tens of pence test digit analysis. It shows that the business in question worked for 291 days in the year and that, of the tens of pence sales values that the business has returned, 145 of them are zeros, 14 of them are ones, 21 are twos and so on.

The Revenue assumes that all digits have an equal chance of occurring.

It uses its IDEA tool and finds that there is a one in 3.756 x 10105 chance of the client's results being a true representation of reality. From what we know of the Revenue's application of its IDEA, we can conclude that:

•   the Revenue has applied the Chi Squared test but has failed to apply Benford's Law; and

•   the Revenue found that, on the basis of their analysis of the tens of pence column in the retailer's sales records, the retailer is likely to have falsified his records.

Problem with the numbers

The basic problem with the Revenue's analysis is that it fails to appreciate that sales data comprises numbers of varying length. That is, some sales values are as low as 35 and as high as 3,543. The entire sales data series comprises the following:

•   28 sales values are four digits long;

•   203 are five digits long; and

•   52 are six digits long.

Therefore by working from right to left instead of left to right, the Revenue could be said to be analysing the data incorrectly, which would mean its conclusions are invalid.

If the Revenue had started its analysis properly, it would apply Benford's Law and be working from left to right. The table of test digits would then look like that in Panel 4. Applying Benford's Law properly to the retail outlet's sales data gives the expected frequencies shown in Panel 5. Interpreting the results, there does appear to be a problem in the case of the first digits, the second digits seem fine, the third digits are possibly problematic and so are the fourth digit results.

First level analysis

At this point we must remind ourselves that Benford's Law is a first level of analysis. We say that Benford has merely flagged potential problems and that we would need to work with the client to see whether we really do have problems. The Revenue, on the other hand, finds discrepancies and immediately assumes that there is a fraudulent or negligent return.

The Revenue ought to be cautious. Clients may, at the end of a day or week, round up or down their takings to the nearest ten pence. This is not accurate; but the level of any error over the course of a year is not significant.

Similarly, we could consider a newsagent, where almost every paperback, CD, DVD or video will have a price that ends in 99p. Most, if not all, newspapers have prices that end in zero. Sandwiches tend to end in a zero.

Chocolate bars tend to end in zero or five. The average purchaser may buy, say, sandwiches, a chocolate bar and a paper: and the average purchase will end in zero or 5p.

Therefore, we might fully expect a retailer to have significant bias in the frequency distribution of test digits where many sales values end in 5p or 10p, and there may be significant duplication in the sales results.

Second level analysis

Having carried out our level one investigation and assuming there may still be problems, we would investigate those problems. Two basic possibilities are that:

•   the client really has cooked the books, or

•   there are factors at play that mean that even though Benford flags a problem, there really is no problem.

Leaving the first possibility aside, let's pursue the second.

The first digit discrepancies are potentially the most serious errors that Benford has flagged, but analysing the sales data for the business indicates there may be mitigating circumstances. When we analyse the first digit day by day, a significant pattern emerges: the sales values for the early part of just about any week in the year show that Monday, Tuesday and Thursday are days on which sales are most likely to result in a value beginning with one: in this case that generally means in the range 100-199.99 rather than the range 10-19.99 or 1,000-1,999.99.

As we saw in Panel 5, 148 (or almost half) of the first digits are represented by the number one. What we need to know from the client now is whether it is true that almost half of his sales are in the range of 100-199.99 per day. This is quite feasible. Sales for this business are not distributed evenly across the week. In fact, two thirds of the business's sales are made on Saturdays.

Rounding impact

If the retailer has rounded off the sales figures, this could affect the Chi Squared test and Benford's Law. Rounding up will cause the loss of the original tens of pence digit value and the loss of the related units of pounds value: that is if we round up 15.75 to 16.00, there is an impact on the values of the 2nd place digit and the 3rd place digit.

Rounding down will cause the loss of the original tens of pence digit value but not the loss of the related units of pounds value: that is if we round down 15.25 to 15.00, we can see that there is no impact on the value of the 2nd place digit but there is an impact on the 3rd place digit.

Even if the overall impact of rounding up and down is zero over an entire financial year, Benford's Law will discover that something unusual has happened to the data. However, providing this is the sole explanation necessary for these digits, I would advise the Revenue to drop their claim of irregularity against a client under these circumstances: purely on the basis of materiality.

Misplaced effort

The Inland Revenue is trying to ensure that its clients are honest in all of their dealings with it. The combined Chi Squared/Benford's Law analysis is potentially an excellent way of keeping us all under control, but only when used properly. The Revenue has developed a spreadsheet-based routine that carries out the Chi Squared analysis and derives a conclusion as to whether expected and actual values are in harmony. However, the Revenue has failed to appreciate the nature, meaning and application of Benford's Law in such situations.

There are possibly many cases currently under review by the Revenue whose foundation is the analysis we have been discussing here. It would be wise for the Revenue to review such work and adjust both its IDEA tool and its follow up analysis.

PANEL 1: CHI SQUARED TEST EXAMPLE Actual Sick Expected Sick Chi Squared Values: Leave (days) Leave (days) (Actual-Expected)2 Expected 95 67.9 10.82 47 55.0 1.16 18 37.1 9.83 143 170.1 4.32 146 138.0 0.46 112 92.9 3.93 Total = 30.52 The Total Chi Squared value (30.52) is compared with the relevant number in a Chi Squared table, in this case 5.99; and because 30.52 is greater than 5.99, we can conclude that there is nothing unexpected about the relationship between the actual and the expected sick leave days.

Duncan Williamson is a freelance author and consultant who specialises in management accounting and spreadsheet modelling. He maintains his own website at and can be contacted at

PANEL 2: IR TEST DIGITS Sales Digit 1 Digit 2 Digit 3 Digit 4 Tens of Pounds Pounds Tens of Pence Pence 154.00 5 4 0 0 79.50 7 9 0 0 125.75 2 5 7 5 298.31 9 8 3 1 1756.93 5 6 9 3 PANEL 3: REVENUE-STYLE TENS OF PENCE TEST Digit Actual tens of Expected tens of pence digit pence digit 0 145 29.1 1 14 29.1 2 21 29.1 3 15 29.1 4 12 29.1 5 15 29.1 6 11 29.1 7 25 29.1 8 11 29.1 9 22 29.1 Total 291

As far as expectations are concerned, the Revenue uses:

Expected frequencies = total number of values


In this case we have:

Expected frequencies = 291 = 29.1


PANEL 4: REVISED REVENUE TEST DIGITS - BENFORD'S LAW TEST DIGITS. Sales Digit 1 Digit 2 Digit 3 Digit 4 Digit 5 Digit 6 154.00 1 5 4 0 0 79.50 7 9 5 0 125.75 1 2 5 7 5 298.31 2 9 8 3 1 1756.93 1 7 5 6 9 3 PANEL 5: REVISED TEST USING BENFORD 1st Digit 2nd Digit 3rd Digit 4th Digit Digit Actual Benford Actual Benford Actual Benford Actual Benford 0 0 0 32 33.87 46 28.80 135 28.35 1 148 85.19 31 32.23 27 28.69 12 28.34 2 79 49.83 42 30.80 29 28.57 21 28.33 3 22 35.36 33 29.53 20 28.46 16 28.32 4 7 27.43 34 28.39 23 28.35 11 28.31 5 9 22.41 23 27.36 38 28.24 17 28.29 6 4 18.95 22 26.42 35 28.13 10 28.28 7 4 16.41 19 25.57 20 28.02 25 28.27 8 6 14.48 26 24.78 21 27.92 16 28.26 9 4 12.95 21 24.06 24 27.81 20 28.25 Actual is the actual frequency of the test digits: in this case we have found that number 1 is the first digit 148 times, number 2 is the first digit 79 times … and so on. Benford is the expected frequency of each digit in each of the first four positions according to Benford's Law.
Be the first to vote

Rate this article

Related Articles