Generating Test Credit Card Numbers, Fast!

Published Apr 16, 2017
Generating Test Credit Card Numbers, Fast!

by Ethar Alali

Abstract

In financial services or retail environemnts, one problem that almost always appears is the need to generate valid credit card numbers for testing or tokenisation.

Several years ago, an ex-colleague blogged about our solutions to the credit card number problem. I had started off the work as a short R&D exercise and handed it over, with a little bit of mentoring, and a whole load of whiteboard time, once I had the team started, confident in their abililty to make it work. Sure enough, they were able to generate tokenised credit card numbers from the work we did together.

One of the most overlooked testing points is the need for fast feedback on tests. Slow tests, slow deployments. Tat means manual tests, slow deployments.

Generating Test Credit Card Tokens

Tests are code and teams should respect tests as much as their production code. Of course, automation goes a long way to solving that, but if it doesn't solve the problem fast enough it generates wasted time.

In this case, reviewing the resulting code, I noticed something peculiar about it:

public string GenerateCardToken()
{
    string cardNum = string.Empty;
    for (int j = 0; j < random.Next(3) + 13; j++)
    {
        cardNum += random.Next(0, 10).ToString();
    }

    int c = 0;
    string fullNum;
    // check that the resulting card number generates a Luhn passing number
    while (!(fullNum = string.Format("{0}{1}", cardNum, c)).LuhnCheck())
    {
        c++;
    }
   return fullNum;
}

What's wrong with it? The code works. It generated credit card tokens perfectly! The problem is speed. When you're trying to set up thousands of numbers in a DB for automated UAT tests or in memory databases, whcih is most often the case in large enterprises and web scale systems, you have several options that don't require a snapshot VM or container. Generating them is a key part of that speed and having a permanent stock of them is setting up for a side effect of interacting tests.

Check Digits

For those unfamiliar with the Luhn check, pull out your credit card and enter all but the rightmost digit into... OK, don't do that! Never give out personal payment information, even when I tell you to 😃

Joking aside, the rightmost digit in all major credit cards is a check digit. This check digit is the modulo-10 of two sums on a vector:

CCNumSet.PNG

  1. A sum of the doubles of even positioned digits, 2, 4, 6,... (e.g. 44621123XXXXXX
  2. A sum of the odd positioned digits. 1, 3, 5, 7,...

As with all modulo arithmetic, the result is then integer divided by 10 (no decimals) and the remainder becomes the check digit. On a 16 digit card, this is becomes the 16th digit, rightmost on the card.

Purpose

Luhn checks are error detecting codes with a Hamming distance of 1. That means they can detect a change in one number. What it can't do, is detect a transposition of two digits in each of the two sequences.

Applying the Mathematical Treatment

Let us take a look at the while loop, more closely:

    int c = 0;
    string fullNum;

    // check that the resulting card number generates a Luhn passing number
    while (!(fullNum = string.Format("{0}{1}", cardNum, c)).LuhnCheck())
    {
        c++;
    }

What does this code do? It appends a number to the end of a 15 digit sequence, to make a full 16 digits for the card and runs a Luhn check to make sure it is valid. If it isn't, loop round again, increment the number and test again. This means that quite a lot of numbers are going to be thrown away. The method is akin to throwing stuff against a wall and seeing what sticks. Some of it will, most of it won't and anything falling on the floor is waste.

The issue with this "trial and error" approach is the amount of waste generated due to a 'miss'. This is due to the frquency of occurence of correct check digits once the full summations have been carried out. Trial and error methods are perfectly fine where there isn't a known solution or an analytical one. However, the Luhn check absolutely does have a unique solution to generating check digits.

Optimising Generation

There are two steps that cause coders some problems. The first is the check for whether or not the doubling has resulted in a two digit number such as 14 (in which case the Luhn sum would include 1 + 4 = 5) or not. Well, there is an elegant solution and that is that for any two digit number, if you modulo the result with 9, you get the two digit addition. Try it:

11 = 2 (mod 9)
34 = 7 (mod 9)
19 = 1 (mod 9)

There's a game I use as a prty trick once in a while.

  1. Think of a 2 digit number
  2. Add the two digits together
  3. Take it away from the number you first thought of
  4. If you have two digits left, add them together, if not goto step 5
  5. You are left with the number 9

In number theory this works because any number can be expressed as a sum of some parts (units, tens, hundreds...). The kind of stuff you learn in primary school.

Because of that, 28 can be expressed as:

28 = (2)(10) + (8)(1)

And this generalises to any number 'ab' being expressed as:

'ab' = 10a + b

The Luhn algorithm adds the digits together, hence the sum is a + b. This then means that the difference between them is:

'ab' - (a + b) = 10a + b - a - b = 9a

This means every two digit number which requires the summing of the two digits, when subtracted from the original number (effectively what the Luhn check does to get the remainder) is a multiple of 9. It's always a multiple of 9.

If it is always a multiple of 9, then taking the modulus of the original number will give you the remainder relative to that division by 9. So the mod is all we have to do once we have doubled the odd numbers in the zero based sequence.

Credit card check digits aren't really any harder. You simply find the next multiple of 10 up from your Luhn sum of the digits 0 to 14. Use the units or as I preferred to do it, multiply the sum by 9 and take the modulo 10 of that number. This becomes the 16th digit (at position 15). After all, that is how the check digit is calculated.

It ultimately turns into the following equation, where L is a Luhn operator

CheckDigit.PNG

And event better, simplified code:

        public string GenerateCardTokenOptimised()
        {
            int[] checkArray = new int[15];
            
            var cardNum = new int[16];
 
            for (int d = 14; d >= 0; d--)
            {
                cardNum[d] = _random.Next(0, 9);
                checkArray[d] = ( cardNum[d] * (((d+1)%2)+1)) % 9;
            }
 
            cardNum[15] = ( checkArray.Sum() * 9 ) % 10;
 
            var sb = new StringBuilder(); 
 
            for (int d = 0; d < 16; d++)
            {
                sb.Append(cardNum[d].ToString());
            } 
            return sb.ToString();
        }

That's it. No more, no less. I have tried to keep it fairly readable, but you can definitely improve this code and I'd challenge you to find them.

Ultimately though, how does it perform?

Performance Comparison

Well, firstly, does it work?
CCGenCheck.PNG

Yes. As the unit test output show, it works exactly the same.

Now how much faster? The table below shows the speed comparison for both the new code and the old.

Credit Card Numbers.PNG

These 6.5 to 11.5 order speed increases are simply because we don't generate invalid compbinations in the first place.

Conclusion

In the age of cloud computing, operation expenditure such as compute resources are an architectural concern. Anything that reduces compute resources introduces less blocking and less cost. This sort of optimisation falls into line with that and is crucial to optimising compute resources anywhere, not least cloud, since we are not wasting clock cycles and hence, money (in developer or server time) by generating useless computation we throw away.

Let me know what you've optimised in the past. I'll also put this code up on GitHub

Discover and read more posts from Ethar Alali
get started
Enjoy this post?

Leave a like and comment for Ethar