By David Bernstein, Data Privacy Engineer at Privitar

Before we can really dive into credit card tokenization and when and why you need it, let’s start with an example. To begin with, let’s talk about a retailer implementing a loyalty program, along with accompanying customer analytics. At first glance, you might think that the credit card numbers would be the first thing to drop entirely from any analytics endeavor.

After all, that’s about as sensitive as data can get, and what could be gained?

A lot, it turns out.

Why retain the first 6 credit card digits? 

The first six digits of the credit card determines the credit card provider (Visa, MasterCard, American Express). Providing that information enables valuable customer analytics, such as:

  • Which credit card provider is most used?  (Visa)
  • Are average sales higher with a specific card provider?  (American Express)
  • Which card is most widely accepted? (Mastercard)

    How do customer demographics line up? (Turns out ages 18-30 mostly use Visa, while American Express users fell in the 45-65 age range)

  • If the retailer in this case had a co-branded credit card…how was that working?

So here the solution is to just provide the first six digits…problem solved!

This approach is better, but still lacking.

The stories of the loyalty IDs

In our example, the retailer had a focus on identifying households. The data entered for the loyalty program was often over five years old. Many customer ‘life events’ had occurred in those years…marriages, divorces, kids growing into young adults. Different members of the same family often have different loyalty IDs, so how can you determine that these different loyalty IDs are indeed in the same household?

Customers in the loyalty program, using the same credit card, is often the only way to tie that family together. With that, you can identify a unique ‘household’ using different loyalty IDs.

So how do you keep the data you need, but protect privacy?

Enter credit card tokenization  

To solve this we turn to tokenization, but not just tokenization — we need consistent tokenization. So, we keep the first six digits of the credit card number to identify which credit card provider was used, and then tokenize the rest of the digits…consistently.

When we apply tokenization this way, the credit card number is ‘masked.’ It is no longer the highly identifying original credit card number. If there was a data leak and a bad actor tried to take advantage of this credit card number, there is little risk of it being misused or the original card holder being identified, because that number has been generated – it’s not a valid card number.

Because the number has been generated consistently, the same number is generated for each unique credit card value, and therefore the tracking and analytics information is accurate.

For example:

The actual credit card number 1234-56XX-XXXX-XXXX consistently generates a tokenized version 1234-56YY-YYYY-YYYY every time. So in our original example, when we see multiple loyalty IDs using 1234-56YY-YYYY-YYYY, we know those loyalty IDs are likely part of the same household.

There you have it! The credit card number was not the real one, and was safely de-identified with the method of credit card tokenization I outlined. Yet we were able to enable all the analytics and address the business pain. Data privacy and full analytic capabilities can actually co-exist, you just need to balance them with the proper data privacy techniques.

Learn more about how financial services organizations can use sensitive customer data for growth. Watch this webinar on demand: 4 Ways You Should Be Using Sensitive Customer Data for Growth.