Data’s Inigo Montoya Problem (Part II)

If you’ve had campaigns fail because of bad data, then maybe you’ve fallen victim to the “Inigo Montoya problem of data.”  That is, maybe you’ve used data that don’t mean what you think they mean.

Previously, I discussed the origins of bad marketing data.  Now I’d like to discuss how to fix the problem.

First, I recommend something so exciting that you probably will not finish reading this post before trying it: read the dictionary.

recycle-390919_1920

Yeah, yeah.  I know.  All characters, no plot.  Wait, no.  That’s the phone book.

I mean, of course, to read your data dictionary.

Every database should have a data dictionary, a central reference that defines the terms of each attribute.  The data dictionary gives detail on how the database collects the data and, if necessary, the values of each attribute.

While data dictionaries won’t compete with the latest from George R. R. Martin, they do help you determine how (or whether) to use an attribute.  For instance, Google Analytics defines returning visitors based on a browser cookie.  As a result, it means that if a user visits a GA-tracked site from her work PC and then from her phone, she counts as two unique users, not one returning user.  Thus an email capture layer triggered by a new visitor might under-perform expectations since many “new” visitors are just returning visitors on another platform.

Secondly, I recommend the good old stink test for finding faulty or misleading data.

skunk-853083_1920

Cute, but you don’t want one hiding in your database

Run some basic tabs of your data to see the distribution of values.  For instance, if you have a marketing database with home address data, see where they live.  Make sure these values correspond to your understanding of your customers.  If, for instance, you see that a lot of customers live in an area where you don’t have stores, it means that either a) the database has captured bad data or b) your understanding of your customer base needs refreshing.

In a real-world example, I once found that a database of hotel loyalty club members had 20,000 residents of Svalbard. However, the island of Svalbard lies on the Arctic Circle has only 40,000 residents and, at the time, the hotel chain had no hotels within about 1,800 miles.  Needless to say, none of those hotels sits within driving distance.  The loyalty club had to take a hard look not just at the completely irrelevant Svalbard market (sorry, Svalbardians, but it’s true), but at ALL of their geographic codes.

Take the time to look for the Inigo Montoya problem in your data.  You might not rescue the Princess Bride, but it should keep you from falling for the Dread Pirate Roberts trick.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.