I’m going to recommend that you read a book I hated.
More accurately, I want you to read one chapter of the book I hated: The Fires by Joe Flood. So don’t buy it; instead, borrow it from the library or find it in a Barnes & Noble and just read chapter 12, “Quantifying the Unquantifiable.” Apart from the awfulness of the rest of the book, chapter 12 gives a master class in how not to use data. The lessons therein pertain to anyone using data, although of course I find it most useful to apply those lessons to marketing data.
The book covers an interesting point in New York City history, the late 60s through the mid 70s, when fires ravaged poor neighborhoods despite the city’s best efforts to stop them with better management courtesy of management consultants (what could possibly go wrong?). Yes, the book covers the “Bronx is Burning” era (a much finer book with a broader perspective).
Today, FDNY is well-integrated with the neighborhood. Note the dinosaur skull on the truck that serves the American Museum of Natural History.
Why I hated the book
Before getting to the good part, I should explain why I hated the book overall. Author Flood rehashes mid-20th-century New York City through a highly distorted lens. How distorted? Let me put it this way: he attributes the failure to the efforts of arch-liberals Mayor John V. Lindsay (who was, in fact, a liberal) and the RAND corporation (which grew out of the Douglas Aircraft Corporation, those filthy hippies who built the AC-47 gunship for the Vietnam War). He also intimates that Tammany Hall maintained a strong grip on city politics into the 1960s, which would have come as a surprise to Tammany Hall, had it existed in more than name only at the time.
Finally–and most inexcusably–he includes “beer distilling” as a vital New York industry. Humans have BREWED beer for over six millennia, but they’ve never distilled it. This kind of error makes me crazy; this is beer, folks.
Why you should read chapter 12 anyway
When not making poorly-aimed attacks at New York’s Democratic and liberal politics, the book actually traces the history of firefighting in the postwar Big Apple. FDNY emerged as the most authoritative urban firefighters in the country, with their innovations often spreading to departments coast to coast. Unfortunately, the decline of New York’s industry in the 1960s (and, as noted in the book I reviewed last time, the ports) and the departure of wage earners led to declining tax revenues. At the same time, crumbling housing stock in the poorer sections of town produced a bumper crop of firetraps. Interestingly, arson played a smaller role in the fires than one might think, according to Flood.
To address stretched manpower (and they were all men until 1977), Lindsay hired RAND to figure out how best to place extant fire assets–firefighters, apparatus, firehouses–and where to invest in new ones. To assess the problem, RAND began to collect and analyze data at the firehouse level. And there the son-of-a-bitchery began.
Chapter 12 highlights four major causes and effects of bad data collection and analysis:
- Wrong data. RAND focused on one key variable: response time. As variables go, response time seems like a good idea. It measures how quickly units arrive on scene after an alarm call. Unfortunately, this variable does not mean what you think it means. Response time only measures the time up until the unit shows up at the location that reported the fire. In the pre-911 era, many calls came from call boxes, not from a bystander with a cell phone. So the fire trucks might arrive down the block from the fire.Also, the time that firefighters started putting water on the fire might have little to do with response time. In addition to the distance between the call box and the actual fire, firefighters might have to contend with inoperative fire hydrants or obstacles such as illegally parked cars. Response time, then, only told a small part of the story.
- Inaccurate data. Although response time only told a small part of the story, it still had some validity as a proxy for time-to-fire-attack. However, RAND didn’t always get accurate response time data. FDNY commanding officers received stopwatches to measure response time. Some forgot to use them. Others simply estimated. Still others worried that the data would make them look bad so they invented times from whole cloth.
- Poorly-conceived segmentation. To RAND’s credit, they did not lump all fire houses into one bucket. They created buckets by type, which included the high-rise districts of midtown and lower Manhattan, the low-rise districts in other boroughs and the relatively suburban districts in Staten Island and the outer reaches of the Bronx, Brooklyn and Queens.While this segmentation made intrinsic sense, RAND made a mistake using derived data of acceptable response times. They scored fire houses in terms of deviation from the acceptable response time and recommended additional resources to areas with segment-high response times. As a result a sparsely-populated neighborhood in Staten Island might get more firefighting resources than one in a densely-populated South Bronx neighborhood simply because the Staten Island house had response times more out-of-whack in its bracket. Put another way, RAND’s segments failed to take population into account.
- Ignoring institutional wisdom. RAND consultants made decisions based on the numbers and the numbers alone. In some cases, they saw houses with relatively few fires (remember, relative to other houses in their segments) and slated them for reassignment or closure. What RAND didn’t know, but any fire lieutenant could have told them, is that some “houses” represented backup shifts at already busy houses. In other words, one company in the Bronx might exist to back up another company and thus would only fight fires if the initial house had left to fight a fire. As a result, these seemingly under-utilized houses represented, in fact, very busy districts.
What marketers can learn
If the bullet points above haven’t given you any ideas, allow me to connect the dots.
- Check your marketing data for relevance. Do your data actually relate to something important? At minimum, marketers should check their data dictionaries to understand what data points mean. In email marketing, for example, campaign platforms differ in the meaning of something as vital as “clickthrough rate.” Some platforms define it as the number of clicks divided by the number of delivered emails while others use number of opened emails as the denominator. Big difference. More importantly, marketers need to ask the key question of “does this data point tell me what I really need to know?”
- Check your marketing data for accuracy. My friends in digital media strategy have struggled with outright fraud in their industry long enough to take many numbers with a grain of salt. Even stretching back to the pre-digital age, media planners understood that Nielsen ratings often over- or under-counted TV viewership because they relied in part on hand-written diary entries. Make sure to learn enough about how your sources collect data to know what to trust and what not to trust.
- Segment with care and caution. We create segments because we know different groups of consumers respond in different ways. More to the point, they have different needs. Set appropriate goals for each segment. Don’t apply the standards of one segment to the next.
- Ask an old timer. While digital technologies have changed much of the finer points in life, we still have 10 fingers and 10 toes (well, unless you run afoul of the Yakuza, but still). When you draw a seemingly new conclusion from data, make sure to ask an old hand what she thinks. She’ll tell you something between “it’s common knowledge” or “that’s impossible.” Make your best arguments and genuinely ask for hers. Rinse. Lather. Repeat until you both can agree on what the data mean.
So there you have what I learned from The Fires. Again, I don’t recommend buying it. Unless you have…[wait for it!]… money to burn.