Big data has moved beyond a much talked about trend, a fad or even a high demand area of interest. Today, big data is a fact of life, and something we all need to understand and manage. Understanding big data is not as simple as finding a few data scientists and a large storage system. It requires understanding all of the inputs and data streams that lead us to where we are today. Because big data is so, well, big, it can seem overwhelming. In many cases this leads us to be too focused on using one type of data, or one data stream to facilitate critical business answers. The real way to harness the power of big data is to leverage multiple types of data. A great example from an area of my interest is in supporting how to attribute marketing spend to performance.
Consider this: Two years ago Adobe estimated that it captured over 27 petabytes of digital marketing data for its 5,000 Digital Marketing Suite customers alone.
That statistic is both exciting and overwhelming. However, with the focus on big data, there is a natural over-focus on streams with the greatest volume of data. But there are three V’s to big data (volume, velocity and variety) and for good reason. Simply more of the same data doesn’t provide equally more value. More types of data generally provides more incremental value.
My concerns around big data are based on this prevailing notion that access to a large volume of data is some kind of panacea. This has also been exacerbated by the growth around another rising trend; the data scientist. Somehow, our community has fallen under the delusion that one can simply comb through a large volume of data and the answers to the universe's questions will emerge. This is not the case.
All hope is not lost. We just need to harness its power appropriately. Having spent a long time analyzing all variety of data, big and small, I am both excited and concerned by the trajectory my little part of the big data world is on. Focusing on ever increasing volumes of data has driven some great innovation in analytical techniques that use less memory, can find patterns in low signal-to-noise data and techniques designed to transcend linear and simple non-linear relationships. But shifting our sole reliance to techniques that can find relationships due to volume of information rather than seeking to confirm a sound theory of how something works is having negative consequence.
The flaw in all this is twofold. First, in order to sufficiently answer any question appropriately one must cultivate the appropriate question. And secondly, rarely are critical questions answered using a singular type of data...no matter how comprehensive and large. A practical application of my point is the emerging practice in marketing of "attribution modeling". It has become so commonplace in our discussions we now just call it “attribution”.
Attribution is a way to harness the potential of big data. When you leverage attribution modeling, you can accurately and efficiently identify marketing impacts, gain a greater understanding of consumer behavior and define future action to improve your efforts.
If you're not familiar, attribution is the idea of placing value on a marketing tactic for its ability to drive a desired outcome. For example “how did that search campaign or search term do compared to that display ad for getting people to click to my website.” Attribution focuses on using click stream data which is a large volume of data showing how a single person (or browser, anonymous or known) has interacted with a variety of digital touchpoints from a brand. And depending on whom you talk to, any variety of simple rules to more complex algorithms are applied to the data to determine what piece of the outcome was due to a given tactic, keyword, site, campaign, etc.. It is a great example of both big data and the flaws of big data.
With attribution, the techniques are designed to find relationships in situations where there are billions of impressions but terribly small number of interactions (see stats on display ad click thru rates as example). This is a noble pursuit and a great big data problem: “what is a click worth?”
Attribution is not the only answer. With all that data, attribution models still cannot answer fundamental marketing questions for you like “If that click didn’t happen would the person still have gotten to my website?” Or “How much should I spend overall on that digital video campaign relative to a TV campaign?” But it is a step in the right direction. Attribution works best when connected to other marketing models that utilize a larger variety of data to explain how marketing delivers incremental revenue from the highest level tactics down to its most granular level of keyword, websites and on. Then a brand can move forward with confidence that it knows how its spend works. So including a variety of data in your next big data project may be the key to success.