"Monetize Data!" we're told. But what does that mean?
There are many different ways to monetize data, either directly or indirectly, and there are many more opportunities beyond simply selling data. Instead of 'data monetization', let's think of it as 'creating value with data.'
The Economist and countless others have used oil as a metaphor for data. It's not perfect, but it does give us a good way to think about how data works.
The crude version is mined (or extracted); processed into a refined version (aggregated and cleaned); it's the basis for commodified products like plastics, and advanced products use these commodified products, like smartphones.
However, like oil, just because we can extract data doesn't mean we should.
Like oil, sometimes data use makes things worse rather than better.
And, like oil, there is a big gap between the crude version and something useful to the average human.
Similarly, oil producers often look for problems to solve with oil (the proverbial hammer in search of a nail) which can over-emphasize certain kinds of problem-solving. Data-dealers or data scientists often look to solve things with data and don't always work their way back to user needs from problems data can solve.
There are several kinds of data.
First, there is fundamental data (something directly and intentionally measured, like a list of stock investments or an image file on a camera). Second, there are facets or features of the fundamental data, often known as metadata (data about data). This might record, for example, the purchase price of one of the stocks or the date the camera took the photo.
Some secondary datasets may come from 'data exhaust'—something a digital system records or logs without a specific intent in mind.
Monetized datasets can provide either or both types of data. For example, a video stream of a security camera could be considered fundamental data, and logs of web traffic might be secondary data or metadata. What makes them data exhaust is that they are automatically collected, often for unknown future purposes.
Whether fundamental or secondary, these datasets may be monetized as raw, unprocessed data, without structure or labeling, or it might be aggregated and/or processed into a larger, more 'packaged' dataset.
For more information, read the first three modules of the Data Supply Chain (Acquire, Store, and Aggregate).
Sometimes datasets are distinguished by the terms 'traditional data' and 'alternative data' - a division often discussed in financial services. There, and in parallel industries, traditional data means measurements that directly describe fundamental things about an asset or other item of interest. These tend to be absolute and past-based.
Sometimes an alternative dataset can be used to infer something about a ‘traditional’ dataset. Parallel or secondary data can build on traditional data to infer additional detail or provide predictions.
For example, retail stores traditionally report their sales results in the quarter after the holidays, leaving investors to wonder about performance for a time. However, many retail stores have parking lots that can be monitored by security cameras or by satellite. Parking lot occupancy can be cross-referenced to nearby stores to guess what earnings will be.
If we ask the basic question “What will a retail store’s holiday sales numbers be?” we might have to wait until the next quarter for the report. To get some indication of these results sooner, we can use computational thinking: decomposing the result (sales) into the steps leading up to it (number of potential customers in stores, which can be inferred by how full parking lots are).
By pushing ourselves to see the root cause of or correlations to those sales, like an increased number of shoppers, we can get some indication of the outcome sooner.
Such a mindset lets us get an idea of what sales volume we might expect before we get the quarterly report that happens well after the holidays.
Directly describes an asset’s market position or fundamentals
Broadly accessible, obvious, usually from within financial markets
Tends to be ‘now’ or ‘after the fact’
Tends to be free or low-cost
Often has a long, consistent history
Can be used to infer fundamentals or something a/effecting fundamentals
Is ‘discovered’ or ‘mapped,’ sometimes not obvious—usually from outside financial markets
May be used to predict the future
Tends to be expensive
May be shorter or less consistent
Alternative Data offers many opportunities but is also full of ethical considerations.
Using data sets for purposes other than those the disclosers gave consent for won't just damage a brand's reputation but may be illegal.
For example, selling datasets such as satellite imagery could result in de-anonymizing people or otherwise disclosing sensitive information if not handled correctly. It may be harmless to sell a few cropped satellite images at low resolution, but if those were shared with malevolent parties via resale or a data breach, they could be re-constituted and end up disclosing personally identifiable information, national security vulnerabilities, or even just basic security risks like showing the location of service doors on the roof of a mall.
You can read more about ethical considerations for alternative data in the Data Ethics guidebook.
There are wide varieties of datasets which are directly monetized. For example:
Crunchbase collects data on companies, such as their structure, basic financials, leaders, investors, and key employees. By aggregating, normalizing, and verifying the information, Crunchbase makes it possible for other companies to build products that need reliable company information, such as an investing app that provides additional context to its users about the companies listed on a stock exchange.
Data is often accessible via third-party marketplaces. The Amazon Web Services Marketplace offers many elements useful for data-centric innovation, including both broad and highly focused datasets. For example, users can source data on:
And many other topics. Similarly, organizations can list their own data. Amazon's role is to provide a marketplace for many parties to lease data to each other.
Sometimes, companies both consume and produce data.
Nexar, an advanced dashcam, probably had to access maps and other datasets to build its product.
In turn, it sells the images captured by users' cameras. In this case, Nexar markets street sign data that might otherwise be too complex or expensive to gather.
Suppose another company was attempting to make maps of particularly pedestrian-friendly areas. In that case, they might use Nexar's data to determine which roads had good 'pedestrian awareness' signs for drivers. At the same time, another hypothetical firm could find the most truck-friendly routes by avoiding streets with poor traffic controls.
Clearbit acquires, aggregates, and normalizes data about individuals, especially professionals. It does this by offering a free service to individuals in a two-way model: In exchange for sharing their address books with the company, members' address books are updated with the most up-to-date contact, title, and employment history.
Clearbit then monetizes that data with a premium, one-way version available to larger companies who get enriched data without having to provide data in turn.
A vast amount of data is accessible via public or private APIs across almost every industry. It can be hard to choose which data sets to access, but it helps to think of data as domains and records: What domain (or industry) is that data in? And what kind of records might it have? For example, Bloomberg provides data in the 'Finance' domain and offers record types like news stories and stock statistics.
To get started, try using the Data Sources Explorer to explore generic data source types.
When it's time to evaluate actual data types to monetize (or to acquire), you can reference Attributes of Data.
Rather than selling complete data sets, some firms create value by providing insights based on data such as analyses of market activity, research or trends.
'Insights' companies include market research firms like Gartner and Forrester, as well as specific insight products, such as investment bank Credit Suisse's insight into the value of pharmaceuticals for industry benchmarking.
Market research firm Community Marketing Insights specializes in research about LGBTQ people consuming mass-market products.
These firms are not differentiated simply through the data they acquire and/or aggregate but through their unique analysis of it.
Their value propositions may include the underlying data or only the outcomes of their analysis.
Some of the insights for sale are basic mathematical analyses of statistics, while other insights firms construct qualitative frameworks (like Gartner's Hype Cycle), taxonomies, timelines, cause-and-effect studies, and/or integrate editorial perspectives.
Algorithms (mathematical models used for analyzing data) are another way to create value. A lot of energy goes into developing strong algorithms, and the differences in quality are enormous. While some organizations may create data science and machine learning teams, many firms just need access to a good algorithm to analyze their own datasets.
Monetization of algorithms generally falls into two categories: direct sale or licensing of an algorithm; or application program interface (API)-based access to it. The direct approach usually accelerates existing work on machine learning or operates inside a highly-secured environment, while the latter provides a commodity service.
Perhaps one of the best-known examples of an API-monetized algorithm is Google Image Recognition. Users of the service do not have direct access to Google's algorithm and need nearly no technical infrastructure to use it. They simply submit an image via an API call, and Google returns a weighted list of likely keywords. The service can recognize text, everyday objects, and brand logos. Causeit itself does this: we pass images of business cards we've received through Google to quickly identify companies and other relevant information for our internal customer relationship management tool.
Algorithms can be sold as a completely standalone service, like Google Image Recognition, or a core part of a more significant value proposition.
WeGlot, for example, helps website creators easily overlay their primary website with translated (or 'localized') editions and edit the various editions quickly. For example, an English-language website may have a Spanish edition. That functionality is useful on its own, but the real value of WeGlot is that they provide machine translation of your website using an aggregate of three different translation algorithms and a way to blend that machine translation with professional human translation where needed. WeGlot did not need to build an algorithm for translation—this would have been expensive and unnecessary—but instead was able to monetize other algorithms uniquely.
Many marketplaces offer algorithms as a service. This means that necessary parts of digital value propositions (or just data cleanup) can be browsed and implemented quickly.
Many algorithms are focused on cleaning up and/or labeling raw data, which are time-consuming but essential tasks for most data projects. Others are for functions like processing speech into text or recognizing images. And some are highly specialized, like models for forecasting hospital capacity.
Amazon Web Services' marketplace offers algorithms as a service and is one of the most visible venues for purchasing or licensing algorithms. Other firms like Algorithmia/DataRobot provide similar marketplaces, while GenesisAI is creating a marketplace for connecting various AIs.
To find out more, search for "algorithm API marketplace" plus your preferred domain.
Data can create value by helping you optimize your existing operations. Often, decisions inside organizations are made based on intuition or a limited view of what customers and employees needed in the past. Data can be used to help make decisions inside a business and dynamically optimize stages of production.
Data is often used to optimize:
In these situations, simple data-sharing can be powerful in its own right, such as sharing customer contact information across a company. More advanced implementations of machine learning and recommendation engines can help users operate more efficiently, freeing them to do more valuable parts of their jobs or entirely new ones.
Salesforce has a suite of offerings designed to augment human work through a customer's journey with a company. They have monetized data and related technologies in many ways, among them:
Salesforce strikes a balance between providing data-driven products & services and making it possible for customers to create their own value with data. Salesforce is an important example of data-centric value: a tool for an existing business function like sales contact management can also underpin new offerings.
At the same time, Salesforce's business model is designed to keep their products deeply integrated with the critical business functions of their clients. Salesforce's clients may feel 'locked in' to that ecosystem and dependent on—or even at the mercy of—a company with comparatively high prices and a proprietary mindset. It exposes a critical set of choices for leaders with fledgling data initiatives: Should they focus on what they want to do for their customers and accept the high cost of Salesforce in exchange for a rapid path to market? Or use less-integrated or internally-built components in their own firm?
Data-centric products take many forms. Here, 'data-centric products' mean any offering which uses data to create value that could not otherwise be created or where using data substantially improves the offering. For example:
One of the ways data-centric products create value is through data-informed personalization. Traditional business wisdom says that an organization needs to choose between personalization and scale. A bank might have high-quality, personal financial advice for their biggest clients but only offer cursory, generic accounts to their broader customer base. Using a combination of big data (trends from their entire user base) and little data (specific data points about an individual user), wise firms can vastly increase their service's actual and perceived value while continuing to operate at digital scale.
Whether it's movie recommendations from Netflix, Google's customized slide shows based on location and contact data, or a calendar app that knows when you need to leave to get to your next appointment on time, the best digital tools already personalize things for us. If you look more deeply at their strategies, you'll find that most balance big data and little data elements.
For example, Apple Watches ask users for simple goals around physical activity and then 'nudge' users to achieve those goals with tailored advice and encouragement. The watch might say, "you only need to take a brisk twelve-minute walk to reach your activity goal!" or "you're usually more active by this point in the day, but there's still time." This 'nudge' mentality is based in behavioral psychology and works because it isn't the same reminder at the same time and in the same way for every user. Generally, more personalized strategies like 'reminders that you don't set' justify an app's request for personal information and reinforce goodwill between the user and their app or device.
Users even gladly participate with firms who are clearly attempting to offer them financial products like credit cards if the strategy is reciprocal enough. Services like Nerdwallet, Mint, and Credit Karma all ask for sensitive, personal information from users (often through secure APIs like Plaid) in order to first feed helpful information back to users, such as tips for improving their credit score or saving money. In that context, recommendations of credit cards and other financial products are more appropriate, better tailored, and more welcome than traditional broadcast ads for credit cards.
Businesses can also offer other companies personalized services:
Many of these personalization strategies help improve the value and reduce the friction of existing experiences while reserving face-to-face human time for truly complex and high-value issues.
Faced with lots of options but limited resources, it can be challenging to know what to invest in. Data can always help these decisions, whether you're working at a financial institution and participating in public marketplaces; or inside a small firm and prioritizing your team's time. Inside an enterprise, business modeling tools can help leaders determine which products and services to expand or pivot and which to discontinue; or which markets to expand into or withdraw from.
For example, individuals can use data when investing their savings. Automated investing apps like Wealthfront help people select stocks and other opportunities by matching the stock's performance to their users' tolerance for risk and timeline for expected return. At a basic level, so do 'drip-investing' apps like Acorns.
When used well, data can build or improve relationships.
The most apparent application might be relationships between you and your customers or users, but you can also use data to enable third parties to connect, as online social networks do. The most basic data can provide incremental lifts to your relationships with customers, as with online shops that offer a special discount to customers on their birthday.
The real opportunity is to use more advanced data about your users to help them learn both about themselves and what you can do for them. Instead of focusing on transactions and sales, focus on personalization strategies, match-making with other users, recommending helpful content, or other valuable 'a-ha!' moments that delight them.
Just like a good friend or colleague, aim to be generous and helpful, anticipate their needs, and assist them in thinking through decisions. Use data to listen to your customers rather than target them.
The are many strategies to create value with data. Whether evaluating your own opportunities or products on the market, it can be helpful to think through the elements of a digital value proposition. There are a few examples of digital value propositions below, based on the following ad lib.
1. Our [initiative or offering]
2. help(s) [customer group]
3. who want to [jobs to be done]
4. by using data from [sources]
5. to [reduce verb + customer pain]
6. and [increase verb + customer gain]
7. unlike [competing value propositions].
Stores traditionally report their sales results in the quarter after the holidays, leaving investors to wonder about performance for an uncomfortable length of time. However, many retail stores have parking lots. Parking lot occupancy seen from satellites or security cameras can be cross-referenced to nearby stores to guess future earnings results.
If we ask the basic question of "What will a retail store's holiday sales numbers be?" we might have to wait until a past-looking report is issued. What if we went further?
In the finance world, focus on responsible investing has resulted in "Environmental, Social and Governance" stocks, or "ESG" stocks. However, it's not always easy to know which companies are doing the right thing, especially in the future. So, someone could create a score to help predict how well-governed a company might be using company statistics like founders, investors, and press releases to guess at how will a firm might behave and perform in the future:
Our governance predictor
help(s) ESG investors
who want to predict companies’ future ESG scores
by using data from Crunchbase, Angellist, Glassdoor, LinkedIn, and PRweb
to reduce the effort and time needed to assess companies’ governance
and better predict future ESG scores
unlike waiting until third-party assessments have occurred.
There are several roles to consider when creating value with data. This is not a strict set of roles but a prompt to think through the 'thinking capabilities' of your team. Make sure these perspectives (and functions) are present at some point in your development process:
List a few types of data you know your organization has access to. Then think of at least one use case where that data could create value for your customers.
Hint: How could this data be used to make your product or service easier to access or more useful for your customers?