Many organizations hear “third-party data” and jump to the consumption of external data for augmenting their in-house generated and curated datasets. But there’s another aspect of third-party data, which is providing it back to others for their consumption. This is a common monetization strategy for creating new revenue streams.
The continued growth in the availability and diversity of third-party data has more organizations thinking about how they can expose and monetize their own first-party datasets. By exposing your data you create the potential for new revenue, new business partnerships, and an ability to consume data derivatives created by others.
Deciding to monetize your data is only the first step. There are a few aspects to consider before you publish your first dataset and continue to evaluate to ensure you are providing a valuable, easily consumable data product.
- Identifying valuable data. The first stage is to identify data within your ecosystem that’s valuable to other organizations. This will typically be data that enables hyper-personalization of other products or allow organizations to de-risk through higher quantities of input data for their own models.
- Right to share. After determining potential data to monetize, you must determine if you have the right to share that data with others. This will typically be a function of your own consumer agreements shown at the time you collected the data.
- Geographic Considerations. As more municipalities pass consumer data rights laws, the complexity for companies processing data will only increase. Any data products should be built in a way that specific geographies can have associated consumer data removed quickly in response to changing laws.
- Confidential data removal. As part of creating the data products for external exposure, automated processes should be built to ensure that confidential customer data is removed from the dataset, and that the dataset has a secondary validation before being made available to data consumers.
- Contractual agreements. As part of new consumers leveraging your dataset, you should think about the contractual obligations you will have them commit to before data retrieval. This can include removal of datasets after a certain period of time, reporting on how the data has been enriched and sharing derivative products and consumer facing obligations for removal if requested.
- Authentication of consumers. Many data products will have different forms with different costs and authentication methods. Some combinations can include unauthenticated users who can access the public datasets and share even secured datasets if row-level security is enabled. The far less risky path is to publicly share auto-maintained versions of the listings with redacted, sample data. For certain data, there must be auditable proof that authentication and access controls are being enforced. This is particularly important for data which contains PII or other sensitive information which may open the publisher to risk.
- Locating data. Design data so it’s easy to find. All listings will have a set of keywords, categories, and a URL associated with them. These will be used when returning search results to a potential subscriber. You should develop a solid set of keywords for all data products, ensure the listings are in the correct categories, and have a valid URL for subscribers to contact support teams when necessary.
- Cost. Pricing for data products is often a difficult balance. It must start with the value that the consumer will realize initially and over time by having access to the initial dataset and any updates that come. The price should be low enough to enable quick experimentation and easy decision making to renew access, while high enough to cover the costs of systems, data cleaning, and operational support for the data product.
This is a valuable exercise even when data doesn’t meet the threshold to expose it as a new data product. Performing the above analysis can often enable organizations to discover data which is valuable to internal customers, and develop a methodology to deliver it effectively.
Once the data definition, legal considerations and cleansing processes have been finalized, you must identify a system or service to host your dataset, allowing others to find it, access it at high speed, and receive notifications of updates.
There’s a growing ecosystem of providers that focus on this type of platform service, such as AWS, Azure, and Google, which all have native systems. Also, third-party offerings Snowflake or Nomad Data are growing in popularity. When considering a data exchange, there is some critical functionality which should generally exist.
- Publish complete datasets. A dataset can be much more than a simple collection of tables and views. Often, they include objects which add further value to the data such as executable code blocks or machine learning algorithms. These other objects provide easier methods of analyzing the data, and should generally be published (even if the code itself is encrypted) alongside the data.
- Subscriber Communication. The ability to easily communicate with subscribers, especially when providing a publicly available dataset, is important. Data isn’t static, and the underlying structures often change over time. The ability to communicate these changes and to gather feedback from the subscribers is often needed.
- Set conditions for charging for access. There are many tiers to consider when charging for access to data. Different customer profiles often require different pricing models and levels of access. For example, an internal customer might not be charged at all; a brand new customer might be given a limited-time free trial of the data; an enterprise customer might negotiate a discount. Regardless, we often see publishers who would like to charge differently based on the level of access or other criteria.
Leveraging your datasets can be a powerful tool for increasing your own revenue and contributing to the growing ecosystem of third-party data. But while valuable, the process can be complex. That’s where Pythian can help. Contact us at [email protected] to learn how best to analyze, segment, and start monetizing your datasets for long-term business performance.