Getting the Data Out: The Distribution Layer in a People-Measurement Platform
Getting the Data Out: The Distribution Layer in a People-Measurement Platform
By Gary Angel
|July 19, 2023
Lidar is becoming the go to technology for people measurement applications, combining accuracy, precision, privacy and excellent real-time performance. A People-Measurement Platform is the software stack that sits upstream from lidar (and other people-measurement) sensors, ingests the data, cleans and contextualizes it, and then provides intelligence, reporting and distribution services.
In my last two posts, I described the data cleaning, mapping & data contextualization that a People-Measurement Platform does to make the raw data useable and interesting. Those cleaning and mapping functions live at the foundational layer of the platform. The next layer up – the Distribution Layer – is responsible for integrating people-measurement data into the rest of your analytics and operations stack.
I was tempted to make the distribution layer the final (bottom of diagram above) layer and I also debated whether the real-time feed should be lumped into distribution or Real-Time. In the end, I chose to put the Distribution layer immediately after the Foundation and NOT include the real-time feed in distribution. Both those decisions are somewhat arbitrary, but they reflect important realities in people-measurement that are worth emphasizing.
I chose to place Distribution at a lower-level than Reporting & Analytics to reflect the organizational priorities of many enterprises. In the early days of digital analytics, the web or digital analytics tool was very much a standalone component. As organizations have become more analytically sophisticated, however, there has been an evolution into a broader analytics software stack in which purpose-built tools (like Adobe or GA) are just one component.
Given this, when a mature analytics organization looks at people-measurement, its FIRST priority is often getting the data into its existing analytics technology stack. That may include a data warehouse or data lake solution (e.g. a Redshift or BigQuery), a general purpose BI and Analytics Workbench (e.g., Tableau or Qlik), and a more powerful and programmatic analytics tool (e.g., R or SAS). This is often more important than whatever analytics or reporting comes in the bespoke tool. After all, most organizations already have full dashboards. They don’t want to re-create new dashboards with just people-measurement data. They want to blend people-measurement data into those existing systems. Ditto for analytics. It’s almost impossible to build a bespoke platform targeted to a specific domain and give it the kind of general purpose capabilities that Tableau or R bring to the table.
Since it’s common for data integration to be the primary concern for the enterprise above the foundational level, I chose to emphasize that by putting the distribution layer right above the foundational system. If a people-measurement platform isn’t a good distribution platform, it isn’t worth investing in. Even if you don’t already have that broader analytics and reporting stack well-baked, you will have to get there eventually. So, supporting rich, easy distribution of data is job one for a People-Measurement Platform once the data is ready to go.
A People-Measurement Platform should make it easy for you to get data in both “push” and “pull” scenarios. You’ll typically want to create a regular, fully automated push of data to one or more destinations. It’s almost always essential for the platform to support data pushes at more than one-level of granularity. You won’t usually want to push event-level data to BI/Reporting systems. Those systems aren’t performant at that quantity of data nor are they the right place to process the data. On the other hand, just providing a higher-level push (at the visitor level or some mapping-level) won’t meet the needs of your data science and integration teams that need event level data.
It’s also important to support “pull” scenarios via an API (typically REST-based). This allows developers and analysts in your organization to access data specific to some specialized analytics or application need. You may be less concerned with this if you feel like your broader analytics stack and the tools there will cover this need. But a lot of general purpose tools in the analytics stack are surprisingly poor at supporting generalized API access. If that’s your situation, making sure your People-Measurement Platform makes this easy will significantly improve your time to value.
Similarly, I broke the real-time feed out of distribution (even though it’s obviously a distribution function), because real-time tends to be a specialized set of use-cases. Not every organization will do real-time processing at all, and you may not need real-time functionality of any sort. On the other hand, real-time applications are FAR more common in people-measurement than in, for example, digital analytics. And if you need real-time functionality, chances are, you’ll need a real-time feed along with a fair amount of other rather specialized functionality.
But that, as it happens, is the topic for next time.