Bringing Machine Learning to Store Analytics
By Gary Angel|
January 31, 2019
Over the last six months we’ve aggressively re-engineered one of the key processes in our business to be Machine-Learning based. We use a variety of vendor technologies to collect data about shopper movement in store. Of those technologies, electronic detection is probably the most common. Electronic detection works by listening for probes (wifi or Bluetooth) from shopper devices and then triangulating the signals to determine location. It’s fairly inexpensive and provides tracking across the entire shopper journey (you can read more about the various technologies here). What electronic detection doesn’t do very well is position the shopper accurately in the store.
That’s kind of big deal.
If all you want is to know how many shoppers entered a store, positional accuracy doesn’t matter all that much (though it does still matter as we’ve found most systems capture lots of out-of-store shoppers). But our DM1 platform is designed to track and segment on detailed shopper behaviors. So we need to know if a shopper was in Men’s Jeans or Women’s Lingerie.
As I said, it’s kind of a big deal.
We’ve spent a lot of time working with various electronic tracking systems and all of them, frankly, we’re disappointingly inaccurate. There are technology solutions to fix this (things like Cisco’s Hyper-Location Access Points), but they’re prohibitively expensive for most retail clients. So we decided to create a better solution and that solution involves a LOT of machine learning.
After a lot of work, we’ve started rolling out the ML-based systems to our clients and it’s become our default method of bringing up new clients if they’re using electronics. It’s significantly better in almost every respect than the methods we’ve replaced – and it makes electronic tracking much more useful.
But it was no easy journey and along the way we had to significantly re-engineer and re-create a number of core processes around data collection and analysis. And while some of those lessons are highly specific to our particular application, a lot of them are applicable to some or all ML processes. So even if you’re not focused on store analytics, I hope you’ll stay tuned. ML is coming to every field of analytics and it has implications for the analyst and the analyst manager that go well beyond which tool you’re using or what kind of model you build.
Over the next half-dozen posts, I’m going to walk through our central challenge with positional accuracy, explain why ML was a good direction, and then delve into a host of issues we turned up and had to (mostly) solve. That will include the basic challenge embedded in our data – where lack of a known answer initially prevented us from using supervised learning techniques. It will also include our efforts to tune our ML data to make it both richer and more representative. I’ll cover off some of our tool choices and talk a little bit about the skill sets we brought to the table. None of use are ML experts but we have a LOT of programming and analytics chops. Finally, I’m going to spend quite a bit of time on the difficulties in operationalizing our ML models. We actually built two completely independent ML pipelines for production purposes and each had some advantages and some drawbacks. Operationalizing ML processes inevitably draws you deeper into tool questions but there are deep analytics questions involved too.
With the ML-based location analytics going live, we’ve started developing additional ML processes – beginning with Associate identification. We expect that all our work understanding and developing ML processes will pay-off in a host of new and improved analytic processes and methods.
Like our machines, we’ve had to learn a lot along the way. And I’m hoping this series of blogs can help you take a similar journey with perhaps a little less pain!
- The Basics of Display Measurement
- Merchandising Analytics: Measuring In-Store Displays
- Unintentional Model Bias: Why Laziness and Census Data can be good proxies for racism
- Screening Off – The right answer for thinking about model bias
- The Problem of Correlation: Why you don’t need to have a variable to use it