Spot customers early: aha moments, predictive models, and beyond
Usually, marketers want to know how their acquisition channels are performing as soon as possible. But if it takes a couple of months for a user to convert, it might be a good idea to come up with a model predicting potential customers early on.
In this post, I share my experience building a predictive model, explain why it didn’t work as expected, and how we’re going to bypass it in the future.
Some numbers were changed or hidden due to privacy concerns.
To begin with, Renetti is an online-to-offline (O2O) business, which means that a customer can buy either online or in-store.
If a customer buys in-store (as the majority do), we try to tie their offline purchase to the online behavior using a bunch of matching point.
Sometimes, when a customer uses different devices before and after the purchase, we may have a weird situation when a customer’s first visit date is greater than the order date (which is usually not true)
Obviously, I included only properly tracked customers.
As shown in the histogram above, the purchase decision may take months. Since we can’t wait that long to see how our ads are doing, I decided to find a quick way to identify potential customers early on — an insight like “If during week 1 a user has viewed 3+ products, he’ll become a customer with 70% probability”.
The initial idea was to find a micro-conversion (aka “Aha moment”) — a single event, that would differentiate potential customers from noncustomers. Some kind of Facebook’s “7 friends in 10 days”.
Here is what I did:
- Picked a list of website events indicating level of engagement;
- Segmented customers by the delay between first visit and first order. Most likely, a person that is going to buy 3 months from now will behave differently than the one that is going to buy tomorrow;
- Checked week 1 behaviour across those segments.
It turned out, they behaved differently: the closer a person to purchase, the more active she is on the website. Here are some examples (swipe to see more)
However, when I tried to separate customers from noncustomers by a certain threshold using only one event, I got, well, not super accurate result.
The best filter was able to identify 60% of the customers with 1% precision — meaning, only 1 out of 100 converted users would eventually become a customer.
The difference in the behavior during week 1 wasn’t big enough given the disproportion between those segments: customers constituted only a tiny fraction of all the website visitors.
But I decided to go further and try some machine learning.
First, I picked the most relevant features to train the model on.
The quick way to look at how different website events correlate with each other is to use a pair plot. Here is a piece of mine (I added more features later on):
I won’t go deep on the ML things here (I’m not a data scientist, after all), just say that I tried a bunch of different algorithms and dataset’s makeups (as I said, the initial dataset was very imbalanced).
After a quick and dirty trial and error process, I ended up with the following result for different algorithms
For example, Random Forest (rf) was able to detect 80% of customers with 20% precision.
Unfortunately, this level of precision was possible only on the balanced dataset (customers = noncustomers). When I tried it on the real, highly disproportional data, the accuracy dropped to just 2% — meaning, 98 of 100 conversions would be False Positive.
Here is a good way to visualize it
Just to make a point, if the number of customers was equal to noncustomers, the model would be much more accurate
In our case, it wasn’t possible to accurately predict customers during the first week, mainly because of two reasons:
- The difference in behavior wasn’t big enough;
- Segments were too disproportional.
So what’s next? Well, if we can’t find a micro-conversion — we’ll create one. It will be an event, that falls in the middle of the interest-friction spectrum
But it would be a subject for another post.