Are Your Models & Data Leaving Money on the Table?

June 2, 2017

By Nate Struckmeier | Advanced Marketing Analytics

89 Degrees' own Sr. Data Analyst, Nate Struckmeier explores how new modeling techniques are making it easier for companies to see performance gains from their existing data and new browsing and engagement behavior.

There are few resources more important to a company than its audience. Customer interaction forms the backbone for any business success, and as such, becomes a primary focus within marketing. Customer relationship management manifests in numerous outlets such as loyalty programs, offer testing, and multi-channel analysis, however probably the most potent applications resides within targeting optimization. The days of batch-and-blast are fading with the onset of new software, new techniques, and easier approaches to smartly interacting with one’s customer base. Yet despite improvements in modern statistics and machine learning, many companies are sitting on derelict models in need of an upgrade. The data on which a model is built could be valuable, its integrity sound, its variables applicable, and yet companies shy away from immediate benefits of remodeling under false assumptions of a process complex and fraught with peril. Many companies are missing out on an efficient source of profit increase, as new modeling techniques could swiftly be applied to already existing data with minimal effort, providing noticeable improvement in performance.

Let us assume we’ve been making use of an existing model as a demonstration. For now, we’ll assume it’s a logistic regression model splitting customers out into deciles based on likelihood to purchase. Each decile is composed of households classified by the model, and exhibits a unique conversion rate. With just a few useful metrics we can quickly calculate the expected revenue from a promotion.

We’ve targeted the top two deciles of our model with a carefully tailored offer. The deciles have a combined average conversion rate of 10.38%, so we can expect a response of around 109k purchasers (10.38%*1,050,000). On average, each customer spends $23 per order, providing an expected revenue of $2.5M ($23*108,990) for current performance (Fig. 1).

Not bad. The current model seems to be working, adequately identifying customers belonging in your top two deciles. However, perhaps the relationship to some of the variables within the model is not entirely linear. Perhaps you have missing values within observations, or some of the variables themselves are suspect. The linear regression model previously used is less robust than you would like, and there’s room for improvement. Using the same data as in the old model, application of a new technique such as Random Forests, Gradient Boosting, or Neural Networks can improve your accuracy when classifying customers and thus increase overall response.

Let’s say we apply a Random Forest to your modeling data. Random Forests have fewer problems handling non-linearity, and easily copes with null values, thus improving the modeling process and results for this dataset. The model arrived at via application of Random Forests provides a higher level of accuracy when identifying your top deciles, thus improving their conversion when targeted. Decile 1 & 2 had a 10.38% purchase rate under Model 1 and improved to 11.28% under Model 2!

I know what you’re thinking. Yes, 11.28 – 10.38 is 0.9., and a 1% improvement in response from remodeling looks underwhelming. However even a small improvement in a single decile can lead to substantial gains in sales. The 0.9% increase in conversion translates into an additional 9.5k converters from the model upgrade (0.9%*1,050,000). These 9.5k additional customers average $23 per purchase, leading to a $217k potential increase in sales. Overall, a 0.9% increase in performance translates into $217k in profit, assuming no further optimization, which is highly unlikely.

There is readily available profit within existing data but it’s only the beginning. One can proceed even further by utilizing new techniques in customer tracking, leading to browsing and engagement data, and more valuable variables from which to model. Many companies have some form of modeling and segmentation in place, usually based on historical purchase response, loyalty program values, and channel activity. However, with the recent improvement in online visitation detail and visibility into traffic flow, one can create powerful new variables to improve customer insight.

Modeling allows the identification of key performance variables most important to your audience, the measurable values most closely connected to a desired response such as conversion. Yet a customer’s journey to purchase is much richer than mere transaction data. A customer could realize their need for a product yet browse online for weeks. They could request a catalog, opt-in to certain notifications and newsletters. They could localize their online visits to certain product categories, make return visits to similar items, adjust their online cart. They could navigate through banner ads related to the desired purchase, or call in to inquire about a promotion. Companies have a greater ability to construct customer profiles than ever before, and it begins with modeling the browsing and engagement variables most closely tied to response.

Let’s stick with our process. We’ve modified our first model by application of Random Forests, and observed increased purchase from our top deciles, but you’re looking for another gain. You know there are two flagship items for your company; products with some serious gravitas in increasing conversion for both new-to-file and existing customers. You add one more binary variable to the new model, “has the customer browsed a category page related to your top items”. This variable proves significant in influencing your customer to purchase. The addition of engagement data increases your model’s accuracy in construction of your top two deciles and performance increases from 11.28% to 12%! Combining the power of machine learning with browsing & engagement leads to immediate success (Fig 2.).

Expanding one’s horizons into browsing and engagement data, or machine learning, will lead to exciting improvements in customer response, yet it’s the combination of the two areas which lead to the greatest potential for change. Optimizing a model around existing data will provide increased conversion, but providing that same model with additional data on your customers will provide valuable unlocks into exactly why they purchase. Analyzing new engagement behavior will shed light on customer preference, but using machine learning to form strong, statistically significant correlation between engagement and response, gives foundational knowledge required for lasting growth.

About the Author


Nate Struckmeier
Sr. Data Analyst

Raised in Oregon, Nate is an experienced data analyst proficient in analytic software and techniques. When it comes to problem solving, he loves to “live in the weeds,” searching at the most granular level for interesting insights. When not working on projects for IKEA, Delhaize, Genzyme, Harry and David, or Talbots, he enjoys sketching, chess and other strategy board games, and binge watching Star Trek and Cowboy Beebop.

You May Also Be Interested In