Machine Learning & The Long Tail Paradox

When does automation suit your business model?

Photo by Murat Ustuntas on Unsplash

It's not that the phone wasn't sharp enough to cut the orange, it's that it's not the right tool for the job.

Most of the time when ML (machine learning) doesn't meet its business potential, it's not that your data is incomplete or your data scientists aren't extroverted introverts, or any of this stuff. It's usually that ML is not the right tool for your business model.

In this article, I want to discuss the misalignments that may arise between businesses and their automation efforts.

While discussing a new project, my manager told me that artificial intelligence lives at the long tail. Her remark sent me to re-read Chris Anderson's book, The Long Tail.

I had a different career when I read the book for the first time. This time, I read it and its criticism from a different vantage point. And now, the analogies I found helped me understand when ML offers business value, and when it is not the right tool for the job. And I hope I can share this understanding with you here.

Let me start by listing the book's key takeaways.

Chris Anderson's Long-Tail Theory

When retailers have a limited shelf-space, they choose to only sell the most popular items out there. But when shelf-space, and other distribution costs, cease to be the limiting factors, they can expand their offering to less popular niches, or the long-tail items. Hence the name of the book.

The Short-head vs the Long-tail

For example, when the internet offered an infinite shelf space for retailers like Amazon, Amazon became the everything store.

It's more than the shelf-space

Obviously, shelf-space is not the single limiting factor, there are more forces at play here. The book argues that the cost for consumers to reach the long-tail items drops due to the following three forces:

Thus, understanding these three forces, and how automation fits in each of them, is essential to understanding when ML is essential for a business. So, let's start with the first two forces combined.

Democratizing Production & Distribution

The first secret to creating a thriving long-tail is by making everything available to everyone. Aggregators like YouTube or TikTok don't just democratize the production, anyone can create a video, but also democratize the distribution, by making these videos reachable whether from your laptop, iPhone or Google Chrome. The same pattern exists everywhere:

And the list goes on and on.

Now, you may ask: "Why is automation required here?"

For every item you offer as an aggregator, there are overheads. Maybe you have to pay suppliers to ship each item to your warehouse. Maybe you pay for stocking it. Maybe you have minimum quality, and you want to check each item before accepting it. Maybe you want to categorize these items, write descriptions for them and translate those descriptions. These additional costs are more or less fixed per item. But when you check the long-tail graph, you will notice that revenue varies per item. This means that your ROI (return on investment) also varies per item.

This variation exists when you limit yourself to the short-head, but when the long-tail is the name of the game, the ROI variation becomes extreme. Items towards the end of the tail may not be sold ever, but you still have to bear their overheads.

This extreme variation in ROI, means that you either stick to the short-head arena, or you find ways to bring the cost of offering each item to zero. And automation is a key tool if you opted for the second option.

Automation needs scale

Businesses who stick to the short-head arena can afford to do things manually. If it is all about the most popular items, then having a team checking each item manually is justified, and their ROI will be mostly positive. And if an item is not performing well, you can just drop it from your inventory.

Actually, I would advise these businesses to keep their automation efforts minimal, since the cost of automation at small scale will exceed the cost of manual work. You can see what I mean in the figure above.

But once your inventory goes beyond the popular items, you have to adopt a mixture of automation, machine learning and user generated content. Then the linear growth of manual costs will exceed the initial investment in automation.

The probabilistic age paradox

It's hard to define quality, since like beauty, it is in the eyes of the beholder. Yet, generally speaking, quality varies much more in the long-tail than it does in the short-head. Anderson referred to the era where quality varies that much as the probabilistic age.

"With probabilistic systems there is only a statistical level of quality, which is to say: Some things will be great, some things will be mediocre, and some things will be absolutely crappy" — Chris Anderson

He used Wikipedia to make his argument clearer.

"The point is not that every Wikipedia entry is probabilistic, but that the entire encyclopedia behaves probabilistically. Your odds of getting a substantive, up-to-date, and accurate entry for any given subject are excellent on Wikipedia, even if every individual entry isn't excellent" — Chris Anderson

And here comes the paradox:

Nassim Nicholas Taleb compared two cases: Cases where individual items are more or less the same, he calls these Mediocristan. And cases where items vary so much, he calls these Extremistan.

Extremistan vs Mediocristan

People's heights is an example of Mediocristan. In a class of 10 year old students, they all have similar heights. And that's why if you use their average height to predict the height of a new student you will not be that much off. That's why predictions are easy in Mediocristan. People's wealth is an example of Extremistan. Good luck trying to use the worldwide median income to predict Jeff Bezos' wealth.

As we have just discussed, the quality of the items in the long-tail belong to Extremistan. Thus, on the one hand, we need automation to deal with these items, but on the other hand, predictions are harder there since they are citizens of Extremistan.

This is the paradox machine learning engineers have to deal with. Their work is needed the most when it is harder to be done.

Martin Casado and Matt Bornstein seem to agree, and they offer advice to practitioners for when dealing with long-tails. Nevertheless, it's usually the case that in a long-tail world, everyone has to get comfortable dealing with probabilities.

The machine learning algorithms used at the long-tail aren't confident all the time. That's why the models built there have to rely on probabilities. Along with their predictions, they return additional values to show their confidence level in their own prediction.

The bad news: humans aren't very good at dealing with probabilities. But the good news: your customers are much better at dealing with probabilities than your business stakeholders.

Everyone knows that there is a chance that the article they checked on Wikipedia yesterday could have been vandalized two seconds earlier. They know it, and they deal with it, but your business stakeholders may be living in an ideal world that doesn't exist. That's why as a machine learning engineer, when dealing with the long-tail, the toughest part of your job is not to make your predictions useful, but to convince the different stakeholders at your business to use these predictions in the first place. And thus, user experimentation is your number one friend.

Connecting Supply and Demand

With the increase in supply and the high variation in its quality, customers need help to find what they want.

Of course, everyone wants the best in the world, but there are two bests: It's either what everyone sees as the best, or what is best for you. The former can be found in the short-head, while the latter is in the long-tail. Lists like "the most sold items" will give you the former, but you need a good recommendation engine to get to the latter.

Once more, we meet the same paradox: ML algorithms and recommendation systems are needed at the long-tail, but the high variance there makes it hard for the algorithms to perform well. This is why the common wisdom among practitioners states that recommender systems usually have a hard time beating a simple system that just recommends the most popular items.

If recommendation systems are hard, and recommending the most popular items just works, why bother building and improving a sophisticated recommendation system then?

This is a valid question, that every business should ask themselves. They have to understand that recommendation engines aren't just a feature to add to their products, but they have to think of them strategically. These are two strategic reasons to build recommendation engine when dealing with the long-tail:

Commoditizing suppliers

What differentiates aggregators from old fashioned marketplaces is that they are everything stores, whose brands are bigger than the brands of their suppliers. You go to Amazon because of Amazon's brand, not because of the name of this or that seller there. In technical terms, aggregators commoditize their suppliers, and a recommendation engine is part of this commoditization endeavor.

In Spotify's world, we do not listen to artists but lists. We cannot pick a certain driver at Uber or a specific homeowner at Airbnb. You click on the first result on Google, because it is the first result, and not because it comes from a specific website. You can read more about Ben Thompson's Aggregation Theory to see how aggregators commoditize their suppliers.

On the other hand, abandoning personalized recommendations and falling back to suggesting the most popular items, means that you are helping your suppliers become bigger than you, and unless you own their distribution channels, their customers will reach them some other way and bypass you.

Infinite supply for a long-lasting attention

Facebook is free. YouTube is mostly free. They are monetized by ads. Thus, they need you to scroll more, watch more, and the more you consume the more revenue they make.

Thus, the only way to keep you consuming is to offer you infinite supply that matches your taste. They need to explore the long-tail to satisfy their infinite supply needs, and they need a recommendation algorithm to find stuff there that matches each user's taste.

Subscription services also want to increase the stickiness of their product by pushing for infinite supply.

Summary

In summary, a new kind of distribution channel exist and are known as aggregators.

When everyone is a producer, quality varies. Even if it's hard to measure quality, at least the financial return from each item varies. This has the following impact on the need to automate:

Finally, though businesses who live in the short-head need automation to optimize operations that scale with their customer-base and total inventory size, they can get away without automation for anything that scales with the number of unique inventory items.


Tarek Amr, August 18, 2021

Translations: [NL], [AR]