Friday, January 01, 2021

Does Big Data Drive Netflix Content?

One thing that contributes to the success of Netflix is its recommendation engine, originally based on an algorithm called CineMatch. I discussed this in my earlier post Rhyme or Reason (June 2017).

But that's not the only way Netflix uses data. According to several pundits (Bikker, Dans, Delger, FrameYourTV, Selerity), Netflix also uses big data to create content. However, it's not always clear to what extent these assertions are based on inside information rather than just intelligent speculation.

According to Enrique Dans
The latest Netflix series is not being made because a producer had a divine inspiration or a moment of lucidity, but because a data model says it will work.
Craig Delger's example looks pretty tame - analysing the intersection between existing content to position new content. 

The data collected by Netflix indicated there was a strong interest for a remake of the BBC miniseries House of Cards. These viewers also enjoyed movies by Kevin Spacey, and those directed by David Fincher. Netflix determined that the overlap of these three areas would make House of Cards a successful entry into original programming.

This is the kind of thing risk-averse producers have always done, and although data analytics might enable Netflix to do this a bit more efficiently, it doesn’t seem to represent a massive technological innovation. Thomas Davenport and Jeanne Harris discuss some more advanced use of data in the second edition of their book Competing on Analytics.

Netflix ... has used analytics to predict whether a TV show will be a hit with audiences. ... It has used attribute analysis ... to predict whether customers would like a series, and has identified as many as seventy thousand attributes of movies and TV shows, some of which it drew on for the decision whether to create it.

One of the advantages of a content delivery platform is that you can track the consumption of your content. Amazon used the Kindle to monitor how many chapters people actually read, at what times of day, where and when they get bored. Games platforms (Nintendo, PlayStation, X-Box) can track how far people get with the games, where they get stuck, and where they might need some TLC or DLC. So Netflix knows where you pause or give up, which scenes you rewind to watch again. Netflix can also experiment with alternative trailers for the same content.

In theory, this kind of information can then be used not just by Netflix to decide where to invest, but also by content producers to produce more engaging content. But it's difficult to get clear evidence how much influence this actually has on content creation.

How much other (big) data does Netflix actually collect about its consumers. Craig Delger assumes they operate much like most other data-hungry companies.

Netflix user account data provides verified personal information (sex, age, location), as well as preferences (viewing history, bookmarks, Facebook likes).

 However, in a 2019 interview (reported by @dadehayes), Ted Sarandos denied this.

We don’t collect your data. I don’t know how old you are when you join Netflix. I don’t know if you’re black or white. We know your credit card, but that’s just for payment and all that stuff is anonymized.

Sarandos, who is Chief Content Officer at Netflix, also downplayed the role that data (big or otherwise) played in driving content.

Picking content and working with the creative community is a very human function. The data doesn’t help you on anything in that process. It does help you size the investment. … Sometimes we’re wrong on both ends of that, even with this great data. I really think it’s 70, 80% art and 20, 30% science.

But perhaps that's what you'd expect him to say, given that Netflix has always tried to attract content producers with the promise of complete creative freedom. Amazon Studios has made similar claims. See report by Roberto Baldwin.

While there may be conflicting narratives about the difference data makes to content creation, there are some observations that seem relevant if inconclusive.

Firstly, the long tail argument. The orginal business model for Amazon and Netflix was based on having a vast catalogue, in which most of the entries are of practically no interest to anyone, because the cost of adding something to the catalogue was trivial. Even if the tail doesn't actually contribute as much revenue as the early proponents of the long tail theory suggested, it helps to mitigate uncertainty and risk - not knowing in advance which are going to be hits.

But this effect is countered by the trend towards vertical integration. Amazon and Netflix have gone from distribution to producing their own content, while Disney has moved into streaming. This encourages (but doesn't prove) the hypothesis that there may be some data synergies as well as commercial synergies.

And finally, an apparent preference for conventional non-disruptive content, as noted by Alex Shephard, which is pretty much what we would expect from a data-driven approach.

Netflix is content to replicate television as we know it—and the results are deliberately less than spectacular.

Update (June 2023)

I have been reading a detailed analysis in Ed Finn's book, What Algorithms Want (2017).

Finn's answer to my question about data-driven content is no, at least not directly. Although Netflix had used data to commission new content as well as recommend existing content (Finn's example was House of Cards) it had apparently left the content itself to the producers, and then used data and algorithmic data to promote it. 

After making the initial decision to invest in House of Cards, Netflix was using algorithms to micromanage distribution, not production. Finn p99

Obviously that doesn't say anything about what Netflix has been doing more recently, but Finn seems to have been looking at the same examples as the other pundits I referenced above.


Roberto Baldwin, With House of Cards, Netflix Bets on Creative Freedom (Wired, 1 February 2013)

Yannick Bikker, How Netflix Uses Big Data to Build Mountains of Money (7 July 2020)

Enrique Dans, How Analytics Has Given Netflix The Edge Over Hollywood (Forbes, 27 May 2018), Netflix: Big Data And Playing A Long Game Is Proving A Winning Strategy (Forbes, 15 January 2020)

Thomas Davenport and Jeanne Harris, Competing on Analytics (Second edition 2017) - see extract here https://www.huffpost.com/entry/how-netflix-uses-analytics-to-thrive_b_5a297879e4b053b5525db82b

Ed Finn, What Algorithms Want: Imagination in the Age of Computing (MIT Press, 2017)

FrameYourTV, How Netflix uses Big Data to Drive Success via Inside BigData (20 January 2018) 

Daniel G. Goldstein and Dominique C. Goldstein, Profiting from the Long Tail (Harvard Business Review, June 2006)

Dade Hayes, Netflix’s Ted Sarandos Weighs In On Streaming Wars, Agency Production, Big Tech Breakups, M+A Outlook (Deadline, 22 June 2019)

Alexis C. Madrigal, How Netflix Reverse-Engineered Hollywood (Atlantic, 2 January 2014)

Selerity, How Netflix used big data and analytics to generate billions (5 April 2019)

Alex Shephard, What Netflix’s Obama Deal Says About the Future of Streaming (New Republic 23 May 2018)

Related posts: Competing on Analytics (May 2010), Rhyme or Reason - the Logic of Netflix (June 2017)

No comments:

Post a Comment