Have you found Net Promoter Score (NPS) surveys helpful in building products?

Short answer: NPS is a helpful proxy metric for my talks & essays. But at both Netflix & Chegg, I focused on four sources of data: existing data, qualitative, surveys, and A/B tests.

Gib’s note: Welcome to the 400 new members who joined this past week! After four months, we’re 4,700 strong. Each week, I answer a few questions, drawing from my experience as VP of Product at Netflix and Chief Product Officer at Chegg (the textbook rental and homework help company that went public in 2013). Today’s answer is essay #48!

Here are my upcoming public events and special offers:

  • Click here to purchase my self-paced Product Strategy course on Teachable for $200 off the normal $699 price — the coupon code is 200DISCOUNTGIBFRIENDS but the link will handle this automatically. (You can try the first two modules for free.)


Have you found Net Promoter Score (NPS) surveys helpful in building products?

I’ll start with an unusual request. If you have read at least two of my “Ask Gib” essays, please click the link below to provide feedback for this product newsletter series. There are only three questions, and your feedback is incredibly helpful to me:

Click here

Did you complete the survey? Please do it before you read this essay, as it provides helpful context. You’ll also be able to look at the real-time results at the end of this essay, and it’s more fun if your feedback is reflected in the results. It only takes one minute!

Ok, if you’ve read at least two essays before, I assume you’ve provided feedback for my “Ask Gib” product newsletter, so let’s begin…

Intro: NPS— it’s just a number.

Last week on Twitter, I asked this question:

The fact that only 31% found NPS helpful doesn’t surprise me. Comments include:

  • “The question is odd — I never recommend products.”

  • “There are typically no follow-on questions, so you get no insight about why customers gave the score they did.”

  • “It’s gameable. Folks do stuff like not send the survey to customers they know are unhappy.”

What surprised me was the emotion behind some of the comments. As I dug deeper, I realized that NPS takes on almost mythical status as a “single source of truth” within some organizations because it presents a single number. It appears to be quantitative data, but it’s not. It’s a numerical representation of qualitative data.

And wacky things happen. At some companies, NPS is the basis for quarterly objectives, impacting employees’ bonuses and compensation. No wonder there’s such an emotional response.

NPS is a qualitative measure that describes what customers say — not how they behave. But NPS surveys have been helpful to me, so I sometimes use them when I don’t have better proxy metrics for product quality.

In this essay, I outline how NPS works, how I used it at both Netflix and Chegg, and then share the challenge of finding meaningful proxies for product quality using data from my “Ask Gib” essays.

How NPS Works

Bain & Company created the Net Promoter Score in 2003 to measure brand loyalty and a product’s potential for word-of-mouth growth. Below I share the first question in an NPS survey, using an example from a recent “Ask Gib” essay:

Respondents answer the question on a zero to ten scale. A Net Promoter Score is calculated by taking the percentage of customers who rave about the service (9s & 10s) and subtracting the percentage of “detractors” — those who give a rating from zero to six.

Here’s the NPS calculation for my recent “Project-based v. Outcome-based Roadmaps” essay:

Twenty-six readers answered the question. 88% percent were promoters, and none were detractors, so the NPS score is 88. Most consider a score of 50 to be very good/great and 70 to be “world-class.” High scores suggest the potential for strong word of mouth growth and retention.

Beyond this single number, here’s a snapshot of what folks liked about my “Roadmaps” essay:

I also ask, “What would make this essay better?” Answers to this question provide clues for new hypotheses to test in the future:

I find this qualitative feedback incredibly helpful.

So why the controversy about NPS? In this case, it reduces twenty-six responses to a single number, giving it the weight of quantitative data, which it’s not. It’s what folks say — it’s not a measure of their behavior. And often, what folks say and what they choose to do, is different. Last, this is just twenty-six responses from a population of 5,000 subscribers. It doesn’t represent the readers who stopped reading before the end of the essay, and there’s lots of other potential bias.

For me, NPS is like a compass — it’s directional. NPS doesn’t present a complete story. I constantly remind myself, “It’s just a number.” I’m happy with high scores and bummed about low scores, but with a small number of respondents, the NPS can be very “noisy,” so I don’t lose sleep when I receive a low score. But I look at the comments to see how I can edit the essay to make it better. I also apply the learning to future “Ask Gib” essays.

Netflix & Chegg: Four Sources of Data

Here are the four data sources I use, along with how they contribute to my efforts to engage in the scientific method by forming hypotheses, testing them, observing the results, deciding what to do, then forming new hypotheses to continue learning:

  • I use existing quantitative data to understand past and current customer behavior.

  • I engage in qualitative — focus groups, one-on-ones, usability, ethnography — to hear how people think and react to the work. I also use these tactics to get the “voice of the customer” in my head and to discover new ideas.

  • I execute surveys to capture who the customer is and how to think about them — by demographics, competitive product usage, entertainment preferences, etc. NPS is one tool in this survey toolkit.

  • A/B test hypotheses formed via the inputs above to see what works or doesn’t. For me, A/B tests are the “big dog.” It’s the only method that helps me reliably measure how different hypotheses affect customer behavior. A/B tests also help me measure trade-offs between customer delight and margin.

Unfortunately, A/B tests are not feasible for all products and organizations, so I’m occasionally forced to rely on other data sources. At both Netflix and Chegg, however, most of our consumer insights came from combining the four sources above.

NPS at Netflix & Chegg

At Netflix, we collected NPS data but rarely looked at it. We had a much better proxy metric for product quality— the percentage of members who canceled each month. Monthly customer retention is a much better proxy metric for product quality than NPS. It measures what customers do, not what they say.

But the NPS for Netflix was consistently around 70, which makes sense for a product that has grown from zero to 200 million members in the past twenty years.

At Chegg, we executed an NPS survey in 2010, and the NPS was 60, which is very good for a startup. Our proxy metric for textbook rental was the percent of students who returned the next semester to rent another textbook. But this required us to wait a full semester to evaluate if our product was getting better.

We needed a proxy metric that would give us faster insight, so we implemented ongoing NPS surveys. NPS improved as our selection, pricing, and delivery speed got better, and eventually, we drove NPS into the high 70’s. Along the way, we shared this data with potential investors — our increasing NPS score was one of the reasons they gave us additional funding rounds.

Fun fact: Reed Hastings (co-founder of Netflix) funded his first startup, Pure, based on 50 survey results from engineers in Silicon Valley in 1991.

Another fun fact: Netflix used NPS when the service launched in South America, and the score was very high. So they were surprised when they had poor retention during their first few months. It turns out that Brazilians are very generous “graders” (while folks in Australia and Germany are stingy with their 9’s and 10’s). This is one of many reasons you have to be careful not to compare NPS from one industry or country to another.

The “Right” Proxy Metric for “Ask Gib” Essays?

If you were writing essays on Substack, what data would you rely on to determine the quality of an essay? Below, I share the data for the last thirteen essays I have written, from most current to mid-February:

“Ask Gib” Member data from 2/16 to 4-14-2021

A few notes:

  • I chose not to include the email open rate for each essay as it’s always around 50%— there’s little variation. And as much as I’d like to A/B test the three possible landing pages for “Ask Gib” (required sign-up, essay-specific landing page, and a full list of essays), neither Substack nor I have the tools to do this.

There are four potential proxy metrics. Here’s how I prioritize them, from best to worst:

  • If you had to choose one metric to determine the “best” essay, I’d likely go with “Shares.” When a reader shares an essay, he/she informally attests to the quality of an essay. Readers back up their endorsement with an action. (I normalized shares for subscriber base size in the far-right column: (shares/subscribers) *1,000)

  • NPS provides an easy-to-understand signal to me, and the “What’s good?/What could be better?” verbatims are incredibly helpful. The comments help me to understand what qualities in an essay inspire a share.

  • Likes (“Hearts”) are a straightforward feedback system. Hearts give a sense of the quality of the essay but give no insight into the “why.” There’s also secondary value in hearts because they help readers decide whether an essay is worth reading. (NPS scores are not nearly as well understood as hearts, and many folks find my publicizing my NPS scores annoying.)

  • Sign-ups after one day is also a reasonable proxy. A great essay can inspire folks to sign up for the newsletter. But this proxy is noisy— I get many new sign-ups after I publish an essay after a long gap in my writing. There’s pent-up demand.

Another note: all of the data sources— with the possible exception of NPS— are gameable. If I aggressively ask readers for more “hearts,” for instance, I get more of them. In the last essay, in my “ask” at the end of the essay, I prioritized “hearts” and “shares” — to build credibility for the quality of my essays and to strengthen viral loops. It’s also why I got less NPS survey feedback.

So what’s the best essay on the list above? It’s tricky, but note that the “Project v. Outcome-based Roadmaps” essay tops most of the data points I listed, and the qualitative from the NPS survey gives me a strong sense of “Why?” The essay finds a middle ground between the abstract and the “real world,” has strong examples, and provides a framework that readers can apply to their jobs tomorrow.

In writing this essay and looking closely at all of the quantitative and qualitative data, I find shares to be the most helpful proxy metric. The NPS data reinforces my confidence in shares as a proxy — there’s a reasonable correlation between shares and NPS, especially if you toss out the NPS scores with a lot of noise. (Ignore the NPS scores with fewer than 20 responses). The other benefit of shares is it’s not annoying and requires little time/effort by you — my “Ask Gib” readers.

Is the Net Promoter Survey helpful to me? Absolutely. Through comments from readers, I have learned:

  • The importance of stories and examples to bring ideas to life.

  • The value of tools, models, and frameworks that product leaders can apply to their job the next day.

  • While keeping essays short is helpful — people are busy — there’s no rule about how long or short an essay should be as long as you “trim the fat” from each.

  • The value of humility in describing my successes and failures as subscribers (and I) learn from both. If you only talk about success, you aren’t believable. If you dwell too much on failure, you lose credibility.

The open-ended feedback also gives me great ideas to form new hypotheses that I test in future essays.

Does stuff go wrong with NPS? Absolutely. I’ve learned to be careful about comparing results from one talk or essay to another. And if there’s a small sample, NPS is really noisy — both high and low. So when I get a high or a low NPS score, I have to nicely remind myself, “It’s just a number,” as I dig into the qualitative to form hypotheses for my next experiment.

Conclusion: NPS in Action

Thanks for participating in my “Ask Gib” surveys. While I used NPS as a secondary data source at Netflix, NPS did help us to get funding at Chegg. I’ve found NPS to be a reasonable proxy metric for product quality for my talks and essays — though I still consider multiple data sources.

Oji Udezue, the Chief Product Officer at Parsable, responded to my essay,

“Axiomatically, if you have high NPS, you love NPS.”

This is a fair description of why I have been an NPS fan, but the exploration in this essay helped me to discover a better proxy metric for my “Ask Gib” essays: shares.

This has been a fun essay for me to write, and I learned a lot along the way. Many thanks to Duncan Schouten, a product manager at XWP in Vancouver, who sifted through all of my “Ask Gib” data. He nicely captured the spirit of this exploration:

“It’s always fun to watch the birth of a new proxy metric.”

I hope you have enjoyed the journey, too.

If you’d like to see the results of the NPS survey for the “Ask Gib” product newsletter series you completed at the beginning of this essay, click on the link below:

Click here

I’m sure there will be plenty of qualitative for me to sift through! Again, many thanks for your feedback.

One last thing!

I hope you found this essay helpful. Before you go, I’d love it if you’d complete the Four Ss below:

1) Subscribe! By joining the “Ask Gib” community, you’ll never miss an update:

2) Share this essay with others! By doing this, we’ll collect more questions and upvotes to ensure more relevant essays. And, of course, it’s a nice signal to me that you enjoyed this essay:

Share

3) Survey it! It only takes one minute to complete the Net Promoter Score survey for this NPS essay, and your feedback helps me to make each essay better:

Give Feedback

4) Star this essay! Click the Heart icon near the top or bottom of this essay. (Yes, I know it’s a stretch to say “star” and not “heart,” but I need an “S” word!)

Many thanks,

Gib

Gibson Biddle

PS. Got a question? Ask and upvote questions here:

Ask & Upvote Questions

PPS. Once I answer a question, I archive it. I have responded to 47 questions so far!