LP Login

Think Big. Move Fast.

In many businesses, repeat purchase behavior is a key driver of value. Many companies track % of repeat purchases as a key business metric. This is useful in steady state, but can sometimes be quite misleading if the company is showing substantial growth. By definition, growth implies many first time customers, and the mix of these new customers can distort the view into how much repeat purchase behavior is actually occuring.

I prefer to try to analyze repeat pruchase behavior, and hence, estimate lifetime value, by doing cohort analysis. This is approximate by definition, but it can give you some sense of lifetime value well before you actually see a full customer lifetime, which can help in accelerating decisions about marketing and customer acquisition.  I recently posted about how you can improve LTV and CAC for your subscription or repeat purchase business.  But how do you estimate Lifetime value?

I’ve uploaded a spreadsheet with a  sample cohort analysis, using representative but dummy data to illustrate how to do this.

In this particular example, I look at a hypothetical subscription business. Assume that the business has been in operation for one year. First, divide the users into cohorts depending on when they initially subscribed to the service.  I calculate retention at the end of month N by dividing the number of subscribers still subscribing after month N by the total number of subscribers that started in each cohort.  These are the numbers in blue. Obviously, for the subscribers that started in month 1, we have 12 months of retention data, for the subscribers that started in month 2 we have 11 months of retention data, and so on.

By averaging across the cohorts, you can get an average retention rate at the end of one month, two months and so on. As the cohorts age, there are fewer datapoints to average over, and hence the potential for error is greater. However, it is still a useful exercise to get an early indication of how the business looks.

A typical pattern found in subscription businesses is that after a steep drop off after an initial period, month-on-month attrition rates tend to level off. You can see a similar pattern in this example, where after the first month, month-on-month attrition rates are around -6% (ie month N subs ~ 94% of month [N-1] subs).

If you see a pattern like this, you can extrapolate forward using the same month-on-month attrition across several years. As you can see in the model, we extrapolate an average lifetime of 9.77 months by extrapolating forward over 5 years of data.

So if you were a subscription business charging $20/month with 90% gross margins (after accounting for customer service costs for example), then you would attribute a lifetime value for a new customer of 9.77 x $20 x 90% = $176. This sets an upper bound of what you would be willing to pay to acquire a customer (although in practice, you would prefer to see a ratio of CAC/LTV in the 25-35% range).

This example is for a subscription business where the key value driver is the number of active subscribers. However, you can conduct similar analysis on any type of repeat behavior business. In a social business the metric might be activity (e.g. how many users posted a photo this month), and in a social game the metric might be dollars spent in virtual goods that period. The measurement periods may vary according to the tempo of the business. Many social games do their cohort analysis on a daily or weekly basis,  whereas some ecommerce companies whose purchases are less frequent may do their cohort analysis on a quarterly basis.  This will dictate how long you have to collect data before you have enough data to project forward.

Different billing mechanisms can complicate this (e.g. an annual billing system will by nature skew average lifetime upwards) and while these can be important levers, it is usually helpful to hold billing constant and compare cohorts on a same-billing basis, at least initially. However, this cohort analysis is also useful tool to see what the impact of changes in billing, registration flow, product features etc can have on retention as you can often see an increase in early month retention from later cohorts.

The spreadsheet for the sample cohort analysis is read only but you can download it to play with it yourself.

I’d love to hear from others how they estimate lifetime value.

UPDATE: June 2012 – I have a new post describing how to estimate lifetime value for an ecommerce business using cohort analysis.

  • https://monotask.com Charlie Park

    Jeremy, thanks for this writeup. I have (hopefully) quick technical question. I’ve seen a handful of articles about cohort analysis, but haven’t seen any technical writeups for how to efficiently gather / maintain the data for them. Do you have advice on this?

    There are two ways I can think of:

    One is to have a “created_at” and “deleted_at” date/time for each subscription, and to create the analysis dynamically each time you load it up: Find all users with a “created_at” in Month 1, then create subsets for each subsequent month, removing the subscriptions as you cross their “deleted_at” months.

    The other is to create a brand new database table (data warehouse) and, each day (or week, or month), to run a cron job or other automated process to sweep the database, collect data, and pass it into the database, and to then call this (static) data up when you’re generating your report.

    Is there another way that I’m missing? Do you have any suggestions for how to actually create these reports? I’d love any insight you have on this.

    Thanks.

  • Larry Mai

    Charlie,

    If you’re doing with a business intelligence solution (you mentioned data warehouse), you should be able to run a report on the subscription data to determine for each customer the first payment and last payment. I assume the first payment would be close to the created_at and the last payment would be deleted_at. This should provide the data to generate the monthly chrun.
    You can do the same in SQL. It’ll be more manual with the sorting and grouping, but that can be done in excel.
    I can help out if you’re interested.

  • Dan

    How do you get to the average of 9.89 months per sub? I don’t see this on the spreadsheet. Thanks

  • http://lsvp.wordpress.com jeremyliew

    @ Dan – thanks for the catch – i’ve changed it to 9.77 as per the spreadsheet. I anonymized some real data with some random number generators to create some noise and it runs each time i do a re-calc, so had an old # in when I wrote the post

  • https://monotask.com Charlie Park

    @Larry, thanks for the offer for help. I’ll dig around for a bit and see what I can piece together, but I’ll probably be in touch down the road to show you what I’ve figured out and see if you have suggestions on it. Thanks again.

  • Pingback: Running The Numbers: Cohorts « Disruptive Growth

  • http://www.graduatetutor.com Senith @ MBA tutor

    Jeremy,

    Thank you for sharing this. This is very helpful.

    I run an online tutoring business and think our CLV computation is pretty easy. Most of our customers sick around for anywhere from 1 week to 3 months. And we have been around for 2+ years now. So if we just average the revenue per customer (except current customers) and multiple this by our margins we should be able to get the CLV. Is this correct?

  • Pingback: Weekend Reading

  • http://healthcarefinancetutor.com/ Health Care Finance Tutor

    In the healthcare business, the LTV is more complicated. We have to segment the spend based on the DRG code because revenue per customer is dependent on the DRG code. This varies significantly.

  • Adam

    Hey, I was wondering if you know what the average expected LTV, CAC and ARPU of a social/casual game on facebook is or if you know where I can find such information?

  • http://www.ebay.com taylor wescoatt

    Ryan Carson took the time to walk a Seedcamp session through his own model for estimating (among other things) LTV for subscription model. http://video.seedcamp.com/video/867124/seedcamp-week-2010-metrics-for

  • Pingback: 2011 Consumer Internet Predictions « Lightspeed Venture Partners Blog

  • Pingback: Inclined to Create » Blog Archive » 2011 Consumer Internet Predictions

  • Pingback: A simple business plan for online retail startups | Nett

  • Chris Kelly

    I was working on my cohort analysis when I stumbled across this. It’s a great help. Shouldn’t Column O be “Percent remaining from last month” though?

  • http://lsvp.wordpress.com jeremyliew

    @ Chris- yes, you’re right it should. thanks for pointing out the error

  • http://www.skyglue.com Cindy

    If you are are interested in cohort stats for your site, you can try http://www.skyglue.com

    They offer cohort web analytics.

  • andrei marinescu

    Jeremy, great post. Thanks for sharing the sample cohort analysis.

    Can you share the rationale behind the methodology you use to calculate expected average customer lifetimes at the bottom of the spreadsheet?

    Thanks,

    Andrei

  • http://lsvp.wordpress.com jeremyliew

    Basically, you average out the period on period decay after the first period, and project it out into the future, then sum over the lifetime

  • Paul

    Hello, thanks for the example. However I have not understood how you got the 9.77 months value. (I understood that you added all averages (which are percentages)). Also, how is the expected cumulative count as a ‘lifetime’ value. Thanks!

  • http://www.onlinebingorendezvous.co.uk OBR

    Hi,

    Thanks for the post. Great analysis and details. Would you know any working model for online retailing / group discount sites?

    And while calculating this, what if our cost for acquiring customer sums up more than LTV? Till what period we can afford to have greater CAC than LTV?

    Please suggest

  • http://lsvp.wordpress.com jeremyliew

    The idea of LTV is that it is over a lifetime, so if CAC>LTV then you’ve got a tough business model.

  • http://none Dave

    Jeremy,

    How would you tweak this analysis if your business operated using a sales model that required a minimum yearly subscription?

  • http://lsvp.wordpress.com jeremyliew

    Basically you should see close to 100% retention for year one, followed by likely high churn in month 13, then start to normalize (vs normalizing in month 2)

  • Pepe

    Great post!

    One math question: Why does adding the average retention RATES results on an estimate of a users subscription life in MONTHS? Shouldn’t that be the average cummulative retention rate for yr1, yr2, etc?

    Lastly, how would your LTV formula change if you were running a cohort analysis for Groupon?

    Thanks!

  • http://lsvp.wordpress.com jeremyliew

    To calculate the average subsciption life you’d want to calculate:

    % of people who only lasted one month x1
    + % of people who lasted exactly two months x 2
    + % of people who lasted exactly three months x3
    +…
    + % of people who lasted exactly 60 months x 60

    = (month one retention – month 2 retention) x1
    + (month 2 retention – month 3 retention) x2
    + (month 3 retention – month 4 retention) x3
    +…
    + (month 60 retention – month 59 retention) x 60

    = month one retention + (-1+2)x month 2 retention + (-2+3)x month three retention+ …+ (-50+60)x month 60 retention
    = month one retention + month 2 retention + … + month 60 retention

    You would do exactly the same thing if you were running a cohort analysis for groupon, except that you would calculate not subscibers but # of people pruchasing that month.

  • Pingback: 举例解析重复购买行为中的LTV计算方法 | GamerBoom.com 游戏邦

  • vincent

    Jeremy,

    Thank you for sharing this. This is very helpful.
    I have one question:Why does the “first month retention data” not 100%?

  • vincent

    Does the 98.1% in first line mean that if I send email to 1000 people and 981 of them subs? Then the 54.7% below the 98.1% is the precentage of the 981 people who still subs in the second month? Is it correct?

  • http://lsvp.wordpress.com jeremyliew

    Some churn can happen in the first month

  • http://lsvp.wordpress.com jeremyliew

    98.1% of people who start month one as subscribers are still subscribers at the end of the first month. and 54.7% of those 981 are still subs at end of month two

  • babo

    why does your analysis not factor in present value of money? Calculating the LTV of a customer without considering present value of the future revenues [which you factor in by multiplying by the average life] would certainly lead to an inflated LTV calculation. Particularly with smaller companies when WACC is often in excess of 20%, including vs. excluding PV will often have huge numerical (& thus operational) impacts.

    Also, for non-subscription based e-commerce companies, the churn is often times higher than 70% in the first month alone, depending in large part on the industry and customer loyalty/trademark value. In my opinion, for what it’s worth, I think further individual customer tracking & clustering of the customer base is the best way to truly connect with each level of customer. Often times there’s a core level of customers that react vastly different than non-core customers in terms of a marketing approach, creating an inherent insight on customer retention and maximizing ROI/spend.

  • Lirit Belisha

    Jeremy, great post.
    What about gaming industry?
    What happens if we have people who haven’t played for certain period and then came back to play? by using your model we get few results of more than 100% retention rate.

    Please suggest

  • http://lsvp.wordpress.com jeremyliew

    I’d treat them as anomalies as at scale you will likely stop seeing these

  • Pingback: Dossier metrics part4, pour aller plus loin: sources et ouvrages | Clement vouillon

  • Raksha

    Hi Jeremy,

    Please could you help me understand how the expected cumulative count is used to calculate the ‘lifetime’.

    cumalative sub after 1 year = 5.23. Does that mean 5.23 months, one average customer subscribes for a year?

    Sorry if this question was addressed earlier. But I am still unclear.

    Your inputs will be valuable.

    Thanks,

  • http://lsvp.wordpress.com jeremyliew

    Yes, Raksha, that is right. “lifetime” is the weighted average of all subscribers lifetimes, which would be 5.23 if you only calculated out 12 months.

  • Pingback: Tech. Entrepreneurs – Recommended Reading « Communications Strategy

  • kitschkid

    This article is very helpful! Thank you for sharing. Do you also write other articles about cohort analysis or data analysis in general? What do would you teach a “noob” in terms of the basics of doing data analysis?

  • http://lsvp.wordpress.com jeremyliew

    I haven’t done anything else so far

  • Pingback: How to estimate lifetime value for an ecommerce business; Sample cohort analysis « Lightspeed Venture Partners Blog

  • http://twitter.com/davidfraga David Fraga

    We use http://www.RJMetrics.com to automate this calculation. Makes it a lot easier to keep LTV and cohort analyses up to date with the most recent data.

  • Kostas Anagnostou

    Hello Jeremy,

    very helpful article indeed. I have a question though:

    “In a social business the metric might be activity (e.g. how many users
    posted a photo this month), and in a social game the metric might be
    dollars spent in virtual goods that period.”

    And how exactly would your model work with dollars spent in virtual goods over a month for example? I mean how can you find the Average Player Lifetime only based on this metric?

    Best,
    Kostas

  • http://lsvp.com/blog/ jeremyliew

    You would look at cohorts of players over time and track their spending on digital goods each month and project out LTV using the same approach

  • dave@hotmail.com

    When calculating the drop in the attrition rate month over month, why would you use an average of the preceeding 12 months? Should you not only look at the last 2-3 months, especially if you experience higher churn in months 1-3 and there after it normalizes?