User Retention Charts
1. Read the data
2. Select the time range
We have data for the last 365 days, but we don't usually need to look so far back. Let's restrict our analysis to a specific time range.
<class 'pandas.core.frame.DataFrame'> RangeIndex: 190335 entries, 0 to 190334 Data columns (total 3 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 user_id 190335 non-null object 1 signed_up_at_week 190335 non-null datetime64[ns, UTC] 2 week 190335 non-null float64 dtypes: datetime64[ns, UTC](1), float64(1), object(1) memory usage: 4.4+ MB
3. Calculate retention
Let's start by plotting a time series of each cohort's retention rate. This is useful for identifying changes over time. For example, we can see that our cohorts in September are performing much better than the ones in May through August.
Sometimes it's helpful to look at the number of users, not just the retention rate. It's also useful to visualize retention as a matrix, to find any outliers.