This documentation page assumes that you already have a SeekTable account. Create your free account by signing up.

Cohort Analysis with SeekTable

Cohort analysis is used for getting insights about specific users (or another kind of actors) behaviour when they are grouped in cohorts by some criteria. This kind of analytics is performed with dataset represents a history of some events; this might be page views statistics or transactions (say, purchases) or user actions log. Typically cohort analysis is used for:

To perform cohort analysis on input you need to have a dataset with the following columns:

Cohort analysis example: user retention report

This article describes how to use SeekTable for cohort analysis and create user retention report (like in Google Analytics).
Online demo for this kind of report: user retention report sample.

CSV data source

Lets assume that our history data is represented by user_events.csv file. After file upload SeekTable automatically suggests "year" and "month" dimensions for date columns:

To perform cohort analysis we need to add one custom dimension to calculate number of months (this may be days or weeks) between event data and registration date (Cube → Edit Configuration). For this purpose "Expression" type of dimension can be used:

Type Name Label Format Parameters
Expression month_from_reg Months from Reg Date
  1. ( 12*Dimension["Event Date (Year)"]+Dimension["Event Date (Month)"] ) - ( 12*Dimension["Reg Date (Year)"]+Dimension["Reg Date (Month)"] )
  2. Reg Date (Year)
  3. Reg Date (Month)
  4. Event Date (Year)
  5. Event Date (Month)

In dimension with type="Expression" 1-st parameter is a formula expression, and next N parameters are names of dimensions that are used as arguments of the formula. See also reference on calculated fields for more details on this topic.

Then let's add a measure that counts number of unique users inside groups:

Type Parameters Name Label
CountUnique user_id CountUniqueOfUserID Count Unique Users

Now we can configure user retention by month report:

  1. select "Reg Date (Year)" and "Reg Date (Month)" for rows
  2. select "Months from Reg Date" for columns
  3. select "Count Unique Users" for values
  4. click on "Apply", you should see a report like this.

You can use options from "Format" tab to make your user retention report like in Google Analytics:

Now you have basic cohort report; you can use another dimensions on columns to change cohort criteria; also you can change the formula and user another period instead of month - say, a day or quarter.

SQL data source

Configure SQL count unique measure

In case of SQL database all steps are the same as for CSV, only difference is that instead of CountUnique measure type (which is available only for CSV cubes) you need to configure custom measure based on SQL COUNT(DISTINCT):

  • Type: FirstValue
  • Parameters: COUNT(DISTINCT user_id)
  • Name: CountUniqueOfUserID
  • Label: Count Unique Users

Configure SQL-calculated dates difference in months dimension

In the case of SQL data source it makes sense to calculate number of months (weeks, days) between an event and a registration date on SQL level, without usage of Expression-type dimension for month_from_reg. For example:

  • Type: Field
  • Name: month_from_reg
  • Label: Months from Reg Date
  • Parameters: an SQL expression that calculates the difference in months
    • SQL Server: DATEDIFF(month, event_date, reg_date)
    • MySql: TIMESTAMPDIFF(MONTH, event_date, reg_date)
    • PostgreSql: (DATE_PART('year', reg_date::date) - DATE_PART('year', event_date::date))*12 + DATE_PART('month', reg_date::date) - DATE_PART('month', event_date::date)