This documentation page assumes that you already have a SeekTable account. Create your free account by signing up.

Setup SQL database for pivot table reports

You can configure a 'live' connection to your SQL database and use it as a data source for operational reports (pivot tables, charts, flat tables) . Data is not imported: SeekTable executes aggregate queries (SELECT .. GROUP BY) on-the-fly to retrieve the necessary data for the report.

There are no any limitations on the dataset size, however your database should be able to execute aggregate queries fast enough: ideally in seconds (up to 2 minutes max). In case of moderate size fact tables - up to millions of rows - most popular DBs can do that. For large tables a dataset size can be limited by applying WHERE conditons on indexed columns (see parameters setup section below), or by usage of pre-aggregated tables / materialized views. For a real time big data analytics (billions of rows, TBs in size) you can consider to use specialized columnar databases (data warehouses):

How SQL connector works

SQL-based cubes are defined with a fact table; for flexibility purposes it is configured as a Select Query template, so this could be a resultset returned by a view/function/complex SQL statement. Cube dimensions are mapped to columns (or SQL-expressions), and measures are SQL aggregate functions.

For summary reports (pivot tables/charts) SeekTable ROLAP engine generates GROUP BY queries in this way:

  1. Let's assume that cube's Select Query is simply
    SELECT * FROM facts
  2. For a pivot table report with column1 and column2 dimensions on Rows/Columns and Sum of column3 measure on Values SeekTable executes this query under the hood:
    SELECT column1, column2, SUM(column3) FROM facts
    GROUP BY column1, column2
  3. If Select Query doesn't start with SELECT * generated query will be:
    SELECT column1, column2, SUM(column3) FROM (<Select Query>) t
    GROUP BY column1, column2
    

A star-schema is fully supported: dimensions may be resolved with conditional JOINs, also JOINs may be inside Select Query and it is possible to include them into the final query depending on the report configuration (only when JOIN is needed).

How to configure a SQL data source

  1. Click on "Connect to Database" item at "Cubes" view, or just open this link (ensure that you're logged in).
  2. Select "SQL-compatible database" in Data Source Type selector: SQL database connection settings
  3. In Cube Name enter short title that describes this data source.
  4. Choose your database in Database Connector and configure its Connection String:
  5. Specify Select Query: this is a SELECT command that loads all possible columns for dimensions or measures. In simplest case this might be something like:
    SELECT * FROM some_table_or_dataview
    You can specify a complex SQL query here (with JOINs, WHERE).
    • use simple form SELECT * FROM if possible (in this case SeekTable will not wrap it with the outer SELECT and * will be replaced with actual columns needed for the report)
    • do not specify GROUP BY unless you really want to work with pre-aggregated dataset
    • do not specify ORDER BY, it will be ignored anyway
    • do not use t table alias, it is reserved for the outer SELECT generated by the engine (when needed)
  6. Keep Infer dimensions and measures by columns checked to determine dimensions and measures automatically by the first N rows. You can modify suggested configuration later.
  7. Click on "Save" button.

If everything is fine you should see a new cube dashboard with the list of available dimensions and measures.

In case of connection error you'll see an orange box with an error message; you may click on "Edit Configuration" and apply necessary changes.

Dimensions setup

SQL cube dimensions setup
Type
Field: dimension name refers to table column or result of SQL expression (can be provided as first "Parameter").
Expression: dimension is defined as calculated field with custom formula that uses another dimensions as arguments (formula and arguments should be specified in "Parameters").
Name
Unique dimension identifier. For Type=Field this is column name specifier (possibly with table alias prefix).
Label
User-friendly dimension title (optional).
Format
Custom format template (syntax is for .NET String.Format, only zero-index placeholder {0} can be used). Examples:
  • {0:yyyy-MM-dd} → format date (or timestamp) as 2017-05-25
  • {0:MMM} → format month number (1-12) as short month name (Jan, Feb etc)
  • {0:MMMM} → format month number (1-12) as full month name (January, February etc)
Parameters
For Type=Field: you can specify custom SQL expression for this dimension, or dimension ID column for "Conditional JOIN rule".
For Type=Expression: you can specify formula expression (1-st parameter) and dimension names for the arguments (2-nd, 3-rd etc parameter).

Measures setup

SQL cube measures setup
Type
Count: the number of aggregated rows.
Sum: the total sum of a numeric column.
Average: the average value of a numeric column.
Min: the minimal value of a column.
Max: the maximum value of a column.
FirstValue: custom SQL aggregate expression; for example: 'COUNT(DISTINCT some_column)'.
Note: sub-totals/totals/grand-total will be empty for this type of measure because SeekTable cannot calculate them on its side. Most DBs can return totals with "GROUP BY CUBE" syntax, see How to enable totals for "FirstValue" measures Expression: measure is defined as a formula calculated by SeekTable.
Parameters
For Type=Count: no parameters needed.
For Type=Sum/Average/Min/Max: column name to aggregate.
For Type=FirstValue: first parameter is a custom SQL aggregate expression.
For Type=Expression: first parameter is an expression, and next parameters are names of measures used as arguments in the expression.
Name
Explicit unique measure identifier. You can leave it blank (for any measure types except "Expression") to generate the name automatically.
Label
User-friendly measure caption (optional).
Format
Custom format template (syntax is for .NET String.Format, only zero-index placeholder {0} can be used). Examples:
  • {0:$#.##} → format number as $10.25 (or empty if no value)
  • {0:0.#|k} → if number>1000 shorten it with "k" suffix
  • {0:0.#|M} → if number>1000000 shorten it with "M" suffix
  • {0:0.#|kMB} → shorten large number with appropriate "k"/"M"/"B" suffix

Report parameters setup

Report parameters are used when you need to declare user-defined variable and use it in the SQL template as you want; typical usage is SQL query filtering with WHERE conditions.

SQL cube parameters setup
Name
Unique (for cube) parameter identifier.
Label
User-friendly parameter caption for UI (optional).
Data Type
String: text-based value.
Int32: 32-bit integer (max value is 2,147,483,647).
Int64: 64-bit integer (max value is 9,223,372,036,854,775,807).
Decimal: Fixed-point number with max 28 significant digits. Decimal point is '.' character.
DateTime: datetime or date value (in this case datetime value has 0:00:00 time). Date value should be specified as string in YYYY-MM-DD format.
Boolean: accepts only 'True' or 'False' value.
Multivalue
If checked parameter can accept several values (as array, in UI user can enter them as comma-separated string). Multivalue parameter can be used only with SQL IN condition.
Default Value
Defines default value of this parameter. Empty means 'not defined'.
Expression
Custom expression to evaluate final parameter value. Expression syntax is the same as in calculated cube members; you can access user-entered values with Parameter["param_name"].

When parameter is defined it can be used in Select Query as following:

SELECT * FROM orders o WHERE 1=1 @orderDate[ and o.orderDate>={0} ]

Parameter placeholder syntax notes:

@
identifies that this is a placeholder for the parameter
orderDate
parameter Name
[ ]
expression between square brackets is added to SQL command when parameter is defined. Optionally you may define a part that is added when parameter is not defined in this way:
@paramName[ expression_when_defined ; expression_when_NOT_defined ]
{0}
placeholder for the parameter value. Value is inserted into the SQL as a command parameter → SQL injections are impossible. This also means that parameter value cannot contain SQL expression or used to provide a part of SQL command.

Notes:

Efficient lookups resolution

Data in SQL databases often is organized with star schema: when main facts table contains only numbers and foreign keys, and actual values are stored separately in dimension tables. To resolve these lookups these dimension tables should be joined in the cube's Select Query, for example:

SELECT o.*, c.name as country_name FROM order o
LEFT JOIN countries c ON (c.country_id=o.country_id)

This works fine if you have relatively small number of rows in the 'orders' table and everything works very fast. But what if you have millions of rows?.. Every JOIN adds signficant overhead and excessive database load.

Fortunately, solution exists: JOINs may be applied after facts table aggregation, and only when they are needed for the concrete report. This is possible with Conditional JOINs for lookups setup, for example:

Enable totals for "FirstValue" measures

To get totals for measures defined with a custom SQL expression you can check Use "GROUP BY CUBE" option if your DB supports this SQL syntax (MySql doesn't) and columns used as dimensions are not nullable. For "GROUP BY CUBE" query database adds additional rows for subtotals where NULL values are used for grouping columns. If your columns have NULL values you need to use another value instead; to do that you may define a custom SQL for the dimension, for example:

One more limitation is usage of pivot table's Filter: by default it is applied in-memory and since SeekTable engine cannot calculate "FirstValue" totals you still can get empty totals when this kind of filtering is used. To avoid that you can:

"FirstValue" measure + "Expression" dimension

When a measure is calculated with a custom SQL, the engine cannot perform roll-ups and this leads to some limitations in "Expression"-type dimensions usage. In particular, dimension's formula should produce a 1:1 projection - in other words, unique set of arguments should lead to an unique expression evaluation result (otherwise report can display zeros/empty cells).