This documentation page assumes that you already have a SeekTable account. You can create free account by signing up.

ElasticSearch pivot table reports

With built-in ElasticSearch connector you can use SeekTable for ad-hoc queries without need to write Elastic Query DSL by hands. All reports (pivot tables/charts/datagrids) are generated in the real-time and you always get actual data. SeekTable is a good addition to Kibana, especially if you need:

  • pivot table reports (there are no pivot tables in Kibana yet)
  • work with Elastic index(es) as a data table (with sort/filter), in terms of columns and rows
  • publish live ElasticSearch reports to web in a simple way

There are no any limitations on the ElasticSearch index size and performance is limited only by the power of your ElasticSearch cluster.


Looking for embedded solution? Try PivotData microservice which can be seamlessly integrated into any web application for pivot tables generation by ElasticSearch with simple web API.

How to configure ElasticSearch data source

  1. Click on "Connect to Database" item at "Cubes" view, or just open this link (ensure that you're logged in).
  2. Select "ElasticSearch" in Data Source Type selector: ElasticSearch connection settings
  3. Fill all required fields:
    Cube Name
    short title that describes this data source
    Connection URL
    Base URL of your ElasticSearch API endpoint, for example:
    https://ReadOnly:nxxr8qt7n26c7jwijhwg1ipnarf402yn@thorin-us-east-1.searchly.com
    Index
    the name of the index to query. In ElasticSearch you can query multiple indexes, so you can specify them as index1,index2 or use wildcards like logs-2018-*.
    Doc Type
    you might need to specify mapping type if you use legacy version of ElasitcSearch (5.x or earlier). For ElasticSearch 6.x no need to specify doc type option.
    Filter Expression
    here you may define filtering conditions for Elastic query with SQL-like syntax, for example:
    timestamp >= "2018-10-20":datetime
    • condition syntax: "csv_column_name":field <condition> "value"[:datatype]
      csv_column_name
      Name of the CSV column to filer. Expression-type dimension name cannot be used here.
      condition
      <, <=, >, >=, =, !=, like, in. 'Like' works like in SQL: prepend/append '%' for 'contains' (you can add '%' automatically with parameter Expression).
      value
      hardcoded constant, or parameter name with 'var' datatype: "param_name":var
      datatype
      string, int32, int64, decimal, double, datetime
    • boolean and/or operators: C1 and C2, C1 or C2, C1 or (C2 and C3)
  4. Infer dimensions and measures by columns option: keep it checked to determine dimensions and measures by first N documents - in this case you don't need to define Dimensions and Measures by yourself. You can edit configuration later and remove excessive elements, or customize automatically determined ones.
  5. Click on "Save" button.

If everything is fine you should see a new cube dashboard with the list of available dimensions/measures. In case of connection error you'll see an orange box with the error message; don't forget to ensure that ElasticSearch API can be accessed by SeekTable server and it is not blocked by firewall.
If you specified "Infer dimensions and measures" option and get a cube with no dimensions most likely you've specified non-existing mapping type in "Doc Type".

Dimensions setup

ElasticSearch dimensions setup
Type
Field: dimension value is a document field or script field (script code should be provided as first "Parameter").
Expression: dimension is defined as calculated field with custom formula that uses another dimensions as arguments (formula and arguments should be specified in "Parameters").
Name
Unique dimension identifier. For Type=Field this is document or sub-document field specifier.
Label
User-friendly dimension title (optional).
Format
Custom format string (.NET String.Format) for dimension values (optional). Examples:
  • for number values: ${0:0.##} → $10.25
  • for date values: {0:yyyy-MM-dd} → 2017-05-25
Parameters
For Type=Field: you can specify custom script field with "painless" expression syntax. For example:
(doc["registered"].empty ? null : doc["registered"].date.year)
(extracts year value from "registered" date field). Also you can specify "number" for 2-nd parameter if script result is a number (this affects sorting in flat table reports).
For Type=Expression: you can specify formula expression (1-st parameter) and dimension names for the arguments (2-nd, 3-rd etc parameter).

Measures setup

ElasticSearch measures setup
Type
Count: the number of aggregated documents.
Sum: the total sum of a numeric field.
Average: the average value of a numeric field.
Min: the minimal value of a column.
Max: the maximum value of a column.
FirstValue: custom acummulator aggregation pipeline expression.
Expression: measure defined as calculated field.
Name
Explicit unique measure identifier. You can leave it blank (for any measure types except "Expression") to generate the name automatically.
Label
User-friendly measure caption (optional).
Format
Custom format string (.NET String.Format) for measure values (optional). Example:
  • ${0:0.##} → $10.25
Parameters
For Type=Count: no parameters needed.
For Type=Sum/Average/Min/Max: document field or field path to aggregate.
For Type=Expression: first parameter is formula expression, and next parameters are names of measures used as arguments in the expression.

Report parameters setup

Report parameter is used when you need to specify some filtering condition by user-defined variable and use it in the ElasticSearch query. Typical usage of report parameters:

Name
Unique (for this cube) parameter identifier.
Label
User-friendly parameter caption for UI (optional).
Data Type
String: text-based value.
Int32: 32-bit integer (max value is 2,147,483,647).
Int64: 64-bit integer (max value is 9,223,372,036,854,775,807).
Decimal: Fixed-point number with max 28 significant digits. Decimal point is '.' character.
DateTime: date or datetime value (in case of date this will be datetime value with 0:00:00 time). Date value should be specified as string in YYYY-MM-DD format.
Boolean: accepts only 'True' or 'False' value.
Multivalue
If checked parameter can accept several values (as array, in UI user can enter them as comma-separated string). Multivalue parameter can be used only with IN condition.
Default Value
Defines default value of this parameter. Empty means 'not defined'.
Expression
Custom expression to evaluate final parameter value. Expression syntax is the same as in calculated cube members; you can access user-entered values with Parameter["param_name"].

When parameter is defined it can be used in Filter Expression as following:

"1"="1" @paramName[ and name.keyword="{0}":var ]

Parameter syntax notes:

@
identifies that this is a placeholder for the parameter
paramName
parameter's Name
[ ]
expression between square brackets is added to Filter Expression when parameter is defined.
"{0}":var
placeholder for the parameter value.

In this sample parameter name is paramName and ElasticSearch document field to filter is name.keyword.

Troubleshooting

ERROR: Fielddata is disabled on text fields by default. Set fielddata=true on [some_text_field] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.

This error appears when you try to aggregate by text field and your ElasticSearch index doesn't have original values for this field. In most cases you can use .keyword suffix and enable unindexed values for aggregation as described in official ElasticSearch documentation.

ERROR: runtime error(script_exception: some_painless_script_expression)

This error happens when ElasticSearch cannot evaluate calculated field defined with painless expression. Typical case is when date values are stored as strings in format that is not recognized as 'Date' datatype by ElasticSearch and as result expression that takes part of the date (like doc["timespamp_field"].date.dayOfMonth) fails. To fix this you need to define a mapping for this field that declares Date datatype and formats to match (if needed).

ERROR: failed to create query (caused_by: { "type": "number_format_exception", "reason": "For input string: \"some_keyword\"" }): ElasticSearch query

You may get an error like this ('caused_by' may be a bit different) with flat table report type and when you specify a keyword to filter without field name hint: in this case SeekTable produces OR condition for all fields selected as columns, and one of them is not comparable with the specified keyword value.
Workaround: specify a hint for the keyword in the filter, for example: name:John.

ERROR: Trying to create too many buckets. Must be less than or equal to: [10000] but was [10001]

This may happen when you selected high-cardinality field in the pivot table report. To fix this you can use report parameter(s) to apply some filtering condition in ElasticSearch query. If you need to get more than 10k unique values in the pivot table you'll need to change [search.max_buckets] option in your cluster level setting.