This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Searching Data

Searching the data held in Stroom using Dashboards, Queries, Views and Analytic Rules.

Data in stroom (and in external Elastic indexes) can be searched using a number of ways:

  • Dashboard Combines multiple query expressions, result tables and visualisations in one configurable layout.

  • Query Executes a single search query written in StroomQL and displays the results as a table or visualisation.

  • Analytic Rule Executes a StroomQL search query either against data as it is ingested into Stroom or on a scheduled basis.

1 - Data Sources

Stroom has multiple different types of data sources that can be queried by Stroom using Dashboards, Queries and Analytic Rules.

1.1 - Lucene Index Data Source

Stroom’s own Lucene based index for indexing and searching its stream data.

Stroom’s primary data source is its internal Lucene based search indexes. For details of how data is indexed see Lucene Indexes.

1.2 - Statistics

Using Stroom’s statistic stores as a data source.

1.3 - Elasticsearch

Using Elasticsearch as a data source.

Stroom can integrate with external Elasticsearch indexes to allow querying using Stroom’s various mechanisms for querying data sources. These indexes may have been populated using a Stroom pipeline (See here).

Searching using a Stroom dashboard

Searching an Elasticsearch index (or data stream) using a Stroom dashboard is conceptually similar to the process described in Dashboards.

Before you set the dashboard’s data source, you must first create an Elastic Index document to tell Stroom which index (or indices) you wish to query.

Create an Elastic Index document

  1. Right-click a folder in the Stroom Explorer pane ( ).
  2. Select:
    New
    Elastic Index
  3. Enter a name for the index document and click OK .
  4. Click next to the Cluster configuration field label.
  5. In the dialog that appears, select the Elastic Cluster document where the index exists, and click OK .
  6. Enter the name of an index or data stream in Index name or pattern. Data view (formerly known as index pattern) syntax is supported, which enables you to query multiple indices or data streams at once. For example: stroom-events-v1.
  7. (Optional) Set Search slices, which is the number of parallel workers that will query the index. For very large indices, increasing this value up to and including the number of shards can increase scroll performance, which will allow you to download results faster.
  8. (Optional) Set Search scroll size, which specifies the number of documents to return in each search response. Greater values generally increase efficiency. By default, Elasticsearch limits this number to 10,000.
  9. Click Test Connection. A dialog will appear with the result, which will state Connection Success if the connection was successful and the index pattern matched one or more indices.
  10. Click .

Set the Elastic Index document as the dashboard data source

  1. Open or create a dashboard.
  2. Click in the Query panel.
  3. Click next to the Data Source field label.
  4. Select the Elastic Index document you created and click OK .
  5. Configure the query expression as explained in Dashboards. Note the tips for particular Elasticsearch field mapping data types.
  6. Configure the table.

Query expression tips

Certain Elasticsearch field mapping types support special syntax when used in a Stroom dashboard query expression.

To identify the field mapping type for a particular field:

  1. Click in the Query panel to add a new expression item.
  2. Select the Elasticsearch field name in the drop-down list.
  3. Note the blue data type indicator to the far right of the row. Common examples are: keyword, text and number.

After you identify the field mapping type, move the mouse cursor over the mapping type indicator. A tooltip appears, explaining various types of queries you can perform against that particular field’s type.

Searching multiple indices

Using data view (index pattern) syntax, you can create powerful dashboards that query multiple indices at a time. An example of this is where you have multiple indices covering different types of email systems. Let’s assume these indices are named: stroom-exchange-v1, stroom-domino-v1 and stroom-mailu-v1.

There is a common set of fields across all three indices: @timestamp, Subject, Sender and Recipient. You want to allow search across all indices at once, in effect creating a unified email dashboard.

You can achieve this by creating an Elastic Index document called (for example) Elastic-Email-Combined and setting the property Index name or pattern to: stroom-exchange-v1,stroom-domino-v1,stroom-mailu-v1. Click and re-open the dashboard. You’ll notice that the available fields are a union of the fields across all three indices. You can now search by any of these - in particular, the fields common to all three.

1.4 - Internal Data Sources

A set of data sources for querying the inner workings of Stroom.

Stroom provides a number of built in data sources for querying the inner workings of stroom. These data sources do not have a corresponding Document so do not feature in the explorer tree.

These data sources appear as children of the root folder when selecting a data source in a Dashboard , View . They are also available in the list of data sources when editing a Query .

Analytics

Annotations

Annotations are a means of annotating search results with additional information and for assigning those annotations to users. The Annotations data source allows you to query the annotations that have been created.

Field Type Description
annotaion:Id Long Annotation unique identifier.
annotation:CreatedOn Date Date created.
annotation:CreatedBy String Username of the user that created the annotation.
annotation:UpdatedOn Date Date last updated.
annotation:UpdatedBy String Username of the user that last updated the annotation.
annotation:Title String
annotation:Subject String
annotation:AssignedTo String Username the annotation is assigned to.
annotation:Comment String Any comments on the annotation.
annotation:History String History of changes to the annotation.

Dual

The Dual data source is one with a single field that always returns one row with the same value. This data source can be useful for testing expression functions. It can also be useful when combined with an extraction pipeline that uses the stroom:http-call() XSLT function in order to make a single HTTP call using Dashboard parameter values.

Field Type Description
Dummy String Always one row that has the value X

Index Shards

Exposes the details of the index shards that make up Stroom’s Lucene based index. Each index is split up into one or more partitions and each partition is further divided into one or more shards. Each row represents one index shard.

Field Type Description
Node String The name of the node that the index belongs to.
Index String The name of the index document.
Index Name String The name of the index document.
Volume Path String The file path for the index shard.
Volume Group String The name of the volume group the index is using.
Partition String The name of the partition that the shard is in.
Doc Count Integer The number of documents in the shard.
File Size Long The size of the shard on disk in bytes.
Status String The status of the shard (Closed, Open, Closing, Opening, New, Deleted, Corrupt).
Last Commit Date The time and date of the last commit to the shard.

Meta Store

Exposes details of the streams held in Stroom’s stream (aka meta) store. Each row represents one stream.

Field Type Description
Feed String The name of the feed the stream belongs to.
Pipeline String The name of the pipeline that created the stream. [Optional]
Pipeline Name String The name of the pipeline that created the stream. [Optional]
Status String The status of the stream (Unlocked, Locked, Deleted).
Type String The Stream Type , e.g. Events, Raw Events, etc.
Id Long The unique ID (within this Stroom cluster) for the stream .
Parent Id Long The unique ID (within this Stroom cluster) for the parent stream, e.g. the Raw stream that spawned an Events stream. [Optional]
Processor Id Long The unique ID (within this Stroom cluster) for the processor that produced this stream. [Optional]
Processor Filter Id Long The unique ID (within this Stroom cluster) for the processor filter that produced this stream. [Optional]
Processor Task Id Long The unique ID (within this Stroom cluster) for the processor task that produced this stream. [Optional]
Create Time Date The time the stream was created.
Effective Time Date The time that the data in this stream is effective for. This is only used for reference data stream and is the time that the snapshot of reference data was captured. [Optional]
Status Time Date The time that the status was last changed.
Duration Long The time it took to process the stream in milliseconds. [Optional]
Read Count Long The number of records read in segmented streams. [Optional]
Write Count Long The number of records written in segmented streams. [Optional]
Info Count Long The number of INFO messages.
Warning Count Long The number of WARNING messages.
Error Count Long The number of ERROR messages.
Fatal Error Count Long The number of FATAL_ERROR messages.
File Size Long The compressed size of the stream on disk in bytes.
Raw Size Long The un-compressed size of the stream on disk in bytes.

Processor Tasks

Exposes details of the tasks spawned by the processor filters. Each row represents one processor task.

Field Type Description
Create Time Date The time the task was created.
Create Time Ms Long The time the task was created (milliseconds).
Start Time Date The time the task was executed.
Start Time Ms Long The time the task was executed (milliseconds).
End Time Date The time the task finished.
End Time Ms Long The time the task finished (milliseconds).
Status Time Date The time the status of the task was last updated.
Status Time Ms Long The time the status of the task was last updated (milliseconds).
Meta Id Long The unique ID (unique within this Stroom cluster) of the stream the task was for.
Node String The name of the node that the task was executed on.
Pipeline String The name of the pipeline that spawned the task.
Pipeline Name String The name of the pipeline that spawned the task.
Processor Filter Id Long The ID of the processor filter that spawned the task.
Processor Filter Priority Integer The priority of the processor filter when the task was executed.
Processor Id Long The unique ID (unique within this Stroom cluster) of the pipeline processor that spawned this task.
Feed String
Status String The status of the task (Created, Queued, Processing, Complete, Failed, Deleted).
Task Id Long The unique ID (unique within this Stroom cluster) of this task.

Reference Data Store

Reference data is written to a persistent cache on storage local to the node. This data source exposes the data held in the store on the local node only. Given that most Stroom deployments are clustered and the UI nodes are typically not doing processing, this means the UI node will have no reference data.

Task Manager

This data source exposed the back ground tasks currently running across the Stroom cluster. Each row represents a single background server task.

Requires the Manage Tasks application permission.

Field Type Description
Node String The name of the node that the task is running on.
Name String The name of the task.
User String The user name of the user that the task is running as.
Submit Time Date The time the task was submitted.
Age Duration The time the task has been running for.
Info String The latest information message from the task.

2 - Dashboards

A Dashboard document is a way to combine multiple search queries, tables and visualisations in a configurable layout.

2.1 - Queries

How to query the data in Stroom.

Dashboard queries are created with the query expression builder. The expression builder allows for complex boolean logic to be created across multiple index fields. The way in which different index fields may be queried depends on the type of data that the index field contains.

Date Time Fields

Time fields can be queried for times equal, greater than, greater than or equal, less than, less than or equal or between two times.

Times can be specified in two ways:

  • Absolute times

  • Relative times

Absolute Times

An absolute time is specified in ISO 8601 date time format, e.g. 2016-01-23T12:34:11.844Z

Relative Times

In addition to absolute times it is possible to specify times using expressions. Relative time expressions create a date time that is relative to the execution time of the query. Supported expressons are as follows:

  • now() - The current execution time of the query.
  • second() - The current execution time of the query rounded down to the nearest second.
  • minute() - The current execution time of the query rounded down to the nearest minute.
  • hour() - The current execution time of the query rounded down to the nearest hour.
  • day() - The current execution time of the query rounded down to the nearest day.
  • week() - The current execution time of the query rounded down to the first day of the week (Monday).
  • month() - The current execution time of the query rounded down to the start of the current month.
  • year() - The current execution time of the query rounded down to the start of the current year.

Adding/Subtracting Durations

With relative times it is possible to add or subtract durations so that queries can be constructed to provide for example, the last week of data, the last hour of data etc.

To add/subtract a duration from a query term the duration is simply appended after the relative time, e.g.

now() + 2d

Multiple durations can be combined in the expression, e.g.

now() + 2d - 10h

now() + 2w - 1d10h

Durations consist of a number and duration unit. Supported duration units are:

  • s - Seconds
  • m - Minutes
  • h - Hours
  • d - Days
  • w - Weeks
  • M - Months
  • y - Years

Using these durations a query to get the last weeks data could be as follows:

between now() - 1w and now()

Or midnight a week ago to midnight today:

between day() - 1w and day()

Or if you just wanted data for the week so far:

greater than week()

Or all data for the previous year:

between year() - 1y and year()

Or this year so far:

greater than year()

2.2 - Internal Links

Adding links within Stroom to internal features/items or external URLs.

Within Stroom, links can be created in dashboard tables or dashboard text panes that will direct Stroom to display an item in various ways.

Links are inserted in the form:

[Link Text](URL and parameters){Link Type}

In dashboard tables links can be inserted using the link() function or more specialised functions such as data() or stepping().

In dashboard text panes, links can be inserted into the HTML as link attributes on elements.

  <div style="padding: 5px;">
    <span style="text-decoration:underline;color:blue;cursor:pointer" link="[link](uuid=e177cf16-da6c-4c7d-a19c-09a201f5a2da&amp;params=user%3Duser1&amp;title=Details%20For%20user1){dashboard}">Details For user1</span>
  </div>

The link type can be one of the following:

  • dialog : Display the content of a link URL within a stroom popup dialog.
  • tab : Display the content of a link URL within a stroom tab.
  • browser : Display the content of a link URL within a new browser tab.
  • dashboard : Used to launch a Stroom dashboard internally with parameters in the URL.
  • stepping : Used to launch Stroom stepping internally with parameters in the URL.
  • data : Used to show Stroom data internally with parameters in the URL.
  • annotation : Used to show a Stroom annotation internally with parameters in the URL.

Dialog

Dialog links are used to embed any referenced URL in a Stroom popup Dialog. Dialog links look something like this in HTML:

<div style="padding: 5px;">
    <span style="text-decoration:underline;color:blue;cursor:pointer" 
          link="[Show](https://www.somehost.com/somepath){dialog|Embedded In Stroom}">
        Show In Stroom Dialog
    </span>
</div>
{dialog|Embedded In Stroom}

Tab

Tab links are similar to dialog links are used to embed any referenced URL in a Stroom tab. Tab links look something like this in HTML:

<div style="padding: 5px;">
    <span style="text-decoration:underline;color:blue;cursor:pointer" 
          link="[Show](https://www.somehost.com/somepath){tab|Embedded In Stroom}">
        Show In Stroom Tab
    </span>
</div>
{tab|Embedded In Stroom}

Browser

Browser links are used to open any referenced URL in a new browser tab. In most cases this is easily accomplished via a normal hyperlink but Stroom also provides a mechanism to do this as a link event so that dashboard tables are also able to open new browser tabs. This can be accomplished by using the link() table function. In a dashboard text pane the HTML could look like this:

<div style="padding: 5px;">
    <span style="text-decoration:underline;color:blue;cursor:pointer" 
          link="[Show](https://www.somehost.com/somepath){browser}">
        Show In Browser Tab
    </span>
</div>

Dashboard

In addition to viewing/embedding external URLs, Stroom links can be used to direct Stroom to show an internal item or feature. The dashboard link type allows Stroom to open a new tab and show a dashboard with the specified parameters.

The format for a dashboard link is as follows:

[Link Text](uuid=<UUID>&params=<PARAMS>&title=<CUSTOM_TITLE>){dashboard}

The parameters for dashboard links are:

  • uuid - The UUID of the dashboard to open.
  • params - A URL encoded list of params to supply to the dashboard, e.g. params=user%3Duser1.
  • title - An optional URL encoded title to better identify the specific instance of the dashboard, e.g. title=Details%20For%20user1.

An example of this type of link in HTML:

<div style="padding: 5px;">
    <span style="text-decoration:underline;color:blue;cursor:pointer" 
          link="[link](uuid=e177cf16-da6c-4c7d-a19c-09a201f5a2da&amp;params=user%3Duser1&amp;title=Details%20For%20user1){dashboard}">
        Details For user1
    </span>
</div>

Data

A link can be created to open a sub-set of a source of data (i.e. part of a stream) for viewing. The data can either be opened in a popup dialog (dialog) or in another stroom tab (tab). It can also be display in preview form (with formatting and syntax highlighting) or unaltered source form.

The format for a data link is as follows:

[Link Text](id=<STREAM_ID>&partNo=<PART_NO>&recordNo=<RECORD_NO>&lineFrom=<LINE_FROM>&colFrom=<COL_FROM>&lineTo=<LINE_TO>&colTo=<COL_TO>&viewType=<VIEW_TYPE>&displayType=<DISPLAY_TYPE>){data}

Stroom deals in two main types of stream, segmented and non-segmented (see Streams). Data in a non-segmented (i.e. raw) stream is identified by an id, a partNo and optionally line and column positions to define the sub-set of that stream part to display. Data in a segmented (i.e. cooked) stream is identified by an id, a recordNo and optionally line and column positions to define the sub-set of that record (i.e. event) within that stream.

The parameters for data links are:

  • id - The stream ID.
  • partNo - The part number of the stream (one based). Always 1 for segmented (cooked) streams.
  • recordNo - The record number within a segmented stream (optional). Not applicable for non-segmented streams so use null() instead.
  • lineFrom - The line number of the start of the sub-set of data (optional, one based).
  • colFrom - The column number of the start of the sub-set of data (optional, one based).
  • lineTo - The line number of the end of the sub-set of data (optional, one based).
  • colTo - The column number of the end of the sub-set of data (optional, one based).
  • viewType - The type of view of the data (optional, defaults to preview):
    • preview : Display the data as a formatted preview of a limited portion of the data.
    • source : Display the un-formatted data in its original form with the ability to navigate around all of the data source.
  • displayType - The way of displaying the data (optional, defaults to dialog):
    • dialog : Open as a modal popup dialog.
    • tab : Open as a top level tab within the Stroom browser tab.

In preview mode the line and column positions will limit the data displayed to the specified selection. In source mode the line and column positions define a highlight block of text within the part/record.

An example of this type of link in HTML:

<div style="padding: 5px;">
    <span style="text-decoration:underline;color:blue;cursor:pointer" 
          link="[link](id=1822&amp;partNo=1&amp;recordNo=1){data}">
        Show Source</span>
</div>

View Type

The additional parameter viewType can be used to switch the data view mode from preview (default) to source.

In preview mode the optional parameters lineFrom, colFrom, lineTo, colTo can be used to limit the portion of the data that is displayed.

<div style="padding: 5px;">
    <span style="text-decoration:underline;color:blue;cursor:pointer" 
          link="[link](id=1822&amp;partNo=1&amp;recordNo=1&amp;viewType=preview&amp;lineFrom=1&amp;colFrom=1&amp;lineTo=10&amp;colTo=8){data}">
        Show Source Preview
    </span>
</div>

In source mode the optional parameters lineFrom, colFrom, lineTo, colTo can be used to highlight a portion of the data that is displayed.

<div style="padding: 5px;">
    <span style="text-decoration:underline;color:blue;cursor:pointer" 
          link="[link](id=1822&amp;partNo=1&amp;recordNo=1&amp;viewType=source&amp;lineFrom=1&amp;colFrom=1&amp;lineTo=10&amp;colTo=8){data}">
        Show Source
    </span>
</div>

Display Type

Choose whether to display data in a dialog (default) or a Stroom tab.

Stepping

A stepping link can be used to launch the data stepping feature with the specified data. The format for a stepping link is as follows:

[Link Text](id=<STREAM_ID>&partNo=<PART_NO>&recordNo=<RECORD_NO>){stepping}

The parameters for stepping links are as follows:

  • id - The id of the stream to step.
  • partNo - The sub part no within the stream to step (usually 1).
  • recordNo - The record or event number within the stream to step.

An example of this type of link in HTML:

<div style="padding: 5px;">
    <span style="text-decoration:underline;color:blue;cursor:pointer" 
          link="[link](id=1822&amp;partNo=1&amp;recordNo=1){stepping}">
        Step Source</span>
</div>

Annotation

A link can be used to edit or create annotations. To view or edit an existing annotation the id must be known or one can be found using a stream and event id. If all parameters are specified an annotation will either be created or edited depending on whether it exists or not. The format for an annotation link is as follows:

[Link Text](annotationId=<ANNOTATION_ID>&streamId=<STREAM_ID>&eventId=<EVENT_ID>&title=<TITLE>&subject=<SUBJECT>&status=<STATUS>&assignedTo=<ASSIGNED_TO>&comment=<COMMENT>){annotation}

The parameters for annotation links are as follows:

  • annotationId - The optional existing id of an annotation if one already exists.
  • streamId - An optional stream id to link to a newly created annotation, or used to lookup an existing annotation if no annotation id is provided.
  • eventId - An optional event id to link to a newly created annotation, or used to lookup an existing annotation if no annotation id is provided.
  • title - An optional default title to give the annotation if a new one is created.
  • subject - An optional default subject to give the annotation if a new one is created.
  • status - An optional default status to give the annotation if a new one is created.
  • assignedTo - An optional initial assignedTo value to give the annotation if a new one is created.
  • comment - An optional initial comment to give the annotation if a new one is created.

2.3 - Direct URLs

Navigating directly to a specific Stroom dashboard using a direct URL.

It is possible to navigate directly to a specific Stroom dashboard using a direct URL. This can be useful when you have a dashboard that needs to be viewed by users that would otherwise not be using the Stroom user interface.

URL format

The format for the URL is as follows:

https://<HOST>/stroom/dashboard?type=Dashboard&uuid=<DASHBOARD UUID>[&title=<DASHBOARD TITLE>][&params=<DASHBOARD PARAMETERS>]

Example:

https://localhost/stroom/dashboard?type=Dashboard&uuid=c7c6b03c-5d47-4b8b-b84e-e4dfc6c84a09&title=My%20Dash&params=userId%3DFred%20Bloggs

Host and path

The host and path are typically https://<HOST>/stroom/dashboard where <HOST> is the hostname/IP for Stroom.

type

type is a required parameter and must always be Dashboard since we are opening a dashboard.

uuid

uuid is a required parameter where <DASHBOARD UUID> is the UUID for the dashboard you want a direct URL to, e.g. uuid=c7c6b03c-5d47-4b8b-b84e-e4dfc6c84a09

The UUID for the dashboard that you want to link to can be found by right clicking on the dashboard icon in the explorer tree and selecting Info.

The Info dialog will display something like this and the UUID can be copied from it:

DB ID: 4
UUID: c7c6b03c-5d47-4b8b-b84e-e4dfc6c84a09
Type: Dashboard
Name: Stroom Family App Events Dashboard
Created By: INTERNAL
Created On: 2018-12-10T06:33:03.275Z
Updated By: admin
Updated On: 2018-12-10T07:47:06.841Z

title (Optional)

title is an optional URL parameter where <DASHBOARD TITLE> allows the specification of a specific title for the opened dashboard instead of the default dashboard name.

The inclusion of ${name} in the title allows the default dashboard name to be used and appended with other values, e.g. 'title=${name}%20-%20' + param.name

params (Optional)

params is an optional URL parameter where <DASHBOARD PARAMETERS> includes any parameters that have been defined for the dashboard in any of the expressions, e.g. params=userId%3DFred%20Bloggs

Permissions

In order for as user to view a dashboard they will need the necessary permission on the various entities that make up the dashboard.

For a Lucene index query and associated table the following permissions will be required:

  • Read permission on the Dashboard entity.
  • Use permission on any Indexe entities being queried in the dashboard.
  • Use permission on any Pipeline entities set as search extraction Pipelines in any of the dashboard’s tables.
  • Use permission on any XSLT entities used by the above search extraction Pipeline entites.
  • Use permission on any ancestor pipelines of any of the above search extraction Pipeline entites (if applicable).
  • Use permission on any Feed entities that you want the user to be able to see data for.

For a SQL Statistics query and associated table the following permissions will be required:

  • Read permission on the Dashboard entity.
  • Use permission on the StatisticStore entity being queried.

For a visualisation the following permissions will be required:

  • Read permission on any Visualiation entities used in the dashboard.
  • Read permission on any Script entities used by the above Visualiation entities.
  • Read permission on any Script entities used by the above Script entities.

3 - Query

A Query document defines a search query in text form using the Stroom Query Language and displays the results as a table or a visualisation.

3.1 - Stroom Query Language

Stroom Query Language (StroomQL) is a query language for retrieving data in Stroom.

Query Format

Stroom Query Language (StroomQL) is a text based replacement for the existing Dashboard query builder and allows you to express the same queries in text form as well as providing additional functionality. It is currently used on the Query entity as the means of defining a query.

The following shows the supported syntax for a StroomQL query.

from <DATA_SOURCE>
where <FIELD> <CONDITION> <VALUE> [and|or|not]
[and|or|not]
[window] <TIME_FIELD> by <WINDOW_SIZE> [advance <ADVANCE_WINDOW_SIZE>]
[filter] <FIELD> <CONDITION> <VALUE> [and|or|not]
[and|or|not]
[eval...] <FIELD> = <EXPRESSION>
[having] <FIELD> <CONDITION> <VALUE> [and|or|not]
[group by] <FIELD>
[sort by] <FIELD> [desc|asc] // asc by default
[limit] <MAX_ROWS> 
select <FIELD> [as <COLUMN NAME>], ...
[show as] <VIS_NAME> (<VIS_CONTROL_ID_1> = <COLUMN_1>, <VIS_CONTROL_ID_2> = <COLUMN_2>)

Keywords

From

The first part of a StroomQL expression is the from clause that defines the single data source to query. All queries must include the from clause.

Select the data source to query, e.g.

from my_source

If the name of the data source contains white space then it must be quoted, e.g.

from "my source"

Where

Use where to construct query criteria, e.g.

where feed = "my feed"

Add boolean logic with and, or and not to build complex criteria, e.g.

where feed = "my feed"
or feed = "other feed"

Use brackets to group logical sub expressions, e.g.

where user = "bob"
and (feed = "my feed" or feed = "other feed")

Conditions

Supported conditions are:

  • =
  • !=
  • >
  • >=
  • <
  • <=
  • is null
  • is not null

And|Or|Not

Logical operators to add to where and filter clauses.

Bracket groups

You can force evaluation of items in a specific order using bracketed groups.

and X = 5 OR (name = foo and surname = bar)

Window

window <TIME_FIELD> by <WINDOW_SIZE> [advance <ADVANCE_WINDOW_SIZE>]

Windowing groups data by a specified window size applied to a time field. A window inserts additional rows for future periods so that rows for future periods contain count columns for previous periods.

Specify the field to window by and a duration. Durations are specified in simple terms e.g. 1d, 2w etc.

By default, a window will insert a count into the next period row. This is because by default we advance by the specified window size. If you wish to advance by a different duration you can specify the advance amount which will insert counts into multiple future rows.

Filter

Use filter to filter values that have not been indexed during search retrieval. This is used the same way as the where clause but applies to data after being retrieved from the index, e.g.

filter obscure_field = "some value"

Add boolean logic with and, or and not to build complex criteria as supported by the where clause. Use brackets to group logical sub expressions as supported by the where clause.

Eval

Use eval to assign the value returned from an Expression Function to a named variable, e.g.

eval my_count = count()

Here the result of the count() function is being stored in a variable called my_count. Functions can be nested and applied to variables, e.g.

eval new_name = concat(
  substring(name, 3, 5),
  substring(name, 8, 9))

Note that all fields in the data source selected using from will be available as variables by default.

Multiple eval statements can also be used to breakup complex function expressions and make it easier to comment out individual evaluations, e.g.

eval name_prefix = substring(name, 3, 5)
eval name_suffix = substring(name, 8, 9)
eval new_name = concat(
  name_prefix,
  name_suffix)

Variables can be reused, e.g.

eval name_prefix = substring(name, 3, 5)
eval new_name = substring(name, 8, 9)
eval new_name = concat(
  name_prefix,
  new_name)

In this example, the second assignment of new_name will override the value initially assigned to it. Note that that when reusing a variable name, the assignment can depend on the previous value assigned to that variable.

Add boolean logic with and, or and not to build complex criteria, e.g.

where feed = "my feed" or feed = "other feed"

Use brackets to group logical sub expressions, e.g.

where user = "bob" and (feed = "my feed" or feed = "other feed")

Having

A post aggregate filter that is applied at query time to return only rows that match the having conditions.

having count > 3

Group By

Use to group by columns, e.g.

group by feed

You can group across multiple columns, e.g.

group by feed, name

You can create nested groups, e.g.

group by feed
group by name

Sort By

Use to sort by columns, e.g.

sort by feed

You can sort across multiple columns, e.g.

sort by feed, name

You can change the sort direction, e.g.

sort by feed asc

Or

sort by feed desc

Limit

Limit the number of results, e.g.

limit 10

Select

The select keyword is used to define the fields that will be selected out of the data source (and any eval’d fields) for display in the table output.

select feed, name

You can optionally rename the fields so that they appear in the table with more human friendly names.

select feed as 'my feed column',
  name as 'my name column'

Show

The show keyword is used to tell StroomQL how to show the data resulting from the select. A Stroom visualisation can be specified and then passed column values from the select for the visualisation control properties.

show LineChart(x = EventTime, y = count)
show Doughnut(names = Feed, values = count)

For visualisations that contain spaces in their names it is necessary to use quotes, e.g.

show "My Visualisation" (x = EventTime, y = count)

Comments

Single line

StroomQL supports single line comments using //. For example:

from "index_view" // view
where EventTime > now() - 1227d
// and StreamId = 1210
select StreamId as "Stream Id", EventTime as "Event Time"

Multi line

Multiple lines can be commented by surrounding sections with /* and */. For example:

from "index_view" // view
where EventTime > now() - 1227d
/*
eval FirstName = lowerCase(substringBefore(UserId, '.'))
eval FirstName = any(FirstName)
*/
select StreamId as "Stream Id", EventTime as "Event Time"

Examples

The following are various example queries.

// add a where
from "index_view" // view
where EventTime > now() - 1227d
// and StreamId = 1210
eval UserId = any(upperCase(UserId))
eval FirstName = lowerCase(substringBefore(UserId, '.'))
eval FirstName = any(FirstName)
eval Sl = stringLength(FirstName)
eval count = count()
group by StreamId
sort by Sl desc
select Sl, StreamId as "Stream Id", EventId as "Event Id", EventTime as "Event Time", UserId as "User Id", FirstName, count
limit 10
from "index_view" // view
// add a where
where EventTime > now() - 1227d
// and StreamId = 1210
eval UserId = any(upperCase(UserId))
eval FirstName = lowerCase(substringBefore(UserId, '.'))
eval FirstName = any(FirstName)
eval Sl = stringLength(FirstName)
eval count = count()
group by StreamId
sort by Sl desc
select Sl, StreamId as "Stream Id", EventId as "Event Id", EventTime as "Event Time", UserId as "User Id", FirstName, count
limit 10
from "index_view" // view
// add a where
where EventTime > now() - 1227d
// and StreamId = 1210
eval UserId = any(upperCase(UserId))
eval FirstName = lowerCase(substringBefore(UserId, '.'))
eval FirstName = any(FirstName)
eval Sl = stringLength(FirstName)
// eval count = count()
// group by StreamId
// sort by Sl desc
select StreamId as "Stream Id", EventId as "Event Id"
// limit 10
from "index_view" // view
// add a where
where EventTime > now() - 1227d
// and StreamId = 1210
eval UserId = any(upperCase(UserId))
eval FirstName = lowerCase(substringBefore(UserId, '.'))
eval FirstName = any(FirstName)
eval Sl = stringLength(FirstName)
eval count = count()
group by StreamId
sort by Sl desc
select Sl, StreamId as "Stream Id", EventId as "Event Id", EventTime as "Event Time", UserId as "User Id", FirstName, count
limit 10

4 - Analytic Rules

Analytic Rules are queries that can be run against the data either as it is ingested or on a scheduled basis.

5 - Search Extraction

The process of combining data extracted from events with the data stored in an index.

When indexing data it is possible to store (see Stored Fields all data in the index. This comes with a storage cost as the data is then held in two places; the event; and the index document.

Stroom has the capability of doing Search Extraction at query time. This involves combining the data stored in the index document with data extracted using a search extraction pipeline. Extracting data in this way is slower but reduces the data stored in the index, so it is a trade off between performance and storage space consumed.

Search Extraction relies on the StreamId and EventId being stored in the Index. Stroom can then used these two fields to locate the event in the stream store and process it with the search extraction pipeline.

6 - Dictionaries

Creating

Right click on a folder in the explorer tree that you want to create a dictionary in. Choose ‘New/Dictionary’ from the popup menu:

New
Configuration
Dictionary

Call the dictionary something like ‘My Dictionary’ and click OK .

Now just add any search terms you want to the newly created dictionary and click .

You can add multiple terms.

  • Terms on separate lines act as if they are part of an ‘OR’ expression when used in a search.
    apple
    banana
    orange
    
  • Terms on a single line separated by spaces act as if they are part of an ‘AND’ expression when used in a search.
    apple,banana,orange
    

Using the Dictionary

To perform a search using your dictionary, just choose the newly created dictionary as part of your search expression:

TODO: Fix image