1 - Unreleased

As yet unreleased key features and changes.

2 - Version 7.5

Key new features and changes present in v7.5 of Stroom and Stroom-Proxy.

2.1 - New Features

New features in Stroom version 7.5.

This section contains the significant new features or changes in Stroom. For a full list of changes see Change Log.

User Interface

Jobs Screen

images/releases/07.05/Jobs.png

The Jobs screen
  • Rows are now greyed out if one of the following is true:

    • The parent job is disabled.

    • The job is disabled on a specific node.

    • The node itself is disabled.

  • The column Next Scheduled has been added to the detail pane to show the time that the job will next run.

  • The details pane now has an Actions menu ( ) where it is possible to do the follow actions to a job on a specific node.

    • Edit the schedule for job on the selected node.

    • Run the job on the selected node now. This will run the job within 10s or so once clicked. Monitor the Last Executed column to see when it has run.

    • Show in Server Tasks (selected node). This will open the Server Tasks screen with the quick filter set to the selected job and node.

    • Show in Server Tasks (all nodes). This will open the Server Tasks screen with the quick filter set to the selected job.

    It is now possible to link directly to the Nodes screen by clicking the Open hover button next to the node name.

Nodes Screen

images/releases/07.05/Nodes.png

The Nodes screen
  • A jobs detail pane has been added to show the jobs on that node. This shows all jobs and allows jobs to be enabled/disabled on a node. It also shows the state of the parent job and the node. The button can be used to show only the enabled jobs on that node.

    An auto-refresh button has also been added to keep refreshing the jobs detail pane so the Last Executed and Next Scheduled columns are updated.

    It is also possible to link directly to the Jobs screen by clicking the Open hover button next to the job name.

  • Nodes rows are now greyed out if the node is disabled.

  • Job rows are not greyed out if one of the following is true:

    • The node is disabled.
    • The parent job is disabled.
    • The job is disabled on a specific node.
  • The column Build Version has been added to show the version of Stroom that the node is running. This is to highlight any nodes running the wrong version.

  • The column Up Date have been added to the Nodes screen to show the time that Stroom was last booted on that node.

  • The Ping column screen has been changed so an enabled node with no ping stands out while a disabled node does not.

New Look and Feel

The Login, Change Password and Manage Accounts screens have been changed to bring their look and feel in line with the rest of the application. The screens may look a little different but are functionally the same.

images/releases/07.05/Accounts.png

The Manage Accounts screen

Dictionaries

The Dictionary screen has been changed to make it easier to manage the import of other Dictionaries.

The Import sub-tab has been changed to include a detail pane that shows the effective word list for each imported Dictionary. This will show all words in the imported dictionary along with any from Dictionaries that it imports from.

images/releases/07.05/DictionaryImports.png

A Effective Words sub-tab has been added to show the effective list of words in the Dictionary, i.e. combining all words from the dictionary, its imports and any dictionaries imported by those imports.

images/releases/07.05/DictionaryEffectiveWords.png

Authentication Error Screen

If there is an authentication error during user login, e.g. the account is disabled or locked, the user will now be redirected to a configurable error screen rather than back to the login screen.

images/releases/07.05/AuthError.png

The content of the lower part of the dialog is configurable via the property stroom.ui.authErrorMessage. This property accepts HTML content. This the message to contain details of how to contact the appropriate Stroom admin team.

Queries

Editor Code Completion

The code completion in the Query editor has been changed to make the code completion suggestions context aware. For example if you have just typed in dictionary and then hit Ctrl ^ + Space ␣ it will suggest the names of Dictionary documents that are visible to the user.

images/releases/07.05/QueryCompletion.png

Dictionaries and Visualisations have also been added to the list of Query Help items (in the left hand pane) and to the available code completions.

Table Download

You can now download the results of a Query using the icon.

images/releases/07.05/QueryDownload.png

Functions in select

Functions, e.g. count() can now be used within the select clause of a StroomQL query.

Dashboards

Sorting

Now when you change an existing table sort on a dashboard it does not require the query to be executed again. The table data will change to reflect the new sort settings. This is particularly useful on complex queries or those operating on large amounts of data.

Dictionary List Input

When using a List Input pane on a Dashboard that is configured with a Dictionary , the drop-down now shows the source of each Dictionary word.

images/releases/07.05/DictionaryListInput.png

Other UI Changes

  • The explorer tree now shows the name of the item in the hover tooltip. This is useful if the name extends beyond the limit of the explorer tree pane.

  • Issue #4339 : Allow user selection of analytic duplicate columns.

  • Issue #3989 : Improve pause behaviour in dashboards and general presentation of busy state throughout UI.

  • The way dialogs can be moved or resized has been improved so that they can be resized on any edge or corner. The area for clicking and dragging to move a dialog has been increased to include all of the title section.

Permissions

  • Document deletion will now also delete all associated document permissions granted to user/groups. This previously did not happen on document delete so orphaned document permissions would build up in the database.

    The DB migration V07_04_00_005__Orphaned_Doc_Perms which will delete all document permissions (in table doc_permission) for docs that are not a folder, not the System doc, are not a valid doc (i.e. in the doc table) and are not a pipeline filter. Deleted document permission records will first be copied to a backup table doc_permission_backup_V07_04_00_005.

  • Document Copy and Move has been changed to check that the user has Owner permission (or admin) on the document being copied/moved if the permissions mode is None, Destination or Combined. This is because those modes will change the permissions which is something only an Owner/admin can do.

The Find in Content screen added in v7.3 has been changed to add Lucene indexing to speed up content searches. The Lucene index will be created when a user first uses the content search. The user may see an error message on first use telling them to wait for the index to be built.

The index is located in <stroom.path.temp>/doc-index, which unless explicitly configured will likely be /tmp/doc-index.

images/releases/07.05/FindInContent.png

Volumes

images/releases/07.05/Volumes.png
  • The tables on the Data Volumes and Index Volumes screens have been changed to low-light CLOSED/INACTIVE volumes.

  • Tooltips have been added to the Path and Last Updated columns.

  • The Use% column has been changed to a percentage bar.

  • Red/green colouring has been added to the to the Full column values to make it clearer which volumes are full.

Dependency Documents

Missing Dependencies

Various screens include document pickers to select a dependency document, e.g. selecting an extraction Pipeline in the Dashboard table settings. The document picker will now show a icon to indicate the previously selected document is not longer visible to the user or has been deleted.

images/releases/07.05/MissingDoc.png

Tagged Documents

Various screens require the selection of an extraction, or reference loader Pipeline, i.e.:

  • View - Extraction pipeline
  • Index - Default extraction pipeline
  • Dashboard - Extraction Pipeline
  • Pipeline - Reference loader pipeline

To distinguish processing pipelines from extraction or reference loading pipelines, the Pipeline documents can be tagged with pre-configured tags such as extraction and reference-loader. This means the Pipeline picker screen can be pre-filtered on the appropriate tag to make finding the right document easier.

It is recommended to tag all such pipelines using these tags to make document selection easier for other users.

images/releases/07.05/ExtractionTag.png

This system defined tagging is configured using the following properties

  • stroom.explorer.suggestedTags
  • stroom.ui.query.dashboardPipelineSelectorIncludedTags
  • stroom.ui.query.indexPipelineSelectorIncludedTags
  • stroom.ui.query.viewPipelineSelectorIncludedTags
  • stroom.ui.referencePipelineSelectorIncludedTags

API Keys

The API keys screen has changed to allow selection of the hashing algorithm used to store a hash of the API key.

images/releases/07.05/ApiKeyHash.png

Processing

S3 Appender

A new S3 pipeline element S3Appender has been added to enable the streaming of data to an S3 bucket.

The S3 Appender requires the creation of an S3 Config document to provide the credentials and role details for connecting to the S3 bucket. The content of the S3 Config document is JSON and the JSON Schema describing its structure can be found here .

Stepping

The stepper has been changed to allow termination of the step. This is useful when stepping large streams or when using filtered steps.

images/releases/07.05/StepperTerminate.png

The fact that the step is in progress is indicated by a label above the pipeline elements.

images/releases/07.05/StepperStepping.png

Other Changes

  • Issue #4444 : Change the hash() expression function to allow the algorithm and salt arguments to be the result of functions rather than just static values, e.g. hash(${field1}, concat('SHA-', ${algoLen}), ${salt}).

2.2 - Preview Features (experimental)

Preview features in Stroom version 7.5. Preview features are somewhat experimental in nature and are therefore subject to breaking changes in future releases.

State Store

  • Issue #2126 : Add experimental state store.

2.3 - Breaking Changes

Changes in Stroom version 7.5 that may break existing processing or ways of working.

There are no breaking changes in v7.5.

2.4 - Upgrade Notes

Required actions and information relating to upgrading to Stroom version 7.5.

Java Version

Stroom v7.5 requires Java 21. This is the same java version as Stroom v7.4. Ensure the Stroom and Stroom-Proxy hosts are running the latest patch release of Java v21.

Configuration File Changes

The following changes requiring action have been made to the stroom.yml configuration file.

Removed Properties

stroom.ui.theme.backgroundColour

This property has been removed. You will need to remove it (if present) from your configuration files else Stroom will not boot.

New Properties

stroom.contentIndex

This property has been added. It enables the indexing of stroom content for fast content searching.

state.*

This block of properties has been added to control the new state store functionality.

appConfig:
  state:
    scyllaDbDocCache:
      expireAfterAccess: null
      expireAfterWrite: "PT10M"
      maximumSize: 100
      refreshAfterWrite: null
    sessionCache:
      expireAfterAccess: "PT1H"
      expireAfterWrite: null
      maximumSize: 10
      refreshAfterWrite: null
    stateDocCache:
      expireAfterAccess: null
      expireAfterWrite: "PT10M"
      maximumSize: 100
      refreshAfterWrite: null

Changed Property Values

stroom.ui.helpSubPathJobs

The value of this property has changed from /user-guide/jobs/ to /reference-section/jobs/.

stroom.ui.helpUrl

The value of this property has changed from https://gchq.github.io/stroom-docs/7.4/docs to https://gchq.github.io/stroom-docs/7.5/docs.

Servlets

Stroom presents various servlets. The paths to these servlets have changed but the existing paths remain.

  • /stroom/noauth/datafeed => /datafeed
  • /stroom/noauth/debug => /debug
  • /stroom/noauth/echo => /echo
  • /stroom/noauth/status => /status
  • /stroom/noauth/swagger-ui => /swagger-ui
  • /stroom/sessionList => /sessionList

The /sessionList (and /stroom/sessionList) servlet has been changed to require manage users permission.

Database Migrations

When Stroom boots for the first time with a new version it will run any required database migrations to bring the database schema up to the correct version.

On boot, Stroom will ensure that the migrations are only run by a single node in the cluster. This will be the node that reaches that point in the boot process first. All other nodes will wait until that is complete before proceeding with the boot process.

It is recommended however to use a single node to execute the migration. To avoid Stroom starting up and beginning processing you can use the migrage command to just migrate the database and not fully boot Stroom. See migrage command for more details.

Migration Scripts

For information purposes only, the following are the database migrations that will be run when upgrading to 7.5.0 from the previous minor version.

Note, the legacy module will run first (if present) then the other module will run in no particular order.

Module stroom-app

Script V07_05_00_005__Orphaned_Doc_Perms.java

Path: stroom-app/src/main/java/stroom/app/db/migration/V07_05_00_005__Orphaned_Doc_Perms.java

It is not possible to display the content here. The file can be viewed on : GitHub

Module stroom-docstore

Script V07_05_00_005__Add_index_on_doc.sql

Path: stroom-docstore/stroom-docstore-impl-db/src/main/resources/stroom/docstore/impl/db/migration/V07_05_00_005__Add_index_on_doc.sql

-- ------------------------------------------------------------------------
-- Copyright 2022 Crown Copyright
--
-- Licensed under the Apache License, Version 2.0 (the "License");
-- you may not use this file except in compliance with the License.
-- You may obtain a copy of the License at
--
--     http://www.apache.org/licenses/LICENSE-2.0
--
-- Unless required by applicable law or agreed to in writing, software
-- distributed under the License is distributed on an "AS IS" BASIS,
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-- See the License for the specific language governing permissions and
-- limitations under the License.
-- ------------------------------------------------------------------------

-- stop note level warnings about objects (not)? existing
SET @old_sql_notes=@@sql_notes, sql_notes=0;

-- --------------------------------------------------

DELIMITER $$

-- --------------------------------------------------

DROP PROCEDURE IF EXISTS docstore_run_sql $$

-- DO NOT change this without reading the header!
CREATE PROCEDURE docstore_run_sql (
    p_sql_stmt varchar(1000)
)
BEGIN

    SET @sqlstmt = p_sql_stmt;

    SELECT CONCAT('Running sql: ', @sqlstmt);

    PREPARE stmt FROM @sqlstmt;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;
END $$

-- --------------------------------------------------

DROP PROCEDURE IF EXISTS docstore_create_non_unique_index$$

-- DO NOT change this without reading the header!
CREATE PROCEDURE docstore_create_non_unique_index (
    p_table_name varchar(64),
    p_index_name varchar(64),
    p_index_columns varchar(64)
)
BEGIN
    DECLARE object_count integer;

    SELECT COUNT(1)
    INTO object_count
    FROM information_schema.statistics
    WHERE table_schema = database()
    AND table_name = p_table_name
    AND index_name = p_index_name;

    IF object_count = 0 THEN
        CALL docstore_run_sql(CONCAT(
            'create index ', p_index_name,
            ' on ', database(), '.', p_table_name,
            ' (', p_index_columns, ')'));
    ELSE
        SELECT CONCAT(
            'Index ',
            p_index_name,
            ' already exists on table ',
            database(),
            '.',
            p_table_name);
    END IF;
END $$

-- --------------------------------------------------

DELIMITER ;

-- --------------------------------------------------

-- Improve lookup by name and to remove need to hit the table when we
-- list all docs by type.
CALL docstore_create_non_unique_index(
    "doc",
    "doc_type_name_uuid_idx",
    "type, name, uuid");

-- --------------------------------------------------

DROP PROCEDURE IF EXISTS docstore_create_non_unique_index;

DROP PROCEDURE IF EXISTS docstore_run_sql;

-- --------------------------------------------------


-- Reset to the original value
SET SQL_NOTES=@OLD_SQL_NOTES;

Module stroom-node

Script V07_05_00_005__add_build_ver_last_boot.sql

Path: stroom-node/stroom-node-impl-db/src/main/resources/stroom/node/impl/db/migration/V07_05_00_005__add_build_ver_last_boot.sql

-- ------------------------------------------------------------------------
-- Copyright 2020 Crown Copyright
--
-- Licensed under the Apache License, Version 2.0 (the "License");
-- you may not use this file except in compliance with the License.
-- You may obtain a copy of the License at
--
--     http://www.apache.org/licenses/LICENSE-2.0
--
-- Unless required by applicable law or agreed to in writing, software
-- distributed under the License is distributed on an "AS IS" BASIS,
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-- See the License for the specific language governing permissions and
-- limitations under the License.
-- ------------------------------------------------------------------------

-- Stop NOTE level warnings about objects (not)? existing
SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0;

-- --------------------------------------------------

DELIMITER $$

DROP PROCEDURE IF EXISTS node_run_sql_v1 $$

-- DO NOT change this without reading the header!
CREATE PROCEDURE node_run_sql_v1 (
    p_sql_stmt varchar(1000)
)
BEGIN

    SET @sqlstmt = p_sql_stmt;

    SELECT CONCAT('Running sql: ', @sqlstmt);

    PREPARE stmt FROM @sqlstmt;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;
END $$

-- --------------------------------------------------

DROP PROCEDURE IF EXISTS node_add_column_v1$$

CREATE PROCEDURE node_add_column_v1 (
    p_table_name varchar(64),
    p_column_name varchar(64),
    p_column_type_info varchar(64) -- e.g. 'varchar(255) default NULL'
)
BEGIN
    DECLARE object_count integer;

    SELECT COUNT(1)
    INTO object_count
    FROM information_schema.columns
    WHERE table_schema = database()
    AND table_name = p_table_name
    AND column_name = p_column_name;

    IF object_count = 0 THEN
        CALL node_run_sql_v1(CONCAT(
            'alter table ', database(), '.', p_table_name,
            ' add column ', p_column_name, ' ', p_column_type_info));
    ELSE
        SELECT CONCAT(
            'Column ',
            p_column_name,
            ' already exists on table ',
            database(),
            '.',
            p_table_name);
    END IF;
END $$

-- --------------------------------------------------

DELIMITER ;

CALL node_add_column_v1(
    'node',
    'build_version',
    'varchar(255) DEFAULT NULL');

CALL node_add_column_v1(
    'node',
    'last_boot_ms',
    'bigint DEFAULT NULL');

DROP PROCEDURE IF EXISTS node_add_column_v1;

DROP PROCEDURE IF EXISTS node_run_sql_v1;

SET SQL_NOTES=@OLD_SQL_NOTES;

-- vim: set tabstop=4 shiftwidth=4 expandtab:

Module stroom-security

Script V07_05_00_005__api_key_relax_uniqeness.sql

Path: stroom-security/stroom-security-impl-db/src/main/resources/stroom/security/impl/db/migration/V07_05_00_005__api_key_relax_uniqeness.sql

-- ------------------------------------------------------------------------
-- Copyright 2020 Crown Copyright
--
-- Licensed under the Apache License, Version 2.0 (the "License");
-- you may not use this file except in compliance with the License.
-- You may obtain a copy of the License at
--
--     http://www.apache.org/licenses/LICENSE-2.0
--
-- Unless required by applicable law or agreed to in writing, software
-- distributed under the License is distributed on an "AS IS" BASIS,
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-- See the License for the specific language governing permissions and
-- limitations under the License.
-- ------------------------------------------------------------------------

-- Stop NOTE level warnings about objects (not)? existing
SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0;

-- --------------------------------------------------

DELIMITER $$

-- --------------------------------------------------

DROP PROCEDURE IF EXISTS security_run_sql $$

-- DO NOT change this without reading the header!
CREATE PROCEDURE security_run_sql (
    p_sql_stmt varchar(1000)
)
BEGIN

    SET @sqlstmt = p_sql_stmt;

    SELECT CONCAT('Running sql: ', @sqlstmt);

    PREPARE stmt FROM @sqlstmt;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;
END $$

-- --------------------------------------------------

DROP PROCEDURE IF EXISTS security_create_non_unique_index$$

-- DO NOT change this without reading the header!
CREATE PROCEDURE security_create_non_unique_index (
    p_table_name varchar(64),
    p_index_name varchar(64),
    p_index_columns varchar(64)
)
BEGIN
    DECLARE object_count integer;

    SELECT COUNT(1)
    INTO object_count
    FROM information_schema.statistics
    WHERE table_schema = database()
    AND table_name = p_table_name
    AND index_name = p_index_name;

    IF object_count = 0 THEN
        CALL security_run_sql(CONCAT(
            'create index ', p_index_name,
            ' on ', database(), '.', p_table_name,
            ' (', p_index_columns, ')'));
    ELSE
        SELECT CONCAT(
            'Index ',
            p_index_name,
            ' already exists on table ',
            database(),
            '.',
            p_table_name);
    END IF;
END $$

-- --------------------------------------------------

DROP PROCEDURE IF EXISTS security_drop_index $$

-- e.g. security_drop_index('MY_TABLE', 'MY_IDX');
CREATE PROCEDURE security_drop_index (
    p_table_name varchar(64),
    p_index_name varchar(64)
)
BEGIN
    DECLARE object_count integer;

    SELECT COUNT(1)
    INTO object_count
    FROM information_schema.statistics
    WHERE table_schema = database()
    AND table_name = p_table_name
    AND index_name = p_index_name;

    IF object_count = 0 THEN
        SELECT CONCAT(
            'Index ',
            p_index_name,
            ' does not exist on table ',
            database(),
            '.',
            p_table_name);
    ELSE
        CALL security_run_sql(CONCAT(
            'alter table ', database(), '.', p_table_name,
            ' drop index ', p_index_name));
    END IF;
END $$

-- --------------------------------------------------

DELIMITER ;

-- --------------------------------------------------

-- We need to make this column case sensitive (_as_cs) else we limit the range of keys we can generate
-- as the keys have mixed case.
-- Note the api_key_prefix col contains lower case data so can stay as _ai_ci
ALTER TABLE api_key MODIFY
    api_key_hash VARCHAR(255)
      CHARACTER SET utf8mb4
      COLLATE utf8mb4_0900_as_cs
      NOT NULL;

-- Drop the old unique index so we can re-create it as non-unique
CALL security_drop_index(
    "api_key",
    "api_key_prefix_idx");

-- We have to look up records by prefix. This will usually return 1 row
-- but may return >1. We test the hash of all returned rows.
CALL security_create_non_unique_index(
    "api_key",
    "api_key_prefix_idx",
    "api_key_prefix");

-- --------------------------------------------------

DROP PROCEDURE IF EXISTS security_create_non_unique_index;

DROP PROCEDURE IF EXISTS security_drop_index;

DROP PROCEDURE IF EXISTS security_run_sql;

-- --------------------------------------------------

SET SQL_NOTES=@OLD_SQL_NOTES;

-- vim: set shiftwidth=4 tabstop=4 expandtab:

Script V07_05_00_010__api_key_add_algo_column.sql

Path: stroom-security/stroom-security-impl-db/src/main/resources/stroom/security/impl/db/migration/V07_05_00_010__api_key_add_algo_column.sql

-- ------------------------------------------------------------------------
-- Copyright 2020 Crown Copyright
--
-- Licensed under the Apache License, Version 2.0 (the "License");
-- you may not use this file except in compliance with the License.
-- You may obtain a copy of the License at
--
--     http://www.apache.org/licenses/LICENSE-2.0
--
-- Unless required by applicable law or agreed to in writing, software
-- distributed under the License is distributed on an "AS IS" BASIS,
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-- See the License for the specific language governing permissions and
-- limitations under the License.
-- ------------------------------------------------------------------------

-- Stop NOTE level warnings about objects (not)? existing
SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0;

-- --------------------------------------------------

DELIMITER $$

-- --------------------------------------------------

DROP PROCEDURE IF EXISTS security_run_sql $$

-- DO NOT change this without reading the header!
CREATE PROCEDURE security_run_sql (
    p_sql_stmt varchar(1000)
)
BEGIN

    SET @sqlstmt = p_sql_stmt;

    SELECT CONCAT('Running sql: ', @sqlstmt);

    PREPARE stmt FROM @sqlstmt;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;
END $$

-- --------------------------------------------------

DROP PROCEDURE IF EXISTS security_add_column$$

CREATE PROCEDURE security_add_column (
    p_table_name varchar(64),
    p_column_name varchar(64),
    p_column_type_info varchar(64) -- e.g. 'varchar(255) default NULL'
)
BEGIN
    DECLARE object_count integer;

    SELECT COUNT(1)
    INTO object_count
    FROM information_schema.columns
    WHERE table_schema = database()
    AND table_name = p_table_name
    AND column_name = p_column_name;

    IF object_count = 0 THEN
        CALL security_run_sql(CONCAT(
            'alter table ', database(), '.', p_table_name,
            ' add column ', p_column_name, ' ', p_column_type_info));
    ELSE
        SELECT CONCAT(
            'Column ',
            p_column_name,
            ' already exists on table ',
            database(),
            '.',
            p_table_name);
    END IF;
END $$


-- --------------------------------------------------

DELIMITER ;

-- --------------------------------------------------

-- idempotent
CALL security_add_column(
    "api_key",
    "hash_algorithm",
    'tinyint NOT NULL default 0');

-- --------------------------------------------------

DROP PROCEDURE IF EXISTS security_add_column;

DROP PROCEDURE IF EXISTS security_run_sql;

-- --------------------------------------------------

SET SQL_NOTES=@OLD_SQL_NOTES;

-- vim: set shiftwidth=4 tabstop=4 expandtab:

2.5 - Change Log

Link to the full CHANGELOG.

The follow changes are in 7.5 but not in 7.4:

  • Issue #4501 : Fix Query editor syntax highlighting.

  • Add query help and editor completions for Dictionary Docs for use with in dictionary.

  • Issue #4487 : Fix nasty error when running a stats query with no columns.

  • Issue #4498 : Make the explorer tree Expand/Collapse All buttons respect the current Quick Filter input text.

  • Issue #4518 : Change the Stream Upload dialog to default the stream type to that of the feed.

  • Issue #4470 : On import of Feed or Index docs, replace unknown volume groups with the respective configured default volume group (or null if not configured).

  • Issue #4460 : Change the way we display functions with lots of arguments in query help and code completion popup.

  • Issue #4526 : Change Dictionary to not de-duplicate words as this is breaking JSON when used for holding SSL config in JSON form.

  • Issue #4528 : Make the Reindex Content job respond to stroom shutdown.

  • Issue #4532 : Fix Run Job Now so that it works when the job or jobNode is disabled.

  • Issue #4444 : Change the hash() expression function to allow the algorithm and salt arguments to be the result of functions, e.g. hash(${field1}, concat('SHA-', ${algoLen}), ${salt}).

  • Issue #4534 : Fix NPE in include/exclude filter.

  • Issue #4527 : Change the non-regex search syntax of Find in Content to not use Lucene field based syntax so that : works correctly. Also change the regex search to use Lucene and improve the styling of the screen.

  • Issue #4536 : Fix NPE.

  • Issue #4539 : Improve search query logging.

  • Improve the process of (re-)indexing content. It is now triggered by a user doing a content search. Users will get an error message if the index is still being initialised. The stroom.contentIndex.enabled property has been removed.

  • Issue #4513 : Add primary key to doc_permission_backup_V07_05_00_005 table for MySQL Cluster support.

  • Issue #4514 : Fix HTTP 307 with calling /api/authproxy/v1/noauth/fetchClientCredsToken.

  • Issue #4475 : Change mask() function to period() and add using to apply a function to window.

  • Issue #4341 : Allow download from query table.

  • Issue #4507 : Fix index shard permission issue.

  • Issue #4510 : Fix right click in editor pane.

  • Issue #4511 : Fix StreamId, EventId selection in query tables.

  • Issue #4485 : Improve dialog move/resize behaviour.

  • Issue #4492 : Make Lucene behave like SQL for OR(NOT()) queries.

  • Issue #4494 : Allow functions in StroomQL select, e.g. count().

  • Issue #4202 : Fix default destination not being selected when you do Save As.

  • Issue #4475 : Add mask() function and deprecate countPrevious().

  • Issue #4491 : Fix tab closure when deleting items in the explorer tree.

  • Issue #4502 : Fix inability to step an un-processed stream.

  • Issue #4503 : Make the enabled state of the delete/restore buttons on the stream browser depend on the user’s permissions. Now they will only be enabled if the user has the require permission (i.e. DELETE/UPDATE) on at least one of the selected items.

  • Issue #4486 : Fix the format-date XSLT function for date strings with the day of week in, e.g. stroom:format-date('Wed Aug 14 2024', 'E MMM dd yyyy').

  • Issue #4458 : Fix explorer node tags not being copied. Also fix copy/move not selecting the parent folder of the source as the default destination folder.

  • Issue #4454 : Show the source dictionary name for each word in the Dashboard List Input selection box. Add sorting and de-duplication of words.

  • Issue #4455 : Add Goto Document links to the Imports sub-tab of the Dictionary screen. Also add new Effective Words tab to list all the words in the dictionary that include those from its imports (and their imports).

  • Issue #4468 : Improve handling of key sequences and detection of key events from ACE editor.

  • Issue #4472 : Change the User Preferences dialog to cope with redundant stroom/editor theme names.

  • Issue #4479 : Add ability to assume role for S3.

  • Issue #4202 : Fix problems with Dashboard Extraction Pipeline picker incorrectly changing the selected pipeline.

  • Change the DocRef picker so that it shows a warning icon if the selected DocRef no longer exists or the user doesn’t have permission to view it.

  • Change the Extraction Pipeline picker on the Index Settings screen to pre-filter on tag:extraction. This is configured using the property stroom.ui.query.indexPipelineSelectorIncludedTags.

  • Issue #4146 : Fix audit events for deleting/restoring streams.

  • Change the alert dialog message styling to have a max-height of 600px so long messages get a scrollbar.

  • Issue #4468 : Fix selection box keyboard selection behavior when no quick filter is visible.

  • Issue #4471 : Fix NPE with stepping filter.

  • Issue #4451 : Add S3 pipeline appender.

  • Issue #4401 : Improve content search.

  • Issue #4417 : Show stepping progress and allow termination.

  • Issue #4436 : Change the way API Keys are verified. Stroom now finds all valid api keys matching the api key prefix and compares the hash of the api key against the hash from each of the matching records. Support has also been added for using different hash algorithms.

  • Issue #4448 : Fix query refresh tooltip when not refreshing.

  • Issue #4457 : Fix ctrl+enter shortcut for query start.

  • Issue #4441 : Improve sorted column matching.

  • Issue #4449 : Reload Scheduled Query Analytics between executions.

  • Issue #4420 : Make app title dynamic.

  • Issue #4453 : Dictionaries will ignore imports if a user has no permission to read them.

  • Issue #4404 : Change the Query editor completions to be context aware, e.g. it only lists Datasources after a from .

  • Issue #4450 : Fix editor completion in Query editor so that it doesn’t limit completions to 100. Added the property stroom.ui.maxEditorCompletionEntries to control the maximum number of completions items that are shown. In the event that the property is exceeded, Stroom will pre-filter the completions based on the user’s input.

  • Add Visualisations to the Query help and editor completions. Visualisation completion inserts a snippet containing all the data fields in the Visualisation, e.g. TextValue(field = Field, gridSeries = Grid Series).

  • Issue #4424 : Fix alignment of Current Tasks heading on the Jobs screen.

  • Issue #4422 : Don’t show Edit Schedule in actions menu on Jobs screen for Distributed jobs.

  • Issue #4418 : Fix missing css for /stroom/sessionList.

  • Issue #4435 : Fix for progress spinner getting stuck on.

  • Issue #4426 : Add INFO message when an index shard is created.

  • Issue #4425 : Fix Usage Date heading alignment on Edit Volume Group screen for both data/index volumes.

  • Uplift docker image JDK to eclipse-temurin:21.0.4_7-jdk-alpine.

  • Issue #4416 : Allow dashboard table sorting to be changed post query.

  • Issue #4421 : Change session state XML structure.

  • Issue #4419 : Automatically unpause dashboard result components when a new search begins.

  • Rename migration from V07_04_00_005__Orphaned_Doc_Perms to V07_05_00_005__Orphaned_Doc_Perms.

  • Issue #4383 : Add an authentication error screen to be shown when a user tries to login and there is an authentication problem or the user’s account has been locked/disabled. Previously the user was re-directed to the sign-in screen even if cert auth was enabled. Added the new property stroom.ui.authErrorMessage to allow setting generic HTML content to show the user when an authentication error occurs.

  • Issue #4400 : Fix missing styling on sessionList servlet.

  • Fix broken description pane in the stroomQL code completion.

  • Change API endpoint /Authentication/v1/noauth/reset from GET to POST and from a path parameter to a POST body.

  • Fix various issues relating to unauthenticated servlets. Add new servlet paths e.g. /stroom/XXX becomes /XXX and /stroom/XXX. The latter will be removed in some future release. Notable new servlet paths are /dashboard, /status, /swagger-ui, /echo, /debug, /datafeed, /sessionList.

  • Change sessionList servlet to require manage users permission.

  • Issue #4360 : Fix quick time settings popup.

  • Improve styling of Jobs screen so disabled jobs/nodes are greyed out.

  • Add Next Scheduled column to the detail pane of the Job screen.

  • Add Build Version and Up Date columns to the Nodes screen. Also change the styling of the Ping column so an enabled node with no ping stands out while a disabled node does not. Also change the row styling for disabled nodes.

  • Add a Run now icon to the jobs screen to execute a job on a node immediately.

  • Change the FS Volume and Index Volume tables to low-light CLOSED/INACTIVE volumes. Add tooltips to the path and last updated columns. Change the Use% column to a percentage bar. Add red/green colouring to the Full column values.

  • Issue #4327 : Add a Jobs pane to the Nodes screen to view jobs by node. Add linking between job nodes on the Nodes screen and the Jobs screen.

  • Issue #4339 : Allow user selection of analytic duplicate columns.

  • Issue #2126 : Add experimental state store.

  • Issue #4334 : Popup explorer text on mouse hover.

  • Issue #4278 : Make document deletion also delete the permission records for that document. Also run migration V07_04_00_005__Orphaned_Doc_Perms which will delete all document permissions (in table doc_permission) for docs that are not a folder, not the System doc, are not a valid doc (i.e. in the doc table) and are not a pipeline filter. Deleted document permission records will first be copied to a backup table doc_permission_backup_V07_04_00_005.

  • Change document Copy and Move to check that the user has Owner permission (or admin) on the document being copied/moved if the permissions mode is None, Destination or Combined. This is because those modes will change the permissions which is something only an Owner/admin can do.

  • Issue #3989 : Improve pause behaviour in dashboards and general presentation of busy state throughout UI.

  • Issue #2111 : Add index assistance to find content feature.

For a detailed list of all the changes in v7.5 see: v7.5 CHANGELOG

3 - Version 7.4

Key new features and changes present in v7.4 of Stroom and Stroom-Proxy.

3.1 - New Features

New features in Stroom version 7.4.

Scheduler

The scheduler in Stroom that is used to schedule all the background jobs and Analytic Rules has been changed. The existing cron /frequency scheduler was quite simplistic and only supported a limited set of features of a cron schedule.

The cron format has changed from a three value cron expression (e.g. * * * to run every minute) to six value one. Existing three value cron expressions will be migrated to the new syntax when deploying Stroom v7.4.

For full details of the new scheduler see Scheduler

images/releases/07.04/schedule-icon.png

New cron schedule format.

General User Interface Changes

Keyboard Shortcuts

Various new keyboard shortcuts for performing actions in Stroom. See Keyboard Shortcuts for details.

  • Add the keyboard shortcut Ctrl ^ + Enter ↵ to the code pane of the stepper to perform a step refresh .
  • Add the keyboard shortcut Ctrl ^ + Enter ↵ to the Dashboards to execute all queries.
  • Add multiple Goto type shortcuts to jump directly to a screen. See Direct Access to Screens for details.

Documentation

The Documentation entity will now default to edit mode if there is no documentation, e.g. on a newly created Documentation entity.

Copy As

A Copy As group has been added to the explorer tree context menu. This replaces, but includes the Copy Link to Clipboard menu item.

  • Copy Name to Clipboard - Copies the name of the entity to the clipboard.
  • Copy UUID to Clipboard - Copies the UUID of the entity to the clipboard.
  • Copy link to Clipboard - Copies a URL to the clipboard that will link directly to the selected entity.
images/releases/07.04/copy-as.png

Copy explorer tree entities as name/UUID/link

Previous versions of Stroom have included an interactive user interface for navigating Stroom’s API specification. A link to this has been added to the Help menu.

Dashboard Conditional Formatting

  • Make the Enabled and Hide Row checkboxes clickable in the table without having open the rule edit dialog.
  • Dim disabled rules in the table.
  • Add colour swatches for the background and text colours.
images/releases/07.04/conditional-formatting.png

Clickable checkboxes and colour swatches.

Help Icons on Dialogs

Functionality has been added to included help icons on dialogs. Clicking the icon will display a popup containing help text relating to the thing the icon is next to.

Currently help icons have only been added to a few dialogs, but more will follow.

images/releases/07.04/help-icons.png

Help icons on dialogs.

Stepping Location

The way you change the step location in the stepper has changed. Previously you could click and then directly edit each of the three numbers. Now the location label is a clickable link that opens an edit dialog.

images/releases/07.04/step-location-link.png

Modified step location label.
images/releases/07.04/step-location-dialog.png

Step location edit dialog.

3.2 - Preview Features (experimental)

Preview features in Stroom version 7.4. Preview features are somewhat experimental in nature and are therefore subject to breaking changes in future releases.

Analytic Rules

Analytic Rules were introduced in Stroom v7.2 as a preview feature. They remain an experimental preview feature but have undergone more changes/improvements.

Analytic rules are a means of writing a query to find matching data.

Processing Types

Analytic rules have three different processing types:

Streaming

A streaming rule uses a processor filter to find streams that match the filter and runs the query against the stream.

Scheduled Query

A scheduled query will run the rule’s query against a time window of data on a scheduled basis. The time window can be absolute or relative to when the scheduled query fires.

Table Builder

Multiple Notifications

Rules now support having multiple notification types/destinations, for example sending an email as well a stream . Currently Email and Stream are the only notification types supported.

Email Templating

The email notifications have been improved to allow templating of the email subject and body. The template enables static text/HTML to be mixed with values taken from the detection.

images/releases/07.04/email-notifications.png

Email notification settings.

The templating uses a template syntax called jinja and specifically the JinJava library. The templating syntax includes support for variables, filters, condition blocks, loops, etc. Full details of the syntax see Templating and for details of the templating context available in email subject/body templates see Rule Detections Context.

Example Template

The following is an example of a detection template producing a HTML email body that includes conditions and loops:

<!DOCTYPE html>
<html lang="en">
<meta charset="UTF-8" />
<title>Detector '{{ detectorName | escape }}' Alert</title>
<body>
  <p>Detector <em>{{ detectorName | escape }}</em> {{ detectorVersion | escape }} fired at {{ detectTime | escape }}</p>

  {%- if (values | length) > 0 -%}
  <p>Detail: {{ headline | escape }}</p>
  <ul>
    {% for key, val in values | dictsort -%}
      <li><strong>{{ key | escape }}</strong>: {{ val | escape }}</li>
    {% endfor %}
  </ul>
  {% endif -%}

  {%- if (linkedEvents | length) > 0 -%}
  <p>Linked Events:</p>
  <ul>
    {% for linkedEvent in linkedEvents -%}
      <li>Environment: {{ linkedEvent.stroom | escape }}, Stream ID: {{ linkedEvent.streamId | escape }}, Event ID: {{ linkedEvent.eventId | escape }}</li>
    {% endfor %}
  </ul>
  {% endif %}
</body>

Improved Date Picker

Scheduled queries make use of a new data picker dialog which makes the process of setting a date/time value much easier for the user. This new date picker will be rolled out to other screens in Stroom in some later release.

images/releases/07.04/date-picker.png

New date picker

3.3 - Breaking Changes

Changes in Stroom version 7.4 that may break existing processing or ways of working.

Analytic Rules

Any Analytic Rules created in versions 7.2 or 7.3 will need to be deleted and re-created in v7.4. Analytic Rules has been an experimental feature therefore there is no migration in place for rules from previous versions of Stroom.

3.4 - Upgrade Notes

Required actions and information relating to upgrading to Stroom version 7.4.

Java Version

Stroom v7.5 requires Java 21. This is the same java version as Stroom v7.3. Ensure the Stroom and Stroom-Proxy hosts are running the latest patch release of Java v21.

Database Migrations

When Stroom boots for the first time with a new version it will run any required database migrations to bring the database schema up to the correct version.

On boot, Stroom will ensure that the migrations are only run by a single node in the cluster. This will be the node that reaches that point in the boot process first. All other nodes will wait until that is complete before proceeding with the boot process.

It is recommended however to use a single node to execute the migration. To avoid Stroom starting up and beginning processing you can use the migrage command to just migrate the database and not fully boot Stroom. See migrage command for more details.

Migration Filename Change

If you are deploying onto a v7.4-beta.1 instance you will need to modify the database table used to log migration history. If the job_schema_history table contains version 07.03.00.001 then you will need to run the following SQL against the database to prevent Stroom from failing the migration on boot.

delete from job_schema_history where version = '07.03.00.001';

Migration Scripts

For information purposes only, the following are the database migrations that will be run when upgrading to 7.4.0 from the previous minor version.

Note, the legacy module will run first (if present) then the other module will run in no particular order.

Module stroom-analytics

Script V07_04_00_001__execution_schedule.sql

Path: stroom-analytics/stroom-analytics-impl-db/src/main/resources/stroom/analytics/impl/db/migration/V07_04_00_001__execution_schedule.sql

-- ------------------------------------------------------------------------
-- Copyright 2020 Crown Copyright
--
-- Licensed under the Apache License, Version 2.0 (the "License");
-- you may not use this file except in compliance with the License.
-- You may obtain a copy of the License at
--
--     http://www.apache.org/licenses/LICENSE-2.0
--
-- Unless required by applicable law or agreed to in writing, software
-- distributed under the License is distributed on an "AS IS" BASIS,
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-- See the License for the specific language governing permissions and
-- limitations under the License.
-- ------------------------------------------------------------------------

-- Stop NOTE level warnings about objects (not)? existing
SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0;

-- --------------------------------------------------

--
-- Create the table
--
CREATE TABLE IF NOT EXISTS execution_schedule (
   id int NOT NULL AUTO_INCREMENT,
   name varchar(255) NOT NULL,
   enabled tinyint NOT NULL DEFAULT '0',
   node_name varchar(255) NOT NULL,
   schedule_type varchar(255) NOT NULL,
   expression varchar(255) NOT NULL,
   contiguous tinyint NOT NULL DEFAULT '0',
   start_time_ms bigint DEFAULT NULL,
   end_time_ms bigint DEFAULT NULL,
   doc_type varchar(255) NOT NULL,
   doc_uuid varchar(255) NOT NULL,
   PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

CREATE INDEX execution_schedule_doc_idx ON execution_schedule (doc_type, doc_uuid);
CREATE INDEX execution_schedule_enabled_idx ON execution_schedule (doc_type, doc_uuid, enabled, node_name);

CREATE TABLE IF NOT EXISTS execution_history (
   id bigint(20) NOT NULL AUTO_INCREMENT,
   fk_execution_schedule_id int NOT NULL,
   execution_time_ms bigint NOT NULL,
   effective_execution_time_ms bigint NOT NULL,
   status varchar(255) NOT NULL,
   message longtext,
   PRIMARY KEY (id),
   CONSTRAINT execution_history_execution_schedule_id FOREIGN KEY (fk_execution_schedule_id) REFERENCES execution_schedule (id)
) ENGINE=InnoDB DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

CREATE TABLE IF NOT EXISTS execution_tracker (
   fk_execution_schedule_id int NOT NULL,
   actual_execution_time_ms bigint NOT NULL,
   last_effective_execution_time_ms bigint DEFAULT NULL,
   next_effective_execution_time_ms bigint NOT NULL,
   PRIMARY KEY (fk_execution_schedule_id),
   CONSTRAINT execution_tracker_execution_schedule_id FOREIGN KEY (fk_execution_schedule_id) REFERENCES execution_schedule (id)
) ENGINE=InnoDB DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- --------------------------------------------------

SET SQL_NOTES=@OLD_SQL_NOTES;

-- vim: set shiftwidth=2 tabstop=2 expandtab:

Module stroom-job

Script V07_04_00_005__job_node.sql

Path: stroom-job/stroom-job-impl-db/src/main/resources/stroom/job/impl/db/migration/V07_04_00_005__job_node.sql

-- ------------------------------------------------------------------------
-- Copyright 2020 Crown Copyright
--
-- Licensed under the Apache License, Version 2.0 (the "License");
-- you may not use this file except in compliance with the License.
-- You may obtain a copy of the License at
--
--     http://www.apache.org/licenses/LICENSE-2.0
--
-- Unless required by applicable law or agreed to in writing, software
-- distributed under the License is distributed on an "AS IS" BASIS,
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-- See the License for the specific language governing permissions and
-- limitations under the License.
-- ------------------------------------------------------------------------

-- Stop NOTE level warnings about objects (not)? existing
SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0;

update job_node set schedule = concat('0 ', schedule, ' * ?') where job_type = 1 and regexp_like(schedule, '^[^ ]+ [^ ]+ [^ ]+$');

SET SQL_NOTES=@OLD_SQL_NOTES;

-- vim: set tabstop=4 shiftwidth=4 expandtab:

3.5 - Change Log

Link to the full CHANGELOG.

For a detailed list of all the changes in v7.4 see: v7.4 CHANGELOG

4 - Version 7.3

Key new features and changes present in v7.3 of Stroom and Stroom-Proxy.

4.1 - New Features

New features in Stroom version 7.3.

User Interface

  • Add a Copy button to the Processors sub-tab on the Pipeline screen. This will create a duplicate of an existing filter.

  • Add a Line Wrapping toggle button to the Server Tasks screen. This will enable/disable line wrapping on the Name and Info cells.

  • Allow pane resizing in dashboards without needing to be in design mode.

  • Add Copy and Jump to hover buttons to the Stream Browser screen to copy the value of the cell or (if it is a document) jump to that document.

    images/releases/07.03/meta-links.png

    Copy and jump to links in the Data Browser screen
  • Tagging of individual explorer nodes was introduced in v7.2. v7.3 however adds support for adding/removing tags to/from multiple explorer tree nodes via the explorer context menu.

Explorer Tree

  • Additional buttons on the top of the explorer tree pane.

    images/releases/07.03/explorer-pane.png

    Explorer Tree Pane
    • Add Expand All and Collapse All buttons to the explorer pane to open or close all items in the tree respectively.

    • Add a Locate Current Item button to the explorer pane to locate the currently open document in the explorer tree.

Finding Things

Find

New screen for finding documents by name. Accessed using Shift ⇧ + Alt + f or

Navigation
Find
images/releases/07.03/find.png

Find

Find In Content

Improvements to the Find In Content screen so that it now shows the content of the document and highlights the matched terms. Now accessible using shift,ctrl + f or

Navigation
Find In Content
images/releases/07.03/find-in-content.png

Find In Content

Recent Items

New screen for finding recently used documents. Accessed using Ctrl ^ + e or

Navigation
Recent Items
images/releases/07.03/recent-items.png

Recent Items

Editor Snippets

Snippets are a way of quickly adding snippets of pre-defined text into the text editors in Stroom. Snippets have been available in previous versions of Stroom however there have been various additions to the library of snippets available which makes creating/editing content a lot easier.

  • Add snippets for Data Splitter. For the list of available snippets see here.

  • Add snippets for XMLFragmentParser. For the list of available snippets see here.

  • Add new XSLT snippets for <xsl:element> and <xsl:message>. For the list of available snippets see here.

  • Add snippets for StroomQL . For the list of available snippets see here.

API Keys

API Keys are a means for client systems to authenticate with Stroom. In v7.2 of Stroom, the ability to use API Keys was removed if you were using an external identity provider as client systems could get tokens from the IDP themselves. In v7.3 the ability to use API Keys with an external IDP has returned as we felt it offered client systems a choice and removed the complexity of dealing with the IDP.

The API Keys screen has undergone various improvements:

  • Look/feel brought in line with other screens in Stroom.
  • Ability to temporarily enable/disable API Keys. A key that is disabled in Stroom cannot be authenticated against.
  • Deletion of an API Key prevents any future authentication against that key.
  • Named API Keys to indicate purpose, e.g. naming a key with the client system’s name.
  • Comments for keys to add additional context/information for a key.
  • API Key prefix to aid with identifying an API Key.
  • The full API key string is no longer stored in Stroom and cannot be viewed after creation.
images/releases/07.03/api-keys.png

API Keys screen

Key Creation

The screens for creating a new API Key are as follows:

Key Format

We have also made changes to the format of the API Key itself. In v7.2, the API Key was an OAuth token so had data baked into it. In v7.3, the API Key is essentially just a dumb random string, like a very long and secure password. The following is an example of a new API Key:

sak_e1e78f6ee0_6CyT2Mpj2seVJzYtAXsqWwKJzyUqkYqTsammVerbJpvimtn4BpE9M5L2Sx6oeG5prhyzcA7U6fyV5EkwTxoXJPfDWLneQAq16i5P75qdQNbqJ99Wi7xzaDhryMdhVZhs

The structure of the key is as follows:

  1. sak - The key type, Stroom API Key.
  2. _ - separator
  3. Truncated SHA2-256 hash (truncated to 10 chars) of the whole API Key.
  4. _ - separator
  5. 128 crypto random chars in the Base58 character set. This character set ensures no awkward characters that might need escaping and removes some ambiguous characters (0OIl).

Features of the new format are:

  • Fixed length of 143 chars with fixed prefix (sak_) that make it easier to search for API Keys in config, e.g. to stop API Keys being leaked into online public repositories and the like.
  • Unique prefix (e.g. sak_e1e78f6ee0_) to help link an API being used by a client with the API Key record stored in Stroom. This part of the key is stored and displayed in Stroom.
  • The hash part acts as a checksum for the key to ensure it is correct. The following CyberChef recipe shows how you can validate the hash part of a key.

Analytics

  • Add distributed processing for streaming analytics. This means streaming analytics can now run on all nodes in the cluster rather than just one.

  • Add multiple recipients to rule notification emails. Previously only one recipient could be added.

  • Add support for Lucene 9.8.0 and supporting multiple version of Lucene. Stroom now stores the Lucene version used to create an index shard against the shard so that the correct Lucene version is used when reading/writing from that shard. This will allow Stroom to harness new Lucene features in future while maintaining backwards compatibility with older versions.

  • Add support for field in and field in dictionary to StroomQL .

Processing

  • Improve the display of processor filter state. The columns Tracker Ms and Tracker % have been removed and the Status column has been improved better reflect the state of the filter tracker.

  • Stroom now supports the XSLT standard element <xsl:message>. This element will be handled as follows:

    <!-- Log `some message` at severity `FATAL` and terminate processing of that stream part immediately.
         Note that `terminate="yes"` will trump any severity that is also set. -->
    <xsl:message terminate="yes">some message</xsl:message>
    
    <!-- Log `some message` at severity `ERROR`. -->
    <xsl:message>some message</xsl:message>
    
    <!-- Log `some message` at severity `FATAL`. -->
    <xsl:message><fatal>some message</fatal></xsl:message>
    
    <!-- Log `some message` at severity `ERROR`. -->
    <xsl:message><error>some message</error></xsl:message>
    
    <!-- Log `some message` at severity `WARNING`. -->
    <xsl:message><warn>some message</warn></xsl:message>
    
    <!-- Log `some message` at severity `INFO`. -->
    <xsl:message><info>some message</info></xsl:message>
    
    <!-- Log $msg at severity `ERROR`. -->
    <xsl:message><xsl:value-of select="$msg"></xsl:message>
    

    The namespace of the severity element (e.g. <info> is ignored.

  • Add the following pipeline element properties to allow control of logged warnings for removal/replacement respectively.

XSLT Functions

  • Add XSLT function stroom:hex-to-string(hex, charsetName).
  • Add XSLT function stroom:cidr-to-numeric-ip-range XSLT function.
  • Add XSLT function stroom:ip-in-cidr for testing whether an IP address is within the specified CIDR range.

For details of the new functions, see XSLT Functions.

API

  • Add the un-authenticated API method /api/authproxy/v1/noauth/fetchClientCredsToken to effectively proxy for the IDP's token endpoint to obtain an access token using the client credentials flow. The request contains the client credentials and looks like { "clientId": "a-client", "clientSecret": "BR9m.....KNQO" }. The response media type is text/plain and contains the access token.

4.2 - Preview Features (experimental)

Preview experimental features in Stroom version 7.3.

S3 Storage

Integration with S3 storage has been added to allow Stroom to read/write to/from S3 storage, e.g. S3 on AWS. A data volume can now be create as either Standard or S3. If configured as S3 you need to supply the S3 configuration data.

images/releases/07.03/add-volume.png

Add Volume screen.

This is an experimental feature at this stage and may be subject to change. The way Stroom reads and writes data has not been optimised for S3 so performance at scale is currently unknown.

4.3 - Breaking Changes

Changes in Stroom version 7.3 that may break existing processing or ways of working.
  • The Hessian based feed status RPC service /remoting/remotefeedservice.rpc has been removed as it is using the legacy javax.servlet dependency that is incompatible with jakarta.servlet that is now in use in stroom. This was used by Stroom-Proxy up to v5.

  • The StroomQL keyword combination vis as has been replaced with show.

4.4 - Upgrade Notes

Required actions and information relating to upgrading to Stroom version 7.3.

Java Version

Stroom v7.3 requires Java v21. Previous versions of Stroom used Java v17 or lower. You will need to upgrade Java on the Stroom and Stroom-Proxy hosts to the latest patch release of Java v21.

API Keys

With the change to the format of API Keys (see here), it is recommended to migrate legacy API Keys over to the new format. There is no hard requirement to do this as legacy keys will continue to work as is, however the new keys are easier to work with and Stroom has more control over the new format keys, making them more secure. You are encouraged to create new keys for client systems and ask them to change the keys over.

Legacy Key Migration

The new API Keys are now stored in a new table api_key. Legacy keys will be migrated into this table and given a key name and prefix like LEGACY_API_KEY_N, where N is a unique number. As the whole API was previously visible in v7.2, the API Key string is migrated into the Comments field so remains visible in the UI.

images/releases/07.03/api-keys.png

API Keys screen

Database Migrations

When Stroom boots for the first time with a new version it will run any required database migrations to bring the database schema up to the correct version.

On boot, Stroom will ensure that the migrations are only run by a single node in the cluster. This will be the node that reaches that point in the boot process first. All other nodes will wait until that is complete before proceeding with the boot process.

It is recommended however to use a single node to execute the migration. To avoid Stroom starting up and beginning processing you can use the migrage command to just migrate the database and not fully boot Stroom. See migrage command for more details.

Migration Scripts

For information purposes only, the following are the database migrations that will be run when upgrading to 7.3.0 from the previous minor version.

Note, the legacy module will run first (if present) then the other module will run in no particular order.

Module stroom-data

Script V07_03_00_001__fs_volume_s3.sql

Path: stroom-data/stroom-data-store-impl-fs-db/src/main/resources/stroom/data/store/impl/fs/db/migration/V07_03_00_001__fs_volume_s3.sql

-- ------------------------------------------------------------------------
-- Copyright 2020 Crown Copyright
--
-- Licensed under the Apache License, Version 2.0 (the "License");
-- you may not use this file except in compliance with the License.
-- You may obtain a copy of the License at
--
--     http://www.apache.org/licenses/LICENSE-2.0
--
-- Unless required by applicable law or agreed to in writing, software
-- distributed under the License is distributed on an "AS IS" BASIS,
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-- See the License for the specific language governing permissions and
-- limitations under the License.
-- ------------------------------------------------------------------------

-- Stop NOTE level warnings about objects (not)? existing
SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0;

CREATE TABLE IF NOT EXISTS fs_volume_group (
  id                    int NOT NULL AUTO_INCREMENT,
  version               int NOT NULL,
  create_time_ms        bigint NOT NULL,
  create_user           varchar(255) NOT NULL,
  update_time_ms        bigint NOT NULL,
  update_user           varchar(255) NOT NULL,
  name                  varchar(255) NOT NULL,
  -- 'name' needs to be unique because it is used as a reference
  UNIQUE (name),
  PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;


DROP PROCEDURE IF EXISTS V07_03_00_001;

DELIMITER $$

CREATE PROCEDURE V07_03_00_001 ()
BEGIN
    DECLARE object_count integer;

    -- Add volume type
    SELECT COUNT(1)
    INTO object_count
    FROM information_schema.columns
    WHERE table_schema = database()
    AND table_name = 'fs_volume'
    AND column_name = 'volume_type';

    IF object_count = 0 THEN
        ALTER TABLE `fs_volume` ADD COLUMN `volume_type` int NOT NULL;
        ALTER TABLE `fs_volume` ADD COLUMN `data` longblob;
        UPDATE `fs_volume` set `volume_type` = 0;
    END IF;

    -- Add default group
    SELECT COUNT(*)
    INTO object_count
    FROM fs_volume_group
    WHERE name = "Default";

    IF object_count = 0 THEN
        INSERT INTO fs_volume_group (
          version,
          create_time_ms,
          create_user,
          update_time_ms,
          update_user,
          name)
        VALUES (
            1,
            UNIX_TIMESTAMP() * 1000,
            "Flyway migration",
            UNIX_TIMESTAMP() * 1000,
            "Flyway migration",
            "Default Volume Group");
    END IF;

    -- Add volume group
    SELECT COUNT(1)
    INTO object_count
    FROM information_schema.columns
    WHERE table_schema = database()
    AND table_name = 'fs_volume'
    AND column_name = 'fk_fs_volume_group_id';

    IF object_count = 0 THEN
        ALTER TABLE `fs_volume`
        ADD COLUMN `fk_fs_volume_group_id` int NOT NULL;
        UPDATE `fs_volume` SET `fk_fs_volume_group_id` = (SELECT `id` FROM `fs_volume_group` WHERE `name` = "Default Volume Group");
        ALTER TABLE fs_volume
            ADD CONSTRAINT fs_volume_group_fk_fs_volume_group_id
            FOREIGN KEY (fk_fs_volume_group_id)
            REFERENCES fs_volume_group (id);
    END IF;

END $$

DELIMITER ;

CALL V07_03_00_001;

DROP PROCEDURE IF EXISTS V07_03_00_001;

SET SQL_NOTES=@OLD_SQL_NOTES;

-- vim: set shiftwidth=4 tabstop=4 expandtab:

Module stroom-index

Script V07_03_00_001__index_field.sql

Path: stroom-index/stroom-index-impl-db/src/main/resources/stroom/index/impl/db/migration/V07_03_00_001__index_field.sql

-- ------------------------------------------------------------------------
-- Copyright 2020 Crown Copyright
--
-- Licensed under the Apache License, Version 2.0 (the "License");
-- you may not use this file except in compliance with the License.
-- You may obtain a copy of the License at
--
--     http://www.apache.org/licenses/LICENSE-2.0
--
-- Unless required by applicable law or agreed to in writing, software
-- distributed under the License is distributed on an "AS IS" BASIS,
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-- See the License for the specific language governing permissions and
-- limitations under the License.
-- ------------------------------------------------------------------------

-- Stop NOTE level warnings about objects (not)? existing
SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0;

DROP PROCEDURE IF EXISTS drop_field_source;
DELIMITER //
CREATE PROCEDURE drop_field_source ()
BEGIN
    IF EXISTS (
        SELECT NULL
        FROM INFORMATION_SCHEMA.TABLES
        WHERE TABLE_SCHEMA = database()
        AND TABLE_NAME = 'field_info') THEN
        DROP TABLE field_info;
    END IF;
    IF EXISTS (
        SELECT NULL
        FROM INFORMATION_SCHEMA.TABLES
        WHERE TABLE_SCHEMA = database()
        AND TABLE_NAME = 'field_source') THEN
        DROP TABLE field_source;
    END IF;
    IF EXISTS (
        SELECT NULL
        FROM INFORMATION_SCHEMA.TABLES
        WHERE TABLE_SCHEMA = database()
        AND TABLE_NAME = 'field_schema_history') THEN
        DROP TABLE field_schema_history;
    END IF;
END//
DELIMITER ;
CALL drop_field_source();
DROP PROCEDURE drop_field_source;

--
-- Create the field_source table
--
CREATE TABLE IF NOT EXISTS `index_field_source` (
    `id`        int NOT NULL AUTO_INCREMENT,
    `type`      varchar(255) NOT NULL,
    `uuid`      varchar(255) NOT NULL,
    `name`      varchar(255) NOT NULL,
    PRIMARY KEY (`id`),
    UNIQUE KEY  `index_field_source_type_uuid` (`type`, `uuid`)
) ENGINE=InnoDB DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

--
-- Create the index_field table
--
CREATE TABLE IF NOT EXISTS `index_field` (
    `id`                        bigint NOT NULL AUTO_INCREMENT,
    `fk_index_field_source_id`  int NOT NULL,
    `type`                      tinyint NOT NULL,
    `name`                      varchar(255) NOT NULL,
    `analyzer`                  varchar(255) NOT NULL,
    `indexed`                   tinyint NOT NULL DEFAULT '0',
    `stored`                    tinyint NOT NULL DEFAULT '0',
    `term_positions`            tinyint NOT NULL DEFAULT '0',
    `case_sensitive`            tinyint NOT NULL DEFAULT '0',
    PRIMARY KEY                 (`id`),
    UNIQUE KEY                  `index_field_source_id_name` (`fk_index_field_source_id`, `name`),
    CONSTRAINT `index_field_fk_index_field_source_id` FOREIGN KEY (`fk_index_field_source_id`) REFERENCES `index_field_source` (`id`)
) ENGINE=InnoDB DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

SET SQL_NOTES=@OLD_SQL_NOTES;

-- vim: set tabstop=4 shiftwidth=4 expandtab:

Script V07_03_00_005__index_field_change_pk.sql

Path: stroom-index/stroom-index-impl-db/src/main/resources/stroom/index/impl/db/migration/V07_03_00_005__index_field_change_pk.sql

-- ------------------------------------------------------------------------
-- Copyright 2020 Crown Copyright
--
-- Licensed under the Apache License, Version 2.0 (the "License");
-- you may not use this file except in compliance with the License.
-- You may obtain a copy of the License at
--
--     http://www.apache.org/licenses/LICENSE-2.0
--
-- Unless required by applicable law or agreed to in writing, software
-- distributed under the License is distributed on an "AS IS" BASIS,
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-- See the License for the specific language governing permissions and
-- limitations under the License.
-- ------------------------------------------------------------------------

-- Stop NOTE level warnings about objects (not)? existing
SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0;

DELIMITER $$

DROP PROCEDURE IF EXISTS modify_field_source$$

-- The surrogate PK results in fields from different indexes all being mixed together
-- in the PK index, which causes deadlocks in batch upserts due to gap locks.
-- Change the PK to be (fk_index_field_source_id, name) which should keep the fields
-- together.

CREATE PROCEDURE modify_field_source ()
BEGIN

    -- Remove existing PK
    IF EXISTS (
            SELECT NULL
            FROM INFORMATION_SCHEMA.columns
            WHERE TABLE_SCHEMA = database()
            AND TABLE_NAME = 'index_field'
            AND COLUMN_NAME = 'id') THEN

        ALTER TABLE index_field DROP COLUMN id;
    END IF;

    -- Add the new PK
    IF NOT EXISTS (
            SELECT NULL
            FROM INFORMATION_SCHEMA.table_constraints
            WHERE TABLE_SCHEMA = database()
            AND TABLE_NAME = 'index_field'
            AND CONSTRAINT_NAME = 'PRIMARY') THEN

        ALTER TABLE index_field ADD PRIMARY KEY (fk_index_field_source_id, name);
    END IF;

    -- Remove existing index that is now served by PK
    IF EXISTS (
            SELECT NULL
            FROM INFORMATION_SCHEMA.table_constraints
            WHERE TABLE_SCHEMA = database()
            AND TABLE_NAME = 'index_field'
            AND CONSTRAINT_NAME = 'index_field_source_id_name') THEN

        ALTER TABLE index_field DROP INDEX index_field_source_id_name;
    END IF;

END $$

DELIMITER ;

CALL modify_field_source();

DROP PROCEDURE IF EXISTS modify_field_source;

SET SQL_NOTES=@OLD_SQL_NOTES;

-- vim: set tabstop=4 shiftwidth=4 expandtab:

Module stroom-processor

Script V07_03_00_001__processor.sql

Path: stroom-processor/stroom-processor-impl-db/src/main/resources/stroom/processor/impl/db/migration/V07_03_00_001__processor.sql

-- ------------------------------------------------------------------------
-- Copyright 2020 Crown Copyright
--
-- Licensed under the Apache License, Version 2.0 (the "License");
-- you may not use this file except in compliance with the License.
-- You may obtain a copy of the License at
--
--     http://www.apache.org/licenses/LICENSE-2.0
--
-- Unless required by applicable law or agreed to in writing, software
-- distributed under the License is distributed on an "AS IS" BASIS,
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-- See the License for the specific language governing permissions and
-- limitations under the License.
-- ------------------------------------------------------------------------

-- Stop NOTE level warnings about objects (not)? existing
SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0;

DROP PROCEDURE IF EXISTS modify_processor;
DELIMITER //
CREATE PROCEDURE modify_processor ()
BEGIN
     DECLARE object_count integer;

     SELECT COUNT(1)
     INTO object_count
     FROM information_schema.table_constraints
     WHERE table_schema = database()
     AND table_name = 'processor'
     AND constraint_name = 'processor_pipeline_uuid';

     IF object_count = 1 THEN
         ALTER TABLE processor DROP INDEX processor_pipeline_uuid;
     END IF;

     SELECT COUNT(1)
     INTO object_count
     FROM information_schema.table_constraints
     WHERE table_schema = database()
     AND table_name = 'processor'
     AND constraint_name = 'processor_task_type_pipeline_uuid';

     IF object_count = 0 THEN
         CREATE UNIQUE INDEX processor_task_type_pipeline_uuid ON processor (task_type, pipeline_uuid);
     END IF;
END//
DELIMITER ;
CALL modify_processor();
DROP PROCEDURE modify_processor;

SET SQL_NOTES=@OLD_SQL_NOTES;

-- vim: set shiftwidth=4 tabstop=4 expandtab:

Script V07_03_00_005__processor_filter.sql

Path: stroom-processor/stroom-processor-impl-db/src/main/resources/stroom/processor/impl/db/migration/V07_03_00_005__processor_filter.sql

-- ------------------------------------------------------------------------
-- Copyright 2020 Crown Copyright
--
-- Licensed under the Apache License, Version 2.0 (the "License");
-- you may not use this file except in compliance with the License.
-- You may obtain a copy of the License at
--
--     http://www.apache.org/licenses/LICENSE-2.0
--
-- Unless required by applicable law or agreed to in writing, software
-- distributed under the License is distributed on an "AS IS" BASIS,
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-- See the License for the specific language governing permissions and
-- limitations under the License.
-- ------------------------------------------------------------------------

-- Stop NOTE level warnings about objects (not)? existing
SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0;

-- --------------------------------------------------

DELIMITER $$

-- --------------------------------------------------

DROP PROCEDURE IF EXISTS processor_run_sql_v1 $$

-- DO NOT change this without reading the header!
CREATE PROCEDURE processor_run_sql_v1 (
    p_sql_stmt varchar(1000)
)
BEGIN

    SET @sqlstmt = p_sql_stmt;

    SELECT CONCAT('Running sql: ', @sqlstmt);

    PREPARE stmt FROM @sqlstmt;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;
END $$

-- --------------------------------------------------

DROP PROCEDURE IF EXISTS processor_add_column_v1$$

-- DO NOT change this without reading the header!
CREATE PROCEDURE processor_add_column_v1 (
    p_table_name varchar(64),
    p_column_name varchar(64),
    p_column_type_info varchar(64) -- e.g. 'varchar(255) default NULL'
)
BEGIN
    DECLARE object_count integer;

    SELECT COUNT(1)
    INTO object_count
    FROM information_schema.columns
    WHERE table_schema = database()
    AND table_name = p_table_name
    AND column_name = p_column_name;

    IF object_count = 0 THEN
        CALL processor_run_sql_v1(CONCAT(
            'alter table ', database(), '.', p_table_name,
            ' add column ', p_column_name, ' ', p_column_type_info));
    ELSE
        SELECT CONCAT(
            'Column ',
            p_column_name,
            ' already exists on table ',
            database(),
            '.',
            p_table_name);
    END IF;
END $$

-- idempotent
CALL processor_add_column_v1(
        'processor_filter',
        'max_processing_tasks',
        'int NOT NULL DEFAULT 0');

-- vim: set shiftwidth=4 tabstop=4 expandtab:

4.5 - Change Log

Link to the full CHANGELOG.

For a detailed list of all the changes in v7.3 see: v7.3 CHANGELOG

5 - Version 7.2

Key new features and changes present in v7.2 of Stroom and Stroom-Proxy.

Stroom v7.1 was not widely adopted so this section may describe features or changes that were part of the v7.1 release.

5.1 - New Features

New features in Stroom version 7.2.

Look and Feel

New User Interface Design

The user interface has had a bit of a re-design to give it a more modern look and to make it conform to accessibility standards.

images/releases/07.02/new-look.png

User Preferences

Now you can customise Stroom with your own personal preferences. From the main menu , select:

User
Preferences

You can also change the following:

  • Layout Density - This controls the layout spacing to fit more or less user interface elements in the available space.

  • Font - Change font used in Stroom.

  • Font Size - Change the font size used in Stroom.

  • Transparency - Enables partial transparency of dialog windows. Entirely cosmetic.

Theme

Choose between the traditional light theme and a new dark theme with light text on a dark background.

Editor Preferences

The Ace text editor used within Stroom is used for editing things like XSLTs and viewing stream data. It can now be personalised with the following options:

  • Theme - The colour theme for the editor. The theme options will be applicable to the main user interface theme, i.e. light/dark editor themes. The theme affects the colours used for the syntax highlight content.

  • Key Bindings - Allows you to set the editor to use Vim key bindings for more powerful text editing. If you don’t know what Vim is then it is best to stick to Standard. If you would like to learn how to use Vim, install vimtutor. Note: The Ace editor does not fully emulate Vim, not all Vim key bindings work and there is limited command mode support.

  • Live Auto Completion - Set this to On if you want the editor code completion to make suggestions as you type. When set to Off you need to press Ctrl ^ + Space ␣ to show the suggestion dialog.

Date and Time

You can now change the format used for displaying the date and time in the user interface. You can also set the time zone used for displaying the date and time in the user interface.

Dashboard Changes

Design Mode

A Design Mode has been introduced to Dashboards and is toggled using the button . When a Dashboard is in design mode, the following functionality is enabled:

  • Adding components to the Dashboard.
  • Removing components from the Dashboard.
  • Moving Dashboard components within panes, to new panes or to existing panes.
  • Changing the constraints of the Dashboard.

On creation of a new Dashboard, Design Mode will be on so the user has full functionality. On opening an existing Dashboard, Design Mode will be off. This is because typically, Dashboards are viewed more than they are modified.

Visual Constraints

Now it is possible to control the horizontal and vertical constraints of a Dashboard. In Stroom 7.0, a dashboard would always scale to fit the user’s screen. Sometimes it is desirable for the dashboard canvas area to be larger than the screen so that you have to scroll to see it all. For example you may have a dashboard with a large number of component panes and rather than squashing them all into the available space you want to be able to scroll vertically in order to see them all.

It is now possible to change the horizontal and/or vertical constraints to fit the available width/height or to be fixed by clicking the button.

The edges of the canvas can be moved to shrink/grow it.

Explorer Filter Matches

Filtering in the explorer has been changed to highlight the filter matches and to search in folders that themselves are a match. In Stroom v7.0 folders that matched were not expanded. Match highlighting makes it clearer what items have matched.

images/releases/07.02/explorer-filter-matches.png

Document Permissions Screen

The document and folder permissions screens have been re-designed with a better layout and item highlighting to make it clearer which permissions have been selected.

images/releases/07.02/folder-permissions.png

Editor Completion snippets

The number of available editor completion snippets has increased. For a list of the available completion snippets see the Completion Snippet Reference.

Partitioned Reference Data Stores

In Stroom v7.0 reference data is loaded using a reference loader pipeline and the key/value pairs are stored in a single disk backed reference data store on each Stroom node for fast lookup. This single store approach has led to high contention and performance problems when purge operations are running against it at the same time or multiple reference Feeds are being loaded at the same time.

In Stroom v7.2 the reference data key/value pairs are now stored in multiple reference data stores on each node, with one store per Feed. This reduces contention as reference data for one Feed can be loading while a purge operation is running on the store for another Feed or reference data for multiple Feeds can be loaded concurrently. Performance is still limited by the file system that the stores are hosted on.

All reference data stores are stored in the directory defined by stroom.pipeline.referenceData.lmdb.localDir.

Improved OAuth2.0/OpenID Connect Support

The support for Open ID Connect (OIDC) authentication has been improved in v7.2. Stroom can be integrated with AWS Cognito, MS Azure AD, KeyCloak and other OIDC Identity Providers .

Data receipt in Stroom and Stroom-Proxy can now enforce OIDC token authentication as well as certificate authentication. The data receipt authentication is configured via the properties:

  • stroom.receive.authenticationRequired
  • stroom.receive.certificateAuthenticationEnabled
  • stroom.receive.tokenAuthenticationEnabled

Stroom and Stroom-Proxy have also been changed to use OIDC tokens for API endpoints and inter-node communications. This currently requires the OIDC IDP to support the client credentials flow.

Stroom can still be used with its own internal IDP if you do not have an external IDP available.

User Naming Changes

The changes to add integration with external OAuth 2.0/OpenID Connect identity provides has required some changes to the way users in Stroom are identified.

Previously in Stroom a user would have a unique username that would be set when creating the account in Stroom. This would typically by a human friendly name like jbloggs or similar. It would be used in all the user/permission management screens to identify the user, for functions like current-user(), for simple audit columns in the database (create_user and update_user) and for the audit events stroom produces.

With the integration to external identity providers this has had to change a little. Typically in OpenID Connect IDPs the unique identity of a principle (user) is fairly unfriendly UUID . The user will likely also have a more human friendly identity (sometimes called the preferred_username) that may be something like jblogs or jblogs@somedomain.x. As per the OpenID Connect specification, this friendly identity may not be unique within the IDP, so Stroom has to assume this also. In reality this identity is typically unique on the IDP though. The IDP will often also have a full name for the user, e.g. Joe Bloggs.

Stroom now stores and can display all of these identities.

images/releases/07.02/application-permissions.png
  • Display Name - This is the (potentially non-unique) preferred user name held by the IDP, e.g. jbloggs or jblogs@somedomain.x.
  • Full Name - The user’s full name, e.g. Joe Bloggs, if known by the IDP.
  • Unique User Identity - The unique identity of the user on the IDP, which may look like ca650638-b52c-45af-948c-3f34aeeb6f86.

In most screens, Stroom will display the Display Name. This will also be used for any audit purposes. The permissions screen show all three identities so an admin can be sure which user they are dealing with and be able to correlate it with one on the IDP.

User Creation

When using an external IDP, a user visiting Stroom for the first time will result in the creation of a Stroom User record for them. This Stroom User will have no permissions associated with it. To improve the experience for a new user it is preferable for the Stroom administrator to pre-create the Stroom User account in Stroom with the necessary permissions.

This can be done from the Application Permissions screen accessed from the Main menu ( ).

Security
Application Permissions

You can create a single Stroom User by clicking the button.

images/releases/07.02/create-single-user.png

Or you can create multiple Stroom Users by clicking the button.

images/releases/07.02/create-multiple-users.png

In both cases the Unique User ID is mandatory, and this must be obtained from the IDP. The Display Name and Full Name are optional, as these will be obtained automatically from the IDP by Stroom on login. It can be useful to populate them initially to make it easier for the administrator to see who is who in the list of users.

Once the user(s) are created, the appropriate permissions/groups can be assigned to them so that when they log in for the first time they will be able to see the required content and be able to use Stroom.

New Document types

The following new types of document can be created and managed in the explorer tree.

Documentation

It is now possible to create a Documentation entity in the explorer tree. This is designed to hold any text or documentation that the user chooses to write in Markdown format. These can be useful for providing documentation within a folder in the tree to collectively describe all the items in that folder, or to provide a useful README type document. It is not possible to add documentation to a folder entity itself, so this is useful substitute.

Elastic Cluster

Elastic Cluster provides a means to define a connection to an Elasticsearch Cluster. You would create one of these documents for each Elasticsearch cluster that you want to connect to. It defines the location and authentication details for connecting to an elastic cluster.

images/releases/07.02/elastic-cluster.png

Thanks to Pete K for his help adding the new Elasticsearch integration features.

Elastic Index

An Elastic Index document is a data source for searching one or more indexes on Elasticsearch.

images/releases/07.02/elastic-index.png

New Searchables

A Searchable is one of the data sources that appear at the top level of the tree pickers but not in the explorer tree.

Analytics

Adds the ability to query data from Table Builder type Analytic Rules.

New Pipeline Elements

DynamicIndexingFilter

DynamicIndexingFilter

This filter element is used by Views and Analytic Rules. Unlike IndexingFilter where you have to specify all the fields in the index up front for them to visible to the user in a Dashboard, DynamicIndexingFilter allows fields to be dynamically created in the XSLT based on the event being indexed. These dynamic fields are then ‘discovered’ after the event has been added to the index.

DynamicSearchResultOutputFilter

DynamicSearchResultOutputFilter

This filter element is used by Views and Analytic Rules. Unlike SearchResultOutputFilter this element can discover the fields found in the extracted event when the extraction pipeline creates fields that are not present in the index. These discovered field are then available for the user to pick from in the Dashboard/Query.

ElasticIndexingFilter

ElasticIndexingFilter

ElasticIndexingFilter is used to pass fields from an event to an Elasticsearch cluster to index.

Explorer Tree

Various enhancements have been made to the explorer tree.

Favourites

Users now have the ability to mark explorer tree entities as favourites. Favourites are user specific so each user can define their own favourites. This feature is useful for quick access to commonly used entities. Any entity or Folder at any level in the explorer tree can be set as a favourite. Favourites are also visible in the various entity pickers used in Stroom, e.g. Feed pickers.

An entity/folder can be added or removed from the favourites section using the context menu items :

Add to Favourites
Remove from Favourites

An entity that is a favourite is marked with a in the main tree.

images/releases/07.02/favourites.png

A change to a child item of a folder marked as a favourite will be reflected in both the main tree and the favourites section. All items marked as a favourite will appear as a top level item underneath the Favourites root, even if they have an ancestor folder that is also a favourite.

Thanks to Pete K for adding this new feature.

Document Tagging

You can now add tags to entities or folders in the explorer tree. Tags provide an additional means of searching for entities or folders. It allows entities/folders that reside in different folders to be associated together in one or more ways.

The tags on an entity/folder can be managed from the explorer tree context menu item:

Edit Tags

The explorer tree can be filtered by tag using the field prefix tag:, i.e. tag:extraction.

If multiple entities/folders are selected in the explorer tree then the following menu items are available:

Add Tags
Remove Tags

Pre-populated Tag Filters

Stroom comes pre-configured with some default tags. The property that sets these is stroom.explorer.suggestedTags. The defaults for this property are dynamic, extraction and reference-loader

These pre-configured tags are also used in some of the tree pickers in stroom to provide extra filtering of entities in the picker. For example when selecting a Pipeline on an XSLT Filter the filter on the tree picker to select the pipeline will be pre-populated with tag:reference-loader so only reference loader pipelines are included.

images/releases/07.02/tree-picker-tag-filter.png

The following properties control the tags used to pre-populate tree picker filters:

  • stroom.ui.query.dashboardPipelineSelectorIncludedTags
  • stroom.ui.query.viewPipelineSelectorIncludedTags
  • stroom.ui.referencePipelineSelectorIncludedTags

It is not possible to easily copy a direct link to a Document from the explorer tree. Direct links are useful if for example you want to share a link to a particular stroom dashboard.

To create a direct link, right click on the document you want a link for in the explorer tree and select:

Copy Link to Clipboard

You can then paste the link into a browser to jump directly to that document (authenticating as required).

Dependencies

It is not possible to jump to the Dependencies screen to see the dependencies or dependants of a particular document. In the explorer tree right click on a document and select one of:

Dependencies
This will open the Dependencies screen with a filter pre-populated to show all documents that are dependencies of the selected document.

Dependants
This will open the Dependencies screen with a filter pre-populated to show all documents that depend on the selected document.

Broken Dependency Alerts

It is now possible to see alert icons in the explorer tree to highlight documents that have broken dependencies. The user can hover over these icons to display more information about the broken dependency. The explorer tree will show the alert icon against all documents with a broken dependency and all of its ancestor folders.

images/releases/07.02/explorer-dependency-alerts.png

A broken dependency means a document (e.g. an XSLT) has a dependency on another document (e.g. a reference loader Pipeline) but that document does not exist. Broken dependencies can occur when a user deletes a document that other documents depend on, or by a partial import of content.

This feature is disabled by default as it can have a significant effect on performance of the explorer tree with large trees. To enable this feature, set the property stroom.explorer.dependencyWarningsEnabled to true.

Once enabled at the system level by the property, the display of alerts in the tree can be enabled/disabled by the user using the Toggle Alerts button.

Entity Documentation

It is now possible to add documentation to all entities / documents in the explorer tree, e.g. adding documentation on a Feed. Each entity now has a Documentation sub-tab where the user can enter any documentation they choose about that entity. The documentation is written in Markdown syntax. It is not possible to add documentation to a Folder but you can create one or more a Documentation entities as a child item of that folder, see Documentation.

Find Content

You can now search the content of entities in the explorer tree, e.g. searching within XSLTs, Dictionaries, Pipeline structure, etc. This feature is available from the main menu :

Navigation
Find Content

It can also be accessed by hitting Ctrl ^ + Shift ⇧ + f (unless an editor pane has focus).

images/releases/07.02/find-content.png

It is useful for finding which pipelines are using a certain element, or what XSLTs are using a certain stroom: function.

This is an early evolution of this feature and it is likely to be improved with time.

Search Result Stores

When a Dashboard/Query search is run, the results are written to a Search Results Store for that query. This stores reside on disk to reduce the memory used by queries. The Search Result Stores are stored on a single Stroom node and get created when a query is executed in a Dashboard, Query or Analytic Rule.

This screen provides an administrator with an overview of all the stores currently in existence in the Stroom cluster, showing details on their state and size. It can also be used to stop queries that are currently running or to delete the store entirely. Stores get deleted when the user closes the Dashboard or Query that created them.

images/releases/07.02/search-result-stores.png

Pipeline Stepper Improvements

The pipeline stepper has had a few user interface tweaks to make it easier to use.

Log Pane

When there are errors, warnings or info messages on a pipeline element they will now also be displayed in a pane at the bottom. This makes it easer to see all messages in one place.

images/releases/07.02/stepper-log-pane.png

The editor still displays icons with hover tips in the gutter on the appropriate line where the message has an associated line number.

The log pane can be hidden by clicking the icon.

Highlighting

The pipeline displayed at the top of the stepper now highlights elements that have log messages against them. This makes it easier to see when there is a problem with an element as you step through the data. The elements are given a coloured border according to the highest severity message on that element:

  • Info - Blue
  • Warning - Yellow
  • Error - Red
  • Fatal Error - Red (pulsating)
images/releases/07.02/stepper-element-highlight.png

Filtering

Stroom has always had the ability to filter the data being stepped, however the feature was a little hidden (the Mange Step Filters icon).

Now you can right click on a pipeline element to manage the filters on that element. You can also clear its filters or the filters on all elements.

images/releases/07.02/stepper-context-menu.png

The pipeline now shows which elements have an active filter by displaying a filter icon.

images/releases/07.02/stepper-element-filter-icons.png

Server Tasks

Auto Refresh

You can now enable/disable the auto-refreshing of the Server Tasks table using the button. Auto refresh is enabled by default. Disabling it is useful when you want to delete a task, as it will stop the table being refreshed just before you hit delete.

Line Wrapping

You can now enable/disable line wrapping in the Name and Info cells using the button. Line wrapping is disable by default. Enabling this is useful to see long Info cell values.

Info Popup

The Info popup has been changed to include the value from the Info column.

Proxy

Stroom-Proxy v7.2 has undergone a significant re-write in an attempt to address certain performance issues, make it more flexible and to allow data to be forked to many destinations.

5.2 - Preview Features (experimental)

Preview experimental features in Stroom version 7.2.

New Document types

View

A View is a document type that has been added in to make using Dashboards and Queries easier. It encapsulates the data source and an optional extraction pipeline.

images/releases/07.02/view.png

Previously a user wanting to create a Dashboard to query Stroom’s indexes would need to first select the Index to use as the data source then select an extraction pipeline. The indexes do not typically store the full event, so extraction pipelines retrieve the full event from the stream store for each matching event returned by the index. Users should not need to understand the distinction between what is held in the index and what has to be exacted, nor should they need to know how to do that extraction.

A View abstracts the user from this process. They can be configured by an admin or more senior user so that a standard user can just select an appropriate View as the data source in a Dashboard or Query and the View will silently handle the retrieval/extraction of data.

Views are also used by Analytic Rules so need to define a Meta filter that controls the streams that will be processed by the analytic. This filer should mirror the processor filter expression used to control data processed by the Index that the View is using. These two filters may be amalgamated in a future version of Stroom.

Query

The Query feature provides a new way to query data in Stroom. It is a functional but evolving feature that is likely to be enhanced further in future versions.

Rather than using the query expression builder and table column expressions as used in Dashboards, it uses the new text based Stroom Query Language to define the query.

images/releases/07.02/query.png

Stroom Query Language (StroomQL)

This is an example of a StroomQL query. It replaces the old dashboard expression ’tree’ and table column expressions. StroomQL has the advantage of being quicker to construct and is easier to copy from one query to another (whole or in part) as it is just plain text.

FROM "Example View"                           // Define the View to use as the data source
WHERE Action IN("Search", "View")             // Equivalent to the Dashboard expression tree
EVAL hour = floorHour(EventTime)              // Define named fields based on function expressions
EVAL event_count = count()
GROUP BY Feed, Action                         // Equivalent to Dashboard table column grouping
SELECT Feed, Action, event_count AS "Count"   // Equivalent to adding columns to a Dashboard table

Editing StroomQL queries in the editor is also made easier by the code completion (using ctrl+space) to suggest data sources, fields, functions and StroomQL language terms. StroomQl queries can be executed easily with ctrl+enter or shift+enter.

Analytic Rule

Analytic Rules allow the user to create scheduled or streaming Analytic Rule that will fire alerts when events matching the rule are seen.

Analytic rules rely on the new Stroom Query Language to define what events will match the rule. An Analytic Rule can be created directly from a Query by clicking the Create Analytic Rule icon.

5.3 - Breaking Changes

Changes in Stroom version 7.2 that may break existing processing or ways of working.

Quoted Strings in Dashboard Table Expressions

Quoted strings in dashboard table expressions can now be expressed with single and double quotes. As part of this change apostrophes in text are no longer escaped with an additional apostrophe (''), but instead require a leading \ before them if they are in a single quoted string, i.e:

'O''Neill' must be changed to 'O\'Neill'

In many cases it is preferable to use double quotes if the string in question has an apostrophe. Note that the use of \ as an escape character also means that any existing \ characters will need to be escaped with a preceding \ so \ must now become \\, i.e:

c:\Windows\System32 must be changed to c:\\Windows\\System32

The new Find Content feature can be used to find affected Dashboards.

Search API Change

The APIs for running searches against Stroom data sources have changed in a breaking way. This is due to a change in the way running queries are identified.

Previously the client calling the API would provide generate a unique key for the query and included it in the searchRequest object. This key would then be used again if the client wanted to make further requests for results for the same running query.

  "key" : {
    "uuid": "e244d45c-4086-463b-b1a8-10c8c7d7d6c7"
  },

In v7.2 the query key is now generated by Stroom rather than the client. In the first search request for a query, the client should now omit the key field from the request. Stroom will generate a unique key for the running query and return it in the response. However, in any subsequent requests for that running query, the client should include the key field, using the value from the previous response.

If you have static request JSON files then you can easily remove the key field using jq as follows:

jq 'del(.key)' req.json > req.new.json

5.4 - Upgrade Notes

Required actions and information relating to upgrading to Stroom version 7.2.

Java Version

Stroom v7.2 requires Java v17. Previous versions of Stroom used Java v15 or lower. You will need to upgrade Java on the Stroom and Stroom-Proxy hosts to the latest patch release of Java v17.

Regex Performance Issue in XSLT Processing

v7.2 of Stroom uses a newer version of the Saxon XML processing library. Saxon is used for all pipeline processing. There is a bug in this version of Saxon which means that case insensitive regular expression matching performs very badly, i.e. it can be orders of magnitude slower than a case sensitive regex. This bug has been reported to Saxon and has been fixed but not yet released. It is likely a future release of Stroom will include a new version of Saxon that addresses this issue.

The performance issue will show itself when multiple pipelines with effected XSLTs are being processed concurrently. This impacts XSLT/Xpath functions like matches() that use the i flag for case insensitive matching. If you don’t not use any case-insensitive regular expressions in your XSLTs then you do not need to do anything.

Until Stroom is changed to used a new version of Saxon with a fix, you will have to change the XSLTs that use the i flag in one of the following ways:

  • Re-write the regular expression to use case sensitive matching, E.g:
    matches('CATHODE', '^cat.*', 'i') => matches('CATHODE', '^[cC][aA][tT].*')
    This is the preferred option, but may not be possible for all regular expressions.

  • Add the flag ;j to force Saxon to use the Java regular expression engine instead of the Saxon one, E.g:
    matches('CATHODE', '^cat.*', 'i') => matches('CATHODE', '^cat.*', 'i;j')
    As this involves changing the regular expression engine it is possible that there will be subtle differences in the behaviour between the Saxon and Java engines. Regular expression engines are notorious for having subtle differences as there is no one standard for regular expressions.

Tagging Entities

As described in Document Tagging Stroom now pre-populates the filter of some of the tree pickers with pre-configured tags to limit the entities returned. If you do nothing then after upgrade these tree pickers will show no matching entities to the user.

You have two options:

  1. Tag entities with the pre-configured tags so they are visible in the tree pickers.

    To do this you need to find and tag the following entities:

    • Tag all reference loader pipelines (those using the ReferenceDataFilter pipeline element) with reference-loader (or whatever value(s) is/are set in stroom.ui.referencePipelineSelectorIncludedTags).
    • Tag all extraction pipelines (those using the SearchResultOutputFilter pipeline element) with extraction (or whatever value(s) is/are set in stroom.ui.query.dashboardPipelineSelectorIncludedTags).

    Any new entities matching the above criteria also need to be tagged in this way to ensure users see the correct entities. The new Find Content is useful for tracking down Pipelines that contain a certain element.

    The property stroom.ui.query.viewPipelineSelectorIncludedTags is not an issue for an upgrade to v7.2 as Views did not exist prior to this version. All new dynamic extraction pipeline entities (those using the DynamicSearchResultOutputFilter pipeline element) need to be tagged with dynamic and extraction (or whatever value(s) is/are set in stroom.ui.query.viewPipelineSelectorIncludedTags)

  2. Change the system properties to not pre-populate the filters. If you do not want to use this feature then you can just clear the values of the following properties:

    • stroom.ui.query.dashboardPipelineSelectorIncludedTags
    • stroom.ui.query.viewPipelineSelectorIncludedTags
    • stroom.ui.referencePipelineSelectorIncludedTags

Reference Data Store

See Partitioned Reference Data Stores for details of the changes to reference data stores.

No intervention is required on upgrade for this change, this section is for information purposes only, however it is recommended that you take a backup copy of the existing reference data store files before booting the new version of Stroom. To do this, make a copy of the files in the directory specified by stroom.pipeline.referenceData.lmdb.localDir. If there is a problem then you can replace the store with the copy and try again.

Stroom will automatically migrate reference data from the legacy single data store into multiple Feed specific stores. The legacy store exists in the directory configured by stroom.pipeline.referenceData.lmdb.localDir. Each feed specific store will be in a sub-directory with a name like USER-DETAILS-REFERENCE___309e1ca0-7a5f-4f05-847b-b706805d758c (i.e. a file system safe version of the Feed name and the Feed’s UUID .

The migration happens on an as-needed basis. When a lookup is called from an XSLT, if the required reference stream is found to exist in the legacy store then it will be copied into the appropriate Feed specific store (creating the store if required). After being copied, the stream in the legacy store will be marked as available for purge so will get purged on the next run of the job Ref Data Off-heap Store Purge.

When Stroom boots it will delete a legacy store if it is found to be empty, so eventually the legacy store will cease to exist.

Depending on the speed of the local storage used for the reference data stores, the migration of streams and the subsequent purge from the legacy store may slow down processing until all the required migrations have happened. The migration is a trade-off between the additional time it would take to re-load all the reference streams (rather than just copying them from the legacy store) and the dedicated lock on the legacy store that all migrations need to acquire.

If you experience performance problems with reference data migrations or would prefer not to migrate the date then you can simply delete the legacy stores prior to running Stroom v7.2 for the first time. The legacy store can be found in the directory configured by stroom.pipeline.referenceData.lmdb.localDir. Simply delete the files data.mdb and lock.mdb (if present). With the store deleted, stroom will simply load all reference streams as required with no migration.

Database Migrations

When Stroom boots for the first time with a new version it will run any required database migrations to bring the database schema up to the correct version.

On boot, Stroom will ensure that the migrations are only run by a single node in the cluster. This will be the node that reaches that point in the boot process first. All other nodes will wait until that is complete before proceeding with the boot process.

It is recommended however to use a single node to execute the migration. To avoid Stroom starting up and beginning processing you can use the migrage command to just migrate the database and not fully boot Stroom. See migrage command for more details.

Migration Scripts

For information purposes only, the following is a list of all the database migrations that will be run when upgrading from v7.0 to v7.2.0. The migration script files can be viewed at github.com/gchq/stroom .

7.1.0
  stroom-config
    V07_01_00_001__preferences.sql                                 - stroom-config/stroom-config-global-impl-db/src/main/resources/stroom/config/global/impl/db/migration/V07_01_00_001__preferences.sql
  stroom-explorer
    V07_01_00_005__explorer_favourite.sql                          - stroom-explorer/stroom-explorer-impl-db/src/main/resources/stroom/explorer/impl/db/migration/V07_01_00_005__explorer_favourite.sql
  stroom-security
    V07_01_00_001__add_stroom_user_cols.sql                        - stroom-security/stroom-security-impl-db/src/main/resources/stroom/security/impl/db/migration/V07_01_00_001__add_stroom_user_cols.sql
    V07_01_00_002__rename_preferred_username_col.sql               - stroom-security/stroom-security-impl-db/src/main/resources/stroom/security/impl/db/migration/V07_01_00_002__rename_preferred_username_col.sql
7.2.0
  stroom-analytics
    V07_02_00_001__analytics.sql                                   - stroom-analytics/stroom-analytics-impl-db/src/main/resources/stroom/analytics/impl/db/migration/V07_02_00_001__analytics.sql
  stroom-annotation
    V07_02_00_005__annotation_assigned_migration_to_uuid.sql       - stroom-annotation/stroom-annotation-impl-db/src/main/resources/stroom/annotation/impl/db/migration/V07_02_00_005__annotation_assigned_migration_to_uuid.sql
    V07_02_00_010__annotation_entry_assigned_migration_to_uuid.sql - stroom-annotation/stroom-annotation-impl-db/src/main/resources/stroom/annotation/impl/db/migration/V07_02_00_010__annotation_entry_assigned_migration_to_uuid.sql
  stroom-config
    V07_02_00_005__preferences_column_rename.sql                   - stroom-config/stroom-config-global-impl-db/src/main/resources/stroom/config/global/impl/db/migration/V07_02_00_005__preferences_column_rename.sql
  stroom-dashboard
    V07_02_00_005__query_add_owner_uuid.sql                        - stroom-dashboard/stroom-storedquery-impl-db/src/main/resources/stroom/storedquery/impl/db/migration/V07_02_00_005__query_add_owner_uuid.sql
    V07_02_00_006__query_add_uuid.sql                              - stroom-dashboard/stroom-storedquery-impl-db/src/main/resources/stroom/storedquery/impl/db/migration/V07_02_00_006__query_add_uuid.sql
  stroom-explorer
    V07_02_00_005__remove_datasource_tag.sql                       - stroom-explorer/stroom-explorer-impl-db/src/main/resources/stroom/explorer/impl/db/migration/V07_02_00_005__remove_datasource_tag.sql
  stroom-security
    V07_02_00_100__query_add_owners.sql                            - stroom-security/stroom-security-impl-db/src/main/resources/stroom/security/impl/db/migration/V07_02_00_100__query_add_owners.sql
    V07_02_00_101__processor_filter_add_owners.sql                 - stroom-security/stroom-security-impl-db/src/main/resources/stroom/security/impl/db/migration/V07_02_00_101__processor_filter_add_owners.sql

5.5 - Change Log

Link to the full CHANGELOG.

For a detailed list of all the changes in v7.2 see: v7.2 CHANGELOG

6 - Version 7.1

Key new features and changes present in v7.1 of Stroom and Stroom-Proxy.

For a detailed list of the changes in v7.1 see the

changelog

7 - Version 7.0

Key new features and changes present in v7.0 of Stroom and Stroom-Proxy.

For a detailed list of the changes in v7.0 see the

changelog

Integrated Authentication

The previously standalone (in v6) stroom-auth-service and stroom-auth-ui services have been integrated into the core stroom application. This simplifies the installation and configuration of stroom.

Configuration Properties Improvements

Configuration is now provided by YAML files on boot

Previously stroom used a flat .conf file to manage the application configuration. Application logging was configured either via a .yml file (in v6) or in an .xml file (in v5). Now stroom uses a single .yml file to configure the application and logging. This file is different to the .yml files(s) used in the docker compose configuration. The YAML file provides a more logical hierarchical structure and support for typed values (longs, doubles, maps, lists, etc.).

The YAML configuration is intended for configuration items that are either needed to bootstrap stroom or have values that are specific to a node. Cluster wide configuration properties are still stored in the database and managed via the UI.

There has been a change to the precedence of the configuration properties held in different locations (YAML, database, default) and this is described in Properties.

Stroom Home and relative paths

The concept of Stroom Home has been introduced. Stroom Home allows for one path to be configured and for all other configurable paths to default to being a child of this path. This keeps all configured directories in one place by default. Each configured directory can be set to an absolute path if a location outside Stroom Home is required. If a relative path is used it will be relative to Stroom Home. Stroom Home can be configured with the property stroom.path.home.

Improved Properties UI screens that tell you the values over the cluster

Previously the Properties UI screens could only tell you the values held within the database and not the value that a node was actually using. The Properties screens have been improved to tell you the source of a property value and where multiple values exist across the cluster, which nodes have what values. See Properties.

Validation of Configuration Property Values

Validation of configuration property values is now possible. The validation rules are defined in the application code and allow for things like:

  • Ensuring that a regex pattern is a valid pattern
  • Setting maximum or minimum values to numeric properties.
  • Ensuring a property has a value.

Validation will be enforced on application boot or when a value is edited via the UI.

Hot Loading of Node Configuration

Now that node specific configuration is managed via the YAML configuration file stroom will detect changes to this file and update the configuration properties accordingly. Some properties however do not support being changed at runtime so will still require either the whole system or the UI nodes to be restarted.

Data retention impact summary

The Data_Retention screen now provides an Impact Summary tab that will show you a summary of what will be deleted by the current active rules. The summary is based on the rules as they currently are in the UI, so it allows you to see the impact before saving rule changes. The summary is a count of the number of streams that will be deleted by each rule, broken down by feed and stream type. In very large systems with a lot of data or where complex rules are in place the summary may take a some time (minutes) to produce.

See Data Retention for more details.

Fuzzy Finding in Quick Filters and Suggestion Text Fields

A richer fuzzy find algorithm has been added to the Quick filter search fields. It has also been added to some text input fields with suggestion fields, e.g. Feed Name input fields. This makes finding values or rows in a table faster and more precise.

See Finding Things for more details.

New (off-heap) memory efficient reference data

The reference data feature in previous versions of stroom loaded the reference data on demand and held it in Java’s heap memory. In large systems or where a pipeline doing reference data lookups across a wide time range this can lead to very large heap sizes.

In v7 stroom now uses an off-heap, disk backed store (LMDB) for the reference data. This removes all (with the exception of context lookups) from the Java heap, so the -Xmx value can be reduced. In large systems this can mean keeping your -Xmx value below the 32Gb threshold to further reduce the memory usage. Because the store is disk backed frequently used reference data can be kept in the store to reduce the loading overhead. As the reference data is held off-heap it stroom can make use of all available free RAM for the reference data.

See Reference Data

Reference Data API

A RESTful API has been added for the reference data store. This primarily allows reference lookups to be performed by external systems.

See Reference Data API

Text editor improvements

The Ace text editor is used widely in Stroom for such things as editing XSLTs, editing dashboard column expressions, viewing stream data and stepping. There have been a number of improvements to this editor.

See Editing and Viewing Text Data

Editor context menu

Additional options have been added to the context menu in the text editor:

  • Toggle soft line wrapping of long lines.
  • Toggle viewing hidden characters, e.g. tabs, spaces, line breaks.
  • Toggle Vim key bindings. The Ace editor does not implement all Vim functionality but supports the core key bindings.
  • Toggle auto-completion. Completion is triggered using ctrl+space.
  • Toggle live auto-completion. Completion is triggered as you type.
  • Toggle the inclusion of snippets in the auto-complete suggestions.

Auto-completion and snippets

Most editor screens now support basic auto-completion of existing words found in the text. Some editor screens, such as XSLT, dashboard column expressions and Javascript scripts also support keyword and snippet completion.

Data viewing improvements

The way data is viewed in Stroom has changed to improve the viewing of large files or files with no line breaks. Previously a set number of lines of data would be fetched for display on the page in the Data Viewer. This did not work for data that has no line breaks as Stroom would then try to fetch all data.

In v7 Stroom works at the character level so can fetch a reasonable number of characters for display whether they are all one line or spread over multiple lines.

The viewing of data has been separated into two mechanisms, Data Preview and Source View.

See Editing and Viewing Text Data

Data Preview

This is the default view of the data. It displays the first n characters (configurable) of the data. It will attempt the format the data, e.g. showing pretty-printed XML. You cannot navigate around the data.

Source View

This view is intended for seeing the actual data in its raw un-formatted form and for navigating around it. This view provides navigation controls to define the range of data being display, e.g. from a character offset, line number or line and column.

You can now query data, server tasks and processing tasks on dashboards

Data actions such as delete, download, reprocess now provide an impact summary before proceeding.

Index volume groups for easier index volume assignment

Kafka Integration

New Kafka Configuration Entity

Integration with Apache Kafka was introduced in v6 however the way the connection to Kafka cluster(s) is configured has been improved. We have introduced a new entity type called Kafka Configuration that can be created/managed via the explorer tree. This means stroom can integrate with many Kafka clusters or connect to a cluster using different sets of Kafka Configuration properties. The Kafka Configuration entity provides an editor for setting all the Kafka specific configuration properties. Pipeline elements that use Kafka now provide a means to select the Kafka Configuration to use.

An Improved Pipeline Element for Sending Data to Kafka

The previous Kafka pipeline elements in v6 have been replaced with a single StandardKafkaProducer element. The new element allows for the dynamic construction of a Kafka Producer message via an XML document conforming to the kafka-records XmlSchema. With this new element events can be translated into kafka records which will be then given to the Kafka Producer to send to the Kafka Cluster. This allows for complete control of things like timestamps, topics, keys, values, etc.

No limitations on data reprocessing

Improved REST API

A rich REST API for all UI accessible functions

The architecture of the stroom UI has been changed such that all communication between the UI and the back end is via REST calls. This means all of these REST calls are available as an API for users of stroom to take advantage of. It opens up the possibility for interacting with stoom via scripts or from other applications.

Swagger UI to document REST API methods

The Swagger UI and specification file have been improved to include all of the API methods available in stroom.

Improved architecture with separate modules with individual DB access to spread load.

The architecture of the core stroom application has been fundamentally changed in v7 to internally break up the application into its functional areas. This separation makes for a more logical code base and allows for the possibility of each functional area having its own database instance, if required.

Java 12

stroom v7 now runs on the Java 12 JVM.

MySQL 8 support.

stroom v7 has been changed to support MySQL v8, opening up the possibility of using features like group replication.

8 - Version 6.1

Key new features and changes present in v6.1 of Stroom and Stroom-Proxy.

For a detailed list of the changes in v6.1 see the

changelog

9 - Version 6.0

Key new features and changes present in v6.0 of Stroom and Stroom-Proxy.

For a detailed list of the changes in v6.0 see the

changelog

OAuth 2.0/OpenID Connect authentication

Authentication for Stroom provided by an external service rather than a service internal to Stroom. This change allows support for broader corporate authentication schemes and is a key requirement for enabling the future microservice architecture for Stroom.

API keys for third party clients

Anyone wishing to make use of the data exposed by Stroom’s services can request an API key. This key acts as a password for their own applications. It allows administrators to secure and manage access to Stroom’s data.

HBase backed statistics store

This new implementation of statistics (Stroom-Stats) provides a vastly more scalable time series DB for large scale collection of Stroom’s data aggregated to various time buckets. Stroom-Stats uses Kafka for ingesting the source data.

Data receipt filtering

Data arriving in Stroom has meta data that can be matched against a policy so that certain actions can be taken. This could be to receive, drop or reject the data.

Filtering of data also applies to Stroom proxy where each proxy can get a filtering policy from an upstream proxy or a Stroom instance.

Data retention policies

The length of time that data will be retained in Strooms stream store can be defined by creating data retention rules. These rules match streams based on their meta data and will automatically delete data once the retention period associated with the rule is exceeded.

Dashboard linking

Links can be created in dashboards to jump to other dashboards or other external sites that provide additional contextual information.

Search API

The search system used by Dashboards can be used via a restful API. This provides access to data stored in search indices (including the ability to extract data) and statistics stores. The data fetched via the search API can be received and processed via an external system.

Kafka appender and filter

New pipeline elements for writing XML or text data to a Kafka topic. This provides more options for using Stroom’s data in other systems.