This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Version 7.10

Key new features and changes present in v7.10 of Stroom and Stroom-Proxy.

1 - New Features

New features in Stroom version 7.10.

Dashboard & StroomQL Functions

ceilingTime(..) & floorTime(...)

Two new functions similar to the existing ceilingXXX and floorXXX functions, except that an arbitrary duration can be used.

For example, floorTime($time, 'PT5m') will floor the time to the latest time that is divisible by 5 minutes.

case(...)

A case function has been added for performing simple if...else if....else if.....end type logic. The function takes the arguments:

case(input, test1, result1, testN, resultN, otherwise)

This is equivalent to

if (input == test1) {
    return result1
} else if (input == testN) {
    return resultN
} else {
    return otherwise
}

See case for details.

formatIECByteSize(...)

A new function for converting an integer amount of bytes into an appropriate byte size unit, e.g. 1024 bytes becomes 1K. International Electrotechnical Commission (IEC) units with a base of 1024 rather than 1000 are used.

The function has three forms:

formatIECByteSize(bytes)
formatIECByteSize(bytes, omitTrailingZeros)
formatIECByteSize(bytes, omitTrailingZeros, significantFigures)

formatMetricByteSize(...)

A new function for converting an integer amount of bytes into an appropriate byte size unit, e.g. 1000 bytes becomes 1K. Metrix units with a base of 1000 rather than 1024 are used.

The function has three forms:

formatMetricByteSize(bytes)
formatMetricByteSize(bytes, omitTrailingZeros)
formatMetricByteSize(bytes, omitTrailingZeros, significantFigures)

decode(...)

The existing decode(...) function has been changed so that you can use any capture groups in the regular expression patterns in the result arguments. For example:

decode('TestString123','Test(.....)(123)','$1-$2','Nothing')

Which would output String-123.

data(...)

The existing data(...) function has been changed so that you can display the stream info and metadata instead of the stream data by setting the viewType argument to a value of info.

For example:

data('View Cooked', ${StreamId}, 1, ${eventId}, null(), null(), null(), null(), 'info')

Dashboard Embedded Queries

When creating an Embedded Query Dashboard pane, it is now possible to embed a copy of an existing query rather than embedding a reference to one. This decouples the Dashboard from the original Query so the original Query can be changed without impacting the Dashboard.

The embedded Query can be edited via the menu on the Dashboard pane.

Stroom XSLT Functions

parse-dateTime(...)

A parse-dateTime function has been added with the following overloads:

parse-dateTime(ISO8601 string)
parse-dateTime(string, pattern)
parse-dateTime(string, pattern, timezone)

This function will either parse a date/time string in ISO8601 standard date/time format or in a custom date/time format using the supplied pattern and optional timezone.

For details of the pattern syntax see Custom Date Formats.

All forms of the function return an xs:dateTime value for use by standard XSLT/XPath functions that can consume an xs:dateTime value.

format-dateTime(...)

A format-dateTime function has been added with the following overloads:

format-dateTime(DateTimeValue)
format-dateTime(DateTimeValue, pattern)
format-dateTime(DateTimeValue, pattern, timezone)

All three variants take an xs:dateTime value as the first argument. If only one argument is supplied, the function will output the date/time as a standard ISO8601 format xs:string. If two or more arguments are supplied then it will output the date/time formatted using the specified pattern and optionally using the specified timezone. If no timezone is supplied, the date/time is assumed to be in UTC .

Meta Functions

The following functions have been added for obtaining meta-data relating to the stream being processed or specified streams.

  • manifest() - Returns manifest attributes for current stream
  • manifest-for-id(streamId) - Returns manifest attributes for specified stream
  • meta-stream() - Returns meta stream for current stream
  • meta-stream-for-id(streamId, partNo) - Returns meta stream for specified stream and part no
  • parent-id() - Gets parent ID for current stream
  • parent-for-id(streamId) - Get parent Stream ID for specified stream

Content Templates

When creating a content template of type INHERIT_PIPELINE it is now possible to tick a box so that any dependencies of the pipeline being inherited from (e.g. Data Splitter TextConverter documents or XsltFilter XSLT documents) will be copied as siblings of the generated Pipeline .

This allows the Data Splitter or XSLT to be refined/populated for the new content.

Index Shards Searchable

The IndexShards Searchable has been changed to add the fields Shard Id and Index Version to the list of available fields.

  • Shard Id - This is the ID of the shard within the index.
  • Index Version - This is the Lucene version that this index was created with. Currently Stroom supports two different versions of the Lucene search index.

Plan B Changes

Plan B has evolved in 7.10 as a state store capable of storing the following types of state data:

  • State - For a given key provide an unchanging state value.
  • Temporal State - For a given key provide a state value valid at a specific point in time (similar to reference data).
  • Ranged State - For a given numeric key within a key range provide an unchanging state value.
  • Temporal Ranged State - For a given numeric key within a key range provide a state value valid at a specific point in time (similar to reference data for ranges).
  • Session - Record session start and end times, e.g. maintain sessions for each application used by a specific user.
  • Metrics - Record values at points in time, e.g. CPU use %.
  • Histograms - Record counts over time, e.g. number of records per minute, hour etc.

Although still somewhat experimental, Plan B has undergone significant change in 7.10 following feedback from the previous experimental release. The data structure has changed significantly to reduce data store sizes. All previous Plan B LMDB instances must be deleted before this new version can be used.

In addition to data structure changes the following features are now available:

  • Additional Plan B store types for histograms and metrics.
  • Advanced Plan B storage schema settings for specific use cases to improve storage efficiency and performance.
  • Better data retention options allowing for retention based on insert time.
  • Remote query settings for get() and lookup() requests to avoid the need for local snapshots.
  • Plan B shards can now be queried as a Searchable data source to discover stored data and information.
  • Writes can now be synchronised if needed to ensure data presence before query. This option impacts data processing performance.

Improved Dashboard Context

Dashboards now maintain a global context that is available to all dashboard components. The context keeps track of the selection state of each component plus dashboard parameters and time range setting. Context changes can be handled by certain components such as queries and tables by adding selection handlers. Handlers allow components to respond to context changes, e.g. by filtering a table based on a selection in another table.

Annotation Changes

Annotations have been improved in 7.10 and more improvements will be available in 7.11.

For 7.10 the following changes have been made:

  • The annotation edit presenter has been improved so that the layout is clearer.
  • Annotations now have fine-grained permissions for visibility and edit.
  • Creating annotations can now be performed on multiple events just by selecting the events and clicking the annotate button.
  • Users can define custom annotation states.
  • Custom labels and collections can defined and added to annotations.
  • All states and labels have visibility permissions.
  • An annotations screen is now available for easier annotation browsing.
  • Annotations can now have retention periods.

Open ID Connect Authentication

Various minor changes to the way Open ID Connect authentication is performed.

Audience Validation

Replace the property stroom.security.authentication.openid.validateAudience with stroom.security.authentication.openid.allowedAudiences (defaults to empty) and stroom.security.authentication.openid.audienceClaimRequired (defaults to false). If the IDP is known to provide the aud claim (often populated with the clientId) then set allowedAudiences to contain that value and set audienceClaimRequired to true.

User Full Name

Add the config prop stroom.security.authentication.openId.fullNameClaimTemplate to allow the user’s full name to be formed from a template containing a mixture of static text and claim variables, e.g. ${firstName} ${lastName}. Unknown variables are replaced with an empty string. Default is ${name}.

This provides full control over the source of the user’s full name in stroom and allows it to be formed from multiple claims within the authentication token.

AWS Integration

Change template syntax of openid.publicKeyUriPattern property from positional variables ({}) to named variables (${awsRegion}). Default value has changed to https://public-keys.auth.elb.${awsRegion}.amazonaws.com/${keyId}. If this property has been explicitly set in the config.yml or Properties screen, its value will need to be changed to use named variables instead.

Certificate DN Format

Add new property .receive.x509CertificateDnFormat to stroom and proxy to allow extraction of CNs from DNs in legacy OPEN_SSL format. The new property defaults to LDAP, which means no change to behaviour if left as is.

2 - Preview Features (experimental)

Preview features in Stroom version 7.10. Preview features are somewhat experimental in nature and are therefore subject to breaking changes in future releases.

There are no new preview features in v7.10.

3 - Breaking Changes

Changes in Stroom version 7.10 that may break existing processing or ways of working.

Stroom

No Stroom specific breaking changes in v7.10.

Stroom-Proxy

No Stroom-Proxy specific breaking changes in v7.10.

Stroom & Stroom-Proxy

Open ID Connect Configuration

The property stroom.security.authentication.openid.validateAudience has been removed. See Upgrade Notes for details.

AWS Open ID Connect Configuration

If you use AWS for OIDC authentication and you have configured the property stroom.security.authentication.openId.publicKeyUriPattern then you will need to change its value. See Upgrade Notes for details.

4 - Upgrade Notes

Required actions and information relating to upgrading to Stroom version 7.10.

Upgrade Path

You can upgrade to v7.10.x from any v7.x release that is older than the version being upgraded to.

If you want to upgrade to v7.10.x from v5.x or v6.x we recommend you do the following:

  1. Upgrade v5.x to the latest patch release of v6.0.
  2. Upgrade v6.x to the latest patch release of v7.0.
  3. Upgrade v7.x to the latest patch release of v7.10.

Java Version

Stroom v7.10 requires Java 21. This is the same java version as Stroom v7.9. Ensure the Stroom and Stroom-Proxy hosts are running the latest patch release of Java v21.

Configuration File Changes

Stroom’s config.yml

Git Repo

A new branch has been added to the config for configuring the directory used to store the local git repositories.

appConfig:
  gitRepo:
    localDir: "git_repo"

X509 Certificate Extraction

A new property x509CertificateDnFormat has been added to define the format of the certificate Distinguished Name (DN). Valid values are LDAP which is , delimited or OPEN_SSL which is / delimited. The default is LDAP.

appConfig:
  receive:
    x509CertificateDnFormat: "LDAP"

Open ID Connect Authentication

The property stroom.security.authentication.openid.validateAudience has been replaced by two new properties for controlling validation of the aud claim. The allowedAudiences property allows you to supply a list of valid values for the aud claim. If this list is not empty then if the aud claim is present, Stroom will ensure that it matches one of these values.

If audienceClaimRequired is set to true then Stroom will fail authentication if the aud claim is not present.

The new fullNameClaimTemplate lets you define a template for extracting the user’s full name from the claims in a token. For example, the template could be ${firstname} ${lastName} if those two claims are available. If the named claim is not present then the placeholder will be replaced by an empty string.

The template syntax for publicKeyUriPattern has changed from positional place holders (e.g. https://public-keys.auth.elb.{}.amazonaws.com/{}) to named place holders (e.g. https://public-keys.auth.elb.${awsRegion}.amazonaws.com/${keyId}). This make it more flexible.

appConfig:
  security:
    authentication:
      openId:
        allowedAudiences: []
        audienceClaimRequired: false
        fullNameClaimTemplate: "${name}"
        publicKeyUriPattern: "https://public-keys.auth.elb.${awsRegion}.amazonaws.com/${keyId}"

Stroom-Proxy’s config.yml

HTTP Forward Destinations

The property forwardHeadersAdditionalAllowSet has been added to the forwardHttpDestinations branch. It is a set of case-insensitive HTTP header keys.

When forwarding data to a downstream Stroom/Stroom-Proxy, Stroom-Proxy will set the following headers using value from the Meta associated with the data accountId, accountName, classification, component, contextEncoding, contextFormat, encoding, environment, feed, format, guid, schema, schemaVersion, system, type. If any additional HTTP headers need to be set then they should be added to this list.

proxyConfig:
  forwardHttpDestinations:
  - forwardHeadersAdditionalAllowSet: []

X509 Certificate Extraction

The changes described above in Stroom - X509 Certificate Extraction also apply to stroom-proxy.

Open ID Connect Authentication

The changes described above in Stroom - Open ID Connect Authentication also apply to stroom-proxy.

Database Migrations

When Stroom boots for the first time with a new version it will run any required database migrations to bring the database schema up to the correct version.

On boot, Stroom will ensure that the migrations are only run by a single node in the cluster. This will be the node that reaches that point in the boot process first. All other nodes will wait until that is complete before proceeding with the boot process.

It is recommended however to use a single node to execute the migration. To avoid Stroom starting up and beginning processing you can use the migrage command to just migrate the database and not fully boot Stroom. See migrage command for more details.

Migration Scripts

For information purposes only, the following are the database migrations that will be run when upgrading to 7.10.0 from the previous minor version.

Note, the legacy module will run first (if present) then the other module will run in no particular order.

Module stroom-index

Script V07_10_00_001__index_field.sql

Path: stroom-index/stroom-index-impl-db/src/main/resources/stroom/index/impl/db/migration/V07_10_00_001__index_field.sql

-- Stop NOTE level warnings about objects (not)? existing
SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0;

ALTER TABLE index_field              CHANGE COLUMN name                name                  varchar(512) NOT NULL;

SET SQL_NOTES=@OLD_SQL_NOTES;

Module stroom-processor

Script V07_10_00_999__processor_filter_data.java

Path: stroom-processor/stroom-processor-impl-db/src/main/java/stroom/processor/impl/db/migration/V07_10_00_999__processor_filter_data.java

It is not possible to display the content here. The file can be viewed on : GitHub

Module stroom-security

Script V07_10_00_005__trim_user_identities.sql

Path: stroom-security/stroom-security-impl-db/src/main/resources/stroom/security/impl/db/migration/V07_10_00_005__trim_user_identities.sql

-- ------------------------------------------------------------------------
-- Copyright 2020 Crown Copyright
--
-- Licensed under the Apache License, Version 2.0 (the "License");
-- you may not use this file except in compliance with the License.
-- You may obtain a copy of the License at
--
--     http://www.apache.org/licenses/LICENSE-2.0
--
-- Unless required by applicable law or agreed to in writing, software
-- distributed under the License is distributed on an "AS IS" BASIS,
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-- See the License for the specific language governing permissions and
-- limitations under the License.
-- ------------------------------------------------------------------------

-- Stop NOTE level warnings about objects (not)? existing
SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0;

-- --------------------------------------------------

-- Trim existing user identity values so they are consistent with the app
-- that now trims these values.
-- Idempotent.

update stroom_user
set name = trim(name)
where name is not null
and name != trim(name);

update stroom_user
set display_name = trim(display_name)
where display_name is not null
and display_name != trim(display_name);

update stroom_user
set full_name = trim(full_name)
where full_name is not null
and full_name != trim(full_name);

-- --------------------------------------------------

SET SQL_NOTES=@OLD_SQL_NOTES;

-- vim: set tabstop=4 shiftwidth=4 expandtab:

5 - Change Log

Full list of changes in this release.

Features and Changes

  • Change the resource store to not rely on sessions. Resources are now linked to a user.

  • Add ReceiptId to the INFO message on data receipt.

  • Issue #5047 : Replace the property stroom.security.authentication.openid.validateAudience with stroom.security.authentication.openid.allowedAudiences (defaults to empty) and stroom.security.authentication.openid.audienceClaimRequired (defaults to false). If the IDP is known to provide the aud claim (often populated with the clientId) then set allowedAudiences to contain that value and set audienceClaimRequired to true.

  • Issue #5068 : Add the config prop stroom.security.authentication.openId.fullNameClaimTemplate to allow the user’s full name to be formed from a template containing a mixture of static text and claim variables, e.g. ${firstName} ${lastName}. Unknown variables are replaced with an empty string. Default is ${name}.

  • Issue #5066 : Change template syntax of openid.publicKeyUriPattern prop from positional variables ({}) to named variables (${awsRegion}). Default value has changed to https://public-keys.auth.elb.${awsRegion}.amazonaws.com/${keyId}. If this prop has been explicitly set, its value will need to be changed to named variables.

  • Issue #5030 : Add new property .receive.x509CertificateDnFormat to stroom and proxy to allow extraction of CNs from DNs in legacy OPEN_SSL format. The new property defaults to LDAP, which means no change to behaviour if left as is.

  • Issue #5007 : Add ceilingTime() and floorTime().

  • Issue #3083 : Allow data() table function to show the Info pane.

  • Issue #4965 : Add dashboard screen to show current selection parameters.

  • Issue #4496 : Add parse-dateTime xslt function.

  • Issue #4496 : Add format-dateTime xslt function.

  • Issue #4969 : Add a checkbox to Content Templates edit screen to make it copy (and re-map) any xslt/textConverter docs in the inherited pipeline.

  • Issue #4726 : Get meta for parent stream.

  • Issue #4900 : Add histogram and metric stores to Plan B.

  • Issue #3861 : Add Shard Id, Index Version to Index Shards searchable.

  • Issue #4112 : Allow use of Capture groups in the decode() function result.

  • Issue #3955 : Add case expression function.

  • Issue #4484 : Change selection handling to use fully qualified keys.

  • Issue #4742 : Allow embedded queries to be copies rather than references.

  • Issue #4894 : Plan B query without snapshots.

  • Issue #4896 : Plan B option to synchronise writes.

  • Issue #4720 : Add Plan B shards data source.

  • Issue #4919 : Add functions to format byte size strings.

  • Issue #4901 : Add advanced schema selection to Plan B to improve performance and reduce storage requirements.

Bug Fixes

  • Issue #5027 : Allow users to choose run as user for processing.

  • Issue #5137 : Fix how proxy adds HTTP headers when sending downstream. It now only adds received meta entries to the headers if they are on an allow list. This list is made up of a hard coded base list accountId, accountName, classification, component, contextEncoding, contextFormat, encoding, environment, feed, format, guid, schema, schemaVersion, system, type and is supplemented by the new config property forwardHeadersAdditionalAllowSet in the forwardHttpDestinations items.

  • Issue #5135 : Fix proxy multi part gzip handling.

  • Uplift JDK to 21.0.8_9 in docker images and sdkmanrc.

  • Issue #5130 : Fix raw size meta bug.

  • Issue #5132 : Fix missing session when AWS ALB does the code flow.

  • Fix the OpenID code flow to stop the session being lost after redirection back to the initiating URL.

  • Issue #5101 : Fix select-all filtering when doing a reprocess of everything in a folder. It no longer tries to re-process deleted items streams.

  • Issue #5086 : Improve stream error handling.

  • Issue #5114 : Improve handling of loss of connection to IDP.

  • Change the way security filter decides whether to authenticate or not, e.g. how it determines what is a static resource that does not need authentication.

  • Issue #5115 : Use correct header during proxy forward requests.

  • Issue #5121 : Proxy aggregation now keeps only common headers in aggregated data.

  • Fix exception handling of DistributedTaskFetcher so it will restart after failure.

  • Issue #5127 : Maintain case for proxy meta attributes when logging.

  • Issue #5091 : Stop reference data loads failing if there are no entries in the stream.

  • Issue #5095 : Lock the cluster to perform pipeline migration to prevent other nodes clashing.

  • Issue #5099 : Fix Plan B session key serialisation.

  • Issue #5090 : Fix Plan B getVal() serialisation.

  • Issue #5106 : Fix ref loads with XML values where the <value> element name is not in lower case.

  • Issue #5042 : Allow the import of processor filters when the existing processor filter is in a logically deleted state. Add validation to the import confirm dialog to ensure the parent doc is selected when a processor filter is selected.

  • Change DocRef Info Cache to evict entries on document creation to stop stroom saying that a document doesn’t exist after import.

  • Issue #5077 : Fix bug in user full name templating where it is always re-using the first value, i.e. setting every user to have the full name of the first user to log in.

  • Issue #5073 : Trim the unique identity, display name and full name values for a user to ensure no leading/trailing spaces are stored. Includes DB migration V07_10_00_005__trim_user_identities.sql that trims existing values in the name, display_name and full_name columns of the stroom_user table.

  • Issue #5046 : Stop feeds being auto-created when there is no content template match.

  • Issue #5062 : Fix permissions issue loading scheduled executors.

  • Allow clientSecret to be null/empty for mTLS auth.

  • Issue #5017 : Fix stuck spinner copying embedded query.

  • Issue #4974 : Fix Plan B condense job.

  • Issue #4977 : Limit user visibility in annotations.

  • Issue #4976 : Exclude deleted annotations.

  • Issue #5002 : Fix Plan B env staying open after error.

  • Issue #5003 : Fix query date time formatting.

  • Issue #4974 : Improve logging.

  • Issue #4974 : NPE debug.

  • Issue #4983 : Upgrade Flyway to work with newer version of MySQL.

  • Issue #3122 : Make date/time rounding functions time zone sensitive.

  • Issue #4984 : Add debug for Plan B tagged keys.

  • Issue #4991 : Add Plan B schema validation to ensure stores remain compatible especially when merging parts.

  • Issue #4854 : Maintain scrollbar position on datagrid.

  • Issue #4957 : Default vis settings are not added to Query pane visualisations.

  • Issue #4456 : Fix selection handling across multiple components by uniquely namespacing selections.

  • Issue #4886 : Fix ctrl+enter query execution for rules and reports.

  • Issue #4884 : Suggest only queryable fields in StroomQL where clause.

  • Issue #4945 : Increase index field name length.