This is the multi-page printable view of this section. Click here to print.
Version 7.11
- 1: New Features
- 2: Preview Features (experimental)
- 3: Breaking Changes
- 4: Upgrade Notes
- 5: Change Log
1 - New Features
Integrated Pipeline Construction and Translation Development (Stepping)
Improvements have been made in how to construct pipelines and step through data with them. A pipeline can either be opened directly from the explorer tree or via the large Enter Stepping Mode button from the Data Preview pane of the data browser.
Clicking the Enter Stepping Mode button also allows the user to create a new pipeline (by clicking ) rather than opening an existing one if needed.
Stepping mode is used to develop/test a pipeline against some data and change the translation interactively. Once a pipeline is open, stepping mode can be toggled on and off by pressing the Enter Stepping Mode toggle button on the pipeline Structure tab. Opening a pipeline from the data browser button opens a pipeline and immediately enters Stepping Mode. In stepping mode the user will be able to select individual streams to step through from the meta list by clicking on the Source pipeline element.
When Stepping Mode is off the user can make any changes they wish to the pipeline structure or properties. Going back into stepping mode immediately applies any structure or property changes to the stepping process. The results of any changes made to pipeline structure and referenced docs can be viewed by pressing the Refresh Current Step button.
These improvements allow a user to change any aspect of a pipeline in one place and immediately see the impact applied to some test data without saving pipeline changes or navigating to another tab. The source data being tested in stepping mode can be changed at any time by selecting a different input stream on the Source tab. The user can change the filter on the Source pane to find different data to test.
When entering stepping mode from the Data Preview pane of the Data Browser Enter Stepping Mode button, only the data from that context is initially included in the filter. Users must change the filter to choose different data.
Ask Stroom AI
The new Ask Stroom AI feature provides information and AI insights into data contained in search result tables or any other tabular data in Stroom. Ask Stroom UI can be accessed from any table context menu or the toolbar wherever you see the Ask Stroom AI button.
One or more AI models must be configured to be able to use Ask Stroom AI. The models used must conform to the OpenAI API standard. Once configured users can ask questions such as “Summarise this table” and the Ask Stroom AI feature will provide AI insights for the provided data.
At present the Ask Stroom AI feature is a preview feature with many enhancements planned for subsequent versions of Stroom.
Dense Vector LLM (AI)
Dense vector index field support allows Stroom to perform approximation queries to find document containing similar words, similar meaning or sentiment, similar spelling plus errors etc.
Dense vector support leverages the AI model configuration provided for the Ask Stroom AI feature. Support is included for LLM embedded tokenisation and dense vector field storage in both Elastic and Lucene search indices. The Elastic and Lucene implementations are different and both are experimental in this release.
For the Lucene version you can now create a new field type in the UI and set it to be a dense vector field. The dense vector field has many configuration options to choose from such as the AI embedding model to use and limits on context size etc. Once an index is created with a dense vector field and the LLM has provided the tokens for the field data Stroom is able to search using the field.
When querying search terms are substituted for tokens using the LLM and then matched against the dense vector field. This mechanism allows for connections between words and tokens in both the data and the query to find documents approximately related to the search terms.
Plan B Histograms, Metrics, Sessions
Multiple improvements to Plan B to store:
- Histogram data, e.g. counts of things within specific time periods.
- Metric data, e.g. values at points in time, CPU Use %, Memory Use % etc.
- Session data, recording start and end periods for certain events, e.g. user uses application X between time A and B.
Trace Log Support In Plan B
Plan B can now store trace logs supplied via pipelines. There are new additions to the Plan B schema to capture trace log data so it can be supplied to a Plan B store.
Improved Annotations
Annotations have been improved to add several new features and improvements:
- Labels
- Collections
- Data Retention
- Access permissions
- Links between annotations
- Recording table row details
- Instant update of decorated tables after annotation edits
Pipeline Processing Dependencies and Delayed Execution
It is now possible to delay processing of event data until required reference data is present. For each dependency Stroom will delay processing until reference data arrives that is newer than the event data you want to process. Stroom needs to wait until reference data arrives that is newer than the event data it needs to process so that it knows it has definitely received effective reference data for the events.
Minimum Delay
Once reference data has been received and processed we can process the event data, however as reference data is cached we may want to add additional delay to the processing, this is achieved by setting a minimum delay. Even without any processing dependencies a minimum delay can be added so Stroom will wait the specified period of time before processing.
Maximum Delay
In some cases we reference data might be late, and we might not want to wait too long before processing. In these cases the user can set a maximum delay so that processing will occur after the specified time delay even without the dependant data being present.
Git Integration
You can now create Git controlled folders in Stroom that will synchronise all child content with a Git repository. Git content can be pulled and pushed if the correct credentials have been provided via the new Credentials feature.
See Git Repo for details.
Content Store
Stroom now has a mechanism for providing standard content from one or more user defined content stores. Content stores are lists of specific content items described in files with Git repositories. The content store shows a list of items in the user interface that the user can choose to add to their Stroom instance, e.g. standard pipelines, schemas and visualisations.
See Content Store for details.
Credentials
Many features in Stroom require credentials to be provided to authenticate connections to external services, e.g. Git repositories, OpenAI APIs, HTTPS connections. The credentials feature provides a location to store and manage the secrets required to authenticate these services.
Credentials can manage:
- User name and passwords
- API keys and tokens
- SSH keys
- Key stores and trust stores in JKS and PKCS12 format.
See Credentials for details.
Smaller Changes
- Pipeline elements now have user friendly names and can be renamed without breaking pipelines.
- Main Stroom tabs can be reordered by clicking and dragging.
- Main Stroom tabs can be closed to the left or right of the selected tab via a context menu.
- The underline selection colour of a Stroom tab can be customised by a configuration property.
- Dashboards now allow individual panes to be maximised so that a table or visualisation can be viewed full screen.
- Table cells, rows, columns, selected rows and entire tables, now have numerous copy and export options.
New XSLT functions
A number of new XSLT functions have been added:
add-meta(String key, String value)- Add meta to be written to output destination.split-document(Sting doc, String segmentSize, String overlapSize)- Split a document for LLM tokenisation (experimental for Elastic dense vector indexing).
Dashboard & StroomQL Functions
A number of new dashboard and StroomQL functions have been added:
Create Annotation
Since annotation editing is now performed as a primary Stroom feature via a button and context menu on dashboard and query tables we no longer need to create or edit annotations via a hyperlink.
There are some remaining use cases where users want to create and initialise some annotation values based on some table row content.
The createAnnotation function can be used for this purpose and shows a hyperlink that will open the annotation edit screen pre-populated with the supplied values ready to create a new annotation.
The function takes the arguments:
createAnnotation(text, title, subject, status, assignedTo, comment, eventIdList)
Example:
createAnnotation('Create Annotation', 'My Annotation Title', ${SubjectField}, 'New', 'UserA', 'Look at this thing', '123:2,123444:3')
See createAnnotation for details.
HostAddress
Returns the host address (IP) for the given host string.
hostAddress(host)
Example
hostAddress('google.com')
> '142.251.29.102'
HostName
Returns the host name for the given host string.
hostName(host)
Example
hostName('142.251.29.102')
> 'google.com'
InRange
Returns true if the value is between lower and upper (inclusive). All parameters must be either numbers or ISO date strings.
- The input value to test
- The lower bound (inclusive)
- The upper bound (inclusive)
inRange(value, lower, upper)
Example
inRange(5, 2, 6)
> true
inRange(5, 5, 5)
> true
inRange(5, 6, 7)
> false
inRange(5, 3, 4)
> false
Stroom Proxy
API Key Verification
When you have more that one Stroom Proxy in a chain (e.g. Remote Stroom Proxy -> Local Stroom Proxy -> Stroom), the remote Stroom Proxy may receive requests from client using API Keys . On its own, Stroom Proxy is unable to authenticate an API Key as they were created by Stroom. Instead it needs to make a request to the downstream Stroom Proxy to authenticate the API Key. This Stroom Proxy can in turn make a similar request to Stroom.
Each Stroom Proxy is now able to cache and persist the verified keys so that it can perform the authentication without having to always make requests downstream. As it can now persist verified keys in a hashed form, it is able to perform authentication even when it is unable to contact its downstream host.
The following configuration controls this caching:
proxyConfig:
downstreamHost:
# How long to cache verified keys for in memory
maxCachedKeyAge: "PT10M"
# How long keys will be persisted for on disk in case the downstream
# can't be connected to
maxPersistedKeyAge: "P30D"
# The delay to use after there is an error connecting to the downstream
noFetchIntervalAfterFailure: "PT30S"
# The hash algorithm used to hash persisted API keys.
persistedKeysHashAlgorithm: "SHA2_512"
2 - Preview Features (experimental)
Pathways
Trace logs describe the operation of applications and user interactions with applications. Pathways is designed to analyse trace logs over a period of time and remember patterns of activity. The intent is for Pathways to identify unexpected behaviour or changes to services once regular behaviour has been learnt.
Once trace logs have been added to a Plan B store it can be analysed with Pathways. Pathways examines trace logs and identifies unique paths in trace logs between methods or service calls, e.g. A -> B -> C. It also records alternate paths, e.g. A -> B -> D. Each path is remembered by Pathways and logged to an output stream. Whenever a new unique path is found the fact is logged so that it is easy to identify changes.
Pathways also records and monitors changes to span attributes within traces.
Pathways makes no judgement about the changes it logs, it is up to users to add analytic rules to fire against pathway logs.
Data Receipt Rules
Data Receipt Rules is a new feature that serves as an alternative to the existing feed status checking performed by Stroom Proxy and Stroom. It provides a much richer mechanism for controlling which received data streams are Received, Rejected or Dropped. It allows anyone with the Manage Data Receipt Rules Application Permission to create one or more rules to controls the receipt of data.
Data Receipt Rules can be accessed as follows:
Each rule is defined by a boolean expression (as used in Dashboards and Stream filtering) and the Action (Receive, Reject, Drop_ that will be performed if the data matches the rule. Rules are evaluated in ascending order by Rule Number. The action is taken from the first rule to match.
If no rules match then the data will be rejected by default, i.e. the rules are include rather than exclude filters.
If you want data to be received if no rules match then you can create a rule at the end of the list with an Action of Receive and no expression terms.
If a stream matches a rule that has an Accept action, it will still be subject to a check to see if the Feed actually exists.
This means that the rules do not need to contain an Accept rule to cover all of the Feeds in the system.
They only need to cover
The client will receive a 101 Feed is not defined error if it does not exist.
The screen operates in a similar way to Data Retention Rules in that rules can be moved up/down to change their importance, or enabled/disabled.
Fields
The fields available to use in the expression terms can be defined in the Fields tab. The terms will be evaluated against the stream’s meta data, i.e. a combination of the HTTP headers sent by the client and any that have been populated by Stroom Proxy or Stroom. This allows for the use of custom headers to aid in the filtering of data into Stroom.
Dictionaries are supported for use with the in dictionary condition.
The contents of the dictionary and any of the dictionaries that it inherits will be included in the data fetched by Stroom Proxy.
Note
You cannot use the same dictionary for multiple fields if any one of those fields is obfuscated.
Should you need to use the same dictionary for an obfuscated and a non-obfuscated field, you can create one empty dictionary for each and make them both import from the same source dictionary.
Stroom Configuration
Data Receipt Rules are controlled by the following configuration:
appConfig:
receiptPolicy:
# List of fields whose values will be obfuscated when the rules
# are fetched by Stroom Proxy
obfuscatedFields:
- "AccountId"
- "AccountName"
- "Component"
# ... truncated
- "UploadUserId"
- "UploadUsername"
- "X-Forwarded-For"
# The hash algorithm used to hash obfuscated values, one of:
# * SHA3_256
# * SHA2_256
# * BCRYPT
# * ARGON_2
# * SHA2_512
obfuscationHashAlgorithm: "SHA2_512"
# The initial list of fields to bootstrap a Stroom environment.
# Changing this has no effect one an environment has been started up.
receiptRulesInitialFields:
AccountId: "Text"
Component: "Text"
Compression: "Text"
content-length: "Text"
# ... truncated
Type: "Text"
UploadUsername: "Text"
UploadUserId: "Text"
user-agent: "Text"
X-Forwarded-For: "Text"
receive:
# The action to take if there is a problem with the data receipt rules, e.g.
# Stroom Proxy has been unable to contact Stroom to fetch the rules.
fallbackReceiveAction: "RECEIVE"
# The data receipt checking mode, one of:
# * FEED_STATUS - Use the legacy Feed Status Check method
# * RECEIPT_POLICY - Use the new Data Receipt Rules
# * RECEIVE_ALL - Receive ALL data with no checks
# * DROP_ALL - Drop ALL data with no checks
# * REJECT_ALL - Reject ALL data with no checks
receiptCheckMode: "RECEIPT_POLICY"
Stroom Proxy Configuration
appConfig:
receiptPolicy:
# Only set this if you need to supply a non-standard full url
# By default Proxy will used the known path for the Data Receipt Rules resource
# combined with the host/port/scheme from the `downstreamHost` config property.
receiveDataRulesUrl: null
# The frequency that the rules will be fetched from the downstream Stroom instance.
syncFrequency: "PT1M"
# Identical configuration to Stroom as described above.
# Stroom and Stroom Proxy can use different `receiptCheckMode` values, but typically
# they will be the same.
receiptPolicy:
Stroom Proxy Rule Synchronisation
If Stroom Proxy is configured with receiptCheckMode set to RECEIPT_POLICY and has downstreamHost configured, then it will periodically send a request to Stroom to fetch the latest copy of the Data Receipt Rules.
If Stroom Proxy is unable to contact Stroom it will use the latest copy of the rules that it has.
Given that Stroom Proxy will only synchronise periodically, once a change is made to the rule set, there will be a delay before the new rules take effect.
Term Value Obfuscation
As a Stroom administrator you may not want the values used in the Data Receipt Rule expression terms to be visible when they are fetched by a remote Stroom Proxy (that may be maintained by another team).
It is therefore possible to obfuscate the values used for the expression terms for certain configured fields.
The fields that are obfuscated are controlled by the property stroom.receiptPolicy.obfuscatedFields.
For example, in the default configuration, Feed is an obfuscated field.
Thus a term like Feed != FEED_XYZ would have its value obfuscated when fetched by Stroom Proxy.
Stroom Proxy is able to similarly obfuscated meta data values for obfuscated fields in the same way to allow it to test the rule expression.
Warning
Due to the way obfuscation works, you are limited by the expression conditions that can be used, e.g.contains, >, < etc. are not allowed, but == and != are.
Stroom will tell you if you are using an unsupported condition for the field.
This prevents the Stroom Proxy administrator from being able to see the values used in the rules as they are not in plain text.
Each value is salted with its own unique salt then hashed.
The hash algorithm can be configured using stroom.receiptPolicy.obfuscationHashAlgorithm.
Note
Obfuscation is not encryption. The fetched data includes the salt values and given enough compute/time it would be possible to brute force the reversal of the hashing. Strong hashing algorithms such as BCrypt or Argon2 can mitigate against this but not remove the risk. If the rule values are too sensitive then you will have to let the Stroom Proxy accept the data and have Stroom do the full rule based checking.3 - Breaking Changes
Warning
Please read this section carefully in case any of the changes affect you.Stroom
Breaking changes relating to Stroom.
Import of Legacy Content
Stroom v7.11 is no longer able to import content that has been exported from a v5/v6 Stroom. Any such content will have to be imported into Stroom v7.10 or lower then exported for import into Stroom v7.11.
Elastic Configuration
The property stroom.elastic.retention.scrollSize has been removed.
See Upgrade Notes.
Stroom-Proxy
Breaking changes relating to Stroom Proxy.
Feed Status Check Configuration
Various properties have been removed from the proxyConfig.feedStatus configuration branch.
See Upgrade Notes.
Content Sync Configuration
The whole contentSync branch has been removed as it is no longer in use.
See Upgrade Notes.
Stroom & Stroom-Proxy
Breaking changes that are common to both Stroom and Stroom Proxy.
Receive Configuration
The property receiptPolicyUuid has been removed.
See Upgrade Notes.
4 - Upgrade Notes
Warning
Please read this section carefully in case any of it is relevant to your Stroom/Stroom-Proxy instance.Upgrade Path
You can upgrade to v7.11.x from any v7.x release that is older than the version being upgraded to.
If you want to upgrade to v7.11.x from v5.x or v6.x we recommend you do the following:
- Upgrade v5.x to the latest patch release of v6.0.
- Upgrade v6.x to the latest patch release of v7.0.
- Upgrade v7.x to the latest patch release of v7.11.
Warning
v7.11 cannot migrate content in legacy formats, i.e. content created in v5/v6. You must therefore upgrade to v7.0.x first to migrate this content, before upgrading to v7.11.x.Java Version
Stroom v7.10 requires Java 25.
Warning
This is different to the java version required for Stroom v7.9 (Java 21).Ensure the Stroom and Stroom-Proxy hosts are running the latest patch release of Java v25.
Configuration File Changes
Common Configuration Changes
These changes are common to both Stroom and Stroom Proxy.
New receive Branch Properties
proxyConfig:
receive:
# The action to take if there is a problem with the data receipt rules, e.g.
# Stroom Proxy has been unable to contact Stroom to fetch the rules.
fallbackReceiveAction: "RECEIVE"
# If defined then states the maximum size of a request (uncompressed for gzip requests).
# Will return a 413 Content Too Long response code for any requests exceeding this
# value. If undefined then there is no limit to the size of the request.
# Defined as an IEC byte value, e.g. 10GiB, 10M, 23TB, 1024, etc.
maxRequestSize: null
# The data receipt checking mode, one of:
# * FEED_STATUS - Use the legacy Feed Status Check method
# * RECEIPT_POLICY - Use the new Data Receipt Rules
# * RECEIVE_ALL - Receive ALL data with no checks
# * DROP_ALL - Drop ALL data with no checks
# * REJECT_ALL - Reject ALL data with no checks
receiptCheckMode: "FEED_STATUS"
Removed Property proxyConfig.receive.receiptPolicyUuid
The property receiptPolicyUuid has been removed.
This was a for an unused feature so its removal does not require any migration.
If it is in your config file, simply remove it.
proxyConfig:
receive:
receiptPolicyUuid:
Stroom’s config.yml
New askStroomAi Branch
A new branch of configuration to configure Ask Stroom AI.
appCongfig:
askStroomAi:
chatMemory:
# How long a chat memory entry should exist before being expired
timeToLive:
time: 1
timeUnit: "HOURS"
# Number of tokens to keep in each chat memory store
tokenLimit: 30000
# The DocRef of the OpenAIModel document to use for Ask Stroom AI. e.g.
# modelRef:
# name: "My Model"
# type: "OpenAIModel"
# uuid: "8df699c0-2ce5-48bc-bb1a-e5d26bbd2175"
modelRef:
tableSummary:
# Maximum number of tokens to pass the AI service at a time
maximumBatchSize: 16384
# Maximum number of table result rows to pass to the AI when making requests
maximumTableInputRows: 100
New contentStore Branch
A new branch of configuration to configure one or more Content Stores.
appCongfig:
contentStore:
# A list of URLs to content store definition files
urls:
- "https://raw.githubusercontent.com/gchq/stroom-content/refs/heads/master/source/content-store.yml"
New credentials Branch
A new branch has been added for configuring the storage and caching of credentials for authenticating with other systems.
credentials:
# A standard configuration branch for a database connection.
# Only set this if you want the credentials tables to be located on a different database
# to the rest of Stroom.
db:
connection:
#...
connectionPool:
# ...
# The path to stored cached key stores.
keyStoreCachePath: "${stroom.home}/keystores"
Removed Elastic Property
The property stroom.elastic.retention.scrollSize has been removed.
appCongfig:
elastic:
retention: # <- REMOVED
scrollSize: 10000 # <- REMOVED
db Branch added to gitRepo
A standard database configuration branch has been added to GitRepo.
You should not need to set this unless you want the Git Repo table data to be stored on a different database.
appCongfig:
gitRepo:
db:
connection:
#...
connectionPool:
# ...
New receiptPolicy Branch
A new configuration branch for Data Receipt Rules.
receiptPolicy:
obfuscatedFields:
- "AccountId"
- "AccountName"
# ...
- "X-Forwarded-For"
obfuscationHashAlgorithm: "SHA2_512"
receiptRulesInitialFields:
AccountId: "Text"
Component: "Text"
# ...
user-agent: "Text"
X-Forwarded-For: "Text"
Stroom-Proxy’s config.yml
Removed contentSync
The whole contentSync branch has been removed as it is no longer in use.
proxyConfig:
contentSync: # <- REMOVED
apiKey: null # <- REMOVED
contentSyncEnabled: false # <- REMOVED
syncFrequency: "PT1M" # <- REMOVED
upstreamUrl: null # <- REMOVED
Added downstreamHost
A new downstreamHost branch has been added to configure the default downstream host details and authentication.
In a typical deployment, Stroom Proxy will receive data and forward it on to a downstream Stroom or Stroom Proxy.
There are a number of parts of Stroom Proxy that need to communicate with the downstream host, e.g. feed status checking, forwarding or rule fetching.
This allows for the host’s details and any authentication properties to be set in one place.
Typically you will only need to set the hostname and apiKey properties.
proxyConfig:
downstreamHost:
# The API key to use for authentication (unless OpenID Connect is being used)
apiKey: null
# Only set this if you need to use a non-standard path
apiKeyVerificationUrl: null
enabled: true
# The hostname of the downstream
hostname: null
# How long to cache verified keys for in memory
maxCachedKeyAge: "PT10M"
# How long keys will be persisted for on disk in case the downstream
# can't be connected to
maxPersistedKeyAge: "P30D"
# The delay to use after there is an error connecting to the downstream
noFetchIntervalAfterFailure: "PT30S"
# Only requird if you need a common path prefix in front of the path
# that Stroom / Stroom Proxy will use.
pathPrefix: null
# The hash algorithm used to hash persisted API keys.
persistedKeysHashAlgorithm: "SHA2_512"
# The port to connect to the downstream on
# If not set, will default to 80/443 depending on scheme.
port: null
# The scheme to connect to the downstream on
scheme: "https"
Remove various feedStatus properties
The following three properties have been removed from the feedStatus branch.
proxyConfig:
feedStatus:
apiKey: null # <- REMOVED
defaultStatus: "Receive" # <- REMOVED
enabled: true # <- REMOVED
The migration for these properties is as follows:
apiKey=>proxyConfig.downstreamHost.apiKeydefaultStatus=>proxyConfig.receive.fallbackReceiveActionReceive=>RECEIVEReject=>REJECTDrop=>DROP
enabled=>proxyConfig.receive.receiptCheckModetrue=>FEED_STATUSfalse=> (RECEIPT_POLICY,RECEIVE_ALL,DROP_ALL,REJECT_ALL)
feedStatus.url
The url property for feed status checking no longer needs to be set unless you need to use a non-standard URL.
The URL for feed status checking will now be derived from the properties in downstreamHost and the static path for the feed status resource.
Therefore in most cases, simply remove this property.
Also, previously, the value for this property was just a path. Now, if you set it, it should be a full URL including host and path.
proxyConfig:
feedStatus:
url:
New forwardHttpDestinations property
A new property has been added to enable/disable the liveness checking for HTTP destinations. Liveness checking will periodically check that the destination is live and if not, disable forwarding until the liveness check determines it to be live again. Liveness checking is enabled by default.
proxyConfig:
forwardHttpDestinations:
- .....
livenessCheckEnabled: true
New forwardFileDestinations property
A new property has been added to enable/disable the liveness checking for file destinations. Liveness checking will periodically check that the destination is live and if not, disable forwarding until the liveness check determines it to be live again. Liveness checking is enabled by default.
proxyConfig:
forwardFileDestinations:
- .....
livenessCheckEnabled: true
Changes to forwardHttpDestinations[*].forwardUrl property
This property no longer needs to be set unless you need to forward to a location that is different to that defined by downstreamHost.
The forward URL will now be derived from the properties in downstreamHost and the static path for the datafeed endpoint.
Therefore in most cases, simply remove this property.
Also, if you need to set this property, the value of the forwardUrl property has changed from being just a path to being a full URL including host and path.
proxyConfig:
forwardHttpDestinations:
- .....
forwardUrl:
Added receiptPolicy Branch
This configuration branch has been added to configure the Data Receipt Rules.
proxyConfig:
receiptPolicy:
# Stroom Proxy will use downstreamHost to derive the URL to connect to.
# Only set this if you need to use a non-standard URL.
receiveDataRulesUrl: null
# The frequency that Stroom Proxy will connect to the downstream host to obtain updated
# receipt policy rules.
syncFrequency: "PT1M"
Database Migrations
When Stroom boots for the first time with a new version it will run any required database migrations to bring the database schema up to the correct version.
Warning
It is highly recommended to ensure you have a database backup in place before booting stroom with a new version. This is to mitigate against any problems with the migration. It is also recommended to test the migration against a copy of your database to ensure that there are no problems when you do it for real.On boot, Stroom will ensure that the migrations are only run by a single node in the cluster. This will be the node that reaches that point in the boot process first. All other nodes will wait until that is complete before proceeding with the boot process.
It is recommended however to use a single node to execute the migration.
To avoid Stroom starting up and beginning processing you can use the migrage command to just migrate the database and not fully boot Stroom.
See migrage command for more details.
Migration Scripts
For information purposes only, the following are the database migrations that will be run when upgrading to 7.11.0 from the previous minor version.
Note, the legacy module will run first (if present) then the other module will run in no particular order.
Module stroom-annotation
Script V07_11_00_001__annotation3.sql
Path: stroom-annotation/stroom-annotation-impl-db/src/main/resources/stroom/annotation/impl/db/migration/V07_11_00_001__annotation3.sql
-- ------------------------------------------------------------------------
-- Copyright 2023 Crown Copyright
--
-- Licensed under the Apache License, Version 2.0 (the "License");
-- you may not use this file except in compliance with the License.
-- You may obtain a copy of the License at
--
-- http://www.apache.org/licenses/LICENSE-2.0
--
-- Unless required by applicable law or agreed to in writing, software
-- distributed under the License is distributed on an "AS IS" BASIS,
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-- See the License for the specific language governing permissions and
-- limitations under the License.
-- ------------------------------------------------------------------------
-- Stop NOTE level warnings about objects (not)? existing
SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0;
DROP PROCEDURE IF EXISTS V07_11_00_001_annotation;
DELIMITER $$
CREATE PROCEDURE V07_11_00_001_annotation ()
BEGIN
DECLARE object_count integer;
--
-- Add entry parent id
--
SELECT COUNT(1)
INTO object_count
FROM information_schema.columns
WHERE table_schema = database()
AND table_name = 'annotation_entry'
AND column_name = 'parent_id';
IF object_count = 0 THEN
ALTER TABLE `annotation_entry` ADD COLUMN `parent_id` bigint DEFAULT NULL;
END IF;
--
-- Add entry update time
--
SELECT COUNT(1)
INTO object_count
FROM information_schema.columns
WHERE table_schema = database()
AND table_name = 'annotation_entry'
AND column_name = 'update_time_ms';
IF object_count = 0 THEN
ALTER TABLE `annotation_entry` ADD COLUMN `update_time_ms` bigint(20) NOT NULL;
-- Copy all entry times to update times.
SET @sql_str = CONCAT(
'UPDATE annotation_entry a ',
'SET a.update_time_ms = a.entry_time_ms');
PREPARE stmt FROM @sql_str;
EXECUTE stmt;
END IF;
--
-- Add entry update user
--
SELECT COUNT(1)
INTO object_count
FROM information_schema.columns
WHERE table_schema = database()
AND table_name = 'annotation_entry'
AND column_name = 'update_user_uuid';
IF object_count = 0 THEN
ALTER TABLE `annotation_entry` ADD COLUMN `update_user_uuid` varchar(255) NOT NULL;
-- Copy all entry users to update users.
SET @sql_str = CONCAT(
'UPDATE annotation_entry a ',
'SET a.update_user_uuid = a.entry_user_uuid');
PREPARE stmt FROM @sql_str;
EXECUTE stmt;
END IF;
END $$
DELIMITER ;
CALL V07_11_00_001_annotation;
DROP PROCEDURE IF EXISTS V07_11_00_001_annotation;
SET SQL_NOTES=@OLD_SQL_NOTES;
-- vim: set shiftwidth=4 tabstop=4 expandtab:
Module stroom-credentials
Script V07_11_00_001__credential.sql
Path: stroom-credentials/stroom-credentials-impl-db/src/main/resources/stroom/credentials/impl/db/migration/V07_11_00_001__credential.sql
-- ------------------------------------------------------------------------
-- Copyright 2025 Crown Copyright
--
-- Licensed under the Apache License, Version 2.0 (the "License");
-- you may not use this file except in compliance with the License.
-- You may obtain a copy of the License at
--
-- http://www.apache.org/licenses/LICENSE-2.0
--
-- Unless required by applicable law or agreed to in writing, software
-- distributed under the License is distributed on an "AS IS" BASIS,
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-- See the License for the specific language governing permissions and
-- limitations under the License.
-- ------------------------------------------------------------------------
-- Stop NOTE level warnings about objects (not)? existing
SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0;
--
-- Create the credential tables
--
CREATE TABLE IF NOT EXISTS credential (
uuid varchar(255) NOT NULL,
create_time_ms bigint(20) NOT NULL,
create_user varchar(255) NOT NULL,
update_time_ms bigint(20) NOT NULL,
update_user varchar(255) NOT NULL,
name varchar(255) NOT NULL,
crendential_type varchar(255) NOT NULL,
key_store_type varchar(255) DEFAULT NULL,
expiry_time_ms bigint DEFAULT NULL,
secret_json json NOT NULL,
key_store longblob DEFAULT NULL,
PRIMARY KEY (uuid),
UNIQUE KEY name (name)
) ENGINE=InnoDB DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;
SET SQL_NOTES=@OLD_SQL_NOTES;
-- vim: set shiftwidth=4 tabstop=4 expandtab:
Module stroom-index
Script V07_11_00_001__index_field.sql
Path: stroom-index/stroom-index-impl-db/src/main/resources/stroom/index/impl/db/migration/V07_11_00_001__index_field.sql
-- Stop NOTE level warnings about objects (not)? existing
SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0;
ALTER TABLE index_field ADD COLUMN `dense_vector` json DEFAULT NULL;
SET SQL_NOTES=@OLD_SQL_NOTES;
Module stroom-processor
Script V07_11_00_001__processor_filter.sql
Path: stroom-processor/stroom-processor-impl-db/src/main/resources/stroom/processor/impl/db/migration/V07_11_00_001__processor_filter.sql
-- ------------------------------------------------------------------------
-- Copyright 2020 Crown Copyright
--
-- Licensed under the Apache License, Version 2.0 (the "License");
-- you may not use this file except in compliance with the License.
-- You may obtain a copy of the License at
--
-- http://www.apache.org/licenses/LICENSE-2.0
--
-- Unless required by applicable law or agreed to in writing, software
-- distributed under the License is distributed on an "AS IS" BASIS,
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-- See the License for the specific language governing permissions and
-- limitations under the License.
-- ------------------------------------------------------------------------
-- Stop NOTE level warnings about objects (not)? existing
SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0;
-- --------------------------------------------------
DELIMITER $$
-- --------------------------------------------------
DROP PROCEDURE IF EXISTS processor_run_sql_v1 $$
-- DO NOT change this without reading the header!
CREATE PROCEDURE processor_run_sql_v1 (
p_sql_stmt varchar(1000)
)
BEGIN
SET @sqlstmt = p_sql_stmt;
SELECT CONCAT('Running sql: ', @sqlstmt);
PREPARE stmt FROM @sqlstmt;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
END $$
-- --------------------------------------------------
DROP PROCEDURE IF EXISTS processor_add_column_v1$$
-- DO NOT change this without reading the header!
CREATE PROCEDURE processor_add_column_v1 (
p_table_name varchar(64),
p_column_name varchar(64),
p_column_type_info varchar(64) -- e.g. 'varchar(255) default NULL'
)
BEGIN
DECLARE object_count integer;
SELECT COUNT(1)
INTO object_count
FROM information_schema.columns
WHERE table_schema = database()
AND table_name = p_table_name
AND column_name = p_column_name;
IF object_count = 0 THEN
CALL processor_run_sql_v1(CONCAT(
'alter table ', database(), '.', p_table_name,
' add column ', p_column_name, ' ', p_column_type_info));
ELSE
SELECT CONCAT(
'Column ',
p_column_name,
' already exists on table ',
database(),
'.',
p_table_name);
END IF;
END $$
-- idempotent
CALL processor_add_column_v1(
'processor_filter',
'export',
'TINYINT(1) NOT NULL DEFAULT 0')$$
-- vim: set shiftwidth=4 tabstop=4 expandtab:
5 - Change Log
New Features and Changes
-
Feature #5314 : Add
rowCount,fileType(EXCEL|CSV|TSV),fileNameto the templating context when generating email reports. -
Feature #5313 : Allow users to prevent empty reports from being sent on a per report basis.
-
Feature : Add enabled/disabled styling to table rows in the Report screens.
-
Feature : Add column header tool tips to tables in the Report screens.
-
Feature : Change the Report > Notifications Max column to be right aligned.
-
Feature : Add red/green sytling to the Report > Notifications Status column (Complete/Error).
-
Feature #5282 : Processor task creation now supports feed dependencies to delay processing until reference data is available.
-
Feature #5256 : Add option to omit documentation from rule detection.
-
Feature #5309 : Add long support to pathway values.
-
Feature #5303 : Make AI HTTP connection configurable.
-
Feature #5303 : Make AI HTTP SSL certificate stores configurable.
-
Feature #5303 : Add KNN dense vector support to Lucene indexes.
-
Feature #5303 : Pass only visible columns to the Ask Stroom AI service.
-
Feature #4123 : New pipeline stepping mode.
-
Feature #5290 : Plan B trace store now requires data to conform to Plan B trace schema.
-
Feature : Make the Quick Filter Help button hide the help popup if it is visible.
-
Feature : Add the same Quick Filter help popup as used on the explorer QF to the QFs on Dashboard table columns, Query help and Expression Term editors.
-
Feature #5263 : Add copy for selected rows.
-
Remove default value for
feedStatus.urlin the proxy config yaml as downstream host should now be used instead. -
Feature #5192 : Support Elasticsearch kNN search on dense_vector fields.
-
Feature #5124 : Change cluster lock
tryLockto use the database record locks rather than the inter-node lock handler. -
Feature #656 : Allow table filters to use dictionaries.
-
Feature #672 : Dashboards will only auto refresh when selected.
-
Feature #2029 : Add OS memory stats to the node status stats.
-
Feature #3799 : Search for tags in Find In Content.
-
Feature #3335 : Preserve escape chars not preceding delimiters.
-
Feature #1429 : Protect against large file ingests.
-
Feature #4121 : Add rename option for pipeline elements.
-
Feature #2374 : Add pipeline element descriptions.
-
Feature #4099 : Add InRange function.
-
Feature #2374 : Add description is now editable for pipeline elements.
-
Feature #268 : Add not contains and not exists filters to pipeline stepping.
-
Feature #844 : Add functions for hostname and hostaddress.
-
Feature #4579 : Add table name/id to conditional formatting exceptions.
-
Feature #4124 : Show severity of search error messages.
-
Feature #4369 : Add new rerun scheduled execution icon.
-
Feature #3207 : Add maxStringFieldLength table setting.
-
Feature #1249 : Dashboard links can open in the same tab.
-
Feature #1304 : Copy dashboard components between dashboards.
-
Feature #2145 : New add-meta xslt function.
-
Feature #370 : Perform schema validation on save.
-
Feature #397 : Copy user permissions.
-
Feature #5088 : Add table column filter dashboard component.
-
Feature #2571 : Show Tasks for processor filter.
-
Feature #4177 : Add stream id links.
-
Feature #2279 : Drag and drop tabs.
-
Feature #2584 : Close all tabs to right/left.
-
Feature #5013 : Add row data to annotations.
-
Feature #3049 : Check for full/inactive/closed index volumes.
-
Feature #4070 : Show column information on hover tip.
-
Feature #3815 : Add selected tab colour property.
-
Feature #4790 : Add copy option for property names.
-
Feature #4121 : Add rename option for pipeline elements.
-
Feature #2823 : Add
autoImportservlet to simplify importing content. -
Feature #5013 : Add data to existing annotations.
-
Feature #5013 : Add links to other annotations and allow comments to make references to events and other annotations.
-
Feature #2374 : Add pipeline element descriptions.
-
Feature #2374 : Add description is now editable for pipeline elements.
-
Feature #4048 : Add query csv API.
-
Feature: Change the proxy config properties
forwardUrl,livenessCheckUrl,apiKeyVerificationUrlandfeedStatusUrlto be empty by default and to allow them to be populated with either just a path or a full URL.downstreamHostconfig will be used to provide the host details if these properties are empty or only contain a path. Added the propertylivenessCheckEnabledtoforwardHttpDestinationsto control whether the forward destination liveness is checked (defaults to true). -
Feature #5028 : Add build info metrics
buildVersion,buildDateandupTime. -
Add admin port servlets for Prometheus to scrape metrics from stroom and proxy. Servlet is available as
http://host:<admin port>/(stroom|proxy)Admin/prometheusMetrics. -
Feature #4735 : Add expand/collapse to result tables.
-
Feature #5013 : Allow annotation status update without requery.
-
Feature #259 : Maximise dashboard panes.
-
Feature #3874 : Add copy context menu to tables.
-
Feature : Add the Receive Data Rules screen to the Administration menu which requires the
Manage Data Receipt Rulesapp permission. Add the following new config properties to thereceivebranch:obfuscatedFields,obfuscationHashAlgorithm,receiptCheckModeandreceiptRulesInitialFields. Remove the propertyreceiptPolicyUuid. Add the proxy config propertycontentSync.receiveDataRulesUrl. -
Feature: The proxy config property
feedStatus.enabledhas been replaced byreceive.receiptCheckModewhich takes valuesFEED_STATUS,RECEIPT_POLICYorNONE. -
Feature: In the proxy config, the named Jersey clients CONTENT_SYNC and FEED_STATUS have been removed and replaced with DOWNSTREAM.
Bug Fixes
-
Bug #5391 : Fix folder DocRef NPE.
-
Bug #5392 : Fix PlanB segfault.
-
Bug #5300 : Fix path
millisparameter. -
Bug : Fix Reports not respecting the start date during execution. It was executing from the last tracker time rather than from the start date, if the start date is later.
-
Bug #5384 : Improvements to annotations database code.
-
Bug #5360 : Fix NPE when annotation data retention fires change events.
-
Bug #5361 : Fix invalid SQL error when annotation data retention runs.
-
Bug : Fix ‘Data source already in use’ errors when using annotations.
-
Bug : Add missing cluster lock protection for annotation data retention job.
-
Bug #5359 : Fix ‘Data source already in use’ errors in the Data Delete job.
-
Bug #5185 : Fix dashboard maximise NPE.
-
Bug #5370 : Hide rule notification doc checkbox on reports as it is not applicable.
-
Bug #5234 : Fix spaces in
createAnnotation()function link text. -
Bug #5355 : Fix HttpClient being closed by cache while in use.
-
Bug #5303 : Add debug to index dense vector retrieval.
-
Bug #5344 : Fix credential manager UI.
-
Bug #5342 : Fix issue querying Plan B shards where some stores have been deleted.
-
Bug #5351 : Make SSH key and known host configuration clearer.
-
Bug #5353 : Fix keystore config issue.
-
Bug #5339 : Fix NPE thrown when adding vises to dashboards.
-
Bug #5337 : Fix analytic doc serialisation.
-
Bug #4124 : Fix NodeResultSerialiser and add node name to errors.
-
Bug #5317 : Pathways now load current state to not endlessly output mutations.
-
Bug : Fix build issue causing Stroom to not boot.
-
Bug #5304 : Fix error when unzipping the stroom-app-all jar file. This problem was also leading to AV scan alerts.
-
Bug : Fix Files list on Stream Info pane so that you can copy each file path individually.
-
Bug #5297 : Fix missing execute permissions and incorrect file dates in stroom and stroom-proxy distribution ZIP files.
-
Bug #5259 : Fix PlanB Val.toString() NPE.
-
Bug #5291 : Fix explorer item sort order.
-
Bug #5152 : Fix position of the Clear icon on the Quick Filter.
-
Bug #5293 : Fix pipeline element type and name display.
-
Bug #5254 : Fix document NPE.
-
Bug #5218 : When
autoContentCreationis enabled, don’t attempt to find a content template if theFeedheader has been provided and the feed exists. -
Bug #5244 : Fix proxy throwing an error when attempting to do a feed status check.
-
Bug #5149 : Fix context menus not appearing on dashboard tables.
-
Bug #4614 : Fix StroomQL highlight.
-
Bug #5064 : Fix ref data store discovery.
-
Bug #5022 : Fix weird spinner behaviour.
-
Bug : Fix NPE when proxy tries to fetch the receipt rules from downstream.
Code Refactor
- Refactor : Remove static imports except in test classes.
Dependency Changes
-
Dependency : Uplift base docker images to eclipse-temurin:25.0.1_8-jdk-alpine-3.23.
-
Dependency : Uplift dependency com.hubspot.jinjava:jinjava from 2.7.2 to 2.8.2.
-
Dependency : Uplift dependency swagger-* from 2.2.38 to 2.2.41.
-
Dependency : Uplift com.sun.xml.bind:jaxb-impl from 4.0.5 to 4.0.6.
-
Dependency : Uplift lanchain4j dependencies from 1.8.0-beta15 to 1.10.0-beta18 and 1.8.0 to 1.10.0.
-
Dependency : Uplift Gradle to v9.2.1.
-
Dependency : Uplift docker image JDK to
eclipse-temurin:25_36-jdk-alpine-3.22. -
Dependency #5257 : Upgrade Lucene to 10.3.1.
-
Dependency : Uplift dependency java-jwt 4.4.0 => 4.5.0.
-
Dependency : Uplift org.apache.commons:commons-csv from 1.10.0 to 1.14.1.
-
Dependency : Uplift dependency org.apache.commons:commons-pool2 from 2.12.0 to 2.12.1.
-
Dependency : Bumps org.quartz-scheduler:quartz from 2.5.0-rc1 to 2.5.1.
-
Dependency : Bumps org.eclipse.transformer:org.eclipse.transformer.cli from 0.5.0 to 1.0.0.
-
Dependency : Bumps jooq from 3.20.5 to 3.20.8.
-
Dependency : Bumps com.mysql:mysql-connector-j from 9.2.0 to 9.4.0.
-
Dependency : Bumps flyway from 11.9.1 to 11.14.0.