Documents

A reference of all the different types of Document that can be created in Stroom. A Document is a user-created piece of content in Stroom that is visible in the explorer tree.

All Documents in Stroom share some common elements:

  • UUID UUID A Universally Unique Identifier for uniquely identifying something. UUIDs are used as the identifier in Doc Refs. An example of a UUID is 4ffeb895-53c9-40d6-bf33-3ef025401ad3.Click to see more details... - Uniquely identifies the document within Stroom and when exported into another stroom.
  • Type - This is the type as used in the Doc Ref Doc Ref A Doc Ref (or Document Reference) is an identifier used to identify most documents/entities in Stroom, e.g. an XSLT will have a Doc Ref.Click to see more details....
  • Documentation - Every Document has a Documentation tab for recording any documentation that relates to the Document, see Documenting Content.

Some Documents are very simple with just text content and documentation, e.g. XSLT. Others are much more complex, e.g. Pipeline, with various different tabs to manage the content of the Document.

The following is a list of all Document types in Stroom.

Configuration

Documents that are used as configuration for other documents.

Dictionary

  • Icon:
  • Type: Dictionary

A Dictionary is essentially a list of ‘words’, where each ‘word’ is separated by a new line. Dictionaries can be used in filter expressions, i.e. IN DICTIONARY. They allow for the reuse of the same set of values across many search expressions. Dictionaries also support inheritance so one dictionary can import the contents of other dictionaries.

Documentation

  • Icon:
  • Type: Documentation

A Document type for simply storing user created documentation, e.g. adding a Documentation document into a folder to describe the contents of that folder.

Elastic Cluster

  • Icon:
  • Type: ElasticCluster

Defines the connection details for a single Elasticsearch cluster. This Elastic Cluster Document can then be used by one or more Elastic Index Documents.

Git Repo

  • Icon:
  • Type: GitRepo

Contains the configuration for a connection to a Git repository.

Kafka Configuration

  • Icon:
  • Type: KafkaConfig

Defines the connection details for a single Kafka cluster. This Kafka Configuration Document can then be used by one or more StandardKafkaProducer pipeline elements.

OpenAI Model

  • Icon: OpenAI icon
  • Type: OpenAIModel

Defines the settings required to connect to an OpenAI-compatible API and interact with a model.

S3 Configuration

  • Icon:
  • Type: S3Config

Defines the config for S3

Script

  • Icon:
  • Type: Script

Contains a Javascript script that is used as the source for a visualisation Document. Scripts can have dependencies on other Script Documents, e.g. to allow re-use of common code.

Scylla DB

  • Icon: Plan de travail 1Plan de travail 1
  • Type: ScyllaDB

Defines the connection details for a ScyllaDB state store instance.

Visualisation

  • Icon:
  • Type: Visualisation

Defines a data visualisation that can be used in a Dashboard Document. The Visualisation defines the settings that will be available to the user when it is embedded in a Dashboard. A Visualisation is dependent on a Script Document for the Javascript code to make it work.

Data Processing

Documents relating to the processing of data.

Feed

  • Icon:
  • Type: Feed

The Feed Feed A Feed is a means of organising and categorising data in Stroom. A Feed contains multiple Streams of data that have been ingested into Stroom or output by a Pipeline. Typically a Feed will contain Streams of data that are all from one system and have a common data format.Click to see more details... is Stroom’s way of compartmentalising data that has been ingested or created by a Pipeline. Ingested data must specify the Feed that it is destined for.

The Feed Document defines the character encoding for the data in the Feed, the type of data that will be received into it (e.g. Raw Events) and optionally a Volume Group to use for data storage. The Feed Document can also control the ingest of data using its Feed Status property and be used for viewing data that belonging to that feed.

Pipeline

  • Icon:
  • Type: Pipeline

A Pipeline defines a chain of Pipeline elements that consumes from a source of data (a Stream of raw data or cooked events) then processes it according to the elements used in the chain. Pipelines can be linear or branching and support inheritance of other pipelines to allow re-use of common structural parts.

The Pipeline Document defines the structure of the pipeline and the configuration of each of the elements in that pipeline. It also defines the filter(s) that will be used to control what data is passed through the pipeline and the priority of processing. The Pipeline Document can be used to view the data produced by the pipeline and to monitor its processing state and progress.

Indexing

Documents relating to the process of adding data into an index, such as Lucene or Elasticsearch.

Elastic Index

  • Icon:
  • Type: ElasticIndex

Defines an index that exists within an Elasticsearch cluster. This Document is used in the configuration of the ElasticIndexingFilter pipeline element.

Lucene Index

  • Icon:
  • Type: Index

Lucene Index is the standard built-in index within Stroom and is one of many data sources. An index is like a catalog in a library and provides a very fast way to access documents/records/events when searching using fields that have been indexed. The index stores the field values and pointers to the document they came from (the Stream and Event IDs). Data can be indexed using multiple indexes to allow fast access in different ways.

The Lucene Index Document optionally defines the fields that will be indexed (it is possible to define the fields dynamically) and their types. It also allows for configuration of the way the data in the index will be stored, partitioned and retained.

The Lucene Index Document is used by the IndexingFilter and DynamicIndexingFilter pipeline elements.

Pathways

  • Icon:
  • Type: Pathways

TODO - Add description

Plan B

  • Icon:
  • Type: PlanB

Defines a place to store state

Solr Index

  • Icon:
  • Type: SolrIndex

Solr Index represents an index on a Solr cluster. It defines the connection details for connecting to that cluster and the structure of the index. It is used by the SolrIndexingFilter pipeline element.

State Store

  • Icon:
  • Type: StateStore

Defines a place to store state

Statistic Store

  • Icon:
  • Type: StatisticStore

Defines a logical statistic store used to hold statistical data of a particular type and aggregation window. Statistics in Stroom is a way to capture counts or values from events and record how they change over time, with the counts/values aggregated (sum/mean) across time windows.

The Statistic Store Document configures the type of the statistic (Count or Value), the tags that are used to qualify a statistic event and the size of the aggregation windows. It also supports the definition of roll-ups that allow for aggregation over all values of a tag. Tags can be things like user, node, feed, etc. and can be used to filter data when querying the statistic store in a Dashboard/Query.

It is used by the StatisticsFilter pipeline element.

Stroom-Stats Store

  • Icon:
  • Type: StroomStatsStore

The Stroom-Stats Store Document is deprecated and should not be used.

Documents relating to searching for data in Stroom.

Analytic Rule

  • Icon:
  • Type: AnalyticRule

Defines an analytic rule which can be run to alert on events meeting a criteria. The criteria is defined using a StroomQL query. The analytic can be processed in different ways:

  • Streaming
  • Table Builder
  • Scheduled Query

Annotation

  • Icon:
  • Type: Annotation

TODO - Add description

Dashboard

  • Icon:
  • Type: Dashboard

The Dashboard Document defines a data querying and visualisation dashboard. The dashboard is highly customisable to allow querying of many different data sources of different types. Queried data can be displayed in tabular form, visualised using interactive charts/graphs or rendered as HTML.

The Dashboard Doc can either be used for ad-hoc querying/visualising of data, to construct a dashboard for others to view or to just view an already constructed dashboard. Dashboards can be parameterised so that all queries on the dashboard are displaying data for the same user, for example. For ad-hoc querying of data from one data source, you are recommended to use a Query instead.

Query

  • Icon:
  • Type: Query

A Query Document defines a StroomQl StroomQl Stroom Query Language is Stroom’s own query language. It has similarities with Structured Query Language (SQL) as used in databases. StroomQL is sometimes referred to as sQL to distinguish it from SQL.Click to see more details... query and is used to execute that query and view its results. A Query can query main types of data source including Views, Lucene Indexes, and Searchables Searchable A Searchable is the term given the special searchable data sources that appear at the root of the explorer tree picker when selecting a data source. These data sources are special internal data sources that are not user managed content, unlike an Index. They provide the means to search various aspects of Stroom’s internals, such as the Meta Store or Processor Tasks.Click to see more details....

Report

  • Icon:
  • Type: Report

Defines a report that can be run at scheduled intervals and sent to individuals via email. The criteria is defined using a StroomQL query.

View

  • Icon:
  • Type: View

A view is an abstraction over a data source (such as a Lucene Index) and optionally an extraction pipeline. Views provide a much simpler way for users to query data as the user can simply query against the View without any knowledge of the underlying data source or extraction of that data.

Transformation

Documents relating to the transformation of data.

Text Converter

  • Icon:
  • Type: TextConverter

A Text Converter Document defines the specification for splitting text data into records/fields using Data Splitter or for wrapping fragment XML with a XMLFragmentParser pipeline element. The content of the Document is either XML in the data-splitter:3 namespace or a fragment parser specification (see Pipeline Recipies).

This Document is used by the following pipeline elements:

XML Schema

  • Icon:
  • Type: XMLSchema

This Document defines an XML Schema XML Schema XML Schema is a language used to define the permitted structure of an XML document. An XML Schema can be used to validate an XML document to ensure it conforms to that schema such that onward processing of the XML document can be done with confidence that the document is correct.Click to see more details... that can be used within Stroom for validation of XML documents. The XML Schema Document content is the XMLSchema text. This Document also defines the following:

  • Namespace URI - The XML namespace of the XMLSchema and the XML document that the schema will validate.
  • System Id - An ID (that is unique in Stroom) that can be used in the xsi:schemaLocation attribute, e.g. xsi:schemaLocation="event-logging:3 file://event-logging-v3.4.2.xsd".
  • Schema Group - A name to group multiple versions of the same schema. The SchemaFilter can be configured to only use schemas matching a configured group.

The XML Schema Document also provides a handy interactive viewer for viewing and navigating the XMLSchema in a graphical representation.

This Document is used by the SchemaFilter pipeline element.

XSL Translation

  • Icon:
  • Type: XSLT

The content of this Document is an XSLT XSLT Extensible Stylesheet Language Transformations is a language for transforming XML documents into other XML documents. XSLTs are the primary means of transforming data in Stroom.Click to see more details... document for transforming data in a pipeline. This Document is used by the XSLTFilter pipeline element.

Last modified April 16, 2026: Merge branch '7.11' into 7.12 (3554796)