The page that you are currently viewing is for an old version of Stroom (7.1). The documentation for the latest version of Stroom (7.6) can be found using the version drop-down at the top of the screen or by clicking here.

Character Encoding

Details of the character encodings supported by Stroom.

When data is sent to Stroom the character encoding of the data should be configured for the Feed.

Supported Character Encodings

The currently supported character encodings are:

UTF-8

This is the default character encoding A variable width character encoding consisting of one to four bytes per ‘character’.

UTF-16

A variable width character encoding consisting of two or four bytes per ‘character’. UTF-16 can be encoded with either Big (UTF16-BE) or Little (UTF16-LE) Endianness depending on the system that encoded it. The Byte Order Mark will specify the endianness.

UTF-32

A fixed width character encoding consisting of four bytes per ‘character’. UTF-32 can be encoded with either Big (UTF32-BE) or Little (UTF32-LE) Endianness depending on the system that encoded it. The Byte Order Mark will specify the endianness.

ASCII

A single byte character encoding supporting only 128 characters. This character encoding has very limited use as it does not support accented characters or emojis so should be avoided for any logs that capture user input where these characters may occur.

Byte Order Mark (BOM)

A Byte Order Mark (BOM) is a special Unicode character at the start of a text stream that indicates the byte order (or endianness) of the stream. It can also be used to determine the character encoding of the stream.

Stroom can handle the presence of BOMs in the stream and can use it to determine the character encoding.

Encoding BOM
UTF8 EF BB BF
UTF16-LE FF FE
UTF16-BE FE FF
UTF32-LE FF FE 00 00
UTF32-BE 00 00 FE FF
Last modified September 3, 2024: Merge branch '7.0' into 7.1 (27ab3d5)