This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.


Parsing input data.

The following capabilities are available to parse input data:

  • XML - XML input can be parsed with the XML parser.
  • XML Fragment - Treat input data as an XML fragment, i.e. XML that does not have an XML declaration or root elements.
  • Data Splitter - Delimiter and regular expression based language for turning non XML data into XML (e.g. CSV)

1 - XML Fragments

Handling XML data without root level elements.

Some input XML data may be missing an XML declaration and root level enclosing elements. This data is not a valid XML document and must be treated as an XML fragment. To use XML fragments the input type for a translation must be set to ‘XML Fragment’. A fragment wrapper must be defined in the XML conversion that tells Stroom what declaration and root elements to place around the XML fragment data.

Here is an example:

<?xml version="1.1" encoding="UTF-8"?>
<!DOCTYPE records [
<!ENTITY fragment SYSTEM "fragment">
  xsi:schemaLocation="records:2 file://records-v2.0.xsd"

During conversion Stroom replaces the fragment text entity with the input XML fragment data. Note that XML fragments must still be well formed so that they can be parsed correctly.