Stroom Quick-Start Tutorial
In this quick-start guide you will learn how to use Stroom to get from this CSV, which looks like this:
id,guid,from_ip,to_ip,application 1,10990cde-1084-4006-aaf3-7fe52b62ce06,188.8.131.52,184.108.40.206,Tres-Zap 2,633aa1a8-04ff-442d-ad9a-03ce9166a63a,220.127.116.11,18.104.22.168,Sub-Ex ...
To this XML:
<?xml version="1.1" encoding="UTF-8"?> <Events xmlns:stroom="stroom" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <Event> <Id>1</Id> <Guid>10990cde-1084-4006-aaf3-7fe52b62ce06</Guid> <FromIp>22.214.171.124</FromIp> <ToIp>126.96.36.199</ToIp> <Application>Tres-Zap</Application> </Event> <Event> <Id>2</Id> <Guid>633aa1a8-04ff-442d-ad9a-03ce9166a63a</Guid> <FromIp>188.8.131.52</FromIp> <ToIp>184.108.40.206</ToIp> <Application>Sub-Ex</Application> </Event> ...
You will go from a clean vanilla Stroom to having a simple pipeline that takes in CSV data and outputs that data transformed into XML. Stroom is a generic and powerful tool for ingesting and processing data: it's flexible because it's generic so if you do want to start processing data we would recommend you follow this tutorial otherwise you'll find yourself struggling.
We're going to do the following:
- Get, configure, and run Stroom
- Get some data into Stroom
- Set up a pipeline to process the data
- Index the data
- Show the data on a dashboard
All the things we create here are available as a content pack, so if you just wanted to see it running you could get there quite easily.
Note: The CSV data used in mock_stroom_data.csv (linked to above) is randomly generated and any association with any real world IP address or name is entirely coincidental.