This is the multi-page printable view of this section. Click here to print.
Installation Guide
- 1: Setup
- 1.1: Apache Forwarding
- 1.2: Java Key Store Setup
- 1.3: MySQL Setup
- 1.4: Processing Users
- 1.5: Securing Stroom
- 2: Stroom 5 Installation
- 3: Stroom 6 Architecture & Depployment
- 4: Stroom 6 Installation
- 5: Stroom Proxy Installation
- 6: Stroom Upgrades
- 7: Upgrades
- 7.1: v6 to v7 Upgrade
- 8: Configuration
- 8.1: Nginx Configuration
- 8.2: Stroom Configuration
- 8.3: Stroom Proxy Configuration
- 8.4: Stroom Log Sender Configuration
- 8.5: MySQL Configuration
1 - Setup
1.1 - Apache Forwarding
Warning
This document refers to v4/5.
Stroom defaults to listening for HTTP on port 8080. It is recommended that Apache is used to listen on the standard HTTP port 80 and forward requests on via the Apache mod_jk module and the AJP protocol (on 8009). Apache can also perform HTTPS on port 443 and pass over requests to Tomcat using the same AJP protocol.
It is additionally recommended that Stroom Proxy is used to front data ingest and so Apache is configured to route traffic to http(s)://server/stroom/datafeed to Stroom Proxy and anything else to Stroom.
Prerequisites
- tomcat-connectors-1.2.31-src.tar.gz
Setup Apache
- As root
- Patch mod_jk
cd ~/tmp
tar -xvzf tomcat-connectors-1.2.31-src.tar.gz
cd tomcat-connectors-1.2.31-src/native
./configure --with-apxs=/usr/sbin/apxs
make
sudo cp apache-2.0/mod_jk.so /etc/httpd/modules/
cd
- Put the web server cert, private key, and CA cert into the web servers conf directory /etc/httpd/conf. E.g.
[user@node1 stroom-doc]$ ls -al /etc/httpd/conf
....
-rw-r--r-- 1 root root 1729 Aug 27 2013 host.crt
-rw-r--r-- 1 root root 1675 Aug 27 2013 host.key
-rw-r--r-- 1 root root 1289 Aug 27 2013 CA.crt
....
- Make changes to /etc/http/conf.d/ssl.conf as per below
JkMount /stroom* local
JkMount /stroom/remoting/cluster* local
JkOptions +ForwardKeySize +ForwardURICompat +ForwardSSLCertChain -ForwardDirectories
SSLCertificateFile /etc/httpd/conf/[YOUR SERVER].crt
SSLCertificateKeyFile /etc/httpd/conf/[YOUR SERVER].key
SSLCertificateChainFile /etc/httpd/conf/[YOUR CA].crt
SSLCACertificateFile /etc/httpd/conf/[YOUR CA APPENDED LIST].crt
SSLOptions +ExportCertData
- Remove /etc/httpd/conf.d/nss.conf to avoid a 8443 port clash
rm /etc/httpd/conf.d/nss.conf
- Create a /etc/httpd/conf.d/mod_jk.conf configuration
LoadModule jk_module modules/mod_jk.so
JkWorkersFile conf/workers.properties
JkLogFile logs/mod_jk.log
JkLogLevel info
JkLogStampFormat "[%a %b %d %H:%M:%S %Y]"
JkOptions +ForwardKeySize +ForwardURICompat +ForwardSSLCertChain -ForwardDirectories
JkRequestLogFormat "%w %V %T"
JkMount /stroom* local
JkMount /stroom/remoting/cluster* local
JkShmFile logs/jk.shm
<Location /jkstatus/>
JkMount status
Order deny,allow
Deny from all
Allow from 127.0.0.1
</Location>
- Setup stroom-setup/cluster.txt, generate the workers file and copy into Apache. (as root and replace stroomuser with your processing user)
/home/stroomuser/stroom-setup/workers.properties.sh --cluster=/home/stroomuser/cluster.txt > /etc/httpd/conf/workers.properties
- Inspect /etc/httpd/conf/workers.properties to make sure it looks as you expect for your cluster
worker.list=loadbalancer,local,status
worker.stroom_1.port=8009
worker.stroom_1.host=localhost
worker.stroom_1.type=ajp13
worker.stroom_1.lbfactor=1
worker.stroom_1.max_packet_size=65536
....
....
worker.loadbalancer.type=lb
worker.loadbalancer.balance_workers=stroom_1,stroom_2
worker.loadbalancer.sticky_session=1
worker.local.type=lb
worker.local.balance_workers=stroom_1
worker.local.sticky_session=1
worker.status.type=status
- Create a simple redirect page to the stroom web app for the root URL (e.g. DocumentRoot “/var/www/html”, index.html)
<html><head><meta http-equiv="Refresh" content="0; URL=stroom"></head></html>
- Restart Apache and then test default http / https access.
sudo /etc/init.d/httpd restart
Advanced Forwarding
Typically Stroom is setup so that traffic sent to /stroom* is routed to Stroom and /stroom/datafeed to Stroom Proxy. It is possible to setup an extra 1 level of datafeed routing so that based on the URL this traffic can be routed differently.
For example to route traffic directly to Stroom under the URL /stroom/datafeed/direct (avoiding any aggregation) the following mod_jk setting could be used.
JkMount /stroom/datafeed/direct* loadbalancer
1.2 - Java Key Store Setup
In order that the java process communicates over https (for example Stroom Proxy forwarding onto Stroom) the JVM requires relevant keystore’s setting up.
As the processing user copy the following files to a directory stroom-jks in the processing user home directory :
- CA.crt - Certificate Authority
- SERVER.crt - Server certificate with client authentication attributes
- SERVER.key - Server private key
As the processing user perform the following:
- First turn your keys into der format:
cd ~/stroom-jks
SERVER=<SERVER crt/key PREFIX>
AUTHORITY=CA
openssl x509 -in ${SERVER}.crt -inform PEM -out ${SERVER}.crt.der -outform DER
openssl pkcs8 -topk8 -nocrypt -in ${SERVER}.key -inform PEM -out ${SERVER}.key.der -outform DER
- Import Keys into the Key Stores:
Stroom_UTIL_JAR=`find ~/*app -name 'stroom-util*.jar' -print | head -1`
java -cp ${Stroom_UTIL_JAR} stroom.util.cert.ImportKey keystore=${SERVER}.jks keypass=${SERVER} alias=${SERVER} keyfile=${SERVER}.key.der certfile=${SERVER}.crt.der
keytool -import -noprompt -alias ${AUTHORITY} -file ${AUTHORITY}.crt -keystore ${AUTHORITY}.jks -storepass ${AUTHORITY}
- Update Processing User Global Java Settings:
PWD=`pwd`
echo "export JAVA_OPTS=\"-Djavax.net.ssl.trustStore=${PWD}/${AUTHORITY}.jks -Djavax.net.ssl.trustStorePassword=${AUTHORITY} -Djavax.net.ssl.keyStore=${PWD}/${SERVER}.jks -Djavax.net.ssl.keyStorePassword=${SERVER}\"" >> ~/env.sh
Any Stroom or Stroom Proxy instance will now additionally pickup the above JAVA_OPTS settings.
1.3 - MySQL Setup
Prerequisites
- MySQL 5.5.y server installed (e.g. yum install mysql-server)
- Processing User Setup
A single MySQL database is required for each Stroom instance. You do not need to setup a MySQL instance per node in your cluster.
Check Database installed and running
[root@stroomdb ~]# /sbin/chkconfig --list mysqld
mysqld 0:off 1:off 2:on 3:on 4:on 5:on 6:off
[root@stroomdb ~]# mysql --user=root -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
...
mysql> quit
The following commands can be used to auto start mysql if required:
[root@stroomdb ~]# /sbin/chkconfig –level 345 mysqld on
[root@stroomdb ~]# /sbin/service httpd start
Overview
MySQL configuration can be simple to complex depending on your requirements.
For a very simple configuration you simply need an out-of-the-box mysql
install and create a database user account.
Things get more complicated when considering:
- Security
- Master Slave Replication
- Tuning memory usage
- Running Stroom Stats in a different database to Stroom
- Performance Monitoring
Simple Install
Ensure the database is running, create the database and access to it
[stroomuser@host stroom-setup]$ mysql --user=root
Welcome to the MySQL monitor. Commands end with ; or \g.
...
mysql> create database stroom;
Query OK, 1 row affected (0.02 sec)
mysql> grant all privileges on stroom.* to 'stroomuser'@'host' identified by 'password';
Query OK, 0 rows affected (0.00 sec)
mysql> create database stroom_stats;
Query OK, 1 row affected (0.02 sec)
mysql> grant all privileges on stroom_stats.* to 'stroomuser'@'host' identified by 'password';
Query OK, 0 rows affected (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
Advanced Security
It is recommended to run /usr/bin/mysql_secure_installation to remove test database and accounts.
./stroom-setup/mysql_grant.sh is a utility script that creates accounts for you to use within a cluster (or single node setup). Run to see the options:
[stroomuser@host stroom-setup]$ ./mysql_grant.sh
usage : --name=<instance name (defaults to my for /etc/my.cnf)>
--user=<the stroom user for the db>
--password=<the stroom password for the db>
--cluster=<the file with a line per node in the cluster>
--user=<db user> Must be set
N.B. name is used when multiple mysql instances are setup (see below).
You need to create a file cluster.txt with a line for each member of your cluster (or single line in the case of a one node Stroom install). Then run the utility script to lock down the server access.
[stroomuser@host ~]$ hostname >> cluster.txt
[stroomuser@host ~]$ ./stroom-setup/mysql_grant.sh --name=mysql56_dev --user=stroomuser --password= --cluster=cluster.txt
Enter root mysql password :
--------------
flush privileges
--------------
--------------
delete from mysql.user where user = 'stroomuser'
--------------
...
...
...
--------------
flush privileges
--------------
[stroomuser@host ~]$
Advanced Install
The below example uses the utility scripts to create 3 custom mysql server instances on 2 servers:
- server1 - master stroom,
- server2 - slave stroom, stroom_stats
As root on server1:
yum install "mysql56-mysql-server"
Create the master database:
[root@node1 stroomuser]# ./stroom-setup/mysqld_instance.sh --name=mysqld56_stroom --port=3106 --server=mysqld56 --os=rhel6
--master not set ... assuming master database
Wrote base files in tmp (You need to move them as root). cp /tmp/mysqld56_stroom /etc/init.d/mysqld56_stroom; cp /tmp/mysqld56_stroom.cnf /etc/mysqld56_stroom.cnf
Run mysql client with mysql --defaults-file=/etc/mysqld56_stroom.cnf
[root@node1 stroomuser]# cp /tmp/mysqld56_stroom /etc/init.d/mysqld56_stroom; cp /tmp/mysqld56_stroom.cnf /etc/mysqld56_stroom.cnf
[root@node1 stroomuser]# /etc/init.d/mysqld56_stroom start
Initializing MySQL database: Installing MySQL system tables...
OK
Filling help tables...
...
...
Starting mysql56-mysqld: [ OK ]
Check Start up Settings Correct
[root@node2 stroomuser]# chkconfig mysqld off
[root@node2 stroomuser]# chkconfig mysql56-mysqld off
[root@node1 stroomuser]# chkconfig --add mysqld56_stroom
[root@node1 stroomuser]# chkconfig mysqld56_stroom on
[root@node2 stroomuser]# chkconfig --list | grep mysql
mysql56-mysqld 0:off 1:off 2:off 3:off 4:off 5:off 6:off
mysqld 0:off 1:off 2:off 3:off 4:off 5:off 6:off
mysqld56_stroom 0:off 1:off 2:on 3:on 4:on 5:on 6:off
mysqld56_stats 0:off 1:off 2:on 3:on 4:on 5:on 6:off
Create a text file will all members of the cluster:
[root@node1 stroomuser]# vi cluster.txt
node1.my.org
node2.my.org
node3.my.org
node4.my.org
Create the grants:
[root@node1 stroomuser]# ./stroom-setup/mysql_grant.sh --name=mysqld56_stroom --user=stroomuser --password=password --cluster=cluster.txt
As root on server2:
[root@node2 stroomuser]# yum install "mysql56-mysql-server"
[root@node2 stroomuser]# ./stroom-setup/mysqld_instance.sh --name=mysqld56_stroom --port=3106 --server=mysqld56 --os=rhel6 --master=node1.my.org --user=stroomuser --password=password
--master set ... assuming slave database
Wrote base files in tmp (You need to move them as root). cp /tmp/mysqld56_stroom /etc/init.d/mysqld56_stroom; cp /tmp/mysqld56_stroom.cnf /etc/mysqld56_stroom.cnf
Run mysql client with mysql --defaults-file=/etc/mysqld56_stroom.cnf
[root@node2 stroomuser]# cp /tmp/mysqld56_stroom /etc/init.d/mysqld56_stroom; cp /tmp/mysqld56_stroom.cnf /etc/mysqld56_stroom.cnf
[root@node1 stroomuser]# /etc/init.d/mysqld56_stroom start
Initializing MySQL database: Installing MySQL system tables...
OK
Filling help tables...
...
...
Starting mysql56-mysqld: [ OK ]
Check Start up Settings Correct
[root@node2 stroomuser]# chkconfig mysqld off
[root@node2 stroomuser]# chkconfig mysql56-mysqld off
[root@node1 stroomuser]# chkconfig --add mysqld56_stroom
[root@node1 stroomuser]# chkconfig mysqld56_stroom on
[root@node2 stroomuser]# chkconfig --list | grep mysql
mysql56-mysqld 0:off 1:off 2:off 3:off 4:off 5:off 6:off
mysqld 0:off 1:off 2:off 3:off 4:off 5:off 6:off
mysqld56_stroom 0:off 1:off 2:on 3:on 4:on 5:on 6:off
Create the grants:
[root@node1 stroomuser]# ./stroom-setup/mysql_grant.sh --name=mysqld56_stroom --user=stroomuser --password=password --cluster=cluster.txt
Make the slave database start to follow:
[root@node2 stroomuser]# cat /etc/mysqld56_stroom.cnf | grep "change master"
# change master to MASTER_HOST='node1.my.org', MASTER_PORT=3106, MASTER_USER='stroomuser', MASTER_PASSWORD='password';
[root@node2 stroomuser]# mysql --defaults-file=/etc/mysqld56_stroom.cnf
mysql> change master to MASTER_HOST='node1.my.org', MASTER_PORT=3106, MASTER_USER='stroomuser', MASTER_PASSWORD='password';
mysql> start slave;
As processing user on server1:
[stroomuser@node1 ~]$ mysql --defaults-file=/etc/mysqld56_stroom.cnf --user=stroomuser --password=password
mysql> create database stroom;
Query OK, 1 row affected (0.00 sec)
mysql> use stroom;
Database changed
mysql> create table test (a int);
Query OK, 0 rows affected (0.05 sec)
As processing user on server2 check server replicating OK:
[stroomuser@node2 ~]$ mysql --defaults-file=/etc/mysqld56_stroom.cnf --user=stroomuser --password=password
mysql> show create table test;
+-------+----------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+----------------------------------------------------------------------------------------+
| test | CREATE TABLE `test` (`a` int(11) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+----------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
As root on server2:
[root@node2 stroomuser]# /home/stroomuser/stroom-setup/mysqld_instance.sh --name=mysqld56_stats --port=3206 --server=mysqld56 --os=rhel6 --user=statsuser --password=password
[root@node2 stroomuser]# cp /tmp/mysqld56_stats /etc/init.d/mysqld56_stats; cp /tmp/mysqld56_stats.cnf /etc/mysqld56_stats.cnf
[root@node2 stroomuser]# /etc/init.d/mysqld56_stats start
[root@node2 stroomuser]# chkconfig mysqld56_stats on
Create the grants:
[root@node2 stroomuser]# ./stroom-setup/mysql_grant.sh --name=mysqld56_stats --database=stats --user=stroomstats --password=password --cluster=cluster.txt
As processing user create the database:
[stroomuser@node2 ~]$ mysql --defaults-file=/etc/mysqld56_stats.cnf --user=stroomstats --password=password
Welcome to the MySQL monitor. Commands end with ; or \g.
....
mysql> create database stats;
Query OK, 1 row affected (0.00 sec)
1.4 - Processing Users
Processing User Setup
Stroom / Stroom Proxy should be run under a processing user (we assume stroomuser below).
- Setup this user
/usr/sbin/adduser --system stroomuser
-
You may want to allow normal accounts to sudo to this account for maintenance (visudo)
-
Create a service script to start/stop on server startup (as root).
vi /etc/init.d/stroomuser
#!/bin/bash
#
# stroomuser This shell script takes care of starting and stopping
# the stroomuser subsystem (tomcat6, etc)
#
# chkconfig: - 86 14
# description: stroomuser is the stroomuser sub system
Stroom_USER=stroomuser
case $1 in
start)
/bin/su ${Stroom_USER} /home/${Stroom_USER}/stroom-deploy/start.sh
;;
stop)
/bin/su ${Stroom_USER} /home/${Stroom_USER}/stroom-deploy/stop.sh
;;
restart)
/bin/su ${Stroom_USER} /home/${Stroom_USER}/stroom-deploy/stop.sh
/bin/su ${Stroom_USER} /home/${Stroom_USER}/stroom-deploy/start.sh
;;
esac
exit 0
- Initialise Script
/bin/chmod +x /etc/init.d/stroomuser
/sbin/chkconfig --level 345 stroomuser on
Install Java 8
yum install java-1.8.0-openjdk.x86_64
yum install java-1.8.0-openjdk-devel.x86_64
Setup Deployment Scripts
- As the processing user unpack the stroom-deploy-X-Y-Z-bin.zip generic deployment scripts in the processing users home directory.
unzip stroom-deploy-5.0.beta1-bin.zip
- Setup env.sh to include JAVA_HOME to point to the installed directory of the JDK (this will be platform specific). vi ~/env.sh
# User specific aliases and functions
export JAVA_HOME=/usr/lib/jvm/java-1.8.0
export PATH=${JAVA_HOME}/bin:${PATH}
- Setup users profile to include the same. vi ~/.bashrc
# User specific aliases and functions
. ~/env.sh
- Check that java is installed OK
[stroomuser@node1 ~]$ . .bashrc
[stroomuser@node1 ~]$ which java
/usr/lib/jvm/java-1.8.0/bin/java
[stroomuser@node1 ~]$ which javac
/usr/lib/jvm/java-1.8.0/bin/javac
[stroomuser@node1 ~]$ java -version
openjdk version "1.8.0_65"
OpenJDK Runtime Environment (build 1.8.0_65-b17)
OpenJDK 64-Bit Server VM (build 25.65-b01, mixed mode)
- Setup auto deployment crontab script as below (crontab -e)
[stroomuser@node1 ~]$ crontab -l
# Deploy Script
0,5,10,15,20,25,30,35,40,45,50,55 * * * * /home/stroomuser/stroom-deploy/deploy.sh >> /home/stroomuser/stroom-deploy.log
59 0 * * * rm -f /home/stroomuser/stroom-deploy.log
# Clean system
0 0 * * * /home/stroomuser/stroom-deploy/clean.sh > /dev/null
1.5 - Securing Stroom
NOTE This document was written for stroom v4/5. Some parts may not be applicable for v6+.
Firewall
The following firewall configuration is recommended:
- Outside cluster drop all access except ports HTTP 80, HTTPS 443, and any other system ports your require SSH, etc
- Within cluster allow all access
This will enable nodes within the cluster to communicate on:
- Native tomcat HTTP 8080, 9080
- Tomcat AJP 8009, 9009
- MySQL 3006
MySQL
- It is recommended that you run mysql_secure_installation to set a root password and remove test database:
mysql_secure_installation (provide a root password)
- Set root password? [Y/n] Y
- Remove anonymous users? [Y/n] Y
- Disallow root login remotely? [Y/n] Y
- Remove test database and access to it? [Y/n] Y
- Reload privilege tables now? [Y/n] Y
- stroom-setup includes a version of this script designed to be run on instances create using mysqld_instance.sh (i.e. non standard or multiple instances of mysql)
[stroomuser@stroom_1 stroom-setup]$ ./mysql_secure_installation.sh --name=mysqld_ref1m
2 - Stroom 5 Installation
Warning
This document was written for stroom v4/5. It is not applicable for v6+.
Prerequisites
- Install file ‘stroom-app-distribution-X-Y-Z-bin.zip’. All the pre-built binaries are available on GitHub (external link)
- MySQL Server 5.5
- JDK8
- Temporarily allow port 8080, if not relying on Apache Forwarding.
Installing Stroom
Unpack the distribution stroom-app-distribution-X-Y-Z-bin.zip
:
unzip stroom-app-distribution-X-Y-Z-bin.zip
In bin
are scripts for configuring and starting and stopping Stroom.
Configuring
The setup.sh
script will ask a series of questions to help you configure Stroom.
./bin/setup.sh
This script asks a series of questions about configuration parameters. These parameters are:
- TEMP_DIR - This is where Stroom will write some temporary files, e.g. imports/exports. Only change this if you do not want to use ‘/tmp’.
- NODE - Each Stroom instance in the cluster needs a unique name, if this is a reinstall ensure you use the previous deployment. This name needs match the name used in your worker.properties (e.g. ’node1’ in the case ’node1.my.org’)
- RACK - Used to group nodes together (so for example nodes near each other process near data)
- PORT_PREFIX - By default Stroom will run on port 8080
- JDBC_CLASSNAME, JDBC URL, DB USERNAME, DB PASSWORD - MySQL connection details for the stroom database
- JPA DIALECT - Leave blank to use MySQL
- JAVA OPTS - By default this is ‘-Xms1g -Xmx8g’. Stroom performs better if you use most of the servers memory so change the maximum memory setting (Xmx) accordingly, e.g. -Xmx40g will use 40 GB.
- STROOM_STATISTICS_SQL_JDBC_CLASSNAME, STROOM_STATISTICS_SQL_JDBC URL, STROOM_STATISTICS_SQL_DB USERNAME, STROOM_STATISTICS_SQL_DB PASSWORD - MySQL connection details for the statistics database
Running
Start the configured instance:
./bin/start.sh
Inspect the logs:
tail -f instance/logs/stroom.log
Other things to configure:
You might want to configure some of the following:
3 - Stroom 6 Architecture & Depployment
The diagram below shows the logical architecture of Stroom v6. It is not concerned with how/where the various services are deployed. This page describes represents a reference architecture and deployment for stroom but it is possible to deploy the various services in many different ways, e.g. using a different web server to Nginx or introducing load balancers.
Nginx
In stroom v6, a central Nginx is key to the whole architecture. It acts in the following capacities:
- A reverse proxy to abstract clients from the multiple service instances.
- An API gateway for all service traffic.
- The termination point for client SSL traffic.
Reverse Proxy
Nginx is used to reverse proxy all client connections (even those from within the estate) to the various services that sit behind it.
For example, a client request to https://nginx-host/stroom
will be reverse proxied to http://a-stroom-host:8080/stroom
.
Nginx is responsible for selecting the upstream server to reverse proxy to.
It is possible to use multiple instances of Nginx for redundancy or improved performance, however care needs to be taken to ensure all requests for a session go to the same Nginx instance, i.e. sticky sessions.
Some requests are stateful and some are stateless but the Nginx config will reverse proxy them accordingly.
API Gateway
Nginx is also used as an API gateway. This means all inter-service calls go via the Nginx gateway so each service only needs to know the location of the Nginx gateway. Nginx will then reverse proxy all requests to the appropriate instance of an upstream service.
The grey dashed lines on the diagram attempt to show the effective inter-service connections that are being made if you ignore the Nginx reverse proxying.
SSL Termination
All SSL termination is handled by Nginx. Nginx holds the server and certificate authority certificate and will authenticate the client requests if the client has a certificate. Any client certificate details will be passed on to the service that is being reverse proxied.
Physical Deployment
Single Node Docker Deployment
The simplest deployment of stroom is where all services are on a single host and each service runs in its own docker container. Such a deployment can be achieved by following these instructions.
The following diagram shows how a single node deployment would look.
Multi Node Mixed Deployment
The typical deployment for a large scale stroom is where stroom is run on multiple hosts to scale out the processing. In this deployment stroom and MySQL are run directly on the host OS, i.e. without docker. This approach was taken to gradually introduce docker into the stroom deployment strategies.
The following diagram shows how a multi node deployment would look.
Multi Node All docker Deployment
The aim in future is to run all services in docker in a multi node deployment. Such a deployment is still under development and will likely involve kubernetes for container orchestration.
4 - Stroom 6 Installation
We would welcome feedback on this documentation.
Running on a single box
Running a release
Download a release (external link), for example Stroom Core v6.0 Beta 3 (external link), unpack it, and run the start.sh
script. When you’ve given it some time to start up go to http://localhost/stroom
. There’s a README.md
file inside the tar.gz with more information.
Post-install hardening
Before first run
Change database passwords
If you don’t do this before the first run of Stroom then the passwords will already be set and you’ll have to change them on the database manually, and then change the .env
.
This change should be made in the .env
configuration file. If the values are not there then this service is not included in your Stroom stack and there is nothing to change.
-
STROOM_DB_PASSWORD
-
STROOM_DB_ROOT_PASSWORD
-
STROOM_STATS_DB_ROOT_PASSWORD
-
STROOM_STATS_DB_PASSWORD
-
STROOM_AUTH_DB_PASSWORD
-
STROOM_AUTH_DB_ROOT_PASSWORD
-
STROOM_ANNOTATIONS_DB_PASSWORD
-
STROOM_ANNOTATIONS_DB_ROOT_PASSWORD
On first run
Create yourself an account
After first logging in as admin
you should create yourself a normal account (using your email address) and add yourself to the Administrators
group. You should then log out of admin
, log in with your new administrator account and then disable the admin
account.
If you decide to use the admin
account as your normal account you might find yourself locked out. The admin
account has no associated email address, so the Reset Password feature will not work if your account is locked. It might become locked if you enter your password incorrectly too many times.
Delete un-used users and API keys
- If you’re not using stats you can delete or disable the following:
- the user
statsServiceUser
- the API key for
statsServiceUser
- the user
Change the API keys
First generate new API keys. You can generate a new API key using Stroom, under Tools
-> API Keys
. The following need to be changed:
-
STROOM_SECURITY_API_TOKEN
- This is the API token for user
stroomServiceUser
.
- This is the API token for user
Then stop Stroom and update the API key in the .env
configuration file with the new value.
Troubleshooting
I’m trying to use certificate logins (PKI) but I keep being prompted for the username and password!
You need to be sure of several things:
- When a user arrives at Stroom the first thing Stroom does is redirect the user to the authentication service. This is when the certificate is checked. If this redirect doesn’t use HTTPS then nginx will not get the cert and will not send it onwards to the authentication service. Remember that all of this stuff, apart from back-channel/service-to-service chatter, goes through nginx. The env var that needs to use HTTPS is STROOM_AUTHENTICATION_SERVICE_URL. Note that this is the var Stroom looks for, not the var as set in the stack, so you’ll find it in the stack YAML.
- Are your certs configured properly? If nginx isn’t able to decode the incoming cert for some reason then it won’t pass anything on to the service.
- Is your browser sending certificates?
5 - Stroom Proxy Installation
There are 2 versions of the stroom software availble for building a proxy server.
There is an ‘app’ version that runs stroom as a Java ARchive (jar) file locally on the server and has settings contained in a configuration file that controls access to the stroom server and database.
The other version runs stroom proxy within docker containers and also has a settings configuration file that controls access to the stroom server and database.
The document will cover the installation and configuration of the stroom proxy software for both the docker and ‘app’ versions.
Assumptions
The following assumptions are used in this document.
- the user has reasonable RHEL/CentOS System administration skills.
- installation is on a fully patched minimal CentOS 7 instance.
- the Stroom database has been created and resides on the host
stroomdb0.strmdev00.org
listening on port 3307. - the Stroom database user is
stroomuser
with a password ofStroompassword1@
. - the application user
stroomuser
has been created. - the user is or has deployed the two node Stroom cluster described here.
- the user has set up the Stroom processing user as described here.
- the prerequisite software has been installed.
- when a screen capture is documented, data entry is identified by the data surrounded by ‘<’ ‘>’ . This excludes enter/return presses.
Stroom Remote Proxy (docker version)
The build of a stroom proxy where the stroom applications are running in docker containers.
The operating system (OS) build for a ‘dockerised’ stroom proxy is minimal RHEL/CentOS 7 plus the docker-ce & docker-compose packages.
Neither of the pre-requisites are available from the CentOS ditribution.
It will also be necessary to open additional ports on the system firewall (where appropriate).
Download and install docker
To download and install - docker-ce - from the internet, a new ‘repo’ file is downloaded first, that provides access to the docker.com repository. e.g. as root user:
- wget https://download.docker.com/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
- yum install docker-ce.x86_64
The packages - docker-ce docker-ce-cli & containerd.io - will be installed
The docker-compose software can de downloaded from github e.g. as root user to download docker-compose version 1.25.4 and save it to - /usr/local/bin/docker-compose
- curl -L https://github.com/docker/compose/releases/download/1.25.4/docker-compose-Linux-x86_64 -o /usr/local/bin/docker-compose
- chmod 755 /usr/local/bin/docker-compose
Firewall Configuration
If you have a firewall running additional ports will need to be opened, to allow the Docker containers to talk to each other.
Currently these ports are:
- 3307
- 8080
- 8081
- 8090
- 8091
- 8543
- 5000
- 2888
- 443
- 80
For example on a RHEL/CentOS server using firewalld
the commands would be as root user:
firewall-cmd --zone=public --permanent --add-port=3307/tcp
firewall-cmd --zone=public --permanent --add-port=8080/tcp
firewall-cmd --zone=public --permanent --add-port=8081/tcp
firewall-cmd --zone=public --permanent --add-port=8090/tcp
firewall-cmd --zone=public --permanent --add-port=8091/tcp
firewall-cmd --zone=public --permanent --add-port=8099/tcp
firewall-cmd --zone=public --permanent --add-port=5000/tcp
firewall-cmd --zone=public --permanent --add-port=2888/tcp
firewall-cmd --zone=public --permanent --add-port=443/tcp
firewall-cmd --zone=public --permanent --add-port=80/tcp
firewall-cmd --reload
Download and install Stroom v7 (docker version)
The installation example below is for stroom version 7.0.beta.45 - but is applicable to other stroom v7 versions.
As a suitable stroom user e.g. stroomuser - download and unpack the stroom software.
- wget https://github.com/gchq/stroom-resources/releases/download/stroom-stacks-v7.0-beta.41/stroom_proxy-v7.0-beta.45.tar.gz
- tar zxf stroom-stacks…………..
For a stroom proxy, the configuration file - stroom_proxy/stroom_proxy-v7.0-beta.45/stroom_proxy.env
needs to be edited, with the connection details of the stroom server that data files will be sent to.
The default network port for connection to the stroom server is 8080
The values that need to be set are:
STROOM_PROXY_REMOTE_FEED_STATUS_API_KEY
STROOM_PROXY_REMOTE_FEED_STATUS_URL
STROOM_PROXY_REMOTE_FORWARD_URL
The ‘API key’ is generated on the stroom server and is related to a specific user e.g. proxyServiceUser
The 2 URL values also refer to the stroom server and can be a fully qualified domain name (fqdn) or the IP Address.
e.g. if the stroom server was - stroom-serve.somewhere.co.uk - the URL lines would be:
export STROOM_PROXY_REMOTE_FEED_STATUS_URL=“http://stroom-serve.somewhere.co.uk:8080/api/feedStatus/v1"
export STROOM_PROXY_REMOTE_FORWARD_URL=“http://stroom-serve.somewhere.co.uk:8080/stroom/datafeed"
To Start Stroom Proxy
As the stroom user, run the ‘start.sh’ script found in the stroom install:
- cd ~/stroom_proxy/stroom_proxy-v7.0-beta.45/
- ./start.sh
The first time the script is ran it will download from github the docker containers for a stroom proxy
these are - stroom-proxy-remote, stroom-log-sender and nginx.
Once the script has completed the stroom proxy server should be running.
There are additional scripts - status.sh - that will show the status of the docker containers (stroom-proxy-remote, stroom-log-sender and nginx)
and - logs.sh - that will tail all of the stroom message files to the screen.
Stroom Remote Proxy (app version)
The build of a stroom proxy server, where the stroom application is running locally as a Java ARchive (jar) file.
The operating system (OS) build for an ‘application’ stroom proxy is minimal RHEL/CentOS 7 plus Java.
The Java version required for stroom v7 is 12+ This version of Java is not available from the RHEL/CentOS distribution.
The version of Java used below is the ‘openJDK’ version as opposed to Oracle’s version.
This can be downloaded from the internet.
Version 12.0.1
wget https://download.java.net/java/GA/jdk12.0.1/69cfe15208a647278a19ef0990eea691/12/GPL/openjdk-12.0.1_lin
ux-x64_bin.tar.gz
Or version 14.0.2 https://download.java.net/java/GA/jdk14.0.2/205943a0976c4ed48cb16f1043c5c647/12/GPL/openjdk-14.0.2_linux-x64_bin.tar.gz
The gzipped tar file needs to be untarred and moved to a suitable location.
- tar xvf openjdk-12.0.1_linux-x64_bin.tar.gz
- mv jdk-12.0.1 /opt/
Create a shell script that will define the Java variables OR add the statements to .bash_profile
e.g. vi /etc/profile.d/jdk12.sh
export JAVA_HOME=/opt/jdk-12.0.1
export PATH=$PATH:$JAVA_HOME/bin
-
source /etc/profile.d/jdk12.sh
-
echo $JAVA_HOME
/opt/jdk-12.0.1 -
java –version openjdk version “12.0.1” 2019-04-16
OpenJDK Runtime Environment (build 12.0.1+12)
OpenJDK 64-Bit Server VM (build 12.0.1+12, mixed mode, sharing)
**Disable selinux to avoid issues with access and file permissions. **
Firewall Configuration
If you have a firewall running additional ports will need to be opened, to allow the Docker containers to talk to each other.
Currently these ports are:
- 3307
- 8080
- 8081
- 8090
- 8091
- 8543
- 5000
- 2888
- 443
- 80
For example on a RHEL/CentOS server using firewalld
the commands would be as root user:
firewall-cmd --zone=public --permanent --add-port=3307/tcp
firewall-cmd --zone=public --permanent --add-port=8080/tcp
firewall-cmd --zone=public --permanent --add-port=8081/tcp
firewall-cmd --zone=public --permanent --add-port=8090/tcp
firewall-cmd --zone=public --permanent --add-port=8091/tcp
firewall-cmd --zone=public --permanent --add-port=8099/tcp
firewall-cmd --zone=public --permanent --add-port=5000/tcp
firewall-cmd --zone=public --permanent --add-port=2888/tcp
firewall-cmd --zone=public --permanent --add-port=443/tcp
firewall-cmd --zone=public --permanent --add-port=80/tcp
firewall-cmd --reload
Download and install Stroom v7 (app version)
The installation example below is for stroom version 7.0.beta.45 - but is applicable to other stroom v7 versions.
As a suitable stroom user e.g. stroomuser - download and unpack the stroom software.
wget https://github.com/gchq/stroom/releases/download/v7.0-beta.45/stroom-proxy-app-v7.0-beta.45.zip
unzip stroom-proxy-app..............
The configuration file – stroom-proxy/config/config.yml
– is the principal file to be edited, as it contains
- connection details to the stroom server
- the locations of the proxy server log files
- the directory on the proxy server, where data files will be stored prior to forwarding onot stroom
- the location of the PKI Java keystore (jks) files
The log file locations are changed to be relative to where stroom is started i.e. ~stroomuser/stroom-proxy/logs/…
..
currentLogFilename: logs/events/event.log
archivedLogFilenamePattern: logs/events/event-%d{yyyy-MM-dd'T'HH:mm}.log
currentLogFilename: logs/events/event.log
archivedLogFilenamePattern: logs/events/event-%d{yyyy-MM-dd'T'HH:mm}.log
currentLogFilename: logs/send/send.log
archivedLogFilenamePattern: logs/send/send-%d{yyyy-MM-dd'T'HH:mm}.log.gz
currentLogFilename: logs/app/app.log
archivedLogFilenamePattern: logs/app/app-%d{yyyy-MM-dd'T'HH:mm}.log.gz
An API key created on the stroom server for a special proxy user is added to the configuration file. The API key is used to validate access to the application
proxyConfig:
useDefaultOpenIdCredentials: **false**
proxyContentDir: "/stroom-proxy/content"
#If you want to use a receipt policy then the RuleSet must exist
#in Stroom and have the UUID as specified below in receiptPolicyUuid
#proxyRequestConfig:
#receiptPolicyUuid: "${RECEIPT_POLICY_UUID:-}"
feedStatus:
url: **“http://stroomserver.somewhere.co.uk:8080/api/feedStatus/v1}"**
apiKey: **" eyJhbGciOiJSUz ……………………….ScdPX0qai5UwlBA"**
forwardStreamConfig:
forwardingEnabled: true
The location of the jks files has to be set, or comment all of the lines that have sslConfig: & tls: sections out to not use jks checking.
Stroom also needs the client & ca ‘jks’ files and by default are located in - /stroom-proxy/certs/ca.jks & client.jks
Their location can be changed in the config.yml
keyStorePath: "/stroom-proxy/certs/client.jks"
trustStorePath: "/stroom-proxy/certs/ca.jks"
keyStorePath: "/stroom-proxy/certs/client.jks"
trustStorePath: "/stroom-proxy/certs/ca.jks"
Could be changed to……………….
keyStorePath: "/home/stroomuser/stroom-proxy/certs/client.jks"
trustStorePath: "/home/stroomuser/stroom-proxy/certs/ca.jks"
keyStorePath: "/home/stroomuser/stroom-proxy/certs/client.jks"
trustStorePath: "/home/stroomuser/stroom-proxy/certs/ca.jks"
Create a directory - /stroom-proxy – and ensure that stroom can write to it
This is where the proxy data files are stored - /stroom-proxy/repo
proxyRepositoryConfig:
storingEnabled: true
repoDir: **"/stroom-proxy/repo"**
format: "${executionUuid}/${year}-${month}-${day}/${feed}/${pathId}/${id}"
6 - Stroom Upgrades
Stroom v6+
Stroom is designed to detect the version of the database schema present and to run any migrations necessary to bring it up to the version begin deployed.
Docker stack deployments
TODO
Non-docker deployments
TODO
Major version upgrades
The following notes are specific for these major version upgrades
Stroom v5
The cleanest way to upgrade or patch is to simply remove the installed content and reinstall. For example:
./stroom-deploy/stop.sh
rm -fr stroom-app*
<unzip new builds as per install instructions>
<run setup.sh as per install instructions>
./stroom-deploy/start.sh
It is extremely important that you enter the configuration parameters correctly. In particular the node name should match the current node name otherwise Stroom will create a new node in the system thinking it is part of a cluster. It is recommended that you copy the original parameters file values used in the original installation to help with this (e.g. cp stroom-app/bin/~setup.xml /tmp/orig-stroom-app-setup.xml.
You should remove and reinstall all components you originally installed i.e. stroom-deploy-X-Y-Z, stroom-app-X-Y-X as required.
You should temporary disable the cron auto deploy script if you have it running during the above.
Patching
You can choose for a minor patch (1-2-X) to simply copy the new WAR file into relevant lib directory and run the deploy.sh script (which you may have running on a cron tab). However this would not patch any potential script or tomcat setting changes.
7 - Upgrades
7.1 - v6 to v7 Upgrade
Warning
Before comencing an upgrade to v7 you should upgrade Stroom to the latest minor and patch version of v6.
Differences between v6 and v7
Stroom v7 has significant differences to v6 which make the upgrade process a little more complicated.
- v6 handled authentication using a separate application, stroom-auth-service, with its own database. In v7 authentication is handled either internally in stroom (the default) or by an external identity provider such as google or AWS Cognito.
- v6 used a stroom.conf file or environment variables for configuration. In v7 stroom uses a config.yml file for its configuration (see Properties)
- v6 used upper case and heavily abbreviated names for its tables.
In v7 clearer and lower case table names are used.
As a result ALL v6 tables get renamed with the prefix
OLD_
, the new tables created and any content copied over. As the database will be holding two copies of most data you need to ensure you have space to accomodate it.
Pre-Upgrade tasks
The following steps are required to be performed before migrating from v6 to v7.
Download migration scripts
Download the migration SQL scripts from https://github.com/gchq/stroom/blob/STROOM_VERSION/scripts e.g. https://github.com/gchq/stroom/blob/v7.0-beta.133/scripts
These scripts will be used in the steps below.
Pre-migration database checks
Run the pre-migration checks script on the running database.
docker exec -i stroom-all-dbs mysql --table -u"stroomuser" -p"stroompassword1" stroom < v7_db_pre_migration_checks.sql
This will produce a report of items that will not be migrated or need attention before migration.
Stop processing
Before shutting stroom down it is wise to turn off stream processing and let all outstanding server tasks complete.
TODO clairfy steps for this.
Stop the stack
Stop the stack (stroom and the database) then start up the database. Do this using the v6 stack. This ensures that stroom is not trying to access the database.
./stop.sh
./start.sh stroom-all-dbs
Backup the databases
Backup all the databases for the different components.
Typically these will be stroom
, stats
and auth
.
If you are running in a docker stack then you can run the ./backup_databases.sh
script.
Stop the database
Stop the database using the v6 stack.
./stop.sh
Deploy and configure v7
Deploy the v7 stack. TODO - more detail
Verify the database connection configuration for the stroom and stats databases. Ensure that there is NOT any configuration for a separate auth database as this will now be in stroom.
Running mysql_upgrade
Stroom v6 ran on mysql v5.6. Stroom v7 runs on mysql v8. The upgrade path for MySQL is 5.6 => 5.7.33 => 8.x
To ensure the database is up to date mysql_upgrade
neeeds to be run using the 5.7.33 binaries, see (external link).
This is the process for upgrading the database. All of these commands are using the v7 stack.
# Set the version of the MySQL docker image to use
export MYSQL_TAG=5.7.33
# Start MySQL at v5.7, this will recreate the container
./start.sh stroom-all-dbs
# Run the upgrade from 5.6 => 5.7.33
docker exec -it stroom-all-dbs mysql_upgrade -u"root" -p"my-secret-pw"
# Stop MySQL
./stop.sh
# Unset the tag variable so that it now uses the default from the stack (8.x)
unset MYSQL_TAG
# Start MySQL at v8.x, this will recreate the container and run the upgrade from 5.7.33=>8
./start.sh stroom-all-dbs
./stop.sh
Rename legacy stroom-auth tables
Run this command to connect to the auth
database and run the pre-migration SQL script.
docker exec -i stroom-all-dbs mysql --table -u"authuser" -p"stroompassword1" auth < v7_auth_db_table_rename.sql
This will rename all but one of the tables in the auth
database.
Copy the auth
database content to stroom
Having run the table rename perform another backup of just the auth
database.
./backup_databases.sh . auth
Now restore this backup into the stroom
database.
You can use the v7 stack scripts to do this.
./restore_database.sh stroom auth_20210312143513.sql.gz
You should now see the following tables in the stroom
database:
OLD_AUTH_json_web_key
OLD_AUTH_schema_version
OLD_AUTH_token_types
OLD_AUTH_tokens
OLD_AUTH_users
This can be checked by running the following in the v7 stack.
echo 'select table_name from information_schema.tables where table_name like "OLD_AUTH%"' | ./database_shell.sh
Drop unused databases
There may be a number of databases that are no longer used that can be dropped prior to the upgrade.
Note the use of the --force
argument so it copes with users that are not there.
docker exec -i stroom-all-dbs mysql --force -u"root" -p"my-secret-pw" < v7_drop_unused_databases.sql
Verify it worked with:
echo 'show databases;' | docker exec -i stroom-all-dbs mysql -u"root" -p"my-secret-pw"
Performing the upgrade
To perform the stroom schema upgrade to v7 run the migrate command which will migrate the database then exit. For a large upgrade like this is it is preferable to run the migrate command rather than just starting stroom as stroom will only migrate the parts of the schema as it needs to use them. Running migrate ensures all parts of the migration are completed when the command is run and no other parts of stroom will be started.
./migrate.sh
Post-Upgrade tasks
TODO remove auth* containers,images,volumes
8 - Configuration
Version Information: Created with Stroom v7.0
Last Updated: 2021-06-23
Stroom and its associated services can be deployed in may ways (single node docker stack, non-docker cluster, kubernetes, etc). This document will cover two types of deployment:
- Single node stroom_core docker stack.
- A mixed deployment with nginx in docker and stroom, stroom-proxy and the database not in docker.
This document will explain how each application/service is configured and where its configuration files live.
Application Configuration
The following sections provide links to how to configure each application.
General configuration of docker stacks
Environment variables
The stroom docker stacks have a single env file <stack name>.env
that acts as a single point to configure some aspects of the stack.
Setting values in the env file can be useful when the value is shared between multiple containers.
This env file sets environment variables that are then used for variable substitution in the docker compose YAML files, e.g.
environment:
- MYSQL_ROOT_PASSWORD=${STROOM_DB_ROOT_PASSWORD:-my-secret-pw}
In this example the environment variable STROOM_DB_ROOT_PASSWORD
is read and used to set the environment variable MYSQL_ROOT_PASSWORD
in the docker container.
If STROOM_DB_ROOT_PASSWORD
is not set then the value my-secret-pw
is used instead.
The environment variables set in the env file are NOT automatically visible inside the containers.
Only those environment variables defined in the environment
section of the docker-compose YAML files are visible.
These environment
entries can either be hard coded values or use environment variables from outside the container.
In some case the names in the env file and the names of the environment variables set in the containers are the same, in some they are different.
The environment variables set in the containers can then be used by the application running in each container to set its configuration.
For example, stroom’s config.yml
file also uses variable substitution, e.g.
appConfig:
commonDbDetails:
connection:
jdbcDriverClassName: "${STROOM_JDBC_DRIVER_CLASS_NAME:-com.mysql.cj.jdbc.Driver}"
In this example jdbcDriverUrl
will be set to the value of environment variable STROOM_JDBC_DRIVER_CLASS_NAME
or com.mysql.cj.jdbc.Driver
if that is not set.
The following example shows how setting MY_ENV_VAR=123
means myProperty
will ultimately get a value of 123
and not its default of 789
.
env file (stroom<stack name>.env) - MY_ENV_VAR=123
|
|
| environment variable substitution
|
v
docker compose YAML (01_stroom.yml) - STROOM_ENV_VAR=${MY_ENV_VAR:-456}
|
|
| environment variable substitution
|
v
Stroom configuration file (config.yml) - myProperty: "${STROOM_ENV_VAR:-789}"
Note that environment variables are only set into the container on start. Any changes to the env file will not take effect until the container is (re)started.
Configuration files
The following shows the basic structure of a stack with respect to the location of the configuration files:
── stroom_core_test-vX.Y.Z
├── config [stack env file and docker compose YAML files]
└── volumes
└── <service>
└── conf/config [service specifc configuration files]
Some aspects of configuration do not lend themselves to environment variable substitution, e.g. deeply nested parts of stroom’s config.yml
.
In these instances it may be necessary to have static configuration files that have no connection to the env file or only use environment variables for some values.
Bind mounts
Everything in the stack volumes
directory is bind-mounted into the named docker container but is mounted read-only to the container.
This allows configuration files to be read by the container but not modified.
Typically the bind mounts mount a directory into the container, though in the case of the stroom-all-dbs.cnf
file, the file is mounted.
The mounts are done using the inode of the file/directory rather than the name, so docker will mount whatever the inode points to even if the name changes.
If for instance the stroom-all-dbs.cnf
file is renamed to stroom-all-dbs.cnf.old
then copied to stroom-all-dbs.cnf
and then the new version modified, the container would still see the old file.
Docker managed volumes
When stroom is running various forms of data are persisted, e.g. stroom’s stream store, stroom-all-dbs database files, etc.
All this data is stored in docker managed volumes.
By default these will be located in /var/lib/docker/volumes/<volume name>/_data
and root/sudo access will be needed to access these directories.
Docker data root
IMPORTANT
By default Docker stores all its images, container layers and managed volumes in its default data root directory which defaults to /var/lib/docker
.
It is typical in server deployments for the root file system to be kept fairly small and this is likely to result in the root file system running out of space due to the growth in docker images/layers/volumes in /var/lib/docker
.
It is therefore strongly recommended to move the docker data root to another location with more space.
There are various options for achieving this.
In all cases the docker daemon should be stopped prior to making the changes, e.g. service docker stop
, then started afterwards.
-
Symlink - One option is to move the
var/lib/docker
directory to a new location then create a symlink to it. For example:ln -s /large_mount/docker_data_root /var/lib/docker
This has the advantage that anyone unaware that the data root has moved will be able to easily find it if they look in the default location.
-
Configuration - The location can be changed by adding this key to the file
/etc/docker/daemon.json
(or creating this file if it doesn’t exist.{ "data-root": "/mnt/docker" }
-
Mount - If your intention is to use a whole storage device for the docker data root then you can mount that device to
/var/lib/docker
. You will need to make a copy of the/var/lib/docker
directory prior to doing this then copy it mount once created. The process for setting up this mount will be OS dependent and is outside the scope of this document.
Active services
Each stroom docker stack comes pre-built with a number of different services, e.g. the stroom_core stack contains the following:
- stroom
- stroom-proxy-local
- stroom-all-dbs
- nginx
- stroom-log-sender
While you can pass a set of service names to the commands like start.sh
and stop.sh
, it may sometimes be required to configure the stack instance to only have a set of services active.
You can set the active services like so:
./set_services.sh stroom stroom-all-dbs nginx
In the above example and subsequent use of commands like start.sh
and stop.sh
with no named services would only act upon the active services set by set_services.sh
.
This list of active services is held in ACTIVE_SERVICES.txt
and the full list of available services is held in ALL_SERVICES.txt
.
Certificates
A number of the services in the docker stacks will make use of SSL certificates/keys in various forms.
The certificate/key files are typically found in the directories volumes/<service>/certs/
.
The stacks come with a set of client/server certificates that can be used for demo/test purposes. For production deployments these should be replaced with the actual certificates/keys for your environment.
In general the best approach to configuring the certificates/keys is to replace the existing files with symlinks to the actual files.
For example in the case of the server certificates for nginx (found in volumes/nginx/certs/
) the directory would look like:
ca.pem.crt -> /some/path/to/certificate_authority.pem.crt
server.pem.crt -> /some/path/to/host123.pem.crt
server.unencrypted.key -> /some/path/to/host123.key
This approach avoids the need to change any configuration files to reference differently named certificate/key files and avoids having to copy your real certificates/keys into multiple places.
For examples of how to create certificates, keys and keystores see creatCerts.sh (external link)
8.1 - Nginx Configuration
Version Information: Created with Stroom v7.0
Last Updated: 2021-06-07
See Also: Nginx documentation (external link).
Without Docker
The standard way of deploying Nginx with stroom running without docker involves running Nginx as part of the services stack. See below for details of how to configure it. If you want to deploy Nginx without docker then you can but that is outside the scope of the this documentation.
As part of a docker stack
Nginx is included in all the stroom docker stacks.
Nginx is configured using multiple configuration files to aid clarity and allow reuse of sections of configuration.
The main file for configuring Nginx is nginx.conf.template
and this makes use of other files via include
statements.
The purpose of the various files is as follows:
nginx.conf.template
- Top level configuration file that orchestrate the other files.logging.conf.template
- Configures the logging output, its content and format.server.conf.template
- Configures things like SSL settings, timeouts, ports, buffering, etc.- Upstream configuration
upstreams.stroom.ui.conf.template
- Defines the upstream host(s) for stroom node(s) that are dedicated to serving the user interface.upstreams.stroom.processing.conf.template
- Defines the upstream host(s) for stroom node(s) that are dedicated to stream processing and direct data receipt.upstreams.proxy.conf.template
- Defines the upstream host(s) for local stroom-proxy node(s).
- Location configuration
locations_defaults.conf.template
- Defines some default directives (e.g. headers) for configuring stroom paths.proxy_location_defaults.conf.template
- Defines some default directives (e.g. headers) for configuring stroom-proxy paths.locations.proxy.conf.template
- Defines the various paths (e.g//datafeed
) that will be reverse proxied to stroom-proxy hosts.locations.stroom.conf.template
- Defines the various paths (e.g//datafeeddirect
) that will be reverse proxied to stroom hosts.
Templating
The nginx container has been configured to support using environment variables passed into it to set values in the Nginx configuration files. It should be noted that recent versions of Nginx have templating support built in. The templating mechanism used in stroom’s Nginx container was set up before this existed but achieves the same result.
All non-default configuration files for Nginx should be placed in volumes/nginx/conf/
and named with the suffix .template
(even if no templating is needed).
When the container starts any variables in these templates will be substituted and the resulting files will be copied into /etc/nginx
.
The result of the template substitution is logged to help with debugging.
The files can contain templating of the form:
ssl_certificate /stroom-nginx/certs/<<<NGINX_SSL_CERTIFICATE>>>;
In this example <<<NGINX_SSL_CERTIFICATE>>>
will be replaced with the value of environment variable NGINX_SSL_CERTIFICATE
when the container starts.
Upstreams
When configuring a multi node cluster you will need to configure the upstream hosts. Nginx acts as a reverse proxy for the applications behind it so the lists of hosts for each application need to be configured.
For example if you have a 10 node cluster and 2 of those nodes are dedicated for user interface use then the configuration would look like:
upstreams.stroom.ui.conf.template
server node1.stroomhosts:<<<STROOM_PORT>>>
server node2.stroomhosts:<<<STROOM_PORT>>>
upstreams.stroom.processing.conf.template
server node3.stroomhosts:<<<STROOM_PORT>>>
server node4.stroomhosts:<<<STROOM_PORT>>>
server node5.stroomhosts:<<<STROOM_PORT>>>
server node6.stroomhosts:<<<STROOM_PORT>>>
server node7.stroomhosts:<<<STROOM_PORT>>>
server node8.stroomhosts:<<<STROOM_PORT>>>
server node9.stroomhosts:<<<STROOM_PORT>>>
server node10.stroomhosts:<<<STROOM_PORT>>>
upstreams.proxy.conf.template
server node3.stroomhosts:<<<STROOM_PORT>>>
server node4.stroomhosts:<<<STROOM_PORT>>>
server node5.stroomhosts:<<<STROOM_PORT>>>
server node6.stroomhosts:<<<STROOM_PORT>>>
server node7.stroomhosts:<<<STROOM_PORT>>>
server node8.stroomhosts:<<<STROOM_PORT>>>
server node9.stroomhosts:<<<STROOM_PORT>>>
server node10.stroomhosts:<<<STROOM_PORT>>>
In the above example the port is set using templating as it is the same for all nodes. Nodes 1 and 2 will receive all UI and REST API traffic. Nodes 8-10 will serve all datafeed(direct) requests.
Certificates
The stack comes with a default server certificate/key and CA certificate for demo/test purposes.
The files are located in volumes/nginx/certs/
.
For a production deployment these will need to be changed, see Certificates
Log rotation
The Nginx container makes use of logrotate to rotate Nginx’s log files after a period of time so that rotated logs can be sent to stroom.
Logrotate is configured using the file volumes/stroom-log-sender/logrotate.conf.template
.
This file is templated in the same way as the Nginx configuration files, see above.
The number of rotated files that should be kept before deleting them can be controlled using the line.
rotate 100
This should be set in conjunction with the frequency that logrotate is called, which is controlled by volumes/stroom-log-sender/crontab.txt
.
This crontab file drives the lograte process and by default is set to run every minute.
8.2 - Stroom Configuration
Version Information: Created with Stroom v7.0
Last Updated: 2021-06-23
See Also: Properties.
General configuration
The Stroom application is essentially just an executable JAR (external link) file that can be run when provided with a configuration file, config.yml
.
This config file is common to all forms of deployment.
config.yml
This file, sometimes known as the DropWizard configuration file (as DropWizard is the java framework on which Stroom runs) is the primary means of configuring stroom. As a minimum this file should be used to configure anything that needs to be set before stroom can start up, e.g. database connection details or is specific to a node in a stroom cluster. If you are using some form of scripted deployment, e.g. ansible then it can be used to set all stroom properties for the environment that stroom runs in. If you are not using scripted deployments then you can maintain stroom’s node agnostic configuration properties via the user interface.
For more details on the structure of the file, data types and property precedence see Properties.
Stroom operates on a configuration by exception basis so all configuration properties will have a sensible default value and a property only needs to be explicitly configured if the default value is not appropriate, e.g. for tuning a large scale production deployment or where values are environment specific.
As a result config.yml
only contains a minimal set of properties.
The full tree of properties can be seen in ./config/config-defaults.yml
and a schema for the configuration tree (along with descriptions for each property) can be found in ./config/config-schema.yml
.
These two files can be used as a reference when configuring stroom.
Key Configuration Properties
The following are key properties that would typically be changed for a production deployment.
All configuration branches are relative to the appConfig
root.
The database name(s), hostname(s), port(s), usernames(s) and password(s) should be configured using these properties. Typically stroom is configured to keep it statistics data in a separate database to the main stroom database, as is configured below.
commonDbDetails:
connection:
jdbcDriverUrl: "jdbc:mysql://localhost:3307/stroom?useUnicode=yes&characterEncoding=UTF-8"
jdbcDriverUsername: "stroomuser"
jdbcDriverPassword: "stroompassword1"
statistics:
sql:
db:
connection:
jdbcDriverUrl: "jdbc:mysql://localhost:3307/stats?useUnicode=yes&characterEncoding=UTF-8"
jdbcDriverUsername: "statsuser"
jdbcDriverPassword: "stroompassword1"
In a clustered deployment each node must be given a node name that is unique within the cluster. This is used to identify nodes in the Nodes screen. It could be the hostname of the node or follow some other naming convetion.
node:
name: "node1a"
Each node should have its identity on the network configured so that it uses the appropriate FQDNs.
The nodeUri
hostname is the FQDN of each node and used by nodes to communicate with each other, therefore it can be private to the cluster of nodes.
The publicUri
hostname is the public facing FQDN for stroom, i.e. the address of a load balancer or Nginx.
This is the address that users will use in their browser.
nodeUri:
hostname: "localhost" # e.g. node5.stroomnodes.somedomain
publicUri:
hostname: "localhost" # e.g. stroom.somedomain
Deploying without Docker
Stroom running without docker has two files to configure it. The following locations are relative to the stroom home directory, i.e. the root of the distribution zip.
./config/config.yml
- Stroom configuration YAML file./config/scripts.env
- Stroom scripts configuration env file
The distribution also includes these files which are helpful when it comes to configuring stroom.
./config/config-defaults.yml
- Full version of the config.yml file containing all branches/leaves with default values set. Useful as a reference for the structure and the default values../config/config-schema.yml
- The schema defining the structure of theconfig.yml
file.
scripts.env
This file is used by the various shell scripts like start.sh
, stop.sh
, etc.
This file should not need to be unless you want to change the locations where certain log files are written to or need to change the java memory settings.
In a production system it is highly likely that you will need to increase the java heap size as the default is only 2G. The heap size settings and any other java command line options can be set by changing:
JAVA_OPTS="-Xms512m -Xmx2048m"
As part of a docker stack
When stroom is run as part of one of our docker stacks, e.g. stroom_core there are some additional layers of configuration to take into account, but the configuration is still primarily done using the config.yml
file.
Stroom’s config.yml
file is found in the stack in ./volumes/stroom/config/
and this is the primary means of configuring Stroom.
The stack also ships with a default config.yml
file baked into the docker image.
This minimal fallback file (located in /stroom/config-fallback/
inside the container) will be used in the absence of one provided in the docker stack configuration (./volumes/stroom/config/
).
The default config.yml
file uses environment variable substitution so some configuration items will be set by environment variables set into the container by the stack env file and the docker-compose YAML.
This approach is useful for configuration values that need to be used by multiple containers, e.g. the public FQDN of Nginx, so it can be configured in one place.
If you need to further customise the stroom configuration then it is recommended to edit the ./volumes/stroom/config/config.yml
file.
This can either be a simple file with hard coded values or one that uses environment variables for some of its
configuration items.
The configuration works as follows:
env file (stroom<stack name>.env)
|
|
| environment variable substitution
|
v
docker compose YAML (01_stroom.yml)
|
|
| environment variable substitution
|
v
Stroom configuration file (config.yml)
Ansible
If you are using Ansible to deploy a stack then it is recommended that all of stroom’s configuration properties are set directly in the config.yml
file using a templated version of the file and to NOT use any environment variable substitution.
When using Ansible, the Ansible inventory is the single source of truth for your configuration so not using environment variable substitution for stroom simplifies the configuration and makes it clearer when looking at deployed configuration files.
Stroom-ansible has an example inventory for a single node stroom stack deployment. The group_vars/all file shows how values can be set into the env file.
8.3 - Stroom Proxy Configuration
Version Information: Created with Stroom v7.0
Last Updated: 2021-06-23
See Also: Stroom Application Configuration
See Also: Properties.
TODO: This needs updating for v7.1
The configuration of Stroom-proxy is very much the same as for Stroom with the only difference being the structure of the config.yml
file.
Stroom-proxy has a proxyConfig
key in the YAML while Stroom has appConfig
.
It is recommended to first read Stroom Application Configuration to understand the general mechanics of the stroom configuration as this will largely apply to stroom-proxy.
General configuration
The Stroom-proxy application is essentially just an executable JAR (external link) file that can be run when provided with a configuration file, config.yml
.
This configuration file is common to all forms of deployment.
config.yml
Stroom-proxy does not have a user interface so the config.yml
file is the only way of configuring stroom-proxy.
As with stroom, the config.yml
file is split into three sections using these keys:
server
- Configuration of the web server, e.g. ports, paths, request logging.logging
- Configuration of application loggingproxyConfig
- Configuration of stroom-proxy
See also Properties for more details on structure of the config.yml file and supported data types.
Stroom-proxy operates on a configuration by exception basis so all configuration properties will have a sensible default value and a property only needs to be explicitly configured if the default value is not appropriate, e.g. for tuning a large scale production deployment or where values are environment specific.
As a result config.yml
only contains a minimal set of properties.
The full tree of properties can be seen in ./config/config-defaults.yml
and a schema for the configuration tree (along with descriptions for each property) can be found in ./config/config-schema.yml
.
These two files can be used as a reference when configuring stroom.
Key Configuration Properties
Stroom-proxy has two main functions, storing and forwarding. It can be configured to do either or both of these functions. These functions are enabled/disabled using:
proxyConfig:
forwardStreamConfig:
forwardingEnabled: true
proxyRepositoryConfig:
storingEnabled: true
Stroom-proxy should be configured to check the receipt status of feeds on receipt of data. This is done by configuring the end point of a downstream stroom-proxy or stroom.
feedStatus:
url: "http://stroom:8080/api/feedStatus/v1"
apiKey: ""
The url
should be the url for the feed status API on the downstream stroom(-proxy).
If this is on the same host then you can use the http endpoint, however if it is on a remote host then you should use https and the host of its nginx, e.g. https://downstream-instance/api/feedStatus/v1
.
In order to use the API, the proxy must have a configured apiKey
.
The API key must be created in the downstream stroom instance and then copied into this configuration.
If the proxy is configured to forward data then the forward destination(s) should be set.
This is the datafeed
endpoint of the downstream stroom-proxy or stroom instance that data will be forwarded to.
This may also be te address of a load balancer or similar that is fronting a cluster of stroom-proxy or stroom instances.
See also Feed status certificate configuration.
forwardStreamConfig:
forwardDestinations:
- forwardUrl: "https://nginx/stroom/datafeed"
forwardUrl
specifies the URL of the datafeed endpoint on the destination host.
Each forward location can use a different key/trust store pair.
See also Forwarding certificate configuration.
If the proxy is configured to store then it is the location of the proxy repository may need to be configured if it needs to be in a different location to the proxy home directory, e.g. on another mount point.
Deploying without Docker
Apart from the structure of the config.yml
file, the configuration in a non-docker environment is the same as for stroom
As part of a docker stack
The way stroom-proxy is configured is essentially the same as for stroom with the only real difference being the structure of the config.yml
file as note above .
As with stroom the docker stack comes with a ./volumes/stroom-proxy-*/config/config.yml
file that will be used in the absence of a provided one.
Also as with stroom, the config.yml
file supports environment variable substitution so can make use of environment variables set in the stack env file and passed down via the docker-compose YAML files.
Certificates
Stroom-proxy makes use of client certificates for two purposes:
- Communicating with a downstream stroom/stroom-proxy in order to establish the receipt status for the feeds it has received data for.
- When forwarding data to a downstream stroom/stroom-proxy
The stack comes with the following files that can be used for demo/test purposes.
volumes/stroom-proxy-*/certs/ca.jks
volumes/stroom-proxy-*/certs/client.jks
For a production deployment these will need to be changed, see Certificates
Feed status certificate configuration
The configuration of the client certificates for feed status checks is done using:
proxyConfig:
jerseyClient:
timeout: "10s"
connectionTimeout: "10s"
timeToLive: "1h"
cookiesEnabled: false
maxConnections: 1024
maxConnectionsPerRoute: "1024"
keepAlive: "0ms"
retries: 0
tls:
verifyHostname: true
keyStorePath: "/stroom-proxy/certs/client.jks"
keyStorePassword: "password"
keyStoreType: "JKS"
trustStorePath: "/stroom-proxy/certs/ca.jks"
trustStorePassword: "password"
trustStoreType: "JKS"
trustSelfSignedCertificates: false
This configuration is also used for making any other REST API calls.
Forwarding certificate configuration
Stroom-proxy can forward to multiple locations. The configuration of the certificate(s) for the forwarding locations is as follows:
proxyConfig:
forwardStreamConfig:
forwardingEnabled: true
forwardDestinations:
# If you want multiple forward destinations then you will need to edit this file directly
# instead of using env var substitution
- forwardUrl: "https://nginx/stroom/datafeed"
sslConfig:
keyStorePath: "/stroom-proxy/certs/client.jks"
keyStorePassword: "password"
keyStoreType: "JKS"
trustStorePath: "/stroom-proxy/certs/ca.jks"
trustStorePassword: "password"
trustStoreType: "JKS"
hostnameVerificationEnabled: true
forwardUrl
specifies the URL of the datafeed endpoint on the destination host.
Each forward location can use a different key/trust store pair.
8.4 - Stroom Log Sender Configuration
Version Information: Created with Stroom v7.0
Last Updated: 2021-06-14
Stroom log sender is a docker image used for sending application logs to stroom. It is essentially just a combination of the send_to_stroom.sh (external link) script and a set of crontab entries to call the script at intervals.
Deploying without Docker
When deploying without docker stroom and stroom-proxy nodes will need to be configured to send their logs to stroom.
This can be done using the ./bin/send_to_stroom.sh
script in the stroom and stroom-proxy zip distributions and some crontab configuration.
The crontab file for the user account running stroom should be edited (crontab -e
) and set to something like:
# stroom logs
* * * * * STROOM_HOME=<path to stroom home> ${STROOM_HOME}/bin/send_to_stroom.sh ${STROOM_HOME}/logs/access STROOM-ACCESS-EVENTS <datafeed URL> --system STROOM --environment <environment> --file-regex '.*/[a-z]+-[0-9]{4}-[0-9]{2}-[0-9]{2}T.*\\.log' --max-sleep 10 --key <key file> --cert <cert file> --cacert <CA cert file> --delete-after-sending --compress >> <path to log> 2>&1
* * * * * STROOM_HOME=<path to stroom home> ${STROOM_HOME}/bin/send_to_stroom.sh ${STROOM_HOME}/logs/app STROOM-APP-EVENTS <datafeed URL> --system STROOM --environment <environment> --file-regex '.*/[a-z]+-[0-9]{4}-[0-9]{2}-[0-9]{2}T.*\\.log' --max-sleep 10 --key <key file> --cert <cert file> --cacert <CA cert file> --delete-after-sending --compress >> <path to log> 2>&1
* * * * * STROOM_HOME=<path to stroom home> ${STROOM_HOME}/bin/send_to_stroom.sh ${STROOM_HOME}/logs/user STROOM-USER-EVENTS <datafeed URL> --system STROOM --environment <environment> --file-regex '.*/[a-z]+-[0-9]{4}-[0-9]{2}-[0-9]{2}T.*\\.log' --max-sleep 10 --key <key file> --cert <cert file> --cacert <CA cert file> --delete-after-sending --compress >> <path to log> 2>&1
# stroom-proxy logs
* * * * * PROXY_HOME=<path to proxy home> ${PROXY_HOME}/bin/send_to_stroom.sh ${PROXY_HOME}/logs/access STROOM_PROXY-ACCESS-EVENTS <datafeed URL> --system STROOM-PROXY --environment <environment> --file-regex '.*/[a-z]+-[0-9]{4}-[0-9]{2}-[0-9]{2}T.*\\.log' --max-sleep 10 --key <key file> --cert <cert file> --cacert <CA cert file> --delete-after-sending --compress >> <path to log> 2>&1
* * * * * PROXY_HOME=<path to proxy home> ${PROXY_HOME}/bin/send_to_stroom.sh ${PROXY_HOME}/logs/app STROOM_PROXY-APP-EVENTS <datafeed URL> --system STROOM-PROXY --environment <environment> --file-regex '.*/[a-z]+-[0-9]{4}-[0-9]{2}-[0-9]{2}T.*\\.log' --max-sleep 10 --key <key file> --cert <cert file> --cacert <CA cert file> --delete-after-sending --compress >> <path to log> 2>&1
* * * * * PROXY_HOME=<path to proxy home> ${PROXY_HOME}/bin/send_to_stroom.sh ${PROXY_HOME}/logs/send STROOM_PROXY-SEND-EVENTS <datafeed URL> --system STROOM-PROXY --environment <environment> --file-regex '.*/[a-z]+-[0-9]{4}-[0-9]{2}-[0-9]{2}T.*\\.log' --max-sleep 10 --key <key file> --cert <cert file> --cacert <CA cert file> --delete-after-sending --compress >> <path to log> 2>&1
* * * * * PROXY_HOME=<path to proxy home> ${PROXY_HOME}/bin/send_to_stroom.sh ${PROXY_HOME}/logs/receive STROOM_PROXY-RECEIVE-EVENTS <datafeed URL> --system STROOM-PROXY --environment <environment> --file-regex '.*/[a-z]+-[0-9]{4}-[0-9]{2}-[0-9]{2}T.*\\.log' --max-sleep 10 --key <key file> --cert <cert file> --cacert <CA cert file> --delete-after-sending --compress >> <path to log> 2>&1
where the environment specific values are:
<path to stroom home>
- The absolute path to the stroom home, i.e. the location of thestart.sh
script.<path to proxy home>
- The absolute path to the stroom-proxy home, i.e. the location of thestart.sh
script.<datafeed URL>
- The URL that the logs will be sent to. This will typically be the nginx host or load balancer and the path will typically behttps://host/datafeeddirect
to bypass the proxy for faster access to the logs.<environment>
- The environment name that the stroom/proxy is deployed in, e.g. OPS, REF, DEV, etc.<key file>
- The absolute path to the SSL key file used by curl.<cert file>
- The absolute path to the SSL certificate file used by curl.<CA cert file>
- The absolute path to the SSL certificate authority file used by curl.<path to log>
- The absolute path to a log file to log all the send_to_stroom.sh output to.
If your implementation of cron supports environment variables then you can define some of the common values at the top of the crontab file and use them in the entries.
cronie
as used by Centos does not support environment variables in the crontab file but variables can be defined at the line level as has been shown with STROOM_HOME and PROXY_HOME.
The above crontab entries assume that stroom and stroom-proxy are running on the same host. If there are not then the entries can be split across the hosts accordingly.
Service host(s)
When deploying stroom/stroom-proxy without stroom you may still be deploying the service stack (nginx and stroom-log-sender) to a host. In this case see As part of a docker stack below for details of how to configure stroom-log-sender to send the nginx logs.
As part of a docker stack
Crontab
The docker stacks include the stroom-log-sender docker image for sending the logs of all the other containers to stroom.
Stroom-log-sender is configured using the crontab file volumes/stroom-log-sender/conf/crontab.txt
.
When the container starts this file will be read.
Any variables in it will be substituted with the values from the corresponding environment variables that are present in the container.
These common values can be set in the config/<stack name>.env
file.
As the variables are substituted on container start you will need to restart the container following any configuration change.
Certificates
The directory volumes/stroom-log-sender/certs
contains the default client certificates used for the stack.
These allow stroom-log-sender to send the log files over SSL which also provides stroom with details of the sender.
These will need to be replaced in a production environment.
volumes/stroom-log-sender/certs/ca.pem.crt
volumes/stroom-log-sender/certs/client.pem.crt
volumes/stroom-log-sender/certs/client.unencrypted.key
For a production deployment these will need to be changed, see Certificates
8.5 - MySQL Configuration
Version Information: Created with Stroom v7.0
Last Updated: 2021-06-07
See Also: MySQL Server Setup
See Also: MySQL Server Administration (external link)
General configuration
MySQL is configured via the .cnf
file which is typically located in one of these locations:
/etc/my.cnf
/etc/mysql/my.cnf
$MYSQL_HOME/my.cnf
<data dir>/my.cnf
~/.my.cnf
Key configuration properties
-
lower_case_table_names
- This proerty controls how the tables are stored on the filesystem and the case-sensitivity of table names in SQL. A value of0
means tables are stored on the filesystem in the case used in CREATE TABLE and sql is case sensitive. This is the default in linux and is the preferred value for deployments of stroom of v7+. A value of1
means tables are stored on the filesystem in lowercase but sql is case insensitive. See also (external link) -
max_connections
- The maximum permitted number of simultaneous client connections. For a clustered deployment of stroom, the default value of 151 will typically be too low. Each stroom node will hold a pool of open database connections for its use, therefore with a large number of stroom nodes and a big connection pool the total number of connections can be very large. This property should be set taking into account the values of the stroom properties of the form*.db.connectionPool.maxPoolSize
. See also (external link) -
innodb_buffer_pool_size
/innodb_buffer_pool_instances
- Controls the amount of memory availble to MySQL for caching table/index data. Typically this will be set to 80% of available RAM, assuming MySQL is running on a dedicated host and the total amount of table/index data is greater than 80% of avaialable RAM. Note:innodb_buffer_pool_size
must be set to a value that is equal to or a multiple ofinnodb_buffer_pool_chunk_size * innodb_buffer_pool_instances
. See also (external link)
TODO - Add additional key configuration items
Deploying without Docker
When MySQL is deployed without a docker stack then MySQL should be installed and configured according to the MySQL documentation. How MySQL is deployed and configured will depend on the requirements of the environment, e.g. clustered, primary/standby, etc.
As part of a docker stack
Where a stroom docker stack includes stroom-all-dbs (MySQL) the MySQL instance is configured via the .cnf
file.
The .cnf
file is located in volumes/stroom-all-dbs/conf/stroom-all-dbs.cnf
.
This file is read-only to the container and will be read on container start.
Database initialisation
When the container is started for the first time the database be initialised with the root user account.
It will also then run any scripts found in volumes/stroom-all-dbs/init/stroom
.
The scripts in here will be run in alpabetical order.
Scripts of the form .sh
, .sql
, .sql.gz
and .sql.template
are supported.
.sql.template
files are proprietry to stroom stacks and are just templated .sql
files.
They can contain tags of the form <<<ENV_VAR_NAME>>>
which will be replaced with the value of the named environment variable that has been set in the container.
If you need to add additional database users then either add them to volumes/stroom-all-dbs/init/stroom/001_create_databases.sql.template
or create additional scripts/templates in that directory.
The script that controls this templating is volumes/stroom-all-dbs/init/000_stroom_init.sh
.
This script MUST not have its executable bit set else it will be executed rather than being sourced by the MySQL entry point scripts and will then not work.