Hi all, great idea for a thread. We’ve been running SC in Docker on AWS Cloud for about 6 months in production (using ECS) and have some experience to share. This includes:
- A container for
seedlink and slarchive
- Multiple Docker containers for seven real-time processing pipelines; including modules
scautopick, scamp, scmag, scautoloc
- A container for
scmaster and scevent
- Multiple load balanced containers for
fdsnws
At the center we have our own SeisComP base image built for Ubuntu that installs all SC dependencies, and compiles from software from source, and sets up the environment. Attached is file: base-image.zip (11.6 KB) that contains the Dockerfile for our base image and some example inventory. You will need to add a seiscomp.tar.gz with the SC source code to the folder for it to build (dependency versions are fixed so may need to be updated if the image does not build). Each deployment that runs SeisComP inherits from this base image and adds only its required specific module configuration (./etc) and supervisord configuration to start modules:
We never run modules as daemons (seiscomp start), and instead always use the exec command to make them run in the foreground. These processes are managed by supervisord which is simple Python library meant to be pid 1 in a container. Often we run multiple modules for real-time processing in a single container which is no problem with supervisord. A simple supervisord example config file that we use in our FDSNWS containers:
[supervisord]
nodaemon=true
user=root
logfile=/dev/null
logfile_maxbytes=0
[program:fdsnws]
user=sysop
command=seiscomp exec fdsnws -d "postgresql://%(ENV_DB_READ_ONLY)s" --inventory-db "file://%(ENV_INVENTORY_FILE)s"
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
redirect_stderr=true
environment=HOME="/home/sysop/",USER="sysop"
autorestart=true
Supervisord is run: exec supervisord -c ${CONFIG_FILE} during the container init script and takes over pid 1. In the program we define the command seiscomp exec fdsnws and pass the database connection and inventory to serve.
We had a desire to make all containers stateless where possible. This ensures that we can always just swap one container for another, meaning that we can e.g., start a new FDSNWS container before bringing down the older version. Network traffic is swapped between old and new containers seamlessly with zero downtime.
There are two exceptions:
- The container running
seedlink and slarchive is not stateless because during shutdown the buffer files and state are written to a persistent disk. In order: the old container shuts down and writes its state, only then the new container starts up and reads this statefile.
- The container running
scmaster cannot be hot swapped because only one scmaster should connect to the database at a time. We make sure to bring the first container down, and only then the new container up.
Our SC database is maintained outside of the SeisComP Docker images. We use AWS RDS but it could be a standalone postsgres or mysql instance too, Dockerized or not. Furthermore, we do not use the database for inventory or configuration at all and never run seiscomp update-config. Instead:
- Inventory is maintained within the base image and not read from the database and passed to each module using the
--inventory-db flag.
- Configuration files and station bindings are maintained within the base-image and not read from the database either. We use the tool
bindings2cfg and pass the configuration to each module using the --config-db flag.
The reason is that for each image we deploy there is a version, and we want each version to have a specific configuration and inventory. This would be much harder to guarantee if this information was maintained in a database that can be modified outside of the containers.
SC and it’s messaging system are well suited for running processes in different containers or servers. All you need to do is pass the connection string to the messaging system and sometimes database to the various containers.