Best practices for containerized SeisComP deployment

Several members of this forum have voiced interest in opening a discussion about containerizing SeisComP. This thread will address these issues.

How should the SeisComP container ecosystem be structured?

  • Database location
    • Should the database typically be included inside the same container as SeisComP, or
    • Is it best practice to run the database in its own container?
  • Documenting a “how to” for containers
  • Using remote data sources
  • etc.

I am leaving this thread “Uncategorized” until we agree on the best place for it. Looking forward to some interesting interaction.

Hi everybody,
I dockerized the scheli service. It does not need a database to run. It’s getting the configuration from XML files. You can get it at https://github.com/KDFischer/docker-seiscomp
You first have to download or prepare the SeisComP software and make it available as a .tgzfile to build the docker image.

1 Like

Your repository cannot be found. Is it private?

1 Like

Sorry, I fixed this. It’s public now.

1 Like

Thanks for sharing, Kasper, I’ll take a look.

Hello,

This is a nice idea to have a specific thread :smiley:.
In IPGP observatories, we recently created a docker to deploy scolv or any other GUI clients on any OS, including Windows and Apple Silicon Mac. Since it is client oriented, we do not need any database. The config files and locator configuration (time travel tables) are mounted from the host computer to make it easier to update.

Main repository : GitHub - IPGP/seiscomp-gui-docker: SeisComP scolv GUI utility in a docker container

A PR is waiting to make it works with the latest GSM update, this is the working fork : GitHub - jmsaurel/seiscomp-gui-docker: SeisComP scolv GUI utility in a docker container

It is very smooth on Apple Silicon Mac and run much better than the previous VirtualBox we used on old Intel Mac.

Feel free to look at this container and comment.

Jean-Marie.

1 Like

Hello everyone,

I’m looking forward to hearing thoughts and advice from @jabe and others in this thread!

When their execution environment is particularly complex or difficult to integrate directly into the core SeisComP setup, I’ve found Docker to be a practical way to manage and run SeisComP client modules such as scfinder or scdlpicker, using SED-specific configurations within a vanilla SeisComP Docker setup.

Sidenote: I think that for similar reasons, the SED group also occasionally deploys SeisComP companion software such as ShakeMap, as microservice-like Docker containers (@kaestlip should confirm).

Docker might also be quite helpful for running advanced SeisComP tasks in standalone tools, e.g.:

The release of GSM has made all of this considerably easier, many thanks to the team at gempa for that! Also worth noting for @saurel: as of now, python-venv is its only required dependency.

Cheers,

Fred

I don’t have any advices in that regard. I do not use Docker. I see the benefit and power of using it but I personally stick to the development of SeisComP in general and try to make sure that it compiles and works on several Linux flavours. I leave it to you to make use of the code and to utilize it in a way you find most convenient.

I am interested in running GUI applications inside a Docker container in a performant way. Not sure if that is possible at all. I am looking forward in following this discussion, an interesting topic indeed.

Just for my curiosity, is there a specific performance issue with GUI that might be solved with a docker?

No, Docker should not solve performance issues but preferably it should not add a performance penalty. One could use Docker with a tested Qt environment to solve display issues which are sometimes present when using a particular desktop manager or a particular Qt version.

Thanks for clarifying.

Hi all, great idea for a thread. We’ve been running SC in Docker on AWS Cloud for about 6 months in production (using ECS) and have some experience to share. This includes:

  • A container for seedlink and slarchive
  • Multiple Docker containers for seven real-time processing pipelines; including modules scautopick, scamp, scmag, scautoloc
  • A container for scmaster and scevent
  • Multiple load balanced containers for fdsnws

At the center we have our own SeisComP base image built for Ubuntu that installs all SC dependencies, and compiles from software from source, and sets up the environment. Attached is file: base-image.zip (11.6 KB) that contains the Dockerfile for our base image and some example inventory. You will need to add a seiscomp.tar.gz with the SC source code to the folder for it to build (dependency versions are fixed so may need to be updated if the image does not build). Each deployment that runs SeisComP inherits from this base image and adds only its required specific module configuration (./etc) and supervisord configuration to start modules:

We never run modules as daemons (seiscomp start), and instead always use the exec command to make them run in the foreground. These processes are managed by supervisord which is simple Python library meant to be pid 1 in a container. Often we run multiple modules for real-time processing in a single container which is no problem with supervisord. A simple supervisord example config file that we use in our FDSNWS containers:

[supervisord]
nodaemon=true
user=root
logfile=/dev/null
logfile_maxbytes=0

[program:fdsnws]
user=sysop
command=seiscomp exec fdsnws -d "postgresql://%(ENV_DB_READ_ONLY)s" --inventory-db "file://%(ENV_INVENTORY_FILE)s"
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
redirect_stderr=true
environment=HOME="/home/sysop/",USER="sysop"
autorestart=true

Supervisord is run: exec supervisord -c ${CONFIG_FILE} during the container init script and takes over pid 1. In the program we define the command seiscomp exec fdsnws and pass the database connection and inventory to serve.

We had a desire to make all containers stateless where possible. This ensures that we can always just swap one container for another, meaning that we can e.g., start a new FDSNWS container before bringing down the older version. Network traffic is swapped between old and new containers seamlessly with zero downtime.

There are two exceptions:

  • The container running seedlink and slarchive is not stateless because during shutdown the buffer files and state are written to a persistent disk. In order: the old container shuts down and writes its state, only then the new container starts up and reads this statefile.
  • The container running scmaster cannot be hot swapped because only one scmaster should connect to the database at a time. We make sure to bring the first container down, and only then the new container up.

Our SC database is maintained outside of the SeisComP Docker images. We use AWS RDS but it could be a standalone postsgres or mysql instance too, Dockerized or not. Furthermore, we do not use the database for inventory or configuration at all and never run seiscomp update-config. Instead:

  • Inventory is maintained within the base image and not read from the database and passed to each module using the --inventory-db flag.
  • Configuration files and station bindings are maintained within the base-image and not read from the database either. We use the tool bindings2cfg and pass the configuration to each module using the --config-db flag.

The reason is that for each image we deploy there is a version, and we want each version to have a specific configuration and inventory. This would be much harder to guarantee if this information was maintained in a database that can be modified outside of the containers.

SC and it’s messaging system are well suited for running processes in different containers or servers. All you need to do is pass the connection string to the messaging system and sometimes database to the various containers.

1 Like

Hi @koymans,

Thank you very much for sharing your Dockerisation approach — it’s incredibly helpful. I’ve been working on a similar containerised SeisComP setup inspired by your architecture, where each module runs in the foreground using seiscomp exec and is managed by supervisord, avoiding update-config by passing configuration explicitly using the --inventory-db and --config-db flags.

However, I’ve encountered an issue when trying to run the seedlink module.

Unlike other modules (e.g., scautopick, scmag), seedlink does not appear to support seiscomp exec seedlink, likely because it doesn’t have a corresponding binary in the $SEISCOMP_ROOT/bin directory. As a result, I’m only able to launch it using seiscomp start seedlink, which doesn’t run in the foreground.

Could you please share how you handled this in your setup?

  • Did you use a custom script or wrapper for seedlink under seiscomp exec?
  • How is your supervisord.conf defined for the seedlink container?
  • Any tweaks required to make seedlink run correctly in foreground mode?

Thanks again for your insights.

Best regards,
Chanthujan

Hi Chan, glad to hear it is useful. Indeed seedlink does not support seiscomp exec. We run the module directly in the foreground in supervisord from /sbin.

Our Docker container starts with an init script to update the seedlink and slarchive configuration, and passes pid 1 to supervisord with exec:

# Seedlink and slarchive configuration
su sysop -c "source /home/sysop/.bash_profile && seiscomp update-config seedlink slarchive"

# Start the main process and replace init script using exec
exec supervisord -c ${SUPERVISORD_CONFIG_FILE}

The supervisord config file for seedlink and slarchive:

[supervisord]
nodaemon=true
pidfile=/var/run/supervisord.pid
user=root
logfile=/dev/null
logfile_maxbytes=0
minfds=65535

[program:seedlink]
user=sysop
command=/home/sysop/seiscomp/sbin/seedlink -v -f /home/sysop/seiscomp/var/lib/seedlink/seedlink.ini
stdout_logfile=/home/sysop/.seiscomp/log/seedlink.log
stdout_logfile_maxbytes=0
redirect_stderr=true
environment=HOME="/home/sysop/",USER="sysop"
autorestart=true
stopwaitsecs=120
priority=1

[program:slarchive]
user=sysop
command=seiscomp exec slarchive 127.0.0.1:18000
        -SDS /home/sysop/seiscomp/var/lib/archive
        -b
        -x /home/sysop/seiscomp/var/lib/slarchive/slarchive_127.0.0.1_18000.seq
        -k 0
        -Fi:1
        -Fc:900
        -l /home/sysop/seiscomp/var/lib/slarchive/slarchive.streams
stdout_logfile=/home/sysop/.seiscomp/log/slarchive.log
stdout_logfile_maxbytes=0
redirect_stderr=true
environment=HOME="/home/sysop/",USER="sysop"
autorestart=true
stopwaitsecs=120
priority=1
1 Like

Hi All,

I have created a repository for the Dockerization of Seiscomp. Thanks, @koymans, for sharing your knowledge.

Here is the link: GitHub - Chanthujan/DockerizationSeiscomp: This repository provides a modular, containerised deployment for the SeisComP earthquake processing system using Docker. Each SeisComP module is isolated in its own container for flexible scaling, maintainability, and reproducibility.

Please try it and share your thoughts.

Thanks,
Chanthujan

3 Likes

Thank you for posting, @koymans, this is excellent. I need to do some reading on supervisord. I appreciate it.

Thank you so much, @chanthujanc, especially for providing the github link. I will comment again when I’ve had some time to look at your architecture in detail.

1 Like

Thanks everyone for sharing your experiences, lots of valuable stuff here. I’m playing around with stuffing our seiscomp pipeline into a docker-compose project (one container/service per module), seems promising so far.

For those using --inventory-db/--config-db, are you experiencing issues with memory usage? When I take this approach I’m seeing each seiscomp module’s RAM usage up at 500MB+ (when they would be ~100MB otherwise). The merged inventory XML I’m using weighs 55MB.

Memory is already a concern in some of our AWS deployments so this is probably a non-starter for us, so I’m looking into alternative approaches. Perhaps baking the inventory + config into a mariadb image, and using this to run a separate database container for --inventory-db.

Parsing XML costs RAM, that is true. It should release that later on again once parsing is done but maybe it doesn’t. You could also bake your inventory and config into a SQLite3 database file and use that. That should use less memory than XML parsing.

That sounds surprising, I don’t imagine there should be a difference in memory usage once the inventory has been loaded to memory in the data model. Regardless of whether it comes from a database or a file. But like @jabe mentioned maybe there is some leak in the parsing.

Our 8MB inventory file occupies about ~400MB of RAM when running the FDSNWS module. We reduced the memory usage for each module by running invextr before launching, and only extracting the specific inventory we need before launching each module from the main inventory file.