// Stopping Containers Correctly
Stopping a container with
$ podman stop container-name
or
$ docker stop container-name
will send SIGTERM
to the first process (PID 1) and shut down the container when the process terminates. If this doesn't happen within a certain time frame (default is 10s), the runtime will send SIGKILL
to the process and take the container down.
So far so good, things are getting interesting when your container process isn't PID 1.
This is already the case if the process is started via a shell script.
#!/bin/bash
...
java $FLAGS $APP
Attempting to stop this container will terminate the script, while the JVM will keep running. The container runtime is usually smart enough to notice that a process is still active after the script terminated and will wait the grace period anyway, before shutting down the container forcefully. The JVM however won't notice anything and won't have the opportunity to call shutdown hooks, write JFR dumps or finish transactions.
signal delegation
One way to solve this is by delegating the signal from the shell script to the main process:
...
java $FLAGS $APP & # detach process from script
PID=$! # remember process ID
trap 'kill -TERM $PID' INT TERM # delegate kill signal to JVM
wait $PID # attach script to JVM again; note: TERM signal unblocks this wait
trap - TERM INT
wait $PID # wait for JVM to exit after signal delegation
EXIT_STATUS=$?
The second wait prevents the script from exiting before the JVM finished termination and is required since the first wait is unblocked as soon the script received the signal.
it still didn't work
Interestingly, after implementing this (and trying out other variations of the same concept) it still didn't work for some reason - debugging showed the trap never fired.
Turns out that nothing was wrong with the signal delegation - signals just never reached the script :). So I searched a bit around and found this article which basically described the same async/wait/delegate method in greater detail (thats where I stole the EXIT_STATUS line from), so I knew it had to work. Another great article gave me the idea to check the Dockerfile again.
FROM ...
...
CMD ./start.sh
The sh
shell interpreting the bash script was the first process!
$ podman ps
CONTAINER ID IMAGE COMMAND ...
de216106ff39 localhost/test:latest /bin/sh -c ./start.sh ...
htop (on the host) in tree view shows it more clearly:
$ htop
1 root ... /sbin/init
15643 podpilot ... - /usr/libexec/podman/conmon --api-version ...
15646 100996 ... | - /bin/sh -c ./start.sh ...
15648 100996 ... | - /bin/bash ./start.sh
15662 100996 ... | - /home/jdk/bin/java -Xshare:on ...
To fix this a different CMD (or ENTRYPOINT) syntax is needed:
FROM ...
...
CMD [ "./start.sh" ]
Lets check again after rebuild + run:
$ podman ps
CONTAINER ID IMAGE COMMAND ...
72e3e60ed60b localhost/test:latest ./start.sh ...
$ htop
1 root ... /sbin/init
15746 podpilot ... - /usr/libexec/podman/conmon --api-version ...
15749 100996 ... | - /bin/bash ./start.sh ...
15771 100996 ... | - /home/jdk/bin/java -Xshare:on ...
Much better. Since the script is now executed directly, it is able to receive and delegate the signals to the JVM. The Java Flight Recorder records also appeared in the volume, which meant that the JVM had enough time to convert the JFR repository into a single record file. The podman stop
command also returned within a fraction of a second.
Since the trap is listening to SIGINT
too, even the CTRL+C
signal is properly propagated when the container is started in non-detached mode. Nice bonus for manual testing.
alternatives
Starting the JVM with
exec java $FLAGS $APP
will replace the shell process with the JVM process without changing PID or process name. Disadvantage: no java commandline in top and the shell won't execute any lines after the exec line (because it basically doesn't exist anymore).
... and if you don't care about the container life cycle too much you can always tell the JVM directly to shutdown, this will close all parent shells bottom-up until PID 1 terminated which will finally stop the container.
podman exec -it container sh -c "kill \$(jps | grep -v Jps | cut -f1 -d' ')"
- - -
lessons learned:
sometimes two square brackets make the difference :)