// Custom Java Runtimes with jlink [and jdeps for classpath applications]

The jlink command line tool can be used to create custom java runtimes, which only include the functionality required by the (modular) java application. However, what if the application isn't modular and still uses the classpath? In this case an extra step is needed to determine which JDK modules are required by the application before jlink can be used.

classic classpaths: finding module dependencies with jdeps

jdeps is excellent for porting classic classpath based applications to java modules. It analyzes jars and list all their dependencies, which can be other jars, or modules, with package granularity. Although we don't want to port the dusty application to the module system for this blog post, listing all the module dependencies is exactly what we need for jlink, to be able to create a custom java runtime.

jdeps produces the following output:


# foo.jar depends on bar.jar
foo.jar -> bar.jar
...
# or foo.jar depends on a module
foo.jar -> module.name
...

Since the tool is intended to assist with porting applications to java modules, the default output will be fairly detailed down to package dependencies. The summary (-s) omits all that and only lists jars or modules.

All we have to do is to go recursively through all jars and remember the module names they depend on.


# -s omit detailed package dependencies
# -R analyze recursively through all found dependencies
# --multi-release 16 for the case that there are multi release jars involved
$JDK/bin/jdeps -s -R --multi-release 16 --class-path 'lib/*' dusty-application.jar
jakarta.persistence-2.2.3.jar -> java.base
jakarta.persistence-2.2.3.jar -> java.instrument
jakarta.persistence-2.2.3.jar -> java.logging
jakarta.persistence-2.2.3.jar -> java.sql
foo.jar -> bar.jar
...

Some greping and deduplication and we have a short list of JDK modules our application depends on.


$JDK/bin/jdeps -s -R --multi-release 16 --class-path 'lib/*' dusty-application.jar\
 | grep -Ev '\.jar$' | cut -d " " -f 3 | sort -u
java.base
java.desktop
java.instrument
java.logging
java.naming
java.net.http
java.security.jgss
java.sql
java.xml
jdk.unsupported

Thats it? Not quite. Analyzing an application like that won't show dependencies which are caused via reflection. So you will have to take a good look at the resulting modules and probably add some manually. A good candidate are jdk.crypto.* modules. jlink can assist with that task too by listing service providers.


$JDK/bin/jlink --suggest-providers java.security.Provider
Suggested providers:
  java.naming provides java.security.Provider used by java.base
  java.security.jgss provides java.security.Provider used by java.base
  jdk.crypto.ec provides java.security.Provider used by java.base
...

You might also want to add modules like jdk.jfr, java.management or jdk.localedata even when the application isn't direclty depending on them. You can experiment with options like --compile-time which will usually list more dependencies (default is runtime analysis). jlink adds transitive dependencies automatically.

Any missing modules should be quickly noticed during integration tests.

custom runtimes with jlink

Once we have the module list we can give it to jlink for the actual heavy lifting.


MODS=...
JDK=/home/mbien/dev/java/jdk-16.0.1+9
DIST=custom-$(basename $JDK)

$JDK/bin/jlink -v --add-modules $MODS\
 --compress=2 --no-header-files --no-man-pages\
 --vendor-version="[mbien.dev pod REv1]"\
 --output $DIST

du -s $DIST

jlink is automatically using the modules of the JDK which contains the tool, which means that the example above will create a runtime based on jdk-16.0.1+9. The flag --module-path would set a path to a alternative module folder. If the application is already modular, the path could also include the application modules, if they should be part of the runtime too.

some noteworthy flags:

  • --strip-debug this is going to strip debug symbols from both the native binaries and bytecode, you probably don't want to use this since this will remove all line numbers from stack traces. Its likely that the binaries of the JDK distribution you are using have most of their symbols already stripped.
  • --strip-native-debug-symbols=objcopy=/usr/bin/objcopy Same as above, but only for native binaries
  • --compress=0|1|2 0 for no compression, 1 for string deduplication, 2 for zip compressed modules. This might influence startup time slightly; see CDS section below
  • --include-locales=langtag[,langtag]* include only a subset of locales instead of the full module
  • --vendor-version="i made this" this looks uninteresting at first glance but it is very useful if you want to recognize your custom runtime again once you have multiple variants in containers. Adding domain name/project name or purpose of the base image helps.
    It will appear on the second line of the output of java -version

full JDK as baseline


MODS=ALL-MODULE-PATH

# --compress=1
138372 (151812 with CDS)

# --compress=2
102988 (116428 with CDS)

# --compress=2 --strip-debug
90848 (102904 with CDS)

custom runtime example


MODS=java.base,java.instrument,java.logging,java.naming,java.net.http,\
java.security.jgss,java.sql,java.xml,jdk.jfr,jdk.unsupported,java.rmi,\
java.management,java.datatransfer,java.transaction.xa,\
jdk.crypto.cryptoki,jdk.crypto.ec

# --compress=1
55996 (69036 with CDS)

# --compress=2
45304 (58344 with CDS)

# --compress=2 --strip-debug
40592 (52240 with CDS)

(this a aarch64 build of OpenJDK, x64 binaries are slightly larger)

Most modules are actually fairly small, the 5 largest modules are java.base, java.desktop, jdk.localedata, jdk.compiler and jdk.internal.vm.compiler. Since java.base is mandatory anyway, adding more modules won't significantly influence the runtime size unless you can't avoid some of the big ones.

Once you are happy with the custom runtime you should add it to your test environment of the project and IDE.

CDS - to share or not to share?

I wrote about class data sharing before so I keep this short. A CDS archive is a file which is mapped into memory by the JVM on startup and is shared between JVM instances. This even works for co-located containers, sharing the same image layer which includes the CDS archive.

Although it adds to the image size, zip compression + CDS seems to be always smaller than uncompressed without CDS. The CDS file should also eliminate the need to decompress modules during startup since it should contain the most important classes already. So the decision seems to be made easy: compact size + improved startup time and potential (small) memory footprint savings as bonus.

Leaving the CDS out frees up ~10 MB of image size. If this matters to your project, benchmark it to see if it makes a difference. It is also possible to put application classes into the shared archive or create a separate archive for the application which extends the runtime archive (dynamic class data sharing). Or go a step further and bundle the application and runtime in a single, AOT compiled, compact, native image with GraalVM (although this might reduce peak throughput due to lack of JIT and have a smaller choice of GCs beside other restrictions) - but this probably won't happen for dusty applications.


# create CDS archive for the custom runtime
$DIST/bin/java -Xshare:dump

# check if it worked, this will fail if it can't map the archive
$DIST/bin/java -Xshare:on -version

# list all modules included in the custom java runtime
$DIST/bin/java --list-modules

summary

Only a single extra step is needed to determine most of the dependencies of an application, even if it hasn't been ported to java modules yet. Maintaining a module list won't be difficult since it should be fairly static (backend services won't suddenly start using swing packages when they are updated). Make sure that the custom runtime is used in your automated tests and IDE.

Stop using java 8, time to move on - even without a modular application :)


- - - sidenotes - - -

If you want to create a runtime which can compile and run single-file-sourcode-programs adding just jdk.compiler isn't enough. This will result in a a little bit misleading "IllegalArgumentException: error: release version 16 not supported" exception. Solution is to add jdk.zipfs too - I haven't investigated it any further.

If jlink has to be run from within a container (can be useful for building for foreign archs, e.g aarch64 on x64), you might have to change the process fork mechanism if you run into trouble (java.io.IOException: Cannot run program "objcopy": error=0, Failed to exec spawn helper: pid: 934, exit value: 1). (export JAVA_TOOL_OPTIONS="-Djdk.lang.Process.launchMechanism=vfork" worked for me)




Post a Comment:
  • HTML Syntax: NOT allowed