Skip to content

Create a custom JRE for the Docker image#993

Open
toindev wants to merge 1 commit into
gaul:masterfrom
toindev:401_custom_docker_JRE
Open

Create a custom JRE for the Docker image#993
toindev wants to merge 1 commit into
gaul:masterfrom
toindev:401_custom_docker_JRE

Conversation

@toindev
Copy link
Copy Markdown

@toindev toindev commented Mar 1, 2026

Fixes #401

Improvement : reduces the attack surfaces on the Docker image (makes our automated tools scream less often).

Custom Java runtime creation:

  • Added a Maven exec-maven-plugin execution to run jdeps during the package phase, outputting the required Java modules to target/jdeps-modules.txt. This automates module dependency analysis for jlink.
  • Updated the Dockerfile to use a multi-stage build: the first stage creates a custom JRE with jlink using modules listed in jdeps-modules.txt, and the second stage builds the final runtime image
  • Added comments in the Dockerfile (helps with AI coding assistants)

CI/CD workflow:

  • Builds a custom stripped down JRE to run the tests, as close as possible to the one used in the Dockerfile (some extra modules are required by Maven).
  • Modified .github/workflows/ci-main.yml to upload and download the new jdeps-modules.txt and the jar-with-dependencies as build artifacts, ensuring these are available for the Docker build step.
  • Updated .dockerignore to allow target/jdeps-modules.txt into the Docker build context, so the custom JRE can be constructed from actual dependencies.

Build artifact:

  • Changed the assembly configuration to exclude module-info.class files from the jar-with-dependencies, preventing conflicts and unnecessary files in the runtime.

Docker build :

  • Kept ubuntu as a base for the final image, so as not to break any downstream build

@gaul
Copy link
Copy Markdown
Owner

gaul commented Apr 16, 2026

Sorry I missed this earlier -- could you resolve the conflicts?

@toindev toindev force-pushed the 401_custom_docker_JRE branch from f45c0bb to 9f55520 Compare May 5, 2026 07:50
@toindev
Copy link
Copy Markdown
Author

toindev commented May 5, 2026

Sorry I missed this earlier -- could you resolve the conflicts?

@gaul : 😄 I understand missing a notification on GH

@gaul
Copy link
Copy Markdown
Owner

gaul commented May 6, 2026

Thanks for tackling this! Good direction overall — jlink meaningfully reduces image size and attack surface. A few things to address before merge:

Should not be in this PR

  • .vscode/settings.json is personal IDE/Peacock theme config, unrelated to the JRE work. Please drop it (and consider adding .vscode/ to .gitignore). It also lacks a trailing newline.

Likely bugs / dead code

  • The s3proxy-jar artifact (jar-with-dependencies) is uploaded and downloaded to target/, but .dockerignore only allows target/s3proxy and target/jdeps-modules.txt into the Docker build context — so the jar never reaches the image. Either drop the upload/download or document why it's needed. target/s3proxy (appassembler) is what's actually copied.
  • RUN chmod +x /opt/s3proxy/run-docker-container.sh is redundant; the script is already 0755 in the repo and COPY preserves mode.

Drift / duplication

  • The CI jlink invocation and the Dockerfile jlink invocation have already diverged (CI adds jdk.compiler,jdk.zipfs,java.instrument,jdk.attach for Surefire). Consider just running tests on the full JDK to avoid drift — or factor the module list into a shared script. Otherwise these will desync silently over time.

Forward compatibility

  • --compress=2 is deprecated in JDK 21 and emits a warning. Prefer --compress=zip-6 (or zip-9) in both places.

Coverage gap

  • Nothing in CI runs the produced Docker image end-to-end. Since jdeps --ignore-missing-deps will silently skip modules used reflectively or via ServiceLoader, a smoke test that boots the container and hits an endpoint would catch missing-module regressions.

Minor

  • --bind-services pulls in service-provider modules transitively; usually desirable but worth confirming the resulting size vs. an explicit allowlist.
  • Stage 1 does a full apt-get update && install dumb-init just to copy one binary out — installing it in stage 2 (which already runs apt) would be more conventional.
  • The cp .../cacerts line is only needed if the source JDK's cacerts are newer than what java.base ships; a one-line comment on intent (or dropping it if unnecessary) would help.
  • "helps with AI coding assistants" comments are noise; prefer comments that explain non-obvious why.

Nice changes

  • Multi-stage Dockerfile, keeping Ubuntu base for downstream compatibility, deriving the module list from jdeps, and excluding module-info.class from the fat jar — all good.

Copy link
Copy Markdown
Owner

@gaul gaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few inline notes — the must-fix item and the two most important correctness/validation gaps. (Minor nits and optional enhancements shared separately, not posted inline.)

Comment thread .vscode/settings.json
@@ -0,0 +1,22 @@
{
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must fix — accidental commit. This is personal Peacock editor color settings, unrelated to the custom-JRE change. Please drop it from the PR. The repo doesn't track .vscode/, so consider adding .vscode/ to .gitignore to prevent it being re-added.

Comment thread Dockerfile

# Create a custom Java runtime with jlink
RUN jlink \
--add-modules $(cat /tmp/modules.txt) \
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Runtime modules can be missed — risk of broken outbound TLS. This list comes from static jdeps analysis (+ --bind-services), which misses dependencies loaded by name rather than referenced statically. The notable one is the EC crypto provider jdk.crypto.ec: ECDHE TLS handshakes to S3/Azure/GCS need sun.security.ec.SunEC, but it's loaded via the java.security config — so jdeps --print-module-deps won't report it and --bind-services won't pull it. The image can start fine, then fail on the first HTTPS connection to a backend.

This is compounded because no CI step runs the JRE that actually ships:

  • The Build custom JRE with jlink step builds a different, superset JRE (it adds jdk.compiler,jdk.zipfs,java.instrument,jdk.attach), and the Containerize job builds the image but never runs it (push: false on PRs). A missing runtime module won't surface.
  • Every test backend talks plaintext HTTP to localhost (transient/filesystem/azurite/localstack/fake-gcs), so nothing exercises real ECDHE TLS to a remote endpoint.

Suggestions: add jdk.crypto.ec (and likely jdk.crypto.cryptoki) explicitly to --add-modules; and/or add a smoke test that docker runs the built image and makes one real HTTPS backend call. Ideally the test job uses this same minimal module set so CI validates the artifact that ships.

path: target/s3proxy
- uses: actions/upload-artifact@v7
with:
name: s3proxy-jar
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused artifact. This s3proxy-jar is uploaded here and downloaded again in the Containerize job (around line 289), but the Docker build never uses it: .dockerignore only allows /target/s3proxy, /target/jdeps-modules.txt, and the run script into the build context, and no COPY references *-jar-with-dependencies.jar. (target/s3proxy is already that jar made executable.) Both this upload and the matching download can be removed.

@gaul
Copy link
Copy Markdown
Owner

gaul commented May 27, 2026

A few minor / non-blocking suggestions to accompany the inline notes above (none are correctness blockers):

pom.xmljdeps analysis

  • --multi-release resolves to 17 (${java.version}), but the runtime is JDK 21 (21-jdk/21-jre, CI java-version: 21). jdeps then analyzes the Java-17 view of multi-release jars while the image runs on 21; if a dependency ships META-INF/versions/21/ classes that need an extra module, it'd be under-reported. Prefer passing 21 (or a dedicated property) to match the runtime.
  • jdeps now runs on every mvn package/verify/install, since the exec execution is bound to the package phase. That adds a hard dependency on a full JDK (with jdeps on ${java.home}) for all builds, and runs it in the three other CI jobs (azuriteTests/localstackTests/fakeGcsServerTests) where the output is unused. Consider scoping it to a profile activated only for the Docker build.

Dockerfile

  • --compress=2 is deprecated in JDK 21 (also in the CI jlink step) — it still works but warns and is slated for removal. Use the new form, e.g. --compress=zip-6.
  • RUN chmod +x /opt/s3proxy/run-docker-container.sh is redundant — the file is already mode 100755 in git, and COPY preserves it; this just adds a layer.
  • apt-get upgrade -y makes the image non-reproducible (contents drift by build date). Reasonable as a security trade-off, just a conscious call.
  • Cross-base binary-copy assumptiondumb-init and the JRE are produced on eclipse-temurin:21-jdk and copied/run on ubuntu:24.04; this holds only while both are the same glibc generation (noble). Either pin the build base (eclipse-temurin:21-jdk-noble) or apt-get install dumb-init directly in the final stage instead of copying across stages.

.github/workflows/ci-main.yml

  • Mixed download-artifact versions in Containerize@v8 for s3proxy/s3proxy-jar, @v7 for jdeps-modules/pom. Functionally fine, but worth making consistent.

Optional (out of scope for this PR)

  • The container still runs as root; since this PR is about attack-surface reduction, a non-root USER would align well.
  • --bind-services does inflate the JRE (it pulls all reachable service providers), slightly working against the size goal — fine as a correctness/size trade-off, just worth being intentional about.

Thanks for tackling #401 — the multi-stage layout and the automated jdeps step are a nice direction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Investigate jlink to avoid bundling entire JDK

2 participants