Skip to content

F#325 Add NVIDIA NIM service appliance for OpenNebula GPU deployments#335

Open
JavierPalomaresTorrecilla wants to merge 6 commits intomasterfrom
jpalomares/service-nim-nvcr
Open

F#325 Add NVIDIA NIM service appliance for OpenNebula GPU deployments#335
JavierPalomaresTorrecilla wants to merge 6 commits intomasterfrom
jpalomares/service-nim-nvcr

Conversation

@JavierPalomaresTorrecilla
Copy link
Copy Markdown

This PR adds the service_Nim appliance to enable OpenNebula to provision virtual machines that run the NVIDIA NIM container on GPU-enabled infrastructure.

The appliance expects the following input parameters:

  • ONEAPP_NIM_NVIDIA_REGISTRY (mandatory).
  • ONEAPP_NIM_NVIDIA_REGISTRY_KEY (mandatory).
  • ONEAPP_NIM_NVIDIA_IMAGE_REF (mandatory).
  • ONEAPP_NIM_NVIDIA_REGISTRY_USER (optional for nvcr.io, otherwise used for registry authentication).

When ONEAPP_NIM_NVIDIA_REGISTRY is set to nvcr.io, the appliance follows the documented NVIDIA Container Registry authentication flow and uses the special username $oauthtoken together with the provided API key, as described in the NVIDIA NGC User Guide. In that case, ONEAPP_NIM_NVIDIA_REGISTRY_USER does not need to be provided through OpenNebula context.

The appliance configures the runtime needed to launch the NVIDIA NIM container, including propagation of NGC_API_KEY into the container, cache mounting, shared memory allocation, port exposure, NVIDIA runtime integration, and readiness-gated bootstrap through /v1/health/ready. As a result, the service is only marked READY after container startup completes successfully and the readiness check passes.

The original target image used during validation, nvcr.io/nim/openai/gpt-oss-120b:latest, exceeded the available single-GPU memory during model initialization, and the appliance correctly remained in bootstrap without reporting READY. Validation was then completed successfully with nvcr.io/nim/openai/gpt-oss-20b:latest, which reached readiness and was verified through /v1/health/live, /v1/health/ready, /v1/models, /v1/version, /v1/metadata, and streaming inference via /v1/chat/completions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant