Fix domain parsing for GPU & add Display controller in the supported PCI class#12981
Fix domain parsing for GPU & add Display controller in the supported PCI class#12981vishesh92 wants to merge 6 commits intoapache:4.22from
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses CloudStack issue #12960 where GPU passthrough VM creation can fail on KVM due to incorrect parsing of PCI addresses (notably when the domain portion is present/expanded), leading to invalid libvirt PCI slot values.
Changes:
- Update KVM GPU discovery script to consistently normalize/handle PCI addresses with and without an explicit domain and to key lookups using the domain-qualified form.
- Update
LibvirtGpuDefPCI XML generation to accept bothbb:ss.fanddddd:bb:ss.fbus address formats. - Add unit tests covering full PCI addresses (domain 0, non-zero domain) and legacy short BDF behavior.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| scripts/vm/hypervisor/kvm/gpudiscovery.sh | Normalizes PCI address handling (domain-aware) across sysfs access, cache keys, and VM usage mapping to avoid wrong BDF parsing. |
| plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/LibvirtGpuDef.java | Extends PCI address parsing to support domain:bus:slot.func input, preventing slot mis-parsing when a domain is present. |
| plugins/hypervisors/kvm/src/test/java/com/cloud/hypervisor/kvm/resource/LibvirtGpuDefTest.java | Adds regression/unit tests for full PCI addresses and backward compatibility for short BDF inputs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/LibvirtGpuDef.java
Outdated
Show resolved
Hide resolved
plugins/hypervisors/kvm/src/test/java/com/cloud/hypervisor/kvm/resource/LibvirtGpuDefTest.java
Outdated
Show resolved
Hide resolved
Codecov Report✅ All modified and coverable lines are covered by tests.
Additional details and impacted files@@ Coverage Diff @@
## 4.22 #12981 +/- ##
=============================================
- Coverage 17.60% 3.70% -13.90%
=============================================
Files 5918 448 -5470
Lines 531681 38042 -493639
Branches 65005 7038 -57967
=============================================
- Hits 93589 1409 -92180
+ Misses 427539 36446 -391093
+ Partials 10553 187 -10366
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
this adds support for the amd instinct mi2xx accelorator crards in the discovery script.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/LibvirtGpuDef.java
Show resolved
Hide resolved
|
@blueorangutan package |
|
@vishesh92 a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -882,10 +989,7 @@ for LINE in "${LINES[@]}"; do | |||
| if [[ ${#vlist[@]} -eq 0 && ${#flist[@]} -eq 0 ]]; then | |||
| FP_ENABLED=1 | |||
| fi | |||
There was a problem hiding this comment.
parse_pci_address assigns DOMAIN/BUS/SLOT/FUNC via bash dynamic scoping; these call sites don’t declare those variables local first, so the values can leak into the global scope (and between loop iterations), making future changes/debugging brittle. Declare local DOMAIN BUS SLOT FUNC in the immediate scope before calling parse_pci_address (as is already done in process_mdev_instances), or refactor parse_pci_address to return values explicitly (e.g., via stdout or nameref parameters).
| fi | |
| fi | |
| local DOMAIN BUS SLOT FUNC |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17398 |
Description
This PR fixes #12960 & #12957
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
How Has This Been Tested?
How did you try to break this feature and the system with this change?