Skip to content

Virtual_disk: add migration case for disk with fdgroup#6835

Open
meinaLi wants to merge 1 commit intoautotest:masterfrom
meinaLi:fdgroup_migration
Open

Virtual_disk: add migration case for disk with fdgroup#6835
meinaLi wants to merge 1 commit intoautotest:masterfrom
meinaLi:fdgroup_migration

Conversation

@meinaLi
Copy link
Copy Markdown
Contributor

@meinaLi meinaLi commented Mar 19, 2026

Automate case:
VIRT-296996 - Migrate disks with fdgroup.
This case is to test the dom-fd-associate(virDomainFDAssociate, //disk/source@fdgroup) with migration.

Summary by CodeRabbit

  • Tests
    • Added an end-to-end test suite for disk migration with file-descriptor-group (fdgroup) support.
    • Exercises NFS-backed disk migration, multiple migration flag variants, and destination-side VM preparation.
    • Verifies file-descriptor associations are open on both source and destination during migration.
    • Includes dependency checks, automated cleanup, and remote orchestration for fdgroup setup.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 19, 2026

Walkthrough

Adds a new migration test configuration and a Python test module for libvirt fdgroup-based disk migration. The config defines an NFS/netfs-backed storage, two file-backed virtio disks with fdgroup names and FD IDs, migration URIs/credentials, seclabel variants, and platform exclusions. The test creates backing files on NFS, injects disk XML into the domain, runs virsh dom-fd-associate on source and destination (destination done remotely in background), verifies open FDs with lsof, defines the VM on the destination, executes migration, and performs cleanup.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding a migration test case for virtual disks using fdgroup, which matches the primary purpose of the pull request.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

@meinaLi
Copy link
Copy Markdown
Contributor Author

meinaLi commented Mar 19, 2026

# avocado run --vt-type libvirt --vt-omit-data-loss --vt-machine-type q35 migration.migration_with_disk.migration_with_fdgroup
JOB ID     : 92ea090ae9b90e8700545d2f20c6bd90f96bb786
JOB LOG    : /var/log/avocado/job-results/job-2026-03-18T22.30-92ea090/job.log
 (1/4) type_specific.io-github-autotest-libvirt.migration.migration_with_disk.migration_with_fdgroup.flag_none: STARTED
 (1/4) type_specific.io-github-autotest-libvirt.migration.migration_with_disk.migration_with_fdgroup.flag_none: PASS (152.10 s)
 (2/4) type_specific.io-github-autotest-libvirt.migration.migration_with_disk.migration_with_fdgroup.flag_seclabel_restore: STARTED
 (2/4) type_specific.io-github-autotest-libvirt.migration.migration_with_disk.migration_with_fdgroup.flag_seclabel_restore: PASS (153.52 s)
 (3/4) type_specific.io-github-autotest-libvirt.migration.migration_with_disk.migration_with_fdgroup.flag_seclabel_writable: STARTED
 (3/4) type_specific.io-github-autotest-libvirt.migration.migration_with_disk.migration_with_fdgroup.flag_seclabel_writable: PASS (152.49 s)
 (4/4) type_specific.io-github-autotest-libvirt.migration.migration_with_disk.migration_with_fdgroup.flag_seclabel_restore_writable: STARTED
 (4/4) type_specific.io-github-autotest-libvirt.migration.migration_with_disk.migration_with_fdgroup.flag_seclabel_restore_writable: PASS (152.91 s)
RESULTS    : PASS 4 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
JOB HTML   : /var/log/avocado/job-results/job-2026-03-18T22.30-92ea090/results.html
JOB TIME   : 614.31 s

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@libvirt/tests/src/migration/migration_with_disk/migration_with_fdgroup.py`:
- Around line 80-94: verify_fd_open currently always runs lsof locally, so the
post-migration destination check is re-checking source paths; modify
verify_fd_open to accept a remote host/path parameter (e.g., add an optional
host or remote_path argument) and when provided run the lsof command on the
destination host (via SSH or the existing test harness remote execution helper
used elsewhere) instead of locally; update callers that perform the destination
pre-check to pass the destination host/path so the function checks the remote
NFS paths and still logs/fails using the same test.fail and test.log.debug
behavior.
- Around line 138-147: vmxml is created once via VMXML.new_from_dumpxml(vm_name)
then base_steps.sync_cpu_for_mig(params) mutates the domain XML (via
dom_xml.sync()), so prepare_dest_vm(vm_name, vmxml, params, ...) uses stale XML;
reload vmxml after calling base_steps.sync_cpu_for_mig by calling
VMXML.new_from_dumpxml(vm_name) again (or otherwise refresh vmxml) before
prepare_dest_vm to ensure the destination VM is built from the updated, synced
XML.
- Around line 180-181: The call to remote.remote_login is missing the port
argument which shifts parameters (remote.remote_login expects (client, host,
port, user, password, prompt)); update the call that assigns remote_session so
the SSH port (e.g. "22" or the proper port variable) is passed between server_ip
and server_user to restore the correct argument order and prevent server_user
being treated as the port.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 346050a2-db50-4ab9-976f-d9280c380812

📥 Commits

Reviewing files that changed from the base of the PR and between a301673 and 4984bcc.

📒 Files selected for processing (2)
  • libvirt/tests/cfg/migration/migration_with_disk/migration_with_fdgroup.cfg
  • libvirt/tests/src/migration/migration_with_disk/migration_with_fdgroup.py

Comment thread libvirt/tests/src/migration/migration_with_disk/migration_with_fdgroup.py Outdated
Comment thread libvirt/tests/src/migration/migration_with_disk/migration_with_fdgroup.py Outdated
@meinaLi meinaLi force-pushed the fdgroup_migration branch from 4984bcc to 3da30b3 Compare March 19, 2026 07:16
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
libvirt/tests/src/migration/migration_with_disk/migration_with_fdgroup.py (1)

152-160: ⚠️ Potential issue | 🟠 Major

Refresh vmxml after CPU sync; current flow can define destination with stale CPU XML.

At Line 152 base_steps.sync_cpu_for_mig(params) updates domain CPU XML, but Line 159 still uses the previously captured vmxml. Also, Line 153 vmxml.sync() can overwrite the synced CPU XML with stale content.

♻️ Suggested fix
-        base_steps.sync_cpu_for_mig(params)
-        vmxml.sync()
+        base_steps.sync_cpu_for_mig(params)
+        vmxml = vm_xml.VMXML.new_from_dumpxml(vm_name)
         associate_fds(vm_name, params, test, on_dest=False)
         vm.wait_for_login().close()
@@
         verify_fd_open([file_vdb, file_vdc], test)
 
         prepare_dest_vm(vm_name, vmxml, params, remote_runner, test)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libvirt/tests/src/migration/migration_with_disk/migration_with_fdgroup.py`
around lines 152 - 160, The CPU XML is being updated by
base_steps.sync_cpu_for_mig(params) but vmxml.sync() is called in the wrong
order and can reintroduce stale CPU data; after calling
base_steps.sync_cpu_for_mig(params) refresh the vmxml object (or re-read the
domain XML) so it includes the synced CPU changes and only then proceed to
associate_fds/prepare_dest_vm; specifically, remove or move the vmxml.sync() so
that vmxml reflects post-sync CPU state before calling prepare_dest_vm(vm_name,
vmxml, params, remote_runner, test) and before verifying remote fds.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@libvirt/tests/src/migration/migration_with_disk/migration_with_fdgroup.py`:
- Around line 16-17: Module-level list cleanup_files is left populated between
test runs causing cross-case state leakage; reset it at test start by clearing
or reinitializing the module-level variable. Locate the module-level symbol
cleanup_files in migration_with_fdgroup.py and add cleanup_files.clear() (or
cleanup_files = []) at the beginning of the test setup path (e.g., in the module
setup function or at the top of the primary test function that uses it) so each
test run starts with an empty list.
- Around line 63-77: The shell commands in the on_dest and else branches build
strings with unescaped dynamic values (vm_name, fdgroup_vdb, fd_id_vdb,
file_vdb, fdgroup_vdc, fd_id_vdc, file_vdc, flag) which allows injection; fix by
removing direct interpolation into a single shell string and instead pass
arguments safely or properly escape each dynamic value (e.g., use an
argument-list API or wrap each variable with shlex.quote before concatenation)
when calling remote.run_remote_cmd and process.run; update the calls that
construct cmd (the virsh/nohup pipelines) to build a safe list of args or quoted
components so vm_name, fdgroup_*, fd_id_*, file_* and flag cannot break out of
the intended command.

---

Duplicate comments:
In `@libvirt/tests/src/migration/migration_with_disk/migration_with_fdgroup.py`:
- Around line 152-160: The CPU XML is being updated by
base_steps.sync_cpu_for_mig(params) but vmxml.sync() is called in the wrong
order and can reintroduce stale CPU data; after calling
base_steps.sync_cpu_for_mig(params) refresh the vmxml object (or re-read the
domain XML) so it includes the synced CPU changes and only then proceed to
associate_fds/prepare_dest_vm; specifically, remove or move the vmxml.sync() so
that vmxml reflects post-sync CPU state before calling prepare_dest_vm(vm_name,
vmxml, params, remote_runner, test) and before verifying remote fds.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4aa2b38d-4794-455a-b432-a14fa803fb29

📥 Commits

Reviewing files that changed from the base of the PR and between 4984bcc and 3da30b3.

📒 Files selected for processing (2)
  • libvirt/tests/cfg/migration/migration_with_disk/migration_with_fdgroup.cfg
  • libvirt/tests/src/migration/migration_with_disk/migration_with_fdgroup.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • libvirt/tests/cfg/migration/migration_with_disk/migration_with_fdgroup.cfg

Comment on lines +63 to +77
if on_dest:
cmd = (f'nohup sh -c \'(echo "dom-fd-associate {vm_name} {fdgroup_vdb} {fd_id_vdb}{flag}"; '
f'echo "dom-fd-associate {vm_name} {fdgroup_vdc} {fd_id_vdc}{flag}"; '
f'tail -f /dev/null) | virsh\' {fd_id_vdb}<>{file_vdb} {fd_id_vdc}<>{file_vdc} '
f'> /dev/null 2>&1 &')
test.log.debug("Running FD association on dest: %s", cmd)
remote.run_remote_cmd(cmd, params, runner, ignore_status=False)
else:
cmd = (f'virsh "dom-fd-associate {vm_name} {fdgroup_vdb} {fd_id_vdb}{flag}; '
f'dom-fd-associate {vm_name} {fdgroup_vdc} {fd_id_vdc}{flag}; '
f'start {vm_name}" {fd_id_vdb}<>{file_vdb} {fd_id_vdc}<>{file_vdc}')
test.log.debug("Running command on source: %s", cmd)
result = process.run(cmd, ignore_status=True, shell=True)
if result.exit_status != 0:
test.fail(f"Failed to associate FDs and start VM: {result.stderr_text}")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Harden shell command construction; params are interpolated into shell strings unsafely.

At Line 75 and Line 97, shell commands include dynamic values (vm_name, paths, fdgroup fields, flags) without strict quoting/validation. This is an injection and parsing-risk surface (especially if values contain spaces/metacharacters).

🔒 Suggested hardening
+import shlex
@@
-        cmd = (f'virsh "dom-fd-associate {vm_name} {fdgroup_vdb} {fd_id_vdb}{flag}; '
-               f'dom-fd-associate {vm_name} {fdgroup_vdc} {fd_id_vdc}{flag}; '
-               f'start {vm_name}" {fd_id_vdb}<>{file_vdb} {fd_id_vdc}<>{file_vdc}')
+        safe_vm = shlex.quote(vm_name)
+        safe_file_vdb = shlex.quote(file_vdb)
+        safe_file_vdc = shlex.quote(file_vdc)
+        safe_flag = flag if flag in ("", " --seclabel-restore", " --seclabel-writable",
+                                     " --seclabel-restore --seclabel-writable") else ""
+        cmd = (f'virsh "dom-fd-associate {safe_vm} {fdgroup_vdb} {fd_id_vdb}{safe_flag}; '
+               f'dom-fd-associate {safe_vm} {fdgroup_vdc} {fd_id_vdc}{safe_flag}; '
+               f'start {safe_vm}" {fd_id_vdb}<>{safe_file_vdb} {fd_id_vdc}<>{safe_file_vdc}')
@@
-            result = process.run(cmd, ignore_status=True, shell=True)
-            stdout = result.stdout_text
+            result = process.run(["lsof", file_path], ignore_status=True, shell=False)
+            stdout = "\n".join(
+                line for line in result.stdout_text.splitlines()
+                if "qemu" in line
+            )

Also applies to: 91-98

🧰 Tools
🪛 Ruff (0.15.6)

[error] 75-75: Function call with shell=True parameter identified, security issue

(S604)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libvirt/tests/src/migration/migration_with_disk/migration_with_fdgroup.py`
around lines 63 - 77, The shell commands in the on_dest and else branches build
strings with unescaped dynamic values (vm_name, fdgroup_vdb, fd_id_vdb,
file_vdb, fdgroup_vdc, fd_id_vdc, file_vdc, flag) which allows injection; fix by
removing direct interpolation into a single shell string and instead pass
arguments safely or properly escape each dynamic value (e.g., use an
argument-list API or wrap each variable with shlex.quote before concatenation)
when calling remote.run_remote_cmd and process.run; update the calls that
construct cmd (the virsh/nohup pipelines) to build a safe list of args or quoted
components so vm_name, fdgroup_*, fd_id_*, file_* and flag cannot break out of
the intended command.

Automate case:
VIRT-296996 - Migrate disks with fdgroup.
This case is to test the dom-fd-associate(virDomainFDAssociate, //disk/source@fdgroup) with migration.

Signed-off-by: Meina Li <meili@redhat.com>
@meinaLi meinaLi force-pushed the fdgroup_migration branch from 3da30b3 to 0ee10fa Compare March 19, 2026 07:42
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@libvirt/tests/src/migration/migration_with_disk/migration_with_fdgroup.py`:
- Around line 153-154: sync_cpu_for_mig(params) updates the domain CPU XML but
vmxml.sync() called afterward overwrites those updates with a stale cached
object; replace the vmxml.sync() call with a fresh reload of the vmxml object
after base_steps.sync_cpu_for_mig(params) (i.e., re-fetch or reinstantiate the
vmxml used in this test) so the later logic uses the updated CPU XML instead of
the old cached vmxml; reference the functions/variables sync_cpu_for_mig and
vmxml.sync to locate the change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a8bdd2b3-64e7-4be0-9b89-1a22179a4ec0

📥 Commits

Reviewing files that changed from the base of the PR and between 3da30b3 and 0ee10fa.

📒 Files selected for processing (2)
  • libvirt/tests/cfg/migration/migration_with_disk/migration_with_fdgroup.cfg
  • libvirt/tests/src/migration/migration_with_disk/migration_with_fdgroup.py

Comment on lines +153 to +154
base_steps.sync_cpu_for_mig(params)
vmxml.sync()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

vmxml.sync() overwrites CPU changes from sync_cpu_for_mig().

After sync_cpu_for_mig() updates the domain's CPU XML and syncs it, calling vmxml.sync() on the stale cached object (from line 150) overwrites those CPU changes with the old CPU configuration. This can cause migration failures on mixed-CPU hosts.

The fix should reload vmxml after sync_cpu_for_mig() instead of syncing the stale object:

🔧 Proposed fix
         base_steps.sync_cpu_for_mig(params)
-        vmxml.sync()
+        vmxml = vm_xml.VMXML.new_from_dumpxml(vm_name)
         associate_fds(vm_name, params, test, on_dest=False)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libvirt/tests/src/migration/migration_with_disk/migration_with_fdgroup.py`
around lines 153 - 154, sync_cpu_for_mig(params) updates the domain CPU XML but
vmxml.sync() called afterward overwrites those updates with a stale cached
object; replace the vmxml.sync() call with a fresh reload of the vmxml object
after base_steps.sync_cpu_for_mig(params) (i.e., re-fetch or reinstantiate the
vmxml used in this test) so the later logic uses the updated CPU XML instead of
the old cached vmxml; reference the functions/variables sync_cpu_for_mig and
vmxml.sync to locate the change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant