Skip to content

[Feature] Align timeout and task-cancellation handling across parallel verification precompiles #6697

@yanghang8612

Description

@yanghang8612

Summary

The two parallel-verification precompiles BatchValidateSign and VerifyTransferProof in PrecompiledContracts.java follow the same "submit tasks → CountDownLatch.await(cpuBudget) → collect results" pattern but complete the timeout branch differently. This issue proposes aligning the two call-sites so the CPU-budget contract is expressed consistently and in-flight worker tasks are released symmetrically. The change is semantically equivalent to current behaviour (see # Impact below) and does not require a fork gate.

Problem

Motivation

Shielded transfer verification and batch signature verification are both CPU-intensive precompile paths and both track the remaining CPU budget on a CountDownLatch. For predictability and easier reasoning about the CPU-budget contract, the two sites should respond to an exhausted budget in the same way.

Current State

  • BatchValidateSign.doExecute (actuator/src/main/java/org/tron/core/vm/PrecompiledContracts.java, L1083–L1089) checks the return value of countDownLatch.await(...) and raises Program.Exception.notEnoughTime("call BatchValidateSign precompile method") before touching any Future.
  • VerifyTransferProof.execute (same file, L1471–L1477) computes the same withNoTimeout local but does not read it; control falls through to future.get(), which is invoked without a timeout argument and without cancelling the outstanding tasks.

As a result, once the CountDownLatch budget is exhausted, the two code paths behave differently at the call-site: one surfaces an explicit notEnoughTime signal; the other blocks the block-processing thread on future.get() until the worker returns on its own, before the VM main loop's next-opcode CPU check fires.

Limitations or Risks

The two precompiles are conceptually identical in structure, so the divergence is surprising for readers. On the VerifyTransferProof path, the block-processing thread is held on future.get() until the slowest worker completes even though the CPU budget is already exhausted, and worker threads continue running proof verifications whose results will be discarded by the imminent OutOfTimeException.

Proposed Solution

Proposed Design

Adopt the BatchValidateSign pattern in VerifyTransferProof:

  1. Check the return value of countDownLatch.await(...) and raise notEnoughTime when it is false, before the result-collection loop.
  2. Before raising, cancel outstanding tasks via future.cancel(true) so worker threads are reclaimed promptly and no longer compute results that will be discarded.

Optionally, factor the "submit → await CPU budget → collect" sequence into a small shared helper so that any future parallel-verification precompile inherits the same semantics by default.

Key Changes

  • Module: org.tron.core.vm.PrecompiledContracts (VerifyTransferProof.execute, and optionally BatchValidateSign.doExecute to share the helper).
  • API: none; only internal control flow.
  • Test: add coverage for the timeout path (latch exhausted → notEnoughTime, cancelled futures reported as cancelled).

Impact

This change is consensus-neutral. Every observable field of the transaction receipt is identical between the two paths, because when the CountDownLatch returns false the CPU budget is by construction already exhausted:

  • The timeout argument is getCPUTimeLeftInNanoSecond() = vmShouldEndInUs * 1000 - System.nanoTime() (see PrecompiledContracts.java L458–L462), so await(...) == false is equivalent to System.nanoTime() ≥ vmShouldEndInUs.
  • Under the current code, control returns to the VM main loop and the very next iteration calls Program.checkCPUTimeLimit(opName) (see VM.java L85, Program.java L1241–L1260), which throws notEnoughTime for the same reason.
  • Both OutOfTimeException call-sites funnel into the same handler in VMActuator.execute (L272–L278), which calls program.spendAllEnergy() and sets contractResult.OUT_OF_TIME on the receipt; state changes are reverted in both cases (insertLeaves writes on the current path are discarded by the revert).

So the proposed change only relocates where the identical OutOfTimeException is thrown — from the VM's post-precompile check to inside the precompile itself. It does not introduce any new class of transaction outcome and does not interact with TransactionTrace.checkNeedRetry. The --debug and solidity-node paths short-circuit checkCPUTimeLimit and are not on the consensus path.

Concrete benefits:

  • Reliability: the block-processing thread is no longer held on future.get() waiting for workers whose results are already destined to be discarded.
  • Worker utilization: cancel(true) releases pool threads promptly instead of letting them run orphaned native verification calls.
  • Readability: the two precompiles read the same way, and withNoTimeout stops being an unused local.

Compatibility

  • Breaking Change: No.
  • Default Behavior Change: No. Every consensus-visible field (contractResult, energyUsed, state, logs) is identical to the pre-change path (see # Impact).
  • Migration Required: No.

References (Optional)

  • actuator/src/main/java/org/tron/core/vm/PrecompiledContracts.java
    • BatchValidateSign.doExecute — timeout check and notEnoughTime throw (L1083–L1089).
    • VerifyTransferProof.executeawait return value currently unused; future.get() without timeout / cancel (L1471–L1477).
    • getCPUTimeLeftInNanoSecond helper shared by both call-sites (L458–L462).
  • actuator/src/main/java/org/tron/core/vm/VM.java — per-opcode program.checkCPUTimeLimit(opName) call (L85).
  • actuator/src/main/java/org/tron/core/vm/program/Program.javacheckCPUTimeLimit (L1241–L1260).
  • actuator/src/main/java/org/tron/core/actuator/VMActuator.javaOutOfTimeException handling (L272–L278).

Additional Notes

  • Do you have ideas regarding implementation? Yes.
  • Are you willing to implement this feature? Yes.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions