Fix #11450: [FR]: struct.unpack type inference#11494
Conversation
Add a function transform that synthesizes the return type of struct.unpack, struct.unpack_from, and struct.iter_unpack when the format string is a literal, producing a precise tuple type instead of tuple[Any, ...]. Fixes microsoft#11450 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Solid, well-tested transform. One real (non-blocking) gap worth addressing: the per-code count guard rejects |
- Bound the produced element count instead of capping the raw repeat count
during digit parsing, so byte-length codes ('s'/'p') with large counts
(e.g. '1024s', '300p') correctly collapse to a single bytes element.
- Replace the no-op builtInName ternary with a direct assignment.
- Add test coverage for large 's'/'p' byte-length formats.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This comment has been minimized.
This comment has been minimized.
|
|
||
| # A non-literal format string falls back to the declared return type. | ||
| def func1(fmt: str) -> None: | ||
| assert_type(struct.unpack(fmt, buffer), tuple[Any, ...]) |
There was a problem hiding this comment.
📍 packages/pyright-internal/src/tests/samples/structUnpack1.py:1
The sample doesn't directly assert the element-count cap fallback (e.g. "257i", just over maxStructUnpackElementCount, should yield tuple[Any, ...]) nor the 0-count edge ("0i" → tuple[()]). Both are correct by inspection, but explicit assert_type cases would lock in these boundaries cheaply.
There was a problem hiding this comment.
Added explicit assert_type cases for both boundaries in structUnpack1.py:
assert_type(struct.unpack("257i", buffer), tuple[Any, ...])— just overmaxStructUnpackElementCount, exercises the cap fallback.assert_type(struct.unpack("0i", buffer), tuple[()])— the zero repeat-count edge.
Cheap to lock in, thanks for the nudge.
| return 'bytes'; | ||
|
|
||
| case 'b': | ||
| case 'B': |
There was a problem hiding this comment.
📍 packages/pyright-internal/src/analyzer/functionTransform.ts:320
Native-only codes are over-accepted: struct.unpack("<n", ...), "<N", and "<P" raise struct.error: bad char in struct format (native-only codes are invalid under an explicit byte-order prefix), yet getStructElementType maps n/N/P to int unconditionally, so the PR synthesizes tuple[int] for a call that fails at runtime. This doesn't regress vs the declared fallback, but a tighter version would only treat n/N/P as int when the effective mode is native (@ or no prefix). Worth a follow-up.
There was a problem hiding this comment.
Good catch — tightened this. parseStructFormat now tracks the effective mode from the optional prefix and passes it to getStructElementType, which only maps n/N/P to int in native mode (no prefix or @). Under an explicit byte-order prefix they return undefined, so we fall back to the declared tuple[Any, ...] instead of synthesizing a tuple for a call that fails at runtime with struct.error.
Added coverage in structUnpack1.py: nNP / @nNP → tuple[int, int, int], and <n / =N / >P → tuple[Any, ...].
| # `struct.unpack_from` and `struct.iter_unpack` when the format string | ||
| # is a literal. | ||
|
|
||
| import struct |
There was a problem hiding this comment.
📍 packages/pyright-internal/src/tests/samples/structUnpack1.py:1
Issue-linkage convention: the PR description uses Fixes https://github.com/microsoft/pylance-release/issues/11450, but repo convention is Addresses for microsoft/pylance-release issues (Fixes is reserved for microsoft/pyrx issues). Please update the description.
There was a problem hiding this comment.
Updated the PR description to use Addresses https://github.com/microsoft/pylance-release/issues/11450 instead of Fixes, per the repo convention.
|
Main blocker: the struct.unpack transform was added only to the pyright-internal (sync) copy of functionTransform.ts. Pylance's async path uses a separate pylance-internal copy that wasn't updated, so inference diverges between sync and async modes. Please mirror the change (and add Pylance-harness coverage for both modes) per the repo's sync↔async parity policy. |
|
Thanks for flagging the sync↔async parity concern. I've broken that out into a separate follow-up issue so it can be tracked and reviewed independently — this PR stays focused on the pyright change. |
|
Core change looks solid, but it needs the async counterpart updated for parity before merge: the new struct.unpack transform lives only in the pyright (sync) functionTransform.ts, while packages/pylance-internal/src/analyzer/functionTransform.ts still handles only functools.total_ordering. Please mirror the transform (and add a Pylance-harness regression test covering both modes). Two smaller follow-ups noted inline. |
n/N/P now only synthesize int in native mode (no prefix or @); under an explicit byte-order prefix they fall back to the declared type, matching the runtime struct.error. Adds assert_type coverage for the native-mode codes, the zero repeat count, and the element-count cap fallback. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Solid, well-tested fix on the Pyright side. The one blocker is sync↔async parity: the async copy in packages/pylance-internal/src/analyzer/functionTransform.ts still only transforms functools.total_ordering, so struct.unpack results differ when enableAsyncProgram=true. Please mirror the transform there and add Pylance-harness coverage for both modes. |
|
Diff from mypy_primer, showing the effect of this PR on open source code: dulwich (https://github.com/dulwich/dulwich)
+ .../projects/dulwich/dulwich/index.py
+ .../projects/dulwich/dulwich/index.py:816:20 - error: Argument of type "bytes" cannot be assigned to parameter "sha" of type "RawObjectID" in function "sha_to_hex"
+ "bytes" is not assignable to "RawObjectID" (reportArgumentType)
- 9 errors, 22 warnings, 0 informations
+ 10 errors, 22 warnings, 0 informations
aiortc (https://github.com/aiortc/aiortc)
- .../projects/aiortc/src/aiortc/rtcsctptransport.py:1114:36 - error: Argument of type "int | None" cannot be assigned to parameter "a" of type "int" in function "tsn_plus_one"
- Type "int | None" is not assignable to type "int"
- "None" is not assignable to "int" (reportArgumentType)
- .../projects/aiortc/src/aiortc/rtcsctptransport.py:1136:17 - error: Argument of type "int | None" cannot be assigned to parameter "tsn" of type "int" in function "prune_chunks"
- Type "int | None" is not assignable to type "int"
- "None" is not assignable to "int" (reportArgumentType)
+ .../projects/aiortc/src/aiortc/rtcsctptransport.py:1387:14 - error: Cannot assign to attribute "cumulative_tsn" for class "SackChunk"
+ Expression of type "int | None" cannot be assigned to attribute "cumulative_tsn" of class "SackChunk"
+ Type "int | None" is not assignable to type "int"
+ "None" is not assignable to "int" (reportAttributeAccessIssue)
- 278 errors, 4 warnings, 0 informations
+ 277 errors, 4 warnings, 0 informations
sympy (https://github.com/sympy/sympy)
+ .../projects/sympy/sympy/solvers/ode/hypergeometric.py:247:67 - error: Operator "**" not supported for types "Basic" and "Literal[2]" (reportOperatorIssue)
- .../projects/sympy/sympy/solvers/ode/lie_group.py:619:61 - error: Operator "-" not supported for type "Unknown | Basic" (reportOperatorIssue)
- .../projects/sympy/sympy/solvers/ode/nonhomogeneous.py:468:28 - error: Argument of type "Unknown | None" cannot be assigned to parameter "expr" of type "Expr" in function "make_args"
+ .../projects/sympy/sympy/solvers/ode/nonhomogeneous.py:468:28 - error: Argument of type "Expr | Unknown | None" cannot be assigned to parameter "expr" of type "Expr" in function "make_args"
- Type "Unknown | None" is not assignable to type "Expr"
+ Type "Expr | Unknown | None" is not assignable to type "Expr"
- Attribute "pop" is unknown (reportAttributeAccessIssue)
- .../projects/sympy/sympy/solvers/ode/ode.py:610:22 - error: Cannot access attribute "pop" for class "tuple[()]"
- Attribute "pop" is unknown (reportAttributeAccessIssue)
- .../projects/sympy/sympy/solvers/ode/ode.py:610:22 - error: Cannot access attribute "pop" for class "tuple[str, ...]"
+ .../projects/sympy/sympy/solvers/ode/single.py:867:9 - error: Expression with type "tuple[ComplexInfinity | Unknown | Any, list[tuple[Unknown, Unknown]] | list[Unknown]] | tuple[ComplexInfinity | Unknown | Any, list[tuple[Unknown, Unknown]] | list[Unknown], list[tuple[Unknown, Unknown]] | list[Unknown]]" cannot be assigned to target tuple
+ Type "tuple[ComplexInfinity | Unknown | Any, list[tuple[Unknown, Unknown]] | list[Unknown], list[tuple[Unknown, Unknown]] | list[Unknown]]" is incompatible with target tuple
+ Tuple size mismatch; expected 2 but received 3 (reportAssignmentType)
- .../projects/sympy/sympy/solvers/ode/single.py:2646:27 - error: Argument of type "One | NegativeOne | Zero | Integer | NaN | ComplexInfinity | Rational | Infinity | NegativeInfinity | Float | Number | Expr | Unknown | int" cannot be assigned to parameter "stop" of type "SupportsIndex" in function "__new__"
- Type "One | NegativeOne | Zero | Integer | NaN | ComplexInfinity | Rational | Infinity | NegativeInfinity | Float | Number | Expr | Unknown | int" is not assignable to type "SupportsIndex"
- "Expr" is incompatible with protocol "SupportsIndex"
- "__index__" is not present (reportArgumentType)
- .../projects/sympy/sympy/solvers/ode/single.py:2652:9 - error: Operator "+=" not supported for types "One | NegativeOne | Zero | Integer | NaN | ComplexInfinity | Rational | Infinity | NegativeInfinity | Float | Number | Expr | Any | Unknown" and "Basic | Any | Unknown"
+ .../projects/sympy/sympy/solvers/ode/single.py:2652:9 - error: Operator "+=" not supported for types "Unknown | Any | Zero | One | NegativeOne | Integer | NaN | ComplexInfinity | Rational | Infinity | NegativeInfinity | Float | Number | Expr" and "Basic | Any | Unknown"
- .../projects/sympy/sympy/solvers/ode/systems.py:1214:26 - error: Argument of type "dict[str, str | Unknown | Pow | bool | Symbol]" cannot be assigned to parameter "m" of type "Iterable[tuple[str, str]]" in function "update"
+ .../projects/sympy/sympy/solvers/ode/systems.py:1214:26 - error: Argument of type "dict[str, str | Unknown | Expr | bool | Symbol]" cannot be assigned to parameter "m" of type "Iterable[tuple[str, str]]" in function "update"
+ .../projects/sympy/sympy/stats/crv_types.py:2544:9 - error: Method "_cdf" overrides class "SingleContinuousDistribution" in an incompatible manner
+ Return type mismatch: base method returns type "None", override returns type "ComplexInfinity | Unknown"
+ Type "ComplexInfinity | Unknown" is not assignable to type "None"
+ "ComplexInfinity" is not assignable to "None" (reportIncompatibleMethodOverride)
+ .../projects/sympy/sympy/stats/crv_types.py:2723:9 - error: Method "_cdf" overrides class "SingleContinuousDistribution" in an incompatible manner
+ Return type mismatch: base method returns type "None", override returns type "Expr | Unknown"
+ Type "Expr | Unknown" is not assignable to type "None"
+ "Expr" is not assignable to "None" (reportIncompatibleMethodOverride)
- .../projects/sympy/sympy/stats/drv.py:269:22 - error: Argument of type "Generator[tuple[Unknown, ...] | Unknown | Sum | Expr | ZeroMatrix | Add | Zero | NaN | Piecewise | Basic | int | None, None, None]" cannot be assigned to parameter "iterable" of type "Iterable[_SupportsSumNoDefaultT@sum]" in function "sum"
+ .../projects/sympy/sympy/stats/drv.py:269:22 - error: Argument of type "Generator[tuple[Unknown, ...] | Unknown | Sum | Expr | ZeroMatrix | Zero | NaN | Piecewise | Basic | int | None, None, None]" cannot be assigned to parameter "iterable" of type "Iterable[_SupportsSumNoDefaultT@sum]" in function "sum"
- "Generator[tuple[Unknown, ...] | Unknown | Sum | Expr | ZeroMatrix | Add | Zero | NaN | Piecewise | Basic | int | None, None, None]" is not assignable to "Iterable[_SupportsSumNoDefaultT@sum]"
+ "Generator[tuple[Unknown, ...] | Unknown | Sum | Expr | ZeroMatrix | Zero | NaN | Piecewise | Basic | int | None, None, None]" is not assignable to "Iterable[_SupportsSumNoDefaultT@sum]"
- Type parameter "_T_co@Iterable" is covariant, but "tuple[Unknown, ...] | Unknown | Sum | Expr | ZeroMatrix | Add | Zero | NaN | Piecewise | Basic | int | None" is not a subtype of "_SupportsSumNoDefaultT@sum"
+ Type parameter "_T_co@Iterable" is covariant, but "tuple[Unknown, ...] | Unknown | Sum | Expr | ZeroMatrix | Zero | NaN | Piecewise | Basic | int | None" is not a subtype of "_SupportsSumNoDefaultT@sum"
- Type "tuple[Unknown, ...] | Unknown | Sum | Expr | ZeroMatrix | Add | Zero | NaN | Piecewise | Basic | int | None" is not assignable to type "_SupportsSumWithNoDefaultGiven"
+ Type "tuple[Unknown, ...] | Unknown | Sum | Expr | ZeroMatrix | Zero | NaN | Piecewise | Basic | int | None" is not assignable to type "_SupportsSumWithNoDefaultGiven"
- Type "tuple[Unknown, ...] | Unknown | Sum | Expr | ZeroMatrix | Add | Zero | NaN | Piecewise | Basic | int | None" is not assignable to type "_SupportsSumWithNoDefaultGiven"
+ Type "tuple[Unknown, ...] | Unknown | Sum | Expr | ZeroMatrix | Zero | NaN | Piecewise | Basic | int | None" is not assignable to type "_SupportsSumWithNoDefaultGiven"
- .../projects/sympy/sympy/stats/drv_types.py:293:16 - error: Operator "*" not supported for types "Expr" and "tuple[Unknown, ...] | Unknown | Sum | Expr | ZeroMatrix | Add | Zero | NaN | Piecewise | Basic"
+ .../projects/sympy/sympy/stats/drv_types.py:293:16 - error: Operator "*" not supported for types "Expr" and "tuple[Unknown, ...] | Unknown | Sum | Expr | ZeroMatrix | Zero | NaN | Piecewise | Basic"
- .../projects/sympy/sympy/stats/random_matrix_models.py:111:36 - error: Operator "*" not supported for types "Expr" and "tuple[Unknown, ...] | Unknown | Product | Basic | BooleanFalse | BooleanTrue | ComplexInfinity | Equality | Expr | Float | Infinity | Integer | NaN | NegativeInfinity | NegativeOne | Number | One | Rational | Zero"
+ .../projects/sympy/sympy/stats/random_matrix_models.py:111:36 - error: Operator "*" not supported for types "Expr" and "tuple[Unknown, ...] | Unknown | Product | Basic | BooleanFalse | BooleanTrue | Equality | Expr | NaN | One | Zero"
- .../projects/sympy/sympy/stats/rv.py:473:25 - error: Argument of type "Expr | Lambda | Zero | One | Integral | Unknown | Probability | tuple[Unknown, ...] | Sum | ZeroMatrix | Add | NaN | Piecewise | Basic | NegativeOne | Integer | ComplexInfinity | Rational | Infinity | NegativeInfinity | Float | Number | int" cannot be assigned to parameter "args" of type "Expr | complex" in function "__new__"
+ .../projects/sympy/sympy/stats/rv.py:473:25 - error: Argument of type "Expr | Lambda | Zero | One | Integral | Unknown | Probability | tuple[Unknown, ...] | Sum | ZeroMatrix | NaN | Piecewise | Basic | NegativeOne | Integer | ComplexInfinity | Rational | Infinity | NegativeInfinity | Float | Number | int" cannot be assigned to parameter "args" of type "Expr | complex" in function "__new__"
- Type "Expr | Lambda | Zero | One | Integral | Unknown | Probability | tuple[Unknown, ...] | Sum | ZeroMatrix | Add | NaN | Piecewise | Basic | NegativeOne | Integer | ComplexInfinity | Rational | Infinity | NegativeInfinity | Float | Number | int" is not assignable to type "Expr | complex"
+ Type "Expr | Lambda | Zero | One | Integral | Unknown | Probability | tuple[Unknown, ...] | Sum | ZeroMatrix | NaN | Piecewise | Basic | NegativeOne | Integer | ComplexInfinity | Rational | Infinity | NegativeInfinity | Float | Number | int" is not assignable to type "Expr | complex"
- .../projects/sympy/sympy/stats/stochastic_process_types.py:1839:25 - error: Argument of type "tuple[Unknown, ...] | Unknown | Sum | Expr | ZeroMatrix | Add | Zero | NaN | Piecewise | Basic | Integral | Any | int" cannot be assigned to parameter "args" of type "Expr | complex" in function "__new__"
+ .../projects/sympy/sympy/stats/stochastic_process_types.py:1839:25 - error: Argument of type "tuple[Unknown, ...] | Unknown | Sum | Expr | ZeroMatrix | Zero | NaN | Piecewise | Basic | Integral | Any | int" cannot be assigned to parameter "args" of type "Expr | complex" in function "__new__"
- Type "tuple[Unknown, ...] | Unknown | Sum | Expr | ZeroMatrix | Add | Zero | NaN | Piecewise | Basic | Integral | Any | int" is not assignable to type "Expr | complex"
+ Type "tuple[Unknown, ...] | Unknown | Sum | Expr | ZeroMatrix | Zero | NaN | Piecewise | Basic | Integral | Any | int" is not assignable to type "Expr | complex"
... (truncated 1282 lines) ...
tornado (https://github.com/tornadoweb/tornado)
- .../projects/tornado/tornado/websocket.py:1212:51 - error: Argument of type "Any | Literal[0] | None" cannot be assigned to parameter "opcode" of type "int" in function "_handle_message"
+ .../projects/tornado/tornado/websocket.py:1212:51 - error: Argument of type "int | None" cannot be assigned to parameter "opcode" of type "int" in function "_handle_message"
- Type "Any | Literal[0] | None" is not assignable to type "int"
+ Type "int | None" is not assignable to type "int"
mypy (https://github.com/python/mypy)
- .../projects/mypy/mypy/ipc.py:79:19 - error: Operator "+" not supported for "None" (reportOptionalOperand)
- .../projects/mypy/mypy/ipc.py:82:55 - error: Operator "+" not supported for types "Literal[4]" and "int | None"
- Operator "+" not supported for types "Literal[4]" and "None" (reportOperatorIssue)
- .../projects/mypy/mypy/ipc.py:83:35 - error: Operator "+" not supported for types "Literal[4]" and "int | None"
- Operator "+" not supported for types "Literal[4]" and "None" (reportOperatorIssue)
- 1071 errors, 94 warnings, 0 informations
+ 1068 errors, 94 warnings, 0 informations
spark (https://github.com/apache/spark)
- .../projects/spark/python/pyspark/sql/streaming/transform_with_state_driver_worker.py:82:68 - error: Argument of type "str | Unknown | Any | None" cannot be assigned to parameter "state_server_port" of type "int | str" in function "__init__"
+ .../projects/spark/python/pyspark/sql/streaming/transform_with_state_driver_worker.py:82:68 - error: Argument of type "str | Unknown | int | None" cannot be assigned to parameter "state_server_port" of type "int | str" in function "__init__"
- Type "str | Unknown | Any | None" is not assignable to type "int | str"
+ Type "str | Unknown | int | None" is not assignable to type "int | str"
pwndbg (https://github.com/pwndbg/pwndbg)
- .../projects/pwndbg/pwndbg/aglib/macho.py:579:17 - error: Return type of generator function must be compatible with "Generator[tuple[bytearray, Any], Any, Any]"
+ .../projects/pwndbg/pwndbg/aglib/macho.py:579:17 - error: Return type of generator function must be compatible with "Generator[tuple[bytearray, int], Any, Any]"
manticore (https://github.com/trailofbits/manticore)
+ .../projects/manticore/manticore/wasm/executor.py:1194:14 - error: Type "int" is not assignable to declared type "F32"
+ "int" is not assignable to "F32" (reportAssignmentType)
+ .../projects/manticore/manticore/wasm/executor.py:1202:14 - error: Type "int" is not assignable to declared type "F64"
+ "int" is not assignable to "F64" (reportAssignmentType)
+ .../projects/manticore/manticore/wasm/executor.py:1505:14 - error: Type "float" is not assignable to declared type "I32"
+ "float" is not assignable to "I32" (reportAssignmentType)
+ .../projects/manticore/manticore/wasm/executor.py:1513:14 - error: Type "float" is not assignable to declared type "I64"
+ "float" is not assignable to "I64" (reportAssignmentType)
- 26023 errors, 153 warnings, 0 informations
+ 26027 errors, 153 warnings, 0 informations
|
| return 'tuple'; | ||
|
|
||
| case '_struct.iter_unpack': | ||
| return 'iterator'; |
There was a problem hiding this comment.
Review-rule violation (ts-named-types-at-bottom): StructUnpackKind and StructElementType are declared inline, interleaved with the function implementations. Extraction is correct (both appear in multiple signatures, so the tiny-one-off carve-out doesn't apply), but placement is wrong — move both type declarations to the bottom of the file, after getStructElementType.
|
Solid feature with thorough tests. Two blocking items before merge: (1) port the transform into the async functionTransform copy in pylance-internal so sync/async modes agree, and (2) move the two type aliases to the bottom of the file per the repo convention. The native-mode handling for n/N/P is already correct and well-tested. |
|
The struct.unpack synthesis itself looks solid and well-tested, but it was only applied to the pyright/sync copy of applyFunctionTransform. The async copy (packages/pylance-internal/src/analyzer/functionTransform.ts) still handles only functools.total_ordering, which violates the sync↔async parity requirement and yields divergent results across modes. Please port the transform there and add a Pylance-harness regression test covering both modes. |
|
Main blocker: sync↔async parity. The new struct.unpack transform exists only in packages/pyright-internal/src/analyzer/functionTransform.ts but not in the pylance-internal async copy, so results diverge by mode. Please port the transform to the async copy and add a Pylance-harness regression test covering both modes. The implementation itself (literal parsing, native-mode n/N/P handling, element-count cap) looks solid and well-tested. |
Description
Synthesizes a precise return type for
struct.unpack,struct.unpack_from, andstruct.iter_unpackwhen the format string is a literal. Instead of the declaredtuple[Any, ...], callers now get an exact tuple type (e.g.tuple[int, int]) derived from the format codes, so element types and arity are checked and propagated.How you figured out what to do
struct.unpackreturnstuple[Any, ...]in typeshed, losing all element information even when the format string is a known literal. Pyright already applies function-result transforms for special-cased callables (e.g.functools.total_ordering) infunctionTransform.ts, which is the natural hook for synthesizing a more specific return type without touching the wire protocol or typeshed.Implementation
In
applyFunctionTransform, dispatch the_struct.unpack/unpack_from/iter_unpackcallables to a newapplyStructUnpackTransform:str/bytesliteral.parseStructFormatwalks the format string, honoring the optional byte-order character, whitespace, repeat counts, and per-code element kinds (int/float/bool/bytes), withxproducing no element ands/pcollapsing their count into a singlebytes. The native-only codesn/N/Pare only treated asintin native mode (no prefix or@); under an explicit byte-order prefix they fall back to the declared type, matching the runtimestruct.error.makeTupleObject;iter_unpackwraps it inIterator[...].maxStructUnpackElementCount(256) to bound performance.Testing
Added
structUnpack1.pywithassert_typechecks covering numeric codes, repeat counts, mixed kinds,s/c/x, bytes-literal formats, empty/byte-order-only formats, whitespace,unpack_from,iter_unpack, the zero repeat count ("0i"→tuple[()]), the element-count cap fallback ("257i"→tuple[Any, ...]), native-moden/N/Pvs. their invalid prefixed forms, and fallback for unknown codes and non-literal formats. Wired up viaStructUnpack1intypeEvaluator8.test.ts(expects 0 errors).Addresses
Addresses https://github.com/microsoft/pylance-release/issues/11450
Generated by fix_all_my_issues pipeline