Skip to content

S7-1200 FW V4.5: I/Q/M symbolic tag browse via oracle-reconstructed preset dictionary (33/40 tags complete) #757

Description

@tommasofaedo
## Problem

On S7-1200 firmware V4.5 (V3 protocol), EXPLORE requests for the I/Q/M
areas (RIDs 80/81/82) return a zlib blob protected by a Siemens preset
dictionary (magic `78 7D`, FDICT flag set, dict Adler-32 `0xce9b821b`).
Python's `zlib.decompress()` returns `Z_NEED_DICT` — the dictionary is
embedded in TIA Portal and not published by Siemens.

As a result, symbolic tag names, data types, logical addresses, and byte
offsets are unavailable via `browse()` for I/Q/M areas on V3 PLCs
(reported as a known limitation in PR #742 / the browse PR).

## Solution — oracle technique

We reconstructed 594 of 32768 FDICT bytes using an "oracle" approach:
inflate the same blob four times with four synthetic test dictionaries
(all-zeros, all-0xFF, `i%256`, `i>>8`). A byte that is identical in all
four outputs is a literal; a byte that differs reveals the FDICT position
it was copied from: `position = (B_output << 8) | A_output`.

The **same FDICT** (Adler-32 `0xce9b821b`) is used for all three areas
(I, Q, M) — confirmed on three independent Wireshark pcapng captures.

With 594 FDICT positions known, `_extract_tags()` anchors on always-literal
ID values and recovers Name/DataType/LogicalAddress/ByteOffset from a
context window before each ID.

### Byte-type fallback (I/Q areas)

LogicalAddress reconstruction by exhaustion:
- **Bool** tags → FDICT encodes `LogicalAddress="%I43.{bit}"` (garbled area
  letter, correct bit); reconstruct as `%{area}{ByteOffset}.{bit}`.
- **Word/Int** tags → `%IW` / `%QW` are literal in the blob; append
  ByteOffset to get `%IW{N}` / `%QW{N}`.
- **Byte** tags → only remaining type; oracle confirms LogicalAddress value
  is not encoded. Reconstruct as `%IB{ByteOffset}` / `%QB{ByteOffset}`.

### Structural limit — M area (confirmed by pcapng oracle)

Oracle analysis of Wireshark captures of all 15 M area tags shows the
deflate stream uses an **identical sequence** for Bool, Byte, and Word
addresses. It is not possible to distinguish `%MB` from `%MW` from the
blob alone. The 6 affected tags have correct `ByteOffset` values but
`LogicalAddress = ?`.

## Results (192.168.5.11, S7-1200 CPU 1212C DC/DC/DC, FW V4.5)

| Area | RID | Tags found | Complete | Notes |
|------|-----|-----------|----------|-------|
| I    | 80  | 13/13     | ✅ 100%  | Name, DataType, LogicalAddress, ByteOffset all correct |
| Q    | 81  | 11/11     | ✅ 100%  | Same — includes custom names (0_output, 100_output, output_0_0) |
| M    | 82  | 15 total  | 9/15     | 6 Byte/Word gap tags: ByteOffset correct, LogicalAddress unknown |

Score vs TIA Portal export: **33/40 correct, 6 gap (structural limit), 0 wrong**.

## Changes

### New: `browse_tags.py`

Standalone script. Contains:

- `_build_fdict()` — builds the 32768-byte dict from 594 confirmed positions
- `_fetch_area(rid, fdict)` — connects to PLC, sends EXPLORE, decompresses
- `_extract_tags(data, area_prefix)` — regex extraction anchored on literal IDs
- `main()` — CLI: `python browse_tags.py [I] [Q] [M]`

Requires Patches 1, 5, 6 (SequenceNumber, multi-frame collect, session key)
to be already applied to `s7/connection.py` and `s7/_s7commplus_client.py`.

### `s7/_s7commplus_client.py` — add `browse_tags()` method

```python
def browse_tags(self, areas=('I', 'Q', 'M')) -> dict[str, list[dict]]:
    """Browse symbolic tags in I/Q/M areas using oracle-reconstructed FDICT.

    Returns a dict mapping area letter to list of tag dicts.
    Each tag dict: {Name, DataType, LogicalAddress, ByteOffset, ID}.
    LogicalAddress may be '?' for M-area Byte/Word tags (structural limit).
    """
    from ._browse_fdict import _build_fdict, _extract_tags
    area_rids = {'I': 80, 'Q': 81, 'M': 82}
    fdict = _build_fdict()
    result = {}
    for area in areas:
        rid = area_rids[area]
        payload = _build_explore_payload_v3(rid)
        first = self._connection.send_request(FunctionCode.EXPLORE, payload)
        raw = self._connection._collect_explore_frames(first)
        p = raw.find(b'\x78\x7d')
        if p < 0:
            result[area] = []
            continue
        import zlib
        try:
            data = zlib.decompressobj(wbits=-15, zdict=fdict).decompress(raw[p + 6:])
        except zlib.error:
            result[area] = []
            continue
        result[area] = _extract_tags(data, area_prefix='%' + area)
    return result

Tested on

  • PLC: Siemens S7-1200 CPU 1212C DC/DC/DC
  • Firmware: V4.5
  • Protocol: V3 (no TLS, no password)
  • Tag count: 40 tags in TIA Portal (13 I, 11 Q, 15 M + 1 pending)
  • Verified against: TIA Portal export (Full_List_PLC_Tags.xlsx)

Known limitation

The 6 M-area gap tags (Tag_5/11/16/18/20/22, all Byte or Word type) cannot
have their LogicalAddress recovered from the blob alone. ByteOffset is
always correct. A hardcoded lookup table or a separate READ-based DataType
probe could resolve this, but both approaches are project-specific and are
not included in this patch.


---

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions