.. _ncdb-format: ############################ NCDB Coverage File Format ############################ NCDB (*Native Coverage DataBase*) is a compact, ZIP-based binary format for storing and merging UCIS coverage data. A single ``.cdb`` file is a standard ZIP archive whose members encode the scope hierarchy, hit counts, test history, and source file references. The format is designed to be: * **Space-efficient** — typically 100–200× smaller than the equivalent SQLite ``.cdb`` (see :ref:`ncdb-benchmarks`). * **Merge-fast** — same-schema merges reduce to element-wise integer addition over a flat array, with no SQL overhead. * **Self-describing** — a ``manifest.json`` at the root of the archive carries all metadata needed to read or merge the file without any external schema. * **Readable without PyUCIS** — every binary encoding is documented here in sufficient detail to write an independent parser. .. contents:: On this page :local: :depth: 2 ----------- ********************** 1. File identification ********************** Both NCDB and the legacy SQLite backend use the ``.cdb`` extension. Format discrimination is done by inspecting the first 16 bytes of the file. .. list-table:: :header-rows: 1 :widths: 20 30 50 * - Format - Header (hex) - Description * - SQLite - ``53 51 4C 69 74 65 20 66 6F 72 6D 61 74 20 33 00`` - Literal ASCII ``SQLite format 3\x00`` * - NCDB (non-empty) - ``50 4B 03 04 …`` - ZIP local-file header signature ``PK\x03\x04`` * - NCDB (empty archive) - ``50 4B 05 06 …`` - ZIP end-of-central-directory signature ``PK\x05\x06`` **Detection algorithm:** 1. Read the first 16 bytes of the file. 2. If ``bytes[0:16]`` equals the SQLite magic string → format is ``sqlite``. 3. If ``bytes[0:4]`` is ``PK\x03\x04`` or ``PK\x05\x06``: a. Open as ZIP. b. Read ``manifest.json``. c. If ``manifest["format"] == "NCDB"`` → format is ``ncdb``. 4. Otherwise → format is ``unknown``. ----------- *********************** 2. Archive structure *********************** An NCDB file is a **standard ZIP archive** (DEFLATE compression) whose members are named as follows. Members marked *required* must be present in every valid NCDB file; others are only written when the corresponding data is non-empty or non-default. .. list-table:: :header-rows: 1 :widths: 25 12 63 * - Member name - Required - Contents * - ``manifest.json`` - ✓ - Format identity, version, statistics, and the schema hash. * - ``strings.bin`` - ✓ - Deduplicated string table referenced by index throughout other members. * - ``scope_tree.bin`` - ✓ - DFS-serialized scope hierarchy (V2 encoding). Counts are *not* stored here. * - ``counts.bin`` - ✓ - Flat array of hit counts in the same DFS order as ``scope_tree.bin``. * - ``history.json`` - ✓ - Array of test-run and merge history records. * - ``sources.json`` - ✓ - Ordered list of source file paths; indices match file IDs in ``scope_tree.bin``. * - ``attrs.bin`` - — - User-defined attribute assignments (V2 JSON: scopes, coveritems, history nodes, and global attributes). * - ``tags.json`` - — - Tag assignments for scopes (sparse, DFS-indexed). * - ``toggle.bin`` - — - Per-signal toggle metadata (JSON: canonical name, metric, type, direction). * - ``fsm.bin`` - — - FSM state-index overrides (JSON, sparse; only written when state indices differ from the default 0, 1, 2, … sequence). * - ``cross.bin`` - — - Cross-coverpoint link records (JSON: crossed coverpoint sibling names). * - ``properties.json`` - — - Typed string property values (DFS scope-indexed). * - ``design_units.json`` - — - Design-unit name-to-DFS-index lookup table (name, index, scope type). * - ``formal.bin`` - — - Formal-verification assertion data (JSON: status, radius, witness). * - ``coveritem_flags.bin`` - — - Per-coveritem non-default flags (sparse delta-encoded binary). * - ``contrib/.bin`` - — - Per-test coveritem contribution arrays (delta-encoded, sparse). One file per history node that has contributions; ```` is the integer history-node index (not zero-padded). ----------- *********************** 3. Primitive encodings *********************** .. _ncdb-varint: 3.1 Unsigned LEB128 varint ========================== All variable-length integers in NCDB are encoded as **unsigned LEB128** (also called unsigned varint or ULEB128). This is the same encoding used by DWARF, WebAssembly, and Protocol Buffers (field type ``uint64``). **Encoding:** 1. Take the 7 least-significant bits of the value; set bit 7 to ``1`` if more bytes follow, ``0`` if this is the last byte. 2. Shift the value right by 7. Repeat until the value is zero. .. code-block:: text value bytes (hex) ──────────────────── 0 00 1 01 127 7F 128 80 01 255 FF 01 16383 FF 7F 16384 80 80 01 2³²−1 FF FF FF FF 0F 2⁶⁴−1 FF FF FF FF FF FF FF FF FF 01 **Decoding:** Read bytes one at a time. For each byte, take the low 7 bits and OR them into the accumulator at the current bit position (starting at 0). Advance the bit position by 7. If bit 7 of the byte is set, continue reading; otherwise stop. .. code-block:: python def decode_varint(buf: bytes, offset: int = 0): result, shift = 0, 0 while True: byte = buf[offset]; offset += 1 result |= (byte & 0x7F) << shift shift += 7 if not (byte & 0x80): return result, offset 3.2 UTF-8 strings ================= All text is UTF-8. Strings stored inline (e.g. in JSON members) are standard JSON strings. Strings stored in binary members (``scope_tree.bin``, ``strings.bin``) are referenced by their **string-table index** (a varint). ----------- ******************** 4. manifest.json ******************** A JSON object with the following fields (all present; unknown fields must be ignored by readers for forward compatibility): .. code-block:: json { "format": "NCDB", "version": "1.0", "ucis_version": "1.0", "created": "2026-02-25T21:00:00Z", "path_separator": "/", "scope_count": 42, "coveritem_count": 8800, "test_count": 64, "total_hits": 155432, "covered_bins": 7312, "schema_hash": "sha256:a3f1...", "generator": "pyucis-ncdb" } .. list-table:: :header-rows: 1 :widths: 25 75 * - Field - Description * - ``format`` - Always the string ``"NCDB"``. Readers must reject files where this is not ``"NCDB"``. * - ``version`` - Format version string. Currently ``"1.0"``. Readers should check the major component; a mismatch should produce a clear error. * - ``ucis_version`` - UCIS standard version the data conforms to. Currently ``"1.0"``. * - ``created`` - ISO 8601 UTC timestamp when the file was written. * - ``path_separator`` - Hierarchical path separator used in scope names. Typically ``"/"``. * - ``scope_count`` - Total number of scopes in ``scope_tree.bin`` (informational). * - ``coveritem_count`` - Total number of coveritems. Must equal the length of the array in ``counts.bin``. * - ``test_count`` - Number of TEST-kind entries in ``history.json``. * - ``total_hits`` - Sum of all values in ``counts.bin``. * - ``covered_bins`` - Number of non-zero values in ``counts.bin``. * - ``schema_hash`` - ``"sha256:"`` followed by the lowercase hex SHA-256 digest of the **uncompressed** ``scope_tree.bin`` content. Used by the fast-merge path to verify schema identity without parsing the scope tree. (See :ref:`ncdb-merge`.) * - ``generator`` - Free-form tool identification string. ----------- ******************** 5. strings.bin ******************** A deduplicated string table. Every string used anywhere in ``scope_tree.bin`` (scope names, coveritem names) is stored exactly once here and referenced by a zero-based integer index. **Binary layout:** .. code-block:: text [count : varint] — number of strings [len_0 : varint] — byte length of string 0 (UTF-8 encoded) [bytes_0 : len_0 bytes] — UTF-8 bytes of string 0 [len_1 : varint] [bytes_1 : len_1 bytes] ... * **Index 0** is always the empty string ``""``. * String indices are stable: the same string always maps to the same index within a single file (indices are assigned in first-encounter DFS order). ----------- ************************ 6. scope_tree.bin ************************ The complete scope hierarchy encoded as a depth-first traversal. The file contains a flat sequence of scope records with no explicit end marker; the count of child scopes embedded in each record defines the nesting. Counts (hit values) are **not** stored in this member. Instead, each coveritem encountered during DFS appends its hit count to ``counts.bin`` in the same traversal order. A reader reconstructs the association by walking ``scope_tree.bin`` and consuming counts from ``counts.bin`` in lockstep. 6.1 Scope record types ======================= Every scope record begins with a one-byte **marker**: .. list-table:: :header-rows: 1 :widths: 15 20 65 * - Marker byte - Name - Description * - ``0x00`` - ``REGULAR`` - Full scope record with type, name, presence bitfield, and children. * - ``0x01`` - ``TOGGLE_PAIR`` - Compact 2-field record for BRANCH scopes that carry exactly two TOGGLEBIN coveritems with the implicit names ``"0 -> 1"`` and ``"1 -> 0"``. Saves ~10 bytes per signal. 6.2 REGULAR scope record ========================= .. code-block:: text [marker : 1 byte ] always 0x00 [scope_type: varint ] ScopeTypeT integer value [name_ref : varint ] index into strings.bin [presence : varint ] bitfield of optional fields present (see below) — optional fields, each present only if the corresponding bit is set — [flags : varint ] only if PRESENCE_FLAGS (bit 0) set [file_id : varint ] only if PRESENCE_SOURCE (bit 1) set [line : varint ] " [token : varint ] " [weight : varint ] only if PRESENCE_WEIGHT (bit 2) set [at_least : varint ] only if PRESENCE_AT_LEAST (bit 3) set [goal : varint ] only if PRESENCE_GOAL (bit 5) set [source_type : varint ] only if PRESENCE_SOURCE_TYPE (bit 6) set — always present — [num_children : varint] number of child scope records that follow [num_covers : varint] number of coveritem records that follow — present only when num_covers > 0 — [cover_type : varint] CoverTypeT of all coveritems in this scope — num_covers coveritem records — [name_ref_ci : varint] × num_covers (one per coveritem) — num_children child scope records (recursive) — **Presence bitfield values:** .. list-table:: :header-rows: 1 :widths: 10 20 70 * - Bit - Name - Meaning * - 0 - ``PRESENCE_FLAGS`` - Non-default scope flags are stored. * - 1 - ``PRESENCE_SOURCE`` - Source location (``file_id``, ``line``, ``token``) is stored. * - 2 - ``PRESENCE_WEIGHT`` - Non-default scope weight (≠ 1) is stored. * - 3 - ``PRESENCE_AT_LEAST`` - An ``at_least`` threshold that overrides the cover-type default is stored at the scope level (applies to all coveritems in the scope). * - 4 - ``PRESENCE_CVG_OPTS`` - Reserved for covergroup options (not yet used by the writer). * - 5 - ``PRESENCE_GOAL`` - Non-default scope goal (≠ −1) is stored. * - 6 - ``PRESENCE_SOURCE_TYPE`` - Explicit ``SourceT`` enum value is stored. When absent, the source type defaults to ``SourceT.NONE``. **Cover-type defaults** (used when ``PRESENCE_AT_LEAST`` is absent): .. list-table:: :header-rows: 1 :widths: 30 15 15 15 * - CoverTypeT - flags default - at_least default - weight default * - ``CVGBIN`` - ``0x19`` - **1** - 1 * - All others (TOGGLEBIN, STMTBIN, BRANCHBIN, …) - ``0x01`` - 0 - 1 6.3 TOGGLE_PAIR record ======================= .. code-block:: text [marker : 1 byte ] always 0x01 [name_ref : varint ] scope name index in strings.bin A TOGGLE_PAIR record implicitly encodes: * Scope type: ``BRANCH`` * Two TOGGLEBIN coveritems with names ``"0 -> 1"`` and ``"1 -> 0"`` (in that order). * Two consecutive entries are consumed from ``counts.bin``: first the ``"0 -> 1"`` count, then the ``"1 -> 0"`` count. No child scope records follow a TOGGLE_PAIR. 6.4 Scope-type integer values ============================== The ``scope_type`` varint uses the integer values of ``ScopeTypeT``. The most common values are: .. list-table:: :header-rows: 1 :widths: 15 45 40 * - Value - ScopeTypeT name - Typical context * - 2 - ``DU_MODULE`` - Design-unit scope for a Verilog module * - 16 - ``INSTANCE`` - Instantiation of a design unit * - 22 - ``COVERGROUP`` - SystemVerilog covergroup type or instance * - 23 - ``COVERPOINT`` - SystemVerilog coverpoint * - 28 - ``CROSS`` - SystemVerilog cross * - 30 - ``BRANCH`` - Code-coverage branch (toggle pair or regular) * - 32 - ``TOGGLE`` - Toggle scope (parent of BRANCH scopes) * - 33 - ``FSM`` - Finite state machine * - 36 - ``BLOCK`` - Statement block The full set of values is defined in ``ucis/scope_type_t.py``. ----------- ******************** 7. counts.bin ******************** A flat array of non-negative integers, one per coveritem, in the **same DFS order** as the coveritems encountered while reading ``scope_tree.bin``. TOGGLE_PAIR scopes contribute two consecutive counts (``"0 -> 1"`` then ``"1 -> 0"``). The array length is given by ``coveritem_count`` in ``manifest.json``. 7.1 Binary layout ================== .. code-block:: text [mode : 1 byte ] 0 = UINT32, 1 = VARINT [count : varint ] number of integers that follow [data : … ] mode-dependent encoding (see below) **Mode 0 — UINT32:** Each integer is a 4-byte little-endian unsigned 32-bit value. Used when most counts are large (i.e. varint encoding would not save space). .. code-block:: text [v_0 : 4 bytes LE] [v_1 : 4 bytes LE] … [v_{n-1} : 4 bytes LE] **Mode 1 — VARINT:** Each integer is encoded as an unsigned LEB128 varint (see :ref:`ncdb-varint`). Used when most counts are small (0–127), which is the common case for per-test databases. .. code-block:: text [varint_0] [varint_1] … [varint_{n-1}] **Mode selection:** The writer computes both encodings and selects VARINT when ``len(varint_encoding) < count × 4`` (i.e. when it is strictly smaller), falling back to UINT32 otherwise. A reader must support both modes. 7.2 Efficient single-byte fast path ===================================== When mode is VARINT and all values fit in a single byte (0–127), each byte in the data section is equal to the corresponding count value (the high bit is never set). A parser can exploit this: scan the data section for any byte ≥ 0x80; if none are found, each byte *is* its value, and the entire section can be decoded with a single ``bytes → list`` conversion. ----------- ******************** 8. history.json ******************** A JSON array of history node records. Each element represents either a test run (``kind: "TEST"``) or a merge operation (``kind: "MERGE"``). **Record schema:** .. code-block:: json [ { "logical_name": "regression_seed_42", "physical_name": null, "kind": "TEST", "test_status": 0, "tool_category": "sim", "date": "2026-02-25", "sim_time": 1500.0, "time_unit": "ns", "run_cwd": "/home/user/sim", "cpu_time": 12.3, "seed": "42", "cmd": "vsim -seed 42 top", "args": "", "compulsory": null, "user_name": "jsmith", "cost": 0.0, "ucis_version": null, "vendor_id": null, "vendor_tool": null, "vendor_tool_version": null, "same_tests": null, "comment": null } ] .. list-table:: :header-rows: 1 :widths: 20 20 60 * - Field - Type - Description * - ``logical_name`` - string - Unique name for this history node (test name or merge label). * - ``physical_name`` - string | null - Physical file name associated with the history node, or ``null``. * - ``kind`` - ``"TEST"`` | ``"MERGE"`` - History node kind. * - ``test_status`` - integer - Test status code: 0 = OK, 1 = WARNING, 2 = ERROR, 3 = FATAL, 4 = NOTRUN. * - ``tool_category`` - string - Free-form tool category (e.g. ``"sim"``, ``"formal"``). * - ``date`` - string - Date string (ISO 8601 recommended). * - ``sim_time`` - number - Simulation end time in ``time_unit`` units. * - ``time_unit`` - string - Simulation time unit (e.g. ``"ns"``, ``"ps"``). * - ``run_cwd`` - string - Working directory of the simulation run. * - ``cpu_time`` - number - CPU seconds consumed. * - ``seed`` - string - Random seed used. * - ``cmd`` - string - Simulator command line. * - ``args`` - string - Additional arguments. * - ``compulsory`` - any | null - Compulsory flag (tool-defined), or ``null`` if unset. * - ``user_name`` - string - Username that ran the simulation. * - ``cost`` - number - Simulation cost (tool-defined). * - ``ucis_version`` - string | null - UCIS version associated with this history node, or ``null``. * - ``vendor_id`` - string | null - Vendor identifier, or ``null``. * - ``vendor_tool`` - string | null - Vendor tool name, or ``null``. * - ``vendor_tool_version`` - string | null - Vendor tool version, or ``null``. * - ``same_tests`` - integer | null - Number of identical tests merged, or ``null``. * - ``comment`` - string | null - Free-form comment, or ``null``. ----------- ******************** 9. sources.json ******************** A JSON array of strings, where each element is an absolute or relative file path. The position of each path in the array is its **file ID**, which is the integer used as ``file_id`` in ``scope_tree.bin`` source references. .. code-block:: json [ "/home/user/design/top.sv", "/home/user/design/alu.sv", "/home/user/tb/coverage_pkg.sv" ] File ID 0 corresponds to the first element. An empty ``sources.json`` (``[]``) is valid when no source information was recorded. ----------- .. _ncdb-merge: ************************** 10. Merging NCDB files ************************** The key performance advantage of NCDB over SQLite is the **same-schema fast merge path**, which reduces a multi-file merge to element-wise integer addition. 10.1 Same-schema fast merge ============================ Two NCDB files are *schema-compatible* if and only if their ``schema_hash`` values are equal. The ``schema_hash`` is ``"sha256:"`` followed by the SHA-256 digest of the uncompressed ``scope_tree.bin`` bytes; equal hashes guarantee an identical scope hierarchy and coveritem ordering. **Algorithm for merging N same-schema files into one output file:** 1. Read ``manifest.json`` from all N sources. Verify ``schema_hash`` is identical for all; if not, fall back to the cross-schema path. 2. Read ``counts.bin`` from all N sources → N lists of integers. 3. Compute the merged count array: element-wise sum of all N lists. (In Python: ``list(map(sum, zip(*all_counts)))``) 4. Concatenate all ``history.json`` arrays from all sources. Append a new MERGE history node that references all source names. 5. Copy ``strings.bin``, ``scope_tree.bin``, and ``sources.json`` verbatim from the first source (they are identical for same-schema files). 6. Write the output ZIP with the merged manifest, the copied schema members, the merged ``counts.bin``, and the combined ``history.json``. The scope tree and string table never need to be decoded for a same-schema merge. 10.2 Cross-schema merge ======================== When the schema hashes differ, the merger must parse both scope trees, match scopes by ``(path, type, name)`` key, and add counts for matched coveritems. Unmatched coveritems from either source are appended with their original counts. This path is slower but correct for merging databases from designs that have evolved between runs. 10.3 Merge history node ======================== A merge operation appends a ``"MERGE"``-kind history node to ``history.json``: .. code-block:: json { "logical_name": "merge:output.cdb", "physical_name": null, "kind": "MERGE", "test_status": 0, "tool_category": "merge", "date": "2026-02-25T21:00:00Z" } ----------- ******************************* 11. Optional binary members ******************************* These members are omitted from the archive when the corresponding data is absent or all-default. Readers must silently skip any optional member they do not support, and must not fail if an expected optional member is absent. 11.1 attrs.bin ============== User-defined attribute assignments. Despite the ``.bin`` extension, this member is JSON-encoded. **Format v2** (current): .. code-block:: json { "version": 2, "scopes": [ {"idx": 0, "attrs": {"key": "value"}} ], "coveritems": [ {"scope_idx": 0, "ci_idx": 1, "attrs": {"key": "value"}} ], "history": [ {"idx": 0, "kind": "TEST", "attrs": {"key": "value"}} ], "global": {"key": "value"} } ``idx`` / ``scope_idx`` values are DFS scope indices (same ordering as ``scope_tree.bin``). ``ci_idx`` is the zero-based coveritem position within its parent scope. Only objects with at least one attribute are included (sparse). The reader also accepts legacy **v1** files that store only scope-level attributes. 11.2 tags.json ============== Tag assignments for scopes (sparse, DFS-indexed). .. code-block:: json { "version": 1, "entries": [ {"idx": 0, "tags": ["tag_a", "tag_b"]} ] } ``idx`` is the DFS scope index. Only scopes with at least one tag are included. 11.3 toggle.bin ================ Per-signal toggle metadata for ``TOGGLE``-type scopes. Despite the ``.bin`` extension, this member is JSON-encoded. .. code-block:: json { "version": 1, "entries": [ {"idx": 5, "canonical": "top.clk", "metric": 0, "type": 1, "dir": 2} ] } ``idx`` is the DFS scope index. All fields except ``idx`` are optional and are omitted when they match the defaults (``metric`` = ``ToggleMetricT._2STOGGLE``, ``type`` = ``ToggleTypeT.NET``, ``dir`` = ``ToggleDirT.INTERNAL``). Only ``TOGGLE`` scopes with at least one non-default value are included. 11.4 fsm.bin ============= FSM state-index overrides for ``FSM``-type scopes. Despite the ``.bin`` extension, this member is JSON-encoded. State and transition names are already stored in ``scope_tree.bin`` as FSMBIN coveritems under FSM_STATES and FSM_TRANS sub-scopes; this member only records non-sequential state indices. .. code-block:: json { "version": 1, "entries": [ {"fsm_idx": 3, "states": [{"name": "IDLE", "index": 5}]} ] } ``fsm_idx`` is the DFS scope index of the ``FSM`` scope. Only FSM scopes whose state indices differ from the default 0, 1, 2, … sequence are included. The member is omitted entirely when all indices are sequential. 11.5 cross.bin =============== Cross-coverpoint link records for ``CROSS``-type scopes. Despite the ``.bin`` extension, this member is JSON-encoded. .. code-block:: json { "version": 1, "entries": [ {"idx": 12, "crossed": ["cp_a", "cp_b"]} ] } ``idx`` is the DFS scope index of the ``CROSS`` scope. ``crossed`` lists the ``getScopeName()`` values of each crossed coverpoint (sibling scopes within the same parent COVERGROUP/COVERINSTANCE). 11.6 properties.json ===================== Typed string property values for scopes (DFS-indexed). .. code-block:: json { "version": 1, "entries": [ {"kind": "scope", "idx": 0, "key": 1, "type": "str", "value": "comment text"} ] } ``key`` is the integer value of the ``StrProperty`` enum. Only scopes with explicitly-set properties are included. 11.7 design_units.json ======================= Design-unit name-to-DFS-index lookup table. .. code-block:: json { "version": 1, "units": [ {"name": "top", "idx": 0, "type": 2} ] } ``type`` is the integer value of ``ScopeTypeT`` (e.g. 2 = ``DU_MODULE``). Only DU_ANY scopes are included. The member is omitted when no design units are present. 11.8 formal.bin ================ Formal-verification assertion data. Despite the ``.bin`` extension, this member is JSON-encoded. .. code-block:: json { "version": 1, "entries": [ {"idx": 42, "status": 1, "radius": 100, "witness": "/path/to/witness.vcd"} ] } ``idx`` is the flat DFS coveritem index (same ordering as ``counts.bin``). Fields ``status``, ``radius``, and ``witness`` are each omitted when they match the defaults (0, 0, ``null`` respectively). Defaults: ``status`` = ``FormalStatusT.NONE`` (0), ``radius`` = 0. 11.9 coveritem_flags.bin ========================= Per-coveritem non-default flags. This member uses a true binary encoding (sparse, delta-encoded varint pairs). .. code-block:: text [version : varint] always 1 [num_entries : varint] number of (index, flags) pairs per entry: [delta_idx : varint] coveritem DFS index delta from previous entry [flags : varint] ucisFlagsT value Only coveritems whose flags differ from the cover-type default (see cover-type defaults table in Section 6.2) are included. The member is omitted entirely when all coveritems use default flags. 11.10 contrib/.bin ============================= Per-test contribution arrays. One file per history node that recorded contributions; ```` is the integer history-node index (not zero-padded). Each file encodes a sparse, delta-encoded array of per-test hit counts, allowing reconstruction of which tests hit which bins. .. code-block:: text [num_entries : varint] per entry (sorted by bin_index, ascending): [delta_bin_index : varint] bin_index − previous bin_index [count : varint] hit count for this bin from this test ----------- *********************** 12. Version history *********************** .. list-table:: :header-rows: 1 :widths: 15 85 * - Version - Changes * - ``1.0`` - Initial release. Scope-tree V2 encoding with presence bitfield and TOGGLE_PAIR optimization. Varint + UINT32 dual-mode counts encoding. Same-schema fast-merge path via ``schema_hash``. * - ``1.0`` (UCIS compliance update) - Added presence bits 4–6 (``PRESENCE_CVG_OPTS``, ``PRESENCE_GOAL``, ``PRESENCE_SOURCE_TYPE``) to scope records. Added ``coveritem_flags.bin`` member for per-coveritem non-default flags. Updated ``history.json`` to UCIS-compliant field names (``logical_name``, ``physical_name``, ``test_status``, ``sim_time``, ``time_unit``, ``run_cwd``, ``cpu_time``, ``user_name``) and added vendor/tool fields (``ucis_version``, ``vendor_id``, ``vendor_tool``, ``vendor_tool_version``, ``same_tests``, ``comment``, ``compulsory``). Upgraded ``attrs.bin`` to V2 format with sections for scopes, coveritems, history nodes, and global attributes. Updated cover-type default flags to ``0x01`` (most types) / ``0x19`` (``CVGBIN``). Documented ``cross.bin``, ``properties.json``, ``design_units.json``, ``formal.bin``, and ``contrib/`` formats. ----------- .. _ncdb-benchmarks: ************************************* 13. Size and performance reference ************************************* Measurements using synthetic BM1–BM6 benchmark databases (pure Python, no C accelerator, median of 3 merge runs): .. list-table:: :header-rows: 1 :widths: 20 10 16 12 16 12 14 * - Workload - Bins - SQLite/test - NCDB/test - Size ratio - SQLite merge - NCDB merge * - BM1 Counter - 5 - 276 KB - 1.3 KB - **209×** - 22 ms - 1.2 ms * - BM2 ALU - 104 - 276 KB - 1.4 KB - **196×** - 24 ms - 1.7 ms * - BM3 Protocol - 180 - 276 KB - 1.4 KB - **195×** - 29 ms - 3.5 ms * - BM4 Hierarchy - 117 - 276 KB - 1.4 KB - **195×** - 28 ms - 4.0 ms * - BM5 Bins (8K) - 8 800 - 276 KB - 2.3 KB - **122×** - 40 ms - 17 ms * - BM6 SoC - 256 - 276 KB - 1.4 KB - **192×** - 72 ms - 12 ms *Merge seed counts: BM1=4, BM2=16, BM3=32, BM4=32, BM5=64, BM6=128.* The SQLite per-test size is dominated by the fixed B-tree page overhead (minimum 276 KB regardless of design size). NCDB scales with actual data: a design with 5 bins uses only 1.3 KB. With a C accelerator for varint encode/decode, BM5 merge time is projected to drop to ~5 ms (~7.5× faster than SQLite). ----------- ***************************** 14. Implementing a reader ***************************** To read an NCDB file without PyUCIS: .. code-block:: python import zipfile, json, struct, hashlib def read_varint(data, offset): result, shift = 0, 0 while True: b = data[offset]; offset += 1 result |= (b & 0x7F) << shift shift += 7 if not (b & 0x80): return result, offset def read_ncdb(path): with zipfile.ZipFile(path) as zf: manifest = json.loads(zf.read("manifest.json")) assert manifest["format"] == "NCDB" strings_raw = zf.read("strings.bin") counts_raw = zf.read("counts.bin") history = json.loads(zf.read("history.json")) sources = json.loads(zf.read("sources.json")) # Decode string table offset = 0 n_strings, offset = read_varint(strings_raw, offset) strings = [] for _ in range(n_strings): length, offset = read_varint(strings_raw, offset) strings.append(strings_raw[offset:offset+length].decode("utf-8")) offset += length # Decode counts mode = counts_raw[0]; offset = 1 n_counts, offset = read_varint(counts_raw, offset) counts = [] if mode == 1: # VARINT # Fast path: all single-byte values payload = counts_raw[offset:offset + n_counts] if len(payload) == n_counts and all(b < 0x80 for b in payload): counts = list(payload) else: for _ in range(n_counts): v, offset = read_varint(counts_raw, offset) counts.append(v) else: # UINT32 counts = list(struct.unpack_from(f"<{n_counts}I", counts_raw, offset)) return { "manifest": manifest, "strings": strings, "counts": counts, "history": history, "sources": sources, } .. seealso:: * :doc:`sqlite-schema` — SQLite backend schema reference * :doc:`xml-interchange` — XML interchange format * :ref:`working-with-coverage-merging` — How to merge databases using the CLI ----------- .. _ncdb-format-v2-history: *********************** 7. V2 binary test history *********************** When ``manifest.json`` contains ``"history_format": "v2"`` the archive holds six additional binary members. All integers are **little-endian** unless noted. 7.1 ``history/test_registry.bin`` ================================== Maps stable integer IDs to test names and seed strings. IDs are assigned by insertion order and never reassigned. .. code-block:: none Header (17 bytes): magic u32 0x54524547 ('TREG') version u8 1 next_run_id u32 monotonically-increasing run counter num_names u32 num_seeds u32 Offset tables (immediately after header): name_offsets u32[num_names] byte offset into name heap seed_offsets u32[num_seeds] byte offset into seed heap Heaps (NUL-terminated UTF-8 strings): name_heap NUL-terminated strings in name_id order seed_heap NUL-terminated strings in seed_id order 7.2 ``history/test_stats.bin`` ================================ One 72-byte entry per test name (indexed by name_id). .. code-block:: none Header (9 bytes): magic u32 0x54535453 ('TSTS') version u8 1 num_entries u32 Entry (72 bytes, repeated num_entries times): name_id u32 total_runs u32 pass_count u32 fail_count u32 error_count u32 skip_count u32 timeout_count u32 _reserved u32 (padding, always 0) mean_ms f32 Welford running mean of runtime in milliseconds m2_ms f32 Welford running sum-of-squares (variance = m2/n) cusum_pos f32 CUSUM positive accumulator for change detection cusum_neg f32 CUSUM negative accumulator _pad1 f32 (reserved, 0.0) _pad2 f32 (reserved, 0.0) _pad3 f32 (reserved, 0.0) flakiness_score i16 fixed-point 0–10000 representing 0.00–100.00 % tag u8[6] short ASCII label (NUL-padded) last_status u8 most-recent HIST_STATUS_* value _trailing u8 padding 7.3 ``history/bucket_index.bin`` ================================== Index over the per-bucket run-record files. .. code-block:: none Header (9 bytes): magic u32 0x42494458 ('BIDX') version u8 1 num_buckets u32 Entry (28 bytes, sorted by bucket_seq): bucket_seq u32 ts_start u32 Unix timestamp of first record in bucket ts_end u32 Unix timestamp of last record in bucket num_records u32 fail_count u32 min_name_id u32 max_name_id u32 7.4 ``history/NNNNNN.bin`` ============================ Each bucket holds up to 10 000 run records, compressed with LZMA (sealed buckets) or DEFLATE level 1 (current open bucket). After decompression: .. code-block:: none Header (16 bytes): magic u32 0x42434B54 ('BCKT') version u8 1 num_records u32 num_names u16 _pad u8 (padding) ts_base u32 Unix timestamp of first record Name index (12 bytes per unique name in this bucket): name_id u32 global name_id from test_registry offset u32 byte offset into name's record data count u16 number of records for this name _pad u8[2] Columnar record data (one column per name, name_id order): seeds[] u8[count] local seed index (≤ 255 unique/bucket) ts_deltas[] varint[count] delta-encoded seconds from ts_base status_flags[] u8[count] nibble-packed (high=status, low=flags) Seed dictionary (appended after all record data): num_local_seeds u8 seed_ids[] u32[num_local_seeds] global seed_ids Varint encoding: each value uses 1–5 bytes; the high bit of each byte indicates that more bytes follow (7 bits of value per byte, little-endian). 7.5 ``history/contrib_index.bin`` ==================================== Tracks which test runs contributed coverage so that squash can be replayed. .. code-block:: none Header (12 bytes): magic u32 0x43494458 ('CIDX') version u8 1 policy u8 merge-policy constant watermark u32 highest squashed run_id num_active u32 Entry (16 bytes, one per unsquashed run): run_id u32 name_id u32 status u8 flags u8 _pad u8[2] ts u32 7.6 ``history/squash_log.bin`` ================================ Append-only provenance log for squash events. .. code-block:: none Header (9 bytes): magic u32 0x53514C47 ('SQLG') version u8 1 num_entries u32 Entry (24 bytes): ts u32 Unix timestamp of squash operation policy u8 merge-policy used _pad u8[3] from_run u32 first run_id squashed to_run u32 last run_id squashed (inclusive) num_runs u32 total runs processed pass_runs u32 runs that passed ---- ********************************** 8. Testplan and Waivers JSON ********************************** ``testplan.json`` and ``waivers.json`` are optional UTF-8 JSON members stored at the ZIP root. They are written by :class:`~ucis.ncdb.ncdb_writer.NcdbWriter` when the corresponding objects are attached to the database and are read transparently by :class:`~ucis.ncdb.ncdb_reader.NcdbReader`. 8.1 ``testplan.json`` ====================== .. code-block:: json { "format_version": 1, "source_file": "uart.hjson", "import_timestamp": "2025-01-01T00:00:00+00:00", "testpoints": [ { "name": "uart_reset", "stage": "V1", "desc": "Verify reset", "tests": ["uart_smoke", "uart_reset_*"], "tags": ["smoke"], "na": false, "source_template": "", "requirements": [ {"id": "REQ-001", "desc": "Reset spec"} ] } ], "covergroups": [ {"name": "cg_reset", "desc": "Reset coverage"} ] } .. list-table:: testplan.json — top-level fields :header-rows: 1 :widths: 25 15 60 * - Field - Type - Description * - ``format_version`` - int - Schema version; currently ``1`` * - ``source_file`` - string - Path to the Hjson/JSON source that produced this plan * - ``import_timestamp`` - ISO-8601 string - UTC timestamp when the plan was last imported * - ``testpoints`` - array - Ordered list of :class:`~ucis.ncdb.testplan.Testpoint` objects * - ``covergroups`` - array - Ordered list of :class:`~ucis.ncdb.testplan.CovergroupEntry` objects Merger behaviour When merging two ``.cdb`` files that both contain ``testplan.json``: * **Same ``source_file``** — the entry with the later ``import_timestamp`` is kept. * **Different ``source_file``** — a warning is emitted and the merged output contains no testplan. 8.2 ``waivers.json`` ====================== .. code-block:: json { "format_version": 1, "waivers": [ { "id": "W-001", "scope_pattern": "top/uart/**", "bin_pattern": "reset_*", "rationale": "Deferred to V2", "approver": "jdoe", "approved_at": "2025-01-01T00:00:00", "expires_at": "2026-01-01T00:00:00", "status": "active" } ] } .. list-table:: waivers.json — Waiver fields :header-rows: 1 :widths: 25 15 60 * - Field - Type - Description * - ``id`` - string - Unique waiver identifier * - ``scope_pattern`` - glob string - Hierarchy path pattern; ``*`` = single segment, ``**`` = any depth * - ``bin_pattern`` - glob string - Coverage bin name pattern; same glob syntax as scope_pattern * - ``rationale`` - string - Human-readable reason for the waiver * - ``approver`` - string - Name or email of the approver * - ``approved_at`` - ISO-8601 string - Approval timestamp * - ``expires_at`` - ISO-8601 string - Expiry timestamp; empty string means no expiry * - ``status`` - ``"active"`` | ``"expired"`` - Current status; :meth:`~ucis.ncdb.waivers.WaiverSet.active_at` filters on both this field and ``expires_at`` Merger behaviour Waivers are unioned by ``id`` across all source files. When the same ``id`` appears in multiple sources the entry with the latest ``approved_at`` is kept.