NCDB Coverage File Format

NCDB (Native Coverage DataBase) is a compact, ZIP-based binary format for storing and merging UCIS coverage data. A single .cdb file is a standard ZIP archive whose members encode the scope hierarchy, hit counts, test history, and source file references.

The format is designed to be:

  • Space-efficient — typically 100–200× smaller than the equivalent SQLite .cdb (see 13. Size and performance reference).

  • Merge-fast — same-schema merges reduce to element-wise integer addition over a flat array, with no SQL overhead.

  • Self-describing — a manifest.json at the root of the archive carries all metadata needed to read or merge the file without any external schema.

  • Readable without PyUCIS — every binary encoding is documented here in sufficient detail to write an independent parser.


1. File identification

Both NCDB and the legacy SQLite backend use the .cdb extension. Format discrimination is done by inspecting the first 16 bytes of the file.

Format

Header (hex)

Description

SQLite

53 51 4C 69 74 65 20 66 6F 72 6D 61 74 20 33 00

Literal ASCII SQLite format 3\x00

NCDB (non-empty)

50 4B 03 04

ZIP local-file header signature PK\x03\x04

NCDB (empty archive)

50 4B 05 06

ZIP end-of-central-directory signature PK\x05\x06

Detection algorithm:

  1. Read the first 16 bytes of the file.

  2. If bytes[0:16] equals the SQLite magic string → format is sqlite.

  3. If bytes[0:4] is PK\x03\x04 or PK\x05\x06:

    1. Open as ZIP.

    2. Read manifest.json.

    3. If manifest["format"] == "NCDB" → format is ncdb.

  4. Otherwise → format is unknown.


2. Archive structure

An NCDB file is a standard ZIP archive (DEFLATE compression) whose members are named as follows. Members marked required must be present in every valid NCDB file; others are only written when the corresponding data is non-empty or non-default.

Member name

Required

Contents

manifest.json

Format identity, version, statistics, and the schema hash.

strings.bin

Deduplicated string table referenced by index throughout other members.

scope_tree.bin

DFS-serialized scope hierarchy (V2 encoding). Counts are not stored here.

counts.bin

Flat array of hit counts in the same DFS order as scope_tree.bin.

history.json

Array of test-run and merge history records.

sources.json

Ordered list of source file paths; indices match file IDs in scope_tree.bin.

attrs.bin

User-defined attribute assignments (V2 JSON: scopes, coveritems, history nodes, and global attributes).

tags.json

Tag assignments for scopes (sparse, DFS-indexed).

toggle.bin

Per-signal toggle metadata (JSON: canonical name, metric, type, direction).

fsm.bin

FSM state-index overrides (JSON, sparse; only written when state indices differ from the default 0, 1, 2, … sequence).

cross.bin

Cross-coverpoint link records (JSON: crossed coverpoint sibling names).

properties.json

Typed string property values (DFS scope-indexed).

design_units.json

Design-unit name-to-DFS-index lookup table (name, index, scope type).

formal.bin

Formal-verification assertion data (JSON: status, radius, witness).

coveritem_flags.bin

Per-coveritem non-default flags (sparse delta-encoded binary).

contrib/<hist_idx>.bin

Per-test coveritem contribution arrays (delta-encoded, sparse). One file per history node that has contributions; <hist_idx> is the integer history-node index (not zero-padded).


3. Primitive encodings

3.1 Unsigned LEB128 varint

All variable-length integers in NCDB are encoded as unsigned LEB128 (also called unsigned varint or ULEB128). This is the same encoding used by DWARF, WebAssembly, and Protocol Buffers (field type uint64).

Encoding:

  1. Take the 7 least-significant bits of the value; set bit 7 to 1 if more bytes follow, 0 if this is the last byte.

  2. Shift the value right by 7. Repeat until the value is zero.

value     bytes (hex)
────────────────────
0         00
1         01
127       7F
128       80 01
255       FF 01
16383     FF 7F
16384     80 80 01
2³²−1     FF FF FF FF 0F
2⁶⁴−1     FF FF FF FF FF FF FF FF FF 01

Decoding:

Read bytes one at a time. For each byte, take the low 7 bits and OR them into the accumulator at the current bit position (starting at 0). Advance the bit position by 7. If bit 7 of the byte is set, continue reading; otherwise stop.

def decode_varint(buf: bytes, offset: int = 0):
    result, shift = 0, 0
    while True:
        byte = buf[offset]; offset += 1
        result |= (byte & 0x7F) << shift
        shift += 7
        if not (byte & 0x80):
            return result, offset

3.2 UTF-8 strings

All text is UTF-8. Strings stored inline (e.g. in JSON members) are standard JSON strings. Strings stored in binary members (scope_tree.bin, strings.bin) are referenced by their string-table index (a varint).


4. manifest.json

A JSON object with the following fields (all present; unknown fields must be ignored by readers for forward compatibility):

{
  "format":          "NCDB",
  "version":         "1.0",
  "ucis_version":    "1.0",
  "created":         "2026-02-25T21:00:00Z",
  "path_separator":  "/",
  "scope_count":     42,
  "coveritem_count": 8800,
  "test_count":      64,
  "total_hits":      155432,
  "covered_bins":    7312,
  "schema_hash":     "sha256:a3f1...",
  "generator":       "pyucis-ncdb"
}

Field

Description

format

Always the string "NCDB". Readers must reject files where this is not "NCDB".

version

Format version string. Currently "1.0". Readers should check the major component; a mismatch should produce a clear error.

ucis_version

UCIS standard version the data conforms to. Currently "1.0".

created

ISO 8601 UTC timestamp when the file was written.

path_separator

Hierarchical path separator used in scope names. Typically "/".

scope_count

Total number of scopes in scope_tree.bin (informational).

coveritem_count

Total number of coveritems. Must equal the length of the array in counts.bin.

test_count

Number of TEST-kind entries in history.json.

total_hits

Sum of all values in counts.bin.

covered_bins

Number of non-zero values in counts.bin.

schema_hash

"sha256:" followed by the lowercase hex SHA-256 digest of the uncompressed scope_tree.bin content. Used by the fast-merge path to verify schema identity without parsing the scope tree. (See 10. Merging NCDB files.)

generator

Free-form tool identification string.


5. strings.bin

A deduplicated string table. Every string used anywhere in scope_tree.bin (scope names, coveritem names) is stored exactly once here and referenced by a zero-based integer index.

Binary layout:

[count   : varint]          — number of strings
[len_0   : varint]          — byte length of string 0 (UTF-8 encoded)
[bytes_0 : len_0 bytes]     — UTF-8 bytes of string 0
[len_1   : varint]
[bytes_1 : len_1 bytes]
...
  • Index 0 is always the empty string "".

  • String indices are stable: the same string always maps to the same index within a single file (indices are assigned in first-encounter DFS order).


6. scope_tree.bin

The complete scope hierarchy encoded as a depth-first traversal. The file contains a flat sequence of scope records with no explicit end marker; the count of child scopes embedded in each record defines the nesting.

Counts (hit values) are not stored in this member. Instead, each coveritem encountered during DFS appends its hit count to counts.bin in the same traversal order. A reader reconstructs the association by walking scope_tree.bin and consuming counts from counts.bin in lockstep.

6.1 Scope record types

Every scope record begins with a one-byte marker:

Marker byte

Name

Description

0x00

REGULAR

Full scope record with type, name, presence bitfield, and children.

0x01

TOGGLE_PAIR

Compact 2-field record for BRANCH scopes that carry exactly two TOGGLEBIN coveritems with the implicit names "0 -> 1" and "1 -> 0". Saves ~10 bytes per signal.

6.2 REGULAR scope record

[marker    : 1 byte  ]  always 0x00
[scope_type: varint  ]  ScopeTypeT integer value
[name_ref  : varint  ]  index into strings.bin
[presence  : varint  ]  bitfield of optional fields present (see below)

— optional fields, each present only if the corresponding bit is set —
[flags       : varint  ]  only if PRESENCE_FLAGS       (bit 0) set
[file_id     : varint  ]  only if PRESENCE_SOURCE      (bit 1) set
[line        : varint  ]     "
[token       : varint  ]     "
[weight      : varint  ]  only if PRESENCE_WEIGHT      (bit 2) set
[at_least    : varint  ]  only if PRESENCE_AT_LEAST    (bit 3) set
[goal        : varint  ]  only if PRESENCE_GOAL        (bit 5) set
[source_type : varint  ]  only if PRESENCE_SOURCE_TYPE (bit 6) set

— always present —
[num_children : varint]  number of child scope records that follow
[num_covers   : varint]  number of coveritem records that follow

— present only when num_covers > 0 —
[cover_type   : varint]  CoverTypeT of all coveritems in this scope

— num_covers coveritem records —
[name_ref_ci  : varint]  × num_covers   (one per coveritem)

— num_children child scope records (recursive) —

Presence bitfield values:

Bit

Name

Meaning

0

PRESENCE_FLAGS

Non-default scope flags are stored.

1

PRESENCE_SOURCE

Source location (file_id, line, token) is stored.

2

PRESENCE_WEIGHT

Non-default scope weight (≠ 1) is stored.

3

PRESENCE_AT_LEAST

An at_least threshold that overrides the cover-type default is stored at the scope level (applies to all coveritems in the scope).

4

PRESENCE_CVG_OPTS

Reserved for covergroup options (not yet used by the writer).

5

PRESENCE_GOAL

Non-default scope goal (≠ −1) is stored.

6

PRESENCE_SOURCE_TYPE

Explicit SourceT enum value is stored. When absent, the source type defaults to SourceT.NONE.

Cover-type defaults (used when PRESENCE_AT_LEAST is absent):

CoverTypeT

flags default

at_least default

weight default

CVGBIN

0x19

1

1

All others (TOGGLEBIN, STMTBIN, BRANCHBIN, …)

0x01

0

1

6.3 TOGGLE_PAIR record

[marker   : 1 byte ]  always 0x01
[name_ref : varint ]  scope name index in strings.bin

A TOGGLE_PAIR record implicitly encodes:

  • Scope type: BRANCH

  • Two TOGGLEBIN coveritems with names "0 -> 1" and "1 -> 0" (in that order).

  • Two consecutive entries are consumed from counts.bin: first the "0 -> 1" count, then the "1 -> 0" count.

No child scope records follow a TOGGLE_PAIR.

6.4 Scope-type integer values

The scope_type varint uses the integer values of ScopeTypeT. The most common values are:

Value

ScopeTypeT name

Typical context

2

DU_MODULE

Design-unit scope for a Verilog module

16

INSTANCE

Instantiation of a design unit

22

COVERGROUP

SystemVerilog covergroup type or instance

23

COVERPOINT

SystemVerilog coverpoint

28

CROSS

SystemVerilog cross

30

BRANCH

Code-coverage branch (toggle pair or regular)

32

TOGGLE

Toggle scope (parent of BRANCH scopes)

33

FSM

Finite state machine

36

BLOCK

Statement block

The full set of values is defined in ucis/scope_type_t.py.


7. counts.bin

A flat array of non-negative integers, one per coveritem, in the same DFS order as the coveritems encountered while reading scope_tree.bin. TOGGLE_PAIR scopes contribute two consecutive counts ("0 -> 1" then "1 -> 0").

The array length is given by coveritem_count in manifest.json.

7.1 Binary layout

[mode  : 1 byte ]  0 = UINT32, 1 = VARINT
[count : varint ]  number of integers that follow
[data  : …      ]  mode-dependent encoding (see below)

Mode 0 — UINT32: Each integer is a 4-byte little-endian unsigned 32-bit value. Used when most counts are large (i.e. varint encoding would not save space).

[v_0 : 4 bytes LE] [v_1 : 4 bytes LE] … [v_{n-1} : 4 bytes LE]

Mode 1 — VARINT: Each integer is encoded as an unsigned LEB128 varint (see 3.1 Unsigned LEB128 varint). Used when most counts are small (0–127), which is the common case for per-test databases.

[varint_0] [varint_1] … [varint_{n-1}]

Mode selection: The writer computes both encodings and selects VARINT when len(varint_encoding) < count × 4 (i.e. when it is strictly smaller), falling back to UINT32 otherwise. A reader must support both modes.

7.2 Efficient single-byte fast path

When mode is VARINT and all values fit in a single byte (0–127), each byte in the data section is equal to the corresponding count value (the high bit is never set). A parser can exploit this: scan the data section for any byte ≥ 0x80; if none are found, each byte is its value, and the entire section can be decoded with a single bytes list conversion.


8. history.json

A JSON array of history node records. Each element represents either a test run (kind: "TEST") or a merge operation (kind: "MERGE").

Record schema:

[
  {
    "logical_name":  "regression_seed_42",
    "physical_name": null,
    "kind":          "TEST",
    "test_status":   0,
    "tool_category": "sim",
    "date":          "2026-02-25",
    "sim_time":      1500.0,
    "time_unit":     "ns",
    "run_cwd":       "/home/user/sim",
    "cpu_time":      12.3,
    "seed":          "42",
    "cmd":           "vsim -seed 42 top",
    "args":          "",
    "compulsory":    null,
    "user_name":     "jsmith",
    "cost":          0.0,
    "ucis_version":  null,
    "vendor_id":     null,
    "vendor_tool":   null,
    "vendor_tool_version": null,
    "same_tests":    null,
    "comment":       null
  }
]

Field

Type

Description

logical_name

string

Unique name for this history node (test name or merge label).

physical_name

string | null

Physical file name associated with the history node, or null.

kind

"TEST" | "MERGE"

History node kind.

test_status

integer

Test status code: 0 = OK, 1 = WARNING, 2 = ERROR, 3 = FATAL, 4 = NOTRUN.

tool_category

string

Free-form tool category (e.g. "sim", "formal").

date

string

Date string (ISO 8601 recommended).

sim_time

number

Simulation end time in time_unit units.

time_unit

string

Simulation time unit (e.g. "ns", "ps").

run_cwd

string

Working directory of the simulation run.

cpu_time

number

CPU seconds consumed.

seed

string

Random seed used.

cmd

string

Simulator command line.

args

string

Additional arguments.

compulsory

any | null

Compulsory flag (tool-defined), or null if unset.

user_name

string

Username that ran the simulation.

cost

number

Simulation cost (tool-defined).

ucis_version

string | null

UCIS version associated with this history node, or null.

vendor_id

string | null

Vendor identifier, or null.

vendor_tool

string | null

Vendor tool name, or null.

vendor_tool_version

string | null

Vendor tool version, or null.

same_tests

integer | null

Number of identical tests merged, or null.

comment

string | null

Free-form comment, or null.


9. sources.json

A JSON array of strings, where each element is an absolute or relative file path. The position of each path in the array is its file ID, which is the integer used as file_id in scope_tree.bin source references.

[
  "/home/user/design/top.sv",
  "/home/user/design/alu.sv",
  "/home/user/tb/coverage_pkg.sv"
]

File ID 0 corresponds to the first element. An empty sources.json ([]) is valid when no source information was recorded.


10. Merging NCDB files

The key performance advantage of NCDB over SQLite is the same-schema fast merge path, which reduces a multi-file merge to element-wise integer addition.

10.1 Same-schema fast merge

Two NCDB files are schema-compatible if and only if their schema_hash values are equal. The schema_hash is "sha256:" followed by the SHA-256 digest of the uncompressed scope_tree.bin bytes; equal hashes guarantee an identical scope hierarchy and coveritem ordering.

Algorithm for merging N same-schema files into one output file:

  1. Read manifest.json from all N sources. Verify schema_hash is identical for all; if not, fall back to the cross-schema path.

  2. Read counts.bin from all N sources → N lists of integers.

  3. Compute the merged count array: element-wise sum of all N lists. (In Python: list(map(sum, zip(*all_counts))))

  4. Concatenate all history.json arrays from all sources. Append a new MERGE history node that references all source names.

  5. Copy strings.bin, scope_tree.bin, and sources.json verbatim from the first source (they are identical for same-schema files).

  6. Write the output ZIP with the merged manifest, the copied schema members, the merged counts.bin, and the combined history.json.

The scope tree and string table never need to be decoded for a same-schema merge.

10.2 Cross-schema merge

When the schema hashes differ, the merger must parse both scope trees, match scopes by (path, type, name) key, and add counts for matched coveritems. Unmatched coveritems from either source are appended with their original counts. This path is slower but correct for merging databases from designs that have evolved between runs.

10.3 Merge history node

A merge operation appends a "MERGE"-kind history node to history.json:

{
  "logical_name": "merge:output.cdb",
  "physical_name": null,
  "kind":   "MERGE",
  "test_status": 0,
  "tool_category": "merge",
  "date":   "2026-02-25T21:00:00Z"
}

11. Optional binary members

These members are omitted from the archive when the corresponding data is absent or all-default. Readers must silently skip any optional member they do not support, and must not fail if an expected optional member is absent.

11.1 attrs.bin

User-defined attribute assignments. Despite the .bin extension, this member is JSON-encoded.

Format v2 (current):

{
  "version": 2,
  "scopes": [
    {"idx": 0, "attrs": {"key": "value"}}
  ],
  "coveritems": [
    {"scope_idx": 0, "ci_idx": 1, "attrs": {"key": "value"}}
  ],
  "history": [
    {"idx": 0, "kind": "TEST", "attrs": {"key": "value"}}
  ],
  "global": {"key": "value"}
}

idx / scope_idx values are DFS scope indices (same ordering as scope_tree.bin). ci_idx is the zero-based coveritem position within its parent scope. Only objects with at least one attribute are included (sparse). The reader also accepts legacy v1 files that store only scope-level attributes.

11.2 tags.json

Tag assignments for scopes (sparse, DFS-indexed).

{
  "version": 1,
  "entries": [
    {"idx": 0, "tags": ["tag_a", "tag_b"]}
  ]
}

idx is the DFS scope index. Only scopes with at least one tag are included.

11.3 toggle.bin

Per-signal toggle metadata for TOGGLE-type scopes. Despite the .bin extension, this member is JSON-encoded.

{
  "version": 1,
  "entries": [
    {"idx": 5, "canonical": "top.clk", "metric": 0, "type": 1, "dir": 2}
  ]
}

idx is the DFS scope index. All fields except idx are optional and are omitted when they match the defaults (metric = ToggleMetricT._2STOGGLE, type = ToggleTypeT.NET, dir = ToggleDirT.INTERNAL). Only TOGGLE scopes with at least one non-default value are included.

11.4 fsm.bin

FSM state-index overrides for FSM-type scopes. Despite the .bin extension, this member is JSON-encoded. State and transition names are already stored in scope_tree.bin as FSMBIN coveritems under FSM_STATES and FSM_TRANS sub-scopes; this member only records non-sequential state indices.

{
  "version": 1,
  "entries": [
    {"fsm_idx": 3, "states": [{"name": "IDLE", "index": 5}]}
  ]
}

fsm_idx is the DFS scope index of the FSM scope. Only FSM scopes whose state indices differ from the default 0, 1, 2, … sequence are included. The member is omitted entirely when all indices are sequential.

11.5 cross.bin

Cross-coverpoint link records for CROSS-type scopes. Despite the .bin extension, this member is JSON-encoded.

{
  "version": 1,
  "entries": [
    {"idx": 12, "crossed": ["cp_a", "cp_b"]}
  ]
}

idx is the DFS scope index of the CROSS scope. crossed lists the getScopeName() values of each crossed coverpoint (sibling scopes within the same parent COVERGROUP/COVERINSTANCE).

11.6 properties.json

Typed string property values for scopes (DFS-indexed).

{
  "version": 1,
  "entries": [
    {"kind": "scope", "idx": 0, "key": 1, "type": "str", "value": "comment text"}
  ]
}

key is the integer value of the StrProperty enum. Only scopes with explicitly-set properties are included.

11.7 design_units.json

Design-unit name-to-DFS-index lookup table.

{
  "version": 1,
  "units": [
    {"name": "top", "idx": 0, "type": 2}
  ]
}

type is the integer value of ScopeTypeT (e.g. 2 = DU_MODULE). Only DU_ANY scopes are included. The member is omitted when no design units are present.

11.8 formal.bin

Formal-verification assertion data. Despite the .bin extension, this member is JSON-encoded.

{
  "version": 1,
  "entries": [
    {"idx": 42, "status": 1, "radius": 100, "witness": "/path/to/witness.vcd"}
  ]
}

idx is the flat DFS coveritem index (same ordering as counts.bin). Fields status, radius, and witness are each omitted when they match the defaults (0, 0, null respectively). Defaults: status = FormalStatusT.NONE (0), radius = 0.

11.9 coveritem_flags.bin

Per-coveritem non-default flags. This member uses a true binary encoding (sparse, delta-encoded varint pairs).

[version      : varint]  always 1
[num_entries  : varint]  number of (index, flags) pairs
per entry:
    [delta_idx : varint]  coveritem DFS index delta from previous entry
    [flags     : varint]  ucisFlagsT value

Only coveritems whose flags differ from the cover-type default (see cover-type defaults table in Section 6.2) are included. The member is omitted entirely when all coveritems use default flags.

11.10 contrib/<hist_idx>.bin

Per-test contribution arrays. One file per history node that recorded contributions; <hist_idx> is the integer history-node index (not zero-padded). Each file encodes a sparse, delta-encoded array of per-test hit counts, allowing reconstruction of which tests hit which bins.

[num_entries      : varint]
per entry (sorted by bin_index, ascending):
    [delta_bin_index : varint]  bin_index − previous bin_index
    [count           : varint]  hit count for this bin from this test

12. Version history

Version

Changes

1.0

Initial release. Scope-tree V2 encoding with presence bitfield and TOGGLE_PAIR optimization. Varint + UINT32 dual-mode counts encoding. Same-schema fast-merge path via schema_hash.

1.0 (UCIS compliance update)

Added presence bits 4–6 (PRESENCE_CVG_OPTS, PRESENCE_GOAL, PRESENCE_SOURCE_TYPE) to scope records. Added coveritem_flags.bin member for per-coveritem non-default flags. Updated history.json to UCIS-compliant field names (logical_name, physical_name, test_status, sim_time, time_unit, run_cwd, cpu_time, user_name) and added vendor/tool fields (ucis_version, vendor_id, vendor_tool, vendor_tool_version, same_tests, comment, compulsory). Upgraded attrs.bin to V2 format with sections for scopes, coveritems, history nodes, and global attributes. Updated cover-type default flags to 0x01 (most types) / 0x19 (CVGBIN). Documented cross.bin, properties.json, design_units.json, formal.bin, and contrib/ formats.


13. Size and performance reference

Measurements using synthetic BM1–BM6 benchmark databases (pure Python, no C accelerator, median of 3 merge runs):

Workload

Bins

SQLite/test

NCDB/test

Size ratio

SQLite merge

NCDB merge

BM1 Counter

5

276 KB

1.3 KB

209×

22 ms

1.2 ms

BM2 ALU

104

276 KB

1.4 KB

196×

24 ms

1.7 ms

BM3 Protocol

180

276 KB

1.4 KB

195×

29 ms

3.5 ms

BM4 Hierarchy

117

276 KB

1.4 KB

195×

28 ms

4.0 ms

BM5 Bins (8K)

8 800

276 KB

2.3 KB

122×

40 ms

17 ms

BM6 SoC

256

276 KB

1.4 KB

192×

72 ms

12 ms

Merge seed counts: BM1=4, BM2=16, BM3=32, BM4=32, BM5=64, BM6=128.

The SQLite per-test size is dominated by the fixed B-tree page overhead (minimum 276 KB regardless of design size). NCDB scales with actual data: a design with 5 bins uses only 1.3 KB.

With a C accelerator for varint encode/decode, BM5 merge time is projected to drop to ~5 ms (~7.5× faster than SQLite).


14. Implementing a reader

To read an NCDB file without PyUCIS:

import zipfile, json, struct, hashlib

def read_varint(data, offset):
    result, shift = 0, 0
    while True:
        b = data[offset]; offset += 1
        result |= (b & 0x7F) << shift
        shift += 7
        if not (b & 0x80):
            return result, offset

def read_ncdb(path):
    with zipfile.ZipFile(path) as zf:
        manifest = json.loads(zf.read("manifest.json"))
        assert manifest["format"] == "NCDB"

        strings_raw = zf.read("strings.bin")
        counts_raw  = zf.read("counts.bin")
        history     = json.loads(zf.read("history.json"))
        sources     = json.loads(zf.read("sources.json"))

    # Decode string table
    offset = 0
    n_strings, offset = read_varint(strings_raw, offset)
    strings = []
    for _ in range(n_strings):
        length, offset = read_varint(strings_raw, offset)
        strings.append(strings_raw[offset:offset+length].decode("utf-8"))
        offset += length

    # Decode counts
    mode = counts_raw[0]; offset = 1
    n_counts, offset = read_varint(counts_raw, offset)
    counts = []
    if mode == 1:  # VARINT
        # Fast path: all single-byte values
        payload = counts_raw[offset:offset + n_counts]
        if len(payload) == n_counts and all(b < 0x80 for b in payload):
            counts = list(payload)
        else:
            for _ in range(n_counts):
                v, offset = read_varint(counts_raw, offset)
                counts.append(v)
    else:  # UINT32
        counts = list(struct.unpack_from(f"<{n_counts}I", counts_raw, offset))

    return {
        "manifest": manifest,
        "strings":  strings,
        "counts":   counts,
        "history":  history,
        "sources":  sources,
    }

See also


7. V2 binary test history

When manifest.json contains "history_format": "v2" the archive holds six additional binary members. All integers are little-endian unless noted.

7.1 history/test_registry.bin

Maps stable integer IDs to test names and seed strings. IDs are assigned by insertion order and never reassigned.

Header (17 bytes):
  magic       u32   0x54524547  ('TREG')
  version     u8    1
  next_run_id u32   monotonically-increasing run counter
  num_names   u32
  num_seeds   u32

Offset tables (immediately after header):
  name_offsets  u32[num_names]  byte offset into name heap
  seed_offsets  u32[num_seeds]  byte offset into seed heap

Heaps (NUL-terminated UTF-8 strings):
  name_heap  NUL-terminated strings in name_id order
  seed_heap  NUL-terminated strings in seed_id order

7.2 history/test_stats.bin

One 72-byte entry per test name (indexed by name_id).

Header (9 bytes):
  magic      u32   0x54535453  ('TSTS')
  version    u8    1
  num_entries u32

Entry (72 bytes, repeated num_entries times):
  name_id      u32
  total_runs   u32
  pass_count   u32
  fail_count   u32
  error_count  u32
  skip_count   u32
  timeout_count u32
  _reserved    u32   (padding, always 0)
  mean_ms      f32   Welford running mean of runtime in milliseconds
  m2_ms        f32   Welford running sum-of-squares (variance = m2/n)
  cusum_pos    f32   CUSUM positive accumulator for change detection
  cusum_neg    f32   CUSUM negative accumulator
  _pad1        f32   (reserved, 0.0)
  _pad2        f32   (reserved, 0.0)
  _pad3        f32   (reserved, 0.0)
  flakiness_score i16  fixed-point 0–10000 representing 0.00–100.00 %
  tag          u8[6] short ASCII label (NUL-padded)
  last_status  u8    most-recent HIST_STATUS_* value
  _trailing    u8    padding

7.3 history/bucket_index.bin

Index over the per-bucket run-record files.

Header (9 bytes):
  magic       u32   0x42494458  ('BIDX')
  version     u8    1
  num_buckets u32

Entry (28 bytes, sorted by bucket_seq):
  bucket_seq  u32
  ts_start    u32   Unix timestamp of first record in bucket
  ts_end      u32   Unix timestamp of last record in bucket
  num_records u32
  fail_count  u32
  min_name_id u32
  max_name_id u32

7.4 history/NNNNNN.bin

Each bucket holds up to 10 000 run records, compressed with LZMA (sealed buckets) or DEFLATE level 1 (current open bucket). After decompression:

Header (16 bytes):
  magic       u32   0x42434B54  ('BCKT')
  version     u8    1
  num_records u32
  num_names   u16
  _pad        u8    (padding)
  ts_base     u32   Unix timestamp of first record

Name index (12 bytes per unique name in this bucket):
  name_id     u32   global name_id from test_registry
  offset      u32   byte offset into name's record data
  count       u16   number of records for this name
  _pad        u8[2]

Columnar record data (one column per name, name_id order):
  seeds[]         u8[count]           local seed index (≤ 255 unique/bucket)
  ts_deltas[]     varint[count]       delta-encoded seconds from ts_base
  status_flags[]  u8[count]           nibble-packed (high=status, low=flags)

Seed dictionary (appended after all record data):
  num_local_seeds u8
  seed_ids[]      u32[num_local_seeds]  global seed_ids

Varint encoding: each value uses 1–5 bytes; the high bit of each byte indicates that more bytes follow (7 bits of value per byte, little-endian).

7.5 history/contrib_index.bin

Tracks which test runs contributed coverage so that squash can be replayed.

Header (12 bytes):
  magic        u32   0x43494458  ('CIDX')
  version      u8    1
  policy       u8    merge-policy constant
  watermark    u32   highest squashed run_id
  num_active   u32

Entry (16 bytes, one per unsquashed run):
  run_id    u32
  name_id   u32
  status    u8
  flags     u8
  _pad      u8[2]
  ts        u32

7.6 history/squash_log.bin

Append-only provenance log for squash events.

Header (9 bytes):
  magic      u32   0x53514C47  ('SQLG')
  version    u8    1
  num_entries u32

Entry (24 bytes):
  ts        u32   Unix timestamp of squash operation
  policy    u8    merge-policy used
  _pad      u8[3]
  from_run  u32   first run_id squashed
  to_run    u32   last run_id squashed (inclusive)
  num_runs  u32   total runs processed
  pass_runs u32   runs that passed

8. Testplan and Waivers JSON

testplan.json and waivers.json are optional UTF-8 JSON members stored at the ZIP root. They are written by NcdbWriter when the corresponding objects are attached to the database and are read transparently by NcdbReader.

8.1 testplan.json

{
  "format_version": 1,
  "source_file": "uart.hjson",
  "import_timestamp": "2025-01-01T00:00:00+00:00",
  "testpoints": [
    {
      "name": "uart_reset",
      "stage": "V1",
      "desc": "Verify reset",
      "tests": ["uart_smoke", "uart_reset_*"],
      "tags": ["smoke"],
      "na": false,
      "source_template": "",
      "requirements": [
        {"id": "REQ-001", "desc": "Reset spec"}
      ]
    }
  ],
  "covergroups": [
    {"name": "cg_reset", "desc": "Reset coverage"}
  ]
}
testplan.json — top-level fields

Field

Type

Description

format_version

int

Schema version; currently 1

source_file

string

Path to the Hjson/JSON source that produced this plan

import_timestamp

ISO-8601 string

UTC timestamp when the plan was last imported

testpoints

array

Ordered list of Testpoint objects

covergroups

array

Ordered list of CovergroupEntry objects

Merger behaviour

When merging two .cdb files that both contain testplan.json:

  • Same ``source_file`` — the entry with the later import_timestamp is kept.

  • Different ``source_file`` — a warning is emitted and the merged output contains no testplan.

8.2 waivers.json

{
  "format_version": 1,
  "waivers": [
    {
      "id": "W-001",
      "scope_pattern": "top/uart/**",
      "bin_pattern": "reset_*",
      "rationale": "Deferred to V2",
      "approver": "jdoe",
      "approved_at": "2025-01-01T00:00:00",
      "expires_at": "2026-01-01T00:00:00",
      "status": "active"
    }
  ]
}
waivers.json — Waiver fields

Field

Type

Description

id

string

Unique waiver identifier

scope_pattern

glob string

Hierarchy path pattern; * = single segment, ** = any depth

bin_pattern

glob string

Coverage bin name pattern; same glob syntax as scope_pattern

rationale

string

Human-readable reason for the waiver

approver

string

Name or email of the approver

approved_at

ISO-8601 string

Approval timestamp

expires_at

ISO-8601 string

Expiry timestamp; empty string means no expiry

status

"active" | "expired"

Current status; active_at() filters on both this field and expires_at

Merger behaviour

Waivers are unioned by id across all source files. When the same id appears in multiple sources the entry with the latest approved_at is kept.