NNRP/1-preview3 Protocol Design
1. Positioning
NNRP/1-preview3 does not continue the NNRP/1 line by piling on isolated capabilities. Its purpose is to freeze the protocol topics that are still missing from the next stage of NNRP/1: a multi-session connection container, a profile-neutral schema/profile registry, and unified runtime semantics for operation/workflow lifecycles.
preview3 focuses on three protocol problems:
- Connection and session model: upgrade the single-active-session mental model into a unified container that can carry multiple sessions, multiple priorities, and multiple concurrent operations.
- Extension and data semantics: bring typed payloads, schema/profile interpretation, and cache/lease behavior into one stable public protocol boundary rather than continuing to patch them per implementation.
- Lifecycle and observability: freeze operation/workflow lifecycle, flow-control, recovery, and observability semantics so they remain consistent across implementations.
Therefore, this document defines only protocol-layer boundaries: message types, fixed metadata, body layout, error vocabulary, state-machine semantics, and the conformance baseline.
The formal role of preview3 is therefore: the next protocol-freeze document for NNRP/1, not a place to prescribe any concrete implementation shape.
1.1 Overview Diagram
This diagram compresses the core protocol objects of preview3 into one view: the connection is now a multi-session container, operations become explicit lifecycle objects, and extensibility is governed by the schema/profile registry.
2. Topics Explicitly Covered by preview3
NNRP/1-preview3 explicitly covers the following topics:
- Upgrade the connection model from preview2's primary mental model of a single active session into a unified connection container capable of carrying multiple active sessions, multiple priority streams, and multi-workflow operations.
- Upgrade preview2's object cache into an AI runtime cache with lease, versioning, dependency, and observability.
- Upgrade typed payloads from static enums to a negotiable type system driven by a schema/profile registry.
- Elevate tool deltas, structured events, and multi-step inference results into explicit operation / workflow runtime semantics, rather than treating them only as payload frames.
- Freeze cross-implementation conformance, golden vectors, error codes, descriptor layouts, and state-machine vocabulary so all implementations consume the same protocol baseline.
- Define the public boundary of flow control, recovery, and observability fields so implementations do not invent a second runtime semantics in parallel.
preview3 still keeps the core constraints of preview1/preview2:
- The hot path remains binary, fixed-layout, explicitly sized, and directly locatable.
- The protocol continues to serve real-time AI runtime semantics rather than browser media stacks or general-purpose RPC.
- The public layer remains profile-neutral in preview3. The first-round standard profiles include at least
tensorandtoken; tensor is no longer treated as the default privileged profile.
3. Topics Not Covered by preview3
preview3 explicitly does not cover the following content:
- Traditional media-stack problems such as browser media capture, playback, A/V sync, AEC, ABR, and SFU/MCU.
- Private GPU memory-page layout, KV-cache page encoding, or runtime internal thread models of a specific model or inference framework.
- UI / game-engine / notebook / web-framework wrapping conventions of specific host environments.
- Concrete implementation-facing API shape, handle management, callback/polling drive modes, packaging, and release strategy.
- Hard-coding all upper-layer AI business semantics into public protocol enums; preview3 defines only public runtime semantics and standard extension mechanisms.
The mistake preview3 must avoid is misunderstanding "making multi-language integration easier" as "the protocol layer must freeze all upper-layer business objects at once." What should actually be frozen are extension boundaries, object lifecycle, and cross-language consistency requirements, rather than the upper-layer object tree of a single product form.
4. Design Principles
preview3 adopts the following design principles:
- Single protocol semantic source: cross-implementation public semantics must be frozen first in the protocol document and conformance baseline; no implementation may privately redefine them.
- No text on the hot path:
FRAME_SUBMIT,RESULT_PUSH, typed payload frames, cache objects, and schema descriptions all continue to follow fixed-layout binary paths, with no JSON/Protobuf hot-path fallback. - Protocol concepts first: once multi-session, priority, cache lease, schema registry, operation lifecycle, and similar concepts affect cross-language interoperability, they must first become protocol concepts rather than being left to private extensions in a single implementation.
- Profile/schema layering: the public layer freezes connection, session, cache, budget, priority, and operation semantics; concrete payload structures are extended through the profile/schema registry rather than continuously bloating public enums.
- Overwrite-in-place evolution within one major line: preview iterations inside the same major line directly overwrite the current
NNRP/1semantics and do not preserve parallel preview-compatibility paths. - Implementation neutrality: the protocol constrains messages, metadata, error vocabulary, state machines, and descriptor boundaries, not concrete implementation shape.
5. Implementation Boundary
This document freezes only protocol objects, fixed layouts, state machines, error codes, and the conformance vocabulary.
Concrete implementation-facing API shape, handle management, callback/polling drive modes, packaging, and release strategy are not frozen in this protocol document.
The cross-version design of the protocol conformance suite itself, including its layering, case-status model, and integration boundary for implementations, is specified separately in the conformance-suite design document. preview3 only declares here that it must consume one shared public conformance baseline rather than redefining the whole testing framework inline.
6. NNRP/1 Code-Level Identity and Inherited Constraints
6.1 Code-Level Version Identity
preview3, as a development-stage document inside the NNRP/1 line, freezes the emitted code-level identity as NNRP/1.0:
version_major = 1wire_format = 0- ALPN
nnrp/1
This does not mean the design-stage name stops being preview3. It means preview3 must not introduce a new preview-only code-level stage byte or preview-only ALPN. Although the connection and session model of preview3 reuses some concepts of preview2, the new semantics of multiple sessions, multiple priorities, and schema registry added by preview3 must be negotiated through explicit protocol concepts and capability windows rather than being encoded as a preview-only version number.
6.2 Common Header and Length Model
preview3 continues to retain the 40-byte common header and the self-describing length model of meta_len + body_len. The main evolution points of preview3 still lie in:
- Upgrading the field semantics of the metadata table.
- Dividing responsibilities across control-plane message families.
- The extension capability of body regions and the binding relationship between typed payloads and schema.
- Connection and session state-machine semantics.
6.2A Wire Integer Encoding and Fixed-Metadata Packing Rules
preview3 freezes the following binary encoding rules in the first round:
- All multi-byte integers are encoded little-endian.
- The common header and fixed metadata use compact fixed layouts with no language ABI padding.
- Field offsets are byte offsets from the beginning of the structure; every
offsetin the tables below is a wire offset that all implementations must follow. - All
reservedfields must be cleared by senders; receivers must reject non-zero reserved fields on strict / conformance paths. - Bitmap fields may only set bits frozen by this document; receivers must reject unknown set bits on strict / conformance paths.
6.2B Freezing of preview3 Top-Level Message Type Assignments
preview3 freezes the following public msg_type:u8 assignments in the first round. Existing preview2 assignments remain unchanged, while the new preview3 session-container messages occupy previously reserved control-plane slots.
| Value | Name | Metadata Length | Body Shape | Description |
|---|---|---|---|---|
0x01 | CLIENT_HELLO | existing NNRP/1 length | optional extension blocks | Connection-level hello, capability window, and authentication entry |
0x02 | SERVER_HELLO_ACK | existing NNRP/1 length | optional extension blocks | Connection-level hello ack and capability-negotiation result |
0x03 | SESSION_PATCH | existing NNRP/1 length | optional extension blocks | Low-frequency session update path |
0x04 | SESSION_PATCH_ACK | existing NNRP/1 length | optional extension blocks | Low-frequency session update acknowledgment |
0x05 | CLOSE | 0 or existing close metadata | optional extension blocks | Connection-level close; no longer means closing one session |
0x06 | ERROR | existing NNRP/1 length | optional error extension | Stable protocol error response |
0x07 | SESSION_OPEN | 48 | resume_token_block + auth_block + session_extension_block | Explicitly open one session |
0x08 | SESSION_OPEN_ACK | 56 | resume_token_block + session_extension_block | Confirm or reject a session open |
0x09 | SESSION_CLOSE | 24 | 0 or close extension block | Close one session |
0x0A | SESSION_CLOSE_ACK | 16 | 0 or close extension block | Acknowledge session close state |
0x10 | FRAME_SUBMIT | existing NNRP/1 submit length | submit body / typed payload region | Data-plane submission; preview3 upgrades semantics through session/profile/schema |
0x11 | FRAME_CANCEL | existing NNRP/1 length | optional extension blocks | Cancel a submission or operation |
0x12 | RESULT_PUSH | existing NNRP/1 result length | result body / typed payload region | Result, partial, terminal, stale/degraded delivery |
0x13 | RESULT_DROP | existing NNRP/1 length | optional extension blocks | Explicit result drop or delivery failure |
0x14 | CACHE_PUT | existing NNRP/1 length | cache object body | Cache-object write |
0x15 | CACHE_ACK | existing NNRP/1 length | optional extension blocks | Cache operation acknowledgment |
0x16 | CACHE_INVALIDATE | existing NNRP/1 length | optional extension blocks | Cache or schema invalidation |
0x17 | FLOW_UPDATE | 32 | 0 | Three-scope credit / backpressure update |
0x18 | RESULT_HINT | existing NNRP/1 length | optional extension blocks | Low-frequency result hint |
0x19 | TRANSPORT_PROBE | existing NNRP/1 length | probe body | Transport probing |
0x1A | TRANSPORT_PROBE_ACK | existing NNRP/1 length | probe ack body | Transport probing acknowledgment |
0x1B | SESSION_MIGRATE | existing NNRP/1 length | migration body | Existing migration control path |
0x1C | SESSION_MIGRATE_ACK | existing NNRP/1 length | migration ack body | Existing migration acknowledgment path |
0x20 | PING | 0 | 0 | Keepalive / smoke |
0x21 | PONG | 0 | 0 | Keepalive / smoke |
First-round constraints:
0x07-0x0Aare the preview3 session-container message assignments; implementations must not use these values for private messages.- Any
msg_typevalue not listed above is reserved; receivers must reject unknown message types on strict / conformance paths. - Messages marked as "existing NNRP/1 length" continue to use the currently frozen NNRP/1 layouts; preview3 does not reorder those message assignments or old field offsets in this document.
- The body segment order of
SESSION_OPEN/SESSION_OPEN_ACKis determined by the length fields in metadata; a segment is absent when its corresponding length is0.
6.3 Continued Principles from Earlier Frozen Work
The following design principles of preview2 continue to hold in preview3:
- The normative host shape remains
submit pump + result pump + control path, rather than synchronous request-response. - The distinction among
partial / stale_reuse / degraded / dropmust still be explicitly preserved at the protocol layer. - Low-frequency object caching and object references continue to be first-class citizens and do not fall back to "stable objects fully inlined on every frame."
- Typed payloads and extension frames continue to be retained, but the extension mechanism no longer primarily proceeds by "continuing to expand the payload-kind bitmap."
7. preview3 Connection and Session Model
7.1 Connection-Level Bootstrap and Multi-Session Container
preview3 explicitly treats a connection as a session container rather than a dedicated channel for a single active session.
Minimum requirements:
- A single connection may carry multiple active sessions.
CLIENT_HELLO / SERVER_HELLO_ACKare responsible for connection-level capability negotiation, authentication, feature window negotiation, and declaration of baseline cache and schema capabilities.- Add
SESSION_OPEN / SESSION_OPEN_ACKas an explicit session-creation flow for declaring profile, schema, budget window, priority class, and cache/lease requirements. SESSION_PATCH / SESSION_PATCH_ACKcontinue to be retained as the low-frequency session-update path.CLOSEcan still be used for connection-level closure. preview3 additionally requires explicit session-close semantics so that the preview1/2 habit of "closing one session equals closing the whole connection" does not continue leaking into the multi-session model.
7.1A Freezing of SESSION_OPEN / SESSION_OPEN_ACK Fixed Metadata
In the first round, preview3 freezes SESSION_OPEN and SESSION_OPEN_ACK as minimally implementable yet extensible session-open metadata, rather than letting implementations privately assemble their own session-open body.
The fixed metadata of SESSION_OPEN is fixed at 48 bytes in the first round:
| Field | Type | Description |
|---|---|---|
requested_session_id | u32 | Session id desired by the client; 0 means assigned by the server |
profile_id | u16 | Requested standard or extension profile |
priority_class | u8 | Session priority class; values are frozen later in this document |
session_flags | u8 | Session-level capability/behavior flags |
schema_id | u32 | Default schema id; 0 if absent |
schema_version | u32 | Default schema version; 0 if absent |
default_deadline_ms | u32 | Default operation deadline or latency budget |
max_in_flight_operations | u16 | Maximum number of parallel operations expected by the client |
reserved0 | u16 | Reserved; sender clears to 0 |
lease_ttl_hint_ms | u32 | Default lease TTL expected by the client; 0 if unspecified |
resume_token_bytes | u32 | Length of resume_token_block; 0 if absent |
auth_bytes | u32 | Length of auth_block; 0 if absent |
session_extension_bytes | u32 | Length of session_extension_block; 0 if absent |
client_session_tag | u64 | Client-local observable tag for logs and cross-layer correlation |
The fixed metadata of SESSION_OPEN_ACK is fixed at 56 bytes in the first round:
| Field | Type | Description |
|---|---|---|
session_id | u32 | Actually allocated or confirmed session id |
accepted_profile_id | u16 | Profile id accepted by the server |
accepted_priority_class | u8 | Priority class accepted by the server |
session_status | u8 | Session-open result status |
schema_id | u32 | Default schema id confirmed by the server |
schema_version | u32 | Default schema version confirmed by the server |
granted_operation_credit | u16 | Initially granted operation credit |
max_in_flight_operations | u16 | Maximum number of parallel operations allowed by the server |
lease_ttl_ms | u32 | Default lease TTL accepted by the server |
resume_window_ms | u32 | Resume window; 0 if absent |
resume_token_bytes | u32 | Length of resume_token_block; 0 if absent |
session_extension_bytes | u32 | Length of session_extension_block; 0 if absent |
server_session_tag | u64 | Server-local observable tag |
route_scope_id | u32 | Minimum routing scope confirmed by the server |
session_error_code | u32 | Stable error code returned if session_status is not success |
session_flags_ack | u32 | Session flags accepted by the server |
Additional constraints in the first round:
SESSION_OPENis responsible only for establishing the default session context; it does not carry the body of the first operation submission.schema_id / schema_versionindicate the default schema of the session rather than forbidding later operation-level overrides.- Higher-level profile-private session parameters still enter through
session_extension_blockor schema/profile object extensions rather than continuing to bloat fixed metadata.
7.1B Freezing of session-open status bits and error codes
preview3 freezes the following session_flags:u8 bit definitions in the first round:
| bit | Mask | Meaning |
|---|---|---|
| 0 | 0x01 | allow_resume: the client requests that the session support resume tokens / resume windows |
| 1 | 0x02 | allow_background_results: background result/event pumps are allowed to continue delivering outside submit calls |
| 2 | 0x04 | allow_cache_leases: the session is allowed to create or renew cache/schema leases |
| 3 | 0x08 | allow_schema_override: operation-level override of the session default schema is allowed |
| 4-7 | Reserved | The sender clears them to 0; the receiver must reject non-zero reserved bits |
preview3 freezes the following session_status:u8 enum values in the first round:
| Value | Name | Meaning |
|---|---|---|
0 | opened | Session established successfully |
1 | rejected | The server rejected establishing the session |
2 | retry_later | The session cannot currently be established; it may be retried later according to retry/reuse-related policy |
3 | resumed | The session was established successfully in resume mode |
preview3 freezes the following session_flags_ack:u32 bit definitions in the first round:
| bit | Mask | Meaning |
|---|---|---|
| 0 | 0x00000001 | resume_enabled: resume is allowed by the server |
| 1 | 0x00000002 | background_results_enabled: background result/event pumps are allowed by the server |
| 2 | 0x00000004 | cache_leases_enabled: cache/schema leases are allowed by the server |
| 3 | 0x00000008 | schema_override_enabled: operation-level schema override is allowed by the server |
| 4 | 0x00000010 | priority_downgraded: the requested priority was downgraded by the server |
| 5-31 | Reserved | The sender clears them to 0; the receiver must reject unknown set bits |
preview3 freezes the following session_error_code:u32 family in the first round:
| Value | Name | Meaning |
|---|---|---|
0x00000000 | none | No error |
0x00010001 | auth_failed | Authentication failed |
0x00010002 | profile_unsupported | The requested profile is unsupported |
0x00010003 | schema_unsupported | The requested schema or version is unsupported |
0x00010004 | priority_rejected | The requested priority class is not allowed |
0x00010005 | lease_policy_rejected | The requested lease policy is not allowed |
0x00010006 | resume_rejected | The requested resume mode or token is not allowed |
0x00010007 | session_limit_reached | The current connection or server session limit has been reached |
Constraints in the first round:
session_error_codereturns a non-zero value only whensession_status != openedor when there is a downgrade/recovery-related abnormality.session_flags_ackmay only confirm or downgrade what the client requested, and may not privately introduce new capabilities that were not requested.- If this error-code family needs to be extended later, it must continue to expand according to a high-bit family-reservation strategy and must not reorder already frozen values.
7.1C Freezing of SESSION_CLOSE / SESSION_CLOSE_ACK and the minimum routing fields
In the first round, preview3 freezes session close as a standard control-message pair rather than reusing the implicit habit of connection CLOSE.
SESSION_CLOSE fixed metadata is frozen to 24 bytes in the first round:
| Field | Type | Description |
|---|---|---|
close_reason | u16 | Close reason; values are frozen below |
in_flight_policy | u8 | How existing in-flight operations are handled |
reserved0 | u8 | Reserved; sender clears to 0 |
drain_timeout_ms | u32 | Timeout window for draining existing operations; 0 means apply immediately |
last_operation_id | u64 | The last operation watermark acknowledged by the sender; 0 if absent |
session_error_code | u32 | Stable error code when the session is being closed because of an error; otherwise 0 |
session_close_tag | u32 | Local observability correlation tag for the close |
SESSION_CLOSE_ACK fixed metadata is frozen to 16 bytes in the first round:
| Field | Type | Description |
|---|---|---|
close_status | u8 | Close-ack status; values are frozen below |
reserved0 | u8 | Reserved; sender clears to 0 |
reserved1 | u16 | Reserved; sender clears to 0 |
last_operation_id | u64 | Operation watermark confirmed by the server |
session_error_code | u32 | Stable error code if closing itself encountered an error; otherwise 0 |
close_reason:u16 is frozen in the first round as:
| Value | Name | Semantics |
|---|---|---|
0 | normal | Normal close |
1 | client_shutdown | Client-initiated shutdown |
2 | server_shutdown | Server-initiated shutdown |
3 | idle_timeout | Closed because of idle timeout |
4 | protocol_error | Closed because of a stable protocol error |
5 | auth_revoked | Closed because authentication was revoked or expired |
in_flight_policy:u8 is frozen in the first round as:
| Value | Name | Semantics |
|---|---|---|
0 | drain | Allow existing in-flight operations to drain within drain_timeout_ms |
1 | abort | Abort existing in-flight operations immediately |
close_status:u8 is frozen in the first round as:
| Value | Name | Semantics |
|---|---|---|
0 | acknowledged | The close request has been accepted and is being executed |
1 | draining | Existing operations are still being drained |
2 | closed | The session is fully closed |
3 | rejected | The close request was rejected |
Additional first-round constraints:
- Connection-scope control messages must use
header.session_id = 0. - Session-scope control, data, and result messages must use
header.session_id = target session. - Operation-scope messages must carry both
header.session_idand anoperation_idin their fixed metadata. SESSION_CLOSEcloses only one session and does not imply closing the whole connection.- If
SESSION_CLOSE_ACK.close_status = draining, the sender must continue processing subsequent terminal events against the known operation watermark untilclosedor a later close acknowledgment is received.
7.2 Priorities and Stream Classes
preview3 introduces explicit priority and stream-class semantics for scheduling multiple sessions on the same connection and multiple operations within the same session.
At minimum, the protocol layer must be able to express:
- Session priority classes, such as
interactive / balanced / background. - Operation priorities and deadline windows.
- The dual-layer constraints of dynamic credit at the session level and the connection level.
- Explicit acknowledgments from the server for priority downgrade, rate limiting, or preemption.
preview3 does not require any specific scheduling algorithm to be hard-coded as the only implementation, but it must freeze these semantic objects and error vocabularies so that different implementations no longer diverge in their interpretation of "backpressure," "preemption," and "expiration."
7.2A Freezing of standard scheduling enums
preview3 freezes the following standard enum values in the first round:
session_priority_class:u8
| Value | Name | Semantics |
|---|---|---|
0 | interactive | For foreground low-latency interaction, prioritizing deadlines and responsiveness |
1 | balanced | Default priority, balancing throughput and latency |
2 | background | For background tasks or prefetch work, which may be preempted by higher priorities |
operation_state:u8
| Value | Name | Semantics |
|---|---|---|
0 | accepted | Accepted and entered the scheduling system |
1 | running | Execution has started |
2 | partial | Consumable but non-terminal partial results have been produced |
3 | waiting_tool | Waiting for a tool or external dependency before continuing |
4 | superseded | Superseded by a new operation or new context |
5 | cancelled | Explicitly cancelled |
6 | failed | Terminated with an error |
7 | completed | Completed normally |
cancel_scope:u8
| Value | Name | Semantics |
|---|---|---|
0 | operation | Cancel only a single operation |
1 | subtree | Cancel that operation and its child-operation tree |
2 | group | Cancel all operations under the same operation_group_id |
3 | session | Cancel all still-active operations under the entire session |
Constraints in the first round:
- All implementations must treat these numeric values as protocol enums rather than private local status codes.
partialandcompletedmay appear in sequence within the same operation lifecycle;failed / cancelled / superseded / completedare terminal states.interactiveexpresses only scheduling priority and credit preference, and does not guarantee absolute resource exclusivity.
8. preview3 Advanced Cache Model
preview2 already has cache objects and object references; preview3 needs to upgrade them into a lease-capable, versioned, dependency-trackable AI runtime cache.
The preview3 cache model contains at least the following capabilities:
- lease: cache objects or schema objects must be able to declare TTL, renewal, expiration policy, and owner scope.
- version: object references must distinguish
object_idfromobject_version, no longer leaving "changed content but reused old key" to host-private conventions. - dependency: objects, results, and schemas must be able to declare dependency relationships for result reuse, cache invalidation, and consistency checks.
- observability: the protocol layer must be able to express stable error reasons such as cache miss, lease expired, dependency invalid, and schema mismatch.
- host-visible policy: the client may proactively declare preferences such as prefetch, touch, lease renew, eviction hints, or result reuse.
preview3 does not require the public layer to directly freeze private model KV-cache page encodings; such objects should still exist as profile-local or runtime-private object kinds. The public layer is responsible for freezing the lease contract, version semantics, dependency semantics, and error behavior.
8.1 Freezing of lease, version, and cache-error vocabulary
In the first round, preview3 further freezes the following cache-level public semantics:
object_idis the logical object identity; it remains stable when content changes but the logical identity does not.object_versionis the content-revision number under the sameobject_id; it must increase monotonically on semantic changes.lease_id:u64is the stable identity of one granted lease; renewal preserves the samelease_id, while a new grant creates a newlease_id.lease_owner_scope:u8is frozen asconnection=0 / session=1 / operation=2.- Host-visible policy hints such as
prefetch / touch / renew / evict_hint / reuse_preferenceare explicit hints and must not be interpreted as mandatory overrides of version, dependency, or schema validation.
cache_error_code:u32 is frozen in the first round as:
| Value | Name | Meaning |
|---|---|---|
0x00030000 | none | No cache error |
0x00030001 | cache_miss | The referenced object does not exist |
0x00030002 | lease_expired | The referenced lease has expired |
0x00030003 | version_mismatch | The requested object_version does not match the current available version |
0x00030004 | dependency_invalid | A dependent object or schema has become invalid |
0x00030005 | schema_mismatch | The object is incompatible with the required schema/profile interpretation |
First-round constraints:
cache_miss / lease_expired / version_mismatch / dependency_invalid / schema_mismatchmust stay as stable error vocabulary across implementations and must not be rewritten into local private string errors.- If result reuse depends on a specific
object_id + object_versionor schema version, that dependency must enter the observable dependency graph; invalidation must return either a stable error code or an explicit invalidation event. - Runtime-private object kinds may still exist, but they may not bypass the public
object_id / object_version / lease_id / cache_error_codesemantics above.
9. preview3 Schema / Profile Registry
preview3 no longer treats "continuing to add payload-kind enums" as the primary extension path, but introduces a standard schema/profile registry.
Design goals:
- The public layer does not presuppose a single default profile. The first-round standard profiles include at least
tensorandtoken, and continue to allow payload families such asstructured_event,tool_delta, andopaque_bytesto hang off the schema/profile registry. - Concrete payload semantics are bound through
schema_id + schema_version + profile_id + stream_semanticsrather than adding a new public payload kind every time a new data type appears. - Schema objects enter the cache / lease lifecycle and may be preinstalled, referenced, invalidated, and version-rolled back.
- Different implementations must not interpret descriptor-private fields independently; they must follow one unified schema-registry contract.
preview3 therefore needs to standardize at least the following information:
- The common header of schema descriptor objects.
- Negotiation, installation, invalidation, and version-conflict handling of the schema registry.
- Standard fields related to schema/profile binding in typed payload descriptors.
- Error handling for unknown schema, unknown version, and critical schema incompatibility.
9.3 Freezing of the common header of schema descriptors
preview3 fixes the common header of schema descriptors to 32 bytes in the first round, so version, applicability, and routing decisions can be completed without parsing the profile-private body.
| Field | Type | Description |
|---|---|---|
schema_id | u32 | Schema identifier |
schema_version | u32 | Schema version |
profile_id | u16 | Profile to which this schema belongs |
schema_flags | u16 | Schema behavior flags |
min_version_major | u8 | Minimum applicable major version |
max_version_major | u8 | Maximum applicable major version |
reserved0 | u16 | Reserved; sender clears to 0 |
body_bytes | u32 | Length of the schema body |
dependency_count | u16 | Number of dependent schema/object entries |
default_stream_semantics | u16 | Default stream semantics |
schema_hash | u64 | Stable digest of the schema body |
Constraints in the first round:
- The common header addresses only public questions such as "what this schema is, which profile it belongs to, which major version it applies to, how long the body is, and how many objects it depends on."
- Any profile-private interpretation fields must enter the schema body and must not continue to bloat the common header.
schema_hashis used for consistency checks and cache deduplication; it does not directly replace the logical identity ofschema_id + schema_version.default_stream_semanticsprovides only default semantics; a payload descriptor may still override it on a per-frame or per-operation basis.
schema_flags:u16 freezes the following bit definitions in the first round:
| bit | Mask | Meaning |
|---|---|---|
| 0 | 0x0001 | cacheable: this schema may enter the cache / lease lifecycle |
| 1 | 0x0002 | critical: unknown or incompatible handling must reject |
| 2 | 0x0004 | default_bindable: this schema may be used as a session default schema |
| 3 | 0x0008 | hash_stable: the same schema_id + schema_version must bind to the same schema_hash |
| 4-15 | Reserved | The sender clears them to 0; the receiver must reject unknown set bits |
stream_semantics:u16 / default_stream_semantics:u16 are frozen in the first round as:
| Value | Name | Semantics |
|---|---|---|
0 | default | Inherit the default profile/schema interpretation |
1 | snapshot | The current payload is a full snapshot |
2 | append | The current payload appends to an existing sequence or stream |
3 | replace | The current payload replaces an existing logical segment |
4 | event | The current payload carries discrete event semantics |
5 | tool_update | The current payload carries tool-call or tool-result incremental semantics |
9.1 Freezing of the first-round standard profiles
In the first round, preview3 first freezes the standard profiles as tensor profile and token profile, both of which are equally valid at the public layer.
The minimum standard semantics of tensor profile remain:
- It is oriented toward blockized or regionized numeric payloads rather than being forcibly bound to rendering scenarios.
- It allows shape, dtype, layout, section/layout interpretation, and coverage semantics to be declared through schema/profile descriptors.
partial / degraded / stale_reuseunder the tensor profile may still carry coverage semantics, but coverage is no longer the default requirement for all profiles.
The minimum standard semantics of token profile are frozen as:
- It is oriented toward incremental output of discrete tokens or token chunks and does not require token sequences to masquerade as tensor sections.
- The standard result path must at least be able to express incremental token fragments, sequence position/range, completion status, and stop/reason vocabulary.
- In the first round,
token profiledoes not require logits, full candidate distributions, or model-private sampling state to enter mandatory public fields; such content may only enter through schema/profile extensions. - Under the token profile, the default meaning of
partialis "the sequence is not yet complete but the current chunk is consumable," rather than a tensor-style coverage gap.
9.1A Freezing of the first-round standard registry assignments
Freezing structure alone is not sufficient. If the public numeric assignments for standard profiles and standard schemas are left implicit, canonical vectors, conformance baselines, and host-visible helpers will still end up allocating identifiers independently. In the first round, preview3 therefore freezes the minimum standard registry assignments that have already entered the public interoperability surface.
profile_id:u16 freezes the following public assignments in the first round:
| Value | Name | Meaning |
|---|---|---|
0x0000 | unspecified | The current session or payload is not explicitly bound to a standard profile |
0x0001 | tensor | Standard tensor profile |
0x0002 | token | Standard token profile |
First-round constraints:
- A new standard profile must be assigned in this table before it is allowed to appear in canonical vectors, conformance baselines, or public multi-language SDK APIs.
profile_id = 0only means "no standard profile was explicitly bound"; implementations must not silently reinterpret it as "tensor by default" or as some other runtime-private profile.- If a language binding exposes public
tensor/tokenconstants, it must use the assignments above rather than reordering them locally.
schema_id:u32 + schema_version:u32 freeze at least the following public registry anchor in the first round:
| profile_id | schema_id | schema_version | Name | Default stream_semantics | Description |
|---|---|---|---|---|---|
0x0002 | 0x00001001 | 3 | llm.chat.delta.v1 | append | First-round standard token incremental schema used by the minimal public token-chunk interpretation path |
First-round constraints:
- The table above is the minimum standard schema anchor already consumed by canonical vectors and cross-language conformance; no other "standard schema" may be assigned privately before it is added here.
schema_id = 0continues to mean "no default schema is bound in the current context"; it is not an alias for some implicit standard schema.- If a future change wants to add a public tensor schema or any other standard-profile schema, that assignment must first be added to the protocol design before it enters conformance or SDK surfaces.
9.2 Boundary of minimal fields in the first-round descriptors
In the first round, preview3 requires typed payload descriptors to be able to stably bind at least the following public fields:
profile_idschema_idschema_versionstream_semanticsoffsetlengthflags
On top of that, the minimum semantic-field boundary of different standard profiles is frozen as follows:
- A tensor-profile descriptor must be able to uniquely determine the numeric interpretation of the payload, including the entry point for shape/layout interpretation, the entry point for dtype interpretation, and whether profile-local coverage/section semantics exist.
- A token-profile descriptor must be able to uniquely determine the sequence interpretation of the payload, including token units, position/range vocabulary, incremental/terminal semantics, and whether stop-reason is explicitly given in this frame.
- Beyond the minimal fields above, any higher-level profile-private field must enter through schema/profile extensions, and must not directly elevate a private sampling or tensor-layout field of a single runtime into a mandatory public field.
9.2A Freezing of the fixed layout of typed payload descriptors
preview3 fixes the public layout of typed payload descriptors to 24 bytes in the first round:
| Field | Type | Description |
|---|---|---|
profile_id | u16 | The profile to which this payload belongs |
descriptor_flags | u16 | Descriptor behavior flags |
schema_id | u32 | The schema id that interprets this payload |
schema_version | u32 | The schema version that interprets this payload |
stream_semantics | u16 | The stream semantics of this payload |
reserved0 | u16 | Reserved; sender clears to 0 |
offset | u32 | Byte offset relative to the typed-payload frame region |
length | u32 | Byte length of the payload |
descriptor_flags:u16 freezes the following bit definitions in the first round:
| bit | Mask | Meaning |
|---|---|---|
| 0 | 0x0001 | terminal: this payload carries the terminal fragment of the current profile/operation |
| 1 | 0x0002 | partial: this payload is an incremental fragment that is consumable but non-terminal |
| 2 | 0x0004 | schema_override: this descriptor explicitly overrides the session default schema |
| 3 | 0x0008 | profile_hint_present: additional hints required for profile-local interpretation are present in the schema/profile body |
| 4-15 | Reserved | The sender clears them to 0; the receiver must reject unknown set bits |
Constraints in the first round:
- All standard profiles must use the same 24B descriptor public header and must not independently change byte layout by language or profile.
- The minimum interpretation entry of tensor and token is jointly determined by
profile_id + schema_id + schema_version + stream_semantics + descriptor_flags; finer-grained fields continue to go through the schema/profile body. offset / lengthare always interpreted relative to the typed-payload frame region. No binding may change them to be relative to the entire packet body or some private subregion.terminalandpartialmay both be zero, but they must not simultaneously express mutually conflicting terminal semantics; profile-private terminal detail continues to be interpreted through schema/profile.
This allows preview3 to support more data types without needing to freeze an ever-expanding public bitmap table each time.
9.4 Freezing of schema-registry flow and error behavior
In the first round, preview3 freezes the minimum schema-registry flow as:
install: install a new schema whenschema_id + schema_version + schema_hashis not yet present.update: install a higherschema_versionunder the sameschema_id; policy may decide whether old versions remain available, but it must not mutate theschema_hashof an already installed version.invalidate: explicitly invalidate a schema byschema_id + schema_versionor through dependency invalidation.version_conflict: if the sameschema_id + schema_versionarrives with a differentschema_hash, the receiver must reject it and return a stable error.
schema_error_code:u32 is frozen in the first round as:
| Value | Name | Meaning |
|---|---|---|
0x00040000 | none | No schema error |
0x00040001 | schema_unknown | The requested schema_id does not exist |
0x00040002 | schema_version_unknown | The requested schema_version does not exist |
0x00040003 | schema_hash_conflict | The same schema_id + schema_version was presented with a different schema_hash |
0x00040004 | schema_incompatible | The schema is incompatible with the current profile, major version, or critical constraints |
0x00040005 | schema_dependency_missing | A schema dependency is missing or unavailable |
0x00040006 | schema_update_rejected | A schema update or invalidation request was rejected by policy |
First-round constraints:
- When
schema_flags.criticalis set and the receiver cannot recognize the schema, version, or dependency, it must return a stableschema_error_codeand must not silently skip the schema. - The binding between a typed payload descriptor and
schema_id / schema_version / profile_idis strict; implementations must not rewrite it into a "looks close enough" heuristic. - The standard
install / update / invalidate / version_conflictflow must remain consistent across implementations; local integration layers must not privately reinterpret conflict outcomes.
10. preview3 Agent / Workflow Runtime Semantics
preview2 can already carry structured_event and tool_delta; what preview3 adds is their lifecycle semantics at runtime.
preview3 needs to express at least the following objects:
operation_id: the unique identifier of an inference, generation, tool call, or multi-step workflow operation.parent_operation_id: used to express operation trees, subtasks, and dependency chains.operation_group_id: used for scheduling, canceling, or subscribing to results of a group of operations.operation_state: such asaccepted / running / partial / waiting_tool / superseded / cancelled / failed / completed.cancel_scope: allows canceling a single operation, a subtree, a group, or an entire session.
The goal of these semantics is not to write all agent frameworks into a unified DSL, but to provide a cross-language unified lifecycle semantics for "multi-step AI workflows running in a single long-lived connection session."
10.1 Freezing the ownership boundary of structured_event / tool_delta
preview3 explicitly freezes the following boundaries in the first round:
structured_eventandtool_deltastill belong to payload families by default and are not automatically elevated into standalone profiles.- Only when an event affects an operation lifecycle that must interoperate across languages does its minimum semantics enter the public operation model; otherwise it remains in the schema/profile payload layer.
operation_id,parent_operation_id,operation_group_id,operation_state, andcancel_scopebelong to public lifecycle semantics and must be interpretable independently of concrete payloads.- Higher-level content such as tool-call parameters, tool-result bodies, and rich event payloads continue by default to remain in
structured_event/tool_deltapayloads and are interpreted through the schema/profile registry. - Therefore, preview3 does not hard-code tool-call bodies or event bodies into public fixed metadata; the public layer freezes only lifecycle, routing, cancellation, and state-transition semantics.
11. preview3 Flow Control, Recovery, and Observability
preview3 needs to push the flow control and migration of preview2 one level further.
At minimum, it should add:
- Dual-layer acknowledgment of connection-level credit and session-level credit.
- Priority-aware
FLOW_UPDATE, allowing the server to adjust windows independently for different sessions / operations. - Recovery, resume token, and
resume_from_operationsemantics under multi-session scenarios. - Unified result / event / control observability fields so multi-language hosts can stably record queue, compute, transport, backpressure, cache-hit, and lease events.
In the first round, preview3 explicitly freezes the recovery object as the session; a frame is not a recovery object, and an operation is only a watermark and observability boundary within session recovery.
11.1 Freezing the three-scope FLOW_UPDATE and its metadata
In the first round, preview3 fixes FLOW_UPDATE to 32 bytes of fixed metadata for uniformly expressing connection-, session-, and operation-level credit and backpressure updates, rather than allowing different implementations to define private credit packets independently.
| Field | Type | Description |
|---|---|---|
scope_kind | u8 | Update scope; values are frozen below |
update_reason | u8 | Reason for the update; values are frozen below |
backpressure_level | u8 | Current backpressure level; values are frozen below |
reserved0 | u8 | Reserved; sender clears to 0 |
connection_credit | u16 | Connection-level parallel credit |
session_credit | u16 | Session-level parallel credit |
operation_credit | u16 | Operation-level parallel credit |
reserved1 | u16 | Reserved; sender clears to 0 |
operation_id | u64 | Points to the target operation when scope_kind=operation; otherwise 0 |
retry_after_ms | u32 | Suggested retry or reprobe window; 0 if absent |
credit_epoch | u32 | Monotonically increasing credit-update epoch |
flow_flags | u32 | Flow-control behavior bitmap |
scope_kind:u8 is frozen in the first round as:
| Value | Name | Semantics |
|---|---|---|
0 | connection | Update the total credit or total backpressure state of the entire connection |
1 | session | Update the credit or backpressure state of a specific session |
2 | operation | Update the credit or backpressure state of a specific operation |
update_reason:u8 is frozen in the first round as:
| Value | Name | Semantics |
|---|---|---|
0 | grant | Newly grant credit or relax restrictions |
1 | reduce | Tighten the credit window |
2 | pause | Pause sending new operations |
3 | resume | Resume from the paused state |
4 | congestion | Enter rate limiting or backpressure due to congestion |
backpressure_level:u8 is frozen in the first round as:
| Value | Name | Semantics |
|---|---|---|
0 | none | No backpressure |
1 | soft | The sender is advised to slow down proactively, but is not forced to stop |
2 | hard | The sender should stop submitting new operations until a later relaxed update is received |
flow_flags:u32 freezes the following bit definitions in the first round:
| bit | Mask | Meaning |
|---|---|---|
| 0 | 0x00000001 | credit_valid: the credit field for the corresponding scope is valid |
| 1 | 0x00000002 | retry_after_valid: retry_after_ms is valid |
| 2 | 0x00000004 | background_only: only background or low-priority operations may continue progressing |
| 3 | 0x00000008 | drain_in_flight_only: only existing in-flight operations may drain; no new operations are accepted |
| 4-31 | Reserved | The sender clears them to 0; the receiver must reject unknown set bits |
Constraints in the first round:
- When
scope_kind=connection, headersession_idmust be0,operation_idmust be0, and the sender reads onlyconnection_credit. - When
scope_kind=session, headersession_idmust be the target session,operation_idmust be0, and the sender prioritizes readingsession_credit. - When
scope_kind=operation, headersession_idmust be the target session,operation_idmust be non-zero, and the sender prioritizes readingoperation_credit. credit_epochmust be monotonically increasing on the same scope; the receiver must not accept updates older than the current epoch.hardbackpressure is not an error. It indicates that the new submission window has been temporarily tightened; the sender should wait for a latergrant / resumeor aFLOW_UPDATEwith a higher epoch.- This fixed metadata solves only unified routing and control of credit/backpressure and does not carry profile-private queueing metrics. More fine-grained observability data should still be extended through schema/profile or dedicated observability paths.
11.2 Freezing of the recovery object and resume_from_operation
In the first round, preview3 freezes the following recovery semantics:
resume_tokenis always bound to a session rather than a connection or frame.resume_from_operation_idis an optional watermark within session recovery and declares "resume terminal results and events after this operation".- On successful recovery,
SESSION_OPEN_ACK.session_statusmust returnresumed, and the server must continue delivering unfinished or unacknowledged operation lifecycle events on the resumed session. - If the
resume_tokenis invalid, expired, unauthorized, or incompatible with the requested profile/schema/session capabilities, the server must returnsession_error_code = resume_rejected. - Recovery must not treat historical frames as independent recovery objects; any frame-level or packet-level compensation remains subordinate to session recovery semantics.
12. preview3 Fixed-Layout Offset Registry
This section collects the fixed metadata / descriptor layouts frozen above into one offset registry for implementations, golden vectors, and conformance runners. All offsets are byte offsets from the beginning of the corresponding structure, and all multi-byte fields are little-endian.
12.1 SESSION_OPEN Metadata, 48 Bytes
| offset | Field | Type |
|---|---|---|
0 | requested_session_id | u32 |
4 | profile_id | u16 |
6 | priority_class | u8 |
7 | session_flags | u8 |
8 | schema_id | u32 |
12 | schema_version | u32 |
16 | default_deadline_ms | u32 |
20 | max_in_flight_operations | u16 |
22 | reserved0 | u16 |
24 | lease_ttl_hint_ms | u32 |
28 | resume_token_bytes | u32 |
32 | auth_bytes | u32 |
36 | session_extension_bytes | u32 |
40 | client_session_tag | u64 |
12.2 SESSION_OPEN_ACK Metadata, 56 Bytes
| offset | Field | Type |
|---|---|---|
0 | session_id | u32 |
4 | accepted_profile_id | u16 |
6 | accepted_priority_class | u8 |
7 | session_status | u8 |
8 | schema_id | u32 |
12 | schema_version | u32 |
16 | granted_operation_credit | u16 |
18 | max_in_flight_operations | u16 |
20 | lease_ttl_ms | u32 |
24 | resume_window_ms | u32 |
28 | resume_token_bytes | u32 |
32 | session_extension_bytes | u32 |
36 | server_session_tag | u64 |
44 | route_scope_id | u32 |
48 | session_error_code | u32 |
52 | session_flags_ack | u32 |
12.3 SESSION_CLOSE Metadata, 24 Bytes
| offset | Field | Type |
|---|---|---|
0 | close_reason | u16 |
2 | in_flight_policy | u8 |
3 | reserved0 | u8 |
4 | drain_timeout_ms | u32 |
8 | last_operation_id | u64 |
16 | session_error_code | u32 |
20 | session_close_tag | u32 |
12.4 SESSION_CLOSE_ACK Metadata, 16 Bytes
| offset | Field | Type |
|---|---|---|
0 | close_status | u8 |
1 | reserved0 | u8 |
2 | reserved1 | u16 |
4 | last_operation_id | u64 |
12 | session_error_code | u32 |
12.5 FLOW_UPDATE Metadata, 32 Bytes
| offset | Field | Type |
|---|---|---|
0 | scope_kind | u8 |
1 | update_reason | u8 |
2 | backpressure_level | u8 |
3 | reserved0 | u8 |
4 | connection_credit | u16 |
6 | session_credit | u16 |
8 | operation_credit | u16 |
10 | reserved1 | u16 |
12 | operation_id | u64 |
20 | retry_after_ms | u32 |
24 | credit_epoch | u32 |
28 | flow_flags | u32 |
12.6 Schema Descriptor Header, 32 Bytes
| offset | Field | Type |
|---|---|---|
0 | schema_id | u32 |
4 | schema_version | u32 |
8 | profile_id | u16 |
10 | schema_flags | u16 |
12 | min_version_major | u8 |
13 | max_version_major | u8 |
14 | reserved0 | u16 |
16 | body_bytes | u32 |
20 | dependency_count | u16 |
22 | default_stream_semantics | u16 |
24 | schema_hash | u64 |
12.7 Typed Payload Descriptor, 24 Bytes
| offset | Field | Type |
|---|---|---|
0 | profile_id | u16 |
2 | descriptor_flags | u16 |
4 | schema_id | u32 |
8 | schema_version | u32 |
12 | stream_semantics | u16 |
14 | reserved0 | u16 |
16 | offset | u32 |
20 | length | u32 |
13. Protocol Freeze Summary
This document currently freezes the following protocol topics:
- The
NNRP/1.0code-level identity, the 40-byte common header, themsg_typeassignment table, and themeta_len + body_lenlength model. - The layered boundary among connection, session, and operation, including
SESSION_OPEN / SESSION_OPEN_ACK, explicit session close, recovery objects, and routing semantics. - Runtime semantics such as
FLOW_UPDATE, priority, cancel scope, operation lifecycle, and recovery watermarks. - Cache lease, schema/profile registry, the 32-byte schema descriptor, the 24-byte typed payload descriptor, the fixed offset registry, and their standard error vocabulary.
- The boundary between
structured_event/tool_deltapayload families and the public lifecycle model, as well as the cross-implementation baseline of conformance, golden vectors, enum values, and error codes.
Concrete implementation-facing API shape, packaging, and release strategy are outside the freeze scope of this document.