# ONNX Runtime

Informally, the ONNX runtime consists of a computation graph and a heap-like memory where tensors reside. The runtime evaluates the graph in a dataflow fashion by executing each node once its inputs are ready, updating memory with outputs. The memory and the graph together form the runtime machine.

**Definition 1.1 (ONNX Runtime State).** *The machine state of an ONNX runtime consists of a pair: $(G, M)$. $G$ is a directed acyclic computation graph whose nodes represent operator invocations (e.g., Add, MatMul, Relu) and whose edges represent tensor data dependencies. $M$ is a linear, read-write, byte-addressable memory array of size N bytes $(M: \[0..N) → u8)$, used to store all tensor data (inputs, intermediate results, and outputs). Each tensor occupies a contiguous region within $M$, and tensor metadata (such as shape and type) is tracked separately in a tensor table.*

**Definition 1.2 (ONNX node format).** Any ONNX node (operator) can be written in the following format: $\[op\\\_type, inputs, outputs, attributes]$, where:&#x20;

* $op\\\_type$: a string identifying the operator (e.g., "$Add$", "$Relu$")&#x20;
* $inputs$: a list of input tensor names (each mapped to memory)&#x20;
* $outputs$: a list of output tensor names&#x20;
* $attributes$: constant parameters specific to the operator (e.g., axis in $Softmax$)

**Definition 1.3 (ONNX Step Transition).** Given:&#x20;

* A machine state $(G, M)$, where $G$ is the current computation graph and $M$ is linear memory&#x20;
* &#x20;A topologically sorted node list $\[n₀, n₁, ..., n\_k]$&#x20;
* A tensor table $T$ mapping tensor names to $(offset, dtype, shape)$ in memory

We define the step transition for ONNX as:

1. Select node $nᵢ = \[op\_type, inputs, outputs, attributes, subgraph]$ from the graph (in topological order).
2. Read input tensors $x₀, ..., x\_k$ from memory $M$ using tensor metadata in $T$. Each tensor is loaded via its memory offset and interpreted by $dtype$.
3. Apply operator function $f\_{op}$ defined by op\_type, using $inputs$ and $attributes$:  $\[y₀, ..., y\_m] ← f\_{op}(x₀, ..., x\_k, attributes)$
4. Write outputs $y₀, ..., y\_m$ to memory $M$, assigning memory locations using $T$. Update memory values at corresponding offsets.
5. Advance to next node in G. Repeat until all nodes are executed.

###


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://devs.novanet.xyz/jolt-atlas-zkml/onnx/onnx-runtime.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
