Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 46 additions & 2 deletions config.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ config JSON file. Model config JSON files MUST be valid JSON objects.
Contains metadata describing the model.
- **format**: string, REQUIRED

The packaging format of the model file(s). Currently the only supported value is `gguf`.
The packaging format of the model file(s). Supported values are `gguf` and `safetensors`.

- **format_version**: string, OPTIONAL

Expand All @@ -30,6 +30,13 @@ config JSON file. Model config JSON files MUST be valid JSON objects.
standardized [key-value pairs](https://github.com/ggml-org/ggml/blob/master/docs/gguf.md#general) defined in
the GGUF specification.

- **safetensors**: object, OPTIONAL

Contains metadata specific to the `safetensors` format. May include fields such as:
- **architecture**: string, OPTIONAL - The model architecture (e.g., "llama", "qwen2", "mistral")
- **parameter_count**: string, OPTIONAL - The total number of parameters (e.g., "7.24 B", "13 B")


- **size**: string, REQUIRED

The total size of the model in bytes.
Expand All @@ -45,7 +52,9 @@ config JSON file. Model config JSON files MUST be valid JSON objects.

The media type of the file. This indicates the type of the file and how it should be interpreted.

## Example
## Examples

### GGUF Model

```json
{
Expand Down Expand Up @@ -79,3 +88,38 @@ config JSON file. Model config JSON files MUST be valid JSON objects.
}
```

### Safetensors Model (Sharded)

```json
{
"descriptor": {
"createdAt": "2025-01-01T00:00:00Z"
},
"config": {
"format": "safetensors",
"safetensors": {
"architecture": "qwen2",
"parameter_count": "3.09 B"
},
"size": "6171926992"
},
"files": [
{
"diffID": "sha256:67347b23fb4165b652eb6611f5e1f2a06dfcddba8e909df1b2b0b1857bee06c2",
"type": "application/vnd.docker.ai.safetensors"
},
{
"diffID": "sha256:a40d941d0e7e0b966ad8b62bb6d6b7c88cce1299197b599d9d0a4ce59aabfc1d",
"type": "application/vnd.docker.ai.safetensors"
},
{
"diffID": "sha256:5acfb0cc82593273b8c9032239bbe897b80d17b185d8e7ae148afe21cb188067",
"type": "application/vnd.docker.ai.vllm.config.tar"
},
{
"diffID": "sha256:d0ce8fae4da6de6e5a4b85ebee156ac8f3ab6d8407caf4493968d34e9bc3939e",
"type": "application/vnd.docker.ai.license"
}
]
}
```
46 changes: 45 additions & 1 deletion spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,20 @@ All layers blobs SHOULD contain the contents of a single file. Layers SHOULD NOT
- `application/vnd.docker.ai.gguf.v3` - A file adhering to version 3 of the [GGUF specification](https://github.com/ggml-org/ggml/blob/master/docs/gguf.md), containing a tensor model.
- `application/vnd.docker.ai.gguf.v3.lora` - A file adhering to version 3 of the GGUF specification, containing a LoRA adapter.
- `application/vnd.docker.ai.gguf.v3.mmproj` - A file containing multimodal projector weights in GGUF format, used to bridge vision and language models by projecting visual features into the language model's embedding space.
- `application/vnd.docker.ai.safetensors` - A file adhering to the [safetensors specification](https://github.com/huggingface/safetensors), a safe and fast serialization format for machine learning tensors.
- `application/vnd.docker.ai.vllm.config.tar` - A tar archive containing configuration files (*.json) and metadata files (e.g., merges.txt) used by inference engines.
- `application/vnd.docker.ai.license` - Plain text file containing a software license.
- `application/vnd.docker.ai.chat.template.jinja` - A text file containing a [Jinja](https://jinja.palletsprojects.com/en/stable/) prompt template, used to define chat/inference formatting.

## Example Manifest
### Sharded Models
Both GGUF and safetensors formats support sharded models where the model weights are split across multiple files. In such cases:
- Multiple layers with the same media type (e.g., `application/vnd.docker.ai.safetensors` or `application/vnd.docker.ai.gguf.v3`) represent different shards of the same model.
- The order of layers in the manifest defines the shard sequence.
- Shards typically follow naming conventions such as `model-00001-of-00002.safetensors` or `model-00001-of-00002.gguf`.

## Example Manifests

### GGUF Model
```json
{
"schemaVersion": 2,
Expand All @@ -39,3 +48,38 @@ All layers blobs SHOULD contain the contents of a single file. Layers SHOULD NOT
]
}
```

### Safetensors Model (Sharded)
```json
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"config": {
"mediaType": "application/vnd.docker.ai.model.config.v0.1+json",
"size": 465,
"digest": "sha256:2ea258562df7df57407d739f3215419dd1827093d4a8057386b0c723fa011305"
},
"layers": [
{
"mediaType": "application/vnd.docker.ai.safetensors",
"size": 3968658944,
"digest": "sha256:67347b23fb4165b652eb6611f5e1f2a06dfcddba8e909df1b2b0b1857bee06c2"
},
{
"mediaType": "application/vnd.docker.ai.safetensors",
"size": 2203268048,
"digest": "sha256:a40d941d0e7e0b966ad8b62bb6d6b7c88cce1299197b599d9d0a4ce59aabfc1d"
},
{
"mediaType": "application/vnd.docker.ai.vllm.config.tar",
"size": 11530752,
"digest": "sha256:5acfb0cc82593273b8c9032239bbe897b80d17b185d8e7ae148afe21cb188067"
},
{
"mediaType": "application/vnd.docker.ai.license",
"size": 13,
"digest": "sha256:d0ce8fae4da6de6e5a4b85ebee156ac8f3ab6d8407caf4493968d34e9bc3939e"
}
]
}
```