> ## Documentation Index
> Fetch the complete documentation index at: https://docs.runapprentice.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Data tiers: gold, silver, raw

> What gold, silver, and raw rows mean in Apprentice, and which tiers count for optimization versus eval gates.

Apprentice sorts every row in a task's dataset by tier. The tier decides what a row is allowed to do: drive optimization, or certify quality. Getting this right is the difference between a real result and a flattering one.

## The three tiers that matter

* **Gold** is human-verified. A person confirmed the output is correct for the input. Gold is the only tier trusted to certify quality.
* **Silver** is frontier-model output that passed a deterministic check (for example, valid JSON for a JSON task). It is plausible but not human-confirmed.
* **Raw** is everything else: live captured traffic that no one has verified yet.

A captured trace arrives as **raw**. You promote it to gold by verifying it, or you upload already-curated rows as silver. A row you reject during review is marked `rejected` and drops out of every count.

## What each tier is allowed to do

Two rules hold across the product, and they do not overlap:

* **Optimization uses verified rows: gold plus silver.** More verified rows give the optimizer more signal, so silver helps here.
* **Eval gates and model promotion use gold only.** Quality is certified against human-verified rows, never against model-generated ones.

Raw never counts toward either. It is a candidate for verification, not evidence.

## Why the split matters

If silver could certify quality, you would be grading the model with the model's own homework: frontier output checking frontier output. That hides regressions instead of catching them. By letting silver help you optimize but never letting it certify, the gate stays honest. A run that does not beat the baseline on gold is a real result, not a tuning artifact.

The `DatasetStatus` returned by the SDK reports the counts directly: `gold`, `silver`, `raw`, and `ready_for_optimization`. See the [Python SDK reference](/reference/python-sdk#datasets) for the fields.
