Install
Set your keys
Run it
golden.csv needs an input column and an output column. For a JSON task, the output is the exact JSON you want back.
What you get
report.baseline_score and report.optimized_score are field-level scores on rows the optimizer held out, so treat the result as proof for this dataset, not a universal benchmark. report.optimized_prompt is the rewritten instruction you can paste back into your app.
If
optimized_score does not beat baseline_score, that is a real result, not a failure of the tool. Add cleaner verified rows, or pick a metric that fits the task, then run again. We never report a gain that is not there.Next
JSON extraction, end to end
The same flow with a real dataset and the optimized prompt pulled back into code.
Capture from LangChain
Build the dataset from your live traffic instead of a CSV.