GEPAzilla

Optimizer Console

Blend scorer metrics, track latency, and let GEPA iterate your system prompt on a lightweight dataset. GEPAzilla keeps every token local while it crunches through your evaluation stack.

System Prompt
Add your system prompt below. This will be used as the starting point for GEPA to improve.
Scoring Criteria
Add evaluation metrics for model outputs using the above system prompt. These will judge the outputs and help GEPA craft better prompts.

GEPA already optimizes request latency internally; the Results tab always includes the latency metric.

Dataset
Add training data for GEPA to improve the system prompt. Mark rows as validation to test GEPA without exposing them to the model. Expected outputs are optional and help benchmark changes.

Training rows fuel prompt edits; validation rows stay untouched to rank candidates on the Pareto frontier. Click a row’s use pill to swap it between Training and Validation.

#
Use
Input
Expected Output (optional)
Length ratioSummary headingAction items headingUses [REDACTED]
Total
Actions
1
2
3
4
5
Train 3Validation 2 (Pareto gate)
Idle
0:00