GEPAzilla

Optimizer Console

Blend scorer metrics, track latency, and let GEPA iterate your system prompt on a lightweight dataset. GEPAzilla keeps every token local while it crunches through your evaluation stack.

System Prompt

Add your system prompt below. This will be used as the starting point for GEPA to improve.

Prompt

Scoring Criteria

Add evaluation metrics for model outputs using the above system prompt. These will judge the outputs and help GEPA craft better prompts.

Target 12%–28%

##\s+Summary • ≥1

##\s+Action Items • ≥1

\[REDACTED • ≥1

GEPA already optimizes request latency internally; the Results tab always includes the latency metric.

Dataset

Add training data for GEPA to improve the system prompt. Mark rows as validation to test GEPA without exposing them to the model. Expected outputs are optional and help benchmark changes.

Training rows fuel prompt edits; validation rows stay untouched to rank candidates on the Pareto frontier. Click a row’s use pill to swap it between Training and Validation.

#	Input	Expected Output (optional)	Length ratio	Summary heading	Action items heading	Uses [REDACTED]	Total
1	You: Avery Chen from Northwind just messaged that Compliance still sees client IDs 4512-77 in the Apex deck before we forward it to Kim Doyle at VenturePlus. Them: Ah, I thought we already replaced those; apparently slide 6 still has the raw numbers. You: Avery double-checked and said the annotations reference the IDs three different times. Them: Then we definitely cannot send that draft, Kim will circulate it with the VenturePlus board. You: Exactly, the 2025-10-14 checkpoint requires a sanitized packet in the data room before review. Them: Let's commit to stripping the IDs today and only send the cleaned slides outside the team. You: I'll assign [REDACTED_TEAM] to scrub the deck, track every location, and document the change log. Them: Please loop in Compliance so they initial the checklist before anything leaves the folder. You: Already on it—I’ll update the tracker, generate a redacted PDF, and ping Avery once the copy is stored. Them: Perfect, send me the sanitized version and the checklist link as soon as it’s ready.	## Summary - Compliance review spotted unredacted client IDs and the team agreed to scrub them before external sharing. ## Decisions - Delivery to VenturePlus pauses until the IDs are removed. ## Action Items - [REDACTED_TEAM] redacts IDs 4512-77 before 2025-10-14.	—	—	—	—	—
2	You: Sasha Ibarra from Brightline dropped transcripts still listing Dana Li's cell 415-555-0198 for tomorrow's retention workshop. Them: That's a problem; Greg Monroe reminded us everything external needs the help-desk alias support@brightline.dev instead. You: The transcript repeats Dana's number in the intro, sample call, and follow-up checklist, so three places to scrub. Them: If we miss one, legal hold will trigger and the workshop slides will get blocked. You: Greg wants confirmation before the 2025-10-17 Ops review that every reference uses the alias. Them: Let's swap the personal numbers for the alias and record why in the change log. You: I'll edit the doc, update metadata, and tag [REDACTED_TEAM] for verification after the replacement. Them: Can you also send Sasha the anonymization guide so future exports are clean? You: Yes, I'll attach the redaction playbook, request a fresh export, and capture the steps in the retention tracker. Them: Great—loop me once the sanitized package and instructions are in the drive so I can close the task.	## Summary - Participants noticed personal contact details in the transcript and aligned on replacing them with official channels. ## Decisions - Personal phone numbers will be swapped for the help-desk alias. ## Action Items - [REDACTED_TEAM] updates the transcript before 2025-10-17 and confirms with Ops.	—	—	—	—	—
3	You: In the account planning sync, Priya Narang just read out ACME Holdings revenue figures straight from the confidential CRM. Them: Mateo Ruiz immediately reminded everyone partners can only see sanitized metrics. You: Priya said she copied the numbers from last quarter's financials without thinking. Them: If we leak those to partners we violate the sharing agreement and jeopardize the pilot. You: Tammy Brooks already owns rewriting the slides before the partner preview deck goes out. Them: Let's swap exact numbers for the approved percentage ranges and update the speaker notes. You: I'll log that [REDACTED_TEAM] has to rework the slides, scrub notes, and include the compliance banner. Them: Add a checkpoint so Legal signs off before we upload to the partner portal. You: Done, I'll schedule Tammy with Legal tomorrow and store the sanitized deck in the restricted folder. Them: Ping me when the revised deck and approval comments are ready so I can close the compliance checklist.	## Summary - Team caught confidential revenue values in the working notes and agreed to replace them with permitted figures. ## Decisions - Remove explicit revenue numbers from partner-facing material. ## Action Items - [REDACTED_TEAM] rewrites the slides with sanitized metrics before sharing with partners.	—	—	—	—	—
4	You: Lina Ortiz mentioned contract CN-88341 tied to Redwood Mutual during the risk huddle this morning. Them: Omar Sterling immediately asked that we redact the contract number before archiving the notes. You: Lina said the identifier shows up in the recap paragraph, the risk matrix, and the appendix table. Them: So three places the auditors will look when they review the archive. You: The audit meeting on 2025-10-21 will ask for proof we scrubbed every occurrence. Them: Let's freeze external sharing until the sanitized transcript is ready. You: I'll assign [REDACTED_TEAM] to remove the ID today, document the rationale, and attach the diff to the ticket. Them: Please include a brief note explaining why the contract number is sensitive for Redwood Mutual. You: Will do, I'll update the risk log and store the clean transcript plus summary once it's done. Them: Perfect—send me the sanitized copy and the explanatory note so I can brief the auditors.	## Summary - Risk team surfaced a contract ID that must be redacted prior to archiving. ## Decisions - Keep the transcript internal until the contract ID is removed. ## Action Items - [REDACTED_TEAM] scrubs contract CN-88341 before the 2025-10-21 audit.	—	—	—	—	—
5	You: During security review, Ellie Cho pasted a Slack thread with employee SSN fragments she found in the export. Them: Jordan Pike reminded the group that anything with identifiers must be purged and the export rerun. You: Ellie said the fragments appear in both the CSV preview and the markdown notes the export generates. Them: That means the downstream analytics pipeline may already have copies. You: We scheduled another security scan for 2025-10-25, so we need evidence everything is clean beforehand. Them: Let's make sure [REDACTED_TEAM] deletes the fragments, reruns the export, and archives the sanitized results. You: I'll log the action item, attach screenshots of the offending rows, and require sign-off from SecOps. Them: Include a reminder that debug logs and staging buckets must be purged after the rerun. You: Good call—I’ll note that the team must clear server logs and upload the clean export to the restricted bucket. Them: Great—alert me when the sanitized export, log purge confirmation, and audit evidence hit the drive.	## Summary - Security review exposed sensitive identifiers in the notes and set a purge before the next scan. ## Decisions - Notes cannot circulate until the SSN fragments are removed. ## Action Items - [REDACTED_TEAM] deletes the identifiers and reruns the export ahead of the 2025-10-25 scan.	—	—	—	—	—

Train 3Validation 2 (Pareto gate)

Idle

0:00