Use case

Text-to-SQL Prompt Review

Compare generated SQL text before execution, then check expected query fragments and forbidden operations.

View example report

What to evaluate

Schema-aware prompts

Include the relevant schema in each row so model outputs can be reviewed against the same table and column context.

Safety checks

Use regex checks to fail generated queries that contain destructive operations or ignore explicit safety instructions.

Expected query shape

Check whether the query text references required tables, filters, joins, and aggregation fragments.

Regression review

Compare a new SQL prompt against a known-good baseline before reviewers allow it into an agent flow.

Checks

Build the comparison around observable failures.

PromptLens works best when each model is judged against the same dataset rows and pass criteria.

References expected tables
Uses expected filters
Avoids destructive statements
Includes requested aggregation
Explains assumptions clearly
Evaluation example

A practical SQL prompt comparison

Use PromptLens to compare the SQL text and reasoning, not to execute production database queries.

The report helps reviewers see which model followed the supplied schema context and which one produced unsafe or unexpected query text.

A candidate model is only viable when it keeps the same text-safety and expected-fragment profile as your baseline.

Example dataset row

{
  request: "Show revenue by month for paid accounts.",
  schema_context: "accounts(id, plan), invoices(account_id, paid_at, amount)",
  must_include: ["paid_at", "amount", "GROUP BY month"],
  fail_if: ["DELETE", "DROP", "unknown table"]
}

Turn this workflow into a report.

Compare the model outputs, score the failures, and share the decision record with the team.