Use case

Text-to-SQL Prompt Review

Compare generated SQL text before execution, then check expected query fragments and forbidden operations.

View example report

What to evaluate

Schema-aware prompts

Include the relevant schema in each row so model outputs can be reviewed against the same table and column context.

Safety checks

Use regex checks to fail generated queries that contain destructive operations or ignore explicit safety instructions.

Expected query shape

Check whether the query text references required tables, filters, joins, and aggregation fragments.

Regression review

Compare a new SQL prompt against a known-good baseline before reviewers allow it into an agent flow.

Checks

Build the comparison around observable failures.

PromptLens works best when each model is judged against the same dataset rows and pass criteria.

References expected tables

Uses expected filters

Avoids destructive statements

Includes requested aggregation

Explains assumptions clearly

Evaluation example

A practical SQL prompt comparison

Use PromptLens to compare the SQL text and reasoning, not to execute production database queries.

The report helps reviewers see which model followed the supplied schema context and which one produced unsafe or unexpected query text.

A candidate model is only viable when it keeps the same text-safety and expected-fragment profile as your baseline.

Example dataset row

{
  request: "Show revenue by month for paid accounts.",
  schema_context: "accounts(id, plan), invoices(account_id, paid_at, amount)",
  must_include: ["paid_at", "amount", "GROUP BY month"],
  fail_if: ["DELETE", "DROP", "unknown table"]
}

Turn this workflow into a report.

Compare the model outputs, score the failures, and share the decision record with the team.