Learn

LLM Testing Knowledge Base

Everything you need to know about prompt engineering, LLM evaluation, and building reliable AI applications.

Prompt Regression

A prompt regression occurs when changes to a prompt or its context cause the LLM to produce lower-quality outputs than before. This can happen due to prompt edits, model updates, or changes to system context.

3 sectionsRead guide

LLM Evaluation Metrics

LLM evaluation metrics are quantitative and qualitative measures used to assess the quality, accuracy, and usefulness of large language model outputs.

4 sectionsRead guide

Prompt Engineering Glossary

A comprehensive reference of terms and concepts used in prompt engineering and LLM application development.

4 sectionsRead guide

Quality Scoring

Quality scoring is the process of systematically evaluating LLM outputs against defined criteria to produce numerical scores that enable comparison and threshold-based decisions.

4 sectionsRead guide

Prompt Testing Best Practices

A collection of proven methods and workflows for systematically testing LLM prompts to ensure quality, reliability, and safety in production applications.

4 sectionsRead guide

LLM Output Validation

LLM output validation is the process of checking model outputs for correctness, safety, format compliance, and quality before presenting them to users or using them in downstream systems.

4 sectionsRead guide

Put Your Knowledge Into Practice

Use PromptLens to apply professional LLM testing techniques to your projects.

Start Free