LLM Testing Knowledge Base
Everything you need to know about prompt engineering, LLM evaluation, and building reliable AI applications.
Prompt Regression
A prompt regression occurs when changes to a prompt or its context cause the LLM to produce lower-quality outputs than before. This can happen due to prompt edits, model updates, or changes to system context.
LLM Evaluation Metrics
LLM evaluation metrics are quantitative and qualitative measures used to assess the quality, accuracy, and usefulness of large language model outputs.
Prompt Engineering Glossary
A comprehensive reference of terms and concepts used in prompt engineering and LLM application development.
Quality Scoring
Quality scoring is the process of systematically evaluating LLM outputs against defined criteria to produce numerical scores that enable comparison and threshold-based decisions.
Prompt Testing Best Practices
A collection of proven methods and workflows for systematically testing LLM prompts to ensure quality, reliability, and safety in production applications.
LLM Output Validation
LLM output validation is the process of checking model outputs for correctness, safety, format compliance, and quality before presenting them to users or using them in downstream systems.
Put Your Knowledge Into Practice
Use PromptLens to apply professional LLM testing techniques to your projects.
Start Free