llms.txt Content
# DeepEval
> DeepEval is an open-source LLM evaluation framework designed to unit-test LLM powered applications such as agents, chatbots, and RAG. DeepEval incorporates the latest research to evaluate LLM outputs based on metrics such as G-Eval, hallucination, answer relevancy, fluency, etc., which uses LLMs and various other NLP models that runs locally on your machine for evaluation. DeepEval offers a free cloud platform, Confident AI, for teams to incorperate LLM observability, tracing, and organization-wide collaboration into their LLM evals.
- [DeepEval LLM Evaluation](https://deepeval.com/): Open-source framework for evaluating large language models effectively.
- [DeepEval Framework Quickstart](https://deepeval.com/docs/getting-started): DeepEval is an open-source framework for evaluating LLM applications.
- [DeepEval LLM Evaluation](https://deepeval.com/docs/evaluation-introduction): Learn how to evaluate LLM applications using DeepEval.
- [DeepEval Metrics Overview](https://deepeval.com/docs/metrics-introduction): DeepEval provides 40+ metrics for evaluating LLM performance effectively.
- [G-Eval Framework](https://deepeval.com/docs/metrics-llm-evals): G-Eval framework for evaluating LLM outputs with custom metrics.
- [DAG Metric Overview](https://deepeval.com/docs/metrics-dag): Explore the versatile DAG metric for LLM evaluations.
- [Top G-Eval Use Cases](https://deepeval.com/blog/top-5-geval-use-cases): Explore top G-Eval use cases for custom LLM metrics.
- [Answer Relevancy Metrics](https://deepeval.com/docs/metrics-answer-relevancy): Evaluate answer relevancy using LLM metrics for RAG.
- [Faithfulness Metric Overview](https://deepeval.com/docs/metrics-faithfulness): Evaluate RAG pipeline quality using faithfulness metrics.
- [Contextual Relevancy Metric](https://deepeval.com/docs/metrics-contextual-relevancy): Explore the Contextual Relevancy Metric for evaluating RAG pipelines.
- [Contextual Precision Metric](https://deepeval.com/docs/metrics-context