llms.txt Content
# Together AI
> Together AI is the AI Native Cloud — a platform for running, fine-tuning, and deploying
> open-source and frontier AI models at scale. Developers use Together for serverless
> inference, dedicated endpoints, GPU clusters, fine-tuning, and code sandboxes via an
> OpenAI-compatible API.
## Getting Started
- [Quickstart](https://docs.together.ai/docs/quickstart): Get your first API call working in minutes
- [OpenAI Compatibility](https://docs.together.ai/docs/openai-api-compatibility): Drop-in replacement for OpenAI SDK
- [Pricing](https://www.together.ai/pricing): Per-token and per-compute pricing for all products
- [API Authentication](https://docs.together.ai/reference/authentication-1): How to authenticate API requests
- [Introduction](https://docs.together.ai/docs/introduction): Platform overview and capabilities
- [Integrations](https://docs.together.ai/docs/integrations): Connect Together AI with your existing tools
- [Multiple API Keys](https://docs.together.ai/docs/multiple-api-keys): Manage API keys across teams and projects
## Products
- [Serverless Inference](https://www.together.ai/serverless-inference): Pay-per-token access to 200+ open-source and frontier models
- [Dedicated Model Inference](https://www.together.ai/dedicated-model-inference): Reserved model capacity for production workloads
- [Dedicated Container Inference](https://www.together.ai/dedicated-container-inference): Deploy custom containers on dedicated infrastructure
- [Batch Inference](https://www.together.ai/batch-inference): Async large-scale inference at lower cost
- [Fine-tuning](https://www.together.ai/fine-tuning): Full fine-tuning and LoRA adapters for custom models
- [Accelerated Compute / GPU Clusters](https://www.together.ai/accelerated-compute): On-demand H100/H200/GB200/B300 clusters for training and inference
- [Sandbox](https://www.together.ai/sandbox): Secure code interpreter and execution environment
- [Managed Storage](https://www.together.ai/managed-s