llms.txt Content
# Octomil
> Octomil is the on-device AI platform for local inference, edge deployment, staged rollouts, routing, and fleet operations across phones, browsers, laptops, and edge hardware.
Octomil provides:
- **Local inference**: `octomil serve phi-4-mini` starts an OpenAI-compatible API server. Auto-selects the fastest engine for your hardware (MLX on Apple Silicon, llama.cpp on x86/CUDA).
- **Edge deployment**: `octomil deploy model --phone` pushes models to iOS (CoreML) and Android (TFLite) devices with automatic format conversion.
- **Routing and rollouts**: Keep common requests on-device, fall back to cloud only when needed, and ship model changes with canaries and rollback guardrails.
- **Fleet management**: Dashboard for monitoring inference metrics, device health, model versions, and rollouts across your entire device fleet.
- **Cross-platform SDKs**: Python, iOS (Swift), Android (Kotlin), and Browser SDKs.
## Enterprise Add-Ons
- **Federated learning**: Train models across devices without centralizing data. Available on Enterprise tier.
## Links
- [Documentation](https://docs.octomil.com/): Full platform documentation
- [Quickstart](https://docs.octomil.com/guides/partner-pilot-quickstart): From zero to running in 10 minutes
- [Dashboard](https://octomil.com/dashboard): Fleet management dashboard
- [GitHub](https://github.com/octomil): SDKs and examples