Use Cases
llms.py is versatile and can be used in many different scenarios. Here are some common use cases with practical examples.
For Developers
Section titled “For Developers”API Gateway
Section titled “API Gateway”Use llms.py as a centralized gateway for all LLM provider access:
from openai import OpenAI
# Point all your code to llms.pyclient = OpenAI( base_url="http://localhost:8000/v1", api_key="not-needed")
# Use any model from any providerresponse = client.chat.completions.create( model="kimi-k2", # Free via Groq messages=[{"role": "user", "content": "Hello!"}])Benefits:
- Single integration point
- Easy to switch providers
- Automatic failover
- Cost optimization
Cost Management
Section titled “Cost Management”Route to cheapest available providers automatically:
{ "providers": { "groq": {"enabled": true}, // Free tier, tried first "openrouter_free": {"enabled": true}, // Free tier, tried second "openai": {"enabled": true} // Paid, tried last }}Benefits:
- Minimize API costs
- Use free tiers first
- Fallback to paid when needed
Testing & Comparison
Section titled “Testing & Comparison”Easily switch between models for comparison:
# Test different modelsllms -m grok-4 "Explain quantum computing"llms -m claude-sonnet-4-0 "Explain quantum computing"llms -m gemini-2.5-pro "Explain quantum computing"
# Compare response timesllms --check groqllms --check anthropicllms --check googleBenefits:
- Quick model comparison
- Performance testing
- Quality evaluation
Local Development
Section titled “Local Development”Use local models for development, cloud for production:
# Development: Use Ollama (free, private)export ENV=developmentllms --enable ollamallms -m llama3.3 "test query"
# Production: Use premium modelsexport ENV=productionllms --enable openai anthropicllms -m gpt-4o "production query"For ComfyUI Users
Section titled “For ComfyUI Users”Hybrid Workflows
Section titled “Hybrid Workflows”Combine local Ollama models with cloud APIs in ComfyUI:
- Install llms.py in ComfyUI environment
- Enable both Ollama and cloud providers
- Use llms.py node in workflows
- Automatic routing based on availability
Benefits:
- Zero dependency conflicts
- Mix local and cloud models
- Cost control
- Provider flexibility
Zero Setup
Section titled “Zero Setup”No dependency management headaches:
# Just one dependencypip install llms-py
# Works immediately in ComfyUIFor Enterprises
Section titled “For Enterprises”Vendor Independence
Section titled “Vendor Independence”Avoid lock-in to any single LLM provider:
{ "providers": { "openai": {"enabled": true}, "anthropic": {"enabled": true}, "google": {"enabled": true}, "ollama": {"enabled": true} }}Benefits:
- No vendor lock-in
- Easy migration
- Negotiating leverage
- Risk mitigation
Scalability
Section titled “Scalability”Distribute load across multiple providers:
# Enable multiple providers for same modelllms --enable groq openrouter openai
# Automatic load distribution with failoverBenefits:
- Higher throughput
- Better reliability
- Load balancing
- Reduced rate limiting
Compliance
Section titled “Compliance”Keep sensitive data local while using cloud for general tasks:
# Sensitive data: Use local Ollamallms -m llama3.3 --file sensitive-doc.pdf "Analyze"
# General tasks: Use cloudllms -m gpt-4o "General query"Benefits:
- Data sovereignty
- Privacy compliance
- Flexible deployment
- Cost optimization
Budget Control
Section titled “Budget Control”Intelligent routing to optimize costs:
{ "providers": { "groq": {"enabled": true}, // Free tier "google_free": {"enabled": true}, // Free tier "openai": {"enabled": false} // Disabled to control costs }}Track spending with analytics:
- View cost analytics by day/month
- Monitor per-model costs
- Track per-provider spending
Research & Education
Section titled “Research & Education”Model Comparison
Section titled “Model Comparison”Compare capabilities of different models:
# Test reasoningllms -m o3 "Solve this logic puzzle..."llms -m qwq-plus "Solve this logic puzzle..."
# Test visionllms -m gpt-4o --image chart.png "Analyze"llms -m gemini-2.5-pro --image chart.png "Analyze"
# Test multilingualllms -m qwen3-max "Translate to Chinese..."llms -m gemini-2.5-pro "Translate to Chinese..."Cost Analysis
Section titled “Cost Analysis”Track costs for research projects:
# Enable analyticsllms --serve 8000
# View cost analytics in UI# Export data for analysisExperimentation
Section titled “Experimentation”Easy experimentation with different providers:
# Try new models quicklyllms --enable new_providerllms -m new-model "test"
# Compare with existingllms -m existing-model "test"Content Creation
Section titled “Content Creation”Writing Assistant
Section titled “Writing Assistant”Use different models for different tasks:
# Brainstorming: Fast, creative modelllms -m gemini-2.5-flash "Blog post ideas about AI"
# Drafting: Balanced modelllms -m claude-sonnet-4-0 "Write blog post about..."
# Editing: Detail-oriented modelllms -m gpt-4o "Improve this text..."Image Analysis
Section titled “Image Analysis”Analyze images for content creation:
# Describe imagesllms -m qwen2.5vl --image photo.jpg "Describe for alt text"
# Extract textllms -m gemini-2.5-pro --image screenshot.png "Extract text"
# Analyze chartsllms -m gpt-4o --image chart.png "Summarize data"Audio Transcription
Section titled “Audio Transcription”Transcribe and summarize audio content:
# Transcribellms -m gpt-4o-audio-preview --audio interview.mp3 "Transcribe"
# Summarizellms -m gemini-2.5-flash --audio podcast.mp3 "Key points"Data Analysis
Section titled “Data Analysis”Document Processing
Section titled “Document Processing”Process and analyze documents:
# Summarize PDFsllms -m gpt-5 --file report.pdf "Executive summary"
# Extract datallms -m qwen3-max --file invoice.pdf "Extract line items"
# Compare documentsllms -m claude-opus-4-1 --file doc1.pdf "Compare with doc2"Code Analysis
Section titled “Code Analysis”Analyze code and generate documentation:
# Explain codellms -m grok-4 "Explain this Python code: ..."
# Generate docsllms -m claude-sonnet-4-0 "Document this function: ..."
# Code reviewllms -m gpt-4o "Review this code for issues: ..."Personal Use
Section titled “Personal Use”Learning Assistant
Section titled “Learning Assistant”Learn new topics with AI help:
# Explanationsllms -s "You are a patient teacher" "Explain quantum computing"
# Practicellms "Give me a Python coding challenge"
# Feedbackllms "Review my solution: ..."Productivity
Section titled “Productivity”Boost productivity with AI:
# Email draftingllms "Draft email to client about project delay"
# Meeting notesllms --audio meeting.mp3 "Summarize and extract action items"
# Researchllms "Summarize recent developments in AI"Privacy-Focused
Section titled “Privacy-Focused”Use local models for private queries:
# Enable only Ollamallms --disable openai anthropic googlellms --enable ollama
# All queries stay localllms -m llama3.3 "Private query"Integration Examples
Section titled “Integration Examples”With LangChain
Section titled “With LangChain”from langchain_openai import ChatOpenAI
llm = ChatOpenAI( base_url="http://localhost:8000/v1", model="kimi-k2")
# Use in chains, agents, etc.With LlamaIndex
Section titled “With LlamaIndex”from llama_index.llms.openai import OpenAI
llm = OpenAI( api_base="http://localhost:8000/v1", model="kimi-k2")
# Use in indexes, queries, etc.With Custom Code
Section titled “With Custom Code”import requests
response = requests.post( "http://localhost:8000/v1/chat/completions", json={ "model": "kimi-k2", "messages": [{"role": "user", "content": "Hello!"}] })
print(response.json()["choices"][0]["message"]["content"])Next Steps
Section titled “Next Steps”- CLI Usage - Learn CLI commands
- Web UI - Use the web interface
- Providers - Explore available providers
- Configuration - Customize your setup