llms.py gets a UI 🚀

ChatGPT, but Local 🎯​
Simple ChatGPT-like UI to access ALL Your LLMs, Locally or Remotely!
In keeping with the simplicity and goals of llms.py the new UI still only requires its single aiohttp python dependency for all client and server features.
The /ui is small, fast and lightweight and follows the Simple Modern JavaScript approach of leveraging native JS Modules support in Browsers to avoid needing any npm dependencies or build tools.
Install​
To get both llms.py and its UI it's recommended to install from PyPI:
pip install llms-py
Start an OpenAI /v1/chat/completions
server and UI on port 8000:
llms --serve 8000
If no UI is needed, download just llms.py for all other client & server features:
curl -O https://raw.githubusercontent.com/ServiceStack/llms/main/llms.py
chmod +x llms.py
mv llms.py ~/.local/bin/llms
Simple and Flexible UI​
This starts the Chat UI from where you can interact with any of your configured OpenAI-compatible Chat Providers - for a single unified interface for accessing both local and premium cloud LLMs from the same UI.
Configuration​
You can configure which OpenAI compatible providers and models you want to use by adding them to your
llms.json in ~/.llms/llms.json
Whilst the ui.json configuration for the UI is maintained
in in ~/.llms/ui.json
where you can configure your preferred system prompts and other defaults.
Fast, Local and Private​
OSS & Free, no sign ups, no ads, no tracking, etc. All data is stored locally in the browser's IndexedDB that can be exported and imported to transfer chat histories between different browsers.
A goal for llms.py is to limit itself to its single python aiohttp dependency for minimal risk of
conflicts and friction within multiple Python Environments, e.g. llms.py
is an easy drop-in inside a
ComfyUI Custom Node since it requires no any additional dependencies.
Import / Export​
All data is stored locally in the browser's IndexedDB which as it is tied to the browser's origin, you'll be able to maintain multiple independent conversational databases by just running the server on a different port.
When needed you can backup and transfer your entire chat history between different browsers using the Export and Import features on the home page.
Rich Markdown & Syntax Highlighting​
To maximize readability there's full support for Markdown and Syntax highlighting for the most popular programming languages.
To quickly and easily make use of AI Responses, Copy Code icons are readily available on all messages and code blocks.
Rich, Multimodal Inputs​
The Chat UI goes beyond just text and can take advantage of the multimodal capabilities of modern LLMs with support for Image, Audio and File inputs.
🖼️ 1. Image Inputs & Analysis​
Images can be uploaded directly into your conversations with vision-capable models for comprehensive image analysis.
Visual AI Responses are highly dependent on the model used. This is a typical example of the visual analysis provided by the latest Gemini Flash of our ServiceStack Logo:
🎤 2. Audio Input & Transcription​
Likewise you can upload Audio files and have them transcribed and analyzed by multi-modal models with audio capabilities.
Example of processing audio input. Audio files can be uploaded with system and user prompts to instruct the model to transcribe and summarize its content where its multi-modal capabilities are integrated right within the chat interface.
📎 3. File and PDF Attachments​
In addition to images and audio, you can also upload documents, PDFs and other files to capable models to extract insights, summarize content or analyze data.
Document Processing Use Cases:
- PDF Analysis: Upload PDF documents for content extraction and analysis
- Data Extraction: Extract specific information from structured documents
- Document Summarization: Get concise summaries of lengthy documents
- Query Content: Ask questions about specific content in documents
- Batch Processing: Upload multiple files for comparative analysis
Perfect for research, document review, data analysis and content extractions.
Search History​
Quickly find past conversations with built-in search:
Enable / Disable Providers​
Dynamically manage which providers you want to enable and disable at runtime.
Providers are invoked in the order they're defined in llms.json
that supports the requested model.
If a provider fails, it tries the next available provider.
By default Providers with Free tiers are enabled first, followed by local providers and then premium cloud providers which can all be enabled or disabled in the UI:
Smart Autocomplete for Models & System Prompts​
Autocomplete components are used to quickly find and select the preferred model and system prompt.
Only models from enabled providers will appear in the drop down, which will be available immediately after providers are enabled.
Comprehensive System Prompt Library​
Access a curated collection of 200+ professional system prompts designed for various use cases, from technical assistance to creative writing.
System Prompts be can added, removed & sorted in ~/.llms/ui.json
{
"prompts": [
{
"id": "it-expert",
"name": "Act as an IT Expert",
"value": "I want you to act as an IT expert. You will be responsible..."
},
...
]
}
Reasoning​
Access the thinking process of advanced AI models with specialized rendering for reasoning and chain-of-thought responses:
Get Started Today​
The new llms.py UI makes powerful AI accessible and private. Whether you're a developer, researcher or AI enthusiast, this UI provides helps you harness the potential of both local and cloud-based language models.
Why llms.py UI?​
- 🔒 Privacy First: All data stays local - no tracking, ads or external deps
- ⚡ Lightning Fast: Fast, async aiohttp client and server
- 🌐 Universal Compatibility: Works with any OpenAI-compatible API
- 💰 Cost Effective: Mix free local models with premium APIs as needed
- 🎯 Feature Rich: Multimodal support, search, autocomplete, and more
- 🛠️ Developer Friendly: Simple config, easily modifiable implementation
Try llms.py UI:
pip install llms-py
llms --serve 8000
Open your browser to http://localhost:8000
to start chatting with your AI models.