FREE Gemini, Minimax M2, GLM 4.6, Kimi K2

FREE Gemini, Minimax M2, GLM 4.6, Kimi K2 Background
4 min read

To give AI Chat instant utility, we're making available a free servicestack OpenAI Chat provider that can be enabled with:

services.AddPlugin(new ChatFeature {
    EnableProviders = [
        "servicestack",
        // "groq",
        // "google_free",
        // "openrouter_free",
        // "ollama",
        // "google",
        // "anthropic",
        // "openai",
        // "grok",
        // "qwen",
        // "z.ai",
        // "mistral",
        // "openrouter",
    ]
});

The servicestack provider is configured with a default llms.json which enables access to Gemini and the best value OSS models for FREE:

{
  "providers": {
    "servicestack": {
      "enabled": false,
      "type": "OpenAiProvider",
      "base_url": "http://okai.servicestack.com",
      "api_key": "$SERVICESTACK_LICENSE",
      "models": {
        "gemini-flash-latest": "gemini-flash-latest",
        "gemini-flash-lite-latest": "gemini-flash-lite-latest",
        "kimi-k2": "kimi-k2",
        "kimi-k2-thinking": "kimi-k2-thinking",
        "minimax-m2": "minimax-m2",
        "glm-4.6": "glm-4.6",
        "gpt-oss:20b": "gpt-oss:20b",
        "gpt-oss:120b": "gpt-oss:120b",
        "llama4:400b": "llama4:400b",
        "mistral-small3.2:24b": "mistral-small3.2:24b"
      }
    }
  }
}

Clean, Lightweight & Flexible AI Integration

ServiceStack's AI Chat delivers a production-ready solution for integrating AI capabilities into your applications with minimal overhead and maximum flexibility. The llms.json configuration approach provides several key advantages:

Unified Provider Abstraction

Define the exact models you want your application to use through a single, declarative configuration file. This thin abstraction layer eliminates vendor lock-in and allows seamless switching between providers without code changes, enabling you to:

  • Optimize for cost - Route requests to the most economical provider for each use case
  • Maximize performance - Leverage faster models for latency-sensitive operations while using more capable models for complex tasks
  • Ensure reliability - Configure automatic failover between providers to maintain service availability
  • Control access - Specify which models are available to users in your preferred priority order

Hybrid Deployment Flexibility

Mix and match local and cloud providers to meet your specific requirements. Deploy privacy-sensitive workloads on local models while leveraging cloud providers for scale, or combine premium models for critical features with cost-effective alternatives for routine tasks.

Zero-Dependency Architecture

The lightweight implementation adds minimal footprint to your application while providing enterprise-grade AI capabilities. No heavy SDKs or framework dependencies required—just clean, direct performant integrations.

The servicestack provider requires the SERVICESTACK_LICENSE Environment Variable, although any ServiceStack License Key can be used, including expired and Free ones.

Learn more about AI Chat's UI:

FREE for Personal Usage

To be able to maintain this as a free service we're limiting usage for development or personal assistance and research by limiting usage to 60 requests /hour which should be more than enough for most personal usage and research whilst deterring usage in automated tools or usage in production.

info

Rate limiting is implemented with a sliding Token Bucket algorithm that replenishes 1 additional request every 60s

Effortless AI Integration

In addition of providing UI and ChatGPT-like features, it also makes it trivially simple to access AI Features from within your own App that's as simple as sending a populated ChatCompletion Request DTO with the IChatClient dependency:

class MyService(IChatClient client)
{
    public async Task<object> Any(DefaultChat request)
    {
        return await client.ChatAsync(new ChatCompletion {
            Model = "glm-4.6",
            Messages = [
                Message.Text(request.UserPrompt)
            ],
        });
    }
}

It's also makes it easy to send Image, Audio & Document inputs to AI Models that support it, e.g:

var image = new ChatCompletion
{
    Model = "qwen2.5vl",
    Messages = [
        Message.Image(imageUrl:"https://example.org/image.webp",
            text:"Describe the key features of the input image"),
    ]
}

var audio = new ChatCompletion
{
    Model = "gpt-4o-audio-preview",
    Messages = [
        Message.Audio(data:"https://example.org/speaker.mp3",
            text:"Please transcribe and summarize this audio file"),
    ]
};

var file = new ChatCompletion
{
    Model = "gemini-flash-latest",
    Messages = [
        Message.File(
            fileData:"https://example.org/order.pdf",
            text:"Please summarize this document"),
    ]
};

Learn more about AI Chat

To dive deeper into what AI Chat can do:

  • Read the AI Chat API docs to integrate AI into your own services and apps.
  • Explore the AI Chat UI guide to customize the built-in experience.
  • Use Admin UI to inspect analytics, monitor usage, and review audit history.