Skip to main content

Command Palette

Search for a command to run...

Introducing Platformatic AI-Warp 1.0.0

The Ultimate AI Gateway for Modern Applications

Updated
β€’6 min read
Introducing Platformatic AI-Warp 1.0.0

The future of AI integration is here, and it's built for scale.

Today, we're thrilled to announce the official launch of Platformatic AI-Warp 1.0.0, the most comprehensive AI gateway solution that transforms how developers integrate multiple AI providers into their applications. After extensive development and real-world testing, we're ready to revolutionize AI development with a unified, scalable, intelligent platform.

The Problem We Solved

Building production-ready AI applications has been a nightmare. Developers face a maze of challenges:

  • Provider Lock-in: Choosing one AI provider means being stuck with their limitations
  • Complex Integration: Each provider has different APIs, authentication, and response formats
  • Reliability Issues: No built-in fallback when your primary AI service goes down
  • Session Management: Maintaining conversation context across requests is a headache
  • Scaling Bottlenecks: Managing rate limits, timeouts, and connection pooling manually

Sound familiar? We've been there too.

Enter AI-Warp: Your AI Gateway

Platformatic AI-Warp isn't just another AI wrapper – it's a complete AI operations platform that handles the complexity so you can focus on building great features.

πŸ”„ Seamless Session Resume

Never lose conversation context again. AI Warp maintains perfect conversation continuity across service restarts, load balancer switches, and distributed deployments:

// Resume any conversation instantly
const response = await client.ask({
  prompt: 'What did we discuss about AI safety?',
  sessionId: 'sess_abc123xyz'  // Continues exactly where you left off
})

🌐 One API to Rule Them All

Connect to OpenAI, DeepSeek, and Google Gemini through a single, unified interface. Switch between providers without changing a single line of application code:

// Simple string format
models: ["openai:gpt-4", "gemini:gemini-2.0-flash", "deepseek:deepseek-chat"]

// Or detailed configuration
models: [
  {
    provider: 'openai',
    model: 'gpt-4o',
    limits: { maxTokens: 4000, rate: { max: 100, timeWindow: '1m' } }
  },
  {
    provider: 'gemini',
    model: 'gemini-2.0-flash',
    limits: { maxTokens: 8000 }
  }
]

πŸ”„ Intelligent Automatic Fallback

AI Warp automatically tries the next model in your chain when your primary model hits rate limits or fails. No more 5am alerts about broken AI features:

// Configure automatic restoration after failures
restore: {
  rateLimit: '5m',        // Retry after 5 minutes
  timeout: '2m',          // Retry after 2 minutes
  providerError: '10m'    // Retry after 10 minutes
}

πŸ’Ύ Enterprise-Grade Session Management

Maintain conversation context effortlessly with built-in session storage. Choose between lightning-fast in-memory storage or distributed Valkey/Redis for multi-instance deployments:

// Distributed session storage
storage: {
  type: 'valkey',
  valkey: {
    host: 'localhost',
    port: 6379,
    database: 0
  }
}

How Valkey Session Storage Works

AI Warp's Valkey integration provides enterprise-grade session management through a sophisticated distributed storage architecture:

Key Features:

  • Distributed State: Sessions are immediately available across all service instances
  • Automatic Expiration: Configurable TTL prevents memory bloat (historyExpiration: '1d')
  • JSON Serialization: Efficient storage of conversation history with Redis Lists
  • Connection Pooling: Optimized connection management with automatic reconnection
  • Cross-Instance Continuity: Users can continue conversations on any instance

Session Lifecycle:

  1. Session Creation: Auto-generated unique session ID (sess_abc123xyz)
  2. History Storage: Each conversation turn is stored as JSON in Redis Lists using LPUSH
  3. Context Retrieval: Previous conversation retrieved with LRANGE for AI context
  4. Automatic Cleanup: Sessions expire based on configured TTL (default: 24 hours)

Session Resume: Seamless Conversation Continuity

One of AI Warp's most powerful features is seamless session resumption. Here's how it works:

Resume Flow:

// Client resumes conversation with existing session ID
const response = await client.ask({
  prompt: 'What did we discuss about earlier?',
  sessionId: 'sess_abc123xyz'  // Previously stored session ID
})

Behind the Scenes:

  1. Session Validation: AI Warp validates the session ID exists in Valkey
  2. History Retrieval: Complete conversation history is fetched using LRANGE sess_abc123xyz 0 -1
  3. Context Reconstruction: Previous conversation turns are provided to the AI model as context
  4. Contextual Response: AI responds with full awareness of previous conversation
  5. History Update: New conversation turn is added to the existing session

Cross-Instance Resume:

  • User starts conversation on Instance A
  • Session stored in shared Valkey cluster
  • User's next request hits Instance B
  • Instance B seamlessly retrieves full conversation history
  • Response maintains perfect conversation continuity

Error Handling:

  • Invalid session IDs return clear error messages
  • Expired sessions gracefully start new conversations
  • Storage failures fallback to stateless operation

This architecture enables true horizontal scaling - add more AI Warp instances without losing conversation state, and users can seamlessly continue conversations regardless of which instance handles their request.

🌊 Real-Time Streaming with Automatic Resume

Deliver instant, responsive AI experiences with Server-Sent Events (SSE) streaming, now with automatic resume capability for fault-tolerant streaming:

const client = buildClient({ url: 'http://localhost:3042' })

// Start streaming with session management
const response1 = await client.ask({ 
  prompt: 'Write a long story about AI', 
  stream: true 
})

const sessionId = response1.sessionId

// Process stream until connection interruption
for await (const chunk of response1.stream) {
  console.log(chunk.content) // Real-time response chunks
  // Connection interrupted here...
  break
}

// Automatically resume from where you left off
const response2 = await client.ask({ 
  prompt: 'Continue the story',  // Ignored during resume
  sessionId: sessionId,          // Triggers automatic resume
  stream: true 
})

// Continue receiving remaining content seamlessly
for await (const chunk of response2.stream) {
  console.log(chunk.content) // Continues exactly where it left off
}

Stream Resume Benefits:

  • Zero Configuration: Resume happens automatically with sessionId + streaming
  • Fault Tolerance: Recover from network interruptions, timeouts, and connection drops
  • Bandwidth Efficiency: Only streams remaining content, not the full response
  • Graceful Fallback: Automatically falls back to normal requests if resume fails

Built for Production from Day One

πŸ›‘οΈ Rock-Solid Reliability

  • Optimized HTTP Client: Efficient connection reuse with undici
  • Connection Pooling: Efficient resource management
  • Graceful Degradation: Automatic failover between providers
  • Timeout Management: Configurable timeouts with automatic cleanup
  • Error Recovery: Sophisticated retry logic with exponential backoff

πŸ“ˆ Deploy Anywhere, Scale Everywhere

  • Self-Hostable: Complete control over your AI infrastructure
  • Kubernetes Ready: Native container support for cloud-native deployments
  • Universal Deployment: Runs anywhere Node.js runs - Docker, VMs, serverless, edge
  • Distributed Architecture: Share sessions across multiple instances
  • Rate Limit Management: Automatic handling of provider limits
  • Resource Optimization: Configurable limits per model and provider

Get Started in Minutes

Quick Start with Wattpm

mkdir my-ai-app
cd my-ai-app
npx wattpm@latest create

Select @platformatic/ai-warp, configure your API keys, and you're running a production-ready AI service in under 2 minutes.

Or Install the Client

npm install @platformatic/ai-client
import { buildClient } from '@platformatic/ai-client'

const client = buildClient({
  url: 'http://localhost:3042'
})

const response = await client.ask({
  prompt: 'Hello AI, how are you today?',
  stream: false
})

console.log(response.content.text)
// "Hello! I'm doing well, thank you for asking..."

What's Next?

This is just the beginning. Our roadmap includes:

  • More Providers: Anthropic Claude, Cohere, and custom model support
  • Advanced Routing: Cost-based and performance-based model selection
  • Analytics Dashboard: Real-time monitoring and usage analytics
  • Plugin Ecosystem: Custom providers and middleware support

Join the AI Revolution

Platformatic AI-Warp 1.0.0 is available now. Whether you're building a chatbot, content generator, or the next breakthrough AI application, we've got you covered.

Ready to transform your AI development?

Built with ❀️ by the Platformatic Team

Ready to warp into the future of AI development? Get started today and experience the difference of having a true AI operations platform at your fingertips.


Follow us on Twitter for updates and join our Discord community to connect with other developers building the future with AI.