Introducing Platformatic AI-Warp 1.0.0
The Ultimate AI Gateway for Modern Applications

The future of AI integration is here, and it's built for scale.
Today, we're thrilled to announce the official launch of Platformatic AI-Warp 1.0.0, the most comprehensive AI gateway solution that transforms how developers integrate multiple AI providers into their applications. After extensive development and real-world testing, we're ready to revolutionize AI development with a unified, scalable, intelligent platform.
The Problem We Solved
Building production-ready AI applications has been a nightmare. Developers face a maze of challenges:
- Provider Lock-in: Choosing one AI provider means being stuck with their limitations
- Complex Integration: Each provider has different APIs, authentication, and response formats
- Reliability Issues: No built-in fallback when your primary AI service goes down
- Session Management: Maintaining conversation context across requests is a headache
- Scaling Bottlenecks: Managing rate limits, timeouts, and connection pooling manually
Sound familiar? We've been there too.
Enter AI-Warp: Your AI Gateway
Platformatic AI-Warp isn't just another AI wrapper β it's a complete AI operations platform that handles the complexity so you can focus on building great features.
π Seamless Session Resume
Never lose conversation context again. AI Warp maintains perfect conversation continuity across service restarts, load balancer switches, and distributed deployments:
// Resume any conversation instantly
const response = await client.ask({
prompt: 'What did we discuss about AI safety?',
sessionId: 'sess_abc123xyz' // Continues exactly where you left off
})
π One API to Rule Them All
Connect to OpenAI, DeepSeek, and Google Gemini through a single, unified interface. Switch between providers without changing a single line of application code:
// Simple string format
models: ["openai:gpt-4", "gemini:gemini-2.0-flash", "deepseek:deepseek-chat"]
// Or detailed configuration
models: [
{
provider: 'openai',
model: 'gpt-4o',
limits: { maxTokens: 4000, rate: { max: 100, timeWindow: '1m' } }
},
{
provider: 'gemini',
model: 'gemini-2.0-flash',
limits: { maxTokens: 8000 }
}
]
π Intelligent Automatic Fallback
AI Warp automatically tries the next model in your chain when your primary model hits rate limits or fails. No more 5am alerts about broken AI features:
// Configure automatic restoration after failures
restore: {
rateLimit: '5m', // Retry after 5 minutes
timeout: '2m', // Retry after 2 minutes
providerError: '10m' // Retry after 10 minutes
}
πΎ Enterprise-Grade Session Management
Maintain conversation context effortlessly with built-in session storage. Choose between lightning-fast in-memory storage or distributed Valkey/Redis for multi-instance deployments:
// Distributed session storage
storage: {
type: 'valkey',
valkey: {
host: 'localhost',
port: 6379,
database: 0
}
}
How Valkey Session Storage Works
AI Warp's Valkey integration provides enterprise-grade session management through a sophisticated distributed storage architecture:

Key Features:
- Distributed State: Sessions are immediately available across all service instances
- Automatic Expiration: Configurable TTL prevents memory bloat (
historyExpiration: '1d') - JSON Serialization: Efficient storage of conversation history with Redis Lists
- Connection Pooling: Optimized connection management with automatic reconnection
- Cross-Instance Continuity: Users can continue conversations on any instance
Session Lifecycle:
- Session Creation: Auto-generated unique session ID (
sess_abc123xyz) - History Storage: Each conversation turn is stored as JSON in Redis Lists using
LPUSH - Context Retrieval: Previous conversation retrieved with
LRANGEfor AI context - Automatic Cleanup: Sessions expire based on configured TTL (default: 24 hours)
Session Resume: Seamless Conversation Continuity
One of AI Warp's most powerful features is seamless session resumption. Here's how it works:
Resume Flow:
// Client resumes conversation with existing session ID
const response = await client.ask({
prompt: 'What did we discuss about earlier?',
sessionId: 'sess_abc123xyz' // Previously stored session ID
})
Behind the Scenes:
- Session Validation: AI Warp validates the session ID exists in Valkey
- History Retrieval: Complete conversation history is fetched using
LRANGE sess_abc123xyz 0 -1 - Context Reconstruction: Previous conversation turns are provided to the AI model as context
- Contextual Response: AI responds with full awareness of previous conversation
- History Update: New conversation turn is added to the existing session
Cross-Instance Resume:
- User starts conversation on Instance A
- Session stored in shared Valkey cluster
- User's next request hits Instance B
- Instance B seamlessly retrieves full conversation history
- Response maintains perfect conversation continuity
Error Handling:
- Invalid session IDs return clear error messages
- Expired sessions gracefully start new conversations
- Storage failures fallback to stateless operation
This architecture enables true horizontal scaling - add more AI Warp instances without losing conversation state, and users can seamlessly continue conversations regardless of which instance handles their request.
π Real-Time Streaming with Automatic Resume
Deliver instant, responsive AI experiences with Server-Sent Events (SSE) streaming, now with automatic resume capability for fault-tolerant streaming:
const client = buildClient({ url: 'http://localhost:3042' })
// Start streaming with session management
const response1 = await client.ask({
prompt: 'Write a long story about AI',
stream: true
})
const sessionId = response1.sessionId
// Process stream until connection interruption
for await (const chunk of response1.stream) {
console.log(chunk.content) // Real-time response chunks
// Connection interrupted here...
break
}
// Automatically resume from where you left off
const response2 = await client.ask({
prompt: 'Continue the story', // Ignored during resume
sessionId: sessionId, // Triggers automatic resume
stream: true
})
// Continue receiving remaining content seamlessly
for await (const chunk of response2.stream) {
console.log(chunk.content) // Continues exactly where it left off
}
Stream Resume Benefits:
- Zero Configuration: Resume happens automatically with
sessionId+ streaming - Fault Tolerance: Recover from network interruptions, timeouts, and connection drops
- Bandwidth Efficiency: Only streams remaining content, not the full response
- Graceful Fallback: Automatically falls back to normal requests if resume fails
Built for Production from Day One
π‘οΈ Rock-Solid Reliability
- Optimized HTTP Client: Efficient connection reuse with
undici - Connection Pooling: Efficient resource management
- Graceful Degradation: Automatic failover between providers
- Timeout Management: Configurable timeouts with automatic cleanup
- Error Recovery: Sophisticated retry logic with exponential backoff
π Deploy Anywhere, Scale Everywhere
- Self-Hostable: Complete control over your AI infrastructure
- Kubernetes Ready: Native container support for cloud-native deployments
- Universal Deployment: Runs anywhere Node.js runs - Docker, VMs, serverless, edge
- Distributed Architecture: Share sessions across multiple instances
- Rate Limit Management: Automatic handling of provider limits
- Resource Optimization: Configurable limits per model and provider
Get Started in Minutes
Quick Start with Wattpm
mkdir my-ai-app
cd my-ai-app
npx wattpm@latest create
Select @platformatic/ai-warp, configure your API keys, and you're running a production-ready AI service in under 2 minutes.
Or Install the Client
npm install @platformatic/ai-client
import { buildClient } from '@platformatic/ai-client'
const client = buildClient({
url: 'http://localhost:3042'
})
const response = await client.ask({
prompt: 'Hello AI, how are you today?',
stream: false
})
console.log(response.content.text)
// "Hello! I'm doing well, thank you for asking..."
What's Next?
This is just the beginning. Our roadmap includes:
- More Providers: Anthropic Claude, Cohere, and custom model support
- Advanced Routing: Cost-based and performance-based model selection
- Analytics Dashboard: Real-time monitoring and usage analytics
- Plugin Ecosystem: Custom providers and middleware support
Join the AI Revolution
Platformatic AI-Warp 1.0.0 is available now. Whether you're building a chatbot, content generator, or the next breakthrough AI application, we've got you covered.
Ready to transform your AI development?
- π Read the Documentation
- π¬ Join our Community
- π Report Issues
Built with β€οΈ by the Platformatic Team
Ready to warp into the future of AI development? Get started today and experience the difference of having a true AI operations platform at your fingertips.
Follow us on Twitter for updates and join our Discord community to connect with other developers building the future with AI.






