Platformatic AI-Warp 1.0.0

The future of AI integration is here, and it's built for scale.

Today, we're thrilled to announce the official launch of Platformatic AI-Warp 1.0.0, the most comprehensive AI gateway solution that transforms how developers integrate multiple AI providers into their applications. After extensive development and real-world testing, we're ready to revolutionize AI development with a unified, scalable, intelligent platform.

The Problem We Solved

Building production-ready AI applications has been a nightmare. Developers face a maze of challenges:

Provider Lock-in: Choosing one AI provider means being stuck with their limitations
Complex Integration: Each provider has different APIs, authentication, and response formats
Reliability Issues: No built-in fallback when your primary AI service goes down
Session Management: Maintaining conversation context across requests is a headache
Scaling Bottlenecks: Managing rate limits, timeouts, and connection pooling manually

Sound familiar? We've been there too.

Enter AI-Warp: Your AI Gateway

Platformatic AI-Warp isn't just another AI wrapper – it's a complete AI operations platform that handles the complexity so you can focus on building great features.

🔄 Seamless Session Resume

Never lose conversation context again. AI Warp maintains perfect conversation continuity across service restarts, load balancer switches, and distributed deployments:

// Resume any conversation instantly
const response = await client.ask({
  prompt: 'What did we discuss about AI safety?',
  sessionId: 'sess_abc123xyz'  // Continues exactly where you left off
})

🌐 One API to Rule Them All

Connect to OpenAI, DeepSeek, and Google Gemini through a single, unified interface. Switch between providers without changing a single line of application code:

// Simple string format
models: ["openai:gpt-4", "gemini:gemini-2.0-flash", "deepseek:deepseek-chat"]

// Or detailed configuration
models: [
  {
    provider: 'openai',
    model: 'gpt-4o',
    limits: { maxTokens: 4000, rate: { max: 100, timeWindow: '1m' } }
  },
  {
    provider: 'gemini',
    model: 'gemini-2.0-flash',
    limits: { maxTokens: 8000 }
  }
]

🔄 Intelligent Automatic Fallback

AI Warp automatically tries the next model in your chain when your primary model hits rate limits or fails. No more 5am alerts about broken AI features:

// Configure automatic restoration after failures
restore: {
  rateLimit: '5m',        // Retry after 5 minutes
  timeout: '2m',          // Retry after 2 minutes
  providerError: '10m'    // Retry after 10 minutes
}

💾 Enterprise-Grade Session Management

Maintain conversation context effortlessly with built-in session storage. Choose between lightning-fast in-memory storage or distributed Valkey/Redis for multi-instance deployments:

// Distributed session storage
storage: {
  type: 'valkey',
  valkey: {
    host: 'localhost',
    port: 6379,
    database: 0
  }
}

How Valkey Session Storage Works

AI Warp's Valkey integration provides enterprise-grade session management through a sophisticated distributed storage architecture:

Key Features:

Distributed State: Sessions are immediately available across all service instances
Automatic Expiration: Configurable TTL prevents memory bloat (historyExpiration: '1d')
JSON Serialization: Efficient storage of conversation history with Redis Lists
Connection Pooling: Optimized connection management with automatic reconnection
Cross-Instance Continuity: Users can continue conversations on any instance

Session Lifecycle:

Session Creation: Auto-generated unique session ID (sess_abc123xyz)
History Storage: Each conversation turn is stored as JSON in Redis Lists using LPUSH
Context Retrieval: Previous conversation retrieved with LRANGE for AI context
Automatic Cleanup: Sessions expire based on configured TTL (default: 24 hours)

Session Resume: Seamless Conversation Continuity

One of AI Warp's most powerful features is seamless session resumption. Here's how it works:

Resume Flow:

// Client resumes conversation with existing session ID
const response = await client.ask({
  prompt: 'What did we discuss about earlier?',
  sessionId: 'sess_abc123xyz'  // Previously stored session ID
})

Behind the Scenes:

Session Validation: AI Warp validates the session ID exists in Valkey
History Retrieval: Complete conversation history is fetched using LRANGE sess_abc123xyz 0 -1
Context Reconstruction: Previous conversation turns are provided to the AI model as context
Contextual Response: AI responds with full awareness of previous conversation
History Update: New conversation turn is added to the existing session

Cross-Instance Resume:

User starts conversation on Instance A
Session stored in shared Valkey cluster
User's next request hits Instance B
Instance B seamlessly retrieves full conversation history
Response maintains perfect conversation continuity

Error Handling:

Invalid session IDs return clear error messages
Expired sessions gracefully start new conversations
Storage failures fallback to stateless operation

This architecture enables true horizontal scaling - add more AI Warp instances without losing conversation state, and users can seamlessly continue conversations regardless of which instance handles their request.

🌊 Real-Time Streaming with Automatic Resume

Deliver instant, responsive AI experiences with Server-Sent Events (SSE) streaming, now with automatic resume capability for fault-tolerant streaming:

const client = buildClient({ url: 'http://localhost:3042' })

// Start streaming with session management
const response1 = await client.ask({ 
  prompt: 'Write a long story about AI', 
  stream: true 
})

const sessionId = response1.sessionId

// Process stream until connection interruption
for await (const chunk of response1.stream) {
  console.log(chunk.content) // Real-time response chunks
  // Connection interrupted here...
  break
}

// Automatically resume from where you left off
const response2 = await client.ask({ 
  prompt: 'Continue the story',  // Ignored during resume
  sessionId: sessionId,          // Triggers automatic resume
  stream: true 
})

// Continue receiving remaining content seamlessly
for await (const chunk of response2.stream) {
  console.log(chunk.content) // Continues exactly where it left off
}

Stream Resume Benefits:

Zero Configuration: Resume happens automatically with sessionId + streaming
Fault Tolerance: Recover from network interruptions, timeouts, and connection drops
Bandwidth Efficiency: Only streams remaining content, not the full response
Graceful Fallback: Automatically falls back to normal requests if resume fails

Built for Production from Day One

🛡️ Rock-Solid Reliability

Optimized HTTP Client: Efficient connection reuse with undici
Connection Pooling: Efficient resource management
Graceful Degradation: Automatic failover between providers
Timeout Management: Configurable timeouts with automatic cleanup
Error Recovery: Sophisticated retry logic with exponential backoff

📈 Deploy Anywhere, Scale Everywhere

Self-Hostable: Complete control over your AI infrastructure
Kubernetes Ready: Native container support for cloud-native deployments
Universal Deployment: Runs anywhere Node.js runs - Docker, VMs, serverless, edge
Distributed Architecture: Share sessions across multiple instances
Rate Limit Management: Automatic handling of provider limits
Resource Optimization: Configurable limits per model and provider

Get Started in Minutes

Quick Start with Wattpm

mkdir my-ai-app
cd my-ai-app
npx wattpm@latest create

Select @platformatic/ai-warp, configure your API keys, and you're running a production-ready AI service in under 2 minutes.

Or Install the Client

npm install @platformatic/ai-client

import { buildClient } from '@platformatic/ai-client'

const client = buildClient({
  url: 'http://localhost:3042'
})

const response = await client.ask({
  prompt: 'Hello AI, how are you today?',
  stream: false
})

console.log(response.content.text)
// "Hello! I'm doing well, thank you for asking..."

What's Next?

This is just the beginning. Our roadmap includes:

More Providers: Anthropic Claude, Cohere, and custom model support
Advanced Routing: Cost-based and performance-based model selection
Analytics Dashboard: Real-time monitoring and usage analytics
Plugin Ecosystem: Custom providers and middleware support

Join the AI Revolution

Platformatic AI-Warp 1.0.0 is available now. Whether you're building a chatbot, content generator, or the next breakthrough AI application, we've got you covered.

Ready to transform your AI development?

Built with ❤️ by the Platformatic Team

Ready to warp into the future of AI development? Get started today and experience the difference of having a true AI operations platform at your fingertips.

Follow us on Twitter for updates and join our Discord community to connect with other developers building the future with AI.

Introducing Platformatic AI-Warp 1.0.0

The Problem We Solved

Enter AI-Warp: Your AI Gateway

🔄 Seamless Session Resume

🌐 One API to Rule Them All

🔄 Intelligent Automatic Fallback

💾 Enterprise-Grade Session Management

How Valkey Session Storage Works

Session Resume: Seamless Conversation Continuity

🌊 Real-Time Streaming with Automatic Resume

Built for Production from Day One

🛡️ Rock-Solid Reliability

📈 Deploy Anywhere, Scale Everywhere

Get Started in Minutes

Quick Start with Wattpm

Or Install the Client

What's Next?

Join the AI Revolution

Comments

More from this blog

Stop Request Stampedes at the Gateway with Platformatic Deduplication

AWS ECS auto-scaler is broken (don’t worry, we’ve fixed it)

Destino: Doom in Your Terminal, Powered by Node.js FFI

Ahead of Time Scaling: How Platformatic ICC Predicts and Provisions

Run Medusa on Kubernetes with Watt as a Monorepo

Command Palette

The Problem We Solved

Enter AI-Warp: Your AI Gateway

🔄 Seamless Session Resume

🌐 One API to Rule Them All

🔄 Intelligent Automatic Fallback

💾 Enterprise-Grade Session Management

How Valkey Session Storage Works

Session Resume: Seamless Conversation Continuity

🌊 Real-Time Streaming with Automatic Resume

Built for Production from Day One

🛡️ Rock-Solid Reliability

📈 Deploy Anywhere, Scale Everywhere

Get Started in Minutes

Quick Start with Wattpm

Or Install the Client

What's Next?

Join the AI Revolution

Comments

More from this blog