retellai-performance-tuningClaude Skill
Optimize Retell AI API performance with caching, batching, and connection pooling.
| name | retellai-performance-tuning |
| description | Optimize Retell AI API performance with caching, batching, and connection pooling. Use when experiencing slow API responses, implementing caching strategies, or optimizing request throughput for Retell AI integrations. Trigger with phrases like "retellai performance", "optimize retellai", "retellai latency", "retellai caching", "retellai slow", "retellai batch". |
| allowed-tools | Read, Write, Edit |
| version | 1.0.0 |
| license | MIT |
| author | Jeremy Longshore <jeremy@intentsolutions.io> |
Retell AI Performance Tuning
Overview
Optimize Retell AI API performance with caching, batching, and connection pooling.
Prerequisites
- Retell AI SDK installed
- Understanding of async patterns
- Redis or in-memory cache available (optional)
- Performance monitoring in place
Latency Benchmarks
| Operation | P50 | P95 | P99 |
|---|---|---|---|
| Read | 50ms | 150ms | 300ms |
| Write | 100ms | 250ms | 500ms |
| List | 75ms | 200ms | 400ms |
Caching Strategy
Response Caching
import { LRUCache } from 'lru-cache'; const cache = new LRUCache<string, any>({ max: 1000, ttl: 60000, // 1 minute updateAgeOnGet: true, }); async function cachedRetell AIRequest<T>( key: string, fetcher: () => Promise<T>, ttl?: number ): Promise<T> { const cached = cache.get(key); if (cached) return cached as T; const result = await fetcher(); cache.set(key, result, { ttl }); return result; }
Redis Caching (Distributed)
import Redis from 'ioredis'; const redis = new Redis(process.env.REDIS_URL); async function cachedWithRedis<T>( key: string, fetcher: () => Promise<T>, ttlSeconds = 60 ): Promise<T> { const cached = await redis.get(key); if (cached) return JSON.parse(cached); const result = await fetcher(); await redis.setex(key, ttlSeconds, JSON.stringify(result)); return result; }
Request Batching
import DataLoader from 'dataloader'; const retellaiLoader = new DataLoader<string, any>( async (ids) => { // Batch fetch from Retell AI const results = await retellaiClient.batchGet(ids); return ids.map(id => results.find(r => r.id === id) || null); }, { maxBatchSize: 100, batchScheduleFn: callback => setTimeout(callback, 10), } ); // Usage - automatically batched const [item1, item2, item3] = await Promise.all([ retellaiLoader.load('id-1'), retellaiLoader.load('id-2'), retellaiLoader.load('id-3'), ]);
Connection Optimization
import { Agent } from 'https'; // Keep-alive connection pooling const agent = new Agent({ keepAlive: true, maxSockets: 10, maxFreeSockets: 5, timeout: 30000, }); const client = new RetellAIClient({ apiKey: process.env.RETELLAI_API_KEY!, httpAgent: agent, });
Pagination Optimization
async function* paginatedRetell AIList<T>( fetcher: (cursor?: string) => Promise<{ data: T[]; nextCursor?: string }> ): AsyncGenerator<T> { let cursor: string | undefined; do { const { data, nextCursor } = await fetcher(cursor); for (const item of data) { yield item; } cursor = nextCursor; } while (cursor); } // Usage for await (const item of paginatedRetell AIList(cursor => retellaiClient.list({ cursor, limit: 100 }) )) { await process(item); }
Performance Monitoring
async function measuredRetell AICall<T>( operation: string, fn: () => Promise<T> ): Promise<T> { const start = performance.now(); try { const result = await fn(); const duration = performance.now() - start; console.log({ operation, duration, status: 'success' }); return result; } catch (error) { const duration = performance.now() - start; console.error({ operation, duration, status: 'error', error }); throw error; } }
Instructions
Step 1: Establish Baseline
Measure current latency for critical Retell AI operations.
Step 2: Implement Caching
Add response caching for frequently accessed data.
Step 3: Enable Batching
Use DataLoader or similar for automatic request batching.
Step 4: Optimize Connections
Configure connection pooling with keep-alive.
Output
- Reduced API latency
- Caching layer implemented
- Request batching enabled
- Connection pooling configured
Error Handling
| Issue | Cause | Solution |
|---|---|---|
| Cache miss storm | TTL expired | Use stale-while-revalidate |
| Batch timeout | Too many items | Reduce batch size |
| Connection exhausted | No pooling | Configure max sockets |
| Memory pressure | Cache too large | Set max cache entries |
Examples
Quick Performance Wrapper
const withPerformance = <T>(name: string, fn: () => Promise<T>) => measuredRetell AICall(name, () => cachedRetell AIRequest(`cache:${name}`, fn) );
Resources
Next Steps
For cost optimization, see retellai-cost-tuning.
Similar Claude Skills & Agent Workflows
git-commit
Generate well-formatted git commit messages following conventional commit standards
code-review
Comprehensive code review assistant that analyzes code quality, security, and best practices
dsql
Build with Aurora DSQL - manage schemas, execute queries, and handle migrations with DSQL-specific requirements.
backend-dev-guidelines
Comprehensive backend development guide for Langfuse's Next.js 14/tRPC/Express/TypeScript monorepo.
Material Component Dev
FlowGram 物料组件开发指南 - 用于在 form-materials 包中创建新的物料组件
Create Node
用于在 FlowGram demo-free-layout 中创建新的自定义节点,支持简单节点(自动表单)和复杂节点(自定义 UI)