Monitoring and Logging
Monitoring and Logging
Your AI app is deployed and running. But how do you know if it's actually working? Are users getting errors? Is the AI responding too slowly? Are you burning through your API budget? Monitoring and logging answer these questions.
Why Monitoring Matters for AI Apps
AI apps have unique challenges that traditional web apps don't:
| Challenge | Why It Matters |
|---|---|
| API outages | OpenAI, Anthropic, and other providers sometimes go down |
| Slow responses | AI API calls can take 2-30+ seconds |
| Cost spikes | A bug could trigger thousands of unnecessary API calls |
| Quality degradation | Model updates can change response quality |
| Rate limiting | You might hit API rate limits without knowing |
| Token overuse | Badly constructed prompts waste tokens and money |
Without monitoring, you'll only know about problems when users complain — and by then, you might have a $500 API bill.
Logging API Calls and Responses
The most important thing to log in an AI app is every API call. Here's a logging wrapper pattern:
// src/lib/aiLogger.ts interface AILogEntry { timestamp: string; model: string; promptTokens: number; completionTokens: number; totalTokens: number; latencyMs: number; status: "success" | "error"; error?: string; userId?: string; endpoint: string; } export function logAICall(entry: AILogEntry): void { // Log to console (visible in hosting platform logs) console.log(JSON.stringify({ type: "ai_api_call", ...entry, })); // Optionally: save to database for analytics // await db.collection("ai_logs").add(entry); }
Wrapping Your AI API Calls
// src/services/aiService.ts import OpenAI from "openai"; import { logAICall } from "@/lib/aiLogger"; const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY }); export async function generateResponse( prompt: string, userId?: string ): Promise<string> { const startTime = Date.now(); try { const response = await openai.chat.completions.create({ model: "gpt-4o", messages: [{ role: "user", content: prompt }], max_tokens: 1000, }); const latencyMs = Date.now() - startTime; const usage = response.usage; logAICall({ timestamp: new Date().toISOString(), model: "gpt-4o", promptTokens: usage?.prompt_tokens ?? 0, completionTokens: usage?.completion_tokens ?? 0, totalTokens: usage?.total_tokens ?? 0, latencyMs, status: "success", userId, endpoint: "chat.completions", }); return response.choices[0]?.message?.content ?? ""; } catch (error) { const latencyMs = Date.now() - startTime; logAICall({ timestamp: new Date().toISOString(), model: "gpt-4o", promptTokens: 0, completionTokens: 0, totalTokens: 0, latencyMs, status: "error", error: error instanceof Error ? error.message : "Unknown error", userId, endpoint: "chat.completions", }); throw error; } }
This pattern gives you a complete picture of every AI interaction: how long it took, how many tokens it used, and whether it succeeded.
What to ask your AI: "Create a logging wrapper for my [OpenAI/Anthropic/Google AI] API calls that tracks latency, token usage, and errors."
Error Tracking
Console logs work, but dedicated error tracking tools give you much more: stack traces, user context, error frequency, and alerts.
Sentry
Sentry is the most popular error tracking tool. Free tier includes 5,000 errors/month.
npm install @sentry/nextjs # or npm install @sentry/react
// src/lib/sentry.ts import * as Sentry from "@sentry/nextjs"; Sentry.init({ dsn: process.env.NEXT_PUBLIC_SENTRY_DSN, environment: process.env.NODE_ENV, tracesSampleRate: 0.1, // Track 10% of transactions for performance });
Now errors are automatically captured and sent to your Sentry dashboard with full context.
Custom Error Handling for AI
// src/lib/errorHandler.ts import * as Sentry from "@sentry/nextjs"; export function handleAIError(error: unknown, context: Record<string, unknown>) { // Add context for debugging Sentry.withScope((scope) => { scope.setTag("service", "ai_api"); scope.setContext("ai_call", context); if (error instanceof Error) { // Categorize common AI API errors if (error.message.includes("rate_limit")) { scope.setTag("error_type", "rate_limit"); } else if (error.message.includes("insufficient_quota")) { scope.setTag("error_type", "quota_exceeded"); } else if (error.message.includes("timeout")) { scope.setTag("error_type", "timeout"); } Sentry.captureException(error); } }); }
LogRocket
LogRocket records user sessions so you can replay exactly what happened when an error occurred. Great for debugging "it doesn't work" reports.
npm install logrocket
import LogRocket from "logrocket"; LogRocket.init("your-app-id/your-project"); // Identify users LogRocket.identify(userId, { name: user.name, email: user.email, });
What to ask your AI: "Set up Sentry error tracking for my Next.js app. Include custom error handling for AI API failures."
Performance Monitoring
Slow AI responses kill user experience. Monitor performance to catch issues early.
What to Track
| Metric | Target | Why |
|---|---|---|
| AI API latency | < 3 seconds | Users leave if responses are too slow |
| Time to First Token | < 1 second | For streaming responses |
| Page load time | < 2 seconds | Standard web performance |
| API error rate | < 1% | Reliability target |
| P95 latency | < 10 seconds | Worst-case experience |
Vercel Analytics
If you're on Vercel, enable Web Analytics and Speed Insights:
npm install @vercel/analytics @vercel/speed-insights
// app/layout.tsx (Next.js) import { Analytics } from "@vercel/analytics/react"; import { SpeedInsights } from "@vercel/speed-insights/next"; export default function RootLayout({ children }) { return ( <html> <body> {children} <Analytics /> <SpeedInsights /> </body> </html> ); }
Firebase Performance Monitoring
import { getPerformance } from "firebase/performance"; // Initialize const perf = getPerformance(); // Custom trace for AI calls import { trace } from "firebase/performance"; async function trackedAICall(prompt: string) { const t = trace(perf, "ai_response"); t.start(); t.putAttribute("model", "gpt-4o"); const result = await generateResponse(prompt); t.putMetric("response_length", result.length); t.stop(); return result; }
AI-Specific Monitoring
Beyond standard web monitoring, AI apps need specialized tracking.
Token Usage Dashboard
Build a simple dashboard to track token consumption:
// src/services/tokenTracker.ts interface TokenUsage { date: string; model: string; promptTokens: number; completionTokens: number; estimatedCost: number; } const COSTS_PER_1K_TOKENS: Record<string, { input: number; output: number }> = { "gpt-4o": { input: 0.0025, output: 0.01 }, "gpt-4o-mini": { input: 0.00015, output: 0.0006 }, "claude-3-5-sonnet": { input: 0.003, output: 0.015 }, }; export function calculateCost( model: string, promptTokens: number, completionTokens: number ): number { const costs = COSTS_PER_1K_TOKENS[model]; if (!costs) return 0; return ( (promptTokens / 1000) * costs.input + (completionTokens / 1000) * costs.output ); }
Latency Monitoring
Track how long AI calls take over time:
// Log percentiles function trackLatency(latencies: number[]) { const sorted = [...latencies].sort((a, b) => a - b); const p50 = sorted[Math.floor(sorted.length * 0.5)]; const p95 = sorted[Math.floor(sorted.length * 0.95)]; const p99 = sorted[Math.floor(sorted.length * 0.99)]; console.log(`Latency: p50=${p50}ms, p95=${p95}ms, p99=${p99}ms`); }
Cost Alerts
Set up alerts when spending crosses thresholds:
// In your logging function const DAILY_BUDGET = 10; // $10 per day async function checkBudget() { const today = new Date().toISOString().split("T")[0]; const todayLogs = await db.collection("ai_logs") .where("date", "==", today) .get(); let totalCost = 0; todayLogs.forEach(doc => { totalCost += doc.data().estimatedCost; }); if (totalCost > DAILY_BUDGET * 0.8) { // Send alert — 80% of daily budget used console.warn(`⚠️ AI spending alert: $${totalCost.toFixed(2)} of $${DAILY_BUDGET} daily budget used`); // Send email/Slack notification } }
What to ask your AI: "Create a monitoring dashboard for my AI app that tracks token usage, costs, latency, and error rates by day."
Setting Up Alerts
Don't wait for users to tell you something is broken. Set up alerts:
What to Alert On
| Alert | Threshold | Action |
|---|---|---|
| Error rate spike | > 5% of requests | Check AI API status page |
| High latency | P95 > 15 seconds | Check if model is overloaded |
| Daily cost exceeded | > 80% of budget | Review usage patterns |
| AI API down | Health check fails | Switch to fallback or show maintenance page |
Simple Health Check Endpoint
// app/api/health/route.ts (Next.js) import { NextResponse } from "next/server"; export async function GET() { const checks = { server: "ok", aiApi: "unknown", database: "unknown", }; // Check AI API try { const response = await fetch("https://api.openai.com/v1/models", { headers: { Authorization: `Bearer ${process.env.OPENAI_API_KEY}` }, }); checks.aiApi = response.ok ? "ok" : "degraded"; } catch { checks.aiApi = "down"; } const allOk = Object.values(checks).every(v => v === "ok"); return NextResponse.json(checks, { status: allOk ? 200 : 503, }); }
Monitoring Checklist
✅ AI API calls are logged with latency, tokens, and status
✅ Error tracking is set up (Sentry or similar)
✅ Performance monitoring is active
✅ Token usage is tracked per user/day
✅ Cost estimates are calculated and logged
✅ Alerts are set for error spikes and budget thresholds
✅ Health check endpoint is available
✅ Logs are structured (JSON) for easy parsing
What's Next?
You know how to monitor your app. The next tutorial focuses on the business side: cost management and scaling — keeping your AI app affordable as it grows.
What to ask your AI: "Help me set up a monitoring stack for my AI app. I want to track errors, performance, and AI API costs."