← Back to Blog

AI Streaming: Patterns That Matter in Production

2025-05-18

AIStreamingTypeScriptVercel AI SDK

AI Streaming: Patterns That Matter in Production


Streaming feels great when it’s smooth. It feels broken when it janks, duplicates text, or can’t cancel. These are the few patterns I reach for when building streaming AI features with the Vercel AI SDK.


1) The core loop: async iteration


Start here. Keep it boring.


import { streamText } from "ai"
import { openai } from "@ai-sdk/openai"

const result = streamText({
  model: openai("gpt-4-turbo"),
  prompt: "Explain recursion simply.",
})

for await (const chunk of result.textStream) {
  process.stdout.write(chunk)
}

This works because the consumer pulls chunks at its own pace.


2) Cancellation: treat it as a first-class feature


If users can’t stop a response, streaming isn’t “nice to have” — it’s frustrating.


const controller = new AbortController()

await fetch("/api/chat", {
  method: "POST",
  body: JSON.stringify({ prompt }),
  signal: controller.signal,
})

// call this on “Stop”
controller.abort()

Rule: wire cancel on day one. Everything else can iterate.


3) Buffering: a small queue solves most edge cases


You only need a queue when you want to decouple producer and consumer (batching, scheduling, multiple consumers).


class TokenBuffer {
  private queue: string[] = []
  private waiters: Array<(value: string) => void> = []

  push(token: string) {
    const waiter = this.waiters.shift()
    if (waiter) waiter(token)
    else this.queue.push(token)
  }

  async pull(): Promise<string> {
    const token = this.queue.shift()
    if (token) return token
    return new Promise((resolve) => this.waiters.push(resolve))
  }
}

This is the producer/consumer pattern you’ll reuse everywhere.


4) Transforms: modify the stream without rewriting your pipeline


If you need redaction, formatting, or annotations, transforms are clean and composable.


const uppercaseStream = result.textStream.pipeThrough(
  new TransformStream<string, string>({
    transform(chunk, controller) {
      controller.enqueue(chunk.toUpperCase())
    },
  })
)

for await (const chunk of uppercaseStream) {
  console.log(chunk)
}

5) Accumulation: keep the full text only when you need it


Streaming UI usually only needs append-only rendering. Accumulate only when you need the full response for parsing, logging, or post-processing.


let full = ""

for await (const chunk of result.textStream) {
  full += chunk
  // stream chunk to UI
}

// full response available here

The takeaway


Most streaming issues are not “AI problems”. They are UI and state problems:


  • stream append-only
  • support cancellation
  • buffer only when you must
  • transform in a pipeline
  • accumulate only when needed