Streaming

Guides

Parse Server-Sent Events and handle interrupted model responses.

Overview

Streaming returns model output as small Server-Sent Events instead of waiting for the full response. This improves perceived latency and lets UI render tokens as they arrive.

SSE format

Each event line starts with `data:` and the stream ends with `[DONE]`:

data: {"choices":[{"delta":{"content":"Hello"}}]}

data: {"choices":[{"delta":{"content":" world"}}]}

data: [DONE]

JavaScript parser

const response = await fetch("https://uouo.cloud/v1/chat/completions", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${process.env.UOUODUO_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "gpt-4o-mini",
    stream: true,
    messages: [{ role: "user", content: "Explain streaming." }],
  }),
});

const reader = response.body!.getReader();
const decoder = new TextDecoder();
let buffer = "";

while (true) {
  const { value, done } = await reader.read();
  if (done) break;
  buffer += decoder.decode(value, { stream: true });
  for (const part of buffer.split("\n\n")) {
    if (!part.startsWith("data: ")) continue;
    const data = part.slice(6).trim();
    if (data === "[DONE]") break;
    const json = JSON.parse(data);
    process.stdout.write(json.choices?.[0]?.delta?.content ?? "");
  }
}

Failure handling

Streaming can end early if the client disconnects, the network fails, or the upstream provider times out. Treat partial output as incomplete unless your application can safely resume or regenerate.

Billing note

Final usage may arrive in the last chunk when supported. If that chunk is missing, use `/app/logs` as the authoritative record.