·.NET·2 min read·Mid-level developers

Server-Sent Events in .NET 10 — Streaming LLM Responses Without WebSockets

.NET 10 ships SSE as a first-class Minimal API result. Streaming OpenAI tokens to a browser is now a one-line endpoint.

I've had a half-rotten utility class for SSE in three different projects. .NET 10 finally killed it. TypedResults.ServerSentEvents is built in, and the result is the smallest streaming endpoint I've ever shipped.

app.MapPost("/chat/stream", (ChatRequest req, IChatClient chat, CancellationToken ct) =>
    TypedResults.ServerSentEvents(StreamReply(req, chat, ct)));

static async IAsyncEnumerable<SseItem<string>> StreamReply(
    ChatRequest req, IChatClient chat, [EnumeratorCancellation] CancellationToken ct)
{
    var stream = chat.GetStreamingResponseAsync(req.Messages, cancellationToken: ct);
    await foreach (var update in stream.WithCancellation(ct))
    {
        if (!string.IsNullOrEmpty(update.Text))
            yield return new SseItem<string>(update.Text);
    }
}

That's the whole server. Cancellation propagates from the client to the LLM call, errors get serialised as SSE error frames, and the JSON wrapping is handled.

The HTML client is short:

<script>
const es = new EventSource("/chat/stream?...");
es.onmessage = (e) => append(e.data);
es.onerror = () => es.close();
</script>

When to use SSE vs the alternatives:

  • WebSockets if you genuinely need bidirectional. You usually don't — chat UIs send one request, then receive a stream. SSE is the right shape.
  • gRPC streaming if both ends are .NET and you want type safety. SSE wins on browser support and proxy compatibility.
  • NDJSON over chunked HTTP if your client can't use EventSource. Rare.

Two gotchas to know:

If your reverse proxy buffers responses, SSE is dead in the water. nginx needs X-Accel-Buffering: no. Cloudflare needs the response to be marked Transfer-Encoding: chunked (the .NET 10 SSE result handles this). Test through your real proxy, not just localhost.

Browser limits concurrent EventSource connections per origin to 6. If you have a multi-tab chat app, you'll hit it. Either share one connection across tabs via a SharedWorker, or use the modern fetch + ReadableStream pattern which doesn't have the same limit.

The thing I appreciate most about the built-in version: the SseItem<T> lets you ship typed events to the browser, so the client gets { delta, type, finish_reason } instead of a raw string. The streaming endpoint and the response shape stay co-located. One less utility class in everyone's codebase.