January 27, 2026·Microsoft Agent Framework·3 min read·Senior developers

From Prototype to Production: Deploying Microsoft Agent Framework on Azure

Your console app works on a developer laptop. Production needs auth, telemetry, secrets, scaling, and a way to deploy without `dotnet run`. Here's the smallest Azure setup that actually works.

The CTO loved the demo. Now ops wants a deployment plan, security wants a key rotation story, and finance wants to know how cost scales. The console app that worked on your laptop is suddenly a project with stakeholders. Welcome to production.

I've now shipped a handful of Agent Framework workloads on Azure. Here's the minimum viable setup that doesn't fall over the first time you get traffic.

What "production" means here

For an Agent Framework workload, production is the smallest set of Azure services that gives you identity (no API keys in code), observability (you can debug a bad run after the fact), and scale (you can serve more than one user without a hot reload). Three services cover the floor: Azure Container Apps, Azure OpenAI with managed identity, and Application Insights.

Everything else, Front Door, Key Vault references, Azure AI Search for RAG, is an addition. Not the foundation.

The minimal production wiring

var builder = WebApplication.CreateBuilder(args);

// Managed identity → Azure OpenAI. No keys in config.
var credential = new DefaultAzureCredential();
builder.Services.AddSingleton<TokenCredential>(credential);

builder.Services.AddSingleton(sp => new ChatAgent(
    name: "SupportAgent",
    instructions: builder.Configuration["Agent:Instructions"],
    model: builder.Configuration["Agent:Model"]!,
    endpoint: new Uri(builder.Configuration["AzureOpenAI:Endpoint"]!),
    credential: sp.GetRequiredService<TokenCredential>()));

// Application Insights for every agent run.
builder.Services.AddApplicationInsightsTelemetry();
builder.Services.AddSingleton<IAgentTelemetry, AppInsightsAgentTelemetry>();

var app = builder.Build();
app.MapPost("/run", async (RunRequest r, ChatAgent a, IAgentTelemetry t) =>
{
    using var span = t.StartRun(a.Name, r.UserId);
    var result = await a.RunAsync(r.Input);
    span.RecordTokens(result.UsageTokens);
    return Results.Ok(new { result.FinalOutput });
});
app.Run();

Why this works

DefaultAzureCredential picks up the Container Apps managed identity at runtime. No keys, no secrets, no Key Vault hop for the OpenAI call. Application Insights gives you per-run traces with tokens and latency, so when finance asks "what did agent X cost yesterday," you have a query, not a guess. Container Apps scales to zero when idle and out to multiple replicas under load without you writing a single line of scaling logic. That last bit alone has paid for this setup more than once on my side.

When this setup is enough

Internal tools, B2B workloads, anything where p95 latency under 5 seconds is acceptable and traffic is bursty. Container Apps is a great fit if you don't want to think about Kubernetes but need more than Functions can give you. It's the boring choice, and that's the point.

When to outgrow it

Latency-critical consumer apps where cold starts are unacceptable. Switch to always-warm minimum replicas (which is no longer scale-to-zero, but still managed). Heavy multi-tenant SaaS will eventually want AKS for finer control over node SKUs and GPU pools. And anything that needs ground-truth audit logs for regulated industries needs more than App Insights alone. You'll want immutable storage.

Deployment topology

flowchart LR
    User[Client] --> FD["Front Door / APIM"]
    FD --> ACA["Container Apps<br/>(your .NET service)"]
    ACA -- managed identity --> AOAI[(Azure OpenAI)]
    ACA --> AI[(Application Insights)]
    ACA -- managed identity --> Search[(Azure AI Search)]
    Search -. optional .- ACA

    style ACA fill:#dbeafe,stroke:#1d4ed8
    style AOAI fill:#dcfce7,stroke:#16a34a

Conclusion

Wire Application Insights before you wire anything else. The first time the agent does something inexplicable in production, the trace you didn't think you needed is the only thing between a five-minute fix and a four-hour mystery. I've lived both of those. The trace is better.