Speeding up agentic workflows with WebSockets in the Responses API

OpenAI’s recent blog post, published on 2026‑04‑22, provides a technical overview of how WebSockets and connection‑scoped caching can accelerate agentic workflows in the Responses API.

The article focuses on the Codex agent loop, illustrating that maintaining a persistent WebSocket connection reduces the overhead associated with establishing new HTTP connections for each request. By caching data that is scoped to the lifetime of a connection, the system avoids redundant data transfer, which in turn lowers the latency experienced by the model.

Key takeaways from the post include:

WebSocket usage – A persistent WebSocket connection keeps the communication channel open, eliminating the handshake cost of repeated HTTP requests.
Connection‑scoped caching – Data that is relevant for the duration of a single connection is cached, preventing repeated retrieval of the same information.
Reduced API overhead – The combination of persistent connections and caching cuts the number of round‑trips required, directly impacting the overall latency of the agentic workflow.

For a deeper dive, refer to the original OpenAI blog post: Speeding up agentic workflows with WebSockets in the Responses API.

Source: Speeding up agentic workflows with WebSockets in the Responses API
Domain: openai.com