OpenAI GPT-5.3 Codex Spark Hits 1,000 TPS

Today, OpenAI is launching a research preview of GPT-5.3 Codex Spark, a smaller version of GPT-5.3 Codex, and its first model built for live coding. Codex Spark is the first result of its partnership with Cerebras, announced in January. It’s designed to feel almost instant on ultra-low latency hardware, delivering over 1,000 tokens per second while remaining highly effective for actual programming tasks.

We are making Codex Spark on Cerebras available as a research preview to ChatGPT Pro users. This lets developers start experimenting early while we work with Cerebras to increase data center capacity, improve the user experience, and prepare for the launch of our larger models.

Our latest models are especially good at handling long-running tasks, working on their own for hours, days, or even weeks. Codex Spark is our first model built for instant work with Codex, so you can make targeted edits, adjust logic, or refine interfaces and see results right away. Now Codex supports both big, ongoing projects and quick, in-the-moment work.

We look forward to learning from developers and using your feedback as we expand access.

At launch, Codex Spark has a 128K context window and supports only text. During the research preview, it will have its own rate limits, and usage won’t count toward standard limits. If demand is high, you might see limited access or short waits as we keep things reliable for everyone.

Speed and Intelligence

Codex Spark is built for interactive work where speed is just as important as intelligence. You can work with the model in real time, interrupt or redirect it as needed, and quickly try out new ideas with fast responses. Since it is tuned for speed, Codex Spark keeps things simple by making only minimal targeted changes and running tests only when you ask.

Coding

Codex Spark is a powerful, small model designed for fast results on the HWE Bench Pro and Terminal Bench 2.0, which tests software engineering skills. GPT-5.3 Codex Spark performs well and completes tasks much faster than GPT-5.3 Codex.

Latency Improvements for All Models

While training Codex Spark, we realized that speed alone wasn’t enough for instant collaboration. We also needed to reduce latency throughout the entire request and response process.

We made improvements that will help all models, such as:

streamlining how responses move between client and server

updating our inference stack

making sessions start faster so you see the first token sooner

By adding a persistent WebSocket connection and upgrading the responses to API, we reduced the client/server round-trip overhead by 80% per token and the first-time end-to-end time for the first token by 50%. Codex Spark uses the WebSocket path by default, and soon all models will too.

Powered by Cerebras

Codex Spark runs on the Cerebras wafer-scale engine 3, a specialized AI accelerator built for high-speed inference, providing Codex with a low-latency serving option. We worked with Cerebras to add this fast path into our main production system, so Codex works smoothly and is ready to support future models.

What excites us most about ChatGPT 5.3 Codex Spark is partnering with OpenAI and the developer community to discover what fast inference makes possible: new interaction methods, new use cases, and a fundamentally different model experience. This preview is just the beginning: Sean Lee, CTO and co-founder of Cerebras.

GPUs are still the backbone of our training and inference systems, providing the most cost-effective solution for broad adoption. Cerebras adds to this by handling tasks that require very low latency, making Codex feel more responsive as you work. You can also combine GPUs and Cerebras for the best performance on single workloads.

Availability & Details

Codex Spark is launching today as a research preview for ChatGPT Pro users in the latest versions of the Codex app, CLI, and VS Code extension. Since it runs on special low-latency hardware, it has its own rate limit that may change based on demand. During the preview, we are also making Codex Spark available in the API for a small group of design partners to see how developers want to use it in their products. We’ll expand access in the coming weeks as we continue improving our integration.

Right now, Codex Spark is text-only, with a 128K context window, and is the first in a new line of ultra-fast models. As we learn from the developer community about where fast models work best for coding, we’ll add more features, such as:

larger models

longer context windows

support for different types of input

Codex Spark has the same safety training as our main models, including cybersecurity training. We reviewed Codex Spark as part of our usual deployment process, which checks for cyber and other abilities. We found that it does not have a realistic chance of reaching our preparedness framework threshold for higher capability in cybersecurity or biology.

What’s Next

Codex Spark is just the beginning. The goal is to create a Codex in two main modes:

One for longer-term reasoning and execution

Another for live collaboration and quick changes

Over time, these modes will come together. Codex will let you stay closely involved while it handles longer tasks in the background or spreads work across many models at once when you need speed and variety. This way, you won’t have to pick just one mode from the start.

As models get better, the speed of interaction becomes more important. Faster responses make Codex easier to use and open new possibilities for everyone who wants to turn an idea into working software.

Source: Introducing GPT‑5.3‑Codex‑Spark

Visual Studio February Update: Microsoft Unleashes One-Click Agentic Testing via GitHub Co-Pilot

Live from San Francisco: Samsung Unveils Galaxy S26 Ultra with “Privacy Shield” AI Display

Latest post

Visual Studio February Update: Microsoft Unleashes One-Click Agentic Testing via GitHub Co-Pilot

Live from San Francisco: Samsung Unveils Galaxy S26 Ultra with “Privacy Shield” AI Display

Samsung Confirms 7-Year Android 16 Support For Galaxy S26, But There Is A Price Hike Catch

Popular Posts

Best Business Laptops 2025 (1425)

The Future Is Calling: Top Upcoming Smartphones of 2026 You’ll Want to Wait For (842)

Apple Expected to Launch New MacBooks with Next-Gen Apple Silicon (513)

DSLR vs Mirrorless: Which Is Better for Photography Beginners? (407)

Best Smartphones 2025: Complete Buyer’s Guide with Android (405)

Stay Connected

OpenAI Unveils GPT-5.3 Codex Spark: The First AI Model to Hit 1,000 Tokens Per Second

Harish Shenoy

Leave a Reply Cancel reply

Latest Posts

Visual Studio February Update: Microsoft Unleashes One-Click Agentic Testing via GitHub Co-Pilot

Live from San Francisco: Samsung Unveils Galaxy S26 Ultra with “Privacy Shield” AI Display

Samsung Confirms 7-Year Android 16 Support For Galaxy S26, But There Is A Price Hike Catch

Find us on Facebook

Quick Links

Latest post

Popular Posts

Best Business Laptops 2025 (1425)

The Future Is Calling: Top Upcoming Smartphones of 2026 You’ll Want to Wait For (842)

Apple Expected to Launch New MacBooks with Next-Gen Apple Silicon (513)

DSLR vs Mirrorless: Which Is Better for Photography Beginners? (407)

Best Smartphones 2025: Complete Buyer’s Guide with Android (405)

Stay Connected

Related Article

Leave a Reply Cancel reply

Latest Posts

Find us on Facebook