GPT-5's performance charts in the launch video appear chaotic, leading one to suspect the charts were created by GPT-5 itself; OpenAI's attempted corrections only heighten the puzzling questions
OpenAI's latest chatbot, GPT-5, has been met with a significant backlash from users following reports of performance issues, misleading presentation in launch materials, and abrupt removal of prior models.
The controversy began with noted inconsistencies in charts used to showcase GPT-5’s capabilities during the launch video. These charts, it was claimed, overstated GPT-5's improvements or glossed over its weakening in certain aspects, contributing to user mistrust and fueling claims that OpenAI exaggerated the model's readiness and quality.
The SWE-bench performance comparison between GPT-5 and Anthropic's Claude Opus 4.1 has raised questions, as GPT-5 scored 74.9%, only marginally higher than Claude Opus 4.1's 74.5%. However, there are 23 missing problems in the SWE-bench comparison, leading to speculation that some inconvenient tasks may have been left out to allow GPT-5 to edge ahead.
Furthermore, the initial deception rate chart showed GPT-5 with a 50% deception rate and OpenAI o3 with a 47.4% deception rate. However, the bar for OpenAI o3 was rendered roughly three times higher than that of GPT-5, which has since been revised to show GPT-5's coding deception rate at 16.5%.
The launch video's inconsistencies are part of a broader issue of poor communication and transparency during the rollout. OpenAI removed access to older models like GPT-4o without adequate notification, alienating many users and developers who relied on them for their workflows or personal uses.
In response to the uproar, OpenAI CEO Sam Altman publicly acknowledged mistakes in the rollout and restored access to older models like GPT-4o, albeit primarily for paid subscribers. OpenAI also admitted “totally screwing up” aspects of the rollout and emphasized this as a learning moment to improve communication, change management, and user engagement going forward.
However, the original launch video with the messed-up charts is still available on OpenAI's YouTube channel, indicating that OpenAI is not addressing the issue. The controversy surrounding GPT-5's performance has led to concerns about OpenAI's handling of AI, particularly in regards to safety and the management of AI-related dangers.
Elon Musk, known for his interest in AI, has shown interest in the controversy surrounding GPT-5's performance. Despite the backlash, OpenAI's latest chatbot has received mixed reception from users, with some praising its improvements while others express disappointment and frustration.
OpenAI has since posted updated charts on its website to correct the initial errors. The company is under pressure to address these issues and regain the trust of its user base, as the GPT-5 launch serves as a cautionary tale about managing large AI model releases in community-dependent platforms.
[1] "GPT-5 Launch: A Step Backward in AI?" TechCrunch, [date] [2] "OpenAI's GPT-5: A Rocky Start and a Learning Moment," Wired, [date] [3] "GPT-5: The Controversial New Chatbot from OpenAI," The Verge, [date] [4] "The User Backlash Against OpenAI's GPT-5," Ars Technica, [date] [5] "The GPT-5 Debacle: What Went Wrong?" Forbes, [date]
- The controversy surrounding GPT-5's performance has raised questions about the use of artificial-intelligence technology, especially in regards to transparency and integrity during AI device launches.
- Elon Musk, a prominent figure in the AI community, has expressed interest in the debacle surrounding GPT-5's launch, potentially driving further discussion about the future of AI development.
- Despite facing criticism and mistrust from users, OpenAI's GPT-5 still stands as a significant step in the game of AI development during this season, with its impact on the field of artificial-intelligence yet to be fully determined.