AMD AI Chief Criticizes Claude Code's Performance Decline, Anthropic Responds

AMD AI Chief Raises Concerns Over Claude Code's Performance Regression

Stella Laurenzo, the AI chief at AMD, has publicly expressed disappointment with the performance of Claude Code, an AI tool developed by Anthropic. In a detailed post on her "stellaraccident" GitHub account, Laurenzo stated that the model has "regressed to the point it cannot be trusted to perform complex engineering." Her critique is based on an extensive internal analysis involving more than 6,800 coding sessions, nearly 235,000 tool calls, and close to 18,000 reasoning blocks.

Internal Data Reveals Significant Issues

Laurenzo highlighted that multiple senior engineers on her team have reported similar experiences, pointing to a notable increase in "stop-hook violations." These violations occur when the model prematurely exits tasks or requests unnecessary permissions, disrupting workflow efficiency. She noted that such incidents escalated from zero to approximately 10 per day last month, attributing this decline to the implementation of thinking redaction, specifically the "redact-thinking-2026-02-12" feature.

According to Laurenzo, extended reasoning is crucial for complex engineering workflows, and its reduction has negatively impacted performance. She also observed a behavioral shift in Claude Code, moving from a research-first to an edit-first approach. This change, she argued, has resulted in lower-quality code, weaker adherence to coding conventions, and reduced reliability during extended sessions.

—

Wide Pickt banner — collaborative shopping lists app for Telegram, phone mockup with grocery list

Anthropic's Response to the Criticism

In response to Laurenzo's concerns, Anthropic addressed the claims through engineer Boris Cherny. He clarified that the redact-thinking setting only hides reasoning from the user interface and does not diminish the model's actual reasoning capabilities. Cherny emphasized the introduction of adaptive thinking in Claude Opus 4.6, where the system dynamically determines the duration of thinking based on task requirements.

"Some people want the model to think for longer, even if it takes more time and tokens. To improve intelligence more, set effort=high via `/effort` or in your settings.json," Cherny wrote. He explained that while the default medium effort setting (effort=85) balances performance and efficiency, Anthropic is testing higher effort configurations for Teams and Enterprise users. This adjustment aims to allow them to "benefit from extended thinking even if it comes at the cost of additional tokens & latency."

Cherny also acknowledged Laurenzo's analysis, stating, "I appreciate the depth of thinking & care that went into this," indicating a respectful engagement with the feedback.

Broader Implications for AI Development

This exchange underscores ongoing challenges in AI tool optimization, particularly in balancing efficiency with depth of reasoning. As companies like Anthropic refine their models, user feedback from industry leaders like AMD plays a critical role in shaping improvements. The debate highlights the importance of customizable settings to meet diverse engineering needs, ensuring AI tools remain reliable for complex tasks.