Anthropic redesigns hiring tests after Claude 4.5 "aces" human interview

Anthropic’s performance optimization team, evaluating job applicants since 2024, revises its technical interview test to counter AI-assisted cheating, according to team lead Tristan Hume.

Anthropic implemented a take-home test for job applicants. The improving capabilities of AI coding tools necessitated frequent revisions to this test, designed to assess candidate skills. Tristan Hume, team lead, detailed these challenges in a blog post on Wednesday.

Hume stated, “Each new Claude model has forced us to redesign the test.” He noted that “When given the same time limit, Claude Opus 4 outperformed most human applicants.” Subsequently, “Claude Opus 4.5 matched even those,” referring to the strongest human candidates.

This development presented a significant candidate-assessment issue. The absence of in-person proctoring made it impossible to prevent AI utilization during the test. Hume explained, “Under the constraints of the take-home test, we no longer had a way to distinguish between the output of our top candidates and our most capable model.”

The proliferation of AI cheating, already observed in educational institutions globally, now impacts AI laboratories. Anthropic, however, possesses distinct resources to address this specific problem.

Hume ultimately developed a new test. This revised assessment focuses less on hardware optimization, making it challenging for current AI tools. As part of his post, he released the original test, inviting readers to propose alternative solutions. The post stated, “If you can best Opus 4.5, we’d love to hear from you.”

Featured image credit

Anthropic redesigns hiring tests after Claude 4.5 “aces” human interview

Related Stories

Pixel 11 leak hints at new magenta and peach color options

Microsoft updates Windows 11 search with cleaner design and no ads

X updates algorithm to prioritize posts from mutual connections

Xiaomi launches SkyNomad brand with first extended-range SUV lineup