xAI’s Grok chatbot correctly predicted the date of US and Israeli military strikes against Iran three days before they occurred. The prediction emerged from a test by the Jerusalem Post published on February 25, which asked four AI models when the strikes would happen.

The newspaper tested Anthropic’s Claude, Google’s Gemini, xAI’s Grok, and OpenAI’s ChatGPT. Only Grok identified the correct date of February 28. Grok predicted “a limited US strike on February 28, 2026,” while the other models suggested dates in early March. Claude settled on March 7 or 8, Gemini projected March 4 to March 6, and ChatGPT revised its forecast to March 3.

The US and Israel launched the coordinated attacks on February 28 as Grok predicted. Israel’s operation was codenamed “Roaring Lion” and the US operation was “Operation Epic Fury.” President Donald Trump announced the strikes in a video address. Explosions were reported in Tehran, Isfahan, Qom, Karaj, and Kermanshah. Iran’s Supreme Leader Ayatollah Ali Khamenei was killed in the strikes, according to the Associated Press and Reuters.

Iran launched retaliatory strikes against Israel and US facilities in Bahrain, the United Arab Emirates, and Qatar. Elon Musk commented on the prediction on X, stating, “Prediction of the future is the best measure of intelligence.”

The Jerusalem Post framed the exercise as a stress test rather than a forecasting service. The article noted that Grok’s prediction drew on publicly available signals, including Geneva diplomatic talks and Trump’s stated deadline from February 19. Reuters reporting at the time had noted that a senior US official suggested mid-March before all forces would be in place.

The Jerusalem Post concluded that the robots answered when the internet asked for a date.

The Jerusalem Post published the test results on February 25. The newspaper stated that pushing the AI models harder resulted in more specific answers, even though real-world clarity did not improve. Grok’s prediction circulated quickly on X through screenshots. The result may reflect either analytical capability or coincidence in an exercise designed to test the models’ limits.


Featured image credit