TechBriefly
  • Tech
  • Business
  • Crypto
  • Science
  • Geek
  • How to
  • About
    • About TechBriefly
    • Terms and Conditions
    • Privacy Policy
    • Contact Us
    • Languages
      • 中文 (Chinese)
      • Dansk
      • Deutsch
      • Español
      • English
      • Français
      • Nederlands
      • Italiano
      • 日本语 (Japanese)
      • 한국인 (Korean)
      • Norsk
      • Polski
      • Português
      • Pусский (Russian)
      • Suomalainen
      • Svenska
No Result
View All Result
TechBriefly
Home Tech AI
MIT CSAIL unveils PDDL-Instruct for LLM planning

MIT CSAIL unveils PDDL-Instruct for LLM planning

Aytun ÇelebibyAytun Çelebi
22 September 2025
in AI
Reading Time: 2 mins read
Share on FacebookShare on Twitter

Researchers from MIT CSAIL have developed PDDL-INSTRUCT, an instruction-tuning framework designed to improve the multi-step planning capabilities of large language models (LLMs). The method combines logical chain-of-thought reasoning with an external plan validator to increase the generation of logically valid plans over plausible but incorrect outputs.

The framework trains models to recognize and explain why a candidate plan has failed. These failures can include unsatisfied preconditions, incorrect effects, frame violations, or an unmet goal. This process is paired with logical chain-of-thought prompts that guide the LLM to perform step-by-step inference over state and action transitions. This produces traceable sequences of state→action→state, written as ⟨sᵢ, aᵢ₊₁, sᵢ₊₁⟩.

For external validation, PDDL-INSTRUCT integrates the VAL plan validator, which checks each step of the generated plan. The validator provides feedback that is either binary (valid/invalid) or detailed, with the detailed feedback resulting in superior performance. The system uses a two-stage optimization process. The first stage penalizes errors in the reasoning chains, and the second stage optimizes for final planning accuracy.

The system was evaluated using the PlanBench benchmark, which includes planning domains known to challenge LLMs, such as Blocksworld, Mystery Blocksworld, and Logistics. In the Blocksworld domain, a tuned Llama-3-8B model achieved a 94% rate of generating valid plans. Previous models had near-zero validity on Mystery Blocksworld, a domain where predicate names are obfuscated to prevent pattern matching. PDDL-INSTRUCT achieved up to a 64-fold improvement in this domain.

Significant performance gains were also recorded in the Logistics domain. Across all test domains, the framework delivered up to a 66% absolute improvement compared to untuned baseline models. Researchers also noted that performance improved with longer feedback budgets and more detailed output from the validator.

The current implementation of PDDL-INSTRUCT applies to classical PDDL domains and depends on the VAL validator as an external oracle. The results show a method for grounding LLM reasoning in formal semantics for use in agent systems that can include a verifier during planning. Extending the framework to handle long-horizon, temporal, numeric, and cost-sensitive planning tasks remains an area for further work.

Tags: MIT CSAIL
ShareTweet
Aytun Çelebi

Aytun Çelebi

Starting with coding on Commodore 64 in elementary school moving to web programming in his teenage years, Aytun has been around technology for over 30 years, and he has been a tech journalist for over 20 years now. He worked in many major Turkish outlets (newspapers, magazines, TV channels and websites) and managed some. Besides journalism, he worked as a copywriter and PR manager (for Lenovo, HP and many international brands ) in agencies. He founded his agency, Linkmedya in 2019 to execute his way of producing content. He is recently interested in AI, automation and MarTech.

Related Posts

Ashley St. Clair sues xAI over Grok deepfakes

Ashley St. Clair sues xAI over Grok deepfakes

16 January 2026
Google Gemini gains “proactive reasoning” across YouTube and Search history

Google Gemini gains “proactive reasoning” across YouTube and Search history

15 January 2026
Google launches revamped Trends Explore page with Gemini

Google launches revamped Trends Explore page with Gemini

15 January 2026
Apple chose Google Gemini for Siri

Apple chose Google Gemini for Siri

13 January 2026

LATEST

OpenAI rockets $250 million into Altman’s Merge Labs brain-AI bridge

Bluesky opens “Live Now” badges to all users to lure Twitch creators

Capcom reveals Resident Evil: Requiem classic mode and ink ribbons

How to tell if your iPhone or Android phone is carrier unlocked

Paramount+ slams subscribers with first price hike since 2024

Ashley St. Clair sues xAI over Grok deepfakes

Samsung launches instant-play cloud streaming in Mobile Gaming Hub update

Netflix secures Sony Pictures first-to-stream rights

How to apply screen protectors without air bubbles

How to check if someone read your message on iPhone or iPad

TechBriefly

© 2021 TechBriefly is a Linkmedya brand.

  • Tech
  • Business
  • Science
  • Geek
  • How to
  • About
  • Privacy
  • Terms
  • Contact
  • | Network Sites |
  • Digital Report
  • LeaderGamer

Follow Us

No Result
View All Result
  • Tech
  • Business
  • Crypto
  • Science
  • Geek
  • How to
  • About
    • About TechBriefly
    • Terms and Conditions
    • Privacy Policy
    • Contact Us
    • Languages
      • 中文 (Chinese)
      • Dansk
      • Deutsch
      • Español
      • English
      • Français
      • Nederlands
      • Italiano
      • 日本语 (Japanese)
      • 한국인 (Korean)
      • Norsk
      • Polski
      • Português
      • Pусский (Russian)
      • Suomalainen
      • Svenska