MIT CSAIL unveils PDDL-Instruct for LLM planning

Researchers from MIT CSAIL have developed PDDL-INSTRUCT, an instruction-tuning framework designed to improve the multi-step planning capabilities of large language models (LLMs). The method combines logical chain-of-thought reasoning with an external plan validator to increase the generation of logically valid plans over plausible but incorrect outputs.

The framework trains models to recognize and explain why a candidate plan has failed. These failures can include unsatisfied preconditions, incorrect effects, frame violations, or an unmet goal. This process is paired with logical chain-of-thought prompts that guide the LLM to perform step-by-step inference over state and action transitions. This produces traceable sequences of state→action→state, written as ⟨sᵢ, aᵢ₊₁, sᵢ₊₁⟩.

For external validation, PDDL-INSTRUCT integrates the VAL plan validator, which checks each step of the generated plan. The validator provides feedback that is either binary (valid/invalid) or detailed, with the detailed feedback resulting in superior performance. The system uses a two-stage optimization process. The first stage penalizes errors in the reasoning chains, and the second stage optimizes for final planning accuracy.

The system was evaluated using the PlanBench benchmark, which includes planning domains known to challenge LLMs, such as Blocksworld, Mystery Blocksworld, and Logistics. In the Blocksworld domain, a tuned Llama-3-8B model achieved a 94% rate of generating valid plans. Previous models had near-zero validity on Mystery Blocksworld, a domain where predicate names are obfuscated to prevent pattern matching. PDDL-INSTRUCT achieved up to a 64-fold improvement in this domain.

Significant performance gains were also recorded in the Logistics domain. Across all test domains, the framework delivered up to a 66% absolute improvement compared to untuned baseline models. Researchers also noted that performance improved with longer feedback budgets and more detailed output from the validator.

The current implementation of PDDL-INSTRUCT applies to classical PDDL domains and depends on the VAL validator as an external oracle. The results show a method for grounding LLM reasoning in formal semantics for use in agent systems that can include a verifier during planning. Extending the framework to handle long-horizon, temporal, numeric, and cost-sensitive planning tasks remains an area for further work.

Tags: MIT CSAIL

MIT CSAIL unveils PDDL-Instruct for LLM planning

Aytun Çelebi

Related Posts

Ashley St. Clair sues xAI over Grok deepfakes

Google Gemini gains “proactive reasoning” across YouTube and Search history

Google launches revamped Trends Explore page with Gemini

Apple chose Google Gemini for Siri

LATEST

OpenAI rockets $250 million into Altman’s Merge Labs brain-AI bridge

Bluesky opens “Live Now” badges to all users to lure Twitch creators

Capcom reveals Resident Evil: Requiem classic mode and ink ribbons

How to tell if your iPhone or Android phone is carrier unlocked

Paramount+ slams subscribers with first price hike since 2024

Ashley St. Clair sues xAI over Grok deepfakes

Samsung launches instant-play cloud streaming in Mobile Gaming Hub update

Netflix secures Sony Pictures first-to-stream rights

How to apply screen protectors without air bubbles

How to check if someone read your message on iPhone or iPad

© 2021 TechBriefly is a Linkmedya brand.

MIT CSAIL unveils PDDL-Instruct for LLM planning

Related Posts

LATEST

© 2021 TechBriefly is a Linkmedya brand.

Follow Us