Backtracking through Agent Memory

Adventures with LLMs, Prolog and DeepClause

Feb. 05, 2026

Over the past two to three weeks I have spent a considerable amount of time trying to simplify the design of DeepClause. While my original version eventually felt like an early version of LangChain for Prolog, the new version feels much more agent-centric and suitable for a new generation of models. One of the core points of the new version is to find more elegant ways to manage LLM context. Quite naturally, this leads one to look at one of Prolog’s core features: backtracking. Before I explain this a bit deeper, let us first look at what commonly happens in an agentic loop.

It goes something like this:

Your agent tries something.
It doesn’t work
The agent stuffs its memory with error messages, irrelevant or misleading content, and apologizes profusely.
It tries to fix whatever approach it was taking (but now it’s biased by its own wrong answer).
It fails again.
Before you know it, your context window is full of apologies and stack traces, and the agent has completely forgotten what it was trying to do in the first place.

In standard Python or other imperative languages, fixing this is a headache. You end up writing messy code to manually slice and dice the messages[] array, trying to prune out the failures while keeping the useful bits, all while fighting token limits and implementing the latest paper on compacting LLM memory. But there’s a surprisingly elegant solution to this, and it comes from a language that’s older than most of us: Prolog. Specifically, through a feature called backtracking.

In DeepClause / Prolog, when we define an agent’s workflow using DML (our Prolog dialect), we are usually not writing if/else statements, but rather defining possible execution paths. The Prolog runtime then tries every path and - if it fails - goes back to a point where things were still ok.

More importantly, when Prolog goes down a path and hits a dead end (a fail condition), it doesn’t error out, but restores the state of the world to exactly how it was before that wrong turn was taken.

In DeepClause all DML code is run through a meta interpreter (implemented in Prolog) that tracks the state of the messages[] array. Once backtracking happens, the messages[] array gets automatically reset to where we last branched off.

This allows us to elegantly implement things like letting an agent try a strategy, burn 5,000 tokens exploring it, fail, and then be sure that those 5,000 tokens disappear from the prompt context. The agent gets a fresh start for its next attempt, completely unburdened by its previous failure.

Examples in DML / Prolog

Let’s say we want an agent to solve a coding problem. Ideally, it should try using standard libraries first. If that doesn’t work, maybe a shell script. If that fails, maybe a custom algorithm. The problem is, if it fails at the Python approach, we usually don’t want it to know about that failure when it tries the shell script. We want it confident, not apologizing for the Python error.

agent_main(Requirement) :-
    system(”You are a senior engineer.”),
    % Prolog will try these strategies in order.
    % If one fails, the state rolls back, and it tries the next.

    (    attempt_strategy(”Use only Python standard libraries”, Requirement)
        ;attempt_strategy(”Use shell commands via subprocess”, Requirement)
        ;attempt_strategy(”Write a purely algorithmic solution”, Requirement)
    ),
    answer(”Code generated and verified.”).

attempt_strategy(Strategy, Req) :-
   % This context is temporary!
   % If we backtrack, this system message disappears.
   system(”Current Strategy: {Strategy}”),
   task(”Write a script for: {Req}. Output pure code.”, Code),

   % We run the code. If the exit code isn’t 0, this line fails.
   % DeepClause automatically triggers the backtrack.
   exec(vm_exec(command: “python3 -c ‘{Code}’”), Result),
   get_dict(exitCode, Result, 0).

If the first strategy fails, DeepClause rewinds time. When the second strategy starts, the context window contains zero traces of the failed Python code. The agent is fresh and ready to try the next approach without baggage.

Choice points (where we backtrack to) can not only be created through the disjunction “;”, but also predicates like member. This is illustrated in the second example:

agent_main(Topic) :-
         exec(web_search(query: Topic), Results),
         % 'member' picks one URL from the list. If we backtrack, it picks the next.
         % (one choice point for each member)
         
         member(Url, Results),
    
         % We check this specific URL.
         % If this fails, we backtrack to 'member'.
         % The huge webpage content is instantly wiped from memory.
         verify_pricing_page(Url, Price),

         answer("Found the price: {Price}").

verify_pricing_page(Url, Price) :-
         exec(fetch_page(url: Url), Content),

         % This task has a huge context (the webpage).
         % We don't want to keep this if it's useless.
         task("Is this a pricing page? If yes, output price. If no, output 'NO'.", Result),
         Result \= "NO",    % If it's NO, this line fails. Backtrack happens here.
         Price = Result.    % If yes, we're done!

So we need to write weird Prolog code instead of Markdown files??

Not exactly. While I’ve grown to love the (somewhat weird) elegance of Prolog, I realize most people don’t want to spend their afternoon debugging choice points and cuts. This is where the DeepClause CLI comes in. You don’t actually have to start with DML; you start with a simple Markdown file, that is a specification of what you want the agent to do, the tools it should use, and the constraints it should follow.

When you run deepclause compile [markdown file], the CLI uses a specialized agentic loop to translate your natural language intent into the rigorous, DML code we just looked at. It is essentially “Spec-Driven Development”, with DML files becoming “executable specs”. You get to keep the “vibes” and readability of a Markdown spec, but the compiler handles the heavy lifting of mapping that intent to a deterministic logic engine.

Once compiled, you simply use deepclause run [dml file] to execute. You’re getting the best of both worlds: the flexibility of natural language for the “what,” and the 50-year-old reliability of Prolog’s memory for the “how.”

Some final remarks

We’re so used to treating LLM context as a permanent, append-only log file. But that’s not how humans solve problems. We hit dead ends, we scrap bad drafts, and we forget irrelevant details.

With DeepClause, our agents can maybe do the same. By combining Prolog’s control flow with the LLM’s understanding, we might get agents that are cleaner, cheaper, and maybe a lot less confused.

Sometimes the best way to move forward is to know exactly how to go back.

Substack von Andreas

Diskussion über diese Post

Sind Sie bereit für mehr?