It's PART-T Time: How to Talk to Your Coding Agent

With my OpenAI Codex weekly token quota about to reset, I finally decided to tackle a project I’d been wanting to do for a long time: convert a JBIG2 decoder from C++ to Go. For context, JBIG2 is the codec for monochrome images in PDF files, such as scanned pages. The process involves a fair amount of math (specifically, arithmetic coding), making it a non-trivial task. However, the codebase isn’t enormous (around 7,000 lines), so it seemed like a manageable job for Codex. While several open-source JBIG2 decoders exist, I picked the one in PDFium, Google’s PDF engine used in Chrome. I chose it for its liberal license and my prior familiarity with it. I had considered other approaches, like the rule-based C to golang translator from modernc.org/sqlite, but its generated code isn’t human-friendly and carries unpleasant dependencies like a libc runtime.

I gave Codex a simple prompt: translate this JBIG2 decoder to Go. To my surprise, the overly diligent agent started working immediately, without asking for any clarification. I decided not to interrupt and went about my day. I checked in periodically, and at first, it seemed to be making good progress, working task by task. After a few turns, however, it became clear from its status messages that it was confused, complaining about the task’s large scope. Shortly thereafter, it gave up, apologizing for being unable to complete the “gigantic” task.

While I appreciated it recognizing its own limits, the failure didn’t surprise me. Working with LLMs is like swimming in an ocean of probabilities, where all you can see is the next token. Navigating this ocean requires constant support and a clear strategy. Talking to LLMs is inherently tricky. Language is ambiguous, limited, and easily misunderstood - a problem we even have when communicating with other humans. Complicating matters further is our own difficulty in communicating intent and context. Modern LLMs are capable of incredible discovery, but without human guidance, they drown in TMI (too much information).

For my second attempt, I took a different approach: I asked Codex to create and continuously update a set of guiding documents. This created a framework to keep the agent from drifting, especially when its context window filled up and it had to refresh. LLMs lack long-term memory; you must provide it explicitly every time you start a new session. While there are countless frameworks for using LLMs, this simple method works well for smaller software projects.

A quick aside on my first principle of talking to LLMs: be specific enough that it knows what you’re looking for, but not so specific that it stitches together a Frankenstein’s monster instead of crafting a holistic solution. Because an LLM’s mindset, assumptions, and understanding of words can differ from yours, the more tightly you try to control it, the more likely it is to go off the rails. In some ways, it’s not unlike talking to a teenager.

With that, I established five documents for the project:

  • README.md: For the project overview and usage, which any coding agent recognizes by name.
  • ARCHITECTURE.md: To define the tech stack, architecture, APIs, and modules.
  • PLAN.md: To lay out high-level steps and milestones.
  • TODO.md: A granular, check-marked list of action items for tracking progress.
  • TEST.md: A specification for unit tests and user acceptance tests for self-review.

Together, these form the PART-T framework (PLAN, ARCHITECTURE, README, TODO, TEST) - an easy-to-remember acronym that sounds like “PARTY.”

This structured approach worked, though it took a bit longer than I had hoped. I didn’t write a single line of code myself. My role shifted to that of a project manager: telling the agent what to do next and reminding it to update the documents, not unlike a manager asking for weekly status reports. This oversight was critical for keeping the agent on track. Throughout the process, Codex made a few small mistakes, but the overall result was extremely impressive. It even managed to pull JBIG2 byte chunks from a real PDF file to run tests against. The result was amazing - perhaps even better than the AGI we used to imagine.

Here’s the GitHub linke https://github.com/jdeng/gojbig2 if you’re interested. This project is for illustrative purposes; while I have high confidence in Codex and it has reviewed its own code several times, I wouldn’t consider it production-ready.

An interesting episode occurred late in the process. Codex was solid but slow. To speed things up, I tried to hand off the work to a Cursor agent powered by Sonnet 4.5 - a capable model, to be fair. It ticked off a few TODO items in no time, but then the tests started failing. After a dozen rounds of troubleshooting, with each attempt ending in a cheerful “Perfect, I found the issue!”, it finally gave up, maintaining an absurdly positive attitude like, “I’ve accomplished so much, but there is still an issue here.” In reality, it had made absolutely zero actual progress and had, in fact, introduced new bugs. Codex cleaned up the mess. It ran a few diagnostic rounds, identified the bug, and fixed it. When I asked it to review the new code, its response was blunt: “I found some poorly written, buggy code. Do you mind if I rewrite it?” I gladly agreed. To its credit, Sonnet 4.5 is fast and handles simpler tasks well; it just radiates a bit too much unhelpful positive energy.

This framework isn’t the only way to interact with a coding agent for sure; you will undoubtedly find a style that works for you. However, it is worth putting real thought into making your communication organized and effective - just as you would when collaborating with a human. It is both an exciting and a scary time for software engineers. But for software builders, it is PARTY time indeed.