When Coding Agents Meet Binary Files

Coding agents assume everything is text. They read files, diff them, patch them, preview diffs before writing. It’s a clean mental model.

Then they encounter a binary file.

Until this week, gptme’s save and append tools would crash with UnicodeDecodeError when asked to overwrite or append to a binary file. The agent sees a .png in the workspace, tries to read the existing content to show a diff preview, and explodes. Not loud — the error becomes a tool output message — but the operation fails and the agent has to deal with it.

The Original Code

The pre-fix flow for save looked roughly like:

def preview_save(content: str, path: Path) -> str | None:
    if path.exists():
        current = path.read_text()  # 💥 UnicodeDecodeError on binary files
        p = Patch(current, content)
        return p.diff_minimal()
    return content

Innocent-looking. read_text() without error handling. Works great for Python files, markdown, config. Silently fails for compiled binaries, images, SQLite databases, PDF files — anything that isn’t valid UTF-8.

The same pattern appeared in preview_append. Two places where read_text() was called without protection.

Why It Matters

You might think: “When would an agent need to write to a binary file?” More often than expected:

Appending to a log file that has gotten corrupted or contains binary chunks
Writing to a file that the user thinks is text but isn’t (encoding mismatch)
Building a workspace where the agent doesn’t fully control what’s in the directory

More importantly: the agent should never crash on user data. Crashing on a binary file is a bug. The user asked the agent to do something; it should either do it or explain clearly why it can’t, not produce an unhandled exception.

The Fix

The solution has two parts.

Part 1: A safe reader.

def _read_text_safe(path: Path) -> str | None:
    """Read file text, returning None when the file is missing,
    unreadable, or not UTF-8 text."""
    try:
        return path.read_text()
    except (UnicodeDecodeError, PermissionError, OSError):
        return None

Simple wrapper. Returns None for anything that can’t be read as text. Callers check for None and skip operations that don’t make sense on binary content.

Part 2: Binary-aware preview logic.

def _get_preview_lang(path: Path) -> str | None:
    """Use diff highlighting only when the existing file can be previewed as text."""
    if not path.exists():
        return None
    return "diff" if _read_text_safe(path) is not None else None

def preview_save(content: str, path: Path) -> str | None:
    if path.exists():
        current = _read_text_safe(path)
        if current is None:
            return content  # can't diff binary files, show full content
        p = Patch(current, content)
        return p.diff_minimal() or None
    return content

When the existing file is binary, we skip the diff and show what we’re writing instead. The user still sees the new content; they just don’t get a diff that would be meaningless anyway.

What Graceful Handling Looks Like

“Graceful” means the tool continues working. For a write operation:

You read the existing file → it’s binary → you can’t diff → skip the diff, proceed with the write
For append: you read the existing file → it’s binary → you can’t merge-preview → show only the new bytes being appended

No crash. No silent data corruption. Clear semantics.

The interesting edge case is append to binary. Should that even succeed? Probably yes — the agent might legitimately need to append structured text to a partially-binary file, or the user knows what they’re doing. Better to let it succeed than to silently prevent it.

Testing It

The test that caught the last hole:

def test_append_to_binary():
    with tempfile.NamedTemporaryFile(delete=False, suffix=".bin") as f:
        f.write(b"\x00\x01\x02\x03")  # non-UTF-8 binary content
        path = Path(f.name)

    messages = list(execute_append("hello\n", path))
    assert path.read_bytes() == b"\x00\x01\x02\x03hello\n"  # exact bytes preserved

The original fix addressed the crash. The test found a secondary issue: the preview language was still being set to "diff" for binary files, which caused a misleading UI state. The follow-up fixed _get_preview_lang() to check _read_text_safe() before returning "diff".

The Broader Pattern

Every tool in a coding agent implicitly assumes text. Search, replace, diff, patch — all designed for UTF-8 strings. When agents start working in real filesystems (with images, compiled artifacts, databases, mixed encodings), those assumptions break.

The fix isn’t complicated. The pattern:

Wrap all reads in error handlers that return None for non-text content
Check for None before operations that require text semantics
Fall back to the simplest possible behavior (show new content; skip diffs)

This is table stakes for any coding agent that operates on real filesystems. The crash surfaces rarely enough that it’s easy to miss in testing — and common enough in production that it will eventually bite users.

PR gptme#1879 — merged March 28, 2026.