Fix replay rendering for `exec_command` and `update_plan` by kokoro-aya · Pull Request #215 · zed-industries/codex-acp

kokoro-aya · 2026-03-29T15:00:29Z

Context

I was using Codex ACP in Zed for various ongoing tasks. I was able to reenter threads of previous sessions or previous projects and seeing structured output for CLI commands and the plan drafted during these threads.

After a thread where a prompt messed with some dirty inputs (cross-referencing local files/previous threads), the Codex CLI panel's rendering degraded and all CLI commands/plans rendered as generic exec_command and update_plan, which made the reading of threads difficult.

I cloned this repo and worked locally and solved this rendering issue in my fork, which allows me to continue to work on my projects without downgrading my UX.

Summary

This PR fixes two history replay parity issues in codex-acp:

replayed FunctionCall(name="exec_command") entries were shown as generic exec_command tool calls instead of meaningful command titles such as Read ..., Search ..., or List ...
replayed FunctionCall(name="update_plan") entries were shown as generic tool calls instead of restoring the ACP plan UI

Live rendering already handled both cases better; the replay path did not.

Problem

When sessions were restored from history, replay did not fully reconstruct the richer ACP presentation used during live execution.

In practice this caused two visible regressions in old threads:

shell-backed commands stored as FunctionCall(name="exec_command") were replayed as generic tool calls
plan updates stored as FunctionCall(name="update_plan") were replayed as generic tool calls instead of plan updates

This made restored history less useful and less consistent with live session behavior.

Root cause

The replay path in thread.rs handled only a subset of historical tool-call shapes as structured replay events.

`exec_command`

Replay already special-cased a few shell-like function names:

shell
container.exec
shell_command

But historical FunctionCall(name="exec_command") fell through to the generic fallback path, so replay lost semantic tool metadata.

`update_plan`

Live plan updates already had a dedicated path:

PlanUpdate events were translated into SessionUpdate::Plan

But during replay, historical FunctionCall(name="update_plan") entries were not translated back into plan updates and instead fell through to the generic function-call fallback.

What changed

`exec_command`

Replay now treats exec_command as a shell-like function call.

It parses:

cmd
workdir

and reuses the existing command parsing logic to recover structured tool-call metadata during replay.

If parsing fails, replay still falls back to the previous generic behavior.

`update_plan`

Replay now special-cases FunctionCall(name="update_plan").

It:

parses the stored plan arguments
emits SessionUpdate::Plan
tracks the corresponding call_id only during the current replay pass
suppresses the matching FunctionCallOutput during that same replay pass so replay does not emit a stray generic tool update afterward

This replay bookkeeping is local to the replay pass and does not become persistent session state.

Result

After rebuilding and reopening older threads in Zed:

replayed shell-backed tool calls render with meaningful titles again
replayed plan updates render as proper plan UI again

Examples observed after the fix include:

Read Foo.scala
Read Bar.scala
Read Baz.scala
List /Users/irony/Developer/some-project/src/some-module

All these commands were previously shown as exec_command.

Scope

This PR only changes history replay behavior.

It does not change:

live tool-call rendering
live plan update handling
stored rollout / session history data
authentication, session lifecycle, or prompt submission behavior

Risk

the changes are limited to replay
existing generic fallback behavior remains in place for unrecognized or unparsable function calls
no historical data is rewritten
the additional replay bookkeeping for update_plan is local to a single replay pass

Testing

Automated

Ran cargo test

Added 2 tests:

replaying FunctionCall(name="exec_command") produces a structured tool call instead of a generic one:

cargo test test_replay_exec_command_function_call_is_structured
replaying FunctionCall(name="update_plan") produces a plan update and suppresses the matching generic function-call output:

cargo test test_replay_update_plan_function_call_emits_plan_update
Ran cargo fmt --check

Manual

Build an artefact with cargo build --release
Add a custom agent in Zed
Opened older threads in Zed containing replayed exec_command calls
Opened older threads in Zed containing replayed update_plan entries
See the difference between this custom agent and Codex CLI

Observed that:

command history now renders with meaningful titles
plan history now renders as plan UI

Issue relevance

This branch addresses the following replay downgrade problems:

generic replay of historical exec_command
generic replay of historical update_plan

It does not solve every rendering anomaly found during investigation, especially related to unusual embedded transcript/context content.

I used Codex to help me investigate the issue, drafted the code and the PR.

Previously, only the following cases are supported: - `shell` - `container.exec` - `shell_command` This commit adds `exec_command` as well into this path of `shell-like function call` kinds of commands. - Commit drafted by Codex.

See test case `test_replay_exec_command_function_call_is_structured`. - Commit drafted by Codex.

This commit switches the replay of `update_plan` from generic tool call to a structured "plan update" as it displays while thread was performing. What has been changed: - A local `HashSet<String>` was added in `handle_replay_history` function - This local state is only used for current replay for indexing `call_id` associated with updated plans - Also adjusted signature of `replay_response_item` - Updated `FunctionCall` and `FunctionCallOutput` branches of `ResponseItem` - in `ResponseItem::FunctionCall`, we first try to reconstruct the plan from generic tool call, if it passes, we note down the `call_id` and use this plan - in `ResponseItem::FunctionCallOutput`, we omit the outputs of these generic tool updates - Commit drafted by Codex.

See test case `test_replay_update_plan_function_call_emits_plan_update`. - Commit drafted by Codex.

kokoro-aya added 4 commits March 29, 2026 16:36

Extend replay for FunctionCall regarding exec_command

15250a0

Previously, only the following cases are supported: - `shell` - `container.exec` - `shell_command` This commit adds `exec_command` as well into this path of `shell-like function call` kinds of commands. - Commit drafted by Codex.

Added a test case for the exec_command replay

61e2519

See test case `test_replay_exec_command_function_call_is_structured`. - Commit drafted by Codex.

Added a test case for update_plan replay

96e3abb

See test case `test_replay_update_plan_function_call_emits_plan_update`. - Commit drafted by Codex.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix replay rendering for `exec_command` and `update_plan`#215

Fix replay rendering for `exec_command` and `update_plan`#215
kokoro-aya wants to merge 4 commits intozed-industries:mainfrom
kokoro-aya:kokoro-aya/fix-replay-for-exec_command-and-update_plan

kokoro-aya commented Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kokoro-aya commented Mar 29, 2026

Context

Summary

Problem

Root cause

exec_command

update_plan

What changed

exec_command

update_plan

Result

Scope

Risk

Testing

Automated

Manual

Issue relevance

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`exec_command`

`update_plan`

`exec_command`

`update_plan`