fix: clear error state on disabled-transitively cells when ancestor recovers by VishakBaddur · Pull Request #8784 · marimo-team/marimo

VishakBaddur · 2026-03-20T01:23:07Z

Root Cause

When a disabled-transitively cell's ancestor had an error and then recovered, the disabled cell permanently showed the ancestor's error state.

run_stale_cells() in runtime.py only re-queues non-disabled cells:

if cell_impl.stale and not self.graph.is_disabled(cid):
    cells_to_run.add(cid)

So disabled-transitively cells never got re-queued and never had a chance to reset their run_result_status from "exception" to "disabled".

Fix

Added is_any_ancestor_errored() to DirectedGraph
In run_stale_cells(), after building cells_to_run, reset run_result_status to "disabled" for any disabled-transitively cell whose ancestor no longer has an error

Testing

Added test_is_any_ancestor_errored to tests/_runtime/test_dataflow.py verifying the new graph method correctly detects and clears ancestor error states.

…ecovers Fixes marimo-team#8072 When a disabled-transitively cell's ancestor had an error and then recovered, the disabled cell permanently showed the ancestor's error state. This happened because run_stale_cells() only re-queues non-disabled cells, so disabled-transitively cells never got a chance to reset their run_result_status from 'exception' to 'disabled'. Fix: - Add is_any_ancestor_errored() to DirectedGraph - In run_stale_cells(), after building cells_to_run, reset run_result_status to 'disabled' for any disabled-transitively cell whose ancestor no longer has an error

vercel · 2026-03-20T01:23:12Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
marimo-docs	Ready	Preview, Comment	Mar 20, 2026 1:26am

for more information, see https://pre-commit.ci

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

Copilot · 2026-03-30T20:31:01Z

marimo/_runtime/dataflow/graph.py

+    def is_any_ancestor_errored(self, cell_id: CellId_t) -> bool:
+        """Check if any ancestor of a cell has an error."""
+        return any(
+            self.topology.cells[cid].run_result_status == "exception"


is_any_ancestor_errored() only treats run_result_status == "exception" as an error, but the runtime also uses other error-like statuses (e.g. "marimo-error" is set for semantic/registration errors). The method name/docstring says “has an error”, so this narrow check is likely to be reused incorrectly and can cause false negatives when an ancestor is still in an error state.

Consider either (a) broadening the predicate to include all statuses that should be treated as “errored” (at least "exception" and "marimo-error", possibly "interrupted" depending on intent), or (b) renaming/docstring to make it explicit that it only checks for raised exceptions.

Suggested change

self.topology.cells[cid].run_result_status == "exception"

self.topology.cells[cid].run_result_status in ("exception", "marimo-error")

Copilot · 2026-03-30T20:31:02Z

marimo/_runtime/runtime.py

+        # Clear stale error state from disabled-transitively cells whose
+        # ancestor has recovered from an error. Without this, the disabled
+        # cell permanently shows the ancestor error even after it is fixed.
+        for cid, cell_impl in self.graph.cells.items():
+            if (
+                self.graph.is_disabled(cid)
+                and not cell_impl.config.disabled
+                and cell_impl.run_result_status == "exception"
+                and not self.graph.is_any_ancestor_errored(cid)
+            ):


This loop calls self.graph.is_disabled(cid) for every cell on every run_stale_cells() invocation. DirectedGraph.is_disabled() walks parents (BFS) and can become a noticeable hot path for large notebooks.

Since this block only targets disabled-transitively cells, consider using the already-tracked runtime state (cell_impl.runtime_state == "disabled-transitively" / cell_impl.disabled_transitively) instead of recomputing is_disabled() each time, or precomputing a disabled set once and reusing it in both loops.

Copilot · 2026-03-30T20:31:02Z

marimo/_runtime/runtime.py

+        # Clear stale error state from disabled-transitively cells whose
+        # ancestor has recovered from an error. Without this, the disabled
+        # cell permanently shows the ancestor error even after it is fixed.
+        for cid, cell_impl in self.graph.cells.items():
+            if (
+                self.graph.is_disabled(cid)
+                and not cell_impl.config.disabled
+                and cell_impl.run_result_status == "exception"
+                and not self.graph.is_any_ancestor_errored(cid)
+            ):
+                cell_impl.set_run_result_status("disabled")
+


This block updates cell_impl.run_result_status but does not emit any CellNotification to the frontend. The frontend’s “errored”/error UI is driven by received cell-op messages (especially error outputs), and it doesn’t observe backend run_result_status directly.

If the goal is to clear the user-visible error state for disabled-transitively cells, this likely also needs an explicit UI update (e.g., clearing/replacing the error output and/or sending a status transition that resets the frontend’s errored flag). An alternative is to include these cells in the normal _run_cells queue so they go through the runner’s standard status transitions, plus explicitly clearing their error output when they’re skipped as disabled.

Copilot · 2026-03-30T20:31:02Z

tests/_runtime/test_dataflow.py

+def test_is_any_ancestor_errored() -> None:
+    """Test that is_any_ancestor_errored correctly detects ancestor errors."""
+    graph = dataflow.DirectedGraph()
+    # Create a chain: 0 -> 1 -> 2
+    code = "x = 0"
+    first_cell = parse_cell(code)
+    graph.register_cell("0", first_cell)
+    code = "y = x"
+    second_cell = parse_cell(code)
+    graph.register_cell("1", second_cell)
+    code = "z = y"
+    third_cell = parse_cell(code)
+    graph.register_cell("2", third_cell)
+
+    # No errors initially
+    assert not graph.is_any_ancestor_errored("0")
+    assert not graph.is_any_ancestor_errored("1")
+    assert not graph.is_any_ancestor_errored("2")
+
+    # Set cell 0 to exception state
+    graph.cells["0"].set_run_result_status("exception")
+    assert not graph.is_any_ancestor_errored("0")  # no ancestors
+    assert graph.is_any_ancestor_errored("1")  # parent 0 has error
+    assert graph.is_any_ancestor_errored("2")  # grandparent 0 has error
+
+    # Fix cell 0 - clear the error
+    graph.cells["0"].set_run_result_status("success")
+    assert not graph.is_any_ancestor_errored("0")
+    assert not graph.is_any_ancestor_errored("1")
+    assert not graph.is_any_ancestor_errored("2")


This test validates the new DirectedGraph.is_any_ancestor_errored() helper, but the PR’s user-facing behavior change is in Kernel.run_stale_cells() (clearing disabled-transitively cells’ stale error state when an ancestor recovers). Consider adding an integration-style runtime test that reproduces #8072 end-to-end (ancestor errors → downstream disabled-transitively cell shows error → ancestor fixed + run_stale_cells() → downstream cell no longer shows error/exception state). This would help ensure the run_stale_cells() logic stays correct as execution/notification behavior evolves.

VishakBaddur requested a review from dmadisetti as a code owner March 20, 2026 01:23

vercel bot deployed to Preview March 20, 2026 01:24 View deployment

[pre-commit.ci] auto fixes from pre-commit.com hooks

387a4fa

for more information, see https://pre-commit.ci

vercel bot deployed to Preview March 20, 2026 01:26 View deployment

mscolnick added the bug Something isn't working label Mar 20, 2026

mscolnick requested a review from Copilot March 20, 2026 17:44

Copilot AI reviewed Mar 20, 2026

View reviewed changes

mscolnick requested a review from Copilot March 30, 2026 20:22

Copilot started reviewing on behalf of mscolnick March 30, 2026 20:22 View session

Copilot AI reviewed Mar 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: clear error state on disabled-transitively cells when ancestor recovers#8784

fix: clear error state on disabled-transitively cells when ancestor recovers#8784
VishakBaddur wants to merge 2 commits intomarimo-team:mainfrom
VishakBaddur:fix/disabled-cell-error-state-not-cleared

VishakBaddur commented Mar 20, 2026

Uh oh!

vercel bot commented Mar 20, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 30, 2026

Uh oh!

Copilot AI Mar 30, 2026

Uh oh!

Copilot AI Mar 30, 2026

Uh oh!

Copilot AI Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	self.topology.cells[cid].run_result_status == "exception"
	self.topology.cells[cid].run_result_status in ("exception", "marimo-error")

Conversation

VishakBaddur commented Mar 20, 2026

Root Cause

Fix

Testing

Uh oh!

vercel bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vercel bot commented Mar 20, 2026 •

edited

Loading