Fix dangling‐capture in TestNodeConfig::complete_func to eliminate flaky IDLE returns #967

pepeRossRobotics · 2025-04-23T01:17:19Z

Summary

When using TestNode for JSON-driven substitution rules, the default complete_func lambda in
TestNodeConfig was capturing this (the address of a temporary). After move-constructing into the member _test_config, that pointer became dangling and often returned IDLE from onRunning(), triggering a LogicError. This patch changes the capture to copy return_status by value, fully eliminating the UB and making the test nodes reliably return SUCCESS or FAILURE as configured.

Background / Motivation

Reported in issue #927: “TestNode sometimes returns IDLE onRunning()”
Reproduced in a minimal repo: test nodes intermittently returned IDLE even when configured for SUCCESS
Added file/minitrace loggers, saw onRunning() → IDLE and confirmed it came from complete_func()
Debugging revealed complete_func = [this](){…} pointed at a destroyed temporary
Changes

--- a/include/behaviortree_cpp/test_node.h
+++ b/include/behaviortree_cpp/test_node.h
@@ struct TestNodeConfig
-  std::function<NodeStatus(void)> complete_func = [this]() { return return_status; };
+  /// Capture return_status by value to avoid dangling this-pointer
+  std::function<NodeStatus(void)> complete_func = [ret = return_status]() { return ret; };

Before:

onStart() → RUNNING
onRunning() → complete_func() returns IDLE  ← UB, crashes

After:

onStart() → RUNNING
onRunning() → complete_func() returns SUCCESS ← correct

Testing

Added file- and minitrace-logger in user code to capture every tick.
Wrote a small BT with a single <Action ID="Test" …/> substituted via JSON.
Verified 1st tick RUNNING, 2nd tick SUCCESS.
Ran full scenario suite — no more LogicError.

Related issues Issues:
#927: #927
and maybe
#930: #930

Thanks for reviewing 🙏 Let me know if you’d like any tweaks!

…by value: Now every TestNodeConfig instance has its own copy of return_status, and no dangling pointer ever occurs, eliminating the illegal IDLE return in onRunning().

facontidavide · 2025-04-24T06:35:33Z

CI is red

facontidavide · 2025-04-24T06:49:55Z

I opted for a different solution, but thanks for finding the issue 😄

3540b4a

facontidavide · 2025-04-24T07:02:59Z

also, thinking more carefully about it, I decided to make a more radical change: 2da7906

Change the default capture of complete_func to capture return_status …

a4d5848

…by value: Now every TestNodeConfig instance has its own copy of return_status, and no dangling pointer ever occurs, eliminating the illegal IDLE return in onRunning().

facontidavide closed this Apr 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix dangling‐capture in TestNodeConfig::complete_func to eliminate flaky IDLE returns #967

Fix dangling‐capture in TestNodeConfig::complete_func to eliminate flaky IDLE returns #967

pepeRossRobotics commented Apr 23, 2025 •

edited

Loading

facontidavide commented Apr 24, 2025

facontidavide commented Apr 24, 2025

facontidavide commented Apr 24, 2025

Fix dangling‐capture in TestNodeConfig::complete_func to eliminate flaky IDLE returns #967

Fix dangling‐capture in TestNodeConfig::complete_func to eliminate flaky IDLE returns #967

Conversation

pepeRossRobotics commented Apr 23, 2025 • edited Loading

Summary

Background / Motivation

Testing

facontidavide commented Apr 24, 2025

facontidavide commented Apr 24, 2025

facontidavide commented Apr 24, 2025

pepeRossRobotics commented Apr 23, 2025 •

edited

Loading