Skip to content

DRAFT: Control Flow Retrace feature #8636

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

lks9
Copy link
Contributor

@lks9 lks9 commented May 8, 2025

Description

Adds a feature to retrace and verify a single control flow path. The path can be given using --retrace as a string of 0s and 1s. From a very brief evaluation test, it seems to be faster than bounded loop unwinding. Only that it might be challenging to find a path to retrace. The 0s and 1s in the input retrace string follow the goto decisions in cbmc's internal goto program representation. During symbolic execution, files and line numbers for each decision are logged to the console, so the interactive use should be intuitive.

Some examples and evaluations are here: examples_and_evaluation.zip

Use Cases

  • Trying out different traces for interactive debugging with cbmc. The restriction to a single control flow path can make interactive debugging more worthwhile because there is no state splitting/merging. When there is a failed assertion (or similar), we know exactly the path on which the assertion is violated.
  • Symbolic testing using a control flow path as generalization to normal testing using concrete inputs.
  • Debugging when bugreports contain control flow traces. The approach is to record control flow paths (outside of cbmc) and retrace it using cbmc. This can be useful when we observe a crash of the recorded program and retrace it on a different computer (I found two papers describing related approaches, "Better bug reporting with better privacy" and REPT). The recording could have a minimal overhead, using for example the src-tracer we are developing.

Open problems/questions

  • Documentation and regression or unit tests are still missing, we are also a bit unsure on the way and extent to do that.
  • We are currently working on how to record a trace externally and then retrace it in cbmc.
  • Retracing line numbers instead of 0s and 1s: We think this would not be very predictable and only doable by using a complicated heuristics. So we did not try to implement this.
  • Recording a control flow path in cbmc: Modify cbmc, so that a control flow trace is recorded for a normal call with --unwind and --paths. We tried to implement this also in cbmc, but we faced several problems, like where to store the intermediate tracing decisions, as multiple branches are explored and states are sometimes copied, sometimes reused. Furthermore, the method symex_goto() is very complicated so that it is not trivial, when and which tracing decisions were taken.
  • How stable is the goto program translation in cbmc? If there is some kind of optimization in the goto-program translation that would make retracing more unpredictable? What about internal function calls like sqrt() in cbmc? Are the respective goto functions stable?
  • What about threads? They are supported by cbmc but we have not yet looked into them.
  • What about mixing BMC and retracing? For example, internal functions could be done by BMC and only selected parts by retracing.

Technical remarks

The implementation is mainly in the new method goto_retrace() in symex_goto.cpp.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • n/a My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

Copy link

codecov bot commented May 8, 2025

Codecov Report

Attention: Patch coverage is 82.44275% with 23 lines in your changes missing coverage. Please review.

Project coverage is 80.37%. Comparing base (3c915eb) to head (e99e11d).

Files with missing lines Patch % Lines
src/goto-symex/symex_goto.cpp 82.14% 20 Missing ⚠️
src/cbmc/cbmc_parse_options.cpp 62.50% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##           develop    #8636    +/-   ##
=========================================
  Coverage    80.37%   80.37%            
=========================================
  Files         1686     1687     +1     
  Lines       206885   207009   +124     
  Branches        73       73            
=========================================
+ Hits        166276   166376   +100     
- Misses       40609    40633    +24     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@lks9 lks9 force-pushed the retrace_feature branch 3 times, most recently from 61240e2 to d7e20dd Compare May 8, 2025 17:58
lks9 and others added 2 commits May 8, 2025 20:17
Adds a feature to retrace and verify a single control flow path. The path
can be given using `--retrace` as a string of 0s and 1s. From a very brief
evaluation test, it seems to be faster than bounded loop unwinding. Only
that it might be challenging to find a path to retrace. The 0s and 1s in
the input retrace string follow the goto decisions in cbmc's internal goto
program representation. During symbolic execution, files and line numbers
for each decision are logged to the console, so the interactive use should
be intuitive.

Co-authored-by: Yosephine Setiawan <[email protected]>
I have not changed these lines but check-clang-format failed nonetheless.
Thus, I added this extra commit.
@lks9 lks9 force-pushed the retrace_feature branch 3 times, most recently from 978abb8 to 7554fc5 Compare May 9, 2025 13:40
This adds one test file for the new --retrace feature.
@lks9 lks9 force-pushed the retrace_feature branch from 7554fc5 to e99e11d Compare May 9, 2025 13:54
@lks9
Copy link
Contributor Author

lks9 commented May 9, 2025

I added the test file simple_goto.c but I think in the incorrect folder under regression/book-examples/simple-goto/. Please, could someone help me to find/create the correct location for the test file?

Apart from that, testing seems to work well. I would also add the max_while.c. For the documentation, I also don't know: Where and how? Thank you in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant