[October 2nd 2022] Bug in backward reasoning, back to 3,574,222 machines

cosmo · October 2, 2022, 6:02pm

Unfortunately, a new bug has been found by @TonyG in the decider for [Decider] Backward reasoning. Hence, the results have been un-applied and we are back to 3,574,222 machines to decide.

The decider will be debugged but we now require that it goes through [Debate & Vote] Deciders’ validation process before it is next applied in order to minimise the chances of further backward reasoning bugs in the future.

In particular, we will require a fully independent reproduction of the results produced by the debugged decider. Don’t hesitate to jump in if you are up for this challenge!

TonyG · October 3, 2022, 10:11am

I will post my code when I have tidied it up a little, and worked out how to use github.

cosmo · October 3, 2022, 12:22pm

Exciting! If you are interested, we have a discord channel which is useful for real-time research discussions and also for sorting out technical stuff such as github. We’d be happy to see you there but no worries at all if you’d rather not!

sligocki · October 4, 2022, 3:06pm

I have a backtracking implementation that I can run to cross-validate. If you can (and aren’t already saving them), can you add printing of debug params the decider discovers when proving? I am using stats: max_tree_depth (max # steps backwards before proof completes) num_tree_nodes (Number of nodes traversed in the BFS/DFS) and max_tree_width (The largest width at any step backwards … I use BFS, this might not make sense for your DFS implementation). If we get 2-3 different implementations all agreeing on all the same TMs with the same params I think we’ll have a lot more confidence!

TonyG · October 4, 2022, 5:13pm

@sligocki, I have updated my BackwardReasoning decider to output these stats. See bbchallenge/BackwardReasoning at main · TonyGuil/bbchallenge · GitHub.