An interesting parsing problem arises from @savask’s bouncers code.
We try to match the format s(n) = u1 w1^n u2 w2^n ... uk wk^n S> being given s(0), s(1), s(2). Example from @savask:
For example, if we have
s0 = 101 S>
s1 = 110011 S>
s2 = 110100111 S>
then
s(n) = 1 10^n 0 1^n 1 S>
and the tape formula will be
1 (10) 0 (1) 1 S>.
This is currently solved using dynamic programming, but we explore if there is a more efficient algo.
First, idea was the following greedy algorithm by @savask:
if all strings start with one symbol, throw it away. If not, try finding a longest prefix w of s1 such that w w is a prefix of s2, and throw those away.
Unfortunately, this fails on examples like the following:
I wrote another implementation of savask’s algorithm, in Rust. Like all my other deciders, it generates certificates that then get checked by a verifier proven correct in Coq. Thus, the results are almost guaranteed correct. This should brings us closer to a Bouncers decider that meets the verifiability & reproducibility standards – I think only the section of the decider-correctness document is missing?
When ran on the 32632 undecided machines, it decides 29799 machines in about 10 seconds. This is the same 29799 as Tony’s decider, according to @UncombedCoconut’s data. If I am understanding savask’s comments correctly, this also agrees with their implementation – since it’s slower by a few orders of magnitude, and requires a different I/O format, I didn’t bother actually running it to confirm.
This is very impressive work, you are really pushing bbchallenge on the tracks of automatic proofs at a faster rate than what I could have imagined in my dreams!!
I am going to read your code in depth as well as starting to write about bouncers in parallel.