Background:
Skelet’s bbfind is the Swiss-Army Gun of decider programs.
One of its methods, BL_Proof, gets two beavers of extreme difficulty. (The Finite Automata Reduction decider is either too slow or useless, HS/CPS are useless, and zoology-based deciders besides this Skelet one don’t apply.)
bbfind be tested via @Wizord’s adaptation: something like fpc -O3 bbfind-stdin.pp to build it with the Free Pascal Compiler, and echo 1RB0RB_1RC0LB_1LD1RE_1LB0LD_0RA--- | ./bbfind-stdin > 4049716.out to run it and capture the proof. It is then necessary to clean up the inconsistent usage of carriage-return characters, and to carefully undo the line breaking if the next step is to parse rather than print it.
Skelet describes the method as follows:
linear-binary sequence: In this case k is unlimited, there is only 1 exone, number of different intrones is limited by some small constant B.
If we ignore exone sequences (e|n), the remaining part of the tape form some B-nary counter during the tape transitions.
Infinitness of these machines is proved by the procedure TryBL_Proof.
It’s a nice explanation, except for being false, so actually let’s ignore it. Once a proof is fully reverse engineered, a “besides those exons, which don’t count” interpretation of these words, which is not false, may become available.
The non-status output for https://bbchallenge.org/4049716 starts with
List of the Special rules:
ind p c pattern1 pattern2
7 2 8: *--+C>(--)* -> *(++)--+C>* sp>
9 -2 2: *(++)<B* -> *<B(--)* sp>
These represent rewrite-rules which you can get by composing multiple TM transitions, are chainable, and are used below. The rule-index numbers in the “ind” are reused below to show where these rules are applied. Here, - means 0 and + means 1. The C> and <B are widespread oriented-head notations. Parentheses are simply grouping, but they also show what repeats n times when you form the n-fold composition of the rule. The symbol * just means arbitrary tape content (so long as it’s consistent on both sides of the ->) — for now. In other words, these are infix rewrite rules for the TM.
Next, First 400 macro steps are given. A macro step is the application of one of the above rules (if possible) or else a direct TM transition. This section is not essential to the proof, but it illustrates some features of Skelet’s simulation.
It also shows a notational point, literally: the . is notation for an edge of a config. (We’re no longer dealing with infixes.)
Anyway, first it introduces something Skelet mentions on the homepage: “Notation ‘(a|n)’ means that word a is repeated n times on tape.” This notation is used on-demand, when one of “the Special rules” leaves an instance behind. Second, there’s a curious variation, where it’s instead written like (n|a) in left-of-head occurrences. This is an internal implementation detail leaking out: the program internally stores tape content with a consistent head-to-edge ordering. Outputting it that way is out of the question, since there’s a clear orthography for TM configs, so this is “fixed” using mirror-writing on the left side. It’s mainly important to understand this quirk in other settings, where the “n” expressions can involve multi-digit numbers or non-commutative operations, and you’re fucked if you take the notation at face value.
Next up, the victorious BL_Proof succeed ! and some definitions:
Left * consists of the expressions:
a=+-++++--(++|p+3)
b=--(++|p+3)
Right * consists of the expressions:
a=+--+-+(--|p+3)
b=+----+-(--|p+3)
c=+-+-+--++-(--|p+3)
d=+-+-(--|p+2)
So now the * notation will refer to a repetition. Like a Kleene-star, though based on what follows the * pattern means one or more occurrences.
Remember how I said that violating the orthography is out of the question, and internally reversed left-tape configs will be “fixed” by mirror-writing? Well, not here, because fuck you.
So: from now on, a * symbol on the head’s left stands for the same pattern as the regexp ((11){3,}00111101 | (11){3,}00)+, and a * symbol on the head’s right stands for the same pattern as the regexp (100101(00){3,} | 1000010(00){3,} | 1010100110(00){3,} | 1010(00){2,})+.
Next, Closed transition graph proof follows:
I believe this can pretty much be taken at face value.
Once the * definition is figured out, the only real decision to make is where to start building a closed language.
Skelet’s program found this to be an advantageous choice:
ind rule LsdR
0 -1: .<A. --> .*<B(--|p+3)+--+-+--+--+.
The (composite) transitions between configs in the closed language are numbered here, with an indicator of “the Special rule” used (if any). LsdR hopefully stands for left/state/direction/right.
When you add a transition, the “after” side of it should always show the head entering a *.
If this * pattern “consists of” B expressions—in this case B=2, based on the Left * definition—we need to have transitions defined for 2B specific patterns that * can represent (depending on which sub-expression the head is running into, and whether or not there additional repetitions beyond that).
Indeed, 4 lines follow, with
.*(3+p|++)--++++-+<B(--|p+3)+--+-+--+--+.,
. (3+p|++)--++++-+<B(--|p+3)+--+-+--+--+.,
.*(3+p|++)-- <B(--|p+3)+--+-+--+--+.,
. (3+p|++)-- <B(--|p+3)+--+-+--+--+.
before the -->. (Spacing added to show the pattern.)
Each time we need to add a transition, we should be able to simulate the “before” side until the head is about to enter a *, making that the “after” side and continuing until the transition graph is closed.
Note that this process (expanding a *, in particular) may add a pattern like (++|p+3) onto a tape which already contains a pattern using the variable p. An example in this proof is the line
178 7: .(3+p|++)--+C>+-+-+--++-(--|p+3)+-+-+--++---+--+-+--+--+. --> .*(3+p|++)--++++-+<B(--|p+3)+----+---+--+.
Because of where these expressions came from, we should understand the 2 p variables on the same tape to be independent. (Of course, the first p on the “before” side corresponds to the first p on the “after” side, ditto the second.)
For the 2 machines which (so far) need this, I’m seeking to translate this output into a CTL representation which will be hundreds of times easier to parse and verify.
And finally, does Skelet’s homepage description of the method work better in hindsight?
Maybe, if the one-exon-B-introns description is understood to apply independently to the two sides of the tape. So (++) would be the one exon on the left, and (--) would be the one exon on the right.
Pursuing the interpretation further might make sense for anyone trying to get a rule-proving decider to handle these critters… which could have a greater benefit than 2 TMs if CPSable TMs also count, or for going beyond BB5.