Background:
Skelet’s bbfind
is the Swiss-Army Gun of decider programs.
One of its methods, BL_Proof
, gets two beavers of extreme difficulty. (The Finite Automata Reduction decider is either too slow or useless, HS/CPS are useless, and zoology-based deciders besides this Skelet one don’t apply.)
bbfind
be tested via @Wizord’s adaptation: something like fpc -O3 bbfind-stdin.pp
to build it with the Free Pascal Compiler, and echo 1RB0RB_1RC0LB_1LD1RE_1LB0LD_0RA--- | ./bbfind-stdin > 4049716.out
to run it and capture the proof. It is then necessary to clean up the inconsistent usage of carriage-return characters, and to carefully undo the line breaking if the next step is to parse rather than print it.
Skelet describes the method as follows:
linear-binary sequence: In this case k is unlimited, there is only 1 exone, number of different intrones is limited by some small constant B.
If we ignore exone sequences (e|n), the remaining part of the tape form some B-nary counter during the tape transitions.
Infinitness of these machines is proved by the procedure TryBL_Proof.
It’s a nice explanation, except for being false, so actually let’s ignore it. Once a proof is fully reverse engineered, a “besides those exons, which don’t count” interpretation of these words, which is not false, may become available.
The non-status output for https://bbchallenge.org/4049716 starts with
List of the Special rules:
ind p c pattern1 pattern2
7 2 8: *--+C>(--)* -> *(++)--+C>* sp>
9 -2 2: *(++)<B* -> *<B(--)* sp>
These represent rewrite-rules which you can get by composing multiple TM transitions, are chainable, and are used below. The rule-index numbers in the “ind” are reused below to show where these rules are applied. Here, -
means 0
and +
means 1
. The C>
and <B
are widespread oriented-head notations. Parentheses are simply grouping, but they also show what repeats n
times when you form the n
-fold composition of the rule. The symbol *
just means arbitrary tape content (so long as it’s consistent on both sides of the ->
) — for now. In other words, these are infix rewrite rules for the TM.
Next, First 400 macro steps
are given. A macro step is the application of one of the above rules (if possible) or else a direct TM transition. This section is not essential to the proof, but it illustrates some features of Skelet’s simulation.
It also shows a notational point, literally: the .
is notation for an edge of a config. (We’re no longer dealing with infixes.)
Anyway, first it introduces something Skelet mentions on the homepage: “Notation ‘(a|n)’ means that word a is repeated n times on tape.” This notation is used on-demand, when one of “the Special rules” leaves an instance behind. Second, there’s a curious variation, where it’s instead written like (n|a)
in left-of-head occurrences. This is an internal implementation detail leaking out: the program internally stores tape content with a consistent head-to-edge ordering. Outputting it that way is out of the question, since there’s a clear orthography for TM configs, so this is “fixed” using mirror-writing on the left side. It’s mainly important to understand this quirk in other settings, where the “n” expressions can involve multi-digit numbers or non-commutative operations, and you’re fucked if you take the notation at face value.
Next up, the victorious BL_Proof succeed !
and some definitions:
Left * consists of the expressions:
a=+-++++--(++|p+3)
b=--(++|p+3)
Right * consists of the expressions:
a=+--+-+(--|p+3)
b=+----+-(--|p+3)
c=+-+-+--++-(--|p+3)
d=+-+-(--|p+2)
So now the *
notation will refer to a repetition. Like a Kleene-star, though based on what follows the *
pattern means one or more occurrences.
Remember how I said that violating the orthography is out of the question, and internally reversed left-tape configs will be “fixed” by mirror-writing? Well, not here, because fuck you.
So: from now on, a *
symbol on the head’s left stands for the same pattern as the regexp ((11){3,}00111101 | (11){3,}00)+
, and a *
symbol on the head’s right stands for the same pattern as the regexp (100101(00){3,} | 1000010(00){3,} | 1010100110(00){3,} | 1010(00){2,})+
.
Next, Closed transition graph proof follows:
I believe this can pretty much be taken at face value.
Once the *
definition is figured out, the only real decision to make is where to start building a closed language.
Skelet’s program found this to be an advantageous choice:
ind rule LsdR
0 -1: .<A. --> .*<B(--|p+3)+--+-+--+--+.
The (composite) transitions between configs in the closed language are numbered here, with an indicator of “the Special rule” used (if any). LsdR
hopefully stands for left/state/direction/right.
When you add a transition, the “after” side of it should always show the head entering a *
.
If this *
pattern “consists of” B
expressions—in this case B=2
, based on the Left *
definition—we need to have transitions defined for 2B
specific patterns that *
can represent (depending on which sub-expression the head is running into, and whether or not there additional repetitions beyond that).
Indeed, 4 lines follow, with
.*(3+p|++)--++++-+<B(--|p+3)+--+-+--+--+.
,
. (3+p|++)--++++-+<B(--|p+3)+--+-+--+--+.
,
.*(3+p|++)-- <B(--|p+3)+--+-+--+--+.
,
. (3+p|++)-- <B(--|p+3)+--+-+--+--+.
before the -->
. (Spacing added to show the pattern.)
Each time we need to add a transition, we should be able to simulate the “before” side until the head is about to enter a *
, making that the “after” side and continuing until the transition graph is closed.
Note that this process (expanding a *
, in particular) may add a pattern like (++|p+3)
onto a tape which already contains a pattern using the variable p
. An example in this proof is the line
178 7: .(3+p|++)--+C>+-+-+--++-(--|p+3)+-+-+--++---+--+-+--+--+. --> .*(3+p|++)--++++-+<B(--|p+3)+----+---+--+.
Because of where these expressions came from, we should understand the 2 p
variables on the same tape to be independent. (Of course, the first p
on the “before” side corresponds to the first p
on the “after” side, ditto the second.)
For the 2 machines which (so far) need this, I’m seeking to translate this output into a CTL representation which will be hundreds of times easier to parse and verify.
And finally, does Skelet’s homepage description of the method work better in hindsight?
Maybe, if the one-exon-B
-introns description is understood to apply independently to the two sides of the tape. So (++)
would be the one exon on the left, and (--)
would be the one exon on the right.
Pursuing the interpretation further might make sense for anyone trying to get a rule-proving decider to handle these critters… which could have a greater benefit than 2 TMs if CPSable TMs also count, or for going beyond BB5.