[Decider] Cyclers

cosmo · March 27, 2022, 1:37pm

Cyclers: principle

The decider keeps in memory all configurations (i.e. memory tape + head position + machine state) seen during the execution of the machine and returns true if the same configuration is seen twice.

Decider examples and counterexamples

Examples: #279,081; #4,231,819; #279,081
Counterexamples: #59,090,563; #9,281,450; #76,727,755

Decider code

https://github.com/bbchallenge/bbchallenge-deciders/tree/main/decider-cyclers

Decider tests

https://github.com/bbchallenge/bbchallenge-deciders/blob/main/decider-cyclers/main_test.go

Results

28/01/22: 11,229,238 cyclers were decided. Indices of decided machines available http://docs.bbchallenge.org/bb5_decided_indexes/cyclers-run-11c0ef00e9c2-time-1000-space-500-minIndex-0-maxIndex-14322029.

Subset of application

The decider was applied only on the first 14,322,029 machines of the seed database as they are the ones that exceeded the time limit of 47M steps. It is unlikely that a cycler would exceed the 12k space limit.

This corresponds to 14,322,029 machines tested.

Decider correctness

Proof available at https://github.com/bbchallenge/bbchallenge-proofs/blob/build-latex-pdf/deciders/correctness-deciders.pdf.

Nathan-Fenner · September 4, 2022, 6:59pm

I’ve created a second implementation of the Cyclers decider(*) in Dafny. Dafny (GitHub - dafny-lang/dafny: Dafny is a verification-aware programming language) is a “verification-aware programming language”. This means that it’s possible to not only write up an implementation, it is possible to ask Dafny to prove that the implementation is correct.

In addition to writing plain code, Dafny allows you to write specifications that describe the expected behavior of your code. The compiler then confirms that all of your specifications are correct. In this case, this culminates in writing:

// This method returns the nth TM state for a program, starting from the
// initial state.
function method machine_iter_n(program: Program, n: nat): MachineStep {
  machine_step_n(program, initial, n)
}

// By definition, a machine halts if there's a step such that at that point, it halts.
predicate program_eventually_halts(program: Program) {
  exists n :: n >= 0 && machine_iter_n(program, n).Halt?
}

predicate program_loops_forever(program: Program) {
  !program_eventually_halts(program)
}

method cyclers_decider(program: Program, time_limit: nat)
returns (finds_loop: bool)
ensures finds_loop ==> program_loops_forever(program) {
  ...
}

In particular, assuming the implementation of the single-step TM function machine_step_n can be trusted, this ensures that when cyclers_decider returns true, the corresponding TM really does halt. This means that bugs in the cyclers_decider function cannot cause false positives!

Differences

In this program, the configuration is described relative to the head of the machine, not relative to the starting position. That means that https://bbchallenge.org/59645887 is a cycler, not a translated cycler!

However, it will still be defeated by other translated cyclers like https://bbchallenge.org/15167997 since the configuration relative to the head of the machine grows without bound.

For this reason, I expect the resolved machines to be a strict superset of the existing cycler output, and a strict subset of the existing translated cycler output.

Decider code

github.com

Nathan-Fenner/bbchallenge-dafny-deciders/blob/main/cyclers-decider.dfy

// A bit is either 0 or 1.
datatype Bit = B0 | B1
// A tape is a cons-list of bits. InfiniteZeros conceptually represents an infinite sequence of 0s.
// Hence, it is a "half-infinite" tape of bits, describing "one side" of the Turing machine
// configuration.
// We use the name 'RawTape' here since these are not normalized; we want to avoid
// having tapes like "010..." since there is no difference between "010..." and "01...".
datatype RawTape = InfiniteZeros | Cons(head: Bit, tail: RawTape)

// A tape is "good" if it doesn't end in a B0 bit.
// Therefore, it must be empty, or if its head is B0, the next one is not empty.
// Additionally, its tail must be "good".
predicate good_tape(r: RawTape) {
  r == InfiniteZeros || (good_tape(r.tail) && (r.head == B0 ==> r.tail != InfiniteZeros))
}

// A `Tape` is any `RawTape` which does not end in a B0.
// This ensures that each Tape represents a genuinely different state.
type Tape = t: RawTape | good_tape(t) witness InfiniteZeros

This file has been truncated. show original

Remaining Tasks
I’ve updated the file to include a basic test; it runs the current BB(5) candidate for 47_176_869 steps and confirms that it has not halted, and then runs for 47_176_870 steps and confirms that it has.

I still need to run the decider itself to confirm the results and make sure that there’s no bugs in the specification.

cosmo · September 4, 2022, 7:08pm

That is fantastic thank you. This is the first effort for formally verifying code so far in bbchallenge.

Thank you for having started this trend!

Really cool. I assume that in order to trust the single-step TM function, a good unittest would be to verify that the 5 state champion, 1RB1LC_1RC1RB_1RD0LE_1LA1LD_1RZ0LA, does halt after 47,176,870 steps.

TonyG · October 5, 2022, 10:22am

As a kind of warm-up exercise, I wrote my own Cyclers Decider in C++, which outputs a Decider Verification File (dvf) as well as an Undecided Machines File (umf). I also wrote a Verifier, which checks the dvf for correctness. You can find them at bbchallenge/Cyclers at main · TonyGuil/bbchallenge · GitHub.

The Decider was run with time limit 1,000 and space limit 200, and it found 11,229,238 Cyclers, as expected. Afterwards I ran it with time limit 10,000 and space limit 5,000, and no more Cyclers were found; so we can be failry confident that there are no more out there.

See the README file at the link for the format of the dvf.

cosmo · October 5, 2022, 10:43am

Great news, thank you!

so we can be failry confident that there are no more out there.

To mitigate a bit that statement, @sligocki has found a 4-state translated cycler (yes, not a cycler, a translated cycler so not directly comparable) with period of 7,129,704 steps.

So there is a little chance that some monstrous cyclers or translated cyclers are still lingering around.

TonyG · October 5, 2022, 11:20am

@cosmo: I am having trouble verifying the machines at that link. It seems to me that the repeated states are not occurring at record values of the tape head, which means they can’t be Translated Cyclers. Have I misunderstood something? If only they provided proper Verification Data!

cosmo · October 5, 2022, 11:46am

I move the conversation to [Decider] Translated cyclers

meithecatte · July 27, 2023, 2:07am

Greetings! I noticed that there isn’t very much formal verification in this project, so I decided to apply some Coq.

Writing a Cyclers decider seemed like a good starting point, so here it is: GitHub - meithecatte/busycoq: Busy Beaver deciders backed by Coq proof

I decided to approach this with an architecture of untrusted decider written in Rust + a verifier proven correct checking certificates produced by the decider.

Needless to say, the results match the index list linked upthread.

Within the Coq development, I define Turing machines in two ways:

through small-step semantics, using abstract infinite lists to represent the tape: TM.v
as an executable function: Compute.v

Of course, both definitions have been proven equivalent. The main reason for this apparent duplication is to make the proof easier, but it also has the nice aspect that each definition gives more reason to trust the other one.

The main correctness theorem can be found in Cyclers.v:

Theorem verify_cycler_correct : forall tm n k,
  verify_cycler tm n k = true -> ~ halts tm c0.