Skelet #1 is a translated cycler – Coq agrees

Remember when I said that I’d probably be formalizing some more deciders next? Well, looks like I lied, because Skelet #1 managed to pull me in the very next day.

The Coq development stands at a chonky 1628 lines – almost as much as all the other Coq code in the repository combined. The main theorems are as follows:

  • first, we show that the machine quickly enters a configuration that can be described with the high-level symbolic representation:
    Lemma init : c0 -->* L <: C1 |> P :> R.
    Proof. unfold L, C1, R. execute. Qed.
    Definition initial: conf := (right, [l_C1], [r_P]).
    Lemma init' : c0 -->* lift initial.
    Proof. exact init. Qed.
  • then, we have the function fullstep, which attempts to make progress using either the @uni-cycle acceleration, the stride acceleration, or a simple symbolic rule. Here is a theorem that states that whatever this function returns is a valid transition:
    Lemma fullstep_spec : forall c c',
      fullstep c = Some c' ->
      lift c -->* lift c'.
  • the final cycle is described as follows:
    Lemma infinite_cycle : forall l,
      lift (right, l_C0 :: l, K) -->+ lift (right, l_C0 :: F ++ l, K).
    Three things to note here:
    • -->+ indicates that we have taken special care to make sure that this transition takes at least one step – this is necessary for the proof of non-halting to go through
    • my simulator only 982 steps to simulate this cycle, as opposed to @uni’s 1357. This is because the stride acceleration doesn’t require a C3 symbol to trigger – something like |> D x^n D x^n P $ is enough.
    • I have chosen a different point in the cycle than the one in Shawn’s blogpost, as my simulator was skipping right over that point (because, again, the stride acceleration is more general).
  • the cycling configuration is detected by the function is_cycling. If it returns true, then the machine doesn’t halt when started in this configuration.
    Theorem is_cycling_spec : forall c,
      is_cycling c = true ->
      ~ halts tm (lift c).

This is all extracted to OCaml, and driven by the following tiny program:

open Skelet1

let rec doit n c =
    if is_cycling c then
        Printf.printf "cycle found after %d steps\n" n
    else begin
        if n mod 1_000_000 == 0 then
            Printf.printf "%d steps done\n%!" n;
        match fullstep c with
        | Some c' -> doit (n + 1) c'
        | None ->
            Printf.printf "unknown transition after %d steps\n" n

let _ = doit 0 initial

The results are as follows:

$ time ./skelet1
87000000 steps done
cycle found after 87637653 steps

real    1m36.329s
user    1m36.240s
sys     0m0.086s

Perhaps one of the more interesting parts of the proof is the uni-cycle, because it involves computation (simple_step evaluates to this, so consider what happens in that configuration now) significantly interleaved with more abstract reasoning (we have an assumption that says we can perform T strides on r, so we can do a stride now and r will advance by (say) 4096 strides).

This is all automated by perhaps the most complicated Ltac code I have written so far, culminating with a single line that takes over 10 seconds to execute (and then roughly another 10s at Qed, when the Coq kernel type-checks the proof term). The general approach for strides look like this:

  • compute stride 0 1 (...) for the current right side of the tape
  • if we can’t perform a stride, this will have evaluated to None, plain and simple. If we can, then it is some term that contains stride _ _ r.
  • we have an assumption about r that we can use to rewrite this and finish evaluating

My first attempt was taking around 7 seconds for each stride, because the intermediate term had the call to stride _ _ r buried under dozens of match expressions. Apparently Coq doesn’t like those.

This was solved with a “tail-recursive” implementation of the stride function, which uses manual continuation passing. This allows you to run the computation as far as it can go, and then capture the continuation into an Ltac variable.

Of course, all this complex machinery doesn’t need to be understood and checked to trust the proof, because the Coq kernel makes sure that whatever we did conforms to the rules of the particular type theory in use.

As a somewhat simpler example of a similar phenomenon, we have the max_stride function, which computes how many strides can be performed on a tape. The value which it returns doesn’t matter for correctness, as the result is only used to decide what stride count to ask of stride – if max_stride returns a value that is too high, then stride will return None and the relevant acceleration won’t be applied, affecting only performance.

I could talk at length about the formalization, with tidbits like the funky induction needed to prove the stride rule or the headache that is performant arithmetic in programs extracted from Coq (I make the slow but trivially correct option of using binary numbers directly from the Coq standard library, extracted to an OCaml ADT like any other Inductive definition). But I probably should stop right about now – it is 6 AM, and therefore my sleep schedule is shifting forwards even more. God, Coq can be way too addictive.


That’s fantastic. Congrats. Fresh congrats also to uni and Shawn for the discovery you’ve decisively confirmed.


Wow, this is amazing! It is such a relief to have this formally verified so that we can once and for all say that we didn’t bungle something deep in the simulation abstraction. Thanks so much @meithecatte , I think this will be an amazing addition to our eventual paper: Proving the holdouts using Coq is so much more satisfying than the previous handwavy arguments about holdouts being (examined by hand).

Congrats!!! This is such exciting news :slight_smile: Thank you for the hard work! Also congrats to @sligocki and @univerz!!