Remember when I said that I’d probably be formalizing some more deciders next? Well, looks like I lied, because Skelet #1 managed to pull me in the very next day.
The Coq development stands at a chonky 1628 lines – almost as much as all the other Coq code in the repository combined. The main theorems are as follows:
- first, we show that the machine quickly enters a configuration that can be described with the high-level symbolic representation:
Lemma init : c0 -->* L <: C1 |> P :> R. Proof. unfold L, C1, R. execute. Qed. Definition initial: conf := (right, [l_C1], [r_P]). Lemma init' : c0 -->* lift initial. Proof. exact init. Qed.
- then, we have the function
fullstep, which attempts to make progress using either the @uni-cycle acceleration, the stride acceleration, or a simple symbolic rule. Here is a theorem that states that whatever this function returns is a valid transition:
Lemma fullstep_spec : forall c c', fullstep c = Some c' -> lift c -->* lift c'.
- the final cycle is described as follows:
Three things to note here:
Lemma infinite_cycle : forall l, lift (right, l_C0 :: l, K) -->+ lift (right, l_C0 :: F ++ l, K).
-->+indicates that we have taken special care to make sure that this transition takes at least one step – this is necessary for the proof of non-halting to go through
- my simulator only 982 steps to simulate this cycle, as opposed to @uni’s 1357. This is because the stride acceleration doesn’t require a
C3symbol to trigger – something like
|> D x^n D x^n P $is enough.
- I have chosen a different point in the cycle than the one in Shawn’s blogpost, as my simulator was skipping right over that point (because, again, the stride acceleration is more general).
- the cycling configuration is detected by the function
is_cycling. If it returns true, then the machine doesn’t halt when started in this configuration.
Theorem is_cycling_spec : forall c, is_cycling c = true -> ~ halts tm (lift c).
This is all extracted to OCaml, and driven by the following tiny program:
open Skelet1 let rec doit n c = if is_cycling c then Printf.printf "cycle found after %d steps\n" n else begin if n mod 1_000_000 == 0 then Printf.printf "%d steps done\n%!" n; match fullstep c with | Some c' -> doit (n + 1) c' | None -> Printf.printf "unknown transition after %d steps\n" n end let _ = doit 0 initial
The results are as follows:
$ time ./skelet1 [...] 87000000 steps done cycle found after 87637653 steps real 1m36.329s user 1m36.240s sys 0m0.086s
Perhaps one of the more interesting parts of the proof is the uni-cycle, because it involves computation (
simple_step evaluates to this, so consider what happens in that configuration now) significantly interleaved with more abstract reasoning (we have an assumption that says we can perform
T strides on
r, so we can do a stride now and
r will advance by (say) 4096 strides).
This is all automated by perhaps the most complicated Ltac code I have written so far, culminating with a single line that takes over 10 seconds to execute (and then roughly another 10s at
Qed, when the Coq kernel type-checks the proof term). The general approach for strides look like this:
stride 0 1 (...)for the current right side of the tape
- if we can’t perform a stride, this will have evaluated to
None, plain and simple. If we can, then it is some term that contains
stride _ _ r.
- we have an assumption about
rthat we can use to rewrite this and finish evaluating
My first attempt was taking around 7 seconds for each stride, because the intermediate term had the call to
stride _ _ r buried under dozens of
match expressions. Apparently Coq doesn’t like those.
This was solved with a “tail-recursive” implementation of the
stride function, which uses manual continuation passing. This allows you to run the computation as far as it can go, and then capture the continuation into an Ltac variable.
Of course, all this complex machinery doesn’t need to be understood and checked to trust the proof, because the Coq kernel makes sure that whatever we did conforms to the rules of the particular type theory in use.
As a somewhat simpler example of a similar phenomenon, we have the
max_stride function, which computes how many strides can be performed on a tape. The value which it returns doesn’t matter for correctness, as the result is only used to decide what stride count to ask of
stride – if
max_stride returns a value that is too high, then
stride will return None and the relevant acceleration won’t be applied, affecting only performance.
I could talk at length about the formalization, with tidbits like the funky induction needed to prove the stride rule or the headache that is performant arithmetic in programs extracted from Coq (I make the slow but trivially correct option of using binary numbers directly from the Coq standard library, extracted to an OCaml ADT like any other
Inductive definition). But I probably should stop right about now – it is 6 AM, and therefore my sleep schedule is shifting forwards even more. God, Coq can be way too addictive.