Closed Tape Languages: asymmetric automata construction

UncombedCoconut · October 9, 2022, 4:18am

This is yet another implementation of the Filter - CTL concept.
The novelty is that we can just enumerate finite state machines to classify one side of the tape – which gets us to useful search depths and a bit beyond before the combinatorics kill us.
If we fix a DFA classifying left half-tapes, and consider the TM modulo its equivalence relation, it becomes a pushdown automaton. It gets better: though the languages recognized by PDAs are famously the context-free languages, the language of halting (or reachable) configurations is regular – and straightforwardly applicable algorithms in the literature give us a recognizer.

The earliest and most-cited version I found is [BEM97], which formalizes PDAs as always consuming the top stack symbol and shows how to recognize predecessor configurations. [FWW97] from a little later does it with successor states and PDAs which either push or pop one symbol, and is maybe a little easier to read. The first paper nails the use case of solving TMs directly.

The solver’s in a working/promising but incomplete state, pending a good way to compare it to others’ best CTL deciders. The competition includes Shawn’s toolbox, a polished inductive search by @Iijil, a z3-powered search by @djmati1111, and (maybe more of a complement) a verifier / guided solver by @Frans_Faase.

It’s a currently 176-line Python script, including comments/docstrings which prove the algorithm is sound, but not unit tests to prove it’s implemented correctly.
It takes a DB file as input plus optional parameters as follows:

$ ./decide_closed_tape_language_l2r.py --help
usage: decide_closed_tape_language_l2r.py [-h] [--db DB] [-l L] [-q] [seeds ...]

If a Closed Tape Language of given complexity proves a TM cannot halt, show it.

positional arguments:
  seeds        DB seed numbers

optional arguments:
  -h, --help   show this help message and exit
  --db DB      Path to DB file
  -l L         State limit for the left half-tape DFA. At worst, 5 (default) should be sub-second and 6 sub-minute.
  -q, --quiet  Do not output regexp proofs (for speed or to avoid depending on automata-lib)

The regexps it spits out to prove its decisions are disgustingly large, worse than I was getting out of my otherwise inferior brute-force CTL search, but they ought to be verifiable. AFAIK, outputting regexps I’d be happy with is a harder problem than proving the value of BB(5).

UncombedCoconut · October 9, 2022, 7:12am

It has also occurred to me that:

Some emitted expressions contain terms like (0 | 1 | other crap)* – simplifying to [01]* – so a theoretically bad simplification method (like converting to minimal DFA and then to RE) might actually work.
To avoid confusion it may be necessary to adhere to a convention like “a CTL contains the start state and can’t be exited; a co-CTL contains halting configurations and can’t be entered” – and potentially output the complement of the languages I’m outputting now.