# [Decider] Finite Automata Reduction

Finite Automata Reduction: principle

This program finds and verifies proofs of the form:

The following finite-state machine, by construction, recognizes all halting tape configurations for this TM. It doesn’t recognize the initial configuration. Therefore the TM doesn’t halt from its initial configuration.

Thus, it deals in generic co-CTL arguments (“co-CTL” being the complement of a CTL, characterized by a time-reversed version of the latter’s closure criteria).
Any CTL or co-CTL solution to a TM is representable in this decider’s format. (There may be translation overhead.)
The README explains in more detail how the automata, CTL, and co-CTL perspectives are connected.

For complete TM proofs, I impose a specific state-machine architecture, such that the proof goes through as soon as the invariants are checked.

Fortunately, the decider includes an efficient way to reconstruct proofs from just a DFA describing how a language – either CTL or co-CTL – handles one side of the tape.
This is also the heart of the program’s first prover, which incrementally builds a DFA and checks whether it can work in a proof.
A second prover prescribes a more restrictive form of FA (also fully general with translation overhead), such that the search problem translates well to a boolean satisfiability (SAT) problem that we can delegate to an excellent incremental solver.
Because all proofs are checked, the algorithms can get complex without raising correctness concerns about the decider.

Decider examples and counterexamples

From a zoology perspective, this decider’s coverage is cross-cutting.
It doesn’t particularly care what the machines do, just what they won’t do (set themselves up to halt).
For grins, we can ask it to search the entire Skelet list at low depth (abbreviating `--limit` and `--ad-hoc` on the command line):

``````\$ target/release/decider-finite-automata-reduction -l 6 -l 0 -a 68329601 -a 55767995 -a 5950405 -a 6897876 -a 60581745 -a 58211439 -a 7196989 -a 7728246 -a 12554268 -a 3810716 -a 3810169 -a 4982511 -a 7566785 -a 31357173 -a 2204428 -a 20569060 -a 1365166 -a 15439451 -a 14536286 -a 347505 -a 9980689 -a 45615747 -a 6237150 -a 60658955 -a 47260245 -a 13134219 -a 7163434 -a 5657318 -a 6626162 -a 4986661 -a 56967673 -a 6957734 -a 11896833 -a 11896832 -a 11896831 -a 13609549 -a 7512832 -a 35771936 -a 9914965 -a 3841616 -a 5915217 -a 57874080 -a 5878998
``````

On a laptop and single-threaded, this takes about 10 seconds to solve 8/43 of them:

This command also outputs JSON-formatted proofs.

• Counterexamples:

Skelet #34 in particular is conjectured to be immune to pure CTL solutions, due to the way it writes all bit patterns on both sides – and the lack of obvious other reasons it wouldn’t halt.

@Iijil reported success on 7046428 using the Bruteforce-CTL decider. My decider would need an unreasonable amount of time to solve this.

Decider code

Decider tests

The unit tests are in-line, and the `src/core/` files responsible for proof-checking are extensively tested.

Results

In progress – number of machines solved depends on one’s patience.

If you have only 17 minutes (and one thread on a laptop) to spare, the `direct` solver with a depth limit of `5` will bring today’s undecided index (~1.5 million) down to just 81494 unsolved.

The deeper search’s status as of this edit is: 1529124 verifiably solved, 9500 unsolved.

When ready, provide the results of your decider under the form of a file containing the IDs of the machines it decided in the seed database.

Database subset of application

index revision 645dc9f with 1538624 machines.

Decider correctness

2 Likes

Here are point-in-time index and dvf files, if anyone wants to cross-reference or try any independent verification.

At this time, the decider had solved 1526738 of the 1538624 machines.
There are 11868 to go. (That’s not a math error; it just kept solving while I realized I’d need to mess with an external file-sharing service. )

1 Like

Judging by click counts, the community doesn’t need many of these, but here’s a checkpoint.
The dropbox links should be updated in-place, with 1529124 verifiably solved and 9500 unsolved.

The SAT-based prover is hitting more of a performance wall than I expected (based on smaller samples and Mateon1’s successes). I’m looking for a natural time to stop the program, add at least `direct-9` on the smaller unsolved set, and call the solved-count final… at least until we know what new decider(s) shall cover the hard parts of BB(5) and whether slow FA-reduction can still fill any gaps.

BTW, I’ve pushed README updates which describe the `direct` algorithm in more detail, and I’m making cosmetic edits to the OP in this thread.

1 Like

I totally collided with a Decider Verification File “Decider Type” value TonyG was using already. Incremented it, and updated the proven machines (1531525 solved, 7099 unsolved).
For the record, Halting Segment (iijil) has 88 machines outside this solved-set. TonyG’s has those 88 plus 173 more. Mateon1’s cps_8_16 file has 4170 outside this set. Of course this decider gets tons of stuff those don’t, and its deficits are dropping, but not to 0.

1 Like

OK, turns out TonyG tried a while ago to coordinate a DeciderType with me, but the email got eaten. So I’ve pushed another update (now at 1531683 solved, 6941 unsolved).

1 Like

Update #1:
PR for derivation / correctness proof is here.

Update #2:
I rebased to the post-halting-segment set of 535816 (available here).
I’ve optimized the pipeline of proof searches, and done a reproducer run to clarify the cost-benefit tradeoffs of reproducing a given search depth and applying the results.

Progress indicator (NOT FOR REPRODUCTION, EVEN THE COUNTS, JUST FYI)
530852+ 4964 ████████████████████████████████████████████████████████████████████████████████████████████████████████░ All Deciders: 2d 16:49:59
535816/ 535816 ~00:00:00 ████████████████████████████████████████████████████████████████████████████████████████████████████████████ direct-1 : 00:00:00
482842/ 482842 ~00:00:00 ████████████████████████████████████████████████████████████████████████████████████████████████████████████ direct-2 : 00:00:00
411355/ 411355 ~00:00:00 ████████████████████████████████████████████████████████████████████████████████████████████████████████████ direct-3 : 00:00:03
315080/ 315080 ~00:00:00 ████████████████████████████████████████████████████████████████████████████████████████████████████████████ direct-4 : 00:00:13
167339/ 167339 ~00:00:00 ████████████████████████████████████████████████████████████████████████████████████████████████████████████ direct-5 : 00:01:10
71221/ 71221 ~00:00:00 ████████████████████████████████████████████████████████████████████████████████████████████████████████████ direct-6 : 00:10:50
32744/ 32744 ~00:00:00 ████████████████████████████████████████████████████████████████████████████████████████████████████████████ mitm_dfa-8 : 00:20:00
21059/ 21059 ~00:00:00 ████████████████████████████████████████████████████████████████████████████████████████████████████████████ mitm_dfa-9 : 00:56:32
15554/ 15554 ~00:00:00 ████████████████████████████████████████████████████████████████████████████████████████████████████████████ direct-7 : 00:45:05
12369/ 12369 ~00:00:00 ████████████████████████████████████████████████████████████████████████████████████████████████████████████ mitm_dfa-10 : 02:29:52
9468/ 9468 ~00:00:00 ████████████████████████████████████████████████████████████████████████████████████████████████████████████ mitm_dfa-11 : 09:13:37
7360/ 7360 ~00:00:00 ████████████████████████████████████████████████████████████████████████████████████████████████████████████ direct-8 : 08:45:49
6058/ 6058 ~00:00:00 █████████████████████████████████████████████████████████████████████████████████████████████████████████ mitm_dfa-12 : 1d 14:50:22
119/ 4991 ~7d 07:11:19 ██░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ direct-9 : 03:19:34^C

This was with a 14-core “cluster” of laptops + 1 PC + 1 paperweight, aged 11y to 1y. I cancelled `direct-9.. I don't think we'll want to run `direct-9` on 4991 TMs anyway: better to pre-filter.

The below counts are exact and exactly reproducible by running the same pipeline:

pass found Σfound left
direct-1 57644 57644 478172
direct-2 75823 133467 402349
direct-3 102241 235708 300108
direct-4 140760 376468 159348
direct-5 88779 465247 70569
direct-6 37931 503178 32638
mitm_dfa-8 11584 514762 21054
mitm_dfa-9 5505 520267 15549
direct-7 3182 523449 12367
mitm_dfa-10 2900 526349 9467
mitm_dfa-11 2111 528460 7356
direct-8 1299 529759 6057
mitm_dfa-12 1066 530825 4991
direct-9 759 531584 4232

Individual stage results and combined repro.{index,dvf} files are available here.

1 Like

I have put your 4232 hold-out machines through my Deciders, and got the number down to 694. See bbchallenge/Holdouts at main · TonyGuil/bbchallenge · GitHub for details, including Decider Verification Files.

My broader segment-matching criterion for the HaltingSegments Decider fails to find any new non-halting machines here. So if @UncombedCoconut’s results are correct, there is no need for it.

1 Like

Great news, thank you!

Seven more Bouncers found with improved Decider code. So the number of holdouts is now down to 687. See my github Holdouts page for details.

1 Like

I have written a C++ Verifier for UncombedCoconut’s binary Decider Verification File format. It is available at bbchallenge/NFA at main · TonyGuil/bbchallenge · GitHub. It verifies all 531,584 machines in 30 seconds.

1 Like

My C++ Decider and Verifier are now available on https://github.com/TonyGuil/bbchallenge/tree/main/FAR. Basically, all the Decider does is reproduce Justin’s results for smaller values of `n` (up to `n=6`, say). It can generate DFA-only and DFA-with-NFA output. The Verifier can handle both these formats, for any value of `n`.

1 Like

Thanks very much, Tony - I’m excited.

Thanks again, Tony, I’m getting around to trying your version.
In the near future I will also ask for a favor, namely implementing the heuristic I wrote about on Discord – it can wait, if you like, til I’ve committed the same to my repo and the pseudocode/results subsections of the bbchallenge-proofs write-up.
This will most likely get you reproducing up to n=7 comfortably.

Intended next steps:

Depending on how beloved or otherwise the writeup is, we might have the ingredients very soon to apply the results of up to “direct-7” (reproduced by Tony’s program) and “mitm_dfa-11” (reproduced by djmati1111’s program). I think that that will come to 528254 decided and 7547 to go, relative to the Dec. 25 undecided index.

The maximum potential of this decider family is greater – my proofs file (index/DVF) has 531669 machines in the current index, leaving 4132. It takes high and costly parameters (or knowing what you’re looking for) to find these, so most likely we finalize the considerable progress on CPS and Bouncers before returning to these. It’s just good to know what we’ve got in reserve.

For now I will try to get the new heuristic committed/documented properly, then refresh the original post - updating a few minor points, the command to run, and the timing/result details.

Hi Justin, sorry about that – the header files are up now. I have two questions:

As I understand it, your new heuristic (unlike the original version) doesn’t find all machines, only about 99% of them. Is this right? And does that 99% figure stay more or less the same as you increase the depth? I would certainly like to try it myself, but I’m not sure I have all the info – is your Discord description (“if the first k states of the n-state DFA aren’t sinks, that k <= 2^floor(log2(n))”) all there is to it? I know you posted your Rust code, but I am not fluent in that direction.

Secondly: why is it important to reproduce the search process for direct-7 and higher? Surely it’s enough to reproduce the process of verifying your dvf’s?

Oh, and thirdly: my GPU idea was a washout – any time the execution flow depends on the input data, the threads desynchronize and you lose all the benefit. Oh well, I learned a lot.

Hi, thanks!

Right - it’s an assumption about which searches are futile to continue, which is false but has two redeeming qualities: (1) it can’t be wrong enough to block a solution entirely, only at fixed depths, (2) statistically, it saves far more time than it wastes by making you re-search machines.

I doubt it, but it was a vast majority at all depths up to 9 (100% for power-of-2 depths).

Pretty much! For state `i` to be a “sink state” means `DFA[i][0] == i && DFA[i][1] == i`. (Consequence: if `DFA[i][0]` is known and `DFA[i][1]` is under construction, you can’t be sure `i` is a sink but you might be sure it isn’t.)
The Rust code computes this power-of-2 bound by extracting the high bit of `n`, tracks the run of non-sink-states starting at `i=0`, and rejects the DFA (same as if `ExtendNFA` returned false) if the count exceeds the power-of-2 bound. For fear of off-by-one errors I confirm that “the count” is e.g. 1 if the run is from `i=0` to `i=0` inclusive.

This same debate arose in HS acceptance discussions. To summarize: this thread and sub-forum is the place to debate that. I think everyone’s trying to set fair and non-dogmatic rules, just with varying ideas about what balance to strike. I’ll say this, though: when it’s practical, it’s very clean for the BB challenge’s ultimate work product to say: “we followed the method described here, deciding 1,234,567 of the remaining machines, and so can you, and nothing more needs to be said about those machines.”
I can’t find the quote right now, but I think even cosmo was open to shifting to a proof-data approach – but for a small machine population.

Drat. I wasn’t really expecting it to work, but you’d already beaten my expectations, so I was hoping.

@TonyG – I see why `n <= 6` was the expectation you wanted to set.
But the code is quite reasonable - I suspect performance differences are mainly about abstraction penalties the compiler couldn’t eliminate in BoolAlgebra. (For that matter, `-march=native -flto` compiler flags seem to help, ~10% faster here - and it’s cheating if my program can do it and yours can’t. )

I want to be careful with code feedback on an independent reproducer.
I’m very sure this much is OK, giving it here for transparency though:

1. It improves compiler compatibility if we use the `&` operator on `FiniteAutomataReduction::ThreadFunction` in `DecideFAR.cpp`, as for the sibling programs.
2. Any flat array of dimension `MaxDFA_States` or `MaxDFA_States + 1` (`RStack`, `aStack`, `m`) should be double-checked – the question literally being, “am I indexing this by up to double `MaxDFA_States`?”

Best,
Justin

I’ve implemented your two suggestions. There were indeed array indexing bugs. Thanks for the feedback! BTW, I have left this topic for now, and am thinking about Bells.

Fair enough, thanks for the work done, once this decider submission is on solid footing I will turn to a bouncers-repro. (We shall see if Uni’s ideas for a simpler inductive prover pan out or not - I’ll give it some space.)

Refreshed decider post: (I can’t edit the OP)

Finite Automata Reduction: principle

This program finds and verifies proofs of the form:

The following finite-state machine, by construction, recognizes all halting tape configurations for this TM.
It doesn’t recognize the initial configuration. Therefore, the TM doesn’t halt from its initial configuration.

The decider correctness paper, under construction here, derives its algorithms from this requirement.

It’s also fair to call it a general co-CTL decider, where the “co-” means complement.
(The complement of a CTL is characterized by closure under time-reversed TM steps. Regardless, the goal is to tell the start config apart from halting ones.)

The basis of its proof format is a specific state-machine architecture.
If the required conditions hold, the state machine works as a proof.
It’s known that all possible (co-)CTL solutions are expressible in this way, though there may be translation overhead.
Moreover, the full automaton is efficiently constructible from the DFA it uses to scan one side of the tape.
This has three good consequences:

1. The DFA serves as verification data, in a compact and easily parsed format. @TonyG has helpfully implemented a verifier in an independent code base.
2. We get a powerful decider by searching the space of DFAs for one that works. That’s implemented in my code (`--prover=direct`) and also Tony’s.
3. A second decider, built on a state-of-the-art SAT solver (powered by cats), is automatically sound if its DFAs check out.

This second prover actually looks for for meet-in-the-middle DFA solutions—which translate into the above framework but are best understood on their own terms.
Historically, that one was developed earlier. The first public SAT-based implementation is this one by `@djmati1111` on Discord.
@Mateon1 also shared many insights, discoveries, results, and a public but not freestanding implementation.
Huge thanks and credit to the above authors!

The methods coexist because `direct` is far superior at lower search depths, whereas `mitm_dfa` scales far better at high search depths.
(When feasible, `direct-n` covers `mitm_dfa-n` in scope.)

Decider examples and counterexamples

From a zoology perspective, this decider’s coverage is cross-cutting.
It doesn’t particularly care what the machines do, just what they won’t do (set themselves up to halt).
For grins, we can ask it to search the entire Skelet list single-threaded at low depth (abbreviating `--limit` and `--ad-hoc` on the command line):

``````\$ target/release/decider-finite-automata-reduction -l 6 -l 0 -a 68329601 -a 55767995 -a 5950405 -a 6897876 -a 60581745 -a 58211439 -a 7196989 -a 7728246 -a 12554268 -a 3810716 -a 3810169 -a 4982511 -a 7566785 -a 31357173 -a 2204428 -a 20569060 -a 1365166 -a 15439451 -a 14536286 -a 347505 -a 9980689 -a 45615747 -a 6237150 -a 60658955 -a 47260245 -a 13134219 -a 7163434 -a 5657318 -a 6626162 -a 4986661 -a 56967673 -a 6957734 -a 11896833 -a 11896832 -a 11896831 -a 13609549 -a 7512832 -a 35771936 -a 9914965 -a 3841616 -a 5915217 -a 57874080 -a 5878998
``````

this takes about 10 seconds to solve 8/43 of them and output JSON-formatted proofs:

Despite the halting-problem obstacles, proving any specific 5-state TM immune to CTL proofs is an open question.

The yet-unsolved Skelet machines are likely contenders.
Above 5 states, TMs may for instance manipulate pairs of unary numbers, either growing both or halting based on a comparison.
This would allow a pumping lemma proof that regular languages can’t tell reachable configs apart from halting ones.

Sometimes, CTL proofs exist but are only visible to my program at absurd search depths.
@Iijil reports that Bruteforce-CTL solved 7046428, whereas `direct-10` and `mitm_dfa-16` (~1 thread-hour each) can’t.
Similarly, the entire Halting Segment and CPS methods are covered in theory, but my program couldn’t replace them practice. (In CPS’s case, the idea is laughable.)

Decider code

Decider tests

The unit tests are in-line, and the `src/core/` files responsible for proof-checking are extensively tested.

Results

The maximum potential is unknown, and not up for submission today—but I’m sharing a live proofs file (index/DVF).
This would leave at most 2958 undecided.
(Note: to verify this DVF with Tony’s program now requires raising `MaxDFA_States` in `FAR.h`.)

I propose a low-depth proof of concept first, using both decider modes.
Request: can an independent operator please confirm they can match my results in this this Results Reproduction thread?
Upshot: `direct` up to 6 plus `mitm_dfa` up to 9 solve the attached 520258, leaving 15543.
FAR_direct6_mitm9.index.zst (1.1 MB)
(This is nothing! But it’s a good start: it makes sense to switch modes at these depths even if we’re going deeper. It also clarifies how deep we can realistically go using these 3 programs.)

Database subset of application

index revision 758324e with 535801 machines.

Decider correctness

Drafted and undergoing collaborative editing here.