Here are point-in-time index and dvf files, if anyone wants to cross-reference or try any independent verification.
At this time, the decider had solved 1526738 of the 1538624 machines.
There are 11868 to go. (Thatβs not a math error; it just kept solving while I realized Iβd need to mess with an external file-sharing service. )
Judging by click counts, the community doesnβt need many of these, but hereβs a checkpoint.
The dropbox links should be updated in-place, with 1529124 verifiably solved and 9500 unsolved.
The SAT-based prover is hitting more of a performance wall than I expected (based on smaller samples and Mateon1βs successes). Iβm looking for a natural time to stop the program, add at least direct-9 on the smaller unsolved set, and call the solved-count finalβ¦ at least until we know what new decider(s) shall cover the hard parts of BB(5) and whether slow FA-reduction can still fill any gaps.
BTW, Iβve pushed README updates which describe the direct algorithm in more detail, and Iβm making cosmetic edits to the OP in this thread.
I totally collided with a Decider Verification File βDecider Typeβ value TonyG was using already. Incremented it, and updated the proven machines (1531525 solved, 7099 unsolved).
For the record, Halting Segment (iijil) has 88 machines outside this solved-set. TonyGβs has those 88 plus 173 more. Mateon1βs cps_8_16 file has 4170 outside this set. Of course this decider gets tons of stuff those donβt, and its deficits are dropping, but not to 0.
OK, turns out TonyG tried a while ago to coordinate a DeciderType with me, but the email got eaten. So Iβve pushed another update (now at 1531683 solved, 6941 unsolved).
Update #1:
PR for derivation / correctness proof is here.
Update #2:
I rebased to the post-halting-segment set of 535816 (available here).
Iβve optimized the pipeline of proof searches, and done a reproducer run to clarify the cost-benefit tradeoffs of reproducing a given search depth and applying the results.
This was with a 14-core βclusterβ of laptops + 1 PC + 1 paperweight, aged 11y to 1y. I cancelled direct-9.. I don't think we'll want to run direct-9` on 4991 TMs anyway: better to pre-filter.
The below counts are exact and exactly reproducible by running the same pipeline:
pass
found
Ξ£found
left
direct-1
57644
57644
478172
direct-2
75823
133467
402349
direct-3
102241
235708
300108
direct-4
140760
376468
159348
direct-5
88779
465247
70569
direct-6
37931
503178
32638
mitm_dfa-8
11584
514762
21054
mitm_dfa-9
5505
520267
15549
direct-7
3182
523449
12367
mitm_dfa-10
2900
526349
9467
mitm_dfa-11
2111
528460
7356
direct-8
1299
529759
6057
mitm_dfa-12
1066
530825
4991
direct-9
759
531584
4232
Individual stage results and combined repro.{index,dvf} files are available here.
My C++ Decider and Verifier are now available on https://github.com/TonyGuil/bbchallenge/tree/main/FAR. Basically, all the Decider does is reproduce Justinβs results for smaller values of n (up to n=6, say). It can generate DFA-only and DFA-with-NFA output. The Verifier can handle both these formats, for any value of n.
Thanks again, Tony, Iβm getting around to trying your version.
Could you please double-check if the FAR.h header was successfully committed?
In the near future I will also ask for a favor, namely implementing the heuristic I wrote about on Discord β it can wait, if you like, til Iβve committed the same to my repo and the pseudocode/results subsections of the bbchallenge-proofs write-up.
This will most likely get you reproducing up to n=7 comfortably.
Intended next steps:
Depending on how beloved or otherwise the writeup is, we might have the ingredients very soon to apply the results of up to βdirect-7β (reproduced by Tonyβs program) and βmitm_dfa-11β (reproduced by djmati1111βs program). I think that that will come to 528254 decided and 7547 to go, relative to the Dec. 25 undecided index.
The maximum potential of this decider family is greater β my proofs file (index/DVF) has 531669 machines in the current index, leaving 4132. It takes high and costly parameters (or knowing what youβre looking for) to find these, so most likely we finalize the considerable progress on CPS and Bouncers before returning to these. Itβs just good to know what weβve got in reserve.
For now I will try to get the new heuristic committed/documented properly, then refresh the original post - updating a few minor points, the command to run, and the timing/result details.
Hi Justin, sorry about that β the header files are up now. I have two questions:
As I understand it, your new heuristic (unlike the original version) doesnβt find all machines, only about 99% of them. Is this right? And does that 99% figure stay more or less the same as you increase the depth? I would certainly like to try it myself, but Iβm not sure I have all the info β is your Discord description (βif the first k states of the n-state DFA arenβt sinks, that k <= 2^floor(log2(n))β) all there is to it? I know you posted your Rust code, but I am not fluent in that direction.
Secondly: why is it important to reproduce the search process for direct-7 and higher? Surely itβs enough to reproduce the process of verifying your dvfβs?
Oh, and thirdly: my GPU idea was a washout β any time the execution flow depends on the input data, the threads desynchronize and you lose all the benefit. Oh well, I learned a lot.
Right - itβs an assumption about which searches are futile to continue, which is false but has two redeeming qualities: (1) it canβt be wrong enough to block a solution entirely, only at fixed depths, (2) statistically, it saves far more time than it wastes by making you re-search machines.
I doubt it, but it was a vast majority at all depths up to 9 (100% for power-of-2 depths).
Pretty much! For state i to be a βsink stateβ means DFA[i][0] == i && DFA[i][1] == i. (Consequence: if DFA[i][0] is known and DFA[i][1] is under construction, you canβt be sure i is a sink but you might be sure it isnβt.)
The Rust code computes this power-of-2 bound by extracting the high bit of n, tracks the run of non-sink-states starting at i=0, and rejects the DFA (same as if ExtendNFA returned false) if the count exceeds the power-of-2 bound. For fear of off-by-one errors I confirm that βthe countβ is e.g. 1 if the run is from i=0 to i=0 inclusive.
This same debate arose in HS acceptance discussions. To summarize: this thread and sub-forum is the place to debate that. I think everyoneβs trying to set fair and non-dogmatic rules, just with varying ideas about what balance to strike. Iβll say this, though: when itβs practical, itβs very clean for the BB challengeβs ultimate work product to say: βwe followed the method described here, deciding 1,234,567 of the remaining machines, and so can you, and nothing more needs to be said about those machines.β
I canβt find the quote right now, but I think even cosmo was open to shifting to a proof-data approach β but for a small machine population.
Drat. I wasnβt really expecting it to work, but youβd already beaten my expectations, so I was hoping.
@TonyG β I see why n <= 6 was the expectation you wanted to set.
But the code is quite reasonable - I suspect performance differences are mainly about abstraction penalties the compiler couldnβt eliminate in BoolAlgebra. (For that matter, -march=native -flto compiler flags seem to help, ~10% faster here - and itβs cheating if my program can do it and yours canβt. )
I want to be careful with code feedback on an independent reproducer.
Iβm very sure this much is OK, giving it here for transparency though:
It improves compiler compatibility if we use the & operator on FiniteAutomataReduction::ThreadFunction in DecideFAR.cpp, as for the sibling programs.
Any flat array of dimension MaxDFA_States or MaxDFA_States + 1 (RStack, aStack, m) should be double-checked β the question literally being, βam I indexing this by up to double MaxDFA_States?β
Iβve implemented your two suggestions. There were indeed array indexing bugs. Thanks for the feedback! BTW, I have left this topic for now, and am thinking about Bells.
Fair enough, thanks for the work done, once this decider submission is on solid footing I will turn to a bouncers-repro. (We shall see if Uniβs ideas for a simpler inductive prover pan out or not - Iβll give it some space.)
Yes. For that matter, my program also searches DFAs with exactly 6 states on the βdirect-6β iteration.
Itβs just a matter of what we encourage in our respective CLIs. I make it so youβre always asking for a range of βdirectβ or βmitm_dfaβ runs, so just βdirect-6β looks like β-p direct -x 5 -l 6β. Itβs more convenient to do βdirectβ from 1 to 6, which is hopefully what the user wants anyway. Tonyβs approach is simpler, so good in a different way.