Here are point-in-time index and dvf files, if anyone wants to cross-reference or try any independent verification.
At this time, the decider had solved 1526738 of the 1538624 machines.
There are 11868 to go. (That’s not a math error; it just kept solving while I realized I’d need to mess with an external file-sharing service. )
Judging by click counts, the community doesn’t need many of these, but here’s a checkpoint.
The dropbox links should be updated in-place, with 1529124 verifiably solved and 9500 unsolved.
The SAT-based prover is hitting more of a performance wall than I expected (based on smaller samples and Mateon1’s successes). I’m looking for a natural time to stop the program, add at least direct-9 on the smaller unsolved set, and call the solved-count final… at least until we know what new decider(s) shall cover the hard parts of BB(5) and whether slow FA-reduction can still fill any gaps.
BTW, I’ve pushed README updates which describe the direct algorithm in more detail, and I’m making cosmetic edits to the OP in this thread.
I totally collided with a Decider Verification File “Decider Type” value TonyG was using already. Incremented it, and updated the proven machines (1531525 solved, 7099 unsolved).
For the record, Halting Segment (iijil) has 88 machines outside this solved-set. TonyG’s has those 88 plus 173 more. Mateon1’s cps_8_16 file has 4170 outside this set. Of course this decider gets tons of stuff those don’t, and its deficits are dropping, but not to 0.
OK, turns out TonyG tried a while ago to coordinate a DeciderType with me, but the email got eaten. So I’ve pushed another update (now at 1531683 solved, 6941 unsolved).
Update #1:
PR for derivation / correctness proof is here.
Update #2:
I rebased to the post-halting-segment set of 535816 (available here).
I’ve optimized the pipeline of proof searches, and done a reproducer run to clarify the cost-benefit tradeoffs of reproducing a given search depth and applying the results.
This was with a 14-core “cluster” of laptops + 1 PC + 1 paperweight, aged 11y to 1y. I cancelled direct-9.. I don't think we'll want to run direct-9` on 4991 TMs anyway: better to pre-filter.
The below counts are exact and exactly reproducible by running the same pipeline:
pass
found
Σfound
left
direct-1
57644
57644
478172
direct-2
75823
133467
402349
direct-3
102241
235708
300108
direct-4
140760
376468
159348
direct-5
88779
465247
70569
direct-6
37931
503178
32638
mitm_dfa-8
11584
514762
21054
mitm_dfa-9
5505
520267
15549
direct-7
3182
523449
12367
mitm_dfa-10
2900
526349
9467
mitm_dfa-11
2111
528460
7356
direct-8
1299
529759
6057
mitm_dfa-12
1066
530825
4991
direct-9
759
531584
4232
Individual stage results and combined repro.{index,dvf} files are available here.
My C++ Decider and Verifier are now available on https://github.com/TonyGuil/bbchallenge/tree/main/FAR. Basically, all the Decider does is reproduce Justin’s results for smaller values of n (up to n=6, say). It can generate DFA-only and DFA-with-NFA output. The Verifier can handle both these formats, for any value of n.
Thanks again, Tony, I’m getting around to trying your version.
Could you please double-check if the FAR.h header was successfully committed?
In the near future I will also ask for a favor, namely implementing the heuristic I wrote about on Discord – it can wait, if you like, til I’ve committed the same to my repo and the pseudocode/results subsections of the bbchallenge-proofs write-up.
This will most likely get you reproducing up to n=7 comfortably.
Intended next steps:
Depending on how beloved or otherwise the writeup is, we might have the ingredients very soon to apply the results of up to “direct-7” (reproduced by Tony’s program) and “mitm_dfa-11” (reproduced by djmati1111’s program). I think that that will come to 528254 decided and 7547 to go, relative to the Dec. 25 undecided index.
The maximum potential of this decider family is greater – my proofs file (index/DVF) has 531669 machines in the current index, leaving 4132. It takes high and costly parameters (or knowing what you’re looking for) to find these, so most likely we finalize the considerable progress on CPS and Bouncers before returning to these. It’s just good to know what we’ve got in reserve.
For now I will try to get the new heuristic committed/documented properly, then refresh the original post - updating a few minor points, the command to run, and the timing/result details.
Hi Justin, sorry about that – the header files are up now. I have two questions:
As I understand it, your new heuristic (unlike the original version) doesn’t find all machines, only about 99% of them. Is this right? And does that 99% figure stay more or less the same as you increase the depth? I would certainly like to try it myself, but I’m not sure I have all the info – is your Discord description (“if the first k states of the n-state DFA aren’t sinks, that k <= 2^floor(log2(n))”) all there is to it? I know you posted your Rust code, but I am not fluent in that direction.
Secondly: why is it important to reproduce the search process for direct-7 and higher? Surely it’s enough to reproduce the process of verifying your dvf’s?
Oh, and thirdly: my GPU idea was a washout – any time the execution flow depends on the input data, the threads desynchronize and you lose all the benefit. Oh well, I learned a lot.
Right - it’s an assumption about which searches are futile to continue, which is false but has two redeeming qualities: (1) it can’t be wrong enough to block a solution entirely, only at fixed depths, (2) statistically, it saves far more time than it wastes by making you re-search machines.
I doubt it, but it was a vast majority at all depths up to 9 (100% for power-of-2 depths).
Pretty much! For state i to be a “sink state” means DFA[i][0] == i && DFA[i][1] == i. (Consequence: if DFA[i][0] is known and DFA[i][1] is under construction, you can’t be sure i is a sink but you might be sure it isn’t.)
The Rust code computes this power-of-2 bound by extracting the high bit of n, tracks the run of non-sink-states starting at i=0, and rejects the DFA (same as if ExtendNFA returned false) if the count exceeds the power-of-2 bound. For fear of off-by-one errors I confirm that “the count” is e.g. 1 if the run is from i=0 to i=0 inclusive.
This same debate arose in HS acceptance discussions. To summarize: this thread and sub-forum is the place to debate that. I think everyone’s trying to set fair and non-dogmatic rules, just with varying ideas about what balance to strike. I’ll say this, though: when it’s practical, it’s very clean for the BB challenge’s ultimate work product to say: “we followed the method described here, deciding 1,234,567 of the remaining machines, and so can you, and nothing more needs to be said about those machines.”
I can’t find the quote right now, but I think even cosmo was open to shifting to a proof-data approach – but for a small machine population.
Drat. I wasn’t really expecting it to work, but you’d already beaten my expectations, so I was hoping.
@TonyG – I see why n <= 6 was the expectation you wanted to set.
But the code is quite reasonable - I suspect performance differences are mainly about abstraction penalties the compiler couldn’t eliminate in BoolAlgebra. (For that matter, -march=native -flto compiler flags seem to help, ~10% faster here - and it’s cheating if my program can do it and yours can’t. )
I want to be careful with code feedback on an independent reproducer.
I’m very sure this much is OK, giving it here for transparency though:
It improves compiler compatibility if we use the & operator on FiniteAutomataReduction::ThreadFunction in DecideFAR.cpp, as for the sibling programs.
Any flat array of dimension MaxDFA_States or MaxDFA_States + 1 (RStack, aStack, m) should be double-checked – the question literally being, “am I indexing this by up to double MaxDFA_States?”
I’ve implemented your two suggestions. There were indeed array indexing bugs. Thanks for the feedback! BTW, I have left this topic for now, and am thinking about Bells.
Fair enough, thanks for the work done, once this decider submission is on solid footing I will turn to a bouncers-repro. (We shall see if Uni’s ideas for a simpler inductive prover pan out or not - I’ll give it some space.)
Yes. For that matter, my program also searches DFAs with exactly 6 states on the “direct-6” iteration.
It’s just a matter of what we encourage in our respective CLIs. I make it so you’re always asking for a range of “direct” or “mitm_dfa” runs, so just “direct-6” looks like “-p direct -x 5 -l 6”. It’s more convenient to do “direct” from 1 to 6, which is hopefully what the user wants anyway. Tony’s approach is simpler, so good in a different way.