[Decider] Backward reasoning

Update: @atticuscull explained that my second critique was wrong, we are just backtracking forever while there are still possible previous states (but if there are no possible previous states to all states that could reach the halting state, then the machine runs forever)

Thank to both of you for looking at this in more details.

I need to wrap my head around it soon.

Here is an argument of why the algorithm works. This has in mind the go implementation, as opposed to the above python, seeing as the go code is more relevant. Not much is changed besides that a stack is used instead of a queue, making it depth first, and some of the variable names have changed. Let me know if this is not what you asked for; I’m happy to try and shorten it, or fill in other details that I breezed over.

The algorithm works backwards from all halting states to see if there is no possible way to reach a halting state. Abstractly, we are exploring a digraph where each node is a local configuration, that is a small section of the tape where no assumptions are made about the rest of the tape (put another way, each node represents all possible extensions of the local section of the tape). This differs from the usual notation wherein the rest of the unseen tape is assumed to be zeroes. The goal of the algorithm is to show that this digraph of predecessor configurations is self-contained finite by exploring it entirely. Technically we also need to show that the digraph does not contain a local configuration of state A and only zeroes on the tape, however I do not imagine that we will run the decider for 47 million steps on every undecided machine, and we already know that the digraph doesn’t contain a local configuration of the starting state if it has size less than 47 million.

Concretely, the algorithm implements a depth first search on the digraph (limited by a search depth). stack contains the current leaves of our search. seenStates stores the configurations we’ve seen already so that we don’t repeat unnecessary computation. That is, once we’ve found all of the predecessors of a configuration, we don’t need to look at it again.

To start the search, we add all of the local configurations consisting of a state s along with a single cell of exposed tape with bit b, where (s b) is an undefined transition. Next, as long as there are possible local configurations that we haven’t checked yet, we take the next one from the stack, and find all of the predecessor configurations. We do that by looking at all state + bit combinations (s’, b’) that transition to the current state s, and for each one checking that the place where that transition moved from (i.e. if (s’ b’) moves right then we look at the cell to the left of the current head), has the same bit as what the transition (s’ b’) writes. If the current head is at the far left (respectively right) and (s’ b’) moves right (respectively left) then the local configuration expands in scope; this is always acceptable as we made no assumptions about the tape outside of the scope of the local configuration.

For example, if we are in state A with tape 000[1]101, then we want any transition rules that transition to A and either
1. move right and write 0, or
2. move left and write 1
Then, for each pair (s’ b’) satisfying this, we add
1. s’: 00[b’]1101 or
2. s’: 0001[b]01
to stack respectively, if it is not in seenStates.

If this process terminates in 47 million steps or fewer, we know that the halting states are in a different connected component in the configuration graph than the starting configuration (A with all 0s), and so it is not possible for the machine to halt.

2 Likes

Thanks to the work of @modderme123 and @atticuscull, the decider was debugged and successfully applied!

It decided 2,243,340 machines :partying_face:

The index file of undecided machines has been updated: https://github.com/bbchallenge/bbchallenge-undecided-index

A new bug has been found and hopefully fixed: Debuggin Shawn's bug by tcosmo · Pull Request #7 · bbchallenge/bbchallenge-deciders · GitHub

Don’t hesitate to review the code!

Hence this voids the results claimed in the previous message, the new result is 2,035,610 machines decided!

The index file of undecided machines has been updated: https://github.com/bbchallenge/bbchallenge-undecided-index

I have been getting into this project by attempting to reproduce existing results. I have succeeded with Cyclers and Translated Cyclers (in fact I found several hundred new Translated Cyclers at runtimes of up to 3,037,316 steps, but that is for another post). But I have been unable to reproduce your Backward Reasoning results. Specifically, I have been unable to use Backward Reasoning to prove that the following machines don’t halt:

3102389, 4864409, 5367930, 5367934, 5725043, 5852433,
5852438, 6932470, 7998506, 7998514, 8660490, 9430801

So I was able to decide only 2,035,598 machines instead of 2,035,610. All the above machines are flagged in the database as having been decided by Backward Reasoning. But for each of them, I can exhibit an explicit starting configuration that halts after 500 steps. This surely means that they can’t be decided by Backward Reasoning at a depth less than 500?

For instance, if machine 4864409 starts in state A with tape 001100001101, then it runs for 500 steps before halting. You can check this for yourself at https://bbchallenge.org/4864409. I can supply starting configurations for the other eleven machines if you like.

Am I missing something obvious? Or has the database perhaps miscategorised these machines?

1 Like

Waw. Thank you very much, it seems that you have uncovered yet another bug in my implementation of [Decider] Backward reasoning. We are going to un-apply the results (:frowning: ), solve this bug and then require that the decider goes through the new verification process which was not in place at the time of this decider.

Thank you very much for this contribution.

Concerning Translated Cyclers, we are aware that there are a few thousands machine left in the DB when you run the decider with higher parameters (for instance see Skelet machines that are Translated Cyclers) but we did not apply the results officially because it concerned so few machines in proportion to the number of undecided machines.

Update: backward reasoning results have been un-applied on the offical github repository of the undecided index, we are back to 3,574,222 machines on https://bbchallenge.org/.

I wonder if it might have somehting to do with the detection of already-visited configs. I don’t do this in my code. What do you see if you disable this feature?

1 Like

Yes I believe you are right, in the function that transforms a configuration to a string, I forgot to put a separator between the tape, the state and the head position:

func (c ConfigurationAndDepth) toString() string {
	return c.tapeToString() + string(c.State) + strconv.Itoa(c.Head)
}

This makes an ambiguous encoding instead of 1-to-1… This feature was there only for optimization but it seems to be at the root of the bug (thanks to @Iijil for finding out the details).

EDIT: another, more important reason for the bug is that the function that encodes the tape does not include a marker for the origin:

func (c ConfigurationAndDepth) tapeToString() string {
	tapeString := ""

	// Assuming that Configurations are correctly
	// propagating the invariant that any pos
	// between min and max are well defined
	for pos := c.minTapePos; pos <= c.maxTapePos; pos += 1 {
		tapeString += string('0' + c.Tape[pos])
	}
	return tapeString
}

Hence another source of ambiguity…

EDIT2: The irony is that its been noted that this mechanism of configuration detection is useless in practice because a well formed backward reasoning tree cannot have any loops: there cant be a loop within the same branch by construction and there cant be a loop across different branches otherwise the same configuration would two different successors which is not possible (thanks @Iijil again for outlining this).

So the piece of buggy code was trying to solve a non-problem and should be safely removed in order to fix the bug.

My BackwardReasoning source code is now available on GitHub - TonyGuil/bbchallenge: Busy Beaver Challenge code and resultss, along with Deciders and Verifiers for Cyclers and TranslatedCyclers. I have several worries:

  1. Being new to github, I don’t even know if this repository is accessible by the general public. Please let me know if you can’t access it.
  2. githb won’t let me upload files bigger than 100Mb. So interested parties will have to generate various files (e.g. TranslatedCyclers.dvf) themselves.
  3. I use g++ for MinGW, which might already cause problems for many of you. But also I use the boost thread library, and I don’t know whether I am free to upload boost header files and libraries to hithub. What is common practice here?
1 Like

I have this great simplifying idea: that instead of trying to prove correctness of Deciders, the Deciders emit Verification Data for each machine that they decide. For many deciders, this Verification Data can be verified much more easily, making the acceptance process much more straightforward. For instance, it takes 12 hours to find all the new Translated Cyclers up to 4,000,000 steps, but only 0.4 seconds to verify the resulting 568 machines. See the README file GitHub - TonyGuil/bbchallenge: Busy Beaver Challenge code and resultss for more info.

My problem is: where do I post this, so that everybody sees it? This site is not easy to navigate! I did look at your Discord link, but the traffic is so high, anything posted there is likely to be lost in the noise.

1 Like

Thank you very much for sharing your work!

  1. Being new to github, I don’t even know if this repository is accessible by the general public. Please let me know if you can’t access it.

Yes, thank you we can access it well!

  1. githb won’t let me upload files bigger than 100Mb. So interested parties will have to generate various files (e.g. TranslatedCyclers.dvf) themselves.

Yes, we tend to avoid to store very large files on github as it would be inefficient for version control, for instance, we store the official files of decided indices here http://docs.bbchallenge.org/bb5_decided_indexes/.

  1. I use g++ for MinGW, which might already cause problems for many of you. But also I use the boost thread library, and I don’t know whether I am free to upload boost header files and libraries to hithub. What is common practice here?

Thank you for letting us know! Hopefully, we should be able to compile the code by downloading boost independently. We have little experience with cpp deciders because its not the language that has been the most commonly used around here so far (but we are language agnostic so any language is welcome) and I personally don’t know how cpp projects handles their boost dependency on github. I believe it should be ok as it.

I have this great simplifying idea: that instead of trying to prove correctness of Deciders, the Deciders emit Verification Data for each machine that they decide.

Yes! I really like this idea. If we push it to the extreme, a similar idea I had, was that deciders would output parameters for templated formal proofs written in Coq or Lean which are languages to verify mathematical proofs. That way, we would not even have to trust the code of the decider but just use Coq/Lean to verify the outputted proofs.

My problem is: where do I post this, so that everybody sees it? This site is not easy to navigate!

I am not sure I understand your question, what would you like to publish? Any file < 4Mb can be published on the forum (use the upload button), what is the size of your file?

I did look at your Discord link, but the traffic is so high, anything posted there is likely to be lost in the noise.

Yes, discord is better suited for live conversation and we use this forum for keeping records.

I appreciate your positive feedback very much. Thank you! Here is my idea in a bit more detail:

Many of the (present and future) Deciders expend a lot of time and energy in finding non-halting machines, but it would be a simple matter for them to output the results of their search in a way that is easily verifiable. This means that we don’t have to prove that the deciders do their job of finding every possible candidate; we just have to verify that their proposed candidates are valid. Then we can remove them from the list of undecided machines.

As an easy example, the Cyclers decider can output the number of steps to the initial match, and the number of steps to the final match. Then it is easy for a Verifier to take this data and verify that the machine in question really does generate the same configuration at those two points.

Only slightly more complicated is the TranslatedCyclers verification data: as well as the above, it must specify the tape head positions at the initial and final points, and the length of the matched tape.

I have implemented both of these Verifiers. You can find the details in the README files of the various sub-directories in my repository. As a vindication of my approach, I ran the TranslatedCyclers Decider up to 4,000,000 steps, and found 566 new Translated Cyclers. This took about twelve hours on my laptop. But verifying these 566 machines takes just 0.4 seconds. And anybody can write a Verifier – it’s easy! So there is no reason not to accept these 566 new Translated Cyclers as having been decided. The relevant dvf (Decider Verification File) is TC_4000000.dvf in the TranslatedCyclers sub-directory.

This idea is not applicable to the BackwardReasoning or HaltingSegment Deciders, at least not as far as I can see. But it will come in useful for the various Bouncers. It will mean that there is no reason to reject a Decider just because it decides less than 10% of the undecided machines. As long as the Verifier can be trusted, we don’t need to know anything at all about the Decider, so we can just accept anything that can be verified.

The reason I expressed uncertainty about where to post all this was that it seems too general to post under [Decider] Backward Reasoning, but I couldn’t find a more suitable discussion branch for it.

Thank you very much for the details on your idea. I think it is very promising and, in my opinion, especially once we can plug this verification data into formal Lean/Coq proofs to get an absolute certificate of non-halting for the machine.

At the moment, we have put in place a Deciders’ validation process which requires that a decider decides at least 10% of the currently undecided machines which is why, under this “rule”, we would not officially apply the new 566 Translated Cyclers or other Translated Cyclers found in Skelet machines that are Translated Cyclers because its such a tiny fraction of the millions of machines that are left. Note that we have been consistent with this rule since the 5,647 machines of Closed state/transition cluster were not applied because it does not go above the 10% threshold.

The incentive behind this rule is to focus on new deciders, hopefully conceptually simple, that will slash the number of machines by a lot at once.

I believe that we could shift to your idea (i.e. immediately considering that a machine is decided given its verification data) once we get somewhere under 100k machines left.

Long story short: I think your idea is very certainly the way to go once we have reached a “small” set of undecided machines (in my mind, around 100,000).

The reason I expressed uncertainty about where to post all this was that it seems too general to post under [Decider] Backward Reasoning, but I couldn’t find a more suitable discussion branch for it.

I believe that you could post it here if it is about Backward Reasoning or in [Decider] Translated cyclers if it is about translated cyclers :slight_smile:

Thank you again!

@TonyG I have debugged my implementation of backward reasoning and it decides the same 2,035,598 machines as you :partying_face:

Hence, we satisfy the requirements of [Debate & Vote] Deciders’ validation process and I have re-applied the decider officially, there remain 1,538,624 to decide!

When time permits I will check that my decider can output the exact same dvf as you did for extra safety.

Thank you again for this contribution!

That 's good to hear! If you find you can’t duplicate my dvf file, it probably just means that we’re counting the nodes differently, or something. But let me know either way.

1 Like

Hello! Considering the history of bugs in this decider, I’d like to report that my Coq development now includes Backward Reasoning, and reproduces the set of decided machines exactly. :tada:

4 Likes

Great news, thank you so much for your work!

I translated the proof’s pseudocode into C++, except I didn’t include the optimization with configurationsSeen. Here are the results I got for a few values of maxDepth:

  • maxDepth = 10 decided 2034251/3574222 machines.
  • maxDepth = 20 decided 2035586/3574222 machines.
  • maxDepth = 25 decided 2035595/3574222 machines.
  • maxDepth = 30 decided 2035598/3574222 machines. These were the same 2035598 machines in @cosmo’s post.
  • maxDepth = 40 and maxDepth = 50 didn’t decide any new machines.
1 Like

Thank you very much for this reproduction :slight_smile: !!