# How Close Are We to God's Algorithm?

First, I think a distinction must be made that I really hadn't thought about very much. It occurs to me that Tom's method or something akin to it might eventually be able to determine the diameter of the cube group in the face turn metric without actually enumerating the complete God's algorithm. Which is to say, Tom's method is a near-optimal solver for cosets. Generally speaking, it doesn't determine optional solutions and it doesn't determine solutions for individual positions. But that's ok for determining the diameter. If Tom's program could reduce the overall upper limit for the diameter down to some N and if at least one position could be found for which the optional solution required N moves, then the diameter would be proven to be N. And the best candidate we have for N right now is 20. So I think it's possible or even likely that the diameter problem will be solved before the full God's Algorithm problem will be solved. That possibility hadn't occurred to me until recently.

Second, how fast does does a full God's Algorithm program have to run? As a first approximation, let's say that the run time can be reduced to about a million computer years and that we desire to solve the problem in about a year. That means that a million volunteers on the Internet would have to be willing to run a cube solver for a year to achieve that desired result. Whether it would be possible to get that many volunteers is problematic, but let's proceed on the assumption it would be possible.

It is well known that the cube space of about 4.3E19 positions can be reduced by about 96 times using symmetry and antisymmetry. That reduces the search space to about 4.5E17 positions that would have to be solved optimally. Each of our 1,000,000 computers would have to solve about 4.5E11 positions. There are 31536000 seconds in a non-leap year. So each of our million computers would have to solve optimally a little better than 14,000 positions per second. (I calculate 14,286.5 positions per second to be precise, but such precision is silly when there are so many approximations that have gone into the calculations.)

So that raises the question of how close are we to solving 14,000 positions per second. Tom's program is reported to solve 16 million positions per second. However, I think we are comparing apples and oranges, and maybe Tom can clarify. I think he is reporting near-optimal solutions rather than optimal solutions. And I think he is multiplying the number of cosets solved by the number of positions in the coset to come up with the 16 million figure. Actually, I get a slightly different answer than 16 million, but it's very close. The coset size is about 19.5 billion. The average time to solve a coset is reported to be 18 minutes, which yields about 975 million positions per minute. Dividing by 60 yields about 18 million positions per second, which is close to Tom's 16 million. I'm probably doing something wrong in my calculations. If so, someone please correct me.

In any case, I think that if it would be possible to achieve optimal solutions for about 14,000 positions per second, and if it would be possible to have about a million volunteers on the Internet to run such a program, the the full God's Algorithm could be calculated in about a year. I can think of three basic algorithms that could be used to accomplish such a calculation. In the interest of space, I'll discuss those three algorithms in a separate message.

## Comment viewing options

### Distributed computing for the cube

I have generally assumed the diameter would be found before a complete distance distribution was determined. I think some people might say that a true "God's algorithm calculation" would give you for each position, at least one move that gets you closer to "solved" (or some specified "start" postion). If the distance of each position is calculated and stored, that might also be called a God's algorithm calculation. In fact, storing the distance mod 3 for every position would let one determine the distance of each position with a quick and simple procedure, as has been previously mentioned on this forum. A short series of depth one searches provides an optimal sequence for a position. I would consider merely tabulating the distances for all positions another step further from a true God's algorithm calculation. With the size of the cube group, any distance lookup table would probably require many internet-connected computers to hold all the data. A mere tabulation of the distances is a much more feasible task and allows you to calculate the average distance as well as giving you the maximum.

Merely finding the diameter (sometimes called God's number), I would not really consider to be a God's algorithm calculation. However, that would probably be of much more interest to the general public than a complete tabulation of distances of all positions.

On the topic of carrying out a Rubiks's cube analysis via distributed computing, I thought I would mention that a speedcuber named Lars Vandenbergh recently did a couple of Rubik's cube analyses using distributed computing. The analyses were to compute the number of moves needed to solve the 4 edges of a face, with the face chosen to give the result that minimizes the number for each position. He did one analysis for FTM and one for QTM. Corners, of course, could be ignored so there was slightly under a trillion total positions to consider. He did not use symmetry to reduce the amount of calculations, but he broke the problem down into 2048 orientation cases.

The client program only needed to build a relatively small lookup table, and only needed to be told which orientation case to calculate. The results it needed to pass back to the host was just a small set of numbers giving the number of positions of each distance (and a number identifying the orientation case, of course). So the problem was well suited to distributed computing.

See this link for more infomation about it:

http://www.speedsolving.com/showthread.php?t=3040

(There was also a thread about this on the Yahoo speedsolvingrubikscube forum.)

### Three Possible Distributed Computing Strategies

1. The program that would be run on numerous machines in a distributed fashion would be an optimal solver that would solve individual positions rather than cosets. A central server would upon request serve out a range of positions to be worked on, and when those positions were completed the results would be reported back to the central server. The search space would of course be reduced by symmetry and antisymmetry.

The chief problem with this strategy right now is simply speed. If I understand correctly, the best optimal solvers for individual positions currently can solve a few hundred positions per second. The speed needs to be increased to about 14,000 positions per second to get the time to solve the problem down to 1,000,000 computer years.

The chief advantages of this strategy are that most existing desktop machines are capable of running existing optimal solvers, and that this strategy decomposes the overall problem beautifully. That is, none of the numerous machines working on the problem would need to communicate with each other at all. The machines would simply need to communicate with a central server to pick up a bunch of work to do, and much later (possibly months later) they would need to report their results back to the central server.

2. The program that would be run on numerous machines in a distributed fashion would be a variation on my current depth first solver that produces positions in lexicographic order. A central server would upon request serve out a range of positions to be worked on, and when those positions were completed the results would be reported back to the central server. The search space would of course be reduced by symmetry and antisymmetry.

"Never" is a long time, but I do not think my algorithm could ever enumerate the entire cube space. However, my algorithm does have some advantages. It already produces about 20,000 optimal and individual positions per second. And I think this could be increased considerably because the program was originally developed in 1995 to optimize memory usage rather than speed. I think a rewrite to optimize speed might possibly result an as much as a 10 times speed up while still using the same algorithm. Also, the algorithm decomposes beautifully because the positions are produced in lexicographic order. It's very much like a program running in distributed fashion to process a dictionary. One machine could process the A's, one machine could process the B's, one machine could process the C's, etc., and none of the machines would have to communicate with each other at all.

My algorithm does not require storing any of the positions that are being calculated. However, in order to calculate out to a distance of 2N, my algorithm must store all positions out to a distance of N. For example, to calculate the entire cube space in the face turn metric out to a distance of 20f, all positions out to a distance of 10f must be stored. That's about 2.5E11 positions, and the actual positions themselves have to be stored not just the distances from Start. The positions must be stored in memory, not on disk. No existing desktop machine can store all the positions out to 10f. The positions stored out to 10f cannot be reduced by symmetry and antisymmetry (or at least, I've not figured out a way to do so). However, the positions being calculated out to 20f could be reduced be symmetry and antisymmetry.

But my algorithm has an even worse problem, which is why I say "never". The number of positions my algorithm has to calculate for any level in the face turn metric is about 13^N. That's fine as long as the actual branching factor is about 13. But for example when going from 18f to 19f or in going from 19f to 20f, the actual branching factor is much less than 13. Indeed, it is less than 1. So my algorithm is very efficient for example at 12f from Start or 13f from Start or 14f from Start. My algorithm starts becoming less efficient about 17f or 18f from Start, although I think it would be acceptable at those distances if all the other problems associated with calculating those distances could be solved. But my algorithm becomes totally and horribly inefficient at 19f and 20f from Start because the actual branching factor is so small that virtually none of the positions being calculated are useful.

Indeed, if I can make a confession, I worked on solving the edges of the cube with my lexicographic solver for several years before Tom Rokicki succeeded. My program worked great for enumerating God's algorithm for the edges until it got out into the tail of the distribution. And at that point, the run time became totally unreasonable and I never succeeded.

3. The program that would be run on numerous machines in a distributed fashion would be a standard depth first solver that stores all the positions. I've worked out a way to do this that would require a little less than 1,000,000 machines. Each of the machines would represent a unique configuration of the corner cubies, reduced by symmetry and antisymmetry. Each of the machines would have to store a number of positions equal to half the number of edge positions, not reduced by symmetry and antisymmetry. All that would have to be stored would be the distances. The required disk space would be about 500GB. The distances could be stored on disk. The storage requirement could be reduced to about 125GB by storing distances mod 3. However, storing distances mod 3 would require that the problem be run from beginning to end with no interruptions whatsoever. I don't think that's a practical approach, so I think actual distances would need to be stored.

Current desktop machines typically don't have 500GB of disk space, but I think they will in a very few years. The bigger problem with the approach I have worked out is that all the machines solving the problem would have to be running at the same time and they would have to communicate with each other all during the solution. Each machine would only have to communicate with 12 other machines in the quarter turn metric, and with 18 other machines in the face turn metric. But the fact that all the machines would all have to be running at the same time in a cooperative fashion means I think that this approach is not really practical.

### Best optimal solvers

> positions currently can solve a few hundred positions per second.

Hmm, what solvers are these? If you really do arbitrary individual positions, the best I know of average on the order of 15 minutes *per* solution, maybe a bit less. This is based on results from a few years ago, so maybe things have sped up, but not that much.

Can you describe these solvers? Maybe there's some hidden assumption I'm unaware of. Certainly if such solvers do exist I could assuredly use them.

Right now the "hard but solvable" problems that I know of for Rubik's cube include proving a position is a depth 19 or depth 20 position. Proving a single suspected depth-20 position takes, for me, something like 8-10 CPU hours.

### Best optimal solvers

Solving cosets is in my opinion the most promising way to gods algorithm. Within a few years a standard PC will solve a H-coset within a day and a million pc need a few months to solve all 138,639,780 symmetry-reduced cosets. The only question is if a million people are interested in solving this problem.

### I think I mixed up the perfor

I have to confess to having more interest in general in solutions for individual positions than in solutions for cosets -- except that coset solvers are almost certainly the way to go do determine the group's diameter.

## The distinction between the d

My coset solver can be run in a mode to generate optimal solutions only (and coset distances), and it does so fairly efficiently; I have previously reported on some of these results in this forum. It is certainly faster than 14,000 positions per second in this mode but how much faster depends on the coset. (And it does require 4.7GB of RAM, but I am about to reduce that to 3.5GB of RAM.) I have solved 23 cosets to completion in this way, and they typically take a couple of days to complete (that was on older hardware; I'll have to test it with newer hardware).

Your calculations are accurate; both the 18 min average and the 16 million positions per second are relatively rough and relatively conservative.