# 3x3x3 cube ignoring edge locations

I have done an analysis of the 3x3x3 Rubik's cube ignoring the locations of the edges (but not ignoring the orientations). That is, the locations and orientations of the corners were used, as well as the orientation of the edges. I used symmetry to reduce the number of positions from 180,592,312,320 to 3,772,354,560 (1,841,970 corner sym-coordinate values * 2048).

I have created files giving the distances for each of the positions of this problem space for the face-turn and quarter-turn metrics. I have listed the summary of the results I got below (where the numbers given are not symmetry-reduced). I was wondering if anyone has done this analysis before as I haven't been able to find any other such data on the internet to verify my results against.

face turn metric positions cumulative distance 0f: 1 1 distance 1f: 18 19 distance 2f: 243 262 distance 3f: 3,090 3,352 distance 4f: 38,666 42,018 distance 5f: 480,654 522,672 distance 6f: 5,926,677 6,449,349 distance 7f: 71,470,701 77,920,050 distance 8f: 825,654,684 903,574,734 distance 9f: 8,503,518,654 9,407,093,388 distance 10f: 60,902,212,428 70,309,305,816 distance 11f: 106,280,694,040 176,589,999,856 distance 12f: 4,002,310,460 180,592,310,316 distance 13f: 2,004 180,592,312,320 quarter turn metric distance 0q: 1 1 distance 1q: 12 13 distance 2q: 114 127 distance 3q: 1,068 1,195 distance 4q: 9,939 11,134 distance 5q: 91,668 102,802 distance 6q: 836,232 939,034 distance 7q: 7,515,984 8,455,018 distance 8q: 65,904,748 74,359,766 distance 9q: 545,333,600 619,693,366 distance 10q: 3,985,539,719 4,605,233,085 distance 11q: 22,921,454,744 27,526,687,829 distance 12q: 71,219,543,521 98,746,231,350 distance 13q: 66,815,043,888 165,561,275,238 distance 14q: 15,024,321,886 180,585,597,124 distance 15q: 6,715,196 180,592,312,320

## Comment viewing options

### 3x3x3 cube ignoring edge locations

That would need 90Gbytes of storage !

How long did you take to run all this ?

Have you considered using this table as a base for

the heuristic function used IDE-A* search methods?

great work!

Claude

### Re: 3x3x3 cube ignoring edge locations

> That would need 90Gbytes of storage !

I used the "coordinate space" of size 1841970*2048 to reduce the number of values stored to 3,772,354,560. This requires about 1.886*(10^9) bytes of disk storage at 4 bits per element. This space uses symmetry in the corner cubies. (Note, the "real" number of positions, using symmetry of the edge orientations as well as the corners, according to brute-force counting by some code I added to the program, appears to be 3,769,762,452.)

Note that I used a certain edge orientation convention that allows use of the full 48 symmetries of the cube. Another edge orientation convention might provide a higher average distance, but using fewer symmetries would require a larger table.

> How long did you take to run all this ?The face turn metric took about 16h 17m for the main loop to execute (including one extra iteration to verify no new positions were found). The initialization I believe was under 10 minutes. I had optimized the file I/O when I ran the quarter turn metric, and used an internal hard drive instead of an external USB2 drive. It took about 10h 20m to run the first fourteen iterations. (I initially did the quarter-turn metric in two steps, but found that I really didn't need to since all the remaining positions were reached in the 15th iteration.) I reran the quarter-turn metric in a single step, but I don't seem to have the timing information from that. I think the main reason the quarter-turn metric was faster is that there are only 12 moves instead of 18 to process for each position. (Note: I used a Pentium4, 2.4GHz, 1GB RAM).

Note, I am not mentioning time spent to build and save lookup table data that was done prior to the final "runs" that produced the results. The final runs loaded some of the lookup tables from disk files.

> Have you considered using this table as a base for> the heuristic function used IDE-A* search methods?

Yes, I am working on incorporating this with some code of mine based roughly on the IDA* pseudo-code I found on Jaap's web site. (I am not quite sure if you mean exactly the same thing by IDE-A*.)

- Bruce Norskog### Simple optimal solver results

I have implemented a couple of simple optimal solvers that use the distance table I generated as a pruning table. (I used the face-turn metric distance table, although with a few changes to my code, I should be able to use the quarter-turn metric table.) The good news is that the correct optimal solutions were obtained for the test cases I tried, providing some additional validation to the distance tables. The programs are not any where near the performance needed to compete with optimal solvers already available.

The full distance table has a size in excess of 1.886E+9 bytes (about 1.9 GB), so it is too large to use as is in a computer with 1GB of RAM. I have used a couple of techniques to reduce the amount of memory used for the pruning table.

First, I used a technique I've experimented with where I used a "perimeter search" in combination with the IDA* algorithm, and a pruning table with reduced information. The second approach used a modulo 3 pruning table.

The first method essentially used an IDA* algorithm, except when the remaining search depth was reduced to a distance of 6, I used a table lookup to determine if the position was actually distance 6 from the solved cube state. This used a totally separate table that was a hash table storing all the positions of the cube of distance 6 (or less) from the solved state. As a result, a pruning table value in the range of 1 to 7 becomes essentially useless. This means I could reduce the pruning table to 3 bits wide (per position) and have the three bits represent the values 0, 8, 9, 10, 11, 12, and 13 with one bit-combination left over. (13 is the maximum value in my table for the face-turn metric.) But this would still require more than 1GB. I wanted to compress 5 pruning table values into a byte, so instead I used the equivalent of one ternary digit to represent only the values 0, 10, and 11. If the value in the original table is 0 to 9, 0 is used. 11 is used for any values of 11 or more in the original table. This unfortunately reduces the effectiveness of the pruning table. The increased memory for the "perimeter table" seemed to be enough to cause the computer to use the computer swap file to some extent, perhaps due to background processes waking up.

The 2nd approach uses a trick mentioned in the documentation of the Cube Explorer program by H. Kociemba. In memory, I store the distance value from the original table modulo 3. With this method, there is no effective loss of information in the pruning table, but the time it takes to determine the actual value is much longer. Generally, many lookups and cube move operations must be performed to arrive at the proper value. See the Cube Explorer documentation for more details on this method. My code for performing a basic move is not currently optimized (it translates a symmetry-reduced representation to a non-symmetry-reduced representation, performs the move in that representation, and then converts back to the symmetry-reduced representation), further increasing the time to calculate the proper pruning table value.

The table below gives a summary of results I obtained using three test cases using cubes of distances 14 to 16 from the solved cube. (The last test case using the second method was not run to completion, but progress information allowed me estimate what the total execution would have been.) It can be seen how the reduced effectiveness of the pruning table in the first method increased the number of nodes generated. The node count would have been much higher without the perimeter search were not employed. However, since the pruning table lookups were much faster with this method, the overall execution time was actually much less than with the second method.

(Note: the number of nodes is for the final iteration of the IDA* algorithm, and the final iteration continued to completion after the first solution was found to look for other possible optimal solutions. The execution times were for all iterations. Note that with the first method, a hash table lookup checked if the distance was less than or equal to 6, so the IDA* algorithm started with a search depth of 7.)

first method (with perimeter search) second method (with mod 3 table) distance nodes execution time nodes execution time 14 14,017,014 55s 6,305,748 4m 0s 15 244,926,576 10m 41s 115,281,765 1h 11m 0s 16 2,848,008,711 2h 03m 36s ? 13h (estimated)

The use of a 2nd pruning table that uses edge permutation information would probably help to improve the performance of the program, but for a machine with 1 GB of RAM, there is not a lot of memory to spare.

- Bruce Norskog### I doubt that anybody has done

It would probably be useful if you could post the symmetry reduced numbers. I usually consider them more significant than then non-reduced numbers because I tend to think in terms of the "real size" of cube space rather than the "group size".

It would also be useful if you could describe a few more details of your program. The search space is pretty large because 1841970x2048 is 3772354560. I think the best you can do is 2 bits per position, storing the distance from Start mod 3, with one other bit pattern indicating that you haven't calculated the position yet. At that rate, you would need 943088640 bytes to store all the results, which is about a gigabyte. Were you able to keep all the results in memory, or did you have to use disk? I also assume that you would have to treat the "times 1841970" index as sort of an associative array, because the symmetry classes are so ill-behaved. I would assume you would treat the "times 2048" index as a bit string, with each bit representing flipped or not. It's 2^11 rather than 2^12 because the flip of the first eleven edge cubies dictates the flip of the twelfth cubie.

### Where are the symmetry-reduced numbers?

My program output contained symmetry-reduced numbers, but the numbers were in relation to the "coordinate space" I used. My program did not calculate the "real" symmetry-reduced numbers.

There are "real" positions that do not map to a unique value in my coordinate space. I put in code to ensure that when a "new" position was found, any other position corresponding to the same "real" position was also marked "new." So it looks like to me I could have easily have generated the "real" symmetry-reduced numbers. It should not be necessary to rerun from scratch to get these numbers, so I think I can do this in the next day or so.

- Bruce Norskog### Symmetry-reduced results

Finally, I have "real" symmetry-reduced numbers for the 3x3x3 cube when corner locations and orientations are considered along with edge orientation using (UF,UB,DF,DB,LU,LD,RU,RD,FL,FR,BL,BR) to define the edge orientation reference framework. These results (and the results of my original post) are unconfirmed.

By "real" symmetry-reduced numbers, I mean that positions that are equivalent under one of the 48 cube symmetries are considered the same position. Anti-symmetry was not considered."real" positions cumulative total face-turn metric distance 0f 1 1 distance 1f 2 3 distance 2f 9 12 distance 3f 73 85 distance 4f 839 924 distance 5f 10,100 11,024 distance 6f 123,829 134,853 distance 7f 1,490,278 1,625,131 distance 8f 17,206,529 18,831,660 distance 9f 177,181,959 196,013,619 distance 10f 1,268,898,678 1,464,912,297 distance 11f 2,214,419,000 3,679,331,297 distance 12f 83,438,084 3,762,769,381 distance 13f 71 3,762,769,452 quarter-turn metric distance 0q 1 1 distance 1q 1 2 distance 2q 5 7 distance 3q 25 32 distance 4q 218 250 distance 5q 1,934 2,184 distance 6q 17,530 19,714 distance 7q 156,835 176,549 distance 8q 1,374,168 1,550,717 distance 9q 11,364,530 12,915,247 distance 10q 83,047,697 95,962,944 distance 11q 477,575,701 573,538,645 distance 12q 1,483,874,938 2,057,413,583 distance 13q 1,392,144,562 3,449,558,145 distance 14q 313,070,169 3,762,628,314 distance 15q 141,138 3,762,769,452- Bruce Norskog

### Re: I have verified these numbers

By the way, have you considered trying this analysis with a different edge orientation convention, such as the one used in Mike Reid's program? That doesn't allow full M-symmetry to be used, but I speculate that it may result with a higher average distance. But it also has a cost of approximately 3 times the number of "real" positions.

- Bruce Norskog

### edge orientation convention and distance distributions

In the absence of sufficient information about which edge cubie is where, I believe it does make a difference what edge orientation convention is used. In this case, there is no information about the permutation of the edges. Try doing God's algorithm calculations where only edge orientation is considered (2048 positions). Try different edge orientation conventions (such as Reid's convention and an M-symmetric convention) and see if the results have the same distribution. (I just tried this myself, and if my quickly hacked program worked correctly, the results are close, but not exactly the same.)

I remember awhile back I tried duplicating Mike Reid's pruning table with my own code. I ran my program and compared the distribution of distances with those given by Reid's program. My results were different, so I assumed I must have had a bug in my program. After analyzing the discrepancy, I concluded that the reason my results were different was not because my program was working incorrectly, but because I was defining edge orientation differently. I believe I had totally correct results, but for a different subgroup. After changing the edge orientation convention, I got the same results. I suspect Mike Reid chose the edge orientation that he did because of the quality of the resulting pruning table.

- Bruce Norskog### sym-coordinates and edge orientation coordinates

My program needed to convert between "raw" coordinates and sym-coordinates (both directions). I used a procedure analogous to that used in Mike Reid's optimal solver program to do that, only I was dealing with 8 corner cubies whereas Mike's code was dealing with 4 edge cubies.

In my case, the arrays required over 700 million bytes. This was too much, of course. So I looked for patterns within the arrays, and eventually came up with a way to do the same conversions with 14 lookup tables using a total of about 48 million bytes. This could probably be improved, but I felt it was good enough.

After verifying my new conversion code produced the same results as the direct conversion tables for all possible values, I could eliminate the direct conversion tables from the program. This provided room for the "main array" of the program.

Yes, my edge orientation coordinate was an 11-bit number with a bit containing a 1 corresponding to a flipped edge. The MSB represented the UF edge position, and the LSB represented the BL edge position. The BR edge position was omitted from the 11-bit number.

- Bruce Norskog### I used less than 2 bits per position in memory, but ...

I was working with a computer with 1GB of memory, so I couldn't afford two bits per position in memory, and had to perform some file I/O at the end of each iteration. Here are the details.

I figured using two bits for each of the 3,772,354,560 elements would use up too much memory. I needed to find a way to use less than two bits. Packing 5 elements per byte would reduce the memory requirement for this data to less than 755 million bytes. This would mean I would have essentially a ternary digit of storage for each element. That is, each element could have a value of 0, 1, or 2. I came up with a way to make this work, using a 4-bit per element disk file, along with the large RAM array.

My program performs has a main loop that determines the elements at a distance one more than the previous iteration. Each iteration starts with a disk file containing the distances of each of the 3.772+ billion positions (except a distance value of 15 is used for any element for which the distance is still unknown) and the RAM array has all ternary digits being 0 except for those elements that have the maximum distance determined so far. Let's say the previous iteration determined the elements at distance d from the solved state. Then, the new iteration takes each element whose array value is 1, and computes all positions one move away (quarter-turn, face-turn, or whatever metric is used), and sets the corresponding array elements to the value 2 (except for those where the existing value was 1). This means the elements with an array value of 2 will be either distance d+1 or d-1 from the solved state. Thus, more "new" elements are generated than we really want. This is remedied by using the disk file.

After all the elements are processed for the iteration, the disk file from the previous iteration is read. If the disk file contains a value other than 15, the distance value is valid and is merely copied to the disk file for this new iteration. If the disk file contains the value 15, then the RAM array is checked. If the RAM array value is 2, the value d+1 is written to the new disk file. We know the distance in this case must be d+1, not d-1; otherwise the old file would have had the distance d-1 for it, and the value d-1 would have been copied to the new file. Also, the RAM array values are fixed up for the next iteration during this processing.

In summary, data for 5 positions were packed into a byte of memory, with the penalty of having to concurrently read one large file and write one large file at the end of each iteration. No random access files were used.

- Bruce Norskog### A note about the edge orientation

In my previous post (3x3x3 cube ignoring edge locations), I neglected to mention how I defined edge orientation.

My definition of edge orientation is based upon this list: (UF,UB,DF,DB,LU,LD,RU,RD,FL,FR,BL,BR).

The letter pairs indicate the various positions of the edge cubes in terms of the two faces of the cube the edge position is part of. The first letter of each pair is considered the "marked" face or reference face. For example, the "U" face is considered the reference face of the UF edge position. (Likewise the edge cubie with the "U" and "F" colors has the "U"-colored face as its reference face.) If the reference face for an edge position is aligned with the reference face of the cubie located in that position, then the cubie is considered aligned. Else it is considered flipped.

This convention for defining edge orientations allows all 48 cube symmetries to be applied without knowledge about which edge cubie is in what location. Applying a particular symmetry either preserves all the edge orientation reference faces, or flips all of them. So in either case, the edge orientation coordinate (after applying the symmetry) can be determined without regard to what cubie is in each position. This type of edge orientation convention was necessary for me to take full advantage of the 48 symmetries of the cube in order to reduce the number of effective cube positions to under 4 billion.

- Bruce Norskog### This note is remindful of som

Occasionally a proof of conservation of flip is offered that claims not to establish such a frame of reference. It may be only a matter of terminology, but it always seems to me that such proofs really do establish a frame of reference. It's just that with some proofs the frame of reference is stated very explicitly, and with other proofs the frame of reference is much more implicit. In particular, a fairly abstract proof for conservation of flip can be given involving a wreath product representation of the edges group, and a wreath product representation makes the frame of reference fairly implicit rather than explicit.

But I think the frame of reference always there. The issue is that when an edge cubie is in its home cubicle, it's easy to tell if it's flipped or not. But if an edge cubie is not in it's home cubicle, then you can't tell if it's flipped or not without a frame of reference.

## Pruning table comparison

I wrote a small program to simulate performing pruning table lookups for 100 million cube positions, based upon the distribution of the distance values in the table. The program shows the results for the distance table I generated for the face-turn metric as well as the results for the pruning table used in the Huge Optimal Solver in the Cube Explorer program by H. Kociemba. Note that the Cube Explorer program makes use of the pruning table 3 times per position, making that table significantly more effective than if only 1 lookup per position were performed. (The maximum value from the three lookups can be used as the pruning table value.)

So while it calculated the average value for the Huge Optimal Solver pruning table to be about 10.306, the effective value came out to be about 10.804. It is not clear to me whether or not the Huge Optimal Solver in Cube Explorer can and does incorporate a further technique attributed to Michiel de Bondt. If that is the case, my simulation indicates the effective average pruning table value for a cube position increases to 10.991. (The readme file distributed with the program appears to say it's only used with the program's standard solver, not the huge optimal solver.)

The pruning table I generated that considers the permutation of the corners, and the orientations of both the corners and the edges, comes out to an average distance of approximately 10.575. So although my table has a higher average distance, the pruning table used in the Huge Optimal Solver of the Cube Explorer appears to be somewhat better when you take into account how it is used. (Since Cube Explorer uses the face-turn metric, all numbers are in terms of that metric.)

I thought I would make one additional comment here. In the Huge Optimal Solver, the sym-coordinate used related only to edge permutation information (as compared to my program, where two raw cooordinates, corner permutation and corner orientation, are used in creating a sym-coordinate). I assume this greatly simplifies the translation from "raw" coordinates to "sym" coordinates. I believe if I had used only corner permutation information to create a sym-coordinate, the number of positions for my table would needed to have been 984*2187*2048 = 4,407,312,384, I believe. At 5 positions per byte, the amount of RAM for the main table would have been over 881 million bytes, about 127 million bytes extra, and I suspect it may have been a little too much for the program to run completely from RAM on a 1GB Windows XP machine.

Finally, I would like to thank other people for their prior work and articles/documentation that helped me in creating the program, including Mike Reid, Herbert Kociemba, Jaap Scherphuis, and others. Also thanks to Jerry and Claude for their comments.

- Bruce Norskog