# New estimate for 20f*: 300,000,000

Over the past few weeks I've run 250 random cosets of the Kociemba group, each with 19,508,428,800 positions, out to completion, calculating an optimal solution to all. This is a total of 4,877,107,200,000 positions, almost all unique modulo M+inv. I ran these to try to get a better estimate of the total count of 20f* positions and also to compare the overall histogram of position depths against my previous work to calculate the exact count of 13f* positions, and Kociemba's recent experiment to optimally solve a one hundred thousand random cube positions.

Of these 250 random cosets, 232 had absolutely no distance-20 positions. The 18 cosets with distance-20 positions had a total of 33 distance-20 positions; 10 had a single distance-20 position, 5 had two distance-20 positions, and the remaining distance-20 positions were contained in cosets containing 4, 5, and 6 such positions, respectively.

With these new results, I now estimate that there are about 300,000,000 distance-20 positions overall. Luckily, we know which cosets they are concentrated in, and I am currently exploring those cosets out to completion to try to find as many of the distance-20 positions as I can.

In the table below, you'll see pretty good agreement between the exact values at the 250 coset values at distances 12 and 13. At lesser distances, the accuracy falls off because the "hit" sample size is much smaller.

The overall agreement between the 100K positions solved by Kociemba and the 250 cosets I have solved is very good.

The estimate for the count of 20f* has fallen markedly from my initial experiment with only 25 cosets. Because of this it is fairly likely that the current estimate of about 300 million is also not extremely accurate, but I believe it is in the correct order of magnitude.

``` d           exact  percentage     250 cosets  percentage    100K   percen
-- ---------------  ----------- -------------  ------------  -----  ------
0               1  2.31203e-18
1              18  4.16166e-17
2             243  5.61824e-16
3            3240  7.49098e-15
4           43239  9.99699e-14
5          574908  1.32921e-12
6         7618438  1.76141e-11
7       100803036  2.3306e-10
8      1332343288  3.08042e-09            30   6.15119e-10
9     17596479795  4.06836e-08           990   2.02989e-08
10    232248063316  5.36965e-07         18603   3.81435e-07
11   3063288809012  7.08242e-06        291034   5.96735e-06
12  40374425656248  9.33469e-05       4191486   8.59421e-05
13 531653418284628  0.0012292        57843281   0.00118602
14               -                  779147743   0.0159756       18   0.018
15               -                10312034775   0.211438       197   0.197
16               -               131362659765   2.69345       2710   2.71
17               -              1315341261754  26.9697       26673  26.673
18               -              3261117757529  66.8658       67099  67.099
19               -               158131992975   3.24233       3303   3.303
20               -                         35   7.17639e-10
```

## Comment viewing options

### Interesting results. How much

Interesting results. How much time you needed on average per coset? You say we know, which cosets they are concentrated in. There are exactly 1,091,994 20f* cubes with symmetries, but is there any evidence that the ~300,000,000 20f* cubes are concentrated in the cosets where these cube are in? Or is there some other criterion?

### Time per coset

To run the coset to a depth of 18 averaged 100 minutes each on an i7-920. But to prove the potential 20f*'s were either 20f* or 19f* took additional time that is a bit more problematic to account for, since I did it on the various random P4 machines I have hanging around. That time was substantial. Perhaps it was another 100 minutes on average, with some cosets (maybe 30%?) needed no extra work but a handful requiring many hours of additional work. It doesn't help that my optimal solver is much slower than your new one (but your new one requires a Windows machine and all my machines are linux).

On the concentration of 20f*'s: we can calculate how many length-19 "canonical sequences" (sequences where there are no adjacent turns of the same face, and commuting generators are in a specific order) there are for each coset using a simple dynamic programming program. I've run such a program, and found that some cosets have tremendously more such sequences than others, and empirically found that the 20f*'s tend to be concentrated in the cosets with fewer length-19 sequences. This makes intuitive sense, but also is borne out in practice.

So I'm doing the cosets in order from the ones with the fewest number of length-19 sequences (these have the side effect of also being faster to run, so I'm doing a full depth-19 search on them). So far I've run about 36 of these cosets; each is taking about eight hours of time on my i7-920, and I've found over 10,000 new (mod M+inv) 20f* positions.

The average number of length-19 canonical sequences over the 138M cosets (reduced by symmetry) is 1.59e12. There are more than 100 cosets however that have fewer than 6e11 sequences, and more than 9500 of the cosets that have fewer than 7e11 sequences. On the other end, there is a single coset that has 1.50e16 sequences, and another that has 4.38e15 sequences; in other words, the distance-19 sequences do not spread themselves evenly over the cosets very well.