Prime numbers and Rubik cube

The most fascinating thing about prime numbers is how to predict one.

For example in the following list:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

only 2, 5, 7, 11, 13, 17 and 19 are prime numbers.

Now if we look at the Rubik cube's positions, we can also count them, something like:

U D F B L R U' D' F' B' L' R' UU UD UF UB UL UR UU' UD' UF' UB' UL' UR'....

Similar to the prime number concept we can come up with the prime cube position concept. So in the above list UU' is not a prime position.

In prime numbers theory, the following theorem has been proved:

P(x)~x/ln(x) where P(x) is the prime numbers counting function.

Finding Rubik's cube diameter D is somehow equivalent to finding the formula for R(x) where R(x) is the prime positions counting function.

In fact:

D = ln(R'(43,252,003,274,489,856,000))/ln(12) Where R' is the inverse function of R and ln is the natural logarithm.

The counting function for prime numbers is not easy to prove and I would guess the same is true for R, the counting function for prime positions of Rubik's cube.

Comment viewing options

Select your preferred way to display the comments and click 'Save settings' to activate your changes.

Primes and Cubes

First of all, a sequence like U D F B L R U' D' F' B' L' R' UU UD UF UB UL UR UU' UD' UF' UB' UL' UR'.... is not a position. It is a mapping from one position to another. Maybe you mean to consider the position obtained by applying the sequence of moves to the solved position.

Moreover, if the notion of prime position is limited to exclude sub-sequences like U U', it seems like an incredibly weak notion. The property of prime numbers p is that there exists a unique way of representing them as a product p=a*b of integers such that a<=b. I do not see anything similar to this notion in your definition of prime positions. Please elaborate if you want us to understand the relevance of primality to the cube...

Actually, since Rubik's cube

Actually, since Rubik's cube positions form a group, a sequence (a "word") represents *both* a position (a group element) and a mapping from one position to another (left multiplication by the operator in the group). That is, the positions and the mappings from one position to another are the same thing.

We might consider using the term "prime" to mean "a sequence that is both the shortest sequence for its position, and unique (that is, the only sequence that short to lead to that position, unique up to transposed commutative moves)". In that case, one could legitimately ask, what is the "greatest prime" position in the cube group---that is, what is the furthest position from the identity such that there is only a single solution?

Investigating this problem might lead to some very useful insights.

counting function for the cube

Mathematicians have used the term "word" (as Rokicki mentioned) to talk about sequences of a set of generators and their inverses. The number of elements of Rubik's cube is finite, but the number of possible words to describe positions of the cube group is infinite, of course.

What mdlazreg seems to be saying is that you can define a lexicographic ordering of the words created from the generators U, D, L, R, F, and B. The number of words of length n or less would thus be sum over k from 0 to n of 12k. You could then count the number of unique cube positions that you get starting with the zero-length word and ending with the n'th word of the lexicographic ordering. This mapping from the natural numbers to a cube position counts could be thought of as a sort of "prime counting function for the cube." Find the lowest value of k for which the counting function gives 43252003274489856000, and then you easily find the word length of that the k'th element of the lexicographic ordering. That word length would be the diameter of the cube (quarter-turn metric).

This still doesn't give us any way of efficiently calculating the value of this counting function or its inverse. Since we simply don't have any way of carrying out such calculations accurately and in a feasible time frame, it doesn't seem to me that this will lead us to finding the diameter. I think some prior posts on this forum have expressed similar position counting ideas, but we have no "magic" formula or algorithm for carrying them out.

In 1859, Bernhard Riemann published a paper introducing a technique for calculating the value of the prime counting function using calculus as a tool. The prime counting function could then be calculated without having to find all the relevant primes and counting them. The methods he introduced were key to the eventual proof of the prime number theorem. But these methods were specific to the distribution of prime numbers. They don't provide us with a method for efficiently evaluating a counting function for the cube.

Also, last time I checked, 3 was still considered a prime. :-)

Uniqueness of Solutions

I've thought a lot about this question without making any real progress. One major stumbling block is simply coming up with an acceptable definition for uniqueness. For example, is the position effected by applying the sequence RL to the Start position a "prime" position?

In one sense, RL is obviously not "prime". It's not unique because RL=LR. But it might be reasonable to define uniqueness in terms of what Dan Hoey calls syllables - successive moves of the same or opposite face. So instead of thinking of RL or LR as a sequence of two moves, we think of it as a single unique permutation of length two that can be accomplished equally well by the sequence RL or by the sequence LR.

If one is not willing to define uniqueness in terms of syllables, then prime positions would be constrained to be positions where every syllable is of length one. In other words, having made a move of the R or L face, then the next move in a prime position could not be on the R or L face. And in this model, a position such as R2 would probably be considered prime in the face turn metric but not prime in the quarter turn metric.

On the other hand, if one is willing to define uniqueness in terms of syllables, then might there be some more general concept than syllables that one might be willing to accept and still stay within the bounds of uniqueness? But if so, would the concept of uniqueness thereby become so watered down as to be rendered meaningless?

We can define a sequence to b

We can define a sequence to be trivially equal to another sequence if the only difference is transposition of adjacent commutative moves---or Hoey's syllables. That's what I was getting at in my earlier response.

In any case, we can define the following properties of a sequence:

optimal: length is shortest to attain that position (or equivalently, length of sequence is equal to distance of position)

prime: unique optimal sequence for that position

composite: optimal but not prime

So really, prime and composite are characteristics of a position, but we can extend them to the optimal sequences that reach those positions.

maximal: no single move appended increases the distance

strictly maximal: every single move appended decreases the distance

Clearly if a position is strictly maximal, it is also composite.

In the QTM metric, the set of maximal positions is the same as the set of strictly maximal positions, so all maximal positions are composite.

If a position is composite, all positions one move away, that are farther from the identity position, are also composite.

The shortest composite position I could come up with is U2D2F2B2 which is the same position as F2B2U2D2. (The squares group is full of these.)

I presently have no candidate for the longest prime position. In the face turn metric, it would be very interesting if there were a distance-20 prime position (in which case all neighbors but one are also distance 20 or higher).

We can exhaustively explore and characterise positions pretty easily to a reasonable depth; that might be something worth doing.

Hi Tom, Your definition of

Hi Tom,

Your definition of a prime position is quite interesting. It is not what I had in mind. What I had in mind has been explained by Bruce in better words. Thanks Bruce and yes 3 is still a prime:).

If fact following your definition we can list the positions in QTM in a table as follow:

positions = prime_positions + composite_positions

1 = 1 + 0

12 = 12 + 0

114 = 96 + 18

The third level has 18 composite positions because:

18 = 6x1 + 3x4


6 is for each face M
1 is for the identity MM=M'M'
3 is for each face M and its mirror face N [for example R and L]
4 is for the 4 identities MN=NM MN'=N'M M'N=N'M' M'N'=N'M'

I believe your definition of a prime position in the QTM can be stated in recurrent terms as follow:

A prime position is a position:

that has 0 neighbors at a lower distance. [The identity]


that has 1 neighbor at a lower distance and all the rest of its 11 neighbors are at higher distance and the neighbor at the lower distance is also a prime position.

It is important that the neighbor N at the lower distance of a prime position P be also a prime position because otherwise it means they are multiple optimal sequences leading to N which means they are multiple optimal sequences leading to P, thus violating the definition of a prime position of having a unique optimal sequence.

This means that if the unique optimal sequence to the prime position is:

X1X2X3X4.........XN where Xi is one of the 12 moves then:

X1X2X3X4....Xj for every j less than N is also a prime position.

Once this concept of prime position is understood, the question of what is the longest prime position [LP] is indeed a very interesting question and might hold the key to some deeper insights. One thing is sure is that LP distance is strictly lower than D the diameter of the cube because all antipodes are for sure composite positions...

Syllables and Adjacent Commutative Moves

I meant to mention that Hoey's syllables are almost the same thing as adjacent commutative moves, but not quite.

As an example of adjacent commutative moves, I would simply repeat the example that RL=LR.  RL=LR is also one of Hoey's syllables.

As an example of Hoey's syllables that do not involve adjacent commutative moves, I would cite FF=F-1F-1.  In this case, FF and F-1F-1 are of course the same syllable.

The way I think of the latter case is as follows. From FF, one can go "back to Start" via (FF)(F-1F-1). Or from FF, one can go "forward to Start" via (FF)(FF).

(FF)(F-1F-1) and (FF)(FF) are of course both relations, and both consist of four quarter turns. Also, for both relations it is the case that halfway through the relation you are two moves from Start. Indeed, one might even think of the FF=(F-1F-1) position as being a local maximum with respect to the F face, although it certainly isn't a local maximum with respect to the other five faces. In any case (and as short as it is), I think the FFFF relation is much more interesting than is the FFF-1F-1 relation. The FFF-1F-1 relation trivially collapses into FF-1 via F(FF-1)F-1, and then it collapses trivially into the identity permutation. The FFFF relation doesn't collapse in such a trivial manner. To get it to collapse, you have to do something very simple such as associate it as (FF)(FF) and replace the second FF with F-1F-1 to yield (FF)(F-1F-1). But this very simple manipulation of FFFF doesn't seem "trivial" to me in the same sense that the collapsing of FFF-1F-1 does.

The difference between (FF)(F

The difference between (FF)(F'F') and (FF)(FF) is a very big difference. In fact (FF)(FF) is a relation that characterize the group. The relation (FF)(F'F') is a trivial one because it is true for any group. If you take a generator M of any group, then you have:


such relations are true for any group and stem from the fact that ABC=A(BC)=(AB)C and MM'=I.

In other words FFF'F' is not a useful relation at all.

FFFF=I is however very important and is a characteristic of the cube group. Without it the cube would not be the cube...

Regarding the prime positions definition, I think that defining the uniqueness in terms of syllables gives a weak definition of a prime position.

I prefer to define a prime position as a position that its optimal sequence is unique in an absolute way [no other sequence exist that leads to the prime position from the START position not even one with adjacent commutative moves to the unique sequence].

This means that a prime position is connected with the START position in a unique way. And all positions between the prime position and the START position are also prime.

With this prime position definition in mind I calculated the following table for the cube's corners permutations only:

level positions prime_positions
0         1           1
1        12          12
2       114          96
3       876         480
4      4931        1440
5     12972        1344
6     15066         192
7      6300           0
8        48           0
      40320        3565

This means that they are 192 longest prime positions.

It is interesting to note that the longest prime positions are two levels below the antipodes exactly at the start of the tail...

To be one level below the antipodes is obvious since the antipodes can not be prime positions but to be two levels below is not an obvious thing I think...

The other interesting thing is that the number of prime positions at each level , except level 0 and 1, is a 48 multiplier... I wonder what is the mathematical truth behind it if it turns out to be a general rule or probably it is true only for this specific case [ the cube corners only permutations].

Ps: My table above is the output of a quick code I wrote so the numbers can be very wrong.

Number of Primes is Divisible by 48

The number of prime positions in the quarter turn metric at distance N > 1 from Start is divisible by 48.  This can be proven as follows.

Symm(x) is the group of symmetries for a position x.  Symm(x) is a subgroup of M, where M is the set of rotations and reflections of the cube.  Symm(x) may be calculated as the set of all m in M such that xm=x.  If we have a position x for which Symm(x)={i} (where x may or not be prime and where i is the identity or Start position), then |xM|=48.

Let any position x be given (x may or may not be prime) and let p be any sequence of quarter turns that yields x (p may or may not be minimal).  Hence, we can write

     x = p = qi1 qi2 .... qin

where each qik is a quarter turn and where the actual distance from Start which is N may be less than n. Then, we can calculate xm as pm = (qi1)m(qi2)m .... (qin)m.  That's just a fancy way of saying that a sequence such as RU is M-conjugate with sequences such as UF or R-1U-1, and that a sequence such as RU-1 is M-conjugate with sequences such as UF-1 or R-1U.

Now we suppose that instead of being any old position, x is a prime position that is more than 1 move from Start.  And instead of being any old process, p is the unique process for x.  We can say that either p or one of its M-conjugates must begin either with RU or with RU-1.  So without loss of generality, we can choose an x that begins either with RU or with RU-1.  In order to calculate Symm(x), we must find all m in M such that xm=x.  Because x is prime, xm must preserve p.  The only m in M that preserves RU or RU-1 is i.  Therefore, Symm(x)={i} and |xM|=48.  Notice that we must have N > 1 because for a one move sequence such as x = p = F, we have |Symm(x)| = 4 so that |xM| = 12.

Prime positions for the whole cube

Based on some work done by Jerry Bryan and posted in "Starts-with and Ends-With" I extracted what I believe is the prime positions distribution for QTM up to level 13:
Level        prime_positions
  0               1
  1              12
  2              96
  3             768
  4            7296
  5           68352
  6          639456
  7         5972256
  8        55725744
  9       519810912
 10      4848717648
 11     45219930720
 12    421618316448
 13   3929530575552
Remember that the prime position definition I am adopting is that its optimal sequence is unique. Meaning that it is connected with the START position in a unique optimal way. I verified what I have noticed in my previous post that all levels, except 0 and 1, are 48 multiplier.

Not Necessarily Prime

These are positions for which both |StartsWith(x)| = 1 and |EndsWith(x)| = 1. So a minimal sequence for each one of these positions could be described as "prime at both ends". However, there is no guarantee that such a sequence is "prime in the middle", and therefore there is no guarantee that such a sequence is prime at all. For example, such a sequence could begin URL.... and fulfill the condition that |StartsWith(x)| = 1. That simply means that any minimal sequence for x must start with U. But in this example, the sequence clearly could also begin ULR... Hence, it is not prime.


I see! thanks for the correction.

I think this StartsWith and EndsWith concepts are artifacts of your God's algorithm code.

I was wondering if it can be refined to produce more details, in such a way that it describes also what happens in the "middle".

For example in level 4 the number of positions that have StartsWith=1 and EndsWith=1 is 7296.

It would be great if 7296 is broken down to: 6144 + 1152

6144 is then the number of prime positions for which the "middle" is also unique. 1152 is the number of positions that are "prime at both ends" but not in the "middle".

I think visually this can be described by saying that once we choose one of the 12 generators, we have 11 choices after that, 11 is 8 + 3 where 3 is for staying with the same face [one choice] or its mirror face [2 choices]. We have then:

7296 = 12[ 8*8*8 + 8*3*8 / 2 ]

The 6144 = 12*8*8*8 prime positions can then be described by the following graph:


and the 1152 = 12*8*3*8/2 positions can be described by the graph:


I think this can be generalized for all levels. For example for level 5 your program produces:

68352 as the number of positions for which StartsWith=1 and EndsWith=1

which can be broken down to:

68352 = 12*[8*8*8*8 + 8*3*8*8/2 + 8*8*3*8/2 + 8*3*3*8/9]

The prime positions at level 5 is then:

49152 = 12*8*8*8*8 which can be represented by:

while the other terms can be represented by:


Things get complicated at level 6 because new identities other than the trivial ones come to play... But probably your program can be tweaked to break down the numbers to what happen in the middle instead of only StartsWith and EndsWith.

Ahh, that's true; I omitted t

Ahh, that's true; I omitted those implicitly by focusing attention only on the optimal sequences, but that's sort of cheating. (And truthfully it's not complete either because it permits both FF and F'F', which hurt our notion of primeness in the QTM. I guess I've been working FTM too long.)

Sorry for hijacking this story.