Some Thoughts on Representing the Cube
I wanted to post a number of miscellaneous items about representing the cube, and I will also include a few other related items.
I'll start with the group S3 as an example. I will treat the group S3 as acting on the set {0, 1, 2}. As I have been doing recently, I'll use the notation (a b c) to represent the permutation 0→a, 1→b, 2→c.
In this notation, the entirety of S3 can be listed as follows:
(0 1 2) (0 2 1) (1 0 2) (1 2 0) (2 0 1) (2 1 0)
This basic idea, or something very similar to it, is probably the way most people represent the cube in a computer program. Variations on the theme could include an S54 model, an S48 model, an S24 × S24 model, and some sort of wreath product model. The most common wreath product model would probably be something like (S8 wr C3) × (S12 wr C2). In the wreath product model, S8 and S12 represent the corner cubies and the edge cubies, respectively, and C3 and C2 represent the twists of the corner cubies and the flips of the edge cubies, respectively. Of course, none of these various models are isomorphic to the cube group. Rather, the cube group is a subgroup of whatever group is chosen as the computer model.
I listed S3 in lexicographic order. Obviously, it doesn't have to be listed in lexicographic order to be a group. Nevertheless, lexicographic order is useful in a couple of respects. For one thing, if positions in a search space for a cube problem can be produced in lexicographic order, then duplicate positions can be detected and counted properly without the necessity of having to store all the positions.
For another thing, lexicographic order suggests one of many ways that an easy to calculate bijection can be established between a group and the integers from 0 to N-1, where N is the order of the group. For example, the following is a very natural bijection between S3 and the integers from 0 to 5.
0 ↔ (0 1 2) 1 ↔ (0 2 1) 2 ↔ (1 0 2) 3 ↔ (1 2 0) 4 ↔ (2 0 1) 5 ↔ (2 1 0)
The calculation of this bijection would use mixed base arithmetic. To calculate the first digit, an index of 0, 1, or 2 (essentially, base 3 arithmetic) would chose between the digits in the obvious fashion. To calculate the second digit, an index of 0 or 1 (essentially, base 2 arithmetic) would chose between whatever digits were remaining -- between 1 and 2 if the first digit were 0, between 0 and 2 if the first digit were 1, and between 0 and 1 if the first digit were 2.
The fact that this kind of calculation can work is a consequence of several of the basic Cauchy theorems such as the one about the order of a subgroup dividing exactly the order of the group. In the case of S3, we have that
(0 1 2) (0 2 1)
is a subgroup and that
(1 0 2) (1 2 0)
and
(2 0 1) (2 1 0)
are cosets. We may think of the first digit as selecting a coset, and successive digits really work in the same way with each digit selecting cosets of successively smaller subgroups. For example,
(0 1 2)
is a subgroup of
(0 1 2) (0 2 1)
and
(0 2 1)
is the other coset.
The same kind of easy to calculate bijection is possible for any finite permutation group. S3 is perhaps deceptively simple in this regard. It is both very small and very symmetric. For a group that is not of the form Sn such as the cube group, the indexing of the cosets required for the calculation of the bijection is not quite so straightforward as is the indexing of the cosets in Sn. But the indexing is still quite straightforward.
Indeed, a person who is much better mathematician than I am could probably state precisely and prove the following highly informal "theorem": Let Sn be given and let H be a non-empty subset of Sn. Then, H is a group if and only if there exists an easy to calculate bijection between H and the integers between 0 and N-1 with N=|H|, where the bijection uses mixed base arithmetic, and where the bijection is not simply a list of the elements of H.
Lexicographic order is not the only ordering of a group that leads very naturally to an easy to calculate bijection between H and the integers between 0 and N-1. For example, the following ordering of S3 would do just as well as lexicographic order. The same basic programming could be used to calculate this bijection as for calculating the bijection in lexicographic order. The difference is that the calculations for this bijection would need to proceed right to left rather than left to right.
0 ↔ (2 1 0) 1 ↔ (1 2 0) 2 ↔ (2 0 1) 3 ↔ (0 2 1) 4 ↔ (1 0 2) 5 ↔ (0 1 2)
The method that's described in Cube-Lovers for calculating a bijection between a group and the numbers from 0 to N-1 is called a Sims table. I confess I have never understood the Sims algorithm completely. The algorithm is very powerful in that it does not presume any a priori knowledge of the "structure" of a group such as I have been demonstrating for S3. Rather, all the Sims algorithm requires is a set of generators for the group in question. Given the generators, the algorithm determines a sequence of nested subgroups, with the nesting based on fixing the items that are being permuted successively from right to left. That is, the first subgroup fixes the right-most item, the second subgroup fixes the two right-most items, the third subgroup fixes the three right-most items, etc.
I really don't know for sure, but I believe the bijection calculated for S3 by the Sims algorithm would be as follows. If anybody knows for sure, please post how it works.
0 ↔ (0 1 2) 1 ↔ (1 0 2) 2 ↔ (0 2 1) 3 ↔ (2 0 1) 4 ↔ (1 2 0) 5 ↔ (2 1 0)
The notation (a b c) does not have to represent the permutation 0→a, 1→b, 2→c. Instead, the notation (a b c) could just as well represent the permutation a→0, b→1, c→2. The two interpretations of the notation are clearly inverses of each other. David Singmaster remarked about this possibility in some of his earliest writings on the cube.
For example, consider the cube permutation F, which means to rotate the Front face clockwise by 90 degrees. With respect to the edges, F may be written in standard Singmaster cycle notation as F=(lf,uf,rf,df) meaning left-front to up-front to right-front to down-front to left-front. As Singmaster pointed out, this could mean left-front cubie to up-front cubie to right-front cubie to down-front cubie to left-front cubie. Or it could mean left-front cubicle to up-front cubicle to right-front cubicle to down-front cubicle to left-front cubicle. Either interpretation is entirely valid. You just have to pick one interpretation and stick with it.
It is my personal opinion that with respect to standard mathematical group theory, the most natural interpretation of (a b c) is 0→a, 1→b, 2→c. But it is also my personal opinion that with respect to the cube itself, the most natural interpretation of (a b c) is a→0, b→1, c→2. That's because when we think of applying a move such as F to a scrambled cube, it's the contents of the edge cubicles that get moved in the fashion (lf,uf,rf,df), not the lf, uf, rf, and df cubies themselves.
Consider any random, scrambled cube position X and apply F to it. Given X, most people can readily picture what XF will look like. But can you just as readily picture what FX will look like?
I think it's harder to picture going from X to XF than it is to picture going from X to FX. But to go from X to FX, it's necessary simply to permute the four edge cubies (not cubicles) in the fashion (lf,uf,rf,df), irrespective of where they are in the scrambled cube X (and of course we make a similar permutation for the corner cubies).
The multiplication of two permutations is a binary operation. But in a certain sense, any binary operation can be converted into a unary operation by subsuming one of the operands into the operation. For example, if we multiply the integer 2 by the integer 3 using plain old second grade multiplication, the answer is 6. So far, our operation appears to be quite binary. But consider the following.
0 × 3 = 0 1 × 3 = 3 2 × 3 = 6 3 × 3 = 9 4 × 3 = 12 . . .
If we so chose, we could think of what's going on as numerous applications of the binary operation "times". But we could also think of what's going on as numerous applications of the unary operation "times 3". And to be quite specific, we would really need to think of our unary operation as "times 3 on the right". In the case of second grade integer multiplication, this nicety isn't really required because integer multiplication commutes. But the multiplication of permutations does not in general commute. So if we wanted to apply this concept of making a binary operation into a unary one to the multiplication of permutations, we would have to be very careful to specify right multiplication or left multiplication.
The example will become tedious, but let's define the unary operation "times 2 on the left" according to the following table.
2 × 0 = 0 2 × 1 = 2 2 × 2 = 4 2 × 3 = 6 2 × 4 = 8 . . .
Obviously, the operation 2 × 3 appears in both tables. That simply means that we can regard the operation as consisting of left multiplying 3 by 2, and equivalently we can regard the operation as consisting of right multiplying 2 by 3. We get the same answer either way. The equivalence is not the same thing as commuting because 2 is on the left and 3 is on the right in either case. And, the equivalence has nothing to do with associativity.
The exact same concept applies to the multiplication of permutations. Suppose you have the permutations x and y and suppose you form the product xy. You can think of it as left multiplying y by x and you can equally well think of it as right multiplying x by y.
The question is, why do we care? To answer that question, I will use as an example a subset of S4 that is not a subgroup of S4. In particular, I will use the following subset of S4 which I will call A:
(a b c d) means 0→a, 1→b, 2→c, 3→d (0 1 2 3) (0 3 2 1) A = (2 3 0 1) (2 3 1 0) (3 2 0 1)
It will prove useful at times to think of this set of five permutations as a 5 × 4 matrix. That way, we can speak of rows and columns. For example, column 0 (the left-most column) contains 0, 0, 2, 2, 3.
Consider the permutation x = (1 3 2 0) which means 0→1, 1→3, 2→2, 3→0. We will form the product of x with A four different ways. Namely, we will form the products both as Ax and as xA, and we will form the product using both using the notation (a b c d) to represent the permutation 0→a, 1→b, 2→c, 3→d. and also using the notation (a b c d) to represent the permutation a→0, b→1, c→2 d→3. The results will be instructive.
Recall that when we use the notation (a b c d) to represent the permutation a→0, b→1, c→2 d→3. , the notation (1 3 2 0) is really the permutation x-1 rather than the permutation x. Similarly, when we use the notation (a b c d) to represent the permutation a→0, b→1, c→2 d→3. , what we have denoted as A is really the set of permutations A-1 rather than the set of permutations A. Here, we use A-1 to mean the set of all inverses of the permutations in A. We do not mean that A is a matrix with inverse A-1.
Case #1 : (a b c d) means 0→a, 1→b, 2→c 3→d. Calculate Ax. Essentially, the matrix can be thought of as having been "re-colored". The "pattern" or "structure" of the matrix is not changed by the permutation. The numbers have been colored to emphasize this point. Gray has become yellow, yellow has become pink, aqua has remained as aqua, and pink has become gray. (0 1 2 3) (1 3 2 0) (0 3 2 1) (1 0 2 3) (2 3 0 1) × (1 3 2 0) = (2 0 1 3) (2 3 1 0) (2 0 3 1) (3 2 0 1) (0 2 1 3)
Case #2 : (a b c d) means 0→a, 1→b, 2→c 3→d. Calculate xA. Essentially, the matrix can be thought of as having its columns permuted. The columns have been colored to emphasize this point. Column 0 has been moved to column 3, column 3 has been moved to column 1, column 1 has been moved to column 0, and column 2 has not been moved. (0 1 2 3) (1 3 2 0) (0 3 2 1) (3 1 2 0) (1 3 2 0) x (2 3 0 1) = (3 1 0 2) (2 3 1 0) (3 0 1 2) (3 2 0 1) (2 1 0 3)
Case #3 : (a b c d) means a→0, b→1, c→2 d→3. Calculate A-1x-1. Essentially, the matrix can be thought of as having its columns permuted. The columns have been colored to emphasize this point. Column 0 has been moved to column 3, column 3 has been moved to column 1, column 1 has been moved to column 0, and column 2 has not been moved. (0 1 2 3) (1 3 2 0) (0 3 2 1) (3 1 2 0) (2 3 0 1) × (1 3 2 0) = (3 1 0 2) (2 3 1 0) (3 0 1 2) (3 2 0 1) (2 1 0 3)
Case #4 : (a b c d) means a→0, b→1, c→2 d→3. Calculate x-1A-1. Essentially, the matrix can be thought of as having been "re-colored". The "pattern" or "structure" of the matrix is not changed by the permutation. The numbers have been colored to emphasize this point. Gray has become yellow, yellow has become pink, aqua has remained as aqua, and pink has become gray. (0 1 2 3) (1 3 2 0) (0 3 2 1) (1 0 2 3) (1 3 2 0) × (2 3 0 1) = (2 0 1 3) (2 3 1 0) (2 0 3 1) (3 2 0 1) (0 2 1 3)
It is no accident that Case #1 yields an answer that "looks the same" as Case #4, and that Case #2 yields an answer that "looks the same" as Case #3. Case #4 is calculating the inverse of Case #1, and Case #4 is using a representation of permutations that is the inverse of Case #1. Informally, the inverse of the inverse yields the same representation. Similar comments apply to Case #2 and Case #3.
The coloring that I did in the examples above is analogous to but is not identical to the coloring of the cube. Notwithstanding the imperfect coloring analogy, the examples above have helped me to understand a longstanding mystery.
In particular, while the coloring analogy above is somewhat imperfect when considering face moves on the cube, the coloring analogy is actually quite good when considering rotations and reflections of the cube. In their seminal paper about symmetry and local maxima in Cube-Lovers, Jim Saxbe and Dan Hoey described left multiplying a cube position by a rotation or a reflection as equivalent to a re-coloring of the cube. In turn, they described right multiplying a cube position by a rotation or a reflection as actually rotating or reflecting the cube. In my programs, I have always found it to be just the opposite.
I have never found it wise to disagree with Dan Hoey. At least 99% of the time that I disagree with him, I later find that I am wrong. But in this case, I think we are both right.
In my programs, I have always interpreted (a b c d) to mean 0→a, 1→b, 2→c, 3→d, because that's what makes sense to me mathematically. Having adopted that convention, left multiplying a cube position by a rotation or a reflection is actually rotating or reflecting the cube, and right multiplying a cube position by a rotation or a reflection is equivalent to a re-coloring of the cube.
But as I said earlier, interpreting (a b c d) to mean a→0, b→1, c→2, d→3 makes sense in terms of the cube. And when you make sense in terms of the cube, then it is left multiplying a cube position by a rotation or a reflection that is equivalent to a re-coloring of the cube, just like Dan Hoey said.
Jerry Bryan