Mosaic Making

This page is a work in progress!

With enough decks in my collection, I increasingly wanted to show them off (both on my shelves and in pictures) in a more pleasing maner than just randomly assorted. I could manually arrange the cards into some grid, but doing so is tedious. Plus, what happens when I get more decks? What happens when I want to change the ordering? I needed a better solution.

To start, I took high-quality pictures of each deck, doing my best to reduce reflections and flaring and capture the true colors of each deck. I then cropped each to the full front face of the tuck box.

The problem here is that each deck is a slightly different size, resulting in different crops for each. Plus, I have two double-decks that are about twice as wide as a normal deck. To that end, I decided to reshape each to a consistent size and aspect ratio. This would make them infinitely easier to display nicely, which is shown on the right.

Unfortunately these are ordered basically at random, depending on the order that I took the photos. While some might appreciate the chaos of this mosaic, I don't think it's particularly pleasing to look at. So my next step was to arrange them by their color schemes.

Also, there's 87 decks here, so an 8x11 grid leaves one empty, which I fill in with white.

Extracting the colors from an image is simple: each pixel is a simple array of three numbers: (R, G, B) where R, G, and B are the amount of red, green, and blue in the pixel. In my case, because these were originally JPGs, there is no fourth value (A for alpha, or the opacity of the pixel). But how can I take all of these color values and extract some kind of color palette?

My initial thought was to simply take the average of all the red, green, and blue channels and get some kind of "average" color for the entire deck. For decks in which one or two colors dominate, this works like a charm. But for decks with many colors, I would often end up with some shade of black, white, grey, or brown. I tried using the median, to marginally better results.

So, what's the solution? Enter K-Means Clustering!

K-Means Clustering is a super simple, widely-used algorithm for grouping sets of data. It actually mimics the way humans pick out groups in data quite well and it works on an arbitrary number of dimensions. I won't belabor you with a wall of text about how it works (you can read here for that), so let's just look at an example, of sorts. In the images below, I've made plots showing the RGB values for all the pixels in the image, one for the reds vs greens, reds vs blues, and blues vs greens.

For an image that's entirely white, you would expect to see all the points collect at the top right of each plot: white (in this context) is the combination of all the colors at high intensity. Conversely, a black image would have points entirely in the bottom left: black (in this context) is the lack of any color. And if we combine those, an image with both black and white pixels, like the "Painter & Ghost" deck, should show data at both corners, which it does!

Finding clusters in each plot by eye is simple. For example, "Painter & Ghost" has two clusters: one black, one white-ish. "The Beatles" and "Grateful Dead" decks have lots of clusters scattered throughout, some big some small. But now what if I asked you to find exactly 2 or 4 or 10 clusters in each plot? That's not so simple. Even more complex is finding any clusters for all three colors at once. How do you know what blue pixels belong to what red and green values at the same time? This is where human ability breaks down and a computer must take over.

K-Means Clustering looks at all three dimensions (or more, for other problems) at once and finds some number of clusters within those data. Some algorithms allow this number of clusters to vary, balancing the number of clusters with how bunched together they are. In my case, I force it to find a certain number of clusters. This allows me to find the black, white, and grey colors and filter them out, if desired.

Long story short, I can run this algorithm for each image and extract an n-color palette from each. I typically use a 6-color palette: less colors runs the risk of missing colors while more often ends up with slightly different shades of the same colors.

Each image on the right is the deck itself with its color palette next to it, with the height of the each bar corresponding to the fraction of pixels in the image that belong to that color cluster. The number of clusters varies, just to illustrate the effect that number can have.

Except for the simply-colored decks, a 3-color palette just doesn't capture the true colors at all. The bright greens and oranges of "The Beatles" deck get smeared into an ugly greenish-brown and the cyan and yellow of the "Grateful Dead" into either the light grey or muted pink. But then the 10-color palette picks up far too many shades of the same colors, especially for "Painter & Ghost"; how many shades of light grey do we need? Again, the 6-color palette feels like the best balance of finding the right number and shade of colors.

Doing this for each and every deck, I wind up with... (Click here for a full-res version)

Great! Now I have color palettes (and corresponding color frequencies) for each image. Now comes the tricky part, the part that I've definitely not perfected: arranging the images according to those color palettes. To make a long story short, I tried a few different things but have settled onto simply sorting everything according to

The process is as follows:

1. Reshape the rectangular array of images/palettes into a 1D array. This makes sorting and arranging things much simpler.

2. Next, define some "score" for each color in the palette for each image.

This could be something as simple (yet effective) as total brightness, R+G+B. Lighter decks get a higher score while darker decks get a lower score.
It could be the difference between the red and blue values, so red decks have high values, whereas blue decks are low. value = R - B. This has the aesthetically pleasing effect of separating red and blue nicely, but it has trouble separating shades of grey, so those get jumbled up in a rather displeasing way.
Or how about a mix of the two, where red and bright are favored, but blue and dark are disfavored? Sort by 4R +G - 4B. This is admittedly not much different than the above, but definitely helps avoid some of the random darker decks from appearing too early. By eye, this is the best way I've come up with to sort the 1D array of decks.

3. Now, gather these scores for all 6 colors in the palette of each image, making sure to scale by the fractional occurrence of that color. Here I use the scoring method from immediately above, score = 4R + G - 4B.

All of the palettes. Each column is a palette, and each row is a color in that palette, with the most common on top.

The scores for each color. Dark pixels (blue, dark decks) have low scores while light pixels (red, white decks) have higher scores.

The scaled scores for each color. Recall that the top pixels are more prevalent in the image, and are weighted more heavily.

4. Combine the scores for each column (i.e. each deck) and sort from highest scores (redder, brighter decks) to lower scores (bluer, darker decks). The result? Hard to see much detail, but it clearly trends from red + bright on the left to blue + dark on the right. Sure, there are some decks with a generally red color scheme towards the right side, but they're dominated by darker colors and are thus suitable there.

5. Finally, I need to reshape this 1D array into some 2D array whose size is basically arbitrary (though I'll choose 15 columns like further above). Traditional ways of reshaping 1D arrays wouldn't exactly order things the way I want; it would basically be sliced into chunks and stacked on top of one another. See the figure on the left. I want to more carefully order the images, essentially in diagonals, like the order shown on the right.

And finally, the "perfectly" sorted mosaic of my playing cards looks like this!

I'm quite pleased with this method, but it's far from perfect.

Pros

It's pretty quick. For 87 images it takes a minute or two do everything, including reading and downsampling the images, extracting the color palettes via K-Means, scoring, sorting, and reshaping. Half of that time comes from Spyder displaying the mosaics. I could disable that and get it done even faster.
Works for an arbitrary number of images and dimensions. As I collect more cards, I can take pictures and expand these mosaics with ease. I can also change the aspect ratio however I see fit. I could even do this for my thousands of other pictures.
The final result is pretty great. There is an obvious gradient in the above image and at first glance looks like something I would make myself.

Cons

The empty cells. With most numbers of images, the mosaic will have some empty spots. Putting them in the top left works well enough here, but what if I wanted to arrange the cards not by brightness but by color alone? I want to experiment with making them some color instead of white and placing them near like-colored decks so they blend in nicely.
It's only pretty great, not perfect. Upon close inspection, there are clear outliers that are easy to put in more logical places, but are not easy for computational find.
- For example, the white/green "Bicycle Snowman" deck in the lowest row could switch with either the "Erotica" deck or its neighbor, creating a more pleasing collection of green decks towards the top right.
- I've experimented with minimizing a global "color smoothness" parameter by switching random cells, but have yet to find a technique that doesn't introduce outliers in other ways.
The sorting method is too simple. Stratifying by reds and blues works well here, but what if I get a bunch of green decks? I'll then have to change the sorting method, which is tedious and unreliable. I need either a single, robust sorting method, or a way to automatically chose which colors/shades to stratify.

Page updated

Google Sites

Report abuse