⇥ Optimizing image display using a bin-packing algorithm – Part deux
In the first blog post in this series, I introduced the concept of bin packing and showed how it can be used to optimize the arrangement of an arbitrary group of images so that they occupy the list amount of space possible.
The result, however, is—well, ugly, as those who had the guts to give my code a try will have found out. So, good concept, ugly output.
In this second post, I’m going to show you how to improve the quality of the image display and, in the process, we’re going to stretch the concept of bin packing a little in order to show that, despite its name, it’s not necessarily always about fitting boxes inside a bin.
Images are not boxes
There is a hidden assumption in the very basis of the bin-packing problem that I have not, so far discussed: the fact that all the objects that are to be packed are inflexible1.This is perfectly normal if you’re trying to pack a container full of chairs; images, on the other hand, are not physical objects, and can, therefore, be stretched without any major damage, so long as we (a) don’t resample them to a bigger size than the original and (b) we do not change their aspect ratio
By taking advantage of this feature, we could now decide to, say, shrink an image when we cannot find a fit, thus giving our algorithm a better chance at filling the screen. Using such a naïve approach, however, will simply result in an even more grotesque display than the one that came out of the first attempt. This is because our bin-packing algorithm starts from the largest image, which is likely to occupy a large portion of the available real estate; by the time we get to an image that doesn’t fit without resizing, we’ll have to shrink an already small image, which won’t work well.
Stretching in one dimension
What we could do, instead, is to fix the maximum size of an image in such a way as to give them a uniform appearance. This can be easily accomplished by just deciding on a maximum height and then arranging all the images in a row, allowing our page to grow and stretch vertically as the rows get filled optimally.This, incidentally, opens up a way to simplify our code. Since the height is now fixed, all we really need to do is pack the images horizontally, based exclusively on their width. Thus, we are still solving the bin-packing problem, except that, instead of dealing with two-dimensional objects, we now only have to worry about a single dimension—we are, for all intents and purposes, packing a line segment.
Dealing with a single dimension means that we need to track fewer pieces of data. In fact, as long as the images are packed along the line, we don’t really care about their position either—we’ll just add them to the horizontal row until it’s full, and then move on to a new row.
All in all, what used to be a complex bi-dimensional object whose position and area we were tracking has been reduced to a single, mono-dimensional line segment that is always anchored to a fixed point. This reduces our algorithm to simply tracking the width of each bin; effectively, every time we decide to place an image on a given row, we will just remove a segment from the corresponding bin equal to the image’s width.
It’s important to keep in mind that, despite the significant simplification to the process, we are still packing bins, which should result in a final arrangement of the images that is as close to optimal as possible (as opposed to, say, just shoving the images one next to the other)2. We have just created some additional constraints that have simplified our algorithm considerably.
Great power, and great separation
With this approach comes an additional consequence that is going to come in very handy in Part 3. The structure that we now use to track the space occupied by each image no longer has any real physical connection with the real estate it represents on the screen. We are tracking a row number and the width that hasn’t been used, but, on the screen, we will be drawing bi-dimensional objects that will, in addition to being placed inside some sort of container, have to be arranged so that they are equidistant and properly aligned.In other words, we have separated the logical representation of a bin from its physical appearance. This means that, given the appropriate set of circumstances, we can now discard the physical representation of a row and reconstruct it without any loss of information from the logical data that we have saved. We are not quite there yet, because we are not quite tracking which images we are placing in a particular bin.
In Part 3, we will extend this concept (which separates the model from the view and is at the basis of a model-view-controller architecture) in such a way that we will be able to infinitely extend our real estate vertically while only ever loading as many images as necessary to fill our screen, thereby saving considerable amounts of memory and providing a fast and precise scrolling experience.
May the best fit win
We still haven’t solved the problem of dealing with more than eight images, which is the most we can extract from Google at any given time3. This is a bigger problem than it seems—a best-fit bin-packing algorithm only works when you can determine the tracking data (that is, the value that determines the best fit) ahead of time. This works while we’re dealing with eight images—but what if we wanted to load another eight?There are two possible solutions here. The first is to perform a complete layout every time an image is added or removed from the list. This always gives us a best-fit optimization, but it has two significant drawbacks: it’s computationally intensive (especially once we have a lot of images to deal with) and it is likely to move existing images around to pack more efficiently, thus confusing the user, who will see things move without their input.
The second solution is to pack each group of images that are downloaded independently of the others, adding a new group to the previous one. This approach has the advantage of being computationally efficient and of avoiding to move images without the user’s intervention, but also the drawback of being only locally efficient—the end result will be a layout that will be less optimized than if we had had all the images from the very beginning. Alas, beggars can’t be choosers, but that’s a problem that we’ll deal with in Part 3.
The code, part deux
Finally, before moving on to the code, we still need to lay the images out properly, so that they will be spaced evenly across each row and so that they will be centered vertically. In Flex, we can do this by using an HBox, which automatically positions its children horizontally. To guarantee even spacing, we can use a Spacer between the images whose width we set to 100 percent; this will cause the Spacers to take up as much horizontal room as possible, thus spacing out the images evenly (we also add a Spacer at the beginning and end of each row to create a bit of margin).protected var binContainer:Array = [];
protected function getBinContainerAtIndex(index:int):HBox { while (index >= binContainer.length) { var hbox:HBox = new HBox();
hbox.percentWidth = 100; hbox.height = 150; addChild(hbox);
hbox.addChild(safeSpacer());
binContainer.push(hbox); }
return binContainer[index];}
As you can see above, we keep an array of row containers (which we don’t currently use, but will in Part 3), which are just properly-size and styled HBoxes. The getBinContainerAtIndex() method grows this array as needed—thus giving us the beginning of our bottomless page display.
Finally, our fitImagesInClientArea() method also has to change to reflect our new packing strategy:
protected function fitImagesInClientArea():void {
// Create the main bin
var bins:Array = [new Bin(width, 0)];
// Sort the images
boxes.sort(function(a:Box, b:Box):int { if (a.width > b.width) { return -1; }
if (b.width > a.width) { return 1; }
return 0; });
// Iterate through each image, finding the best possible // bin for it based on its area.
for each (var box:Box in boxes) { var bestFit:Bin = null;
for each (var bin:Bin in bins) { if (bin.width >= box.width) { // The box fits in the width of the bin. But is // it a better fit?
if (!bestFit || (bestFit.width - box.width > bin.width - box.width)) { bestFit = bin; } } }
if (bestFit) {
// Position the box in the bin and add it to the screen
var hbox:HBox = getBinContainerAtIndex(bestFit.rowNumber);
hbox.addChild(box); hbox.addChild(safeSpacer());
// Reclaim any unused space bestFit.width -= box.width; } else { trace("Unable to fit box with width: " + box.width + " and height: " + box.height); } }
}
Here, we have changed the Bin implementation to only track the row number and a width4, and the packing algorithm has become similarly simpler—in fact, the bin-splicing operation now only consists of removing the width of the new image from the overall width of the bin.
The results, compared to the Part 1 attempts, are dramatically better. You can try it for yourself, and take a look at the current code in the mono-packing tag of the GitHub repo for the project.
- In reality, we’re also assuming that they’re rectangles—but that’s fine for our purposes. ↩
- This mono-dimensional approach, incidentally, is one way of defragmenting a hard drive: one first collects all the fragmented files so that they are contiguous, sorts them in descending order by some arbitrary value—for example, how often they are used—and then bin-packs them by another arbitrary best-fit value, such as how quickly the hard drive can read them into memory, thus optimizing the apparent speed of the filesystem. ↩
- In my project, I get images from a different source—I switched to Google here just so that I could build a public demo. Therefore, it’s entirely possible that I am misreading the docs for Google’s API and it is possible to get more than eight results at a time. If that’s the case, I’d appreciate a comment from someone who knows better! ↩
- If you remember, in Part 1 I mentioned that we would be using the Flex built-in Rectangle class, but, as it turns out, that’s unnecessary. ↩

