Wednesday, June 07, 2006

Convex Hull

I don't know how do mathematicians define a Convex Hull, but informally, the (planar) convex hull of a set of point on the plain is defined as the smallest convex polygon that includes all the points. If you think about it, the verteces of such polygon will have to be in the given set, and the convex hull will be unique (if we add the restriction that no three consecutive vertecs of the resulting hull can be on the same line. Which is not much of a "restriction", because if they are, we just simply remove the middle point of the three and voila!) So, the problem becomes finding the smallest area, convex polygon one can construct with the vertices from the set that encompasses all the given points. Finding a convex hull is an interesting problem by itself, and quite useful in many other planar geometry and pattern recognition problems. Now, many algorithms exist for finding the convex hull efficiently but not too simple (except the divide and conquer one which I'm not going to go into here.) When it comes to implementing, most people may try the naive O(n2), which is not easy to get right in one go (if you're that kind of person who tries that and gets it right in the first try, leave my weblog immidiately. I can't stand people better than I here!) The other easy-to-comprehend-but-hard-to-implement method is "Graham's Scan", but that needs a special and none-trivial sort. Graham's Scan runs in O(n log n). The method I'm going to discuss here is a O(n log n), and easy to implement algorithm. I first saw it in a Python book, the title of which I can't remember. It starts with a straightforward sorting of the points, based on the X coordinate then the Y coordinate. Then, we'll start from the leftmost point (least X) and work our way to the rightmost point, maintaining two lists of points. One is the botton run of the points between the min- and max-X points, and the other is the top run. Here's C++ code that implements the algorithm:
#define TURN_DIR(p1,p2,p3)  (p1.x * p2.y - p1.y * p2.x + \
                             p2.x * p3.y - p2.y * p3.x + \
                             p3.x * p1.y - p3.y * p1.x)
#define LAST(cntnr)         (cntnr).back()
#define BEFORE_LAST(cntnr)  (cntnr)[(cntnr).size() - 2]

vector<Point> ConvexHull (vector<Point> & pts)
{
    sort (pts.begin(), pts.end());

    vector<Point> lower, upper;
    for (unsigned i = 0; i < pts.size(); ++i)
    {
        while (lower.size() >= 2 &&
         TURN_DIR(BEFORE_LAST(lower), LAST(lower), pts[i]) <= 0
        )
            lower.pop_back ();
        while (upper.size() >= 2 &&
         TURN_DIR(BEFORE_LAST(upper), LAST(upper), pts[i]) >= 0
        )
            upper.pop_back ();

        lower.push_back (pts[i]); upper.push_back (pts[i]);
    }

    lower.insert (lower.end(), upper.rbegin() + 1, upper.rend() - 1);
    return lower;
}
You should be able to make sense of this code without much difficulty, but be warned, I changed my implemented code to put it here, and I have not tested it (my data structures for points and for return values were different, and I removed the comments! >:-) ) So use at your own risk. Now, the challenge is this. Can you make it shorter? (It's possible to use recursion to do so, I think.)

No comments: