Tuesday, October 7, 2014

Trying to use math to solve my problems

I recently encountered a moderately hard problem that required geometry to solve it. The basic idea is that I hit an API request that returns a list of points. These points when connecting the dots draw several irregular polygons:


But the idea behind these points is that people want to see it as a "Heatmap" where the data represents something round, not square. But if you apply a technique such as interpolating using Bezier curves, without pruning, looks something like this:


Definitely curvy, but not exactly what we are looking for. A previous colleague of mine encountered this problem before I did, and they tried to apply a strategy to discard points that are unneeded. This sounds easier said than done. From what I could tell, the strategy was to figure out a ratio of distances between 3 points. If the ratio exceeded a discard threshold, we would discard the middle point:


This turned out to be too aggressive and would prune points we wanted to keep:


If you look at the original picture, some of the polygons were just squares. If the polygon we were interpolating through was a regular convex polygon, like a square, we shouldn't remove any points. Using the Bezier interpolation from square points should make a more oval shape. But previous algorithm was converting the squares to an irregular shape for no reason.

I had to think of a new strategy. There needed to be a way to get rid of the inner jagged corners. I kept thinking there was some text book algorithm out there that already solved this problem, but I couldn't find one. The problem set I had seemed simple enough. All the points were a perimeter of a shape.

So since we are dealing with a perimeter, I don't think this problem is a convex hull problem. For a convex hull problem you are trying to remove all inner points. But it looked like all we are trying to do is remove bad jagged points.

I tweaked the pruning algorithm to just walked over every 3 points (current, previous, next). With three points you can determine two vectors: One going from the previous to current, and one going from the current to next. Using this, you can determine the dot product, which can tell you if they form a right angle.

So now I was able to determine which points formed a right angle, but you still need to figure out which right angles are bad. I tried to enumerate all the cases which were bad angles:


This was on the right path, but not exactly correct.  It turned out that sometimes the inverse behavior was observed.  The algorithm appeared to be pruning the outside angles instead of the inside angles.

Eventually I determined that the direction of the points matters.  If you generally are going in a clockwise direction, then you  should prune locally counter clockwise angles.  But if you are generally going in a counter clockwise direction, you need to prune locally clockwise angles.

If you took the same picture above and traversed it backwards and removed locally counter clockwise angles, you would remove the wrong middle points:


So it seemed that direction mattered.  I had to find an algorithm to figure out which direction the points were going.  I ended up finding something similar to the Shoelace Algorithm on this stack overflow answer.  When you calculate this, if the sums add up to a positive number you are going clockwise, and if its negative, you are going counter clockwise (this is flipped on a browser since the y axis is inverted).

Ok so, after using the shoelace algorithm to determine direction and the above clockwise or counter clockwise angles to determine pruning, the results still didn't give reasonable shapes:



Some shapes still looked weird.  After much debugging, the next best option available was to prune points of duplicate slope.  What was happening was that too many points were sequentially going in the same direction.  Removing the extra points and interpolating between the outer points looks something like this:

It still doesn't feel perfect, but these were the results:

The question I have now is:  is there an algorithm I'm missing here?  Has this problem been solved before and I don't have the right vocabulary to google hard enough?

From what I can see, for each polygon I draw, I iterate over N points to calculate the clockwise/counter clockwise direction.  Then I iterate N points again to prune inner angles.  Then iterate over a subset of N points (n) to remove duplicate slopes.  This totals up to N + N + n, or equals O(N).

So from a performance standpoint, I feel like the algorithm is solid.  From a math standpoint, I feel like I'm doing it wrong.  Maybe some sort of calculus or trigonometry needs to be applied?  Or is it good enough?  What do you think?