Tuesday, October 7, 2014

Trying to use math to solve my problems

I recently encountered a moderately hard problem that required geometry to solve it. The basic idea is that I hit an API request that returns a list of points. These points when connecting the dots draw several irregular polygons:

But the idea behind these points is that people want to see it as a "Heatmap" where the data represents something round, not square. But if you apply a technique such as interpolating using Bezier curves, without pruning, looks something like this:

Definitely curvy, but not exactly what we are looking for. A previous colleague of mine encountered this problem before I did, and they tried to apply a strategy to discard points that are unneeded. This sounds easier said than done. From what I could tell, the strategy was to figure out a ratio of distances between 3 points. If the ratio exceeded a discard threshold, we would discard the middle point:

This turned out to be too aggressive and would prune points we wanted to keep:

If you look at the original picture, some of the polygons were just squares. If the polygon we were interpolating through was a regular convex polygon, like a square, we shouldn't remove any points. Using the Bezier interpolation from square points should make a more oval shape. But previous algorithm was converting the squares to an irregular shape for no reason.

I had to think of a new strategy. There needed to be a way to get rid of the inner jagged corners. I kept thinking there was some text book algorithm out there that already solved this problem, but I couldn't find one. The problem set I had seemed simple enough. All the points were a perimeter of a shape.

So since we are dealing with a perimeter, I don't think this problem is a convex hull problem. For a convex hull problem you are trying to remove all inner points. But it looked like all we are trying to do is remove bad jagged points.

I tweaked the pruning algorithm to just walked over every 3 points (current, previous, next). With three points you can determine two vectors: One going from the previous to current, and one going from the current to next. Using this, you can determine the dot product, which can tell you if they form a right angle.

So now I was able to determine which points formed a right angle, but you still need to figure out which right angles are bad. I tried to enumerate all the cases which were bad angles:

This was on the right path, but not exactly correct.  It turned out that sometimes the inverse behavior was observed.  The algorithm appeared to be pruning the outside angles instead of the inside angles.

Eventually I determined that the direction of the points matters.  If you generally are going in a clockwise direction, then you  should prune locally counter clockwise angles.  But if you are generally going in a counter clockwise direction, you need to prune locally clockwise angles.

If you took the same picture above and traversed it backwards and removed locally counter clockwise angles, you would remove the wrong middle points:

So it seemed that direction mattered.  I had to find an algorithm to figure out which direction the points were going.  I ended up finding something similar to the Shoelace Algorithm on this stack overflow answer.  When you calculate this, if the sums add up to a positive number you are going clockwise, and if its negative, you are going counter clockwise (this is flipped on a browser since the y axis is inverted).

Ok so, after using the shoelace algorithm to determine direction and the above clockwise or counter clockwise angles to determine pruning, the results still didn't give reasonable shapes:

Some shapes still looked weird.  After much debugging, the next best option available was to prune points of duplicate slope.  What was happening was that too many points were sequentially going in the same direction.  Removing the extra points and interpolating between the outer points looks something like this:

It still doesn't feel perfect, but these were the results:

The question I have now is:  is there an algorithm I'm missing here?  Has this problem been solved before and I don't have the right vocabulary to google hard enough?

From what I can see, for each polygon I draw, I iterate over N points to calculate the clockwise/counter clockwise direction.  Then I iterate N points again to prune inner angles.  Then iterate over a subset of N points (n) to remove duplicate slopes.  This totals up to N + N + n, or equals O(N).

So from a performance standpoint, I feel like the algorithm is solid.  From a math standpoint, I feel like I'm doing it wrong.  Maybe some sort of calculus or trigonometry needs to be applied?  Or is it good enough?  What do you think?

Monday, February 17, 2014

Keyword/Named arguments in programming languages that don't support it

Let's say you come from a scripting background like python or perl. Those languages have a cool feature called keyword arguments, which essentially allows you to pass a hash/dictionary of key/values to a function without declaring an object: This language feature saves you the effort of creating an on the fly data structure to pass arguments to a function: This is essentially what you have to do in javascript to accomplish keyword args: But what if you wanted to not have to always create an on the fly object? What other tools do we have available? Well, you could still use variable arguments: So, knowing this, we could write a shim for javascript to parse named args with arguments: You may be thinking, "Big deal, I don't need to use arguments, I can just pass on the fly objects in javascript thank you very much."
But this example was really a fake out. You can actually use this example with a non-scripting language, like Java, since they also allow you to use variable length arguments: Why would you want something like this in java? Well, lets say you want to create a function that creates a test object for a junit test. The object is essentially a POJO, but you don't want to write code like this: Wouldn't it be better if you could write it like this? Then all you need to do is create the getTestObject method. One caveat is that you can either lookup the public setters by name, or you have to expose the private variables. I am for exposing private variables since we are in test land. You probably shouldn't change field access control in production code. So with a little bit of utility code, you can create static constructor functions that use variable length Object arguments to simulate named arguments. I'm sure there are Java purists that will say this violates object oriented programming principles, but I argue that this is just for test code and is just a means to an end of creating on-the-fly objects with less typing/effort.

Sunday, January 12, 2014

SVG Element Transparencies

I solved a really interesting problem the other day that eluded me previously.  I hope this will help someone in the future so they don't make my mistake.

Let's say you have an SVG tag with multiple transparent shapes in it.  The way you set a shape's transparency is by adding fill-opacity style to the element.  

<svg height="50" width="100">
<line x1="0" y1="20" x2="100" y2="20" stroke="black"></line>
<circle cx="50" cy="25" r="10" fill="red" fill-opacity="0.5"></circle>

The problem with adding the fill-opacity to multiple overlapping elements is that their fills blend together and sometimes this ruins the purpose of your image. If you draw one element over another, it is hard to say which color is on top of another color.

<svg height="50" width="100">
<line x1="0" y1="20" x2="100" y2="20" stroke="black"></line>
<circle cx="50" cy="25" r="10" fill="red" fill-opacity="0.5"></circle>
<circle cx="55" cy="25" r="10" fill="yellow" fill-opacity="0.5"></circle>
<circle cx="60" cy="25" r="10" fill="blue" fill-opacity="0.5"></circle>
<circle cx="65" cy="25" r="10" fill="green" fill-opacity="0.5"></circle>
In the end, what I really wanted was to group all these elements and then apply the opacity to the parent element. This made each individual color stand out, yet the overall shapes were transparent so you could see a background behind it.

<svg height="50" width="100">
<line x1="0" y1="20" x2="100" y2="20" stroke="black"></line>
<g style="opacity: 0.5">
  <circle cx="50" cy="25" r="10" fill="red"></circle>
  <circle cx="55" cy="25" r="10" fill="yellow"></circle>
  <circle cx="60" cy="25" r="10" fill="blue"></circle>
  <circle cx="65" cy="25" r="10" fill="green"></circle>
Why would you want to use one strategy versus the other? If you want the colors to appear distinct but transparent, you group together a bunch of elements and add a transparency to their parent group. But if you want the colors to blend, then you should add the opacities individually.
By the way, this problem isn't really an SVG problem. You see the same results in plain old html. :-D