Sunday, January 30, 2011

Precision, the cause of spikes


Precision, the cause of spikes


This is a follow-up on my last blog. The spikey effects we have seen are perfectly explainable and understandable.

What we did was:
  1. cut a parcel in two pieces
    1. create a cutline and make a (nearly linear) cutting polygon of it (using buffer)
    2. subtract that cutline from the geometry, using STDifference
  2. keep one piece, using STGeometryN
  3. subtract that piece from the parcel, using STDifference
The result should be the discarded piece, but we got back two free spikes extra.

This short blog will explain this a little more and, as a side note, use just another SQL syntax for the same result.

Side note: with

The query from my previous blog can be rewritten much more elegantly using the with syntax:
with myquery as 
(
  select
    geometry::STGeomFromWKB(0x0103000000010000001A00000023DBF97EE316064146B6F3FDD32513415A643BDFD216064175931804E225134196438B6C52150641C976BE9F34261341F6285C8F06150641022B871641261341F853E3A5D3140641E17A14AE50261341B81E85EBFB120641A4703D8A142713414E6210584D120641AC1C5A64602713414E621058431106414E621058C9271341A01A2FDD1B11064121B07268D8271341F4FDD478571006411904560ECF271341448B6CE7DD0F06418195438BC1271341F6285C8F6A0F06413BDF4F0DA72713410E2DB29D360F06416F1283C091271341E5D022DB070F0641F6285C0F7227134160E5D022DA0E06416F1283404F271341448B6CE7D30E0641F853E3A52427134154E3A59BE60E06411904568E07271341643BDF4F0C0F0641D7A3703DFC2613414A0C022B100F064125068115FB26134191ED7C3F310F0641B6F3FDD4F42613414C378941F11006414A0C022BA0261341EC51B81ECC1206413BDF4F0D40261341022B87167514064125068115F1251341B4C876BE8C160641C74B37897F25134121B07268C6160641508D976E7525134123DBF97EE316064146B6F3FDD3251341,28992)
      as parcel,
    geometry::STGeomFromText('LINESTRING(180955 313700,180920 313740)', 28992).STBuffer(0.1)
      as cutline
)
select parcel.STDifference((select parcel.STDifference(cutline).STGeometryN(2) from myquery))
from myquery

The with clause is useful to avoid repeating pieces (as having the WKB twice), and is essential to do recursive queries. You define an inlined viewy thingy (here "myquery") and use it.

Both SQL Server and PostGreSQL support with queries. See also this link which explains with in depth.

Floating point precision

Floating points are really awful. Everyone knows that if you divide one by six (1/6) and multiply the result with six ((1/6) * 6) you should get one back. However, with floating points this is not always the case. You might get 0.9999999 back. If you try it, you might also get the correct 1.0 back. That is (according to this book) because intermediate results are stored in the floating point register of the CPU. So it is by chance that it is right. In reality it will be wrong.

See also this link and scroll to "Rounding errors can play havoc with math-intense programs". Adding ten 0.1 values will result into 0.99999999999999989. I repeated this and indeed this is the case with a double. With a float the result is 1.0000001192092896. With ttmath it is 1.0, but as soon as you do the same trick with eleven 1/11 additions, you might get 0.9999999999999999999999999999999 back, or 1.0, depending on which precision you specify. Using Boost.Rational you get always 1/1, guaranteed, because it calculates in another way. It uses fractions and keeps fractions all the time. That is great in this use case.

What then caused the spikes

Now that we realize again that floating points are awful, we investigate how imprecision causes the spikes. The damage happens in the first phase. The first STDifference is used to cut the parcel into two pieces.

See the picture below. The grid is the so-called floating point grid. The smallest difference in a 32 bits floating point value (1.175e-38) is depicted. Any point (with float) must be located on one of the gridpoints. (In reality the FP-grid is irregular, but the idea will be clear).

So in this case (in most cases), the intersection point happens to fall between the gridpoints... All the vertices of both input polygons are located on the gridpoints, of course. But the real intersection point does not. It is rounded to one of the four gridpoints. It is just by chance to which one it is rounded. So the rounded intersection point is located on a gridpoint.

a


Suppose it rounds to the point labeled "3". We then get a resulting polygon as partly depicted here:


b

The resulting polygon (in blue) is too small. There is a small piece of the original polygon which is not covered by this piece, at the north side, upperleft of point 3. Subtract this resulting polygon from the original and of course, you will get the spike. There is also a similar small piece at the east side but that will not create a visual effect in this use case.

There is no obvious solution. It is the same as adding ten times 0.1. You get too much or too little. These spikes are too little in the first phase, and after the second difference, the two resulting spikes are a bit too much. This explains that sometimes you get zero, sometimes you get one and sometimes you get two spikes. It is by chance. If the rounding would go to point "1" in the picture above, there would be no spike there.

The spikes are not the result of flawed logic, or wrong decisions somewhere. It is just the rounding.

Use ttmath if you want to minimize the chance on it. Or work around it, e.g. by buffering the result with a small value e.g. -0.000001. Boost.Geometry now has some algorithms to find spikes and remove them.

By the way, the nice trick on cutting a parcel into two pieces was introduced in our project by our partners from B3Partners. The project where this study is from extracted is the Portaal Natuur en Landschap

Conclusions

In this use case, spikes are the geometric visualization of floating point imprecision.

3 comments:

  1. > Boost.Geometry now has some algorithms to find spikes and remove them.

    Is it transparent? I mean, is it applyed automatically after each action?

    ReplyDelete
  2. No, it is not applied automatically.

    ReplyDelete
  3. What fucntion removes spikes? Can you point me to documentation about this?

    ReplyDelete

Note: Only a member of this blog may post a comment.