Friday, April 29, 2011

SqlGeometry types and LINQ

SqlGeometry types and LINQ


Last year I started to use the Microsoft .NET SqlGeometry types and I like them very much. Thanks to Bert Temme who attended us (at Geodan) on their existance. See also this page with an introduction.

In short, the SqlGeometry is the type used for the SQL Server Spatial geometry objects. But it can also be used outside of SQL Server environments: it is contained in a pure .NET assembly. The download page says: "This component can be installed separately from the server to allow client applications to use these types outside of the server."

So the SqlGeometry type is comparable to other Geometry libraries, such as Boost.Geometry, GEOS and NTS (the JTS for .NET).

SqlGeometry functionality follows the OGC/ISO specifications quite closely. This means there is not much more than OGC functionality. Sometimes there is the wish to do other things than that. For example: select only the polygons from a geometry collection. Or: select only the polygons larger than X square meters from a MultiPolygon. There comes LINQ into place.

Installing the assemblies to use SqlGeometry

To use SQLGeometry, you need to have a reference to Microsoft.SqlServer.Types, which can be found in C:\Program Files\Microsoft SQL Server\100\SDK\Assemblies (on my (64 bit) machine in c:\Program Files (x86)\..., indicating 32 bits). If not there, download it from here (there is a 64 bits version there).
Note that you have to add the reference manually, by browsing on the file system, it does not appear in the reference tab (at least not on my system, using Microsoft Visual C# 2010 Express).

After that, just add:
using Microsoft.SqlServer.Types;

The geometry collection

A geometry collection is one geometry consisting of several sub-geometries, which possibly can have different types (e.g. linestrings and polygons).

A geometry collection is sometimes not convenient. For example: MapServer cannot display them. And there are probably more GIS packages which do not display geometry collections, because most mapping software is based on layers, and most layers are defined either on a layer with points (or multi-points), or a layer with lines (or multi-linestrings), or on a layer with polygons (or multi-polygons), but in most cases not on a layer with polygons and lines, and in even more cases not on a layer with geometry collections.

Geometry collections can come into existence during an intersection. If you overlay, for example, two valid polygons which share a part of their boundaries in opposite directions, that shared boundary will form a linestring, while the rest of the intersection (if any) will form a polygon. The polygon and the linestring form together a geometry collection. The next SQL statement generates a (probably unwanted) geometry collection:
select geometry::STGeomFromText('POLYGON((0 0,0 8,8 8,8 4,4 4,4 0,0 0))', 0) 
    .STIntersection(geometry::STGeomFromText('POLYGON((4 0,4 8,8 8,8 0,4 0))', 0))

The result is: GEOMETRYCOLLECTION (POLYGON ((4 4, 8 4, 8 8, 4 8, 4 4)), LINESTRING (4 4, 4 0))

Because geometry collections are often not displayed, functionality to remove all linestrings from a geometry collection would be very useful, or crucial. SqlGeometry does not offer this functionality, but it is possible to build it. It can be built by creating WKT's of all sub-geometries, and parsing and recombining them, but LINQ offers a more elegant way.

Selecting polygons from a geometry collection

We design our algorithm as:
  • create a list of geometries from the geometry collection
  • remove the linestrings and points from this list
  • build a Polygon or a MultiPolygon from the polygons kept in the list

Creating the list

Creating a list of geometries can be done using the STGeometryN and STNumGeometries functions, which are available. We can iterate through the single geometries and add them to the list, using:

var list = new List<SqlGeometry>(); 
for (int i = 1; i <= geometry.STNumGeometries().Value; i++)
{
    list.Add(geometry.STGeometryN(i));
}

Note that STGeometryN uses a one-based index, instead of the usual zero-based index.

Because we think this functionality is general, and will often be reused, we make an extension method of it (I really like them!). So we make a static class called GeometryExtensions and we add a static method containing the piece above, and we can just call:

IList<SqlGeometry> list = geom.ToList();

on any variable geom of type SqlGeometry. See source code below for the extension method.

Creating a geometry from a list

Of course, we want to have the reverse as well, making an SqlGeometry from a list of polygons. The key is that an STUnion is required to make a valid result. If two polygons of the list mutually overlap, this overlap will be resolved by STUnion. Even if there is no overlap, the STUnion has to be called because there is no STAdd function available. Alternatively we could make use of the SqlGeometryBuilder type which is available in the SqlGeometry library. However, using STUnion guarantees a valid result. The next loop does this for us:
SqlGeometry result = null; 
foreach (SqlGeometry g in list)
{
   result = result == null ? g : result.STUnion(g);
}

Using LINQ

So having programmed step 1 and step 3 as extension methods, and having a convenient list, we can now implement step 2 using LINQ:

var result = (from g in geom.ToList() where g.STGeometryType().ToString().ToLower() == "polygon" select g).ToSqlGeometry(); 
The magic. This one line statement with LINQ selects the polygons from an on-the-fly list, and put them via another list back into an SqlGeometry. In the listing below it is formatted as three lines.

The complete listing

using System; 
using System.Collections.Generic;
using System.Linq;

using Microsoft.SqlServer.Types;

namespace SqlGeometryBlog1
{
    public static class GeometryExtensions
    {
        // Converts a multi-geometry to a list of single-geometries
        public static List<SqlGeometry> ToList(this SqlGeometry geometry)
        {
          var list = new List<SqlGeometry>();
          for (int i = 1; i <= geometry.STNumGeometries().Value; i++)
          {
         
    list.Add(geometry.STGeometryN(i));
          }
          return list;
        }

        // Converts a list of geometries to a multi-geometry
        public static SqlGeometry ToSqlGeometry(this IEnumerable<SqlGeometry> list)
        {
          SqlGeometry result = null;
          foreach (SqlGeometry g in list)
          {
         
    result = result == null ? g : result.STUnion(g);
          }
          return result;
        }

        // For convenience, we add two extra methods
        public static SqlGeometry FromWkt(string text, int srid)
        {
          return SqlGeometry.STGeomFromText(new System.Data.SqlTypes.SqlChars(text), srid);
        }
        public static String ToWkt(this SqlGeometry geometry)
        {
          return geometry.STAsText().ToSqlString().Value;
        }
    }

    class Program
    {
        static void Main(string[] args)
        {
          var geom1 = GeometryExtensions.FromWkt("POLYGON((0 0,0 8,8 8,8 4,4 4,4 0,0 0))", 0);
          var geom2 = GeometryExtensions.FromWkt("POLYGON((4 0,4 8,8 8,8 0,4 0))", 0);

          var geom = geom1.STIntersection(geom2);
          System.Console.WriteLine(geom.STIsValid().Value);
          System.Console.WriteLine(geom.ToWkt());

          var list = geom.ToList();
          var selection = from g in list where g.STGeometryType().ToString().ToLower() == "polygon" select g;
          geom = selection.ToSqlGeometry();
          System.Console.WriteLine(geom.ToWkt());
        }
    }
}

The output

True
GEOMETRYCOLLECTION (POLYGON ((4 4, 8 4, 8 8, 4 8, 4 4)), LINESTRING (4 4, 4 0))
POLYGON ((4 4, 8 4, 8 8, 4 8, 4 4))



Friday, April 8, 2011

Extent of a SQL Server Spatial table


Extent of a SQL Server Spatial table


PostGIS does have a convenient spatial aggregation function: ST_Extent. SQL Server Spatial does not have that functionality. This little blog shows how to achieve its effect.

PostGIS


Given a table (e.g. called world1) with a geometry column (e.g. called geom), a call to the PostGIS function ST_Extent will give a neat box containing all geometries. For example:

select ST_Extent(geom) from world1;

returns a bounding box:

BOX(-180 -89.9,180 83.674733)

SQL Server Spatial


SQL Server Spatial follows OGC closely and therefore does not implement this functionality. Neither it has implemented it as a so-called extended geometry method. I've seen various creative solutions on the web, e.g. here and here and here. Most of these efforts are based on iterating through the rows, which is possible but often inconvenient (because of the required loop) and (also therefore) probably less performant. I will not say that my solution is performing better, but at least it is one SQL statement.

Be prepared on use of with, my current favorite SQL clause.  I described it earlier here.

So what happens: we define a sort of on-the-fly view of all the envelopes (select geom.MakeValid().STEnvelope() as envelope from world1). (The call to MakeValid() is only necessary if there might be invalid geometries in the table). Then we ask for the corner points of that viewy thingy. An envelope is a rectangle, always containing five points, where the fifth point is closing point and therefore the same as the first point. So either point 1 and 3, or point 2 and 4 are the opposite corners of a rectangle. So we take point 1 and point 3 (as done in first creative solution I referenced above). We define a second on-the-fly view of this, using the first view as input. To get both diagonally opposite points, we union them (union all is required). Note that I'm referring to an SQL union here, not to a spatial union.

Those on-the-fly views can be (very conveniently) implemented using with. I name them then viewy but that's up to you. So envelope_viewy is the first one and corner_viewy is the second one.

The complete SQL statement is then, for a column called geom in a table called world:

with
  envelope_viewy as
  (
    select geom.MakeValid().STEnvelope() as envelope from world1
  ),
  corner_viewy as
  (
    select envelope.STPointN(1) as point from envelope_viewy
    union all select envelope.STPointN(3) from envelope_viewy
  )
select min(point.STX) as min_x,min(point.STY) as min_y,max(point.STX) as max_x,max(point.STY) as max_y
from corner_viewy

And voilĂ , you will get:
min_x    min_y    max_x    max_y
-180    -89.9    180    83.674733

The same! (Of course, I used the same table, world1, used from my shapefile research-still-to-be-finished).

You can also create a stored procedure of this (e.g. called ST_Extent).

PostGIS remark


Writing this blog, I noticed a strange thing of the PostGIS ST_Extent. It only works for 2D, there is explicitly written in the documentation that it does not work for 3D (use a ST_Extent3D for that). But it delivers a 3D box. And indeed it is explicitly typed as a 3D box. I wonder why.