In computational geometry, convex hull is the boundary of minimal convex set containing a given finite set of points in a plane. Otherwise, it is a simple closed polygonal chain encompassing all the points. As an example it looks like the following.

In a recent Proof of Concept at the Microsoft Technology Centre, we needed to build such a polygon. Dealing with around 40 million points with each Point available in SQL Server 2008 with a Geography data type, Longitude and Latitude as decimals. (We had various predicate operations to use so we cannot use Geometry data type even though Geometry has got various indexing options which improves performance).
Okay so, what are the approaches?.
To start with we quite liked the ConvexHull function which is available in SQL Server Spatial Tools here as this is what we wanted.
Approach 1:
- Use GeographyUnionAggregate function to get a single Geography object out of all the points
- Use ConvexHullGeography function to produce a Geography object which is essentially a polygon
1: DECLARE @aggregate Geography
2: DECLARE @convexHull Geography
3: SELECT @aggregate = dbo.GeographyUnionAggregate(Location) FROM Locations
4: SET @convexHull = dbo.ConvexHullGeography(@aggregate)
- Once we have the convexhull object we can get the points on the polygon using geography::STAsText(@convexHull) which can be processed further as required.
What is the problem?
As the data we are dealing is huge, the GeographyUnionAggregate is too slow and response times are not in an acceptable range even though the spatialindex is available. On the same note it is to be understood that it is not guaranteed that the spatial index is used always and you cannot supply a query hint unless you are using geospatial predicate operations. Also you can use the query hint only if there exists a Join condition in the query.
So we needed another solution for which the response time is high. So the next approach.
Approach 2:
For this approach we:
- Get a MultiPoint object from all the points available.
- Use ConvexHullGeography to produce the convexhull
The problem here is, how can we produce the multipoint object from the database rows. One approach is using a cursor loop through the database rows and construct the MultiPoint object. We hated this and did not want to use this approach. We came up with the following solution.
We produced a XML string from the Longitude and Latitude using the FOR XML EXPLICIT inside a SQL Server stored procedure which produces a string as below. For example (I have taken couple of points only for simplicity).
<Point><pos>-2.514545000 51.567890000</pos></Point>
<Point><pos>-2.524545000 51.557890000</pos></Point>
<Point><pos>-2.535656000 51.575656560</pos></Point>
Then by doing some prefixing and suffixing make a valid representation of MultiPoint. Once we have a string representation of MultiPoint, we can easily make a Geography object out of it using GeomFromGML method. So the solution looks as below.
1: DECLARE @xml xml
2: SELECT @xml = ( SELECT 1 AS Tag, NULL AS Parent, ''
3: + RTRIM(CAST(longitude as CHAR(20)))
4: + ' ' + RTRIM(CAST(latitude as CHAR(20)))
5: + '' AS [Point!1!pos!element]
6: FROM address
7: FOR XML EXPLICIT )
8:
9: DECLARE @MultiPoint nvarchar(max) =
10: '<MultiPoint xmlns=' + '"'
11: + 'http://www.opengis.net/gml' + '"'
12: + '><pointMembers>' + convert(nvarchar(max),@outputXML)
13: + '</pointMembers></MultiPoint>'
14:
15: DECLARE @g Geography
16: SET @g = geography::GeomFromGml(@MultiPoint, 4326)
17:
18: IF(@g IS NOT NULL)
19: BEGIN
20: DECLARE @boundary geography = dbo.ConvexHullGeography(@g);
21: SELECT @boundary.STAsText() AS Boundary
22: END
With this approach, the overall execution time came down drastically and it executed just like that.