Thursday, December 2, 2010

Free Shapefile of Countries of the World


Free Shapefile of Countries of the World


Note: read also part 2 and part 3 of this blog.

I'm always surprised that it is hard to find a good shapefile with world countries. I need country vector data for Boost.Geometry sample data. There is some there but I want to have another set. I want to have it as WKT (Well-Known Text) and I can use a shapefile as input.

Asking Google for "free world map shapefile" you get:
  • aprs (1), with also a modified version
  • blue marble requires registration, but (probably) the same set can be downloaded from the next entry
  • brothersoft, serving blue marble data (2)
  • mapcruzin (leading to no countries but several interesting other shapefiles)
Via my company, Geodan, you can also download a worldmap, here (requiring mailing your contact information) (3)

Via free gis data, here, there is a hit on ESRI, leading to this site, leading to annual subscriptions etc.. I skipped this one.

Via similar terms ("countries") we also find:
  • mapping hacks, serving three world files (4)
  • actually originating from this site, which has a more actual version (5)
  • a file from openmap here, called cntry02 (6)
  • and finally (via "cntry08") we find ESRI data here (7)
I've downloaded these shapefiles. It gives me (file dates):
  1. is from 2002, modified in 2009 for Antarctica
  2. is from 1996
  3. is from 2003
  4. is from 2004
  5. is from 2008
  6. is from 2002
  7. is from 2008

The Netherlands
Let's first show my country, the Netherlands:
netherlands

All borders through each other look quite messy... But let's concentrate on the data.
Southern Flevoland, a polder, already existing since 1968, is still not there on maps from 1 (aprsworld, green) and 2 (blue marble, so blue). Of course it is present on the map of my company 3 (geodan, red), because Geodan of course takes care for its own country. Southern Flevoland is also present on all newer maps (5, pink, 6 orange, 7 gray). But on 5 and 6 it has a planned but not realized polder (Markerwaard) included. So best for the Netherlands are 3 and 7.

Of the five inhabited islands along the Waddenzee, 7 depicts only two. So here 3 is the best choice (of course, it is our country, and note that for the rest this is not a commercial talk)

For the whole world, 6 and 7 are quite similar in nearly all aspects

Uruguay
Let's now look at a country not part of Europe or US. I select Uruguay:
Uruguay

The blue vectors are quite rough and shifted to the west and north. Green lines are looking nice for a map on this scale, but deviate a bit from all other maps.

Let's now take a look in Google Earth. For this, we need to create a KML file...  hmm, shp2kml cannot be downloaded (blank website saying Missing ID). OK, we use PostGIS then, it can create KML and WKT (we need it below).

shp2pgsql -s4326 world.shp world1 > world1.sql

etc

So I created a database called blog and executed these SQL files.
Five minutes later (PostGIS is great) I do:
psql "-F " -A -n -t -q -Upostgres -dblog "-cselect '<Placemark>',ST_AsKml(the_geom),'</Placemark>' from world2 where name='Uruguay'" -otworld2.kml
 

Doing this for all tables and reworking the KML's  a bit (adding headers etc) gives me:
Uruguay with Google Maps

Considering Google Maps as our Single Point of Truth, we can discard the blue lines from Blue Marble, and might think that the green lines are also wrong. The rest is following the borders more or less but we need to zoom in to judge it better.

This is the new detail on a border:
Uruguay Google Detailed

The Yellow line is Google's line. The red line is following it, probably a bit too much. For a global scale, the other ones (4,5,6,7) are doing well here.

Caspian Sea
Looking at global level, we see immediately one thing, on which we zoom in here: it is surprising that the Caspian sea is not included in map 6 and 7. So the country borders are just across the sea... Not so good.
Caspian see

Not so good. So let's discard 6 and 7. We also see that the red lines (3) are too detailed, also here. Discard 3. Blue (2) and green (1) was already discarded before. So we keep 4 and 5, which are roughly the same. So I decide to keep the more actualy one: 5.

Decision
So we conclude, based on a few samples (and some more but not described here), that the file TM_WORLD_BORDERS-0.3.zip  from Thematic Mapping is the best shapefile to use for worldscale countries (even though it contains the Markerwaard).

Country number and validity
We did not consider attributes, and our research was still rough.

An important aspect we want to consider is geometric validity. And, even more important, the number of countries, it can differ over years, but the differences are sometimes still surprising.
Dataset Invalid (SQL Server) Invalid (PostGIS) #Countries Year
1 1 1 244 2002 / 2009
2 4 3 239 1996
3 34 5 253 2003
4 5 6 polygons 251 (3784 polygons (this dataset contains polygonsp; not multi-polygons) 2004
5 5 4 246 2008
6 3 0 251 2002
7 2 7 249 2008

About the countries themselves, I just glanced through the differences using a query, comparing ISO3 country codes, getting this table:
Compare 5 and 7

We see some inconsistent and some missing ISO codes, and some countries not in the one but in the other, and vice versa.


Conclusion

I think a good and free shapefile of the world is still  welcome.


4 comments:

  1. I'm quite happy with naturalearth.org

    ReplyDelete
  2. Have you checked out VMAP0 VMAP1?

    ReplyDelete
  3. gislars, mustafadog, thanks for your comments I'll process them.

    ReplyDelete
  4. Excellent piece of analysis!! Can't wait to see the other ones analyzed too!

    ReplyDelete