Gav's Blog

Shorter of breath and one day closer to death

Archive for the ‘OpenStreetMap’ tag

Why Government should support open and free geospatial data

without comments

Back in July, I posted about dc.gov releasing some data. I was a bit slow replying to a comment made by Nat Torkington then, and felt that a reply actually required a new post to elaborate further on why I’m so supportive of governments – be they local or national – releasing data that has been paid for by the rate/tax-payer. Nat said:

“Isn’t it the case that the USA doesn’t have an authoritative roading database, either? That’s why Navteq, TeleAtlas, and Google have to drive the roads.”

Whilst the US doesn’t have an authoritative roading database either, the release of the TIGER line shapefiles has spurred the development of free and open maps – e.g. the inclusion of Tiger data in OpenStreetMap, and the production of free and open maps for GPS units. This mirrors what has occurred in New Zealand with the likes of the NZ Open GPS Maps project, utilising the free information made available from Land Information NZ.

However, this leaves us with two broad types of maps both with their problems – commercial datasets with restrictive usage conditions and free datasets maintained by volunteers that may not be sustainable in the long term. In New Zealand, the commercial dataset providers are primarily Terralink, Critchlow’s and Eagle Technology, with some more affordable sets made available by Kim Ollivier. The free maps are primarily catered for by the New Zealand OpenStreetMap project and the NZ Open GPS Maps project.

My problem is that there is a lot of inefficiency in the current way that mapping data is managed in New Zealand (and this probably applies internationally). Why do we have four+ commercial sources for roading data and two volunteer driven projects all duplicating each other, as well as Government agencies that have legislative responsibilities for roading infrastructure?

Well, it is because LINZ is not currently funded to provide a centralised repository for all this information – they are too busy focusing on the cadestral database where they make their money. Instead we are producing inefficient silos of information, that are all subtly different. I have been prodding at a few people to try and get the NZ OpenStreetMap and Open GPS Maps projects to try and consolidate the underlying database to OSM, and I believe that this will occur over the long term, but there are a number of issues to work through before this will happen.

As Nat indicated in the original post – in the US Navteg, TeleAtlas and Google drive the roads there, and we’ve got at least Terralink, Google and probably others driving the roads here. In addition we have active volunteers also driving roads and correcting errors in OpenStreetMap and the Open GPS Maps project – I personally provide GPS tracklogs to OSM, and have also placed the 2007/8 High Speed Data Survey in there. The interesting part is that all of the errors are being corrected from the original LINZ roading dataset. So, because the New Zealand Government has not funded LINZ to maintain the roading dataset, make it widely available under permissive licensing terms, and allow feedback and corrections to be suggested for review and possible inclusion, we now have a massively inefficient approach to mapping roads in New Zealand.

All of these projects have sprung up because LINZ is not funded to provide the correct road dataset in the first place.

We can’t support that in a small country in New Zealand where only corporates, local authorities, and central government agencies can afford the commercial roading datasets due to expense. I know at least one of the commercial datasets costs over $100,000 to license. What this means is that small-and-medium sized businesses are being left out in the cold from using geospatial information to improve the way they do business as it is too expensive, and rate/tax-payers do not have affordable access to the information for tourism, recreational and safety purposes.

As the Immediate Past President of the NZ Recreational GPS Society, I’ve seen people balking in our forums at having to pay extra for decent road or topographical maps. Some of these are expensive because the GPS map vendor has needed to license the underlying data from a commercial provider. In addition to the cost, vendors also have to implement measures to stop the reverse-engineering and redistribution of this licensed data. However, like most forms of Digital Rights Management (some may say Restrictions), the technical mechanisms cause their own problems. I’ve just been helping with one person that has been suffering through Garmin’s Map Unlock process that is poorly communicated to customers, and provides nothing but roadblocks in an effort to set up the maps on the user’s computer and GPS. And even when he hopefully does have the maps unlocked, he will only be able to install them on one GPS!

Perhaps as a comparison, I am not able to download and install a copy of the Yellow Pages on my iPhone so that I can use it in a disconnected manner, but I can download the free and open Zenbu iPhone application that bundles all the data – so if for whatever reason I am out of mobile coverage, I can still use this data as it is stored locally on the device. I don’t believe that commercial directory services would be very comfortable about releasing their datasets to be installed on mobile devices, as they would risk the loss of their database in which the perceived value of their business resides. So having data released under permissive liceneses is also essential for new applications such as storing massive geospatial resources in our pockets.

That said, I’m not really in favour any more of the Government attempting to build a single massive dataset any more, as I think Government has proven that it cannot build these IT things effectively because there is too much management by committee, and the commercial vendors that provide the infrastructure are just looking for a jackpot if they win the tender (e.g. tender prices of $9-48 million for the failed National Address Register (NAR) project). I don’t see the need for the Government to build what is effectively their own OpenStreetMap infrastructure when we can just use something like OSM. Honestly, NZ Govt should just approach OpenStreetMap and look at an arrangement where Government can publish geospatial datasets into OSM with the ability to set some layers (such as say electoral and property boundaries which shouldn’t be editable) as read only, and the rest as editable – e.g. roads and walking tracks that can be maintained by everyone. If the publisher of a layer doesn’t want the original layer edited, then in some circumstances editable child layers should be allowed – e.g. so I can add a new walking track to a layer that hasn’t yet been updated to reflect it, and the owner of the original dataset can then look at whether they want to accept the change back into their layer.

Commercial geospatial datasets put nothing but roadblocks in the way for new and creative uses of geospatial data. I have no problem with commercial datasets providing value-add to the data, but the fundamental data such as roads and the like should be made as open and accessible as possible to encourage adoption and standardisation upon that dataset – this will also consolidate feedback and error correction. If I find an error now, I can’t report it to LINZ – they won’t listen. What benefit do I have in reporting a roading error to a commercial provider? Indeed the only benefit I get is if I report the error to a free and open project.

Adoption and standardisation of fundamental datasets are important to ensure consistency between map sets. Right now on my GPS I have two maps sets that both provide roads and you don’t have to look far to find discrepancies between the two datasets – but guess what, they are both derived from the LINZ road centrelines.

If left to commercial providers, geospatial data will be left as an expensive tool that only large organisations can afford.

The sooner governments in general recognise this, start funding the publishing and maintenance of fundamental datasets, the sooner we will see a real renaissance in how spatial information is used by the average organisation and individual. That is why I am so supportive of dc.gov releasing all their data.

Written by Gavin Treadgold

October 29th, 2008 at 11:10 am

More mainstream media coverage for Sahana

without comments

This time it is BusinessWeek promoting the “Do-Good Imperative” – including free and open source software. There is also a sister article on collaborative map-making during emergencies using the likes of OpenStreetMap. Naturally, this is an area that we are working hard on building these geospatial capabilities into Sahana as well.

Open source and collaborative approachs are certainly starting to get the mainstream attention that they deserve. Now we just need funding to support these developments.

Written by Gavin Treadgold

July 13th, 2008 at 1:20 am

Posted in Emergency Management

Tagged with ,

2.2 Million Trackpoints!

with one comment

I’ve been meaning to blog about this sooner, but have been pretty busy with work. A chance email on a NZ GIS list that I belong to two weeks ago, inspired me to go out on a limb and see if I could get some Government data. I saw a post from someone within the Transit (soon to be merged into the New Zealand Transport Agency) refering to working with 2.2 million trackpoints from a roading survey. I started a private email discussion, and after a couple discussions, I soon had 2.2 million trackpoints from the 2008 High Speed Data Collection survey of New Zealand State Highway network.

My intention of obtaining this data was to be able to convert it to GPX files and upload it as a raw data survey layer to OpenStreetMap (OSM) so that it could be used as the basis for mapping New Zealand’s State Highway network in OSM.

I had some help from John McCombs from Integrated Mapping in Christchurch who very kindly reprojected all the points to WGS84. I then spent 4 evenings last week converting to GPX and uploading the files to OSM.

Was this data essential to mapping the highways in OSM? No. But it was a great experiment to see if a New Zealand Government Agency was willing to release data under acceptable terms and conditions – this dataset is licensed under the Creative Commons v3 Attritbution ShareAlike license, and effectively turn the raw data over for public consumption. Naturally, this doesn’t contain all of the detailed geometry that is collected during the survey, so not all of the data was made available, but we got the most important – latitude and longitude, and a lot of them!

For more information, see the following links.

One of the key points I was trying to make, was indicating that citizens are actually interested in accessing government data such as this, and that agencies should take a more proactive approach to releasing data for the world. After all, data is global these days – put it on the Internet and anyway can access it.

Written by Gavin Treadgold

July 1st, 2008 at 10:45 pm

Protecting your privacy uploading tracklogs to public sites

with 2 comments

I have become interested in the ways that you can protect potentially private or sensitive information that may be contained in tracklogs uploaded to any public site. I am primarily writing this article from an OSM perspective, but it is really valid for any site that you may upload a tracklog to.

A GPX tracklog consists of a lot of sections of code that look something this – a trackpoint.

<trkpt lat="-43.502053000" lon="172.576317000">
<ele>16.480000</ele>
<time>2008-05-06T08:37:46Z</time>
</trkpt>

A trackpoint contains two key pieces of information – the time (in UTC – the Z after the time refers to this), and the location in latitude and longitude. A whole pile of these trackpoints are then added together to produce a tracklog. This of course presents a privacy risk as anyone that has access to the tracklog might be able to assume that the person that uploaded the tracklog was at that location at the time specified. And with GPS, this can be recorded to a high level of accuracy.

So, what we need to do is look at ways to protect some of this information. I’ll write here about two techniques that I have used to protect information in tracklogs by editing them before uploading them to public websites. For most public websites, the most important information is location, and time is less important. So we need to take a two-pronged approach to tracklog privacy protection.

1. Delete track points that we might have privacy concerns with.
2. Remove timestamps that we don’t want people knowing the time we were there.

Deleting Trackpoints
1. Protecting it manually. I have been using the free GPSTrackMaker to load and edit tracklogs before uploading them to OSM. This is a manual and sometime laborious process. I use this to remove any trackpoints around the final locations of puzzles/multi-caches that I have visited, and also to remove trackpoints close to home/home/friends etc. I also use it to touch up the tracklogs such as those areas that spray trackpoints around a wide area that don’t mean anything – such as in urban canyons in Wellington. This can result in quite a ‘rich’ tracklog, especially if you delete those areas where the trackpoints are not that accurate due to GPS signal error.

2. Automating deletion of trackpoints. There are also a number of locations that one may always want to remove from a tracklog before making it publicly available. Locations such as home and work spring to mind. I was looking for a way to automate the removal of these locations using GPSBabel. Using nothing more than co-ordinates near your home and a radius, you can easily set up a filter to remove all points that fall with the circle using the following GPSBabel command. Note that the following command is needlessly complex as a little workaround is required to use the radius filter on trackpoints (you have to convert tracks to waypoints, do the radius filter on waypoints, and then convert the waypoints back to tracks – ugly but it works).

gpsbabel -t -i gpx -f in.gpx -x transform,wpt=trk -x nuketypes,tracks -x radius,distance=0.3K,lat=-43.0,lon=172.5,exclude,nosort -x transform,trk=wpt -x nuketypes,waypoints -x track,pack,split=30m,title="LOG %Y%m%d" -o gpx -F out.gpx

It is possible to build a batch file that removes multiple locations such as home, work and friends, that requires very little input. Note that this process does not destroy the original tracklog that you keep, rather it creates a new tracklog with the sensitive data removed.

Removing Timestamps
For whatever reason, it makes some sense to also remove timestamp information from tracklogs – I won’t go into the reasons here. Here is a little unix script that I use to change the timestamp information. Usually I don’t mind people knowing what day I was somewhere, but I’m not that keen on them always knowing the time. So, I will remove either minutes/seconds, or minutes/seconds/hours as have every timestamp appear as midnight.

If you want to set it so that all times are set to the start of the hour e.g. hh:00:00, use this.

#!/bin/sh
for f in *.gpx; do
sed 's/:[0-9][0-9]:[0-9][0-9]Z/:00:00Z/g' < $f > ${f%.gpx}-clean.gpx
done

If you want to set it so that all times are set to midnight e.g. 00:00:00, use this.

#!/bin/sh
for f in *.gpx; do
sed 's/T[0-9][0-9]:[0-9][0-9]:[0-9][0-9]Z/T00:00:00Z/g' < $f > ${f%.gpx}-clean.gpx
done

Naturally, this isn’t the easiest to do, but it is getting easier. It would be great if someone was able to write a tool/webpage that was able to do this sort of cleaning of tracklog data before uploading it to public websites.

Written by Gavin Treadgold

May 12th, 2008 at 6:58 pm