Thursday, July 28, 2011

Making maps with Fusion Tables

Inspired by the Guardian Data Blog I decided to explore Fusion Tables and Google Maps with Australian data. To start with, I selected a set of Socio-Economic Indexes for Areas, created from 2006 Census data, and postal area boundaries from the Australian Bureau of Statistics (ABS). The results are presented in an earlier post titled Mapping social diversity in NSW. Today I would like to share a few observations regarding creating maps with data from Fusion Tables.

As outlined in my earlier posts, although Fusion Tables is not yet a fully featured thematic mapping, analysis and publishing application however, with a little bit of effort, anyone can create informative maps which are visually attractive and fast to deploy. The best thing about Fusion Tables is that you don’t need to manage any complex infrastructure yourself and that the application is free (however, with some limitations on data storage volumes, currently capped at 250MB per account).

Spatial Data

Since Fusion Tables support spatial data only in KML format you have to convert your dataset before uploading it to the server, or alternatively, find a publicly available table that has already been uploaded by someone else.

Google provides a tool to translate SHP data into KML format and to import directly into Fusion Tables but it didn’t work for me with complex data. There are some free alternatives available (really easy to use one is QGIS, for example) but loading other than KML data into Fusion Tables will always be a multi step process.

If you decide to upload your own data, please note a couple of annoying limitations of Fusion Tables. Firstly, complex polygon structures are not supported (for example, I could not upload postcode number 0822 in Northern Territory at full resolution, yet it works perfectly with Google Maps). Secondly, some larger polygons and/or with many parts get generalised automatically as you load them to Fusion Tables as, for example, postal area 7255 in Tasmania (compare the results below – the same KML file as imported to Fusion Tables, on the left, and as displayed directly on Google Map - note green outlines on all, even the smallest islands):

Table search functionality in Fusion Tables is rather crude so, it may not be an easy task to locate what you are looking for. Not to mention that the concept of metadata is non-existent in Fusion Tables so, it is hard to know if the data you find is appropriate for your purposes.

Numeric data

Upload of tabular numeric information in csv format is very straightforward but if you disallow “Export” option up front, you will not be able to edit the data in Fusion Tables. My suggestion is to import the data as “Private” (default option) and allow for “Export”, then add new columns with formulas (if required), and disallow export only when you are ready to publish the data (if at all).

Table Operations

You can easily create a map based on data from numeric tables if those tables contain a “spatial reference” column, for example, postcode numbers (provided you can find equivalent spatial data set in Fusion Tables). To combine numeric and spatial data tables you have to use “Merge” function. My suggestion is to use “smaller table” as a starting point. For example, to create thematic map with postcodes for Sydney area only, select relevant numeric table first and then merge with a table containing postal areas for the entire NSW. Only relevant boundaries will be included in the merged table (ie. the subset of NSW postcodes). If you do the operation in the reverse order, the merged table will contain all postcodes for NSW but only a handful will have the data that can be used in creating a thematic map.

When you “Create View” (ie. copy the table - your own or from other users to your account) or “Merge” tables with spatial geometry column you will lose map formatting parameters (eg. colour setting for polygon fills, etc.). This is very unfortunate, especially when you need to retain colour schema from the original table.

Styling Map

Handling “No data” fields is not easy in Fusion Tables. The problem is that polygons with “no value” in the table default to red fill when rendered on the map (as in the example below – there was no data for 2006 postcode in the merged numeric table). A workaround is to include some value in the table for the missing record (eg. traditional -9999) if you can. Then you can specify map settings to colour only that value, for example, as white and/or fully transparent.

Fully transparent overlays (eg. if fill is set to 0% transparency) are not clickable – it is a very handy feature for handling polygons with missing data in the numeric table (ie. no information window will be displayed when the polygon is clicked). However, when your objective is to present on the map only outlines of the polygons but you still want to display information about those polygons on click of the map, you have to change transparency parameter to a value greater than 0.


If you are eager to start playing with Fusion Tables, Google produced easy to follow tutorial on how to create thematic maps (note, if you are working with your own data, choose “Map” option and not “Intensity Map” in the relevant step).

Tuesday, July 26, 2011

Mapping social diversity in NSW

The concept of relative socio-economic advantage or disadvantage is neither simple, nor well defined. Australian Bureau of Statistics attempts to quantify socio-economic diversity for geographic locations with a suite of four summary measures called Socio-Economic Indexes for Areas (SEIFA).

The four indexes in SEIFA 2006 are:

Index of Relative Socio-economic Disadvantage: is derived from Census variables related to disadvantage, such as low income, low educational attainment, unemployment, and dwellings without motor vehicles.

Index of Relative Socio-economic Advantage and Disadvantage: a continuum of advantage (high values) to disadvantage (low values) which is derived from Census variables related to both advantage and disadvantage, like household with low income and people with a tertiary education.

Index of Economic Resources: focuses on Census variables like the income, housing expenditure and assets of households.

Index of Education and Occupation: includes Census variables relating to the educational and occupational characteristics of communities, like the proportion of people with a higher qualification or those employed in a skilled occupation.

While SEIFA score represents an average of all people living in an area, SEIFA does not represent the individual situation of each person. Larger areas are more likely to have greater diversity of people and households.

A SEIFA score is created using information about people and households in a particular area. This score is standardised against a mean of 1000 with a standard deviation of 100. This means that the average SEIFA score will be 1000 and the middle two-thirds of SEIFA scores will fall between 900 and 1100 (approximately).

To determine the SEIFA rank, all the areas are ordered from lowest score to highest score. The area with the lowest score is given a rank of 1, the area with the second-lowest score is given a rank of 2 and so on, up to the area with the highest score which is given the highest rank, being 2615 for a postal areas (POA) index.

Deciles divide a distribution into ten equal groups. In the case of SEIFA, the distribution of scores is divided into ten equal groups. The lowest scoring 10% of areas are given a decile number of 1, the second-lowest 10% of areas are given a decile number of 2 and so on, up to the highest 10% of areas which are given a decile number of 10.

For more information about SEIFA and its potential uses please refer to the following document: 2039.0 - Information Paper: An Introduction to Socio-Economic Indexes for Areas (SEIFA), 2006

Data tables and maps are available for reference and further reuse via Google’s Fusion Tables:

SEIFA 2006 for NSW Index of Disadvantage
SEIFA 2006 for NSW Advantage-Disadvantage
SEIFA 2006 for NSW Economic Resources
SEIFA 2006 for NSW Education-Occupation
SEIFA for Postal Areas Census 2006 (data table)
Postal Areas NSW Census 2006 Edition (postal area boundaries)

Related posts:
Interactive Atlas of NSW
Free postcodes with reference map
More free data with reference map
Postcode maps and population statistics

Monday, July 25, 2011

Shp data and Fusion Tables

A lot of geographic data in public domain is distributed in SHP format. However, Fusion Tables application supports geographic data only in KML format. Google has recognised the opportunity and is now providing a link to a “translator/loader” application to facilitate uploading of SHP files into Fusion Tables. Shpescape has been implemented with GeoDjango framework and is aimed at facilitating the process of converting and loading that vast resource of GIS data from SHP format into Fusion Tables. It should potentially improve uptake of Fusion Tables by GIS as well as broader application development community.

The concept behind Shpescape is great but, for now it fails in terms of performance. I tried the application with a modest size SHP datataset (40MB) and the result was not impressive. It took extremely long time to upload the data to the server, process it into KML and load into the Fusion Tables (short of an hour!). I know from my own experiments that converting SHP into KML takes only a few seconds with basic PHP script. Allowing for download and upload time (since 2 separate servers are involved), the whole process should be finished in a matter of minutes and not almost an hour. The biggest disappointment was that the algorithm used in Shpescape enforces generalisation of polygons and does not process “point for point” from SHP to KML [correction, it’s is actually undesirable Fusion Tables feature and not Shpescape fault]. It resulted in some polygons being converted incorrectly and/or corrupted in the process (as per image below).

Shpescape will work with small SHP files, with simple geometries but, as it stands, Fusion Tables are unable to handle full resolution datasets. Therefore it may be better to generalise SHP files before loading into Fusion Tables via Shpescape.

Related Posts:
Converting shapefile to KML
Converting csv data into shapefile

Thursday, July 21, 2011

New style for Google Maps

There is a good chance that you haven’t noticed subtle changes to cartographic design of Google Map that the company is continuously implementing. However, if you put different versions of Google Map side by side, it becomes very obvious how dramatically the appearance changed over the last few years. The key objective behind those changes is “… to make the map cleaner, more focused, more visually harmonious, and easier to use.

…Some highlights to look out for are a brighter and more cheerful colour palette, a more integrated and less visually noisy labelling style, subtle improvements to footpaths and minor roads, and cleaner building and land parcel rendering.

One thing Google cannot be accused of is that it does not put continuous efforts into upgrading of its products and services. In fact, that constant tinkering with features and functionality gives an impression that all Google products are in a permanent state of development. With Google we never know what functionality is coming and when it will be available, or whether the product or service will survive in the long run as the company is not afraid to pull down underperforming applications. The most recent announcement is the closure of Google Labs with 56 experimental products. Product-specific Labs sites, like Gmail Labs, Google Maps Labs and Search Experiments, aren't affected by the decision.

First spotted on Google Maps Mania

Monday, July 18, 2011

Fusion Tables yet to ignite

Google is making consistent but slow progress with Fusion Tables, gradually enabling various functionality options to turn the application into a comprehensive data visualisation and sharing package. The idea behind Fusion Tables is simple – allow people to upload data in a tabular format, then present that data with graphs or geocode/ match to spatial data and display on Google maps as thematic overlays or location points. Undoubtedly, the integration of tables, maps and graphs is Google’s response to emerging trend for “data marts” and “data journalism”.

I wrote about Fusion Tables with great excitement a year ago, concluding that the application has a potential to evolve into a formidable competitor to PostGIS, ArcSDE , Oracle Spatial or SQL Server for basic GIS applications. Although a recent addition of dynamic styling capabilities takes Fusion Tables closer to that goal, it is still a long way for the application to reach that point. Unfortunately, the implementation of Fusion Tables is in typical, of late, Google fashion – unattractive and rather complex to follow so, most likely only “hard core” developer community will be taking advantage of it. The limit of 250MB of data per account is not helping either. There is no catalogue of available data (although basic text search is enabled) and no metadata for public tables so, it will not facilitate sharing.

Nevertheless, you can already make nice and very responsive maps with Fusion Tables, as in this example from Guardian’s Data Blog:

Related Posts:
Google launches Fusion Tables
Ingenuity of Google Map architecture also its main limitation