Once we have found a source network image from the web, we trace it using the yEd graph drawing program. By adding nodes and edges we can capture the network topology. We also annotate the traced network with node and edge labels.
If applicable, we can mark different types of nodes (such as external transit providers) using different shapes. Finally, we can categorise nodes (such as core or edge) using colours. The traced network is then saved in the GML graph exchange format, suitable for processing.

We have written a set of Python scripts to process the networks traced using yEd. These scripts make extensive use of the NetworkX graph library.

The merge script processes the yEd GML file, extracting topology and relevant metadata, and discarding graphics information.
It can also merge in metadata provided in an external comma-seperated-value (CSV) file, such as the location of the source image, date of the network, or network type.
The GeoNames web service can be used to look up geographical locations of node names.
The GML files produced by the merge script are then suitable to be included in the Zoo, and can be found in the dataset.

The convert script converts GML files from the Zoo into other file formats, including GraphML and adjacency matrices suitable for use in Matlab. It is also used to generate the table used on the dataset page.

The geoplot script uses the geographical information obtained using the merge script to plot network maps. This uses NetworkX and the Basemap Toolkit for matplotlib.
An example plot is shown below.