CartoType Map Data Format Type 1 (CTM1)

Introduction

CartoType Map Data Format 1, known for short as CTM1, is a binary file format for storing vector map data. Map Data Format 1 is one of the data formats supported by CartoType. It is the format used for demonstration software on desktop and laptop computers and mobile devices. The format is platform-independent.

Map data is taken from OSM (OpenStreetMap) files, SHP files and other data sources and converted to CartoType Map Data Format 1 by tools delivered as part of the CartoType toolset. The main tool is generate_map_data_type1.

Copyright and license

This format is copyright © 2004-2008 Cartography Ltd. For unrestrictive licensing please contact Cartography Ltd. However, you may use this documentation and the data format described in it under the following conditions: any programs or other computer software components that use the format must contain an acknowledgement in the 'Help' or 'About' menu or splash screen or documentation, using text of an easily legible size and style, in the words "The CTM1 data format is licensed from Cartography Ltd (http://www.cartotype.com)."; further, any changes to the CTM1 format or documentation must be made available under the same license and conditions.

CTM1 versions

28th April 2004 0.0  
4th May 2004 1.0 strings are now preceded by their lengths instead of null-terminated
5th May 2004 2.0 map objects are now preceded by their lengths in bytes
12th November 2004 2.1 object coordinates are now stored compactly if possible using 16-bit offsets
28th February 2005 2.2 added table 2 (label index)
9th September 2005 2.3 added table 3 (projections)
16th April 2006 2.4 if there are 3 sub-regions the third is the overlap region
25th August 2008 3.0 object names are replaced by null-separated named string attributes; the change allows older (version 2.*) CTM1 files to be read by later CartoType versions but not vice versa

Data types

The following data types are used. All multi-byte numbers are big-endian, which means more significant bytes come first.

int8: 8-bit signed integer stored as one byte

uint8: 8-bit unsigned integer stored as one byte

int16: 16-bit signed integer stored as two bytes

uint16: 16-bit unsigned integer stored as two bytes

int32: 32-bit signed integer stored as four bytes

uint32: 32-bit unsigned integer stored as four bytes

fixed: 32-bit signed integer stored as four bytes, representing a fixed-point number equal to 1/65536 of this number (e.g., 65536 represents 1.0; 32768 represents 0.5).

latitude: latitude stored as degrees in a fixed-point number, taking degrees West of Greenwich as negative and degrees East as positive: see the ‘data shift’ in table 0 to fully understand this

longitude: longitude stored as degrees in a fixed-point number, taking degrees North of the equator as positive and degrees South as negative: see the ‘data shift’ in table 1 to fully understand this

metres: projected metres; used instead of degrees if the map data has already been projected

string: a length followed by either a UTF8 or a UTF16 string, depending on the string format entry in the global information table; the length is a single byte unless that byte is 255, in which case a four-byte length follows that sentinel

File structure

The file format consists of a header followed by some tables, some of which are mandatory. This is a very similar pattern to that of a TrueType font file. To find a table, look it up in the table index, which contains table IDs and offsets to the tables from the start of the file.

The header

The format of the header is:

uint8 * 4 signature the file signature, which must be the bytes ‘CTm1’
uint16 version major the major version number of the file format
uint16 version minor the minor version number of the file format
uint16 tables number of tables in the file

for each table:

uint16 table ID unique ID of the table
uint32 table offset offset of the table from the start of the file

Tables may appear in any order in the table index. A non-existent table may either be absent from the table, or may be indicated by a table offset of zero.

Global information (table ID = 0)

uint8 string format 0 for UTF 16 strings, 1 for UTF 8 strings
string data set name name describing the data (e.g., ‘Hertfordshire’)
string copyright copyright and ownership details
uint16 version major the major version number of this data
uint16 version minor the minor version number of this data
fixed * 4 bounding box bounds of the data, in the order left longitude, top latitude, right longitude, bottom latitude
uint32 map datum ID 0 = unknown, 1 = WGS 84, etc.
uint8 point format 0 = unknown, 1=degrees, 2=meters
uint8 data shift number of bits by which to shift latitude and longitude data: if 0, the numbers represent degrees; if 1,half-degrees, if 2, quarter-degrees, etc; this field is always zero in current practice
uint8 x axis direction direction of  x axis: 0 = left, 1 = right
uint8 y axis direction direction of y axis: 0 = down, 1 = up

Layers (table ID = 1)

A map data file can contain data for any number of layers. The layer name must be 'road' for the road layer. Other layer names may be chosen freely. The layer table enumerates the layers and provides offsets to sub-tables giving the data for each layer.

uint16 layers number of layers

For each layer:

string layer name layer name
uint32 layer data offset from the start of the layer table to the data for this layer
uint8 integer attributes number of integer attributes in this layer

For each integer attribute:

string attribute name name of the attribute
uint8 attribute type type of attribute. 1 = integer; that is the only value used at the moment
uint32 attribute default attribute's default value

Layer data

Each layer has a layer data table that is part of table 1 and accessed at a known offset from the start of that table.

A layer is optionally divided into two or more clip regions containing only objects that intersect that region. A clip region contains only objects that intersect the it. Every map object occurs only once in the data. If a map object intersects more than one region in divided data it is not stored in the divided data but directly at the higher level region.

There is a special convention if the number of divisions is three. In that case, the containing region is split into two halves which become sub-regions 0 and 1. The third sub-region, which is the same size, is the overlap region and is centred on the boundary between sub-regions 0 and 1. Objects intersecting both sub-regions 0 and 1 and entirely contained in sub-region 2 are placed in sub-region 2. This technique reduces the average size of a region that contains an object and thus allows the data accessor to read fewer objects that will ultimately not be drawn because they do not intersect the clip area.

Because of the clipping system, the data format is recursive. It consists of a horizontal clip region optionally divided into vertical regions, optionally divided into horizontal regions, etc. Each region is defined as follows:

uint16 divisions number of region dividers (0 or more)

If divisions = 0, this is undivided data and the data follows immediately.

If divisions = n > 0, for each of the n + 1 regions we have

uint32 clip boundary latitude or longitude of the boundary following(that is, right of or below) the region
uint32 data offset from the start of the layer table (table 1) to the data in this region, which may be further divided

The region data is followed by the data.

The data itself is stored as:

uint32 items number of map objects

followed by the map objects themselves. If the layer has attributes map objects are sorted in ascending order of the integer value of their first attribute.

Map object structure

Map objects start with their size in bytes as a four-byte integer. This allows the accessor to skip objects at high speed if their attributes don’t match what is required.

Each object in the layer may optionally possess one or more integer attributes. The presence of the integer attributes is marked by a flag bit in the object type. If the bit is set, all the integer attributes are present. If not, none are.

String attributes are stored as null-delimited concatenations of <attrib>=<value> pairs, where <attrib> can contain any characters except ‘=’ and null and <value> can contain any character except null. However, the first attribute is the name (also known as the ‘label’) and is stored as a plain value, without ‘<attrib>=’. The presence of one or more string attributes is also marked by a flag bit.

Map object types are identified by the next byte: the format is different for each type.

The object type byte has the following structure:

bits 0…1 basic type: 0 = point, 1 = line, 2 = polygon, 3 = elevation grid
bit 2 set if the point data is stored compactly – see below
bit 3 set if this object has integer attributes (as listed in the layer definition)
bit 4 set if this object has string attributes
bit 5 set if this object has more than one contour (more than one point for type 0)
bit 6 set if this object has more than 255 points
bit 7 set if this object has more than 65535 points

Numbers of points, numbers of contours and contour ends are stored in one byte if bit 6 is clear, two bytes if bit 6 is set but not bit 7, and four bytes in the unlikely event that bit 7 is set.

Storage of data points

Data points represent either degrees stored as 16:16 fixed-point numbers – that is, 32-bit quantities that represent 65536ths of a degree, or, if the data has been pre-projected according to a map projection, projected metres.

All points are stored as longitude followed by latitude – x coordinates precede y coordinates.

If the compact bit is set in the object type, the first point is a 32-bit number and following points are 16-bit signed differences from the preceding point ('deltas'). If the compact bit is not set all the points are 32-bit numbers.

Type 0 (point)

int32 * number of attributes for this layer (defined in the layer table) integer attributes integer attribute values, only present if bit 3 of the type was set
string string attributes string attribute values as described above; only present if bit 4 of the type was set
uint points number of points; only present if bit 5 of the type is set

for each point:

longitude x coordinate
latitude y coordinate

or, if the data has been projected, which is the more usual case

metres x coordinate
metres y coordinate

stored either as 32-bit integers or (if the compact object type bit is set) a 32-bit initial point followed by 16-bit deltas.

Type 1 (line)

int32 * number of attributes for this layer (defined in the layer table) attributes attribute values, only present if bit 3 of the type was set
string string attributes string attribute values as described above; only present if bit 4 of the type was set
uint contours number of contours; only present if bit 5 of the type is set
uint * number of contours contour ends end index of each contour

for each point:

longitude x coordinate
latitude y coordinate

or, if the data has been projected, which is the more usual case

metres x coordinate
metres y coordinate

stored either as 32-bit integers or (if the compact object type bit is set) a 32-bit initial point followed by 16-bit deltas.

Type 2 (polygon)

Identical to type 1.

Type 3 (elevation grid)

int32 bottom of linear height range, in metres, represented by a height data byte
int32 top of height range
int32 * 4 bounds of area covered by data, in the order left, top, right, bottom
int32 width: number of data points in each row
int32 height: number of data points in each column
int32

size of each cell in the same units as the bounds; strictly speaking, unnecessary, but included for convenience

int8[width * height] the data as a row-wise array of single data bytes

Label index (table ID = 2)

The label index is stored as a packed array trie. The array contains nodes and each node is identified by an index in the trie. The entries of the node correspond to the alphabet indexes. An entry belongs to the node if the alphabet index stored in the entry equals the alphabet index leading to the entry. The last entry of a node with alphabet index  = alphabet_size can be used to store any intermediate references to objects. In the following description an object ID is the offset in bytes of a map object from the start of the CTM1 file.

uint32 array trie offset offset of the array trie table from the start of the file
uint32 suffix table offset offset of the suffix table from the start of the file
uint32 object id table offset offset of the object id table from the start of the file

Alphabet

uint16 alphabet size size of the alphabet used in the index

For each entry:

uint16 alphabet char Unicode value of alphabet character

The type alphabet_index is derived from alphabet size: uint8 if alphabet size < 255, uint16 otherwise.

Packed array trie

For each entry:

alphabet_index index of alphabet character for this entry
uint32 link:
the two most significant bits indicate link type, the other bits the value:
0 = value is index in array trie table
1 = value is byte offset in suffix table
2 = value is byte offset in object id table
3 = value is object id

Suffix table

For each entry:

alphabet_index  * suffix length alphabet indexes of suffix
alphabet_index 0xFF(FF) is the suffix terminator
uint32 * objects object ids of map objects with resulting label. The last one has the highest bit set

Object id table

For each entry:

uint32 * objects object ids of map objects with resulting label. The last one has the highest bit set

Projection (table ID = 3)

If the data is pre-projected the projection used to convert the data to map metres can optionally be stored in this table. This is used for converting between latitude and longitude and map points, for example when synchronising the map to a position received from a GPS receiver.

uint32 projection type The type of the projection. Legal types are: 2 = transverse Mercator; 3 = universal transverse Mercator; 4 = cylindrical equidistant, 5 = universal transverse Mercator (ellipsoidal), 6 = Mercator.

The projection follows in serialised form.

to do: document the serialised form of the projections