Home Documentation CartoType's native map data format: CTM1

CartoType Map Data Format Type 1 (CTM1)

Introduction

CartoType Map Data Format 1, known for short as CTM1, is CartoType's standard format for storing vector and image map data. The format is platform-independent.

Map data is taken from OSM (OpenStreetMap) files, SHP files and other data sources and converted to CartoType Map Data Format 1 by tools delivered as part of the CartoType toolset. The main tool is makemap.

Copyright and license

This format is copyright © 2004-2013 CartoType Ltd. For unrestrictive licensing please contact CartoType Ltd. However, you may use this documentation and the data format described in it under the following conditions: any programs or other computer software components that use the format must contain an acknowledgement in the 'Help' or 'About' menu or splash screen or documentation, using text of an easily legible size and style, in the words "The CTM1 data format is licensed from CartoType Ltd (http://www.cartotype.com)." and the CTM1 format and documentation must be made available under the same license and conditions.

CTM1 versions

28th April 2004 0.0  
4th May 2004 1.0 strings are now preceded by their lengths instead of null-terminated
5th May 2004 2.0 map objects are now preceded by their lengths in bytes
12th November 2004 2.1 object coordinates are now stored compactly if possible using 16-bit offsets
28th February 2005 2.2 added table 2 (label index)
9th September 2005 2.3 added table 3 (projections)
16th April 2006 2.4 if there are 3 sub-regions the third is the overlap region
25th August 2008 3.0 object names are replaced by null-separated named string attributes; the change allows older (version 2.*) CTM1 files to be read by later CartoType versions but not vice versa
2nd December 2009 3.1 table 3 is replaced by table 4, which fulfils the same purpose but has a new format allowing higher-resolution point formats to be used
21st September 2010 3.2 re-introduced type 3 map objects, which are now general bitmaps that can be used for terrain shading textures, or for anything else
29th June 2011
4.0 object bounds are now stored at the start of every object, at a cost of 16 bytes per object, or 8 bytes for single-point objects; generate_map_data_type1 can still generate CTM1 files in format version 3.2, and CartoType can still read them, if this overhead is not desired
3rd December 2011
4.1 added table 7, the palette table, for shared palettes used by raster image objects

24th February 2013

4.2 the global information table now stores the extent of the map in degrees as well as in map coordinates; useful for projections where it is hard to derive the former from the latter

Data types

The following data types are used. All multi-byte numbers are big-endian, which means more significant bytes come first.

int8: 8-bit signed integer stored as one byte

uint8: 8-bit unsigned integer stored as one byte

int16: 16-bit signed integer stored as two bytes

uint16: 16-bit unsigned integer stored as two bytes

int32: 32-bit signed integer stored as four bytes

uint32: 32-bit unsigned integer stored as four bytes

fixed: 32-bit signed integer stored as four bytes, representing a fixed-point number equal to 1/65536 of this number (e.g., 65536 represents 1.0; 32768 represents 0.5).

latitude: latitude stored as degrees in a fixed-point number, taking degrees West of Greenwich as negative and degrees East as positive: see the ‘data shift’ in table 0 to fully understand this

longitude: longitude stored as degrees in a fixed-point number, taking degrees North of the equator as positive and degrees South as negative: see the ‘data shift’ in table 1 to fully understand this

metres: projected metres; used instead of degrees if the map data has already been projected

string: a length followed by either a UTF8 or a UTF16 string, depending on the string format entry in the global information table; the length is a single byte unless that byte is 255, in which case a four-byte length follows that sentinel

File structure

The file format consists of a header followed by some tables, some of which are mandatory. This is a very similar pattern to that of a TrueType font file. To find a table, look it up in the table index, which contains table IDs and offsets to the tables from the start of the file.

The header

The format of the header is:

uint8 * 4 signature the file signature, which must be the bytes ‘CTm1’
uint16 version major the major version number of the file format
uint16 version minor the minor version number of the file format
uint16 tables number of tables in the file

for each table:

uint16 table ID unique ID of the table
uint32 table offset offset of the table from the start of the file

Tables may appear in any order in the table index. A non-existent table may either be absent from the table, or may be indicated by a table offset of zero.

Global information (table ID = 0)

uint8 string format 0 for UTF 16 strings, 1 for UTF 8 strings
string data set name name describing the data (e.g., ‘Hertfordshire’)
string copyright copyright and ownership details
uint16 version major the major version number of this data
uint16 version minor the minor version number of this data
uint32 * 4 bounding box bounds of the data in map coordinates, in the order minimum x, minimum y, maximum x, maximum y
uint32 map datum ID 0 = unknown, 1 = WGS 84, etc.
uint8 point format 0 = unknown, 1=degrees, 2=meters
uint8 data shift number of bits by which to shift latitude and longitude data: if 0, the numbers represent degrees; if 1,half-degrees, if 2, quarter-degrees, etc; this field is always zero in current practice and values other than zero are not supported by CartoType
uint8 x axis direction direction of  x axis: 0 = left, 1 = right
uint8 y axis direction direction of y axis: 0 = down, 1 = up
fixed * 4 bounding box bounds of the data in degrees, in the order left longitude, top latitude, right longitude, bottom latitude

Layers (table ID = 1)

A map data file can contain data for any number of layers. The layer name must be 'road' for the road layer. Other layer names may be chosen freely. The layer table enumerates the layers and provides offsets to sub-tables giving the data for each layer.

uint16 layers number of layers

For each layer:

string layer name layer name
uint32 layer data offset from the start of the layer table to the data for this layer
uint8 integer attributes number of integer attributes in this layer

For each integer attribute:

string attribute name name of the attribute
uint8 attribute type type of attribute. 1 = integer; that is the only value used at the moment
uint32 attribute default attribute's default value

Layer data

Each layer has a layer data table that is part of table 1 and accessed at a known offset from the start of that table.

A layer is optionally divided into two or more clip regions containing only objects that intersect that region. Every map object occurs only once in the data. If a map object intersects more than one region in divided data it is not stored in the divided data but directly at the higher level region.

There is a special convention if the number of divisions is three. In that case, the containing region is split into two halves which become sub-regions 0 and 1. The third sub-region, which is the same size, is the overlap region and is centred on the boundary between sub-regions 0 and 1. Objects intersecting both sub-regions 0 and 1 and entirely contained in sub-region 2 are placed in sub-region 2. This technique reduces the average size of a region that contains an object and thus allows the data accessor to read fewer objects that will ultimately not be drawn because they do not intersect the clip area.

Because of the clipping system, the data format is recursive. The top-level region may be divided vertically using boundaries which are y coordinates, which in turn may be divided horizontally using boundaries which are x coordinates, etc. Each region is defined as follows:

uint16 divisions number of region dividers (0 or more)

If divisions = 0, this is undivided data and the data follows immediately.

If divisions = n > 0, for each of the n + 1 regions we have

uint32 boundary x or y coordinate of the boundary following (right of or below) the region
uint32 data offset from the start of the layer table (table 1, or one of the low-resolution tables in table 7) to the data in this region, which may be further divided; a zero offset means there is no data in this region

The region data is followed by the data.

The data itself is stored as:

uint32 items number of map objects

followed by the map objects themselves. If the layer has attributes map objects are sorted in ascending order of the integer value of their first attribute.

Map object structure

Map objects start with their size in bytes as a four-byte integer. This allows the accessor to skip objects at high speed if their attributes don’t match what is required.

Each object in the layer may optionally possess one or more integer attributes. The presence of the integer attributes is marked by a flag bit in the object type. If the bit is set, all the integer attributes are present. If not, none are.

String attributes are stored as null-delimited concatenations of <attrib>=<value> pairs, where <attrib> can contain any characters except ‘=’ and null and <value> can contain any character except null. However, the first attribute is the name (also known as the ‘label’) and is stored as a plain value, without ‘<attrib>=’. The presence of one or more string attributes is also marked by a flag bit

If table 5 (Strings) is present, an object may contain compressed string attributes. If a map object has a final pseudo-attribute of the form '=xxx', where xxx is a series of characters, each character is used as an index into the string table; the index is reduced by one before being used so that no null characters need occur in the pseudo-attribute. Compressed attributes can be used where many objects share the same attribute and value - for example, thousands of streets each with the attribute-value pair 'city=London'.

Map object types are identified by the next byte: the format is different for each type.

The object type byte has the following structure:

bits 0…1 basic type: 0 = point, 1 = line, 2 = polygon, 3 = elevation grid
bit 2 set if the point data is stored compactly – see below
bit 3 set if this object has integer attributes (as listed in the layer definition)
bit 4 set if this object has string attributes
bit 5 set if this object has more than one contour (more than one point for type 0)
bit 6 set if this object has more than 255 points
bit 7 set if this object has more than 65535 points

Numbers of points, numbers of contours and contour ends are stored in one byte if bit 6 is clear, two bytes if bit 6 is set but not bit 7, and four bytes in the unlikely event that bit 7 is set.

Object bounds (format version 4 and above)

In format versions 4 and above the object bounds follow the object type byte. If the object is a single point object the object's point is stored here as two four-byte integers, otherwise the object's bounding box is stored as four four-byte integers, in the order min-x, min-y, max-x, max-y.

Storage of data points

Data points represent either degrees stored as 16:16 fixed-point numbers – that is, 32-bit quantities that represent 65536ths of a degree, or, if the data has been pre-projected according to a map projection, projected metres.

All points are stored as longitude followed by latitude – x coordinates precede y coordinates.

If the compact bit is set in the object type, the first point is a 32-bit number and following points are 16-bit signed differences from the preceding point ('deltas'). If the compact bit is not set all the points are 32-bit numbers.

Type 0 (point)

Note: in version 4 or above, if there is only one point it has already been supplied, so the number of points, and the point data, is not present here.

int32 * number of attributes for this layer (defined in the layer table) integer attributes integer attribute values, only present if bit 3 of the type was set
string string attributes string attribute values as described above; only present if bit 4 of the type was set
uint points number of points; only present if bit 5 of the type is set

for each point

longitude x coordinate
latitude y coordinate

or, if the data has been projected, which is the more usual case

metres x coordinate
metres y coordinate

stored either as 32-bit integers or (if the compact object type bit is set) a 32-bit initial point followed by 16-bit deltas.

Type 1 (line)

int32 * number of attributes for this layer (defined in the layer table) attributes attribute values, only present if bit 3 of the type was set
string string attributes string attribute values as described above; only present if bit 4 of the type was set
uint contours number of contours; only present if bit 5 of the type is set
uint * number of contours contour ends end index of each contour

for each point:

longitude x coordinate
latitude y coordinate

or, if the data has been projected, which is the more usual case

metres x coordinate
metres y coordinate

stored either as 32-bit integers or (if the compact object type bit is set) a 32-bit initial point followed by 16-bit deltas.

Type 2 (polygon)

Identical to type 1.

Type 3 (array)

uint32 the format: 0 = uncompressed byte array, 1 = an array of delta values compressed using zlib, 2 = uncompressed byte array with a palette, 3 = an array of values compressed using zlib, with a palette
uint32 total number of data bytes in the array
int32 * 4 (not present for format version 4 and later) bounds of area covered by data, in the order left, top, right, bottom
int32 number of bytes per point
int32 the width of the array
int32 the height of the array
int8[data bytes] the data; if there is a palette (formats 2 and 3), the first four bytes are an index into table 7, the palette table

Label index (table ID = 2)

This table is now obsolete and has been replaced by table 8: text index.

The label index is stored as a packed array trie. The array contains nodes and each node is identified by an index in the trie. The entries of the node correspond to the alphabet indexes. An entry belongs to the node if the alphabet index stored in the entry equals the alphabet index leading to the entry. The last entry of a node with alphabet index  = alphabet_size can be used to store any intermediate references to objects. In the following description an object ID is the offset in bytes of a map object from the start of the CTM1 file.

uint32 array trie offset offset of the array trie table from the start of the file
uint32 suffix table offset offset of the suffix table from the start of the file
uint32 object id table offset offset of the object id table from the start of the file

Alphabet

uint16 alphabet size size of the alphabet used in the index

For each entry:

uint16 alphabet char Unicode value of alphabet character

The type alphabet_index is derived from alphabet size: uint8 if alphabet size < 255, uint16 otherwise.

Packed array trie

For each entry:

alphabet_index index of alphabet character for this entry
uint32 link:
the two most significant bits indicate link type, the other bits the value:
0 = value is index in array trie table
1 = value is byte offset in suffix table
2 = value is byte offset in object id table
3 = value is object id

Form of a suffix table entry

An entry in the suffix table is a string of alphabet indexes representing the suffix, terminated by either 0xFF (for single-byte indexes) or 0xFFFF (for two-byte indexes). After that come the object IDs for those objects with a label resulting from appending the suffix to the start of the label. The last object index has its high bit set.

In tabular form a suffix table entry looks like this:

alphabet_index  * suffix length alphabet indexes of suffix
alphabet_index 0xFF(FF) is the suffix terminator
uint32 * objects object ids of map objects with resulting label. The last one has the highest bit set

Form of an object ID table entry

An object ID table entry is a list of IDs, with the high bit set on the last one.

uint32 * objects object ids of map objects with resulting label. The last one has the highest bit set

Projection (table ID = 4)

If the data is pre-projected the projection used to convert the data to map metres can optionally be stored in this table. It is loaded by the CartoType library and used for converting between latitude and longitude and map coordinates or display pixels, for example when synchronizing the map to a position received from a GPS receiver or converting a point on the display to latitude and longitude.

uint32 projection type The type of the projection. Legal types are: 2 = Universal Transverse Mercator (spherical); 3 = Transverse Mercator (spherical); 4 = Cylindrical equidistant, 5 = Universal Transverse Mercator (ellipsoidal), 6 = Web Mercator, 7 = Miller Cylindrical, 8 = Ordnance Survey of Great Britain, 9 = Plate Carrée, 10 = General Projection defined by Proj.4 parameters

The projection follows in serialized form.

For all projection types:

This data is written for all projection types:

The transform applied to the output after the projection is applied, as 6 32-bit numbers which are the affine projection parameters A, B, C, D, Tx, and Ty. The first four are in 16.16 fixed-point format; the last two are plain integers. In practice these values will always be 65536, 0, 0, 65536, 0, 0.

The scale value as a 32-bit number. This has a complicated definition, but is the value used in TCoordinateTransformParam::iScale.

A 32-bit flags value: bit 0 indicates whether there is an input shift. Bit 1 is a boolean value representing TCoordinateTransformParam::iProjectFromMeters. Bits 2..4 store the value of TCoordinateTransformParam::iInputShift.

For Transverse Mercator and Universal Transverse Mercator

fixed    central meridian in degrees
fixed    latitude of origin or central parallel in degrees
fixed    false easting
fixed    false northing
5.27 fixed point    scale factor

For Cylindrical Equidistant and Plate Carrée

fixed    central meridian (lambda 0) in degrees
fixed    latitude of origin (phi 0) in degrees
fixed    false easting (always 0)
fixed    false northing (always 0)

For Universal Transverse Mercator (Ellipsoidal)

uint8    zone
uint8    north (1) or south (0)
uint8    ellipsoid code: WGS84 (0), WGS72 (1) or WGS66 (2)

For Web Mercator and Miller Cylindrical

uint32    prime meridian in ten millionths of a degree

For Ordnance Survey of Great Britain

no extra parameters beyond those written for all projection types

For General Projections defined by Proj.4 parameters

string    the Proj.4 parameters, space separated, each one preceded by a +, as required by the pj_init_plus function

Strings (table ID = 5)

The string table is optional and is used for expanding compressed string attributes. It consists of a 32-bit integer giving the number of strings in the table, followed by the strings themselves, stored in the standard way described at the start of this document. Each string is of the form <attrib>=<value>. If a map object has a final pseudo-attribute of the form '=xxx', where xxx is a series of characters, each character is used as an index into the string table; the index is reduced by one before being used so that no null characters need occur in the pseudo-attribute.

Low-resolution layer data (table ID = 6)

An optional table containing layer data at lower resolutions for small-scale maps. This table saves large amounts of time when drawing small-scale maps because the generalisation and reduction in size of the data has been done at data preparation time and need not be done at runtime.

For each layer in the main layer table:

uint32 resolution: number of metres per pixel for which the main layer data was created
uint32 flags: flags indicating the type of data present: the value 0xFFFFFFFF is used if the information is not known, otherwise the bit values used are 1 (points are present), 2 (lines are present), 4 (polygons are present), 8 (arrays are present).

uint32 extra_resolution_count: the number of data sets at extra resolutions supplied

For each extra-resolution data set:

uint32 resolution: number of metres per pixel for which this data set was created
uint32 flags: flags indicating the type of data present: the value 0xFFFFFFFF is used if the information is not known, otherwise the bit values used are 1 (points are present), 2 (lines are present), 4 (polygons are present), 8 (arrays are present).
uint32 data_offset: offset of the data for this data set from the start of the low-resolution table

Palettes (table ID = 7)

An optional table containing shared palettes used by raster image objects.

uint32 palette_count: the number of palettes in the table

For each palette:

uint32 color_count: the number of colors in the palette
uint32 color[color_count]: the colors as 32-bit integers, each of four 8 bit components stored in the order ABGR (alpha, blue, green, red)

Text index (table ID = 8)

The text index is a trie allowing fast text searching. According to the options used when creating the CTM1 file it may contain just the text attributes, or be a full text index containing sub-phrases.

The attribute names. These are the names of string attributes of map objects. An empty name indicates the standard label of an object. The purpose of this table is to associate attribute names with numeric indexes used when searching the table.

uint32 attribute_count

For each attribute:

string name

The nodes. The nodes form a trie in which each indexed string starts with two non-character values: the layer index and the attribute index. Strings are converted to lower case. Thus the string 'London', if it is in layer 7 and attribute 0, is stored as the text { 7, 0, 'l', 'o', 'n', 'd', 'o', 'n' }. Strings are Unicode and are stored in UTF-16 or UTF-8 in the standard way; see the start of this page for an explanation.

Each node is stored as follows:

string    text; the text stored at this node, which is a substring of the whole text, and is often a single character

uint32    object_count; the number of map objects containing the string terminating at this node

For each map object:

uint32    object_position; the position in the file of a map object

uint32    branches; the number of branch nodes coming from this node

For each branch:

uint32    branch_position; the position in the file of the branch node