CartoType Map Data Format 1, known for short as CTM1, is a binary file format for storing vector map data. Map Data Format 1 is one of the data formats supported by CartoType. It is the format used for demonstration software on desktop and laptop computers and mobile devices. The format is platform-independent.
Map data is taken from OSM (OpenStreetMap) files, SHP files and other data sources and converted to CartoType Map Data Format 1 by tools delivered as part of the CartoType toolset. The main tool is generate_map_data_type1.
This format is copyright © 2004-2008 Cartography Ltd. For unrestrictive licensing please contact Cartography Ltd. However, you may use this documentation and the data format described in it under the following conditions: any programs or other computer software components that use the format must contain an acknowledgement in the 'Help' or 'About' menu or splash screen or documentation, using text of an easily legible size and style, in the words "The CTM1 data format is licensed from Cartography Ltd (http://www.cartotype.com)."; further, any changes to the CTM1 format or documentation must be made available under the same license and conditions.
| 28th April 2004 | 0.0 | |
| 4th May 2004 | 1.0 | strings are now preceded by their lengths instead of null-terminated |
| 5th May 2004 | 2.0 | map objects are now preceded by their lengths in bytes |
| 12th November 2004 | 2.1 | object coordinates are now stored compactly if possible using 16-bit offsets |
| 28th February 2005 | 2.2 | added table 2 (label index) |
| 9th September 2005 | 2.3 | added table 3 (projections) |
| 16th April 2006 | 2.4 | if there are 3 sub-regions the third is the overlap region |
| 25th August 2008 | 3.0 | object names are replaced by null-separated named string attributes; the change allows older (version 2.*) CTM1 files to be read by later CartoType versions but not vice versa |
The following data types are used. All multi-byte numbers are big-endian, which means more significant bytes come first.
int8: 8-bit signed integer stored as one byte
uint8: 8-bit unsigned integer stored as one byte
int16: 16-bit signed integer stored as two bytes
uint16: 16-bit unsigned integer stored as two bytes
int32: 32-bit signed integer stored as four bytes
uint32: 32-bit unsigned integer stored as four bytes
fixed: 32-bit signed integer stored as four bytes, representing a fixed-point number equal to 1/65536 of this number (e.g., 65536 represents 1.0; 32768 represents 0.5).
latitude: latitude stored as degrees in a fixed-point number, taking degrees West of Greenwich as negative and degrees East as positive: see the ‘data shift’ in table 0 to fully understand this
longitude: longitude stored as degrees in a fixed-point number, taking degrees North of the equator as positive and degrees South as negative: see the ‘data shift’ in table 1 to fully understand this
metres: projected metres; used instead of degrees if the map data has already been projected
string: a length followed by either a UTF8 or a UTF16 string, depending on the string format entry in the global information table; the length is a single byte unless that byte is 255, in which case a four-byte length follows that sentinel
The file format consists of a header followed by some tables, some of which are mandatory. This is a very similar pattern to that of a TrueType font file. To find a table, look it up in the table index, which contains table IDs and offsets to the tables from the start of the file.
The format of the header is:
| uint8 * 4 | signature | the file signature, which must be the bytes ‘CTm1’ |
| uint16 | version major | the major version number of the file format |
| uint16 | version minor | the minor version number of the file format |
| uint16 | tables | number of tables in the file |
for each table:
| uint16 | table ID | unique ID of the table |
| uint32 | table offset | offset of the table from the start of the file |
Tables may appear in any order in the table index. A non-existent table may either be absent from the table, or may be indicated by a table offset of zero.
| uint8 | string format | 0 for UTF 16 strings, 1 for UTF 8 strings |
| string | data set name | name describing the data (e.g., ‘Hertfordshire’) |
| string | copyright | copyright and ownership details |
| uint16 | version major | the major version number of this data |
| uint16 | version minor | the minor version number of this data |
| fixed * 4 | bounding box | bounds of the data, in the order left longitude, top latitude, right longitude, bottom latitude |
| uint32 | map datum ID | 0 = unknown, 1 = WGS 84, etc. |
| uint8 | point format | 0 = unknown, 1=degrees, 2=meters |
| uint8 | data shift | number of bits by which to shift latitude and longitude data: if 0, the numbers represent degrees; if 1,half-degrees, if 2, quarter-degrees, etc; this field is always zero in current practice |
| uint8 | x axis direction | direction of x axis: 0 = left, 1 = right |
| uint8 | y axis direction | direction of y axis: 0 = down, 1 = up |
A map data file can contain data for any number of layers. The layer name must be 'road' for the road layer. Other layer names may be chosen freely. The layer table enumerates the layers and provides offsets to sub-tables giving the data for each layer.
| uint16 | layers | number of layers |
For each layer:
| string | layer name | layer name |
| uint32 | layer data | offset from the start of the layer table to the data for this layer |
| uint8 | integer attributes | number of integer attributes in this layer |
For each integer attribute:
| string | attribute name | name of the attribute |
| uint8 | attribute type | type of attribute. 1 = integer; that is the only value used at the moment |
| uint32 | attribute default | attribute's default value |
Each layer has a layer data table that is part of table 1 and accessed at a known offset from the start of that table.
A layer is optionally divided into two or more clip regions containing only objects that intersect that region. A clip region contains only objects that intersect the it. Every map object occurs only once in the data. If a map object intersects more than one region in divided data it is not stored in the divided data but directly at the higher level region.
There is a special convention if the number of divisions is three. In that case, the containing region is split into two halves which become sub-regions 0 and 1. The third sub-region, which is the same size, is the overlap region and is centred on the boundary between sub-regions 0 and 1. Objects intersecting both sub-regions 0 and 1 and entirely contained in sub-region 2 are placed in sub-region 2. This technique reduces the average size of a region that contains an object and thus allows the data accessor to read fewer objects that will ultimately not be drawn because they do not intersect the clip area.
Because of the clipping system, the data format is recursive. It consists of a horizontal clip region optionally divided into vertical regions, optionally divided into horizontal regions, etc. Each region is defined as follows:
| uint16 | divisions | number of region dividers (0 or more) |
If divisions = 0, this is undivided data and the data follows immediately.
If divisions = n > 0, for each of the n + 1 regions we have
| uint32 | clip boundary | latitude or longitude of the boundary following(that is, right of or below) the region |
| uint32 | data | offset from the start of the layer table (table 1) to the data in this region, which may be further divided |
The region data is followed by the data.
The data itself is stored as:
| uint32 | items | number of map objects |
followed by the map objects themselves. If the layer has attributes map objects are sorted in ascending order of the integer value of their first attribute.
Map objects start with their size in bytes as a four-byte integer. This allows the accessor to skip objects at high speed if their attributes don’t match what is required.
Each object in the layer may optionally possess one or more integer attributes. The presence of the integer attributes is marked by a flag bit in the object type. If the bit is set, all the integer attributes are present. If not, none are.
String attributes are stored as null-delimited concatenations of <attrib>
Map object types are identified by the next byte: the format is different for each type.
The object type byte has the following structure:
| bits 0…1 | basic type: 0 = point, 1 = line, 2 = polygon, 3 = elevation grid |
| bit 2 | set if the point data is stored compactly – see below |
| bit 3 | set if this object has integer attributes (as listed in the layer definition) |
| bit 4 | set if this object has string attributes |
| bit 5 | set if this object has more than one contour (more than one point for type 0) |
| bit 6 | set if this object has more than 255 points |
| bit 7 | set if this object has more than 65535 points |
Numbers of points, numbers of contours and contour ends are stored in one byte if bit 6 is clear, two bytes if bit 6 is set but not bit 7, and four bytes in the unlikely event that bit 7 is set.
Data points represent either degrees stored as 16:16 fixed-point numbers – that is, 32-bit quantities that represent 65536ths of a degree, or, if the data has been pre-projected according to a map projection, projected metres.
All points are stored as longitude followed by latitude – x coordinates precede y coordinates.
If the compact bit is set in the object type, the first point is a 32-bit number and following points are 16-bit signed differences from the preceding point ('deltas'). If the compact bit is not set all the points are 32-bit numbers.
| int32 * number of attributes for this layer (defined in the layer table) | integer attributes | integer attribute values, only present if bit 3 of the type was set |
| string | string attributes | string attribute values as described above; only present if bit 4 of the type was set |
| uint | points | number of points; only present if bit 5 of the type is set |
for each point:
| longitude | x coordinate |
| latitude | y coordinate |
or, if the data has been projected, which is the more usual case
| metres | x coordinate |
| metres | y coordinate |
stored either as 32-bit integers or (if the compact object type bit is set) a 32-bit initial point followed by 16-bit deltas.
| int32 * number of attributes for this layer (defined in the layer table) | attributes | attribute values, only present if bit 3 of the type was set |
| string | string attributes | string attribute values as described above; only present if bit 4 of the type was set |
| uint | contours | number of contours; only present if bit 5 of the type is set |
| uint * number of contours | contour ends | end index of each contour |
for each point:
| longitude | x coordinate |
| latitude | y coordinate |
or, if the data has been projected, which is the more usual case
| metres | x coordinate |
| metres | y coordinate |
stored either as 32-bit integers or (if the compact object type bit is set) a 32-bit initial point followed by 16-bit deltas.
Identical to type 1.
| int32 | bottom of linear height range, in metres, represented by a height data byte |
| int32 | top of height range |
| int32 * 4 | bounds of area covered by data, in the order left, top, right, bottom |
| int32 | width: number of data points in each row |
| int32 | height: number of data points in each column |
| int32 |
size of each cell in the same units as the bounds; strictly speaking, unnecessary, but included for convenience |
| int8[width * height] | the data as a row-wise array of single data bytes |
The label index is stored as a packed array trie. The array contains nodes and each node is identified by an index in the trie. The entries of the node correspond to the alphabet indexes. An entry belongs to the node if the alphabet index stored in the entry equals the alphabet index leading to the entry. The last entry of a node with alphabet index = alphabet_size can be used to store any intermediate references to objects. In the following description an object ID is the offset in bytes of a map object from the start of the CTM1 file.
| uint32 | array trie offset | offset of the array trie table from the start of the file |
| uint32 | suffix table offset | offset of the suffix table from the start of the file |
| uint32 | object id table offset | offset of the object id table from the start of the file |
| uint16 | alphabet size | size of the alphabet used in the index |
For each entry:
| uint16 | alphabet char | Unicode value of alphabet character |
The type alphabet_index is derived from alphabet size: uint8 if alphabet size < 255, uint16 otherwise.
For each entry:
| alphabet_index | index of alphabet character for this entry |
| uint32 | link: the two most significant bits indicate link type, the other bits the value: 0 = value is index in array trie table 1 = value is byte offset in suffix table 2 = value is byte offset in object id table 3 = value is object id |
For each entry:
| alphabet_index * suffix length | alphabet indexes of suffix |
| alphabet_index | 0xFF(FF) is the suffix terminator |
| uint32 * objects | object ids of map objects with resulting label. The last one has the highest bit set |
For each entry:
| uint32 * objects | object ids of map objects with resulting label. The last one has the highest bit set |
If the data is pre-projected the projection used to convert the data to map metres can optionally be stored in this table. This is used for converting between latitude and longitude and map points, for example when synchronising the map to a position received from a GPS receiver.
| uint32 | projection type | The type of the projection. Legal types are: 2 = transverse Mercator; 3 = universal transverse Mercator; 4 = cylindrical equidistant, 5 = universal transverse Mercator (ellipsoidal), 6 = Mercator. |
The projection follows in serialised form.
to do: document the serialised form of the projections