A common challenge we’ve heard from businesses, governments, and academics who work with place data is the lack of standardization when referring to a place. A given place may be referred to across various data sets by name, address, geocode, or any number of different data-provider IDs. Often these pieces of identifying information are messy and unstable over time (e.g., a business may change its name, or a street name may be changed), some pieces of information are not unique to a given piece of information (e.g., a new business moves in at an address), or some pieces of information may not be present for all places (e.g., a park without a street address). Placekey was created to serve as a standard universal identifier for any physical place, so that information pertaining to those places can be shared easily across organizations and data sets.
Each Placekey is divided into two parts: What and Where, written as “What@Where”. The What part encodes information about the place and its address, while the Where part situates that place on Earth. The What part of a Placekey is optional, and a What less Placekey like “@5vg-7qg-tvz” refers to a region on the Earth while a What part plus a Where part specifies a particular place within a region on the Earth.
The Where part of a Placekey encodes a hexagon of approximately 15,000 m2 on the surface of the Earth. These hexagons have an edge length of 66 m on average, and it can be helpful to think of them as roughly circles with a diameter of 132 m. The exact area and edge length of the hexagon varies by location. In particular, these hexagons are given by resolution 10 H3 indices. This document will not cover H3 in full detail, as that is covered by the documentation for H3.
Each Where part is specified by 9 characters split into three triplets for legibility. The triplets do not explicitly code exact spatial distances, but the code does become more specific when reading left to right. There is an upper bound on how far apart two Where parts can be based on the length of their shared prefix (see the table below).
However, it is important to be aware that nearby hexagons may have codes that are not very similar. This occurs when Placekey grid cells are near the edges of larger (i.e., lower resolution) hexagons in H3’s spatial hierarchy. Pictured below are three neighboring Placekeys whose shared vertex is also shared by three resolution 5 (with edge length roughly 8.5 km, in orange) hexagons. Each of these Placekey hexagons is nested under the resolution 5 hexagon that contains most of its area, which is why their encodings are so different.
The What part of a Placekey encodes an address (when it is just about an address) or an address and a POI (when a POI is present). Encodings that start with a zero “0” refer to addresses and those that start with a one “1” refer to a POI. In the future, there might be other types that start with a 2, 3, etc. What parts are only unique up to the Where part of a Placekey.
There are a number of ways to convey information about a location on the Earth. The standard and most basic is latitude and longitude pairs, but these have the issue of defining points along a continuum rather than a region so that multiple pairs of coordinates would be required to specify a region, as well as requiring at least 5 decimal places to specify something the size of a big-box store.
One desirable property of latitude and longitude is that it is easy to tell relative spatial relationships between multiple points (often referred to as “proximity”), and this motivates the use of hierarchical systems for geocoding so that, outside of boundary conditions, nearby locations will have similar codes and vice-versa. These boundary conditions are inescapable as codes are one-dimensional while the surface of the Earth is not. Some geocoding solutions are ruled out because they do not have this proximity property, such as what3words.
There are several candidate geocoding systems such as Geohash, Open Location Code (also known as Plus Codes), S2, and H3. The first three of these use a rectangular grid system to tile the globe, while H3 uses a hexagonal grid system. Both of these grid systems are not completely regular since rectangles and hexagons tile the plane but not the sphere. In the case of rectangular grid systems, the grid breaks down at the poles where triangles must be used instead of rectangles and a large number of grid cells must touch. H3 handles the grid breakdown by starting with an icosahedral projection of the surface of the Earth (i.e., a 3d shape with 20 faces and 12 vertices), where each face can be regularly tiled by hexagons and boundary conditions are introduced at each of the twelve vertices of the icosahedron. These boundary conditions require the use of pentagons rather than hexagons for the cells containing these vertices, which has less impact on the adjacency structure of cells than the boundary cases of the rectangular grids. H3 is also designed so that these pentagonal cells are centered over bodies of water.
Another benefit of H3 over grid based systems is the aforementioned adjacency structure of cells. In a hexagonal grid, each cell has 6 neighbors with which it shares an edge, and the centers of each of these neighbors are equidistant from the given cell. In the case of a rectangular grid, each cell has 4 neighbors with which it shares an edge and 4 neighbors with which it only shares a vertex, furthermore the centers of the edge-sharing neighbors are closer to the center of the given cell than the vertex sharing neighbors. The simpler adjacency structure of the hexagonal grid makes analyses of spatial data easier than with rectangular grids (see ESRI’s Why Hexagons? for instance). This property also makes it easier to approximate complicated shapes such as various governmental boundaries using hexagons than with rectangles of a similar size.
A final benefit of using H3 is that it has low distortion of hexagons across the globe when compared to a grid system (e.g., expansion of landmasses near the poles in a Mercator projection). For H3 the entire range of distortion occurs across each face of the icosahedron, where hexagons in the center of each face have larger area than those near the edges of the face.
A final reason we opted to use H3 for Placekey is that it is open source and there is already a community of other libraries, tools, and services using H3 (e.g., kepler. gl, deck.gl, h3-py, h3-js, h3-node, geojson2h3, geo (Clojure), pgh3, bigquery-jslibs, H3 Indexes, Logstash H3 filter plugin). We wish to make Placekey as widely usable as possible, which means bundling its functionality into open source libraries and including Placekeys as part of other services and data sets.
In order to maintain the What part of a Placekey, Placekey maintains databases of addresses and POI. Incoming addresses and POI are either matched against pre-existing places in our databases, or they are assigned new Placekeys in our database. Since the address, location, or name of a place may change over time we also keep a historical record of related Placekeys to enable easy deduping of places in a data set.
The Placekey service API provides the ability to lookup Placekeys based on location (latitude and longitude), address, or address plus location name.
There are Python and Javascript libraries for working with Placekeys. These cannot provide or validate the What part of a Placekey, but they can compute and validate Where parts of Placekeys. These libraries also contain additional functionality that allows for the conversion of Where parts into various geometric formats, and vice-versa. Example notebooks for the Python library are hosted in a separate repository.
To learn more about Placekey or to try it for yourself, visit our website at placekey.io