Monday, January 7, 2013

Visualizing Geohash

I recently had to process data about places, or points of interest, around the globe. It was intuitive to me to try organize these records by their location. The standard way to group hadoop records is to make the records in the same group share the key prefix. I needed to somehow convert a latitude, longitude in a string of characters and that is when found Geohash. It is a well known dimensionality reduction technique that transforms the two dimension spatial point (latitude,longitude) into a alphanumerical string, or hash.
I'll describe the  details of the points of interest processing in a future post. In this post, I will describe Geohash visually because I believe it is easier for some people (like myself) to understand and it would had saved me a some time had anyone else done it.

First, the whole globe is divided in 32, 4 rows and 8 columns,  and to each cell is given an alpha-numeric character.
World Geohash'd.
The Geohash algorithm maps a  point (latitude, longitude) to one of the characters above depending on which square they fall into. For instance, any point in Alaska would be mapped to B. The reverse mapping is also possible, i.e., converting B to its point would result in the latitude and longitude in the center of the B cell.

At this level, the Geohash is not very precise as the cells are too big. To improve precision, each cell can be recursively subdivided in 32  parts until the desired precision is achieved. The final Geohash is the concatenation of the cells Geohash chosen along the way. For instance, the picture below show the cell D (from the picture above) divided and the new sub-cells labels.

   The next picture goes a few steps farther  and shows DR5RU cell, which is in  Midtown NYC. It is a level 5 cell, i.e., the cell DR above was divided in 32 and the cell 5 was chosen making a new cell was DR5. It then was divided in 32 again and now R was chosen making a new cell DR5R. Finally it was divided in 32 one more time and  U was selected, ending in DR5RU.
Midtown NYC level 5 cell

Lastly, one may need to understand how the cells are spatially laid out. In my case, I needed to calculate all the neighbors for a given cell and having this visual representation helped me to create test cases. The cells are organized in z-like order that starts on the bottom left corner and finishes in the top right corner, as displayed in the picture below.
Z-like order of the Geohashes
It is worth noting that not always the alphanumerical proximity of the Geohash means geo spatial proximity. Points in 7 and 8 are very far apart although numerically very close and 2.  
Geohash is a very useful concept and I hope this post helps people to understand it.

 

18 comments:

  1. Hi,

    I ve noticed that A is not in the map. I assume that A is all the map.

    Is that right?

    Greetings

    Asterios

    ReplyDelete
    Replies
    1. Yeah, there is no A, I, L, among others, in decoding. see here for details

      Delete
  2. Hello Paulo,

    thanks for the post. What i noticed in the 'World Geohash'd.'-Map is that there is no 'A'-letter.

    Can you say us where is the 'A'-letter?

    I suppose that it is all the 'World Geohash'd.'-Map. Is that right?

    Thanks

    ReplyDelete
  3. Nice article! Thanks! Looking forward to a more detailed post on geohash.

    ReplyDelete
  4. Hello Paulo! I am very interested in your presentation. So, I would like to know how you will use the Z-order to visualize the geohash because similarities in the hash is not synonym of proximity. Thank you ! Anned

    ReplyDelete
  5. Hello Paulo! I am very interested in your presentation. So, I would like to know how you will use the Z-order to visualize the geohash because similarities in the hash is not synonym of proximity. Thank you ! Anned

    ReplyDelete
  6. Thank you, it was very useful :)

    ReplyDelete
  7. Very usefull, thanks!

    But... Why there is no letter 'a' ?

    ReplyDelete
  8. Thank you for this, it was so useful to read this before diving into ES Geohashing.

    ReplyDelete
  9. Hello Thanks for the useful info.......

    ReplyDelete
  10. Very helpful, thanks. It is really good to see how one geohash value is mapped to one particular cell in the world map.

    ReplyDelete