Since my last post I've found out that using the 'morton' number to index spatial number is also referred to as the Z-order.
To index using this order, you can use this stored function:
- CREATE FUNCTION getGeoMorton(lat DOUBLE, lng DOUBLE) RETURNS BIGINT UNSIGNED DETERMINISTIC
- BEGIN
-
- -- 11930464 is round(maximum value of a 32bit integer / 360 degrees)
-
- DECLARE bit, morton, pos BIGINT UNSIGNED DEFAULT 0;
-
- SET @lat = CAST((lat + 90) * 11930464 AS UNSIGNED);
- SET @lng = CAST((lng + 180) * 11930464 AS UNSIGNED);
- SET bit = 1;
-
- WHILE bit <= @lat || bit <= @lng DO
-
- IF(bit & @lat) THEN SET morton = morton | ( 1 << (2 * pos + 1)); END IF;
- IF(bit & @lng) THEN SET morton = morton | ( 1 << (2 * pos)); END IF;
-
- SET pos = pos + 1;
-
- SET bit = 1 << pos;
-
- END WHILE;
-
- RETURN morton;
- END;
Some caveats
- Since the function is using floating-point numbers, there will be rounding errors. These are generally very small, and for us well within the acceptable margin of error.
- More significantly, this function assumes euclidean geometry (i.e.: a flat surface). The earth obviously isn't, so as you get closer to the poles you might get back results from outside your rectangle, or miss results from within.
- It's not recommended to use this function directly in your WHERE clause. Even though the function is marked deterministic (i.e.: it will always yield the same results for the same arguments), the MySQL query optimizer currently ignored that modifier. So, set it in a temporary variable first (SET @number =) and use the variable in your where clause.
- Don't forget dealing with queries spanning over the -180°, 180° longitude line.
- Also store your actual longitude and latitude values in the DB. This function is not intended to be exact.
- When you do your queries, select both on the morton number, and longitude and latitude.
A better way to do it?
In my research I've found the Hilbert Curve to be an even better algorithm, but haven't yet gone through the effort of trying to express it in SQL.

Hi!
I just read your posts on geo indexing, and from there i ended up reading wikipedia on the matter. Interesting stuff, thanks!
I just want to point out there is an error in your MySQL getGeoMorton function. The line
RETURN @lng;
Should not be there. I also think there are a number of things that could be improved in your implementation. For example, there is no need to use the user-defined varibles (the @ variables). You will find that these are quite a bit slower than proper local variables.
There are more performance optimizations possible (and with MySQL stored functions, you need all the speed you can get). Here's what I came up with:
DELIMITER go
DROP FUNCTION IF EXISTS f_morton;
go
CREATE FUNCTION f_morton(
p_longitude DOUBLE
, p_latitude DOUBLE
)
RETURNS BIGINT UNSIGNED
DETERMINISTIC
BEGIN
DECLARE v_bit BIGINT UNSIGNED
DEFAULT 1;
DECLARE v_shift BIGINT UNSIGNED
DEFAULT 1;
DECLARE v_morton BIGINT UNSIGNED
DEFAULT 0;
DECLARE v_latitude BIGINT UNSIGNED
DEFAULT CAST((p_latitude + 90) * 11930464 AS UNSIGNED);
DECLARE v_longitude BIGINT UNSIGNED
DEFAULT CAST((p_longitude + 180) * 11930464 AS UNSIGNED);
WHILE v_bit <= v_latitude="" ||="" v_bit="" <="v_longitude" do="" if="" &="" v_longitude="" then="" set="" v_morton="" :="v_bit" |="" v_shift;="" end="" if;="" v_shift="" <<="" 1;="" while;="" return="" v_morton;="" end;="" go="" delimiter="" ;="" some="" testing="" indicated="" this="" is="" about="" 20-25%="" faster.="" i="" hope="" it="" helps!="" cheers,="" roland="">
Dropped that return line. I must have left it in after debugging; Thanks for that Roland!
jbland,
From what I read there is that geohashing uses the exact same concept for encoding lat/long and shares the same limitations. The difference is that I encode to a 64bit integer, geohashing encodes this number further as a string.
Using the integer will give you better performance.
Evert
Evert,
im thinking of going with morton numbers to help with clustering, but im unsure about how to dial up/down precision... any pointers
It's not really easy to calculate range, what you could do is simply drop bits off the bottom, like: morton XOR 0x1 to drop the last bit, morton XOR 0x2 for the 2 last bits, etc..
That way you might not get the best possible clusters, but it could definitely push you in the right direction.