To continue with the changes in PostGIS 3.1, in this post I’ll cover the performance improvements on many functions that output geometries either as binary or as text. I will talk about several changes which have in common that they kicked off by a single question: “Now what?”
After the release of 3.0 back in October 2019, I wasn’t working on anything directly related to PostGIS with the exception of using MVTs to make maps. Since I had recently improved vector tile functions and I was ok with their performance (spoiler alert: I ended up improving them again in 3.1) I didn’t have any good ideas about what to work on next. But then two events happened around the same time: CppCon released their 2019 talks on YouTube and a coworker mentioned moving large amounts of information between PostgreSQL and BigQuery, which was not only slow but also prone to add inaccuracies to the data.
It turned out that the only way there was to avoid BigQuery altering the geometries on input was to use the GeoJSON format (ST_AsGeoJSON) which outputs a JSON with the geometry in its text form, and I had just watched Stephan T. Lavavej talk where he explained how they had improved the performance of the conversion from floating point numbers to string in Microsoft’s C++ Standard Library. Those two things clicked in my head: If I could apply a similar approach inside PostGIS, we would print geometries 10x faster. Once I started exploring the code and deciding how to best approach the issue I also found several quick enhancements that could be both to text and binary output functions so the task grew but the spirit remained: Let’s make getting geometries out of PostGIS faster.
Floating point to string
The foundation on which Microsoft’s <charconv> change rested was Ryū, an algorithm developed by Ulf Adams that greatly improved the speed of float to string conversion, which is likely to be one of the most used functions in computer programs. Think about how many times we print numbers every day: anything from logs, reports, showing them on screen, ETLs… Improving a function like this means a direct reduction in the energy we use on our devices and data centers, which means that it literally avoids the emission of tons of CO2, which reduces the impact of mankind on climate, slightly tipping the scale on our favor in the fight to save the world from ourselves.
Most developers don’t need to know how to convert a floating point number to string since the standard libraries or the programming languages themselves give us that functionality, and in an ideal world the enhancements introduced by Ryū would be automagically implemented everywhere solving all of our problems. Sadly for us the reality is different and the changes in the output (when compared with the traditional
printf output) and the extremely long process to update core system libraries means that there is always more work to do. So although not every programmer has to care about this kind of stuff, some do. For example, Andrew Gierth (a.k.a. our beloved RhodiumToad on IRC) brought Ryū’s improvements to PostgreSQL 12 and I set up to emulate him and do the same in PostGIS.
First implementation: A hack
The first step in the process was to confirm the estimation of how impactful the change would be. I wasn’t looking for a perfect match, instead I just wanted to get something that was close enough to the existing coordinate output with a much better performance.
Since PostgreSQL 12 had introduced a similar change, I modified PostGIS’
lwprint_double to use those functions directly with an ugly hack that involved hardcoding compiler and linker flags relative to my local installation. This confirmed that I was in the correct path as I saw a 3-4x improvement in performance in
ST_AsText. From that moment, I knew that working on a clean integration was indeed worth it.
Second implementation: Ryū
With the proven thesis I moved on to integrate the library into PostGIS itself. In the proposed change (#523) I used Ryū’s
d2exp_buffered_n to print coordinates and it worked great: depending on the function (and what percentage of the CPU time was actually spent printing doubles) they became 1.2x to 8x as fast as they were in 3.0. Nevertheless I did find several points of friction when trying to match the previous output:
It printed as many decimal digits as required by the caller. This was in part because it was hard to know how many digits you needed to ensure the number you printed would be reimported as exactly the same binary number, and in many occasions it lead to having a lot of meaningless data: for example,
The previous implementation had some bugs with the precision parameter (which determines how many decimal digits are included) and sometimes returned less characters than requested. Those issues had to be addressed.
I couldn’t simply replace the old call to
sprintfwith Ryu’s functions because 3.0 did extra operations like trimming trailing zeros. I initially added this truncation to our version of Ryū but I wasn’t very happy with the change since I was adding characters to the buffer to remove them moments later.
d2fixed_buffered_ndoesn’t provide a way to limit the precision of the output. My initial implementation truncated the string result, which was slow and wrong as it didn’t do proper rounding.
With all this in mind I decided that I would rather break compatibility and change the coordinate output format to provide a faster and more human friendly alternative. Hopefully with less bugs.
Third implementation: Custom Ryū
The final step, which will be part of PostGIS 3.1, was focused on getting a better coordinate output while keeping the performance improvements of the second iteration. After multiple tests and discussions, the final decision was to use the following rules:
Use the shortest representation, which is enough to guarantee round-trip safety.
0.3will always be represented as
0.3for any precision greater than 0.
Use scientific notation for absolute numbers smaller than
1e-8. The previous behaviour was to output
0for absolute values smaller than
1e-12, which meant a precision loss around zero.
Use scientific notation for absolute numbers greater than
1e+15, which was the same behaviour as before.
The precision parameter still limits only the number of decimal digits of the output but now it is applied with proper rounding to the shortest representation, that is it will only trim meaningful decimal digits. It will also be applied exactly in the same way to all text output functions.
The precision parameter now also affects the scientific notation too, whereas before the precision for large numbers was fixed to between 5 and 8 digits.
The default precision value remains unchanged:
9for GeoJSON and
15for everything else.
The new code is based on Ryū’s
d2s and modified to handle the format defined above and has a faster and more consistent output than before. Let’s see an example of the same geometry at different precision levels:
SELECT x, ST_AsText(ST_MakePoint(0.3, 22.200000000000003), x) FROM generate_series(1, 20, 2) x; x | st_astext ----+--------------------------------------------------- 1 | POINT(0.3 22.2) 3 | POINT(0.3 22.2) 5 | POINT(0.3 22.2) 7 | POINT(0.3 22.2) 9 | POINT(0.3 22.2) 11 | POINT(0.3 22.2) 13 | POINT(0.3 22.2) 15 | POINT(0.3 22.2) 17 | POINT(0.29999999999999999 22.200000000000003) 19 | POINT(0.2999999999999999889 22.20000000000000284) (10 rows)
SELECT x, ST_AsText(ST_MakePoint(0.3, 22.200000000000003), x) FROM generate_series(1, 20, 2) x; x | st_astext ----+------------------------------- 1 | POINT(0.3 22.2) 3 | POINT(0.3 22.2) 5 | POINT(0.3 22.2) 7 | POINT(0.3 22.2) 9 | POINT(0.3 22.2) 11 | POINT(0.3 22.2) 13 | POINT(0.3 22.2) 15 | POINT(0.3 22.200000000000003) 17 | POINT(0.3 22.200000000000003) 19 | POINT(0.3 22.200000000000003) (10 rows)
We can see the two most noticeable changes here:
0.29999999999999999represents the same binary value as
0.3, so in 3.1 we always prefer
0.3as it’s sorter and still safe for a round trip.
At precision 15, you already have enough digits to show
22.200000000000003which, but due to a bug, wouldn’t show until precision 17 in 3.0. As it’s the case for
0.3, that number already has as many digits as needed to uniquely identify a binary floating point number, so there is no need to add more digits in higher precision levels.
Once you speed up the slowest wheel in the process others raise in importance or even become the new bottleneck, so aside from introducing Ryū and changing the coordinate output format I also applied several other performance improvements:
In many output functions, both in text and binary output, we now generate the exact buffer that we are going to return to PostgreSQL instead of a temporary one that later needs to be copied to add a header (#541).
In functions that need the SRS for the output (like ST_AsGML or optionally ST_AsGeoJSON) we cache it instead of generating it for each row (#557). We also avoid SQL inlines where the cache is destroyed after each row, which made it useless (#561). This also affects ST_GeomfromGeoJSON and the the equivalent
I also adapted the cost of the SQL functions to help PostgreSQL planner make better decisions (#556).
For most of the benchmarks I’ll use the Boundaries of Canada Provinces dataset, which contains only 13 multipolygons with an average of 260k points and over 2000 rings. It’s my favourite dataset when I want to do any performance test.
$ shp2pgsql -D -s 4326 -I lpr_000b16a_e/lpr_000b16a_e canada | psql -U postgres benchmarks
For ST_GeoHash, which only works with points, I’ll use the 2015 NYC tree census with 683788 points.
$ shp2pgsql -D -s 4326 -I geo_export_d02b464e-5a77-4dc8-b5d1-f92150d03a11 trees | psql -U postgres benchmarks
And for ST_AsEncodedPolyline, which only works on single LineStrings, I’ll use the first line of each of the 4248 MultiLineStrings of the US coastile from Tiger 2019.
$ shp2pgsql -D -s 4326 -I tl_2019_us_coastline coastline | psql -U postgres benchmarks
To run the tests I’m using pgbench and running one query over and over for 30 seconds and comparing the query latency between PostGIS 3.0 and 3.1.
$ head -n 5 file.pgbench -- Binary output functions EXPLAIN (ANALYZE , TIMING OFF) SELECT ST_AsBinary(geom) FROM canada; --EXPLAIN (ANALYZE , TIMING OFF) SELECT ST_AsTWKB(geom) FROM canada; --EXPLAIN (ANALYZE , TIMING OFF) SELECT ST_GeoHash(geom) FROM trees; --EXPLAIN (ANALYZE , TIMING OFF) SELECT ST_AsEncodedPolyline(geom) FROM coastline_simple; $ pgbench -c 1 -T 30 -r -f file.pgbench -U postgres benchmarks ... statement latencies in milliseconds: 154.625 EXPLAIN (ANALYZE , TIMING OFF) SELECT ST_AsBinary(geom) FROM canada;
Binary output functions
These functions were the least impacted by the changes as they were only affected by some of the minor improvements to avoid unnecessary copies. Nevertheless, we see an improvement of 4-9%;
|Function||3.0 latency (ms)||3.1 latency (ms)||Change|
Text output functions
In text output functions, we see the full impact of all the changes with functions that are 2 to 40x as fast as the previous release.
|Function||3.0 latency (ms)||3.1 latency (ms)||Change|
|ST_AsText default (polygons)||4962.685||551.860||9x|
|ST_AsText default (points)||590.086||222.566||2.65x|
|ST_AsText precision=0 (polygons)||3192.996||556.264||5.74x|
|ST_AsText precision=20 (polygons)||5914.018||552.952||10.7x|
|ST_AsEWKT default (polygons)||4954.474||555.070||8.93x|
|ST_AsGeoJSON default (polygons)||4068.112||496.636||8.19x|
|ST_AsGeoJSON short CRS (poly)||4043.459||500.971||8.07x|
|ST_AsGeoJSON short CRS (points)||10920.411||279.894||39.02x|
|ST_AsGML v3 default (polygons)||5267.115||893.432||5.9x|
|ST_AsGML v3 default (points)||11073.952||322.920||34.29x|
|ST_AsSVG default (polygons)||5135.232||924.468||5.55x|
|ST_AsSVG default (points)||576.794||242.407||2.38x|
|ST_AsX3D default (polygons)||5409.824||1157.401||4.67x|
|ST_AsX3D default (points)||10580.443||242.407||43.65x|