The most important tool for geospatial performance optimization:
Single-stage query engine (SSQE):
EXPLAIN PLAN FORSELECT store_name, address, ST_Distance(location, ST_Point(-122.4194, 37.7749)) as distanceFROM stores WHERE ST_Distance(location, ST_Point(-122.4194, 37.7749)) < 5000ORDER BY distance;
Multi-stage query engine MSQE:
set explainAskingServers=true;EXPLAIN PLAN FORSELECT store_name, address, ST_Distance(location, ST_Point(-122.4194, 37.7749)) as distanceFROM storesWHERE ST_Distance(location, ST_Point(-122.4194, 37.7749)) < 5000ORDER BY distance;
Setting explainAskingServers query option is necessary for MSQE to show usage of segment-level indexes.
Shows as FilterH3Index in MSQE.
When: ST_Distance with range conditions
WHERE ST_Distance(location_column, literal_point) < distanceWHERE ST_Distance(location_column, literal_point) > distance WHERE ST_Distance(location_column, literal_point) BETWEEN min_dist AND max_dist
-- Correct argument order (column first, literal second)WHERE ST_Distance(location_column, ST_Point(-122.4194, 37.7749)) < 5000-- Range queries work wellWHERE ST_Distance(location_column, ST_Point(-122.4194, 37.7749)) BETWEEN 1000 AND 5000-- Use with ORDER BY for nearest neighborSELECT * FROM stores WHERE ST_Distance(location, ST_Point(-122.4194, 37.7749)) < 10000ORDER BY ST_Distance(location, ST_Point(-122.4194, 37.7749))LIMIT 10;
Non-Optimized Patterns:
-- Wrong argument order (literal first)WHERE ST_Distance(ST_Point(-122.4194, 37.7749), location_column) < 5000-- Both arguments are columns (can't use index)WHERE ST_Distance(pickup_location, dropoff_location) < 5000-- Complex expressions in spatial functionWHERE ST_Distance(location, ST_Point(lon_column, lat_column)) < 5000
-- Combine spatial and non-spatial filters efficientlySELECT store_name, categoryFROM stores WHERE category = 'restaurant' -- Fast filter first AND ST_Distance(location, ST_Point(-122.4194, 37.7749)) < 2000 -- Spatial filter AND rating > 4.0; -- Additional filters after spatial
Subquery Optimization:
-- Use spatial filtering in subquery to reduce datasetSELECT s.store_name, s.address, r.avg_ratingFROM ( SELECT store_id, store_name, address FROM stores WHERE ST_Distance(location, ST_Point(-122.4194, 37.7749)) < 5000) sJOIN ratings r ON s.store_id = r.store_id;
Spatial Aggregations:
-- Efficient spatial groupingSELECT region, COUNT(*) as store_count, AVG(ST_Distance(location, ST_Point(-122.4194, 37.7749))) as avg_distanceFROM stores WHERE ST_Distance(location, ST_Point(-122.4194, 37.7749)) < 20000GROUP BY regionORDER BY avg_distance;
-- Enable query execution metricsSET enableQueryExecutionMetrics = true;-- Run your spatial query and check execution timeSELECT store_name FROM stores WHERE ST_Distance(location, ST_Point(-122.4194, 37.7749)) < 5000;
Measure: Ratio of filtered docs to total docs
EXPLAIN PLAN FORSELECT COUNT(*) FROM stores WHERE ST_Distance(location, ST_Point(-122.4194, 37.7749)) < 5000;
Good Ratios:
>90% filtered: Excellent index effectiveness
50-90% filtered: Good performance
<50% filtered: Consider different resolutions
Check which resolutions are being used:Test different query radii:
-- Small radius - should use high resolutionEXPLAIN PLAN FOR SELECT COUNT(*) FROM stores WHERE ST_Distance(location, ST_Point(-122.4194, 37.7749)) < 500;-- Large radius - should use low resolution EXPLAIN PLAN FOR SELECT COUNT(*) FROM storesWHERE ST_Distance(location, ST_Point(-122.4194, 37.7749)) < 50000;
-- Benchmark different query patterns-- Small radius (high precision)SELECT 'Small Radius' as test_type, COUNT(*) as result_countFROM stores WHERE ST_Distance(location, ST_Point(-122.4194, 37.7749)) < 1000;-- Medium radius (typical use case)SELECT 'Medium Radius' as test_type, COUNT(*) as result_count FROM storesWHERE ST_Distance(location, ST_Point(-122.4194, 37.7749)) < 5000;-- Large radius (coarse filtering)SELECT 'Large Radius' as test_type, COUNT(*) as result_countFROM stores WHERE ST_Distance(location, ST_Point(-122.4194, 37.7749)) < 25000;
Performance Comparison:
-- Compare indexed vs non-indexed performance-- (Temporarily disable index for comparison)-- Time indexed querySELECT store_name FROM stores_indexedWHERE ST_Distance(location, ST_Point(-122.4194, 37.7749)) < 5000;-- Time non-indexed query SELECT store_name FROM stores_no_indexWHERE ST_Distance(location, ST_Point(-122.4194, 37.7749)) < 5000;
-- For very large datasets, use approximate filtering firstWITH nearby_candidates AS ( SELECT store_id, location FROM stores WHERE ST_Distance(location, ST_Point(-122.4194, 37.7749)) < 6000 -- Slightly larger radius)SELECT s.store_name, s.addressFROM nearby_candidates ncJOIN stores s ON nc.store_id = s.store_idWHERE ST_Distance(nc.location, ST_Point(-122.4194, 37.7749)) < 5000; -- Exact filter
Batch Spatial Operations:
-- Process multiple locations efficientlyWITH locations_of_interest(location_name, lon, lat) AS ( VALUES ('SF', -122.4194, 37.7749), ('Oakland', -122.2711, 37.8044), ('San Jose', -121.8863, 37.3382))SELECT loi.location_name, COUNT(*) as nearby_storesFROM locations_of_interest loiCROSS JOIN stores sWHERE ST_Distance(s.location, ST_Point(loi.lon, loi.lat)) < 5000GROUP BY loi.location_name;
-- Select only needed columnsSELECT store_name, addressFROM stores WHERE ST_Distance(location, ST_Point(-122.4194, 37.7749)) < 5000;-- Avoid SELECT * with spatial dataSELECT * FROM stores -- Includes large spatial columnsWHERE ST_Distance(location, ST_Point(-122.4194, 37.7749)) < 5000;
Result Set Limiting:
-- Use LIMIT for interactive queriesSELECT store_name, ST_Distance(location, ST_Point(-122.4194, 37.7749)) as distanceFROM stores WHERE ST_Distance(location, ST_Point(-122.4194, 37.7749)) < 10000ORDER BY distanceLIMIT 20; -- Only get top 20 results
Symptoms: Queries taking >1 second, high CPU usageDiagnosis:
EXPLAIN PLAN FOR your_slow_query;
Solutions:
Verify H3 index operators are present
Check argument order in spatial functions
Ensure resolutions match query patterns
Add additional non-spatial filters
Index Not Being Used
Symptoms: EXPLAIN PLAN shows FILTER instead of FILTER_H3_INDEXDiagnosis Checklist:
Column configured with encodingType: "RAW"
Column in noDictionaryColumns array
H3 index configured with appropriate resolution
Query uses correct argument order
Spatial function is supported (ST_Distance, ST_Within, ST_Contains)
Query distance is not >100x the hexagon size (auto-reverts to full scan)
Distance vs Resolution Check:
If your query distance is much larger than hexagon size, the H3IndexFilterOperator automatically reverts to non-indexed expression for performance reasons.Fix Example:
{ "fieldConfigList": [{ "name": "location", "encodingType": "RAW", // Must be RAW "indexes": { "h3": { "resolution": [8] // Single resolution only } } }]}
High Memory Usage
Symptoms: Out of memory errors, slow ingestionSolutions:
Reduce number of H3 resolutions (keep 2-3 max)
Remove very high resolutions (13-15) if not needed