You open the Uber app, tap "Request Ride", and within two seconds you see nearby drivers on a map. Behind that tap is a real-time geospatial query across millions of moving data points — answered in under 100 milliseconds. No brute-force scan. No city-wide database sweep. Just smart engineering built on top of spatial indexing, in-memory stores, and persistent connections.
This post breaks down exactly how Uber and Lyft solve this problem — the actual data structures, algorithms, and architecture choices that make sub-100ms driver lookup possible at global scale. The two companies solved the same problem in genuinely different ways, and both approaches are worth understanding.
The core problem: At peak hours, Uber has 5–6 million active drivers worldwide, each updating their GPS location every 4–5 seconds. A rider needs the nearest available driver in <100ms. A naive SQL query like SELECT * FROM drivers ORDER BY distance(lat, lon) LIMIT 5 would scan millions of rows on every request — completely infeasible.
Uber's Approach — H3 Hexagonal Indexing
Uber built and open-sourced H3 — a geospatial indexing system that divides the entire surface of Earth into a hierarchy of hexagonal cells. Hexagons are chosen deliberately: unlike squares, every hexagon has exactly 6 equidistant neighbors, which makes radius searches and traversals mathematically clean and uniform.
Step 1 — The Driver's Persistent WebSocket
Every driver's app maintains a persistent WebSocket connection to Uber's Location Service. Every 4–5 seconds, the driver's GPS coordinates (lat, long) are packaged as a lightweight Protocol Buffer (protobuf) and pushed over that connection.
Why WebSocket over HTTP polling? Persistent connections eliminate the TCP + TLS handshake overhead on every update. At 5 million active drivers pinging every 5 seconds, that's 1 million location updates per second — HTTP polling at that volume would collapse any server fleet under connection overhead alone.
The protobuf payload is tiny on purpose:
// driver_location.proto
message DriverLocation {
string driver_id = 1;
double latitude = 2;
double longitude = 3;
float heading = 4; // direction in degrees
int64 timestamp = 5; // Unix ms
bool available = 6;
}
Binary protobuf is ~3–5x smaller than the equivalent JSON. Over millions of updates per second, this saves significant bandwidth and parse time.
Step 2 — H3 Cell Encoding
When the Location Service receives a driver's GPS, it immediately converts the (lat, long) into an H3 cell ID — a 64-bit integer that uniquely identifies which hexagonal cell on Earth's surface that coordinate falls in.
H3 operates at 16 resolution levels (0–15). Higher resolution = smaller cells:
Country-level
Neighborhood
Uber's primary
City block
Max precision
Uber primarily uses resolution 9 for driver lookup — each cell covers roughly 100 meters × 100 meters, small enough to be precise, large enough to group multiple drivers per cell for efficient batch lookups.
import h3
# Convert driver GPS to H3 cell ID at resolution 9
def encode_driver_location(lat: float, lng: float) -> str:
return h3.geo_to_h3(lat, lng, resolution=9)
# Example
driver_lat, driver_lng = 37.7749, -122.4194 # San Francisco
cell_id = encode_driver_location(driver_lat, driver_lng)
# Returns: "8928308280fffff" — a unique hex cell identifier
Step 3 — Redis as the Live Driver Store
The H3 cell ID becomes a Redis key. The value is the set of driver IDs currently inside that cell. When a driver moves into a new cell, they're removed from the old key and added to the new one — atomically.
# Location Service pseudocode — on each driver update
def update_driver_location(driver_id, lat, lng, available):
new_cell = h3.geo_to_h3(lat, lng, resolution=9)
old_cell = redis.get(f"driver:{driver_id}:cell")
pipe = redis.pipeline()
if old_cell and old_cell != new_cell:
pipe.srem(f"cell:{old_cell}:drivers", driver_id) # remove from old cell
pipe.sadd(f"cell:{new_cell}:drivers", driver_id) # add to new cell
pipe.set(f"driver:{driver_id}:cell", new_cell) # update driver's current cell
pipe.set(f"driver:{driver_id}:loc", f"{lat},{lng}") # store exact coords
pipe.set(f"driver:{driver_id}:available", int(available))
pipe.expire(f"driver:{driver_id}:cell", 30) # auto-expire if driver goes offline
pipe.execute() # atomic batch
Redis pipelines batch all these operations into a single round-trip. The EXPIRE on the cell key ensures stale drivers are automatically evicted — if a driver's app crashes and stops sending updates, they disappear from the index within 30 seconds.
Step 4 — K-Ring Search (Rider Requesting a Ride)
When a rider taps "Request Ride", the system runs a k-ring traversal. It converts the rider's location to an H3 cell, then expands outward in rings of neighboring cells, querying Redis for available drivers at each ring until enough candidates are found.
def find_nearby_drivers(rider_lat, rider_lng, max_rings=3):
rider_cell = h3.geo_to_h3(rider_lat, rider_lng, resolution=9)
candidates = []
for ring in range(max_rings + 1):
# Get all hex cells in this ring
ring_cells = h3.k_ring(rider_cell, ring) if ring == 0 \
else h3.hex_ring(rider_cell, ring)
# Batch query Redis for all drivers in these cells
pipe = redis.pipeline()
for cell in ring_cells:
pipe.smembers(f"cell:{cell}:drivers")
cell_results = pipe.execute()
for driver_ids in cell_results:
for driver_id in driver_ids:
if redis.get(f"driver:{driver_id}:available") == b'1':
loc = redis.get(f"driver:{driver_id}:loc").decode().split(',')
dist = haversine(rider_lat, rider_lng, float(loc[0]), float(loc[1]))
candidates.append((driver_id, dist))
if len(candidates) >= 10:
break # enough candidates found — stop expanding
# Sort by distance, return top 5
return sorted(candidates, key=lambda x: x[1])[:5]
K-ring geometry: Ring 0 = 1 cell. Ring 1 = 7 cells total (6 neighbors + center). Ring 2 = 19 cells. Ring 3 = 37 cells. The search expands in O(k²) cells but each Redis lookup is O(1), so even searching 37 cells takes single-digit milliseconds.
Uber's Complete Architecture
Key architectural decisions in Uber's system:
- Redis sharded by city: Driver data for New York is never on the same shard as San Francisco. This bounds the size of each Redis node and keeps query fan-out local.
- Location Service is stateless: Any instance can handle any driver's update — no sticky sessions. The state lives entirely in Redis.
- Dispatch is separate from Location: The service writing location data is completely decoupled from the service reading it for matching. This lets each scale independently.
Lyft's Approach — Google S2 Sphere Geometry
Lyft took a different path: instead of hexagons, they built on top of Google's S2 geometry library, which subdivides the surface of a sphere (Earth) into a hierarchy of square-ish cells derived from projecting a cube onto the sphere. S2 is the same library Google Maps uses internally.
S2 vs H3 — The Core Difference
S2 and H3 both solve the same problem — map a GPS coordinate to a discrete spatial cell — but with different geometry:
| Property | H3 (Uber) | S2 (Lyft / Google) |
|---|---|---|
| Cell shape | Hexagon | Quad (square-ish) |
| Neighbor count | 6 (uniform distance) | 4 edge + 4 corner = 8 |
| Hierarchy levels | 16 (res 0–15) | 31 (level 0–30) |
| Cell ID type | 64-bit string | 64-bit integer |
| Area uniformity | ~±4% variance | ~2× variance pole-to-equator |
| Range queries | K-ring traversal | Cell unions / coverings |
| Open source | Yes (Uber) | Yes (Google) |
Step 1 — Driver Location via HTTP + WebSocket Hybrid
Lyft's driver app sends location updates using a combination of approaches depending on network conditions. On stable connections: WebSocket like Uber. On degraded networks: HTTP long-polling with backoff. The fallback matters because rideshare drivers often operate in areas with poor signal (underground garages, rural zones).
// Lyft driver location payload (simplified)
{
"driver_id": "drv_abc123",
"lat": 37.3382,
"lng": -121.8863,
"bearing": 270,
"speed_mph": 18.4,
"timestamp_ms": 1712345678901,
"status": "available" // available | on_trip | offline
}
Step 2 — S2 Cell Encoding
Lyft's Location Service converts GPS to an S2 cell ID. They primarily use level 13 for driver indexing — each cell at level 13 covers roughly 1.2 km², similar in scale to Uber's resolution 9 hex cells.
Country
Metro area
Lyft primary
City block
Max precision
import s2sphere
def encode_driver_s2(lat: float, lng: float, level: int = 13) -> int:
latlng = s2sphere.LatLng.from_degrees(lat, lng)
cell_id = s2sphere.CellId.from_lat_lng(latlng)
return cell_id.parent(level).id() # returns 64-bit integer
# Example
cell_id = encode_driver_s2(37.3382, -121.8863, level=13)
# Returns: 3832992110956265472 — unique S2 cell ID
Step 3 — S2 Cell Union for Region Search
The key S2 concept Lyft uses for driver lookup is a cell union — a compact set of S2 cells that together cover a geographic region (like a 2km radius circle). Instead of ring-by-ring expansion like Uber's k-ring, S2 pre-computes which cells cover the search area and queries them in parallel.
import s2sphere
def find_nearby_drivers_lyft(rider_lat, rider_lng, radius_km=2.0):
center = s2sphere.LatLng.from_degrees(rider_lat, rider_lng)
center_point = s2sphere.CellId.from_lat_lng(center).to_point()
# Build a spherical cap (circle on sphere surface)
earth_radius_km = 6371.0
angle = s2sphere.Angle.from_degrees(radius_km / earth_radius_km * (180 / 3.14159))
cap = s2sphere.Cap.from_axis_angle(center_point, angle)
# Get S2 cells that cover this cap — level 13 cells
region_coverer = s2sphere.RegionCoverer()
region_coverer.min_level = 13
region_coverer.max_level = 13
region_coverer.max_cells = 20 # limit to 20 cells max
covering = region_coverer.get_covering(cap)
# Parallel Redis batch query for all covering cells
pipe = redis.pipeline()
for cell in covering:
pipe.smembers(f"s2cell:{cell.id()}:drivers")
all_results = pipe.execute()
candidates = []
for driver_ids in all_results:
for driver_id in driver_ids:
info = redis.hgetall(f"driver:{driver_id}")
if info.get(b'status') == b'available':
dlat, dlng = float(info[b'lat']), float(info[b'lng'])
dist = haversine(rider_lat, rider_lng, dlat, dlng)
if dist <= radius_km:
candidates.append((driver_id, dist))
return sorted(candidates, key=lambda x: x[1])[:10]
Lyft's Zone-Based Dispatch
Beyond the cell indexing, Lyft overlays a zone-based dispatch system. Cities are divided into logical zones (e.g., "SFO Airport Zone", "Downtown SF Zone"). Drivers and riders are always tagged to a zone. This lets Lyft do coarse-grained filtering before the precise geo query — ruling out entire zones instantly if they're far away, then running S2 cell lookups only within candidate zones.
Side-by-Side Comparison
| System Component | UBER | LYFT |
|---|---|---|
| Spatial Index | H3 Hexagonal (Res 9) | S2 Sphere Cells (Level 13) |
| Cell shape | Hexagon — 6 uniform neighbors | Quad — 8 neighbors, slight distortion |
| Cell area | ~0.1 km² at Res 9 | ~1.2 km² at Level 13 |
| Search strategy | K-ring expansion (ring by ring) | S2 cap covering (parallel batch) |
| Driver transport | WebSocket + Protobuf | WebSocket + HTTP fallback + JSON |
| In-memory store | Redis Cluster (city-sharded) | Redis Cluster (zone-sharded) |
| Driver TTL / eviction | 30s key expiry on last update | Status flag + heartbeat timeout |
| Dispatch model | Pure geo proximity then ETA rank | Zone-first then S2 geo proximity |
| ETA model | ML model (road network + traffic) | ML model (zones + surge + road) |
| Open source library | H3 (github.com/uber/h3) | S2 (github.com/google/s2geometry) |
| End-to-end latency | < 100ms | < 100ms |
What Both Systems Share
Despite the different indexing approaches, Uber and Lyft converge on the same foundational architecture patterns:
1. Location Updates Never Touch a Database
Neither system writes driver GPS updates to a relational database at write time. Every update goes directly to Redis. The relational DB (Postgres, MySQL) is only used for non-real-time data: trip history, billing, driver profiles. For the live geospatial queries that matter for latency, it's all in-memory.
2. Exact Distance Computed Only for Candidates
The cell-based lookup gives a shortlist of maybe 50–200 nearby drivers. The exact haversine distance (or road-network distance for ETA) is computed only for these candidates — not for all 5 million active drivers. This is the key insight that makes it fast: the cell index does the heavy spatial filtering; the precise math runs on a small set.
import math
def haversine(lat1, lon1, lat2, lon2) -> float:
"""Great-circle distance in km between two GPS points."""
R = 6371.0
phi1, phi2 = math.radians(lat1), math.radians(lat2)
dphi = math.radians(lat2 - lat1)
dlambda = math.radians(lon2 - lon1)
a = math.sin(dphi/2)**2 + math.cos(phi1)*math.cos(phi2)*math.sin(dlambda/2)**2
return 2 * R * math.asin(math.sqrt(a))
3. Driver State Expiry as a Safety Net
Both systems use TTL-based expiry on driver keys. If a driver's app crashes, loses signal, or the phone dies, they automatically disappear from the index after ~30 seconds. Without this, dead entries would accumulate in Redis and be served to riders as "available drivers" that never respond.
4. Write Path and Read Path are Completely Separate
The Location Service (writes) and Dispatch Service (reads) are separate microservices. They share the Redis store but nothing else. This lets them scale independently — during peak rider demand, the Dispatch Service scales horizontally without touching the Location Service, and vice versa during high driver-registration periods.
The Numbers That Matter
| Metric | Value | Why It Matters |
|---|---|---|
| Driver GPS update interval | 4–5 seconds | Balances freshness vs battery drain |
| Active drivers at peak (Uber) | ~5–6 million | Sets Redis memory requirements |
| Location updates / second | ~1 million | WebSocket write throughput needed |
| Driver match latency | <100ms p99 | User-perceived responsiveness SLA |
| Redis lookup latency | ~0.5ms per key | 100 cell lookups = ~50ms max |
| H3 cells at Res 9 globally | ~69 million | Bounds Redis key space |
| Avg drivers per H3 cell (busy city) | 2–8 | Small sets = fast SMEMBERS |
| Driver eviction TTL | 30 seconds | Max staleness in the index |
What Actually Happens in 100ms
Here is the precise sequence from rider tap to driver notification — with timing at each stage:
The ETA service is the slowest part. The geospatial lookup (Redis) takes ~10ms. The ML model computing road-network arrival times — factoring in live traffic, turn restrictions, one-way streets — takes 40–60ms. Most of the 100ms budget is spent on making the ETA accurate, not finding the drivers.
Key Takeaways
- Spatial cell indexing converts an O(n) scan into O(1) lookup. The core insight: encode coordinates into a fixed cell ID, use the cell ID as a Redis key. Driver lookup becomes a key lookup, not a distance scan.
- Hexagons (H3) are geometrically cleaner for k-ring searches. Every hexagon neighbor is equidistant. Squares have corner neighbors that are √2× farther than edge neighbors — this creates search asymmetry that hexagons avoid.
- S2 cell unions are better for irregular region queries. If you need to cover a non-circular area (a city district, an airport zone, a geofenced region), S2 coverings are more expressive than hexagonal k-rings.
- The write path is the hard part. Reading 100 Redis keys is easy. Writing 1 million location updates per second to Redis without dropping any or causing replication lag — that's the engineering challenge.
- TTL-based eviction > explicit cleanup. At this scale, you cannot guarantee a "driver went offline" event is always delivered. A passive TTL ensures correctness without requiring a reliable disconnect signal.