How Uber & Lyft Find Nearby Drivers in Milliseconds

You open the Uber app, tap "Request Ride", and within two seconds you see nearby drivers on a map. Behind that tap is a real-time geospatial query across millions of moving data points — answered in under 100 milliseconds. No brute-force scan. No city-wide database sweep. Just smart engineering built on top of spatial indexing, in-memory stores, and persistent connections.

This post breaks down exactly how Uber and Lyft solve this problem — the actual data structures, algorithms, and architecture choices that make sub-100ms driver lookup possible at global scale. The two companies solved the same problem in genuinely different ways, and both approaches are worth understanding.

The core problem: At peak hours, Uber has 5–6 million active drivers worldwide, each updating their GPS location every 4–5 seconds. A rider needs the nearest available driver in <100ms. A naive SQL query like SELECT * FROM drivers ORDER BY distance(lat, lon) LIMIT 5 would scan millions of rows on every request — completely infeasible.

Uber

Uber's Approach — H3 Hexagonal Indexing

Uber built and open-sourced H3 — a geospatial indexing system that divides the entire surface of Earth into a hierarchy of hexagonal cells. Hexagons are chosen deliberately: unlike squares, every hexagon has exactly 6 equidistant neighbors, which makes radius searches and traversals mathematically clean and uniform.

Step 1 — The Driver's Persistent WebSocket

Every driver's app maintains a persistent WebSocket connection to Uber's Location Service. Every 4–5 seconds, the driver's GPS coordinates (lat, long) are packaged as a lightweight Protocol Buffer (protobuf) and pushed over that connection.

flowchart LR DA[Driver App\niOS / Android] -->|WebSocket\nevery 4–5s| LS[Location Service\nFleet of servers] LS -->|encode as H3 cell ID| RS[(Redis Cluster\nCell ID → Driver IDs)] style DA fill:#1a1a2e,stroke:#00c2ff,color:#e0e0e0 style LS fill:#1877F2,stroke:#00c2ff,color:#fff style RS fill:#420177,stroke:#b39ddb,color:#fff

Why WebSocket over HTTP polling? Persistent connections eliminate the TCP + TLS handshake overhead on every update. At 5 million active drivers pinging every 5 seconds, that's 1 million location updates per second — HTTP polling at that volume would collapse any server fleet under connection overhead alone.

The protobuf payload is tiny on purpose:

// driver_location.proto
message DriverLocation {
  string  driver_id  = 1;
  double  latitude   = 2;
  double  longitude  = 3;
  float   heading    = 4;   // direction in degrees
  int64   timestamp  = 5;   // Unix ms
  bool    available  = 6;
}

Binary protobuf is ~3–5x smaller than the equivalent JSON. Over millions of updates per second, this saves significant bandwidth and parse time.

Step 2 — H3 Cell Encoding

When the Location Service receives a driver's GPS, it immediately converts the (lat, long) into an H3 cell ID — a 64-bit integer that uniquely identifies which hexagonal cell on Earth's surface that coordinate falls in.

H3 operates at 16 resolution levels (0–15). Higher resolution = smaller cells:

Res 3~1,200 km²
Country-level

Res 7~5.16 km²
Neighborhood

Res 9 ★~0.1 km²
Uber's primary

Res 11~0.003 km²
City block

Res 15~0.9 m²
Max precision

Uber primarily uses resolution 9 for driver lookup — each cell covers roughly 100 meters × 100 meters, small enough to be precise, large enough to group multiple drivers per cell for efficient batch lookups.

import h3

# Convert driver GPS to H3 cell ID at resolution 9
def encode_driver_location(lat: float, lng: float) -> str:
    return h3.geo_to_h3(lat, lng, resolution=9)

# Example
driver_lat, driver_lng = 37.7749, -122.4194  # San Francisco
cell_id = encode_driver_location(driver_lat, driver_lng)
# Returns: "8928308280fffff" — a unique hex cell identifier

Step 3 — Redis as the Live Driver Store

The H3 cell ID becomes a Redis key. The value is the set of driver IDs currently inside that cell. When a driver moves into a new cell, they're removed from the old key and added to the new one — atomically.

# Location Service pseudocode — on each driver update
def update_driver_location(driver_id, lat, lng, available):
    new_cell = h3.geo_to_h3(lat, lng, resolution=9)
    old_cell = redis.get(f"driver:{driver_id}:cell")

    pipe = redis.pipeline()

    if old_cell and old_cell != new_cell:
        pipe.srem(f"cell:{old_cell}:drivers", driver_id)   # remove from old cell

    pipe.sadd(f"cell:{new_cell}:drivers", driver_id)       # add to new cell
    pipe.set(f"driver:{driver_id}:cell", new_cell)         # update driver's current cell
    pipe.set(f"driver:{driver_id}:loc", f"{lat},{lng}")    # store exact coords
    pipe.set(f"driver:{driver_id}:available", int(available))
    pipe.expire(f"driver:{driver_id}:cell", 30)            # auto-expire if driver goes offline

    pipe.execute()  # atomic batch

Redis pipelines batch all these operations into a single round-trip. The EXPIRE on the cell key ensures stale drivers are automatically evicted — if a driver's app crashes and stops sending updates, they disappear from the index within 30 seconds.

Step 4 — K-Ring Search (Rider Requesting a Ride)

When a rider taps "Request Ride", the system runs a k-ring traversal. It converts the rider's location to an H3 cell, then expands outward in rings of neighboring cells, querying Redis for available drivers at each ring until enough candidates are found.

flowchart TD R([Rider taps\nRequest Ride]) --> RC[Convert rider GPS\nto H3 cell ID\nRes 9] RC --> K0[Ring 0: center cell\nQuery Redis] K0 -->|no drivers?| K1[Ring 1: 6 neighbors\nQuery Redis] K1 -->|still empty?| K2[Ring 2: 12 more cells\nQuery Redis] K2 --> F[Filter by availability\n+ distance + ETA] F --> M[Match nearest driver\nnotify both parties] style R fill:#1a1a2e,stroke:#00c2ff,color:#e0e0e0 style RC fill:#0d6e4e,stroke:#00e676,color:#fff style K0 fill:#1877F2,stroke:#00c2ff,color:#fff style K1 fill:#1877F2,stroke:#00c2ff,color:#fff style K2 fill:#1877F2,stroke:#00c2ff,color:#fff style F fill:#FF9900,stroke:#ffb74d,color:#000 style M fill:#0d6e4e,stroke:#00e676,color:#fff

def find_nearby_drivers(rider_lat, rider_lng, max_rings=3):
    rider_cell = h3.geo_to_h3(rider_lat, rider_lng, resolution=9)
    candidates = []

    for ring in range(max_rings + 1):
        # Get all hex cells in this ring
        ring_cells = h3.k_ring(rider_cell, ring) if ring == 0 \
                     else h3.hex_ring(rider_cell, ring)

        # Batch query Redis for all drivers in these cells
        pipe = redis.pipeline()
        for cell in ring_cells:
            pipe.smembers(f"cell:{cell}:drivers")
        cell_results = pipe.execute()

        for driver_ids in cell_results:
            for driver_id in driver_ids:
                if redis.get(f"driver:{driver_id}:available") == b'1':
                    loc = redis.get(f"driver:{driver_id}:loc").decode().split(',')
                    dist = haversine(rider_lat, rider_lng, float(loc[0]), float(loc[1]))
                    candidates.append((driver_id, dist))

        if len(candidates) >= 10:
            break  # enough candidates found — stop expanding

    # Sort by distance, return top 5
    return sorted(candidates, key=lambda x: x[1])[:5]

K-ring geometry: Ring 0 = 1 cell. Ring 1 = 7 cells total (6 neighbors + center). Ring 2 = 19 cells. Ring 3 = 37 cells. The search expands in O(k²) cells but each Redis lookup is O(1), so even searching 37 cells takes single-digit milliseconds.

Uber's Complete Architecture

flowchart TD DA[Driver App\nGPS every 4–5s] -->|Protobuf over WebSocket| LB1[Load Balancer] LB1 --> LS1[Location Service\nNode 1] LB1 --> LS2[Location Service\nNode 2] LB1 --> LS3[Location Service\nNode 3] LS1 & LS2 & LS3 -->|Cell ID updates| RC[(Redis Cluster\nSharded by city)] RA[Rider App\nRequest Ride] -->|HTTPS| DS[Dispatch Service] DS -->|K-ring query| RC RC -->|Driver candidates| DS DS -->|Filter + rank| ETA[ETA Service\nML model] ETA --> DS DS -->|Match notification| LS1 style DA fill:#1a1a2e,stroke:#00c2ff,color:#e0e0e0 style RC fill:#420177,stroke:#b39ddb,color:#fff style DS fill:#FF9900,stroke:#ffb74d,color:#000 style ETA fill:#0d6e4e,stroke:#00e676,color:#fff

Key architectural decisions in Uber's system:

Redis sharded by city: Driver data for New York is never on the same shard as San Francisco. This bounds the size of each Redis node and keeps query fan-out local.
Location Service is stateless: Any instance can handle any driver's update — no sticky sessions. The state lives entirely in Redis.
Dispatch is separate from Location: The service writing location data is completely decoupled from the service reading it for matching. This lets each scale independently.

Lyft

Lyft's Approach — Google S2 Sphere Geometry

Lyft took a different path: instead of hexagons, they built on top of Google's S2 geometry library, which subdivides the surface of a sphere (Earth) into a hierarchy of square-ish cells derived from projecting a cube onto the sphere. S2 is the same library Google Maps uses internally.

S2 vs H3 — The Core Difference

S2 and H3 both solve the same problem — map a GPS coordinate to a discrete spatial cell — but with different geometry:

Property	H3 (Uber)	S2 (Lyft / Google)
Cell shape	Hexagon	Quad (square-ish)
Neighbor count	6 (uniform distance)	4 edge + 4 corner = 8
Hierarchy levels	16 (res 0–15)	31 (level 0–30)
Cell ID type	64-bit string	64-bit integer
Area uniformity	~±4% variance	~2× variance pole-to-equator
Range queries	K-ring traversal	Cell unions / coverings
Open source	Yes (Uber)	Yes (Google)

Step 1 — Driver Location via HTTP + WebSocket Hybrid

Lyft's driver app sends location updates using a combination of approaches depending on network conditions. On stable connections: WebSocket like Uber. On degraded networks: HTTP long-polling with backoff. The fallback matters because rideshare drivers often operate in areas with poor signal (underground garages, rural zones).

// Lyft driver location payload (simplified)
{
  "driver_id": "drv_abc123",
  "lat": 37.3382,
  "lng": -121.8863,
  "bearing": 270,
  "speed_mph": 18.4,
  "timestamp_ms": 1712345678901,
  "status": "available"   // available | on_trip | offline
}

Step 2 — S2 Cell Encoding

Lyft's Location Service converts GPS to an S2 cell ID. They primarily use level 13 for driver indexing — each cell at level 13 covers roughly 1.2 km², similar in scale to Uber's resolution 9 hex cells.

L 5~250,000 km²
Country

L 9~5,000 km²
Metro area

L 13 ★~1.2 km²
Lyft primary

L 17~0.05 km²
City block

L 30~1 cm²
Max precision

import s2sphere

def encode_driver_s2(lat: float, lng: float, level: int = 13) -> int:
    latlng = s2sphere.LatLng.from_degrees(lat, lng)
    cell_id = s2sphere.CellId.from_lat_lng(latlng)
    return cell_id.parent(level).id()  # returns 64-bit integer

# Example
cell_id = encode_driver_s2(37.3382, -121.8863, level=13)
# Returns: 3832992110956265472 — unique S2 cell ID

Step 3 — S2 Cell Union for Region Search

The key S2 concept Lyft uses for driver lookup is a cell union — a compact set of S2 cells that together cover a geographic region (like a 2km radius circle). Instead of ring-by-ring expansion like Uber's k-ring, S2 pre-computes which cells cover the search area and queries them in parallel.

flowchart TD RL([Rider Location\nlat, lng]) --> SC[Convert to S2 cell\nLevel 13] SC --> CU[Compute S2 Cell Union\ncovering 2km radius cap] CU --> PQ[Parallel Redis queries\nfor all covering cells] PQ --> AGG[Aggregate candidates\nfrom all cells] AGG --> FIL[Filter by status\n+ exact distance] FIL --> RANK[Rank by ETA\nML model] RANK --> MATCH[Dispatch to\nbest driver] style RL fill:#1a1a2e,stroke:#FF00BF,color:#e0e0e0 style SC fill:#1a001a,stroke:#FF00BF,color:#FF00BF style CU fill:#1a001a,stroke:#FF00BF,color:#FF00BF style PQ fill:#420177,stroke:#b39ddb,color:#fff style AGG fill:#420177,stroke:#b39ddb,color:#fff style FIL fill:#FF9900,stroke:#ffb74d,color:#000 style RANK fill:#0d6e4e,stroke:#00e676,color:#fff style MATCH fill:#0d6e4e,stroke:#00e676,color:#fff

import s2sphere

def find_nearby_drivers_lyft(rider_lat, rider_lng, radius_km=2.0):
    center = s2sphere.LatLng.from_degrees(rider_lat, rider_lng)
    center_point = s2sphere.CellId.from_lat_lng(center).to_point()

    # Build a spherical cap (circle on sphere surface)
    earth_radius_km = 6371.0
    angle = s2sphere.Angle.from_degrees(radius_km / earth_radius_km * (180 / 3.14159))
    cap = s2sphere.Cap.from_axis_angle(center_point, angle)

    # Get S2 cells that cover this cap — level 13 cells
    region_coverer = s2sphere.RegionCoverer()
    region_coverer.min_level = 13
    region_coverer.max_level = 13
    region_coverer.max_cells = 20     # limit to 20 cells max
    covering = region_coverer.get_covering(cap)

    # Parallel Redis batch query for all covering cells
    pipe = redis.pipeline()
    for cell in covering:
        pipe.smembers(f"s2cell:{cell.id()}:drivers")
    all_results = pipe.execute()

    candidates = []
    for driver_ids in all_results:
        for driver_id in driver_ids:
            info = redis.hgetall(f"driver:{driver_id}")
            if info.get(b'status') == b'available':
                dlat, dlng = float(info[b'lat']), float(info[b'lng'])
                dist = haversine(rider_lat, rider_lng, dlat, dlng)
                if dist <= radius_km:
                    candidates.append((driver_id, dist))

    return sorted(candidates, key=lambda x: x[1])[:10]

Lyft's Zone-Based Dispatch

Beyond the cell indexing, Lyft overlays a zone-based dispatch system. Cities are divided into logical zones (e.g., "SFO Airport Zone", "Downtown SF Zone"). Drivers and riders are always tagged to a zone. This lets Lyft do coarse-grained filtering before the precise geo query — ruling out entire zones instantly if they're far away, then running S2 cell lookups only within candidate zones.

flowchart TD RA2[Rider App] --> ZL[Zone Lookup\nWhich zone is rider in?] ZL --> NZ[Find neighboring zones\npre-computed adjacency map] NZ --> S2Q[S2 cell union query\nonly within those zones] S2Q --> RD[(Redis — per-zone\ndriver pools)] RD --> DS2[Dispatch Service\nETA + surge pricing] style ZL fill:#1a001a,stroke:#FF00BF,color:#FF00BF style NZ fill:#1a001a,stroke:#FF00BF,color:#FF00BF style S2Q fill:#420177,stroke:#b39ddb,color:#fff style RD fill:#420177,stroke:#b39ddb,color:#fff style DS2 fill:#FF9900,stroke:#ffb74d,color:#000

Side-by-Side Comparison

System Component	UBER	LYFT
Spatial Index	H3 Hexagonal (Res 9)	S2 Sphere Cells (Level 13)
Cell shape	Hexagon — 6 uniform neighbors	Quad — 8 neighbors, slight distortion
Cell area	~0.1 km² at Res 9	~1.2 km² at Level 13
Search strategy	K-ring expansion (ring by ring)	S2 cap covering (parallel batch)
Driver transport	WebSocket + Protobuf	WebSocket + HTTP fallback + JSON
In-memory store	Redis Cluster (city-sharded)	Redis Cluster (zone-sharded)
Driver TTL / eviction	30s key expiry on last update	Status flag + heartbeat timeout
Dispatch model	Pure geo proximity then ETA rank	Zone-first then S2 geo proximity
ETA model	ML model (road network + traffic)	ML model (zones + surge + road)
Open source library	H3 (github.com/uber/h3)	S2 (github.com/google/s2geometry)
End-to-end latency	< 100ms	< 100ms

What Both Systems Share

Despite the different indexing approaches, Uber and Lyft converge on the same foundational architecture patterns:

1. Location Updates Never Touch a Database

Neither system writes driver GPS updates to a relational database at write time. Every update goes directly to Redis. The relational DB (Postgres, MySQL) is only used for non-real-time data: trip history, billing, driver profiles. For the live geospatial queries that matter for latency, it's all in-memory.

2. Exact Distance Computed Only for Candidates

The cell-based lookup gives a shortlist of maybe 50–200 nearby drivers. The exact haversine distance (or road-network distance for ETA) is computed only for these candidates — not for all 5 million active drivers. This is the key insight that makes it fast: the cell index does the heavy spatial filtering; the precise math runs on a small set.

import math

def haversine(lat1, lon1, lat2, lon2) -> float:
    """Great-circle distance in km between two GPS points."""
    R = 6371.0
    phi1, phi2 = math.radians(lat1), math.radians(lat2)
    dphi = math.radians(lat2 - lat1)
    dlambda = math.radians(lon2 - lon1)
    a = math.sin(dphi/2)**2 + math.cos(phi1)*math.cos(phi2)*math.sin(dlambda/2)**2
    return 2 * R * math.asin(math.sqrt(a))

3. Driver State Expiry as a Safety Net

Both systems use TTL-based expiry on driver keys. If a driver's app crashes, loses signal, or the phone dies, they automatically disappear from the index after ~30 seconds. Without this, dead entries would accumulate in Redis and be served to riders as "available drivers" that never respond.

4. Write Path and Read Path are Completely Separate

flowchart LR subgraph Write Path DRV[Driver Updates] --> LS[Location Service] --> R1[(Redis)] end subgraph Read Path RIDER[Rider Request] --> DS[Dispatch Service] --> R1 end style LS fill:#1877F2,stroke:#00c2ff,color:#fff style DS fill:#FF9900,stroke:#ffb74d,color:#000 style R1 fill:#420177,stroke:#b39ddb,color:#fff

The Location Service (writes) and Dispatch Service (reads) are separate microservices. They share the Redis store but nothing else. This lets them scale independently — during peak rider demand, the Dispatch Service scales horizontally without touching the Location Service, and vice versa during high driver-registration periods.

The Numbers That Matter

Metric	Value	Why It Matters
Driver GPS update interval	4–5 seconds	Balances freshness vs battery drain
Active drivers at peak (Uber)	~5–6 million	Sets Redis memory requirements
Location updates / second	~1 million	WebSocket write throughput needed
Driver match latency	<100ms p99	User-perceived responsiveness SLA
Redis lookup latency	~0.5ms per key	100 cell lookups = ~50ms max
H3 cells at Res 9 globally	~69 million	Bounds Redis key space
Avg drivers per H3 cell (busy city)	2–8	Small sets = fast SMEMBERS
Driver eviction TTL	30 seconds	Max staleness in the index

What Actually Happens in 100ms

Here is the precise sequence from rider tap to driver notification — with timing at each stage:

flowchart TD T0["t=0ms: Rider taps Request Ride"] --> T1 T1["t=5ms: App sends HTTPS request to Dispatch Service"] --> T2 T2["t=8ms: Dispatch converts rider GPS → H3/S2 cell ID"] --> T3 T3["t=10ms: Redis batch query (ring 0 + ring 1 = 7 cells)"] --> T4 T4["t=18ms: ~30 candidate driver IDs returned from Redis"] --> T5 T5["t=22ms: Haversine filtering — compute exact distances"] --> T6 T6["t=35ms: Top 10 candidates sent to ETA service"] --> T7 T7["t=75ms: ML model returns road-network ETA for each"] --> T8 T8["t=85ms: Dispatch picks winner — sends match notification"] --> T9 T9["t=100ms: Both rider and driver apps updated"] style T0 fill:#1a1a2e,stroke:#00c2ff,color:#e0e0e0 style T4 fill:#420177,stroke:#b39ddb,color:#fff style T7 fill:#0d6e4e,stroke:#00e676,color:#fff style T9 fill:#1877F2,stroke:#00c2ff,color:#fff

The ETA service is the slowest part. The geospatial lookup (Redis) takes ~10ms. The ML model computing road-network arrival times — factoring in live traffic, turn restrictions, one-way streets — takes 40–60ms. Most of the 100ms budget is spent on making the ETA accurate, not finding the drivers.

Key Takeaways

Spatial cell indexing converts an O(n) scan into O(1) lookup. The core insight: encode coordinates into a fixed cell ID, use the cell ID as a Redis key. Driver lookup becomes a key lookup, not a distance scan.
Hexagons (H3) are geometrically cleaner for k-ring searches. Every hexagon neighbor is equidistant. Squares have corner neighbors that are √2× farther than edge neighbors — this creates search asymmetry that hexagons avoid.
S2 cell unions are better for irregular region queries. If you need to cover a non-circular area (a city district, an airport zone, a geofenced region), S2 coverings are more expressive than hexagonal k-rings.
The write path is the hard part. Reading 100 Redis keys is easy. Writing 1 million location updates per second to Redis without dropping any or causing replication lag — that's the engineering challenge.
TTL-based eviction > explicit cleanup. At this scale, you cannot guarantee a "driver went offline" event is always delivered. A passive TTL ensures correctness without requiring a reliable disconnect signal.