Between January 2018 and January 2025, Uber grew from handling one in five for-hire vehicle trips in New York City to three in four. This project uses NYC Taxi and Limousine Commission trip records to document how the spatial distribution of pickups shifted over that period, and evaluates whether the observed changes align with predictions derived from value-based dispatch algorithms described in the academic literature.
Ride-hailing platforms must decide which driver to assign to each incoming passenger request. This dispatch rule has consequences not just for the individual ride, but for how driver supply distributes itself across a city over time. Two approaches represent opposite ends of the design spectrum, and each implies a different spatial footprint.
Greedy matching assigns the closest available driver to each request, minimizing immediate pickup time. The rule is simple but myopic: it ignores where the driver ends up after the trip. A driver two minutes away in Midtown might take a fare to an isolated area, earning well on that single ride but then waiting 30 minutes for the next one. Meanwhile, a more distant driver could have served the same trip and ended up better positioned for subsequent pickups.
MDP-based matching (Markov Decision Process), described in Xu et al. (2018), takes the opposite approach. Instead of minimizing pickup distance, it maximizes long-run system value. The algorithm learns a value for each geographic zone at each time of day, built from millions of historical trip observations. Each dispatch decision evaluates: Immediate Fare + Future Position Value − Current Position Value. Drivers in low-value zones receive priority for trips to high-value destinations; trips to high-demand areas are valued because they position drivers well for the next ride.
A passenger in Midtown requests a trip to JFK Airport. Two drivers are available. Driver A is 2 minutes away, already in Midtown. Under greedy matching, Driver A takes the fare but then sits in the JFK queue for 30 minutes, and Midtown loses a well-positioned driver. Driver B is 5 minutes away, on the Upper East Side. Under MDP matching, Driver B gets the airport fare because Driver B has less to lose by leaving a lower-value zone. Driver A stays in high-demand Midtown and picks up the next ride quickly. Both end up better off. Xu et al. (2018) report that MDP-based dispatch improved trip completion rates by 0.5% to 5% across 20+ cities.
The spatial implication is that MDP matching should redistribute supply toward zones with high future value (like airports, which position drivers for lucrative return fares) and away from zones where drivers are already saturated. Over time, this should produce a more even demand surface, lower spatial clustering of high-activity neighbourhoods, and more trips completed locally as drivers are matched to nearby riders more efficiently.
How did the spatial distribution of Uber pickups in NYC change between 2018 and 2025, and are the observed patterns consistent with value-based algorithmic dispatch?
The analysis proceeds in three stages. First, geographic analysis compares the spatial distribution of Uber pickups and dropoffs across NYC's 263 TLC taxi zones. This uses K-means clustering (a method that groups zones into geographic demand centres based on their coordinates), the Gini coefficient (a standard measure of inequality, here applied to how unevenly pickups are distributed across zones), Lorenz curves, and origin-destination flow analysis between clusters.
Second, spatial autocorrelation analysis tests whether high-demand zones cluster together geographically and how that clustering changed. This uses global Moran's I (a statistic measuring the degree to which nearby zones have similar demand levels) and LISA maps (Local Indicators of Spatial Association, which identify individual hot spots and cold spots at the zone level).
Third, temporal analysis examines whether hourly and daily demand patterns shifted between years, both at the aggregate level and within each geographic cluster. This serves as a control: if the geographic changes were driven by shifts in rider behaviour (remote work, post-pandemic commuting), then when people use the service should have shifted alongside where they use it. Stable temporal patterns alongside shifting geography would point more toward supply-side reallocation. Jensen-Shannon divergence quantifies the degree of temporal shift at the cluster level.
Before interpreting any spatial changes, it is important to understand what happened to the broader for-hire vehicle market between 2018 and 2025. Uber's growth did not occur in a vacuum: it absorbed the entire traditional FHV sector, which has implications for how the geographic redistribution should be interpreted.
In 2018, traditional FHV operators (community car services, black cars, luxury limousines) handled 61% of all for-hire trips. By 2025, that category had been entirely eliminated. Lyft also grew, from 16% to 25%.
The elimination of the "Other FHV" category is a structural confounder. Uber in 2025 is not simply doing more of what Uber did in 2018; it is also handling trips that previously belonged to an entirely different operator pool with different geographic coverage. Traditional car services were disproportionately concentrated in outer-borough neighbourhoods (community livery bases served specific ethnic and geographic communities). Some of the spatial dispersion observed in Uber's 2025 pickup distribution may therefore reflect the absorption of these trip patterns rather than algorithmic redistribution.
If Uber deployed value-based matching between 2018 and 2025, six specific spatial changes should follow. Each prediction derives from a concrete mechanism in the dispatch model and can be tested against the data.
| Prediction | Mechanism |
|---|---|
| Increased airport demand | Airports carry high destination value: drivers completing airport trips are positioned for lucrative return fares into Manhattan |
| Manhattan core decline | Driver saturation in central Manhattan reduces the marginal value of sending additional supply there |
| Lower geographic concentration | By routing supply toward underserved zones, the system reduces the gap in activity levels across the city |
| Lower spatial autocorrelation | Active supply redistribution narrows the divide between high-activity and low-activity neighbourhoods |
| More localised trip completion | Efficient matching pairs riders with nearby drivers, increasing the share of trips that start and end within the same geographic area |
| Stable geographic structure | The algorithm reallocates supply within existing demand patterns rather than reorganizing the city's spatial layout |
In January 2018, Uber accounted for 4.5 million of 19.8 million for-hire vehicle trips in NYC, roughly one in five. Pickups were heavily concentrated in Manhattan: Midtown Center, East Village, and Union Square led in volume, and Manhattan zones accounted for approximately 60% of all Uber pickups. K-means clustering on zone centroid coordinates identified six geographic demand centres. The two Manhattan clusters alone comprised nearly 59% of trips.
Each zone is coloured by its dominant pickup cluster. Hover for trip counts and cluster assignments.
1. Midtown Center (1.9% of all pickups)
2. East Village (1.8%)
3. Union Square (1.8%)
Manhattan zones accounted for ~60% of total pickups
Peak hour: 6 PM (evening commute)
Busiest day: Saturday
Morning rush: 7–9 AM
Evening peak comprised roughly 18% of daily demand
By January 2025, Uber handled 15.4 million trips, three in four for-hire vehicle rides in the city. Total FHV volume remained roughly flat at 20.4 million, so Uber's growth came almost entirely at the expense of other operators. Two geographic shifts stand out. First, JFK Airport displaced Midtown Center as the top pickup zone, and combined airport pickup share rose 48%. Second, the Gini coefficient fell from 0.511 to 0.440, meaning demand became more evenly distributed across zones rather than more concentrated. Uber's expanded market share did not simply amplify the existing Manhattan-centric pattern; it dispersed pickups toward airports and outer-borough areas that had lower activity in 2018.
Compare with the 2018 map above. The cluster structure is broadly preserved, but the relative weight of airport and outer-borough clusters increased.
Combined share: +48% vs 2018
JFK Airport: Top pickup zone in 2025
LaGuardia: Substantial growth
Airports displaced Manhattan core as top demand centres
Average decline: −35% across top zones
Union Sq, Midtown, East Village: −34 to −39%
Net effect: Retained volume but lost relative share
Central zones still active, but no longer dominant
The choropleth below maps the change in each zone's share of total pickups (in percentage points) between 2018 and 2025. Blue zones gained share; red zones lost it. Airport zones in Queens are the largest gainers, several central Manhattan zones show the steepest declines, and parts of Brooklyn and the Bronx picked up modest share, consistent with the falling Gini coefficient.
The Lorenz curve plots the cumulative share of pickups against the cumulative share of taxi zones, ranked from lowest to highest demand. A perfectly even distribution would trace the diagonal; the further the curve bows below it, the more concentrated demand is. The Gini coefficient quantifies this gap. Between 2018 and 2025, the Gini fell from 0.511 to 0.440: the 2025 curve sits closer to the diagonal across its range, indicating that pickups spread more evenly across zones.
To confirm that this decline is not a sampling artifact, a bootstrap procedure resampled zone-level trip counts 1,000 times for each year to produce 95% confidence intervals around the Gini coefficient. The 2018 interval (0.475–0.541) and the 2025 interval (0.402–0.473) do not overlap, indicating that the concentration decline is statistically robust (see Appendix for the full bootstrap distribution).
Aggregating by borough provides a macro view of the redistribution. Because Uber's total trip volume more than tripled, absolute counts rose in every borough. The relevant comparison is the shift in relative shares.
Manhattan retained the majority of pickups (~68%) but with reduced dominance. Brooklyn grew from 18% to 20%, Queens held at around 9%, and the Bronx picked up modest share. The outer-borough gains represent expansion into areas that were less served in 2018, not a substitution away from Manhattan in absolute terms.
While density maps show where trips start and end, they do not capture how trips flow between areas. The heatmap below shows the change in trip share for each pickup-dropoff cluster pair between 2018 and 2025. Each cell represents the change (in percentage points) in the fraction of all trips flowing between that origin and destination. The 2025 clusters are aligned to their 2018 geographic matches.
The Gini coefficient measures overall concentration but is indifferent to geography: it treats zones as interchangeable regardless of where they sit on the map. Moran's I addresses this by testing whether high-demand zones tend to cluster together geographically. A high Moran's I means nearby zones have similar demand levels (high near high, low near low), indicating spatial polarization. A lower value means demand is more patchy, with less of a clean divide between core and periphery.
LISA (Local Indicators of Spatial Association) maps showing zone-level hot spots and cold spots, along with pickup/dropoff balance statistics, are available in the Appendix. The key finding from that analysis is that the contiguous cold-spot block across eastern Queens and the Bronx contracted substantially between 2018 and 2025, as many outer-borough zones gained enough activity to become statistically indistinguishable from their neighbours.
To measure how demand centres shifted geographically, each 2018 K-means cluster centroid was matched to its nearest 2025 counterpart by geographic proximity (not index order), and the displacement was computed using the Haversine formula. The map below shows arrows from each 2018 centroid to its 2025 match, with line thickness proportional to shift distance.
The six centroids shifted 4.22 km on average. The largest displacement (11.6 km) was the Brooklyn/Borough Park cluster, which moved toward Staten Island. Most other shifts were modest (under 3 km), directed toward airports and outer-borough transit nodes. The scale of these movements, relative to the full extent of NYC, indicates reallocation within an existing spatial structure rather than a wholesale geographic reorganization.
A direct test of whether trips are completing more locally is to measure, for each pickup zone, what fraction of its dropoffs land in the same zone or in one of the five geographically nearest zones (based on centroid distance). Unlike the intra-cluster share metric used elsewhere, this measure is boundary-independent: it does not depend on how K-means clusters are drawn, only on fixed zone geography.
Mean localization rose from 29.9% in 2018 to 33.4% in 2025. The distribution shifted modestly rightward across zones. This is a real but small increase, and weaker evidence for the "more localised matching" prediction than the cluster-based metric (63% to 72%) would suggest. The discrepancy confirms that part of the intra-cluster increase was an artifact of differently shaped cluster boundaries.
Percentage of each zone's trips where the dropoff falls in the same zone or one of its 5 nearest neighbours.
As a complementary measure, the great-circle distance between pickup and dropoff zone centroids was computed for all trips with valid origin and destination data. If MDP matching were producing shorter, more efficient local trips, average distances should decline or remain stable.
In practice, mean trip distance increased slightly from 4.96 km to 5.27 km, and median distance from 3.46 km to 3.56 km. This cuts against the "more localised matching" prediction, and suggests that at least some of the spatial redistribution involved longer trips (plausibly airport rides, which are among the longest in the dataset).
The temporal dimension serves as a diagnostic. If the geographic redistribution documented above were driven by changes in rider behaviour (remote work patterns, post-pandemic commuting, evolving leisure habits), then when people use the service should have shifted alongside where they use it. If temporal patterns remain stable while geographic patterns change, the evidence points more toward supply-side reallocation.
The hourly curves track each other closely. The peak hour remained at 6 PM in both years. The busiest day shifted from Saturday (2018) to Friday (2025), but the overall weekly shape changed only modestly. The 2025 curve sits higher in absolute terms (reflecting Uber's tripled trip volume), but the relative distribution across hours is nearly identical.
Rather than comparing the aggregate hour-by-day heatmap, the panels below show the change in temporal demand distribution for each geographic cluster separately. Each cell shows how the share of that cluster's trips at a given hour and day shifted between 2018 and 2025 (in percentage points). Most cells are near zero, confirming temporal stability at the cluster level.
To move beyond visual comparison, the Jensen-Shannon divergence (JSD) was computed for each cluster. JSD is a standard measure of how different two probability distributions are: a value of 0 means two distributions are identical, and higher values indicate more divergence. Applied here, it compares the 24-hour demand profile of each cluster in 2018 with its matched cluster in 2025. The aggregate JSD across all trips was 0.0025, confirming minimal change.
The per-cluster breakdown reveals that this stability was not perfectly uniform. Brooklyn (JSD = 0.0099) and the Bronx (JSD = 0.0057) shifted roughly two to four times more than Manhattan clusters (JSD ≈ 0.002). While all values remain very low in absolute terms, the pattern suggests that outer-borough areas experienced more temporal change than central Manhattan.
Aggregate temporal stability can mask compositional shifts: if Manhattan demand fell and outer-borough demand rose by similar proportions across all hours, the aggregate hourly curve would look identical even though the spatial composition of each hour changed. The stacked bars below decompose each hour's demand by cluster.
The geographic clusters used throughout this analysis group zones by physical location. An alternative approach clusters zones by their hourly demand shape: the 24-dimensional vector describing what fraction of a zone's trips occur at each hour. This identifies zones that behave similarly regardless of where they sit on the map. Each zone's hourly profile was standardized and grouped using K-means (k=4). Four demand types emerged in both years.
Each line shows the average hourly demand profile for zones of that type. Morning-peak zones (airports, transit hubs) have demand concentrated before 10 AM; evening-peak zones see activity build through the afternoon.
Zones coloured by their demand type. Hover for zone name and type assignment.
Each of the six predictions derived from MDP theory was evaluated against the data. All six are directionally consistent with the observed patterns, but the strength of evidence varies. Predictions are rated as Strong (large effect, less vulnerable to alternative explanations), Moderate (directionally consistent but with meaningful confounders or methodological caveats), or Weak (fragile metric or evidence too sensitive to analytical choices).
| Prediction | Result | Strength | Notes |
|---|---|---|---|
| Airport growth | JFK + LaGuardia combined pickup share rose 48% | Strong | Clear directional change with large magnitude. However, airport ground transportation policies also changed over this period (terminal access restrictions for non-app hails), which could independently drive this. |
| Manhattan core decline | Top-3 zones lost 34–39% of their pickup share | Strong | Large, consistent decline across multiple central zones. Post-pandemic remote work is an equally plausible explanation that cannot be ruled out with this design. |
| Lower concentration | Gini: 0.511 to 0.440; bootstrap 95% CIs do not overlap | Moderate | Direction is correct and statistically confirmed by bootstrap. Without a counterfactual, it is unclear whether this exceeds what market share growth alone would produce (more trips = more zones with non-trivial activity). |
| Reduced spatial clustering | Moran's I: 0.55 to 0.28 (both p<0.001) | Strong | Nearly halved. This is the most robust finding because Moran's I is less mechanically affected by volume growth than the Gini coefficient. |
| More localised matching | Zone localization: 29.9% to 33.4%; trip distance: 4.96 to 5.27 km | Weak | The boundary-independent localization metric shows a modest increase, but mean trip distance rose, contradicting the prediction. The intra-cluster metric (63% to 72%) overstates the effect due to cluster boundary differences. |
| Stable structure | Mean centroid shift: 4.22 km; aggregate temporal JSD: 0.0025 | Moderate | Centroid shifts are small on average, but one cluster shifted 11.6 km. Temporal JSD is low overall, but outer-borough clusters (Brooklyn, Bronx) shifted 2–4x more than Manhattan clusters. Compositional analysis shows the spatial mix within each hour did change. |
The geographic, temporal, spatial autocorrelation, and trip flow results point to five main observations.
Temporal patterns (peak hours, daily distribution) remained stable between years while geographic patterns shifted substantially. The aggregate Jensen-Shannon divergence of 0.0025 between the 2018 and 2025 hourly profiles confirms this quantitatively. If changes in rider behaviour were the primary driver, both dimensions should have moved. The stability of the temporal dimension points toward mechanisms operating on where drivers are dispatched, not on when or why riders request trips. However, compositional analysis reveals that the spatial mix within each hour did shift, with outer-borough clusters contributing more to each hour in 2025.
Both inequality measures declined: the Gini fell from 0.511 to 0.440 (bootstrap CIs non-overlapping) and Moran's I from 0.55 to 0.28. Previously underserved outer-borough zones gained enough activity to become statistically indistinguishable from their surroundings. The 4.22 km average cluster centroid shift indicates these changes occurred within the existing geographic structure of NYC demand, not through fundamental reorganization.
JFK displaced Midtown Center as the highest-volume pickup zone. Combined airport pickup share rose 48%, while the top-3 Manhattan zones declined 34–39%. Under MDP theory, airports carry high future position value because drivers completing airport trips are well-positioned for return fares, making them attractive dispatch destinations for the algorithm.
The boundary-independent localization metric rose modestly (29.9% to 33.4%), but mean trip distance increased from 4.96 km to 5.27 km. These results partially contradict the prediction that MDP matching should produce shorter, more local trips. The increase in airport trips, which are among the longest in the dataset, likely accounts for the distance increase and represents a trade-off: MDP matching may improve system-wide efficiency while producing longer individual trips to high-value destinations.
The "Other FHV" category went from 61% of all for-hire trips to 0%. This is the largest structural confounder. Traditional car services served specific outer-borough communities; their absorption by Uber means that some of the geographic dispersion reflects the inheritance of pre-existing trip patterns rather than algorithmic redistribution. Lyft also grew (16% to 25%), further complicating the attribution of spatial changes to Uber's dispatch algorithm specifically.
The data covers January only in each year. Seasonal variation is not captured, and results may differ in summer months or during holiday periods. A fuller analysis would compare multiple months across years. Intermediate January snapshots (2019–2024) would show whether the spatial changes were gradual or discontinuous around specific events.
The elimination of the traditional FHV sector (61% of 2018 trips) is the dominant confounder. Uber's 2025 geographic footprint partly reflects the absorption of trips that previously belonged to community car services with different spatial coverage. Separating algorithmic effects from market consolidation effects would require data on the geographic profile of the absorbed operators.
Uber's actual dispatch algorithm is proprietary. The MDP framework used here derives from Xu et al. (2018), a published research paper. The analysis tests predictions from that framework against observable patterns, but cannot confirm which specific algorithm Uber deployed or when.
Trip distances are computed as great-circle distances between zone centroids, not actual trip routes. This approximation understates true distances (roads are not straight lines) and conflates trips within the same zone as zero distance. Actual trip duration and fare data, where available, would provide better measures of trip length and efficiency.
K-means clusters are fit independently per year. The intra-cluster trip share increase (63% to 72%) is partly an artifact of differently drawn boundaries. The zone-level localization metric (29.9% to 33.4%) is boundary-independent and provides a more conservative estimate of localization change.
Post-pandemic commuting behaviour, airport ground transportation policy changes, the 2019 TLC vehicle cap, Uber's own pricing adjustments, and the competitive dynamics between Uber and Lyft are all plausible contributors that cannot be isolated in a two-period comparison.
NYC TLC public trip records (Parquet format). January 2018: 19.8M total FHV trips (4.5M Uber, 3.1M Lyft, 12.2M other). January 2025: 20.4M total FHV trips (15.4M Uber, 5.0M Lyft, 0 other). 263 TLC taxi zones. Uber identified via dispatching base number (2018) and HVFHS license number (2025). Zone centroids from NYC Open Data shapefile.
K-means clustering (k=6) on zone centroid lat/lon. Clusters matched across years by geographic proximity (Haversine distance). Gini coefficient and Lorenz curves for concentration. Bootstrap confidence intervals on Gini (1,000 iterations resampling zones). Zone-level localization (K=5 nearest zones by centroid distance). Trip distance via vectorized Haversine on centroid coordinates.
Global Moran's I and local LISA (KNN weights, k=6, row-standardized) for spatial autocorrelation. Kolmogorov-Smirnov test for distributional comparison of pickup/dropoff mismatch ratios. Levene's test for variance equality.
Jensen-Shannon divergence (JSD) between 2018 and 2025 hourly pickup distributions, computed at aggregate and per-cluster levels. Demand typology clustering via K-means (k=4) on standardized 24-dimensional hourly profile vectors per zone.
Reference: Xu et al. (2018). "Large-scale order dispatch in on-demand ride-hailing platforms." KDD '18. MDP framework based on published research. Analysis independent of any proprietary system.
Zone-level trip counts were resampled 1,000 times with replacement, and the Gini coefficient recomputed for each iteration. The resulting distributions for 2018 and 2025 do not overlap at the 95% level, confirming the observed decline is not a sampling artifact.
Local Indicators of Spatial Association (LISA) identify statistically significant spatial clusters. Red zones are hot spots (high demand surrounded by high demand); blue zones are cold spots. The spatial weights matrix uses K=6 nearest neighbours with row standardization.
The pickup-to-dropoff ratio per zone (PU/(PU+DO)) measures whether a zone is a net origin or destination. A value of 0.5 indicates perfect balance. The KS test detected a significant distributional shift (D = 0.21, p < 0.001), while Levene's test found no change in variance (p = 0.95).