Uber NYC Demand Analysis | Leonardo Luksic

Background

Dispatch Algorithms and Spatial Theory

Ride-hailing platforms must decide which driver to assign to each incoming passenger request. This dispatch rule has consequences not just for the individual ride, but for how driver supply distributes itself across a city over time. Two approaches represent opposite ends of the design spectrum, and each implies a different spatial footprint.

Greedy matching assigns the closest available driver to each request, minimizing immediate pickup time. The rule is simple but myopic: it ignores where the driver ends up after the trip. A driver two minutes away in Midtown might take a fare to an isolated area, earning well on that single ride but then waiting 30 minutes for the next one. Meanwhile, a more distant driver could have served the same trip and ended up better positioned for subsequent pickups.

MDP-based matching (Markov Decision Process), described in Xu et al. (2018), takes the opposite approach. Instead of minimizing pickup distance, it maximizes long-run system value. The algorithm learns a value for each geographic zone at each time of day, built from millions of historical trip observations. Each dispatch decision evaluates: Immediate Fare + Future Position Value − Current Position Value. Drivers in low-value zones receive priority for trips to high-value destinations; trips to high-demand areas are valued because they position drivers well for the next ride.

Worked Example

A passenger in Midtown requests a trip to JFK Airport. Two drivers are available. Driver A is 2 minutes away, already in Midtown. Under greedy matching, Driver A takes the fare but then sits in the JFK queue for 30 minutes, and Midtown loses a well-positioned driver. Driver B is 5 minutes away, on the Upper East Side. Under MDP matching, Driver B gets the airport fare because Driver B has less to lose by leaving a lower-value zone. Driver A stays in high-demand Midtown and picks up the next ride quickly. Both end up better off. Xu et al. (2018) report that MDP-based dispatch improved trip completion rates by 0.5% to 5% across 20+ cities.

The spatial implication is that MDP matching should redistribute supply toward zones with high future value (like airports, which position drivers for lucrative return fares) and away from zones where drivers are already saturated. Over time, this should produce a more even demand surface, lower spatial clustering of high-activity neighbourhoods, and more trips completed locally as drivers are matched to nearby riders more efficiently.

Research Question

How did the spatial distribution of Uber pickups in NYC change between 2018 and 2025, and are the observed patterns consistent with value-based algorithmic dispatch?

Analytical Approach

The analysis proceeds in three stages. First, geographic analysis compares the spatial distribution of Uber pickups and dropoffs across NYC's 263 TLC taxi zones. This uses K-means clustering (a method that groups zones into geographic demand centres based on their coordinates), the Gini coefficient (a standard measure of inequality, here applied to how unevenly pickups are distributed across zones), Lorenz curves, and origin-destination flow analysis between clusters.

Second, spatial autocorrelation analysis tests whether high-demand zones cluster together geographically and how that clustering changed. This uses global Moran's I (a statistic measuring the degree to which nearby zones have similar demand levels) and LISA maps (Local Indicators of Spatial Association, which identify individual hot spots and cold spots at the zone level).

Third, temporal analysis examines whether hourly and daily demand patterns shifted between years, both at the aggregate level and within each geographic cluster. This serves as a control: if the geographic changes were driven by shifts in rider behaviour (remote work, post-pandemic commuting), then when people use the service should have shifted alongside where they use it. Stable temporal patterns alongside shifting geography would point more toward supply-side reallocation. Jensen-Shannon divergence quantifies the degree of temporal shift at the cluster level.

Context

Market Context

Before interpreting any spatial changes, it is important to understand what happened to the broader for-hire vehicle market between 2018 and 2025. Uber's growth did not occur in a vacuum: it absorbed the entire traditional FHV sector, which has implications for how the geographic redistribution should be interpreted.

FHV Market Share: Uber vs. Lyft vs. Other Operators

In 2018, traditional FHV operators (community car services, black cars, luxury limousines) handled 61% of all for-hire trips. By 2025, that category had been entirely eliminated. Lyft also grew, from 16% to 25%.

The elimination of the "Other FHV" category is a structural confounder. Uber in 2025 is not simply doing more of what Uber did in 2018; it is also handling trips that previously belonged to an entirely different operator pool with different geographic coverage. Traditional car services were disproportionately concentrated in outer-borough neighbourhoods (community livery bases served specific ethnic and geographic communities). Some of the spatial dispersion observed in Uber's 2025 pickup distribution may therefore reflect the absorption of these trip patterns rather than algorithmic redistribution.

Predictions

Testable Predictions from MDP Theory

If Uber deployed value-based matching between 2018 and 2025, six specific spatial changes should follow. Each prediction derives from a concrete mechanism in the dispatch model and can be tested against the data.

Prediction	Mechanism
Increased airport demand	Airports carry high destination value: drivers completing airport trips are positioned for lucrative return fares into Manhattan
Manhattan core decline	Driver saturation in central Manhattan reduces the marginal value of sending additional supply there
Lower geographic concentration	By routing supply toward underserved zones, the system reduces the gap in activity levels across the city
Lower spatial autocorrelation	Active supply redistribution narrows the divide between high-activity and low-activity neighbourhoods
More localised trip completion	Efficient matching pairs riders with nearby drivers, increasing the share of trips that start and end within the same geographic area
Stable geographic structure	The algorithm reallocates supply within existing demand patterns rather than reorganizing the city's spatial layout

Part I

Geographic Analysis

2018 Baseline

19.8M

Total FHV Trips (Jan 2018)

22.7%

Uber Market Share

0.511

Gini Coefficient

In January 2018, Uber accounted for 4.5 million of 19.8 million for-hire vehicle trips in NYC, roughly one in five. Pickups were heavily concentrated in Manhattan: Midtown Center, East Village, and Union Square led in volume, and Manhattan zones accounted for approximately 60% of all Uber pickups. K-means clustering on zone centroid coordinates identified six geographic demand centres. The two Manhattan clusters alone comprised nearly 59% of trips.

2018 Demand Clusters (K-Means, k=6)

Each zone is coloured by its dominant pickup cluster. Hover for trip counts and cluster assignments.

Top Pickup Zones

1. Midtown Center (1.9% of all pickups)

2. East Village (1.8%)

3. Union Square (1.8%)

Manhattan zones accounted for ~60% of total pickups

Temporal Profile

Peak hour: 6 PM (evening commute)

Busiest day: Saturday

Morning rush: 7–9 AM

Evening peak comprised roughly 18% of daily demand

2025 Snapshot

20.4M

Total FHV Trips (Jan 2025)

75.3%

Uber Market Share

0.440

Gini Coefficient (−14%)

Market Share

2018

22.7%

2025

75.3%

Geographic Concentration (Gini)

2018

0.511

2025

0.440

By January 2025, Uber handled 15.4 million trips, three in four for-hire vehicle rides in the city. Total FHV volume remained roughly flat at 20.4 million, so Uber's growth came almost entirely at the expense of other operators. Two geographic shifts stand out. First, JFK Airport displaced Midtown Center as the top pickup zone, and combined airport pickup share rose 48%. Second, the Gini coefficient fell from 0.511 to 0.440, meaning demand became more evenly distributed across zones rather than more concentrated. Uber's expanded market share did not simply amplify the existing Manhattan-centric pattern; it dispersed pickups toward airports and outer-borough areas that had lower activity in 2018.

2025 Demand Clusters (K-Means, k=6)

Compare with the 2018 map above. The cluster structure is broadly preserved, but the relative weight of airport and outer-borough clusters increased.

Airport Zones

Combined share: +48% vs 2018

JFK Airport: Top pickup zone in 2025

LaGuardia: Substantial growth

Airports displaced Manhattan core as top demand centres

Manhattan Core

Average decline: −35% across top zones

Union Sq, Midtown, East Village: −34 to −39%

Net effect: Retained volume but lost relative share

Central zones still active, but no longer dominant

Pickup Density Change

The choropleth below maps the change in each zone's share of total pickups (in percentage points) between 2018 and 2025. Blue zones gained share; red zones lost it. Airport zones in Queens are the largest gainers, several central Manhattan zones show the steepest declines, and parts of Brooklyn and the Bronx picked up modest share, consistent with the falling Gini coefficient.

Change in Pickup Share by Zone (2018 → 2025)

Geographic Concentration

The Lorenz curve plots the cumulative share of pickups against the cumulative share of taxi zones, ranked from lowest to highest demand. A perfectly even distribution would trace the diagonal; the further the curve bows below it, the more concentrated demand is. The Gini coefficient quantifies this gap. Between 2018 and 2025, the Gini fell from 0.511 to 0.440: the 2025 curve sits closer to the diagonal across its range, indicating that pickups spread more evenly across zones.

Lorenz Curves: Pickup Concentration (2018 vs. 2025)

To confirm that this decline is not a sampling artifact, a bootstrap procedure resampled zone-level trip counts 1,000 times for each year to produce 95% confidence intervals around the Gini coefficient. The 2018 interval (0.475–0.541) and the 2025 interval (0.402–0.473) do not overlap, indicating that the concentration decline is statistically robust (see Appendix for the full bootstrap distribution).

Borough-Level Distribution

Aggregating by borough provides a macro view of the redistribution. Because Uber's total trip volume more than tripled, absolute counts rose in every borough. The relevant comparison is the shift in relative shares.

Pickup Counts by Borough (2018 vs. 2025)

Manhattan retained the majority of pickups (~68%) but with reduced dominance. Brooklyn grew from 18% to 20%, Queens held at around 9%, and the Bronx picked up modest share. The outer-borough gains represent expansion into areas that were less served in 2018, not a substitution away from Manhattan in absolute terms.

Part I (continued)

Spatial Autocorrelation

The Gini coefficient measures overall concentration but is indifferent to geography: it treats zones as interchangeable regardless of where they sit on the map. Moran's I addresses this by testing whether high-demand zones tend to cluster together geographically. A high Moran's I means nearby zones have similar demand levels (high near high, low near low), indicating spatial polarization. A lower value means demand is more patchy, with less of a clean divide between core and periphery.

Global Moran's I fell from 0.55 in 2018 to 0.28 in 2025 (both significant at p<0.001). Demand is still spatially structured, but the divide between high-activity and low-activity neighbourhoods narrowed considerably. Combined with the falling Gini, this means demand became both less concentrated overall and less spatially polarized.

LISA (Local Indicators of Spatial Association) maps showing zone-level hot spots and cold spots, along with pickup/dropoff balance statistics, are available in the Appendix. The key finding from that analysis is that the contiguous cold-spot block across eastern Queens and the Bronx contracted substantially between 2018 and 2025, as many outer-borough zones gained enough activity to become statistically indistinguishable from their neighbours.

Cluster Movement

To measure how demand centres shifted geographically, each 2018 K-means cluster centroid was matched to its nearest 2025 counterpart by geographic proximity (not index order), and the displacement was computed using the Haversine formula. The map below shows arrows from each 2018 centroid to its 2025 match, with line thickness proportional to shift distance.

Cluster Centroid Shifts: 2018 → 2025

The six centroids shifted 4.22 km on average. The largest displacement (11.6 km) was the Brooklyn/Borough Park cluster, which moved toward Staten Island. Most other shifts were modest (under 3 km), directed toward airports and outer-borough transit nodes. The scale of these movements, relative to the full extent of NYC, indicates reallocation within an existing spatial structure rather than a wholesale geographic reorganization.

Trip Localization

A direct test of whether trips are completing more locally is to measure, for each pickup zone, what fraction of its dropoffs land in the same zone or in one of the five geographically nearest zones (based on centroid distance). Unlike the intra-cluster share metric used elsewhere, this measure is boundary-independent: it does not depend on how K-means clusters are drawn, only on fixed zone geography.

Mean localization rose from 29.9% in 2018 to 33.4% in 2025. The distribution shifted modestly rightward across zones. This is a real but small increase, and weaker evidence for the "more localised matching" prediction than the cluster-based metric (63% to 72%) would suggest. The discrepancy confirms that part of the intra-cluster increase was an artifact of differently shaped cluster boundaries.

Zone-Level Localization Distribution (2018 vs. 2025)

Percentage of each zone's trips where the dropoff falls in the same zone or one of its 5 nearest neighbours.

Trip Distance

As a complementary measure, the great-circle distance between pickup and dropoff zone centroids was computed for all trips with valid origin and destination data. If MDP matching were producing shorter, more efficient local trips, average distances should decline or remain stable.

In practice, mean trip distance increased slightly from 4.96 km to 5.27 km, and median distance from 3.46 km to 3.56 km. This cuts against the "more localised matching" prediction, and suggests that at least some of the spatial redistribution involved longer trips (plausibly airport rides, which are among the longest in the dataset).

Trip Distance Distribution (2018 vs. 2025)

Median Trip Distance by Cluster

Part II

Temporal Demand Patterns

The temporal dimension serves as a diagnostic. If the geographic redistribution documented above were driven by changes in rider behaviour (remote work patterns, post-pandemic commuting, evolving leisure habits), then when people use the service should have shifted alongside where they use it. If temporal patterns remain stable while geographic patterns change, the evidence points more toward supply-side reallocation.

Hourly Profile

Hourly Pickup Distribution (2018 vs. 2025)

The hourly curves track each other closely. The peak hour remained at 6 PM in both years. The busiest day shifted from Saturday (2018) to Friday (2025), but the overall weekly shape changed only modestly. The 2025 curve sits higher in absolute terms (reflecting Uber's tripled trip volume), but the relative distribution across hours is nearly identical.

Temporal Change by Cluster

Rather than comparing the aggregate hour-by-day heatmap, the panels below show the change in temporal demand distribution for each geographic cluster separately. Each cell shows how the share of that cluster's trips at a given hour and day shifted between 2018 and 2025 (in percentage points). Most cells are near zero, confirming temporal stability at the cluster level.

Temporal Change by Cluster (2025 minus 2018, pp)

Quantifying Temporal Stability

To move beyond visual comparison, the Jensen-Shannon divergence (JSD) was computed for each cluster. JSD is a standard measure of how different two probability distributions are: a value of 0 means two distributions are identical, and higher values indicate more divergence. Applied here, it compares the 24-hour demand profile of each cluster in 2018 with its matched cluster in 2025. The aggregate JSD across all trips was 0.0025, confirming minimal change.

The per-cluster breakdown reveals that this stability was not perfectly uniform. Brooklyn (JSD = 0.0099) and the Bronx (JSD = 0.0057) shifted roughly two to four times more than Manhattan clusters (JSD ≈ 0.002). While all values remain very low in absolute terms, the pattern suggests that outer-borough areas experienced more temporal change than central Manhattan.

Temporal Shift by Cluster (Jensen-Shannon Divergence)

Hourly Compositional Breakdown

Aggregate temporal stability can mask compositional shifts: if Manhattan demand fell and outer-borough demand rose by similar proportions across all hours, the aggregate hourly curve would look identical even though the spatial composition of each hour changed. The stacked bars below decompose each hour's demand by cluster.

Cluster Share of Demand by Hour (2018 vs. 2025)

Part III

Prediction Scorecard

Each of the six predictions derived from MDP theory was evaluated against the data. All six are directionally consistent with the observed patterns, but the strength of evidence varies. Predictions are rated as Strong (large effect, less vulnerable to alternative explanations), Moderate (directionally consistent but with meaningful confounders or methodological caveats), or Weak (fragile metric or evidence too sensitive to analytical choices).

Prediction	Result	Strength	Notes
Airport growth	JFK + LaGuardia combined pickup share rose 48%	Strong	Clear directional change with large magnitude. However, airport ground transportation policies also changed over this period (terminal access restrictions for non-app hails), which could independently drive this.
Manhattan core decline	Top-3 zones lost 34–39% of their pickup share	Strong	Large, consistent decline across multiple central zones. Post-pandemic remote work is an equally plausible explanation that cannot be ruled out with this design.
Lower concentration	Gini: 0.511 to 0.440; bootstrap 95% CIs do not overlap	Moderate	Direction is correct and statistically confirmed by bootstrap. Without a counterfactual, it is unclear whether this exceeds what market share growth alone would produce (more trips = more zones with non-trivial activity).
Reduced spatial clustering	Moran's I: 0.55 to 0.28 (both p<0.001)	Strong	Nearly halved. This is the most robust finding because Moran's I is less mechanically affected by volume growth than the Gini coefficient.
More localised matching	Zone localization: 29.9% to 33.4%; trip distance: 4.96 to 5.27 km	Weak	The boundary-independent localization metric shows a modest increase, but mean trip distance rose, contradicting the prediction. The intra-cluster metric (63% to 72%) overstates the effect due to cluster boundary differences.
Stable structure	Mean centroid shift: 4.22 km; aggregate temporal JSD: 0.0025	Moderate	Centroid shifts are small on average, but one cluster shifted 11.6 km. Temporal JSD is low overall, but outer-borough clusters (Brooklyn, Bronx) shifted 2–4x more than Manhattan clusters. Compositional analysis shows the spatial mix within each hour did change.

The combination of stable temporal patterns and reduced spatial autocorrelation is the most informative signal. When people use the service did not change substantially, but where demand concentrates, and how sharply it clusters geographically, shifted. Both the Gini coefficient and Moran's I fell, indicating that demand became more evenly distributed in aggregate and less spatially polarized. These patterns are consistent with a dispatch system that routes supply toward underserved zones, but the evidence is qualified by the elimination of the traditional FHV sector, rising trip distances, and the difficulty of separating algorithmic effects from post-pandemic behavioural shifts.

Synthesis

Interpretation

The geographic, temporal, spatial autocorrelation, and trip flow results point to five main observations.

Supply-side reallocation, not demand-side reorganization

Temporal patterns (peak hours, daily distribution) remained stable between years while geographic patterns shifted substantially. The aggregate Jensen-Shannon divergence of 0.0025 between the 2018 and 2025 hourly profiles confirms this quantitatively. If changes in rider behaviour were the primary driver, both dimensions should have moved. The stability of the temporal dimension points toward mechanisms operating on where drivers are dispatched, not on when or why riders request trips. However, compositional analysis reveals that the spatial mix within each hour did shift, with outer-borough clusters contributing more to each hour in 2025.

Spatial dispersion within a stable structure

Both inequality measures declined: the Gini fell from 0.511 to 0.440 (bootstrap CIs non-overlapping) and Moran's I from 0.55 to 0.28. Previously underserved outer-borough zones gained enough activity to become statistically indistinguishable from their surroundings. The 4.22 km average cluster centroid shift indicates these changes occurred within the existing geographic structure of NYC demand, not through fundamental reorganization.

Airport zones replaced Manhattan core as the top demand centres

JFK displaced Midtown Center as the highest-volume pickup zone. Combined airport pickup share rose 48%, while the top-3 Manhattan zones declined 34–39%. Under MDP theory, airports carry high future position value because drivers completing airport trips are well-positioned for return fares, making them attractive dispatch destinations for the algorithm.

Trip localization is weak and distances increased

The boundary-independent localization metric rose modestly (29.9% to 33.4%), but mean trip distance increased from 4.96 km to 5.27 km. These results partially contradict the prediction that MDP matching should produce shorter, more local trips. The increase in airport trips, which are among the longest in the dataset, likely accounts for the distance increase and represents a trade-off: MDP matching may improve system-wide efficiency while producing longer individual trips to high-value destinations.

The traditional FHV sector was eliminated

The "Other FHV" category went from 61% of all for-hire trips to 0%. This is the largest structural confounder. Traditional car services served specific outer-borough communities; their absorption by Uber means that some of the geographic dispersion reflects the inheritance of pre-existing trip patterns rather than algorithmic redistribution. Lyft also grew (16% to 25%), further complicating the attribution of spatial changes to Uber's dispatch algorithm specifically.

Limitations

Temporal Coverage

The data covers January only in each year. Seasonal variation is not captured, and results may differ in summer months or during holiday periods. A fuller analysis would compare multiple months across years. Intermediate January snapshots (2019–2024) would show whether the spatial changes were gradual or discontinuous around specific events.

Market Structure

The elimination of the traditional FHV sector (61% of 2018 trips) is the dominant confounder. Uber's 2025 geographic footprint partly reflects the absorption of trips that previously belonged to community car services with different spatial coverage. Separating algorithmic effects from market consolidation effects would require data on the geographic profile of the absorbed operators.

Algorithm Opacity

Uber's actual dispatch algorithm is proprietary. The MDP framework used here derives from Xu et al. (2018), a published research paper. The analysis tests predictions from that framework against observable patterns, but cannot confirm which specific algorithm Uber deployed or when.

Distance Measurement

Trip distances are computed as great-circle distances between zone centroids, not actual trip routes. This approximation understates true distances (roads are not straight lines) and conflates trips within the same zone as zero distance. Actual trip duration and fare data, where available, would provide better measures of trip length and efficiency.

Cluster Boundaries

K-means clusters are fit independently per year. The intra-cluster trip share increase (63% to 72%) is partly an artifact of differently drawn boundaries. The zone-level localization metric (29.9% to 33.4%) is boundary-independent and provides a more conservative estimate of localization change.

Alternative Explanations

Post-pandemic commuting behaviour, airport ground transportation policy changes, the 2019 TLC vehicle cap, Uber's own pricing adjustments, and the competitive dynamics between Uber and Lyft are all plausible contributors that cannot be isolated in a two-period comparison.

Methodological Notes

Data

NYC TLC public trip records (Parquet format). January 2018: 19.8M total FHV trips (4.5M Uber, 3.1M Lyft, 12.2M other). January 2025: 20.4M total FHV trips (15.4M Uber, 5.0M Lyft, 0 other). 263 TLC taxi zones. Uber identified via dispatching base number (2018) and HVFHS license number (2025). Zone centroids from NYC Open Data shapefile.

Geographic Methods

K-means clustering (k=6) on zone centroid lat/lon. Clusters matched across years by geographic proximity (Haversine distance). Gini coefficient and Lorenz curves for concentration. Bootstrap confidence intervals on Gini (1,000 iterations resampling zones). Zone-level localization (K=5 nearest zones by centroid distance). Trip distance via vectorized Haversine on centroid coordinates.

Spatial Methods

Global Moran's I and local LISA (KNN weights, k=6, row-standardized) for spatial autocorrelation. Kolmogorov-Smirnov test for distributional comparison of pickup/dropoff mismatch ratios. Levene's test for variance equality.

Temporal Methods

Jensen-Shannon divergence (JSD) between 2018 and 2025 hourly pickup distributions, computed at aggregate and per-cluster levels. Demand typology clustering via K-means (k=4) on standardized 24-dimensional hourly profile vectors per zone.

Reference: Xu et al. (2018). "Large-scale order dispatch in on-demand ride-hailing platforms." KDD '18. MDP framework based on published research. Analysis independent of any proprietary system.

K-Means Clustering Geospatial Analysis Spatial Autocorrelation (LISA) Markov Decision Process Gini Coefficient Bootstrap CI Jensen-Shannon Divergence OD Flow Analysis NYC TLC Data Python PySAL / GeoPandas Plotly

Appendix

Bootstrap Gini Confidence Intervals

Zone-level trip counts were resampled 1,000 times with replacement, and the Gini coefficient recomputed for each iteration. The resulting distributions for 2018 and 2025 do not overlap at the 95% level, confirming the observed decline is not a sampling artifact.

Bootstrap Distribution of Gini Coefficients

LISA Cluster Maps

Local Indicators of Spatial Association (LISA) identify statistically significant spatial clusters. Red zones are hot spots (high demand surrounded by high demand); blue zones are cold spots. The spatial weights matrix uses K=6 nearest neighbours with row standardization.

LISA Map: 2018 (Global Moran's I = 0.55, p < 0.001)

LISA Map: 2025 (Global Moran's I = 0.28, p < 0.001)

Pickup/Dropoff Balance

The pickup-to-dropoff ratio per zone (PU/(PU+DO)) measures whether a zone is a net origin or destination. A value of 0.5 indicates perfect balance. The KS test detected a significant distributional shift (D = 0.21, p < 0.001), while Levene's test found no change in variance (p = 0.95).