Diversity in cities

What is Diversity in Cities?

What is Diversity in Cities?

  • People > Population Diversity

  • Firms, businesses & institutions > Economic Diversity

  • Urban landscape > Morphological Diversity

  • Animals & Plants > Species diversity

What is Diversity in Cities?

  • In general: ’The state of being diverse‘

  • With Spatial context: ’The state of being diverse at one location OR throughout a geographical area

  • In context of cities: ‘The state of being divrse within within and between urban places’

Why is diversity important?

  • concentration of diverse entities (people, firms, and other) at location promotes creativity and innovation

In practice

  1. Concentration/number & lack thereof

  2. Spread - homogeneity and heterogeneity

  3. Spillover - Geographical relation to concetration results in

General diversity measures

For example Spieces richness…

  • … aka variety

  • \(D = \sum_{i}^n p_{i}^0\)

  • \(p_i\) is the proportion of data points in the \(i\)th category

  • \(n\) is the number of total categories

  • A count of different species / categories / …


  • Plurality

  • Availability of options

General diversity measures

OR Shannon entropy

  • \(H = -\sum_{i}^n p_{i} \ln{p_{i}}\)

  • \(n\) is the number of total categories

  • \(p_i\) is the proportion of data points in the \(i\)th category

  • Probably the most common diversity index.

  • Interpretation:

    • If one category dominates ➔ less surprise ➔ low entropy

    • No category dominates ➔ more surprise ➔ high entropy

Economic Diversity

  • High concentration, high diversity promotes collaboration and allows for economies of scale and economic growth

Economic Diversity

  • Marshalllian externalities - benefits gained from geographical agglomeration

    For example: Knowledge spillover, production spillover, …

  • Jacobian externalities - benefits gained from the diversity of economic activities within geography

    For example: Knowledge concentration correlated with production concentration

Knowledge spillovers

Example of method: Spatial weights

From Rowe (2021) and Li, Ma, and Liu (2019)

Population Diversity

Cities are generators of cosmopolitanism

  • ‘cosmos’ + ‘polis’

  • ‘world’ + ‘city’

  • city of the world

  • cosmopolite = citizen of the word

  • cosmoplitan = being part of the world, free from local attachments and prejudices

Population Diversity

Chicago School - mosaic - spatial ecology

massive number of segregation studies > ‘The ethnic city’

link to interactive map

Population Diversity

Later on shift >

  • exchancge
  • convivality
  • multiculture
  • spaces of difference
  • engaging strangers

Population Diversity

Steven Vertovec - Diversity and Contact

watch Stevens’ lecture on Diversity

Diversity is not just about the ‘cosmo’

It can have negative effects such as ‘halo effect’ = xenofobic populism is highest in areas close to highly diverse or changing areas

Mapping Diversity

  1. Plotting the diversity metrics (shannon entropy, rates,…)

  2. Clustering



  • Reducing the dimensions of the observation space

  • Classification of observations into (exclusive) groups

  • Distance or (dis)similarity between each pair of observations to create a distance or dissimilarity or matrix

  • Observations within the same group are as similar as possible

  • Based on Boehmke and Greenwell (2019) available here

  • Plenty of other resources online and in textbooks


  1. Hierarchical

  2. k means

  3. dbscan


Hierarchical clustering

  1. Agglomerative clustering (AGNES – AGglomerative NESting)

  2. Divisive hierarchical clustering (DIANA – DIvise ANAlysis)

Dissimilarity (distance) of observations



  1. k is the number of clusters and is pre-defined

  2. The algorithm selects k random observations (starting centres)

  3. The remaining observations are assigned to the nearest centre

  4. Recalculates the new centres

  5. Re-check cluster assignment

  6. Iterative process to minimise within-cluster variation until convergence

\(SS_{within} = \sum_{k=1}^k W(C_{k}) = \sum_{k=1}^k \sum_{x_i\in C_K}(x_i-\mu_k)^2\)


K-Means in practice

stats::kmeans(x, centers = k, iter.max = 10, nstart = 1, algorithm = c(“Hartigan-Wong”, “Lloyd”, “Forgy”,“MacQueen”))

How to choose k ?

  1. The elbow method

  2. Silhouette score/coefficient

  3. Gap statistics

How to choose k ?

  1. The elbow method

    • Compute k-means clustering for different values of k

    • Calculate \(SS_{within}\) - the sum of square distances between the centroids and each points.

How to choose k ?

  1. Silhouette score
  • is a measure of how similar an object is to its own cluster (cohesion) compared to other clusters (separation)

  • ranges from −1 to +1

How to choose k ?

  1. Gap statistics
  • metric that describes how compact the clusters are > minimization problem

  • computes all the pairwise distances between points within a cluster and average these distances

  • read the original paper Tibshirani, Walther, and Hastie (2001)

How to choose k ?

From the optional practical on clustering


dbscan or hdbscan

  • identifies cluster by the density of the points
  1. for each point constructs buffer with radius r

  2. Counts all the other points within each buffer = N > Core points

  3. Keep constructing buffers to points within the first buffer > iterates

  4. Stops when it cannot expand any more


dbscan or hdbscan

Resources: SciKit-learn docs, dbscan package, Youtube video, example K-means vs DBscan

Population Diversity

Today the field is more concerned about the process of diversification.

How are diverse environment created?

  • ‘Route-ines’ are patterns of encounter that arise from fleeting interactions

  • Through ‘rout-ines’ people observe changes in their neighbourhoods and became more familiar with the people around them

Based on Vertovec (2015)

Population Diversity

  1. Rooms without walls - urban spaces where interaction create social spaces and communities > patterns of social interactions

  2. Corridors of dissociation - urban places which are not where people are banned to interact in either by someone, institution or by themselves > patterns of social exclusion

Based on Vertovec (2015)

Try out clustering yourself


Boehmke, Brad, and Brandon Greenwell. 2019. Hands-on Machine Learning with r. Chapman; Hall/CRC.
Li, Ma, and Liu. 2019. “A New Trend in the SpaceTime Distribution of Cultivated Land Occupation for Construction in China and the Impact of Population Urbanization.” Sustainability 11 (18): 5089. https://doi.org/10.3390/su11185089.
Rowe, Francisco. 2021. “Spatial Weights.” Geographic Data Science for Public Policy. https://fcorowe.github.io/udd_gds_course/02-spatial_weights.html.
Tibshirani, Robert, Guenther Walther, and Trevor Hastie. 2001. “Estimating the Number of Clusters in a Data Set Via the Gap Statistic.” Journal of the Royal Statistical Society Series B: Statistical Methodology 63 (2): 411–23. https://doi.org/10.1111/1467-9868.00293.
Vertovec, Steven. 2015. “Route-Ines.” In Diversities Old and New: Migration and Socio-Spatial Patterns in New York, Singapore and Johannesburg, edited by Steven Vertovec, 171–92. Global Diversities. London: Palgrave Macmillan UK. https://doi.org/10.1057/9781137495488_11.