People > Population Diversity
Firms, businesses & institutions > Economic Diversity
Urban landscape > Morphological Diversity
Animals & Plants > Species diversity
In general: ’The state of being diverse‘
With Spatial context: ’The state of being diverse at one location OR throughout a geographical area
In context of cities: ‘The state of being divrse within within and between urban places’
Concentration/number & lack thereof
Spread - homogeneity and heterogeneity
Spillover - Geographical relation to concetration results in
For example Spieces richness…
… aka variety
\(D = \sum_{i}^n p_{i}^0\)
\(p_i\) is the proportion of data points in the \(i\)th category
\(n\) is the number of total categories
A count of different species / categories / …
Interpretation:
Plurality
Availability of options
OR Shannon entropy
\(H = -\sum_{i}^n p_{i} \ln{p_{i}}\)
\(n\) is the number of total categories
\(p_i\) is the proportion of data points in the \(i\)th category
Probably the most common diversity index.
Interpretation:
If one category dominates ➔ less surprise ➔ low entropy
No category dominates ➔ more surprise ➔ high entropy
Marshalllian externalities - benefits gained from geographical agglomeration
For example: Knowledge spillover, production spillover, …
Jacobian externalities - benefits gained from the diversity of economic activities within geography
For example: Knowledge concentration correlated with production concentration
Example of method: Spatial weights
Cities are generators of cosmopolitanism
‘cosmos’ + ‘polis’
‘world’ + ‘city’
city of the world
cosmopolite = citizen of the word
cosmoplitan = being part of the world, free from local attachments and prejudices
Chicago School - mosaic - spatial ecology
massive number of segregation studies > ‘The ethnic city’
Later on shift >
Steven Vertovec - Diversity and Contact
Diversity is not just about the ‘cosmo’
It can have negative effects such as ‘halo effect’ = xenofobic populism is highest in areas close to highly diverse or changing areas
Plotting the diversity metrics (shannon entropy, rates,…)
Clustering
Reducing the dimensions of the observation space
Classification of observations into (exclusive) groups
Distance or (dis)similarity between each pair of observations to create a distance or dissimilarity or matrix
Observations within the same group are as similar as possible
Plenty of other resources online and in textbooks
Hierarchical
k means
dbscan
Hierarchical clustering
Source: @boehmke2019hands
Agglomerative clustering (AGNES – AGglomerative NESting)
Divisive hierarchical clustering (DIANA – DIvise ANAlysis)
Dissimilarity (distance) of observations
K-Means
k is the number of clusters and is pre-defined
The algorithm selects k random observations (starting centres)
The remaining observations are assigned to the nearest centre
Recalculates the new centres
Re-check cluster assignment
Iterative process to minimise within-cluster variation until convergence
\(SS_{within} = \sum_{k=1}^k W(C_{k}) = \sum_{k=1}^k \sum_{x_i\in C_K}(x_i-\mu_k)^2\)
K-Means in practice
stats::kmeans(x, centers = k, iter.max = 10, nstart = 1, algorithm = c(“Hartigan-Wong”, “Lloyd”, “Forgy”,“MacQueen”))
The elbow method
Silhouette score/coefficient
Gap statistics
The elbow method
Compute k-means clustering for different values of k
Calculate \(SS_{within}\) - the sum of square distances between the centroids and each points.
is a measure of how similar an object is to its own cluster (cohesion) compared to other clusters (separation)
ranges from −1 to +1
metric that describes how compact the clusters are > minimization problem
computes all the pairwise distances between points within a cluster and average these distances
read the original paper Tibshirani, Walther, and Hastie (2001)
From the optional practical on clustering
dbscan or hdbscan
for each point constructs buffer with radius r
Counts all the other points within each buffer = N > Core points
Keep constructing buffers to points within the first buffer > iterates
Stops when it cannot expand any more
dbscan or hdbscan
Resources: SciKit-learn docs, dbscan package, Youtube video, example K-means vs DBscan
Today the field is more concerned about the process of diversification.
How are diverse environment created?
‘Route-ines’ are patterns of encounter that arise from fleeting interactions
Through ‘rout-ines’ people observe changes in their neighbourhoods and became more familiar with the people around them
Based on Vertovec (2015)
Rooms without walls - urban spaces where interaction create social spaces and communities > patterns of social interactions
Corridors of dissociation - urban places which are not where people are banned to interact in either by someone, institution or by themselves > patterns of social exclusion
Based on Vertovec (2015)
Optional practical on github