Introduction
Web data
Methods
Results
Conclusions
Because new digital activities are rarely—if ever—captured in official state data, researchers must rely on information gathered from alternative sources (Zook and McCanless 2022).
Guide policies for deployment of new technologies
Predictions of introduction times for future technologies (Meade and Islam 2021):
Network operators
Suppliers of network equipment
Regulatory authorities
As in temporal diffusion models, an S-shaped pattern in the cumulative level of adoption
A hierarchy effect: from main centres to secondary ones – central places
A neighborhood effect: diffusion proceeds outwards from innovation centres, first “hitting” nearby rather than far-away locations (Grubler 1990)
Hägerstrand (1965): from innovative centres (core) through a hierarchy of sub-centres, to the periphery
Diffusion of an intangible, digital technology [web]
Map the active engagement with the digital
Over time, early stages of the internet [1996-2012]
Granular and multi-scale spatial processes
Data from the Internet Archive, the oldest web archive
Observe commercial websites 1996 - 2012 in the UK (.co.uk)
Geolocation: postcode references in the text
Timestamp: archival year
Counts
Data from a Web Archive – The Internet Archive
Observe commercial websites 1996 - 2012 in the UK (.co.uk)
Geolocation: postcode references in the text
Timestamp: archival year
Counts
JISC UK Web Domain Dataset: all archived webpages from the .uk domain 1996-2012
Curated by the British Library
All .uk archived webpages which contain a UK postcode in the web text
Circa 0.5 billion URLs with valid UK postcodes
20080509162138 | http://www.website1.co.uk/contact_us | IG8 8HD
All the archived .uk webpages
Archived during 1996-2012
Commercial webpages (.co.uk)
From webpages to websites:
- http://www.website1.co.uk/webpage1 and
- http://www.website1.co.uk/webpage2 are part of the
1 vs. multuple postcodes in a website
level | freq | perc | cumfreq | cumperc |
---|---|---|---|---|
(0,1] | 41,596 | 0.718 | 41,596 | 0.718 |
(1,2] | 6,451 | 0.111 | 48,047 | 0.830 |
(2,10] | 6,163 | 0.106 | 54,210 | 0.936 |
(10,100] | 2,975 | 0.051 | 57,185 | 0.988 |
(100,1000] | 646 | 0.011 | 57,831 | 0.999 |
(1000,10000] | 62 | 0.001 | 57,893 | 1.000 |
(10000,100000] | 4 | 0.000 | 57,897 | 1.000 |
Websites with a large number of postcodes: e.g. directories, real estate websites
Focus on websites with one unique postcode per year
S-shaped pattern in the cumulative level of adoption
A hierarchy effect: from main centres to secondary ones
A neighborhood effect: first “hitting” nearby locations
Cumulative adoption: Self-starting logistic growth model
[nls
and SSlogis
]
Descriptive statistics & ESDA
Machine learning framework [random forests]
Two scales: