\[trade_{ijt} = hyperlinks_{ijt} + distance_{ij} + \\ pop.density_{it} + pop.density_{it} + empl_{it} + empl_{jt} + e_{ijt}\]
\[\begin{align} R^2 = 1 - \frac{\sum_{k} (y_{k} - \hat{y_{k}})^2} {\sum_{k} (y_{k} - \overline{y_{k}})^2} \label{eq:rsquared} \end{align}\]
\[\begin{align} MAE = \frac{1}{N} \sum_{k = 1}^{N} |\hat{y_{k}} - y_{k}| \label{eq:mae} \end{align}\]
\[\begin{align} RMSE = \sqrt{\frac{\sum_{k = 1}^{N} (\hat{y_{k}} - y_{k})^2} {N}} \label{eq:rmse} \end{align}\]
level | freq | perc | cumfreq | cumperc |
(0,1] | 41596 | 0.718 | 41596 | 0.718 |
(1,2] | 6451 | 0.111 | 48047 | 0.830 |
(2,10] | 6163 | 0.106 | 54210 | 0.936 |
(10,100] | 2975 | 0.051 | 57185 | 0.988 |
(100,1000] | 646 | 0.011 | 57831 | 0.999 |
(1000,10000] | 62 | 0.001 | 57893 | 1.000 |
(10000,100000] | 4 | 0.000 | 57897 | 1.000 |
year | hyperlinks | distance |
2000 | 0.539 | -0.219 |
2001 | 0.578 | -0.221 |
2002 | 0.793 | -0.221 |
2003 | 0.483 | -0.220 |
2004 | 0.807 | -0.223 |
2005 | 0.643 | -0.219 |
2006 | 0.585 | -0.219 |
2007 | 0.598 | -0.214 |
2008 | 0.491 | -0.205 |
2009 | 0.922 | -0.207 |
2010 | 0.674 | -0.205 |
\[trade_{ijt} = hyperlinks_{ijt} + distance_{ij} + \\ pop.density_{it} + pop.density_{it} + empl_{it} + empl_{jt} + e_{ijt}\]
year | RMSE | Rsquared | MAE |
2002 | 951.04 | 0.96 | 166.99 |
2003 | 1254.95 | 0.94 | 230.47 |
2004 | 1019.69 | 0.95 | 179.42 |
2005 | 1852.54 | 0.89 | 310.94 |
2006 | 1713.55 | 0.92 | 307.53 |
2007 | 1974.77 | 0.90 | 210.49 |
2008 | 1534.67 | 0.92 | 248.84 |
2009 | 1237.98 | 0.93 | 215.63 |
2010 | 3165.46 | 0.63 | 302.44 |
Athey, Susan, and Guido W Imbens. 2019. “Machine Learning Methods That Economists Should Know About.” Annual Review of Economics 11: 685–725.
Biau, GÊrard. 2012. “Analysis of a Random Forests Model.” Journal of Machine Learning Research 13 (Apr): 1063–95.
Breiman, Leo. 2001. “Random Forests.” Machine Learning 45 (1): 5–32.
Caruana, Rich, Nikos Karampatziakis, and Ainur Yessenalina. 2008. “An Empirical Evaluation of Supervised Learning in High Dimensions.” In Proceedings of the 25th International Conference on Machine Learning, 96–103. ICML ’08. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/1390156.1390169.
Guns, Raf, and Ronald Rousseau. 2014. “Recommending Research Collaborations Using Link Prediction and Random Forest Classifiers.” Scientometrics 101 (2): 1461–73.
Halavais, Alexander. 2000. “National Borders on the World Wide Web.” New Media & Society 2 (1): 7–28.
Holmberg, Kim. 2010. “Co-Inlinking to a Municipal Web Space: A Webometric and Content Analysis.” Scientometrics 83 (3): 851–62.
Holmberg, Kim, and Mike Thelwall. 2009. “Local Government Web Sites in Finland: A Geographic and Webometric Analysis.” Scientometrics 79 (1): 157–69.
Janc, Krzysztof. 2015. “Geography of Hyperlinks—Spatial Dimensions of Local Government Websites.” European Planning Studies 23 (5): 1019–37.
Jones, Brant W, Ben Spigel, and Edward J Malecki. 2010. “Blog Links as Pipelines to Buzz Elsewhere: The Case of New York Theater Blogs.” Environment and Planning B: Planning and Design 37 (1): 99–111.
Keßler, Carsten. 2017. “Extracting Central Places from the Link Structure in Wikipedia.” Transactions in GIS 21 (3): 488–502.
Krüger, Miriam, Jan Kinne, David Lenz, and Bernd Resch. 2020. “The Digital Layer: How Innovative Firms Relate on the Web.” ZEW-Centre for European Economic Research Discussion Paper, nos. 20-003.
Liaw, Andy, Matthew Wiener, and others. 2002. “Classification and Regression by randomForest.” R News 2 (3): 18–22.
Lin, Jia, Alexander Halavais, and Bin Zhang. 2007. “The Blog Network in America: Blogs as Indicators of Relationships Among Us Cities.” Connections 27 (2): 15–23.
Mullainathan, Sendhil, and Jann Spiess. 2017. “Machine Learning: An Applied Econometric Approach.” Journal of Economic Perspectives 31 (2): 87–106.
Robinson, Caleb, and Bistra Dilkina. 2018. “A Machine Learning Approach to Modeling Human Migration.” In Proceedings of the 1st Acm Sigcas Conference on Computing and Sustainable Societies, 1–8.
Salvini, Marco M, and Sara I Fabrikant. 2016. “Spatialization of User-Generated Content to Uncover the Multirelational World City Network.” Environment and Planning B: Planning and Design 43 (1): 228–48.
Tribby, Calvin P, Harvey J Miller, Barbara B Brown, Carol M Werner, and Ken R Smith. 2017. “Analyzing Walking Route Choice Through Built Environments Using Random Forests and Discrete Choice Techniques.” Environment and Planning B: Urban Analytics and City Science 44 (6): 1145–67.
Vaughan, Liwen. 2004. “Exploring Website Features for Business Information.” Scientometrics 61 (3): 467–77.
Vaughan, Liwen, Yijun Gao, and Margaret Kipp. 2006. “Why Are Hyperlinks to Business Websites Created? A Content Analysis.” Scientometrics 67 (2): 291–300.
Vaughan, Liwen, and Guozhu Wu. 2004. “Links to Commercial Websites as a Source of Business Information.” Scientometrics 60 (3): 487–96.
Yan, Xiang, Xinyu Liu, and Xilei Zhao. 2020. “Using Machine Learning for Direct Demand Modeling of Ridesourcing Services in Chicago.” Journal of Transport Geography 83: 102661.