Scale-adjusted metrics for predicting the evolution of urban indicators and quantifying the performance of cities
Luiz G. A. Alves1,2, Renio S. Mendes1,2, Ervin K. Lenzi1,2,3, Haroldo V. Ribeiro1,4,*,
1 Departamento de Física, Universidade Estadual de Maringá, Maringá, PR 87020-900, Brazil
2 National Institute of Science and Technology for Complex Systems, CNPq, Rio de Janeiro, RJ 22290-180, Brazil
3 Departamento de Física, Universidade Estadual de Ponta Grossa, Ponta Grossa, PR 84030-900, Brazil
4 Departamento de Física, Universidade Tecnológica Federal do Paraná, Apucarana, PR 86812-460, Brazil
More than a half of world population is now living in cities and this number is expected to be two-thirds by 2050. Fostered by the relevancy of a scientific characterization of cities and for the availability of an unprecedented amount of data, academics have recently immersed in this topic and one of the most striking and universal finding was the discovery of robust allometric scaling laws between several urban indicators and the population size. Despite that, most governmental reports and several academic works still ignore these nonlinearities by often analyzing the raw or the per capita value of urban indicators, a practice that actually makes the urban metrics biased towards small or large cities depending on whether we have super or sublinear allometries. By following the ideas of Bettencourt et al. [PLoS ONE 5 (2010) e13541], we account for this bias by evaluating the difference between the actual value of an urban indicator and the value expected by the allometry with the population size. We show that this scale-adjusted metric provides a more appropriate/informative summary of the evolution of urban indicators and reveals patterns that do not appear in the evolution of per capita values of indicators obtained from Brazilian cities. We also show that these scale-adjusted metrics are strongly correlated with their past values by a linear correspondence and that they also display crosscorrelations among themselves. Simple linear models account for 31%-97% of the observed variance in data and correctly reproduce the average of the scale-adjusted metric when grouping the cities in above and below the allometric laws. We further employ these models to forecast future values of urban indicators and, by visualizing the predicted changes, we verify the emergence of spatial clusters characterized by regions of the Brazilian territory where we expect an increase or a decrease in the values of urban indicators.
Over the past six decades the world went through a period of quick and remarkable urbanization. According to the United Nations  in the year of 2007, for the first time, the world urban population has surpassed the rural one and, if this process persists, two-thirds of the world population are expected to be living in urban areas by 2050. On one hand, cities are usually associated with higher levels of literacy, health care and better opportunities; on the other hand, unplanned urbanization and bad political decisions also lead to pollution, environmental degradation, growth of crime, unequal opportunities and the increase in the number of people living in substandard conditions. In this sense, there is a vast necessity for finding patterns, quantifying and predicting the evolution of urban indicators, since these investigations may provide guidance for better political decisions and resources allocation.
Fostered by this need and also due the availability of an unprecedented amount of data at city level, several researchers have recently promoted an impressive progress into what has been called Science of Cities . These new data allowed researchers to probe patterns of the cities to a degree not before possible and one of the most striking and universal finding of these studies was the discovery of robust allometric scaling laws between several urban indicators and the population size. Patents, gasoline stations, gross domestic product [3, 4, 5, 6], crime [6, 7, 8, 9, 10, 11], indicators of education , suicides , number of election candidates [13, 14], transportation networks [15, 16], employees from several sectors , measures of social interaction  are just a few examples where robust scaling laws have been found. Despite these intrinsic nonlinear relationships it is common to find works that try unveiling relationships between population size and urban indicators by employing linear regressions in the raw data; also, per capita indicators (that is, divided by population size) are ubiquitous in reports of government agencies and are often used as a guide for public policies when analyzing the temporal evolution of a given city or for comparing/ranking the performance of cities with different population sizes. However, these linear regressions may result in misguided/controversial relationships [19, 9] and per capita indicators are oblivious to the allometric scaling laws that make city a complex agglomeration that cannot be modeled as a linear combination of its individual components [6, 21, 20]. In this sense, there is a paucity of alternative metrics for urban indicators that may overcome the foregoing problems and provide a fairer comparison between cities of different sizes as well as a better understanding of the city evolution. Bettencourt et al.  have recently proposed a simple alternative to overcome these nonlinearities by evaluating the difference between the actual value of the urban indicators and the value expected by the allometries with the population size (that is, the residuals in the allometric relationships). This scale-adjusted metric explicitly considers the allometric relationships and already have proved to be useful in the economic context [23, 22] and for unveiling relationships between crime and urban metrics that are not properly carried out by regression analysis .
In this article, we follow the evolution of this relative metric for eight urban indicators from Brazilian cities in three years (1991, 2000 and 2010) in which the national census took place. By grouping cities in above and below the allometric laws, we argue that the average of this scale-adjusted metric provides a more appropriate/informative summary of the evolution of the urban indicators when compared with the per capita values; it also reveals patterns that do not appear when analyzing only the evolution of the per capita values. For instance, while the per capita values of homicides have systematically increased over the last three decades, both the average of the scale-adjusted metric for cities above and below the allometric law are approaching zero, that is, cities where the number of homicides is above the expected by the allometry have managed to reduce this crime, whereas this crime has been increased in those cities where number of homicides is below the allometry. We argue that the nonlinearities may affect the per capita indicators by creating a bias towards large cities for superlinear allometries and a bias towards small cities for sublinear allometries. We further show that these scale-adjusted metrics are strongly correlated with their past values by a linear correspondence, making them particularly good for predicting future values of urban indicators through linear regressions. We have tested this hypothesis via a linear model where the scale-adjusted metric for one indicator in a given census was predicted by a linear combination of all eight metrics evaluated from the preceding census. These simple models account for 31%-97% of the observed variance in data and correctly reproduce the average value of the scaled-adjusted metric when grouping the cities in above and below the allometric laws. Motivated by these good agreements, we present a prediction for the values of urban indicators in the year of 2020 by assuming the linear coefficients constants over time. By visualizing the predicted changes, we verify the emergence of spatial clusters characterized by regions of the Brazilian territory where the model predicts an increase or a decrease in the values of urban indicators. We further report a list containing all the scale-adjusted metrics as well as the predictions for each city in the hope that government agencies find these informations useful (S1 Dataset).
Results and Discussion
We start by considering the average of the per capita values for eight urban indicators described in the Methods Section. This is a common practice of government agencies for tracking the evolution of a particular city or for comparing a group of cities with different populations. We observe in Fig. 1 that almost all per capita indicators show a clear temporal trend: elderly population, female population, homicides and family income have increased over the years; whereas child labor, illiteracy and male population have decreased (the unemployment rates have evolved in a more complex manner, exhibiting no clear tendency). We could also list the cities in which these indicators have considerably changed or rank the cities that have made more progress in reducing, for instance, the homicide or illiteracy rates.
One of the main problems with this analysis is that it completely ignores the hypothesis that most urban indicators displays allometric relationships (or scale invariance) with the population , that is,
where is a constant, is the allometric (or scaling) exponent and stands for time. This simple relationship summarizes the (average) effects of increasing the population size on the urban indicators; it states that cities are self-similar in terms of their population, in the sense that average properties of a given city can be inferred only by knowledge of its population. Urban indicators are thus expected to display a deterministic component emerging from very few and general properties of the urban networks related to social and infrastructure aspects of cities . When analyzing per capita values, it is implicitly assumed that the value of an urban indicator is proportional to the population size () or, in other words, that cities are extensive systems. This idea is opposed to the complex systems approach of cities: complex systems are non-extensive (), meaning that its isolated parts do not behave in the same manner as when they are interacting. Cities have similar properties and only make sense as an entire “organism” and, in fact, there is robust evidence favoring the non-extensive and universal nature of cities (that is, the urban scaling hypothesis) across different cultures and historical periods [21, 24, 25]. Thus, several properties of a city of a given size cannot be linearly scaled for another city with larger or smaller population size. The dynamical processes mediated by the urban networks make the scale operation a nonlinear transformation for several urban indicators, often resulting in per capita savings of material infrastructure and in gain of socio-economic productivity . From a more technical point of view, whenever the allometric exponent is different from one, there is a remaining component related to the population size when evaluating the per capita values of these urban indicators, that is, , which creates a bias towards large cities for and towards small cities for ; per capita measures are only efficient in correctly removing the effect of the population size in an urban indicator for . For our data, a complete description of the allometric relationships between the eight urban indicators and the population is presented in the Methods Section, where we have confirmed the presence of allometries in our data as summarized in Table 1 (see also S1 and S2 Figs.).
The problem with per capita indicators thus prompts the question on how we can account for these allometries and correctly remove the effect of the population size on the urban indicators. Bettencourt et al.  have proposed a simple and efficient procedure for overcoming this problem by defining the so-called scale-adjusted urban indicator. The approach consists in evaluating the logarithmic difference between the actual value of an urban indicator and the value expected by allometric relationship with the population (that is, the residuals in the allometric relationships) in given year , namely,
The previous quantity explicitly considers the allometry between an urban indicator and the population size, creating a relative measure that is not biased by the population size (size-independent) for any value of . The scale-adjusted metric captures the exceptionality (either good or bad) of a city, which somehow is the result of the nonlinear agglomeration process related to the socio-economic choices and historical path of a city . Furthermore, establish a more “natural” scale for ranking cities by identifying whether an urban indicator of a given city is above () or below () the expected value from cities of similar sizes. This approach already have proved to be useful in the economic context [23, 22] and was also employed for unveiling relationships between crime and urban metrics that are not properly carried out by linear regression analysis . Figure 2 illustrates the definition of and shows an example of allometry between homicides and population size (see also S1 and S2 Figs. for all urban indicators).
In order to show the informations that this scale-adjusted metric provides, we study the evolution of the average by creating two groups of cities: those whose the urban indicator was above () and those whose was below () the allometric law in the year . Figure 3 shows these averages for the eight urban indicators over the three years in our database. For the majority of the urban indicators, the average displays a statistically significant decreasing tendency for cities initially above the allometric law; whereas an increasing tendency for the average is observed for those cities that were initially below the allometric law. For instance, cities where the number of children laboring, homicides and unemployment were above the value expected by the allometric law have been successful (on average) in reducing them; in contrast, cities where these indicators were below the allometric laws proved unable (on average) to improve or maintain this situation. The case of illiteracy is an exception to the previous pattern, since the average has increased for cities initially above the allometric law and it is almost a constant for those cities initially below. Thus, cities where illiteracy was initially above the allometric law have failed (on average) in increasing the number of literates; on the other hand, those cities initially below the allometric law have not only kept this feature (on average) but also managed to further improve the literacy levels. In the case of population metrics, particularly regarding female and male populations, the approaching to the allometric laws (together with the decrease in proportion of males in the population) may represent a good aspect with respect to reduction of violence, since an excessive contingent of men can drive an increase in antisocial behavior ; moreover, both male population above and female population below the allometric law correlate with number of homicides above the allometric law . Similarly, family income above the allometric law correlates with homicides above, while family income below the allometric law correlates with homicides below . Remarkably, these informations remain hidden or distorted when we look only at the per capita values of these urban indicators.
The trends in the average values of the scale-adjusted metrics prompt the question of how the values of this relative metric are affected by their previous values, that is, are the values of and correlated in some particular fashion? To answer this question, we analyze the scatter plots of the scale-adjusted metric in a given year versus its past values for each urban indicator for all cities. Figure 4 shows these scatter plots considering the values of versus and S3 Fig. shows the plots for versus . We observe that, despite the different scattering degrees (Pearson correlation ranging from to ), linear functions are good approximations for the average tendency of these relationships. We thus adjust the linear model
to each urban indicator via least square method and the best fitting parameter (Pearson correlations as well) is shown in Table 2 for the two combinations of years. We have omitted the values of because they are very small (). Despite the increasing tendency observed for the values of over the years (expect for homicides and family income), we observe that for almost all urban indicators except for illiteracy ( for 2000 versus 1991 and for 2010 versus 2000). These results agree with the evolution of the average , that is, indicators characterized by present a tendency of approaching the allometric laws, while for there should be a departing tendency from the allometric laws.
|Child labor||–||0.54 (2)||0.53|
|Elderly population||–||0.84 (1)||0.92|
|Female population||–||0.66 (1)||0.83|
|Family income||–||0.89 (1)||0.91|
|Male population||–||0.70 (1)||0.84|
In order to better understand the role of the parameter , we consider the limit where is small to rewrite Eq. 3 as the following differential equation
whose solution is
where is an integration constant. Thus, Eq. 5 predicts that will exponentially approach the value for and that they will exponentially increase over time when . For we have a linear behavior for the evolution of . It is worth remembering that is a very small number () for all urban indicators and hence the values of are actually approaching zero for (that is, the values of the urban indicators are getting closer to value expected by the allometric law). We further observe that plays the role of a characteristic time and if we assume , the smaller the value of is, the faster we expect to be changes in the urban indicator. For population-related indicators, we observe an increasing tendency in the values of and also that these values are among the largest ones, which agrees with the reduction of the Brazilian population growth in the last decades. For socio-economic indicators, we have, on one hand, that the values of for child labor and unemployment have also increased (pointing out to slowdown of their dynamics); on the other hand, homicides and family income had their values of reduced, suggesting that their rates of change have increased. Apart from the evolution in the values of , population-related indicators have (in general) larger values of than those observed for socio-economic indicators, indicating that these latter are more susceptible to natural or public policies driven changes. The illiteracy is unique because its value of , that was very close to one (for versus ), has increased, a result that agrees with the detachment from the allometry reported in Fig. 3 and suggests an acceleration in this process.
In addition of being autocorrelated with their past values, for the urban indicator also displays statistically significant crosscorrelations with for other indicators (S4 Fig.). These memory effects and also the fact that the residuals surrounding the relationships versus are very close to Gaussian distributions (S5 Fig.) with standard deviations across windows practically constant (S6 Fig.) make these scale-adjusted metrics particularly good for being used in linear regressions aiming forecasts. We have thus adjusted the linear model (via ordinary least-squares method)
by considering the relationships versus and versus . In Eq. 6, is the linear coefficient quantifying the predictive power of on ( is the intercept coefficient) and is the noise term accounting for the effect of unmeasurable factors. The results exhibiting the linear coefficients of each linear regression for the two combinations of years are shown in S1 Text. We note that these simple models account for 31%–97% of the observed variance in and that they correctly reproduce the average values of the scale-adjusted metric above and below the allometric laws for the years 2000 and 2010 only using data from the years 1991 and 2000, respectively (see S7 Fig.). We have further compared the distributions of the empirical values of with the predictions of these linear models and observed that the agreement is remarkable good for the indicators elderly, female and male population as well as for illiteracy and income (S8 and S9 Figs.). Motivated by these good agreements, we proposed to forecast the values of in the year of 2020 (next Brazilian national census). In order to do so, we have considered that the linear coefficients are constant over time and employed the average value of over the two combinations of years used in Eq. 6 for predicting the values of . It is worth noting that by assuming constant, we are ignoring the evolution of socio-economic and policy factors. In an ideal scenario, one could track the evolution of the values of for achieving more reliable predictions. However, our data (that is, the two values for ) do not enable us to probe possible evolutionary behaviors in the values of . Even so, as pointed by Bettencourt , the dynamics of the urban metrics seems to be dominated by long timescales (30 years), and thus the approach of constant coefficients should be seen as a first approximation. The grays bars in Fig. 3 show the averages after grouping the cities with and . We observe that predictions for the average values basically keep the trends presented in the previous years; for unemployment, in which the trend was not very clear, the predictions put this indicator together with most indicators, where the average has been decreasing for cities initially above the allometric law and increasing for those initially below.
In order to gain further information on the predictions, we build a geographic visualization of the expected changes in between the years and . The circles over the maps in Fig. 5 show the geographic location of Brazilian cities; the radii of these circles are proportional to and are colored with shades of azure for cities where (the indicator is expected to decrease) and with shades of red for cities where (the indicator is expected to increase); in both cases, the darker the shade, the larger is the absolute value of the difference . Perhaps, the most striking feature of these visualizations is the fact that the predicted changes appear spatially clustered for almost all indicators, which somehow reflects the geographic inequalities existing in Brazil; however, some intriguing patterns are indicator-dependent.
For child labour, is expected to increase around three of the most densely populated regions that contain the metropolitan areas of São Paulo, Rio de Janeiro and the metropolitan areas of almost all northeast capitals; we further observe that a decrease in child labour cases is expected in mostly of the inner and southern cities. For the indicators elderly, female and male populations the clustering of the changes in is quite evident: elderly and female populations are foreseen to decrease in mostly of the northeast cities and display an increasing tendency in large part of the other regions (male population behaves anti-symmetrically to the female population). In the case of homicides, our model predicts a decrease in for the vast majority of southern cities and we further observe a stripe near the east coast where is expected to decrease for mostly of cities (both densely populated areas); on the other hand, inner cities (specially inner cities from the state of São Paulo and northeastern region) are expected to increase , suggesting that this violent crime may be “moving” towards less populated areas of the interior of Brazil.
For illiteracy, again, the clustering of the changes is easily perceptible: we expect an increase in for mostly of the northeast and northernmost cities, while a decrease is predicted for the majority of the cities from other regions (excluding several inner cities of southernmost region). For family income, we also observe a clustering in the changes where most northern cities are expected to increase the value of (specially the inner cities of these regions), while for most cites in the central part of Brazil are expected to decrease the value of ; these expected changes may be (at least in part) related to the “bolsa família” (family allowance) program — a large scale social welfare program of the Brazilian government (more than 14 million families were beneficiaries in 2013) for providing financial aid to poor families via direct cash transfer — because large part of the families receiving this aid are from the north and northeast regions. It is worth noting that for participating in the program, families must insure that their children attend to school and thus one would expect a reduction in for illiteracy in same regions that concentrate the beneficiaries of the program, which was not predicted by the model. This result suggest that simply enforce school attendance may not be efficient for reducing illiteracy, which can also be explained by common observed poor conditions of the public school system of those regions. Finally, for unemployment, there is no clear clustering of the changes in , but instead we expect a decrease in widespread throughout the Brazilian territory (except in the southernmost region, where we note the prevalence of light shades of red).
Our article discusses that despite being widely used in government reports and also in academic works, per capita indicators completely ignore the universal allometric laws that appear ruling the growth dynamics of cities. We discuss that per capita indicators can be biased towards small or large cities depending on whether we have sub or superlinear allometries with the population size. We thus employed a scale-adjusted metric by evaluating the difference between the actual values of urban indicators and the ones that are expected by the allometric relationships. When investigating the evolution of the scale-adjusted metrics, we have reported patterns that do not appear in the per capita indicators. The scale-adjusted metrics also display a linear correspondence with their past values, a feature that facilitates the use of linear regressions for modeling the urban indicators. By employing simple linear models for describing the scale-adjusted metrics based on their past values, we verified that these models account for 31%–97% of the observed variance and correctly reproduce the average values of the scale-adjusted metric. Assuming the linear coefficients constant over time, we present predictions for the values of the scale-adjusted metrics in year of 2020, when the next Brazilian census will happen. We observe that the predicted changes for the urban indicators appear (for most cases) spatially clustered, that is, forming regions where most cities are expected to increase or decrease the value of the scale-adjusted metric. Apart from this visualization of the predicted changes, we also provide a table (S1 Dataset) with the values of the scale-adjusted metrics for the three past census data as well as the predictions for the next one in the supplementary materials. We believe that our analysis may find potential applications on development of new policies and resources allocation in the context of urban planing. Finally, we note that the methods worked out here can also be directly applied in other contexts where allometries are present such as for economic indexes and biological quantities.
The data we analyzed consist of the population size and eight urban indicators for each Brazilian city in the years , and in which the national census took place. We filter these data by selecting the 1605 cities for which all the eight urban indicators were available, this corresponds to 28.8% of the total number of Brazilian cities but account for 76.5% of the total population of Brazil. These data are maintained and made freely available by the Department of Informatics of the Brazilian Public Health System — DATASUS . The eight urban metrics are defined as follows: Child labour: the proportion of the population aged 10 to 15 years who is working or looking for work during the reference week, in a given geographic area, in the current year; Elderly population: the number of inhabitant of a given city aged 60 years or older; Female population: the number of inhabitant of a given city that is female; Homicides: injuries inflicted by another person with intent to injure or kill, by any means. Illiteracy: it gives the number of inhabitants in a given geographic area, in the current year, aged 15 years or older, who cannot read and write at least a single ticket in the language they know; Family income: this indicator gives the average household incomes of residents in a given geographic area, in the current year. It was considered as family income the sum of the monthly income of the household, in Reals (Brazilian currency) divided by the number of its residents; Male population: the number of inhabitant of a given city that is male; Unemployment: it gives the number of inhabitant aged 16 years or older who is without working or looking for work during the reference week, in a given geographic area, in the current year.
Despite there being other definitions , the results presented here have been obtained by considering that cities are the smallest administrative units with a local government (municipalities or municípios). The other commonly employed definition is the metropolitan area, which is composed of more than one municipality and its is usually associated with the coalescence of several municipalities. As discussed by Bettencourt et al. , the choice of the “unit of analysis” is crucial when studying properties of cities. Regarding the scaling analysis: on one hand, the disaggregation of the correct urban definition can introduce a bias in the value of the scaling exponent (either by reducing or increasing its expectation value); on the other hand, the aggregation of the correct urban definition usually make the allometry more linear . In fact, changes in the scaling exponents have been reported when choosing different definitions of city [29, 30, 31]. However, there is no fail-safe procedure for defining the correct boundaries of a city, also some urban indicators are actually more spatially restricted than others (for instance, homicides versus family income). We have also analyzed our data after considering simultaneously the municipalities that do not belong to any metropolitan area and by aggregating the municipalities of the 39 metropolitan areas existing in Brazil. Despite the observation of relatively small changes in the scaling exponents, our conclusions remain unaltered under this scenario.
Fitting allometric laws between urban indicators and population
As we previously mentioned, several works have reported the existence of robust allometric relationships between several urban indicators and the population size. Regarding Brazilian cities, scaling laws already have been identified for several urban indicators [5, 7, 8, 9, 10, 13, 14, 12, 11], mainly because of the existence of reliable data made freely available by Brazilian agencies (DATASUS and IBGE). Here, we want to confirm that the urban scaling hypothesis holds for all our urban indicators and if these allometries have been changed over time. Specifically, we test the hypothesis that an urban indicator can be described by a power-law function of the population size, that is, , where is the allometric (or scaling) exponent and is constant. In order to do so, we have plotted the logarithm of each urban indicator against the logarithm of the population and adjusted a linear model via orthogonal distance regression (as implement in the package scipy.odr of the Python library SciPy ) to all these relationships. Although the empirical relationships present different scattering degrees, they all display good quality linear relationships (Pearson correlation ranging from to — see Table 1) which are well described by linear models in log-log scale (see Fig 2 and S1 and S2 Figs.). From Table 1, we further note that the values of for illiteracy, family income and unemployment display a weak decreasing tendency over the years, while the other indicators show only small fluctuations (no clear evolutive tendency). A weak decreasing tendency for unemployment also appears in the work of Ignazzi  on the same data from the years of 2000 and 2010. The values of thus classify our indicators in two groups: female population, homicides and unemployment have superlinear relationships with the population (); while child labor, elderly population, illiteracy, family income and male population have sublinear ones (). It is worth noting that despite the allometric exponents be close to one for elderly, female and male populations, the allometries between these indicators and total population are almost perfectly correlated, producing values of very close but statistically different from one. We further observe that the values of the allometric exponents reported here may slightly differ from previous-reported one due the different fitting procedures as well as different urban definitions; however, these discrepancies are often very small (for instance, by considering generalized least squares via the Cochrane-Orcutt procedure and another definition of city, Ignazzi  have found and for unemployment, respectively in years of 2000 and 2010).
H.V.R. and L.G.A.A. designed the research, analyzed the data and prepared the figures. All authors wrote and reviewed the manuscript.
- 1. United Nations, Department of Economic and Social Affairs, Population Division (2014). World Urbanization Prospects: The 2014 Revision, Highlights (ST/ESA/SER.A/352). Available: http://esa.un.org/unpd/wup/Highlights/WUP2014-Highlights.pdf. Date of access: 01 2015 Jun.
- 2. Louf R, Barthelemy M (2014) How congestion shapes cities: from mobility patterns to scaling. Sci. Rep. 4: 5561.
- 3. Bettencourt LMA, Lobo J, Helbing D, Kuhnert C, West GB (2007) Growth, innovation, scaling, and the pace of life in cities. Proc. Natl. Acad. Sci. U. S. A. 104: 7301.
- 4. Arbesman S, Kleinberg JM, Strogatz W H (2009) Superlinear scaling for innovation in cities. Phys. Rev. E 79: 016115.
- 5. Bettencourt LMA, West GB (2010) A unified theory of urban living. Nature 467: 912.
- 6. Bettencourt LMA, Lobo J, Strumsky D, West GB (2010) Urban scaling and its deviations: revealing the structure of wealth, innovation and crime across cities. PLoS ONE 5: e13541.
- 7. Gomez-Lievano A, Youn H, Bettencourt LMA (2012) The statistics of urban scaling and their connection to Zipf’s law. PLoS ONE 7: e40393.
- 8. Alves LGA, Ribeiro HV, Mendes RS (2013) Scaling laws in the dynamics of crime growth rate. Physica A 392: 2672.
- 9. Alves LGA, Ribeiro HV, Lenzi EK, Mendes RS (2013) Distance to the scaling law: a useful approach for unveiling relationships between crime and urban metrics. PLoS ONE 8: e69580.
- 10. Alves LGA, Ribeiro HV, Lenzi EK, Mendes RS (2014) Empirical analysis on the connection between power-law distributions and allometries for urban indicators. Physica A 409: 175.
- 11. Ignazzi AC (2014) Scaling laws, economic growth, education and crime: evidence from Brazil. L’Espace Géographique 4: 324.
- 12. Melo HPM, Moreira AA, Batista E, Makse HA, Andrade JS (2014) Statistical signs of social influence on suicides. Sci. Rep. 4: 6239.
- 13. Mantovani MC, Ribeiro HV, Moro MV, Picoli S, Mendes RS (2011) Scaling laws and universality in the choice of election candidates. EPL 96: 48001.
- 14. Mantovani MC, Ribeiro HV, Lenzi EK, Picoli S, Mendes RS (2013) Engagement in the electoral processes: scaling laws and the role of political positions. Phys. Rev. E 88: 024802.
- 15. Samaniego H, Moses ME (2008) Cities as organisms: Allometric scaling of urban road networks. J. Transp. Land. Use 1: 21.
- 16. Louf R, Roth C, Barthelemy M (2014) Scaling in transportation networks. PLoS ONE 9: e102007.
- 17. Pumain D, Paulus F, Vacchiani-Marcuzzo C, Lobo J (2006) An evolutionary theory for interpreting urban scaling laws. Cybergeo 343: 20.
- 18. Pan W, Ghoshal G, Krumme C, Cebrian M, Pentland A (2013) Urban characteristics attributable to density-driven tie formation. Nat. Commun. 4: 1961.
- 19. Gordon MB (2010) A random walk in the literature on criminality: a partial and critical view on some statistical analysis and modeling approaches. Eur. J. Appl. Math. 1: 283.
- 20. Pumain D (2004) Scaling laws and urban systems. SFI Working Paper: 2004-02-002. Available: http://www.santafe.edu/media/workingpapers/04-02-002.pdf. Date of access: 01 2015 Jun.
- 21. Bettencourt LMA (2013) The origins of scaling in cities. Science 340: 1438.
- 22. Podobnik B, Horvatic D, Kenett DY, Stanley HE (2012) The competitiveness versus the wealth of a country. Sci. Rep. 2: 678.
- 23. Lobo J, Bettencourt LMA, Strumsky D, West GB (2013) Urban Scaling and the Production Function for Cities. PLoS ONE 8: e58407.
- 24. Bettencourt LMA, Lobo J, Youn H (2013) The hypothesis of urban scaling: formalization implications and challenges. SFI Working Paper: 2013-01-004. Available: http://arxiv.org/abs/1301.5919. Date of access: 01 2015 Jun.
- 25. Ortman SG, Cabaniss AHF, Sturm JO, Bettencourt LMA (2014) The Pre-History of Urban Scaling. PLoS ONE 9: e87902.
- 26. Hesketh T, Xing ZW (2006) Abnormal sex ratios in human populations: causes and consequences. Proc. Natl. Acad. Sci. U. S. A. 103: 13271.
- 27. Brazil’s Public healthcare System (SUS), Department of Data Processing (DATASUS), 2011. Available: http://www.datasus.gov.br/. Date of access: 01 2015 Jun.
- 28. Angel S, Sheppard SC, Civco DL, Buckley R, Chabaeva A, et al. (2005) The Dynamics of Global Urban Expansion. Washington DC: World Bank.
- 29. Schläpfer M, Bettencourt LMA, Grauwin S, Raschke M, Claxton R, Smoreda Z, West GB, Ratti C (2014) The scaling of human interactions with city size. J. R. Soc. Interface 11: 20130789.
- 30. Louf R, Barthelemy M (2004) Scaling: lost in the smog. Environ. Plann. B 41: 767.
- 31. Arcaute E, Hatna E, Ferguson P, Youn H, Johansson A, Batty M (2015) Constructing cities, deconstructing scaling laws. J. R. Soc. Interface 12: 20140745.
- 32. Jones E, Oliphant E, Peterson P, et al. SciPy: Open Source Scientific Tools for Python, 2001, Available: http://www.scipy.org/. Date of access: 01 2015 Jun.
Scale-adjusted metrics for Brazilian cities. Values of the scale-adjusted metrics () for the eight urban indicators of Brazilian cities in years of 1991, 2000 and 2010 as well as the predictions obtained via the linear model (Eq. 6) for year of 2020.
Allometric laws with the population size. The scatter plots show the allometric relationships between the urban indicators (from top to bottom: child labor, elderly population, female population and homicides) and population size for the years (red dots), (blue dots) and (green dots) in log-log scale. The allometric exponents (see Methods Section for details on the calculation of ) are shown in the figures. See S2 Fig. for the other indicators.
Allometric laws with the population size. The same as S1 Fig. for the indicators illiteracy, family income, male population and unemployment.
Memory effects in the evolution of the scale-adjusted metrics . The purple dots show the values of versus for each city. The dashed lines are fits of the linear model (Eq. 3) obtained via ordinary least-square regression. The values of and their standard errors are shown in the plots and also summarized in Table 2.
Cross-correlations between the urban indicators. The matrix plot on left shows the values of the Pearson correlation coefficient between the scale-adjusted metric for a given indicator (one indicator per row) in the year and all the other indicators in the year (one indicator per column). The right panel does the same for the years and . The value inside each cell is the Pearson correlation and each one is also colored according to this value. We note that all indicators are strongly correlated with their own past values; furthermore, all indicators also display relevant correlations with at least one other indicator.
Cumulative distributions of the normalized fluctuations surrounding the relationships between the scale-adjusted metrics and . The plots show the cumulative distributions of the normalized residuals of the linear regressions between and (Fig. 4 and S3 Fig.) for the years 2000-1991 (blue lines) and 2010-2000 (green lines) in comparison with the standard Gaussian (dashed lines). We also show the -values of the Cramér von Mises method for testing the null hypotheses that the residuals are normally distributed. We observe that the normality of the data is rejected in most cases (probably due the small heteroskedasticity present in these relationships — see S6 Fig.). However, no huge differences are observed between the Gaussian cumulative curve and the empirical cumulative distributions, suggesting that can be approximately described as a standard Gaussian noise.
Window-evaluated standard deviation over the relationship between the scale-adjusted metrics and . These plots show the standard deviation of the scale-adjusted metrics versus the average value of evaluated in five equally spaced windows taken from the relationship between and (Fig. 4 and S3 Fig.) for the years 2000-1991 (left panel) and 2010-2000 (right panel). We note that the standard deviation can be approximated by a constant for most indicators in both combinations of years. We further observe that the small fluctuations in are probably the reason of why the Cramér von Mises test has rejected the normality of the fluctuations shown in S5 Fig.. When fitting the linear models of Eq. 6, we have also taken into account this small heteroskedasticity (as implemented in the Stata 13 — http://www.stata.com — via the robust option in the regress function) but the linear coefficients remain practically the same.
Comparisons between the average values of the scale-adjusted metrics obtained from the linear models and the empirical ones. We have applied the linear model of Eq. 6 for predicting the values of in the year of only using data from the year of as well as for predicting in the year of only using data from the year of . In both cases, we have calculated the average for the predictions (gray bars) after grouping the cities in above (A) and below (B) the allometric laws (in that year) and compared these results with the same averages evaluated using the empirical data (blue bars for year of 2000 and green bars for the year of 2010). The errors bars are 95% bootstrapping confidence intervals for the average values. We observe that the predicted average values are in very good agreement with the empirical values for all urban indicators in both years.
Comparisons between the cumulative distributions of the scale-adjusted metrics obtained from the linear models and the empirical ones. We have obtained the values of in the year of using the linear model of Eq. 6 and by employing data from the year of . We thus calculated the cumulative distributions functions (CDF) of for the predicted values (black lines) and the empirical ones (blue lines). We observe that the agreement is very good for the population indicators, illiteracy and family income; for the other indicators we observe that the model fails in reproducing the tails of the distributions.
Comparisons between the cumulative distributions of the scale-adjusted metrics obtained from the linear models and the empirical ones. The same as S8 Fig. considering data from the year of .