Jekyll2020-10-21T00:40:38+00:00https://lexparsimon.github.io/feed.xmlUrban Data ScienceIdeas, experiments, and discussions about how cities function.Gevorg Yeghikyangevorg.yeghikyan@sns.itUrban sensing: quantifying the sensing power of vehicle fleets2020-09-25T00:00:00+00:002020-09-25T00:00:00+00:00https://lexparsimon.github.io/sensingpower<h2 id="introduction">Introduction</h2>
<p>Smart cities are increasingly being equipped with sensors measuring a variety of quantities indicative of the quality of the urban enviroment: air pollution, traffic congestion, air temperature, humidity, road quality, pedestrian density, parking spot occupancy, Wi-Fi accessibility, etc. All these measurements will fuel advanced urban analytics, and become a routine component in urban planning, policy making, and management.</p>
<p>However, the city-wide deployment of sensors has limited spatial coverage and comes at a significant cost, and the question of their optimal placement arises naturally. As a solution, the <a href="http://nrlweb.cs.ucla.edu/nrlweb/publication/download/498/vsnsurvey10.pdf">“drive-by”</a> paradigm has been recently proposed, whereby sensors are installed on third-party vehicles “scanning” the city. While most of the research on drive-by sensing has focused on engineering aspects, the key question of <strong>how many vehicles are required to adequately scan a city</strong> remained unanswered until recently. The answer intuitively depends on the mobility patterns of the vehicles in question. Among many candidates for the vehicle fleets, such as private cars, buses, and taxis, taxis are the natural choice for deploying the sensors as they are pervasive in the city and do not follow fixed routes. Is it possible to find out whether attaching sensors to taxi vehicles is a good idea? This is what we are going to do in this post.</p>
<h2 id="why-is-it-important">Why is it important?</h2>
<p>If we discover that a small number of taxis covers a large portion of the city, attaching sensors to those taxis could provide a cheap way for monitoring the various quantities mentioned in the introduction. In this post, we define a measure of this “covering” and discuss its analytic description, which agrees surprisingly well with empirical data from a ride sharing company called GG based in Yerevan. We will follow the arguments described in a recent <a href="https://www.pnas.org/content/116/26/12752">paper</a> from the <a href="http://senseable.mit.edu/">MIT Senseable Lab</a> to demonstrate the suprisingly huge potential of this method: just 30 randomly chosen taxi vehicles (less than 1% of the entire fleet) during a typical day cover on average <strong>more than a third</strong> of Yerevan city and <strong>almost 70%</strong> of the centrally located districts (Kentron, Ajapnyak, Kanaker-Zeytun, Nork-Marash)! This has huge implications for smart city projects as it demonstrates that drive-by sensing can be readily implemented in real projects at a relatively small cost.</p>
<h2 id="how-to-measure-urban-sensing">How to measure urban sensing?</h2>
<p>In order for a vehicle fleet to scan as large a portion of the urban street network as possible, a dense exploration of what the <a href="https://www.pnas.org/content/116/26/12752">authors</a> call the city’s spatio-temporal “volume” is required. The extent to which a vehicle fleet does this is called its <em>sensing power</em>.</p>
<p>Imagine a fleet of sensor-equipped vehicles \(\mathcal{V}\) moving in a city, scanning its street network \(S\) during a reference period \(\mathcal{T}\). Below you can see such a fleet of randomly chosen 30 taxis traversing the streets of Yerevan:
<img src="/images/urban sensing/Yerevan_30_gg.gif" alt="Alt Text" /></p>
<p>We represent the nodes of the street network \(S\) as potential pick-up and drop-off locations for the vehicles. Since we are going to work with street segments, we convert the set of GPS coordinates of a given vehicle to a trajectory \(T_{r}\), defined as a sequence of street segments \(T_{r} = (S_{i_1} , S_{i_2} , . . . )\). In order to achieve this, we need to match the taxi trajectories to the OpenStreetMap driving network segments. However, this is not an easy task, since a naive projection to the closest road segment yields incorrect results. Fortunately, this problem has been solved in <a href="https://www.microsoft.com/en-us/research/publication/hidden-markov-map-matching-noise-sparseness/">this</a> paper, which builds a Hidden Markov Model for identifying the most probable street network path, given a sequence of GPS points.</p>
<p>In order to measure the <em>sensing power</em> of a fleet, we quantify it as its covering fraction \(\langle C\rangle\), defined in the paper as the average fraction of street segments in \(S\) that are “covered” or sensed by a taxi during time period \(\mathcal{T}\) , assuming that \(N_{V}\) vehicles are selected
uniformly at random from the vehicle fleet \(\mathcal{V}\).</p>
<p>What we want is an understanding of how the <em>sensing power</em> of a vehicle fleet in a city changes as a function of the size of the fleet.</p>
<h2 id="can-we-simulate-the-taxi-movements">Can we simulate the taxi movements?</h2>
<p>To model the taxis’ movements, the paper introduces the taxi-drive process. The model makes very simple (and wrong) assumptions:</p>
<ul>
<li>that taxis travel to randomly chosen destinations via shortest paths, with ties between multiple shortest paths broken at random.</li>
<li>Once a destination is reached, another destination is chosen, again at random, and the process repeats.</li>
<li>To reflect heterogeneities in real taxi trajectories, destinations are not selected uniformly at random. Instead, already visited nodes are chosen preferentially: The probability \(q_{n}(t)\) of selecting a node \(n\) is proportional to \(1+v_{n}^{\beta}(t)\), where \(v_{n}(t)\) is the total number of times node \(n\) has been visited at during the time interval \((t_{start}, t)\) and \(\beta\) is a city-specific parameter
to be tweaked. This <a href="https://en.wikipedia.org/wiki/Preferential_attachment">“preferential attachment”</a> mechanism, colloquially known as rich-get-richer effect and discussed mainly within the context of wealth distribution and tie formation in social networks, has been <a href="http://barabasi.com/publications/10/human-dynamics">shown</a> to capture the statistical properties of human mobility and, as it turns out, also captures those of taxis.</li>
</ul>
<p>We first calculate the street segment popularities \(p_{i}\), the relative frequency each street segment in the city is traversed by the vehicles in the fleet \(\mathcal{V}\) during \(\mathcal{T}\) (the \(p_{i}\) sum to 1) from the GG taxi data (<strong>spoiler</strong>: we are going to use the \(p_{i}\) to compute our target \(\langle C\rangle\)). Then, we run the simulation of the taxi-drive process with the same fleet size and the same number of trips for each vehicle. We discover that, despite the unrealistic assumptions in the model, the taxi-drive process captures quite well the statistical properties of real taxis’ trajectories.</p>
<p>In particular, the simulation produces surprisingly realistic distributions of segment popularities \(p_{i}\), in agreement with that obtained from the GG data:</p>
<p><img src="https://lexparsimon.github.io/images/urban sensing/Yerevan_segment_pops.jpg" alt="Yerevan segment popularity distribution" /></p>
<p>As one might expect from a preferential attachment mechanism, the distributions are heavy-tailed and closely follow <a href="https://en.wikipedia.org/wiki/Zipf%27s_law">Zipf’s law</a>. This good agreement between the model and the data is rather surprising. One might expect that the simplistic assumptions in the taxi-drive model miss many important factors in the real world, such as variations in street-segment lengths and driving speeds, the spatial arranagement of attractive destinations specific to a city, human-routing decisions, as well as heterogeneities in passenger-pickup and -dropoff times and locations. However, the results show that, at the macro level of segment popularity distributions, these complexities are irrelevant. Moreover, the original paper finds this agreement to be true across many cities varying in size and continent.</p>
<h2 id="analytic-derivation-of-langle-c_n_vrangle">Analytic derivation of \(\langle C_{N_V}\rangle\)</h2>
<p>Having computed the segment popularities \(p_{i}\), we now proceed to estimating the sensing power \(\langle C_{N_V}\rangle\) analytically by recognizing the connection with the <a href="https://en.wikipedia.org/wiki/Urn_problem">urn</a> problem in probability theory (this is why knowledge of basic probability theory is so useful!). The street segments are considered as “bins” into which “balls” are placed every time a taxi vehicle traverses that street segment. Using the segment popularities \(p_{i}\) as the bin probabilities, we can derive the analytic expression for \(\langle C_{N_V}\rangle\).</p>
<p>As stated in the paper, given the nontrivial topology of \(S\) and the non-Markovian nature of the
taxi-drive process (this essentially means that the future does not only depend on the present - loosely speaking the Markov property - but on the past as well), it is difficult to solve for \(\langle C_{N_V}\rangle\) exactly. However, it is possible to derive a very good approximation.</p>
<p>As we will see in a bit, it is actually easier to first solve for the trip-level \(\langle C_{N_T}\rangle\) covered fraction — i.e., when \(N_T\), the total number of trips, is the dependent variable, so we begin the derivation with this simpler case; the vehicle-level expression for \(\langle C_{N_V}\rangle\) will then be trivial to obtain.</p>
<p>Imagine we have a set \(\mathcal{P}\) of taxi trajectories. We define a taxi trajectory \(T_r\) as a sequence of street segments \(T_{r} = (S_{i_1} , S_{i_2} , . . . )\). The trajectories can be those obtained from the empirical data or from the simulation - this is unimportant for now. Given \(\mathcal{P}\), our strategy for finding \(\langle C_{N_T}\rangle\) will be to imagine street segments as bins into which balls are added every time they are traversed by a trajectory from \(\mathcal{P}\). Note that, in contrast to the traditional urn problem, a random number of balls is placed at each time step, since taxis’ trajectories have random length.</p>
<h3 id="trajectories-with-unit-length">Trajectories with unit length</h3>
<p>Let \(L\) be the random length of a trajectory. The special case of \(L=1\) is trivial to solve, since it corresponds exactly to the classical case: drawing \(N_T\) trips at random from \(\mathcal{P}\) amounts to adding \(N_T\) balls into \(N_S\) bins, where \(N_S\) is the total number of street segments, and each bin is selected with probability \(p_i\). Let \(\vec{M}=(M1, M2, . . ., M_{N_S})\), where \(M_i\) is the number of balls in the \(i\)-th bin. It is well known that the \(M_i\) are multinomially distributed:</p>
<p>\(\vec{M} \sim \operatorname{Multi}\left(N_{T}, \vec{p}\right)\),</p>
<p>where \(\vec{p}=\left(p_{1}, p_{2}, \dots p_{N_{s}}\right)\). Now, since a street segment is considered covered during time period \(\mathcal{T}\) if it is traversed by at least one vehicle from \(\mathcal{V}\), the (random) fraction of street segments covered is</p>
<p>\(C=\frac{1}{N_{S}} \sum_{i=1}^{N_{S}} 1_{\left(M_{i} \geq 1\right)}\),</p>
<p>where \(1_A\) is the indicator function of event \(A\). The expectation of this fraction \(C\) is</p>
<p>\(\langle C\rangle_{\left(N_{T}, L=1\right)}=\frac{1}{N_{S}} \sum_{i=1}^{N_{S}} \mathbb{P}_{N_{T}}\left(M_{i} \geq 1\right)\).</p>
<p>Now the trick is to note that the number of balls in each bin is a binomial random variable \(M_{i} \sim B i\left(N_{T}, p_{i}\right)\). The survival function of the binomial distribution is \(\mathbb{P}\left(M_{i} \geq 1\right)=1-\left(1-p_{i}\right)^{N_{T}}\). Finally, we substitute this into the previous equation and obtain:</p>
<p>\(\langle C\rangle_{\left(N_{T}, L=1\right)}=1-\frac{1}{N_{S}} \sum_{i=1}^{N_{S}}\left(1-p_{i}\right)^{N_{T}}\).</p>
<h3 id="trajectories-with-fixed-length">Trajectories with fixed length</h3>
<p>Trajectories of fixed (i.e., nonrandom) length \(L>1\) lead to spatial correlations between the counts \(M_i\) (note: in the classic urn problem, there is already a correlation among the \(M_i\), since their sum is constant and equal to the total number of balls placed). This is due to the fact that real trajectories are contiguous in space: a trajectory that covers a given segment cannot jump the neighbouring segments at a node, and has to cover one of the neighboring segments. Given the nontrivial topology of the street network \(S\), the correlations between bins are difficult to model (this would require working with the adjacency matrix of the street network \(S\)). To overcome this, the authors in the paper made the strong assumption that for \(N_{T} \gg 1\), the spatial correlations between bins are asymptotically zero. This assumption renders our task much more easier. We now imagine that adding a trajectory of length \(L\) is the same as if we add
\(L\) balls into (not necessarily contiguous) bins chosen <em>randomly</em> according to \(p_i\). This means that choosing \(N_T\) trajectories of fixed length \(L\) from \(\mathcal{P}\) is equivalent to placing
\(L \star N_T\) balls into \(N_S\) bins: \(\langle C\rangle_{\left(N_{T}, L_{f i x e d}\right)}=\langle C\rangle_{\left(N_{T} \star L, L=1\right)}\). Hence, the expected value of the coverage can be obtained by modifying the result obtained in the unit length case:</p>
<p>\(\langle C\rangle_{\left(N_{T}, L_{f i x e d}\right)}=1-\frac{1}{N_{S}} \sum_{i=1}^{N_{S}}\left(1-p_{i}\right)^{L * N_{T}}\).</p>
<p>The assumption that neighbouring street segments are spatially uncorellated is a gross simplification which essentially removes the spatial component from the model. However, as we will see in a bit, it produces results remarkably consistent with the data.</p>
<h3 id="trajectories-of-random-lengths">Trajectories of random lengths</h3>
<p>Now, generalizing to random \(L\) is easy.
Let \(S_{N_{T}}=\sum_{i=1}^{N_{T}} L_{i}\) be the number of segments covered by \(N_T\)
trajectories. By the <a href="https://en.wikipedia.org/wiki/Law_of_total_expectation">law of total expectation</a></p>
<p>\(\langle C\rangle_{\left(N_{T}, L\right)}=\sum_{n=0}^{\infty}\langle C\rangle_{\left(n, L_{f i x e d}\right)} \mathbb{P}\left(S_{N_{T}}=n\right)\),</p>
<p>where \(\langle C\rangle_{\left(n, L_{\text {fixed}}\right)}\) is given by the expression for fixed \(L\). Finding the probabilities \(\mathbb{P}\left(S_{N_{T}}=n\right)\) is trickier. We need to know the distribution of the trajectory lengths. In the paper, the authors show that \(L \sim L o g n o r m a l\left(\tilde{\mu}, \tilde{\sigma}^{2}\right)\). We confirm this finding on the GG vehicle trajectory lengths with fitting a longormal distribution to the trajectory lengths and finding \((\mu, \sigma,\langle L\rangle)=(3.00,0.86,27.80)\):</p>
<p><img src="https://lexparsimon.github.io/images/urban sensing/Yerevan_traj_length_distribution.jpg" alt="Yerevan trajectory length distribution" /></p>
<p>To remind again: \(L\) measures the number of segments covered by a vehicle, and not the distance of the trip.
Further, it has been shown that the sum of lognormal random variables is itself approximately lognormal \(S_{N_{T}} \sim \operatorname{Lognormal}\left(\mu_{S,} \sigma_{S}^{2}\right)\) for some \(\mu_{S}\) and \(\sigma_{S}\). In order to choose appropriate \(\mu_{S}\) and \(\sigma_{S}\), the authors follow the <a href="http://leo.ugr.es/pgm2012/submissions/pgm2012_submission_6.pdf">Fenton-Wilkinson method</a> in which \(\sigma_{S}^{2}=\ln \left(\frac{\exp \tilde{\sigma}^{2}-1}{N_{T}}+1\right)\) and \(\mu_{S}=\ln \left(N_{T} \exp (\tilde{\mu})\right)+\left(\tilde{\sigma}^{2}-\sigma_{S}^{2}\right) / 2\).
Plugging into the lognormal pdf formula, we obtain</p>
<p>\(\mathbb{P}\left(S_{N_{T}}=n\right)=\frac{1}{n \sigma_{S} \sqrt{2 \pi}} \mathrm{e}^{-\frac{\left(\ln n-\mu_{S}\right)^{2}}{2 \sigma_{S}^{2}}}\).</p>
<p>And substituting this into the previous equation above, we obtain the nasty</p>
<p>\(\langle C\rangle_{\left(N_{T}, L\right)}=\frac{1}{N_{S} n \sigma_{S} \sqrt{2 \pi}} \sum_{n=0}^{\infty} \sum_{i=1}^{N_{S}}\left(1-\left(1-p_{i}\right)^{n}\right) \mathrm{e}^{-\frac{\left(\ln n-\mu_{S}\right)^{2}}{2 \sigma_{S}^{2}}}\).</p>
<p>This equation completely models the trip-level fraction \(\langle C\rangle_{\left(N_{T}, L\right)}\). However, a non-trivial thing to notice: the sum over \(n\) is dominated by its expectation, so we simplify matters by replacing \(n\) by its expected value \(\langle L\rangle \star N_{T}\). This gives us much simpler and nicer formula \(\left.\langle C\rangle_{\left(N_{T}, L\right)}=\langle C\rangle_{\left(N_{T}^{*}\right.}\langle L\rangle, L=1\right)\), or</p>
<p>\(\langle C\rangle_{N_{T}} \approx 1-\frac{1}{N_{S}} \sum_{i=1}^{N_{S}}\left(1-p_{i}\right)^{\langle L\rangle \star N_{T}}\).</p>
<h3 id="extension-to-vehicle-level">Extension to vehicle level</h3>
<p>As promised, now we do the final step of obtaining the fraction \(\langle C\rangle_{N_V}\) as a function of the number of vehicles. Let \(B\) be the random number of segments that a random taxi in \(\mathcal{V}\) covers during the period \(\mathcal{T}\). As we can see in the plot below, \(B\) is also lognormally distributed:</p>
<p><img src="https://lexparsimon.github.io/images/urban sensing/Yerevan_trip_length_distribution.jpg" alt="Yerevan trip length distribution" /></p>
<p>Now, in the expression for \(\langle C\rangle_{N_{T}}\) we simply replace \(\langle L\rangle\) with \(\langle B\rangle\) and obtain our desired \(\langle C\rangle_{N_{V}}\):</p>
<p>\(\langle C\rangle_{N_{V}} \approx 1-\frac{1}{N_{S}} \sum_{i=1}^{N_{S}}\left(1-p_{i}\right)^{\langle B\rangle \star N_{V}}\).</p>
<h2 id="results">Results</h2>
<p>Now that we have the analytical expression for \(\langle C\rangle_{N_{V}}\), we compute it in two ways:</p>
<ol>
<li>using the \(p_i\) obtained from the data</li>
<li>using the \(p_i\) obtained from the taxi-drive simulation</li>
</ol>
<p>Finally, we plot them against \(N_{V}\):</p>
<p><img src="https://lexparsimon.github.io/images/urban sensing/Yerevan_fraction.jpg" alt="Yerevan sensing power" /></p>
<p>We see that the covered fraction computed the first way is in near perfect agreement with the data, while the second way, despite the oversimplified assumptions, produced impressive results!
The rapid increase of the \(\langle C\rangle_{N_{V}}\) curves reveals that taxi fleets have large sensing power, easily covering the popular street segments, while rarely visited unpopular segments, are more and more difficult to cover. We see a law of diminishing returns: while covering the entire city is difficult, a considerable fraction can be covered with relative ease at a low cost. As mentioned in the beginning, only roughly about 30 taxi vehicles are needed to cover more than a third of the entire city, and up to 70% if we restrict the city to the central districts.</p>
<p>Here is a nice animation summarizing the study in Yerevan:
<img src="/images/urban sensing/Yerevan_sensing.gif" alt="Alt Text" /></p>
<h2 id="conclusion">Conclusion</h2>
<p>Of course there are many implicit assumptions and scenarios not considered in this study. For instance, we considered \(\mathcal{T}\) to be one day, that is a segment is considered as covered if it is covered at least once by a vehicle from \(\mathcal{V}\) during one day. This is for many sensing purposes too coarse a temporal resolution. For a discussion of finer temporal resolutions, feel free to read the original study.
Further, since taxis are concentrated in commercial and touristic neighbourhoods, taxi-based urban sensing displays an inherent spatial bias. This bias could have negative effects, such as underscanning socioeconomically disadvantaged neighborhoods. To overcome this, a hybrid approach to sensing could be attempted. Further yet, our analysis has focused on the fraction of raw segments covered: \(C=\sum 1_{\left(M_{i} \geq 1\right)}\). A more accurate approach would have been to consider the lengths \(b_i\) of the road segments: \(C=\sum b_{i} 1_{\left(M_{i} \geq 1\right)}\).</p>
<p>Despite these and other shortcomings, this work has shown the great potential of taxi-based urban sensing. It will furnish urban practitioners with large amounts of useful data and make possible the development of effective monitoring tools. We have revealed this to be possible with a surprisingly small numbers of sensors.</p>Gevorg Yeghikyangevorg.yeghikyan@sns.itDid you know that a surprisingly small number of taxi vehicles is enough to cover a large portion of a city?Love Urban policy in the time of Cholera Coronavirus2020-02-03T00:00:00+00:002020-02-03T00:00:00+00:00https://lexparsimon.github.io/coronavirus<p><strong>You can learn the entire modelling, simulation and spatial visualisation of the epidemic spreading in cities using just Python in <a href="https://www.udemy.com/course/covid-19-urban-epidemic-modelling-in-python/?referralCode=220EF2D17E80758E78B5">this online course</a> or in <a href="https://skl.sh/30Vdq7J">this one</a>.</strong></p>
<h2 id="are-cities-prepared-for-epidemics">Are cities prepared for epidemics?</h2>
<p>The recent <a href="https://en.wikipedia.org/wiki/Novel_coronavirus_(2019-nCoV)">2019-nCoV Wuhan coronavirus</a> outbreak in China has sent shocks through financial markets and entire economies, and has duly triggered panic among the general population around the world. On 30 January 2020, 2019-nCoV was even <a href="https://www.bbc.com/news/world-51318246">designated</a> a global health emergency by the World Health Organization (WHO). At the time of this writing, no specific treatment verified by medical research standards has yet been discovered. Moreover, some key epidemiological metrics such as the <a href="https://en.wikipedia.org/wiki/Basic_reproduction_number">basic reproduction number</a> (the average number of people infected by an ill individual) are still unknown.
In our times of unprecedented global connectedness and mobility, such epidemics are a major threat on a global scale due to <a href="https://en.wikipedia.org/wiki/Small-world_network">small world network</a> effects. One could conjecture that conditional on a global catastrophic event (loosely defined as > 100mln casualties) happening in 2020, the most likely cause would be precisely some pandemic - not a nuclear disaster, not climate catastrophe, etc. This is further aggravated by worldwide rapid urbanisation, with our densely populated dynamic cities turning into propagation nodes in the disease diffusion network, thus becoming extremely vulnerable and fragile.</p>
<p>In this post, <strong>we will discuss what can happen when an epidemic strikes a city, what measures should immediately be taken, and what implications this has for urban planning, policy making, and management</strong>. We will take the city of Yerevan as our case study and will mathematically model and simulate the spread of the coronavirus in the city, looking at how urban mobility patterns affect the spread of the disease.</p>
<h2 id="urban-mobility">Urban mobility</h2>
<p>Effective, efficient, and sustainable urban mobility is of crucial importance for the functioning of modern cities. It has been <a href="https://www.nature.com/articles/s41467-019-12809-y">shown</a> to directly affect livability and economic output (GDP) of cities. However, <strong>in the event of an epidemic, it will add fuel to the fire</strong>, amplifyig and propagating the disease spread.</p>
<p>So let’s begin by looking at the network of aggregated origin-destination (\(OD\)) flows on a uniform Cartesian grid in Yerevan to get an idea about the spatial structure of mobility patterns in the city:</p>
<p><img src="https://lexparsimon.github.io/images/coronavirus/OD.jpg" alt="Yerevan OD network" /></p>
<p>Further, if we look at the total inflow to the grid cells, we see a more or less monocentric spatial organisation with some cells with high daily inflow located off the center:</p>
<p><img src="https://lexparsimon.github.io/images/coronavirus/Yerevan_inflow.jpg" alt="Yerevan inflow" /></p>
<p>Now, imagine that an epidemic breaks out at a random location in the city. <strong>How will it spread? What can be done to contain it?</strong></p>
<h2 id="modelling-the-epidemic">Modelling the epidemic</h2>
<p>To answer these questions, we will build a simple <a href="https://en.wikipedia.org/wiki/Compartmental_models_in_epidemiology#The_SIR_model">compartmental model</a> to simulate the spread of the infectuous disease in the city. As an epidemic breaks out, its <a href="https://www.ncbi.nlm.nih.gov/pubmed/23864593">transmission dynamics varies significantly</a>, depending on the geographical locations of the initial infection and its connectivity with the rest of the city. This is one of the most important insights gained from recent, data-driven studies on epidemics in urban populations. However, as we will see further below, the various outcomes call for similar measures to contain the epidemic and to account for such a possibility in planning and managing cities.</p>
<p>Since runnning individual-based epidemic models is <a href="https://www.sciencedirect.com/science/article/pii/S0198971514001367">challening</a>, and since our goal is to show general principles of epidemic spread in cities, and not to build a minutely calibrated and accurate epidemic model, we will follow the approach described in this <a href="https://www.nature.com/articles/s41467-017-02064-4">Nature article</a>, modifying the described classical <a href="https://en.wikipedia.org/wiki/Compartmental_models_in_epidemiology#The_SIR_model">SIR model</a> for our needs.</p>
<p>The model divides the population in four compartments. For each location \(i\) at time \(t\), the four compartments are as follows:</p>
<ul>
<li>\(S_{i, t}\): the number of individuals not yet infected or susceptible to the disease.</li>
<li>\(E_{i, t}\): the number of individuals infected but not yet infectious.</li>
<li>\(I_{i, t}\): the number of individuals infected with the disease and capable of spreading the disease to those in the susceptible group.</li>
<li>\(R_{i, t}\): the number of individuals who have been infected and then removed from the infected group, either due to recovery or due to death. Individuals in this group are not capable of contracting the disease again or transmitting the infection to others.</li>
</ul>
<p>In our simulations, time will be a discrete variable as the state of the system is modelled at a daily basis. In a fully susceptible population at location \(j\) at time \(t\), an outbreak happens with probability:</p>
\[h(t, j)=\frac{\beta_{t} S_{j, t}\left(1-\exp \left(-\sum_{k} m_{j, k}^{t} x_{k, t} y_{j, t}\right)\right)}{1+\beta_{t} y_{j, t}},\]
<p>where \(\beta_{t}\) is the transmission rate on day \(t\); \(m_{j, k}^{t}\) reflects mobility from location k to location j, \(x_{k, t}\) and \(y_{k, t}\) denote the fraction of the infected and susceptible populations on day \(t\) at location \(k\) and location \(j\), respectively, given by \(x_{k, t}=\frac{I_{k, t}}{N_{k}}\) and \(y_{j, t}=\frac{S_{j, t}}{N_{j}}\), where \(N_k\) and \(N_j\) are the population sizes at the locations \(k\) and \(j\). Then we go ahead and simulate a stochastic process introducing the disease into locations with entirely susceptible populations, with \(I_{j, t+1}\) being a Bernoulli random variable with probability \(h(t, j)\).</p>
<p>Once the infections are introduced at random locations, the disease spreads both within those locations and is carried and transmitted in other locations by travelling individuals. <strong>This is where the urban mobility patterns characterised by the \(OD\) flow matrix play a crucial role</strong>.</p>
<p>Further, to formalise how the disease is transmitted by an infected person, we need the <em>basic reproduction number</em>, \(R_0\). It is defined as \(R_0 = \beta_{t}/\gamma\) where \(\gamma\) is the recovery rate, and can be thought of as the expected number of secondary infections after an infected individual comes into contact with a susceptible population. At the time of this writing, the basic reproduction number for the Wuhan coronavirus <a href="https://www.nejm.org/doi/full/10.1056/NEJMoa2001316">has been estimated</a> to be between 1.4 and 4. Let’s take the most frequently used average value of 2.4. However, we should note that it’s actually a random variable and the reported number is but the <em>expected</em> number.</p>
<p>We can now proceed to the model dynamics:</p>
\[\begin{equation}
\begin{aligned}
S_{j, t+1} &=S_{j, t} - S_{j, t}\frac{ I_{j, t}}{P_{j}} \frac{R_0}{D_{I}} + \sum_{k} s_{j, k}^{t} \alpha_{j, k}^{t} - \sum_{k} s_{k, j}^{t} \alpha_{k, j}^{t} \\
E_{j, t+1} &=E_{j, t} + S_{j, t}\frac{ I_{j, t}}{P_{j}} \frac{R_0}{D_{I}} - \frac{E_{j, t}}{D_{E}} + \sum_{k} e_{j, k}^{t} \alpha_{j, k}^{t} - \sum_{k} e_{k, j}^{t} \alpha_{k, j}^{t} \\
I_{j, t+1} &=I_{j, t} + \frac{E_{j, t}}{D_{E}} - \frac{I_{j, t}}{D_{I}} + \sum_{k} i_{j, k}^{t} \alpha_{j, k}^{t} - \sum_{k} i_{k, j}^{t} \alpha_{k, j}^{t} \\
R_{j, t+1} &=R_{j, t} + \frac{I_{j, t}}{D_{I}} + \sum_{k} r_{j, k}^{t} \alpha_{j, k}^{t} - \sum_{k} r_{k, j}^{t} \alpha_{k, j}^{t},
\end{aligned}
\end{equation}\]
<p>where</p>
<ul>
<li>\(P_{j}\) is the population in cell \(j\),</li>
<li>\(R_0\) is the <a href="https://en.wikipedia.org/wiki/Basic_reproduction_number">basic reproduction number</a>,</li>
<li>\(D_{E}\) is the <strong>incubation period</strong>, i.e., \(t_{first symptom} - t_{infected}\), with the assumption that during the incubation period the disease can’t be transmitted (which is not the case in real life!)</li>
<li>\(D_{I}\) is the <strong>infection period</strong>, i.e., the period the person can infect others,</li>
<li>\(s_{j, k}^{t}\) is the number of susceptible people that went from cell \(k\) to cell \(j\) at time \(t\),</li>
<li>\(\alpha_{j, k}^{t}\) is a parameter specifying the quarantine strength or the <a href="https://en.wikipedia.org/wiki/Modal_share">modal share</a> or the intensity of public transport vs. private car travel modes in the city..</li>
</ul>
<p>The model dynamics described in the above equations are very simple: on day \(t+1\) at location \(j\), we need to <em>subtract</em> from the susceptible population \(S_{j, t}\) the fraction of people infected within location \(j\) (the second term in the first equation) and the number of susceptible people that have arrived from other locations in the city (the third term in the first equation), and we need to <em>add</em> the number of susceptible people that have left cell \(j\) to other locations in the city (the last term in the first equation). The other equations follow the same logic for the remaining <em>E</em>, <em>I</em>, and <em>R</em> groups.</p>
<h2 id="simulation-setup">Simulation setup</h2>
<p>For this analysis, we will use the aggregated \(OD\) flow matrix of a typical day obtained from GPS data provided by local ride sharing company <a href="https://www.ggtaxi.com">gg</a> as a proxy for the mobility patterns in Yerevan city. Next, we need the population counts in each \(250 \times 250m\) grid cell, which we approximate by proportionally scaling the extracted flow counts so that the total inflows in different locations sum up to approximately half of Yerevan’s population of 1.1 million. This is actually a bold assumption, but since varying this portion yielded very similar results, we will stick to it.</p>
<h3 id="reduce-public-transport">Reduce public transport?</h3>
<p>For our first simulation, we will imagine a sustainable public transport-dominated future urban mobility with \(\alpha=0.9\):</p>
<p><img src="https://lexparsimon.github.io/images/coronavirus/virus_public_transport.jpg" alt="Yerevan high public transport share simulation" /></p>
<p>We see how fast the infected fraction of the population is climbing up immediately, reaching the epidemic’s peak on around day 8-10, with <strong>almost 70% of the population infected</strong>, while only a small portion (~10%) of the population having recovered from the disease. Towards day 100, when the epidemic has receded, we see <strong>the fraction of recovered individuals reach a staggering 90%</strong>! Now let’s see if reducing the intensity of public transport travel to something like \(\alpha=0.2\) has any effect on mitigating the epidemic spread. This can either be interpreted as <strong>taking drastic measures to reduce urban mobility (e.g., by issuing a curfew)</strong> or as <strong>increasing the share of private car travel to reduce chances of infection <em>during</em> the travel</strong>.</p>
<p><img src="https://lexparsimon.github.io/images/coronavirus/virus_normal.jpg" alt="Yerevan low public transport share simulation" /></p>
<p>We see how the peak of the epidemic comes somewhere between day 16 and 20, with a <strong>significantly smaller infected group</strong> (~45%) and twice as many recovered (~20%). Towards the end of the epidemic, the fraction of susceptible individuals is also twice as big (~24% vs. ~12%), meaning that more people have escaped the disease. As expected, <strong>we see that the introduction of dramatic measures to temporarily bring urban mobility down has a big impact on the disease spreading dynamics</strong>.</p>
<h3 id="quarantine-popular-locations">Quarantine popular locations?</h3>
<p>Now, let’s see whether another intuitive idea of completely cutting off a few key popular locations has the desired effect. To do this, let’s pick the locations associated with the upper 1 percentile of mobility flows,</p>
<p><img src="https://lexparsimon.github.io/images/coronavirus/Yerevan_top_locs.jpg" alt="Yerevan top locations" /></p>
<p>and <strong>completely block all flow to and from those locations</strong>, effectively establishing there a quarantine regime. As we can see from the plot, in Yerevan these locations are mostly in the city center, with two other locations being the two largest shopping malls. Choosing a moderate \(\alpha = 0.5\), we obtain:</p>
<p><img src="https://lexparsimon.github.io/images/coronavirus/virus_without_malls.jpg" alt="Yerevan simulation without top locations" /></p>
<p>We see an even smaller fraction of infected individuals at the epidemic’s peak (~35%), and, most importantly, we see that towards the end of the epidemic, <strong>around half of the population remains susceptible, effectively escaping from contracting the infection!</strong></p>
<details>
<summary>Python code for running the epidemic spread model</summary>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">from</span> <span class="nn">collections</span> <span class="kn">import</span> <span class="n">namedtuple</span>
<span class="n">Param</span> <span class="o">=</span> <span class="n">namedtuple</span><span class="p">(</span><span class="s">'Param'</span><span class="p">,</span> <span class="s">'R0 DE DI I0 HospitalisationRate HospitalIters'</span><span class="p">)</span>
<span class="c1"># I0 is the distribution of infected people at time t=0, if None then randomly choose inf number of people
</span>
<span class="c1"># flow is a 3D matrix of dimensions r x n x n (i.e., 84 x 549 x 549),
# flow[t mod r] is the desired OD matrix at time t.
</span>
<span class="k">def</span> <span class="nf">seir</span><span class="p">(</span><span class="n">par</span><span class="p">,</span> <span class="n">distr</span><span class="p">,</span> <span class="n">flow</span><span class="p">,</span> <span class="n">alpha</span><span class="p">,</span> <span class="n">iterations</span><span class="p">,</span> <span class="n">inf</span><span class="p">):</span>
<span class="n">r</span> <span class="o">=</span> <span class="n">flow</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
<span class="n">n</span> <span class="o">=</span> <span class="n">flow</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="n">N</span> <span class="o">=</span> <span class="n">distr</span><span class="p">[</span><span class="mi">0</span><span class="p">].</span><span class="nb">sum</span><span class="p">()</span> <span class="c1"># total population, we assume that N = sum(flow)
</span>
<span class="n">Svec</span> <span class="o">=</span> <span class="n">distr</span><span class="p">[</span><span class="mi">0</span><span class="p">].</span><span class="n">copy</span><span class="p">()</span>
<span class="n">Evec</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
<span class="n">Ivec</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
<span class="n">Rvec</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
<span class="k">if</span> <span class="n">par</span><span class="p">.</span><span class="n">I0</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
<span class="n">initial</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
<span class="c1"># randomly choose inf infections
</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">inf</span><span class="p">):</span>
<span class="n">loc</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">randint</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
<span class="k">if</span> <span class="p">(</span><span class="n">Svec</span><span class="p">[</span><span class="n">loc</span><span class="p">]</span> <span class="o">></span> <span class="n">initial</span><span class="p">[</span><span class="n">loc</span><span class="p">]):</span>
<span class="n">initial</span><span class="p">[</span><span class="n">loc</span><span class="p">]</span> <span class="o">+=</span> <span class="mf">1.0</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">initial</span> <span class="o">=</span> <span class="n">par</span><span class="p">.</span><span class="n">I0</span>
<span class="k">assert</span> <span class="p">((</span><span class="n">Svec</span> <span class="o"><</span> <span class="n">initial</span><span class="p">).</span><span class="nb">sum</span><span class="p">()</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span>
<span class="n">Svec</span> <span class="o">-=</span> <span class="n">initial</span>
<span class="n">Ivec</span> <span class="o">+=</span> <span class="n">initial</span>
<span class="n">res</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">((</span><span class="n">iterations</span><span class="p">,</span> <span class="mi">5</span><span class="p">))</span>
<span class="n">res</span><span class="p">[</span><span class="mi">0</span><span class="p">,:]</span> <span class="o">=</span> <span class="p">[</span><span class="n">Svec</span><span class="p">.</span><span class="nb">sum</span><span class="p">(),</span> <span class="n">Evec</span><span class="p">.</span><span class="nb">sum</span><span class="p">(),</span> <span class="n">Ivec</span><span class="p">.</span><span class="nb">sum</span><span class="p">(),</span> <span class="n">Rvec</span><span class="p">.</span><span class="nb">sum</span><span class="p">(),</span> <span class="mi">0</span><span class="p">]</span>
<span class="n">realflow</span> <span class="o">=</span> <span class="n">flow</span><span class="p">.</span><span class="n">copy</span><span class="p">()</span> <span class="c1"># copy!
</span>
<span class="c1"># The two lines below normalise the flows and then multiply them by the alpha values.
</span> <span class="c1"># This is actually the "wrong" the way to do it because alpha will not be a *linear* measure
</span> <span class="c1"># representing lockdown strength but a *nonlinear* one.
</span> <span class="c1"># The normalisation strategy has been chosen for demonstration purposes of numpy functionality.
</span> <span class="c1"># (Optional) can you rewrite this part so that alpha remains a linear measure of lockdown strength? :)
</span> <span class="n">realflow</span> <span class="o">=</span> <span class="n">realflow</span> <span class="o">/</span> <span class="n">realflow</span><span class="p">.</span><span class="nb">sum</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="mi">2</span><span class="p">)[:,:,</span> <span class="n">np</span><span class="p">.</span><span class="n">newaxis</span><span class="p">]</span>
<span class="n">realflow</span> <span class="o">=</span> <span class="n">alpha</span> <span class="o">*</span> <span class="n">realflow</span>
<span class="n">history</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">((</span><span class="n">iterations</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="n">n</span><span class="p">))</span>
<span class="n">history</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">,:]</span> <span class="o">=</span> <span class="n">Svec</span>
<span class="n">history</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">,:]</span> <span class="o">=</span> <span class="n">Evec</span>
<span class="n">history</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">2</span><span class="p">,:]</span> <span class="o">=</span> <span class="n">Ivec</span>
<span class="n">history</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">3</span><span class="p">,:]</span> <span class="o">=</span> <span class="n">Rvec</span>
<span class="n">eachIter</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">(</span><span class="n">iterations</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span>
<span class="c1"># run simulation
</span> <span class="k">for</span> <span class="nb">iter</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">iterations</span> <span class="o">-</span> <span class="mi">1</span><span class="p">):</span>
<span class="n">realOD</span> <span class="o">=</span> <span class="n">realflow</span><span class="p">[</span><span class="nb">iter</span> <span class="o">%</span> <span class="n">r</span><span class="p">]</span>
<span class="n">d</span> <span class="o">=</span> <span class="n">distr</span><span class="p">[</span><span class="nb">iter</span> <span class="o">%</span> <span class="n">r</span><span class="p">]</span> <span class="o">+</span> <span class="mi">1</span>
<span class="k">if</span> <span class="p">((</span><span class="n">d</span><span class="o">></span><span class="n">N</span><span class="o">+</span><span class="mi">1</span><span class="p">).</span><span class="nb">any</span><span class="p">()):</span> <span class="c1">#assertion!
</span> <span class="k">print</span><span class="p">(</span><span class="s">"Houston, we have a problem!"</span><span class="p">)</span>
<span class="k">return</span> <span class="n">res</span><span class="p">,</span> <span class="n">history</span>
<span class="c1"># N = S + E + I + R
</span>
<span class="n">newE</span> <span class="o">=</span> <span class="n">Svec</span> <span class="o">*</span> <span class="n">Ivec</span> <span class="o">/</span> <span class="n">d</span> <span class="o">*</span> <span class="p">(</span><span class="n">par</span><span class="p">.</span><span class="n">R0</span> <span class="o">/</span> <span class="n">par</span><span class="p">.</span><span class="n">DI</span><span class="p">)</span>
<span class="n">newI</span> <span class="o">=</span> <span class="n">Evec</span> <span class="o">/</span> <span class="n">par</span><span class="p">.</span><span class="n">DE</span>
<span class="n">newR</span> <span class="o">=</span> <span class="n">Ivec</span> <span class="o">/</span> <span class="n">par</span><span class="p">.</span><span class="n">DI</span>
<span class="n">Svec</span> <span class="o">-=</span> <span class="n">newE</span>
<span class="n">Svec</span> <span class="o">=</span> <span class="p">(</span><span class="n">Svec</span>
<span class="o">+</span> <span class="n">np</span><span class="p">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">Svec</span><span class="p">.</span><span class="n">reshape</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="n">n</span><span class="p">),</span> <span class="n">realOD</span><span class="p">)</span>
<span class="o">-</span> <span class="n">Svec</span> <span class="o">*</span> <span class="n">realOD</span><span class="p">.</span><span class="nb">sum</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="p">)</span>
<span class="n">Evec</span> <span class="o">=</span> <span class="n">Evec</span> <span class="o">+</span> <span class="n">newE</span> <span class="o">-</span> <span class="n">newI</span>
<span class="n">Evec</span> <span class="o">=</span> <span class="p">(</span><span class="n">Evec</span>
<span class="o">+</span> <span class="n">np</span><span class="p">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">Evec</span><span class="p">.</span><span class="n">reshape</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="n">n</span><span class="p">),</span> <span class="n">realOD</span><span class="p">)</span>
<span class="o">-</span> <span class="n">Evec</span> <span class="o">*</span> <span class="n">realOD</span><span class="p">.</span><span class="nb">sum</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="p">)</span>
<span class="n">Ivec</span> <span class="o">=</span> <span class="n">Ivec</span> <span class="o">+</span> <span class="n">newI</span> <span class="o">-</span> <span class="n">newR</span>
<span class="n">Ivec</span> <span class="o">=</span> <span class="p">(</span><span class="n">Ivec</span>
<span class="o">+</span> <span class="n">np</span><span class="p">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">Ivec</span><span class="p">.</span><span class="n">reshape</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="n">n</span><span class="p">),</span> <span class="n">realOD</span><span class="p">)</span>
<span class="o">-</span> <span class="n">Ivec</span> <span class="o">*</span> <span class="n">realOD</span><span class="p">.</span><span class="nb">sum</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="p">)</span>
<span class="n">Rvec</span> <span class="o">+=</span> <span class="n">newR</span>
<span class="n">Rvec</span> <span class="o">=</span> <span class="p">(</span><span class="n">Rvec</span>
<span class="o">+</span> <span class="n">np</span><span class="p">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">Rvec</span><span class="p">.</span><span class="n">reshape</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="n">n</span><span class="p">),</span> <span class="n">realOD</span><span class="p">)</span>
<span class="o">-</span> <span class="n">Rvec</span> <span class="o">*</span> <span class="n">realOD</span><span class="p">.</span><span class="nb">sum</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="p">)</span>
<span class="n">res</span><span class="p">[</span><span class="nb">iter</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,:]</span> <span class="o">=</span> <span class="p">[</span><span class="n">Svec</span><span class="p">.</span><span class="nb">sum</span><span class="p">(),</span> <span class="n">Evec</span><span class="p">.</span><span class="nb">sum</span><span class="p">(),</span> <span class="n">Ivec</span><span class="p">.</span><span class="nb">sum</span><span class="p">(),</span> <span class="n">Rvec</span><span class="p">.</span><span class="nb">sum</span><span class="p">(),</span> <span class="mi">0</span><span class="p">]</span>
<span class="n">eachIter</span><span class="p">[</span><span class="nb">iter</span> <span class="o">+</span> <span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="n">newI</span><span class="p">.</span><span class="nb">sum</span><span class="p">()</span>
<span class="n">res</span><span class="p">[</span><span class="nb">iter</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">]</span> <span class="o">=</span> <span class="n">eachIter</span><span class="p">[</span><span class="nb">max</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="nb">iter</span> <span class="o">-</span> <span class="n">par</span><span class="p">.</span><span class="n">HospitalIters</span><span class="p">)</span> <span class="p">:</span> <span class="nb">iter</span><span class="p">].</span><span class="nb">sum</span><span class="p">()</span> <span class="o">*</span> <span class="n">par</span><span class="p">.</span><span class="n">HospitalisationRate</span>
<span class="n">history</span><span class="p">[</span><span class="nb">iter</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span><span class="mi">0</span><span class="p">,:]</span> <span class="o">=</span> <span class="n">Svec</span>
<span class="n">history</span><span class="p">[</span><span class="nb">iter</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,:]</span> <span class="o">=</span> <span class="n">Evec</span>
<span class="n">history</span><span class="p">[</span><span class="nb">iter</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,:]</span> <span class="o">=</span> <span class="n">Ivec</span>
<span class="n">history</span><span class="p">[</span><span class="nb">iter</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span><span class="mi">3</span><span class="p">,:]</span> <span class="o">=</span> <span class="n">Rvec</span>
<span class="k">return</span> <span class="n">res</span><span class="p">,</span> <span class="n">history</span>
</code></pre></div> </div>
</details>
<p><br /></p>
<p>Here is a small animation visualising the dynamics of the high public transport share scenario:</p>
<p><img src="/images/coronavirus/coronavirus_60_small.gif" alt="Alt Text" /></p>
<h2 id="conclusion">Conclusion</h2>
<p>By no means claiming accurate epidemic modelling (or even any substantial knowledge in epidemiology beyond the basics), our aim in this post was to get a first insight on how network effects come into play in an urban setting during an infectuous disease outbreak. With ever increasing population densities, mobility, and dynamics, our cities become more exposed to “black swans” and become more fragile. And since <strong>you can’t fetch the coffee if you’re dead, smart and sustainable cities will be meaningless without effective and efficient crisis handling capability and mechanisms.</strong> For instance, we saw that the introduction of quarantine regimes in key locations, or taking draconian measures to curb mobility, can be instrumental during such a health crisis. However, a further important question would be <strong>how to implement such measures while minimizing damage and loss to the functioning of the city and its economy?</strong></p>
<p>Further yet, the exact epidemic spreading mechanisms of infectuous diseases are <a href="https://link.springer.com/chapter/10.1007/978-1-4614-4496-1_4">still an active area of research</a> and the advances in this fields will have to be communicated to and integrated in urban planning, policy making, and management to make our cities safe and <a href="https://en.wikipedia.org/wiki/Antifragility">antifragile</a>.</p>Gevorg Yeghikyangevorg.yeghikyan@sns.itImagine that the coronavirus epidemic breaks out at a random location in the city. How will it spread? What can be done to contain it?Medieval urbanism: lessons for contemporary cities2020-02-03T00:00:00+00:002020-02-03T00:00:00+00:00https://lexparsimon.github.io/medieval<h2 id="how-do-cities-grow">How do cities grow?</h2>
<p>Whether large cities are just scaled-up versions of small cities is still an <a href="https://www.pnas.org/content/115/10/2317">open question</a>. An important one for that matter.
In order to plan better cities, we need an improved understanding of how geography, urban economy, social and physical networks and political institutions come together to bring about growth or decline of cities. To achieve this, a thorough outlook on how urban settlements evolved over time is required. <a href="https://www.ucpress.edu/book/9780520081154/civilization-and-capitalism-15th-18th-century-vol-ii">Substantial research</a> claims that the social, economic, political, and organizational structures and their innovations that would triumph in “modernity,” “capitalism,” and the Industrial Revolution are rooted in and developed from medieval European settlements.
On the other hand, medieval cities were qualitatively different: much smaller, agrarian, with a lower productivity, simpler technologies, and no organized market economy. Strictly hierarchical institutions as the feudal government, Church, and guilds exerted <a href="https://uncpress.org/book/9780807844984/wage-labor-and-guilds-in-medieval-europe/">a much stronger influence</a> on the organization of social and economic life in medieval cities compared to analogous structures in modern cities. This influence manifested itself in segregated, corporate societies in which social groupings limited social and economic interactions and opportunities for individuals and families. The modern Western “free flow” of ideas, people, and goods was alien to medieval cities.</p>
<p><strong>These two different and at first sight conflicting perspectives raise the empirical question: Were medieval cities fundamentally different from modern cities? Or is there a continuum of urban processes, form and function from medieval to modern cities?</strong></p>
<p>Far from being just an interesting question, asking such questions is important for building a broad, historically-informed theory of urbanization. In this post, we will narrow down the raised questions and will try to <strong>find out whether hierarchical institutions such as the Church or guilds limited and constrained social mixing and interaction in cities affecting urban economic and spatial growth</strong>.
We will look at a dataset of built-up area and resident populations of 173 medieval cities at circa. 1300 AD, uncover their relationship with simple statistical techniques, and interpret the results in relation to two models from <a href="https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0162678#:~:text=Despite%20their%20many%20structural%20differences,within%20a%20given%20urban%20system.">settlement scaling theory</a>.
The <a href="https://science.sciencemag.org/content/340/6139/1438/tab-pdf">first model</a>, originally developed for describing modern cities, constructs the built-up area of cities as a function of their population size, considering the city as an unconstrained non-hierarchical socio-economic network embedded in space. The <a href="https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0162678#:~:text=Despite%20their%20many%20structural%20differences,within%20a%20given%20urban%20system.">second model</a> builds on top of the first model by adding restrictions to social and economic interactions due to the impact of medieval institutions on social networking. This latter model predicts that, if this restrictive impact is strong, cities will have weak agglomeration effects, which would be visible in the relationship between built-up area and population.</p>
<h2 id="spatial-and-social-organisation-models-in-cities">Spatial and social organisation models in cities</h2>
<p>On the one hand, one can imagine a medieval city as a self-organized, non-hierarchical social and spatial network entity. If so, interactions among people are only limited by transaction, transportation, <a href="https://en.wikipedia.org/wiki/Opportunity_cost">opportunity costs</a>, and the so-called “<a href="https://en.wikipedia.org/wiki/Matching_theory_(economics)">matching costs</a>” of between the needs, skills, and resources of individuals. In such a network, interactions between any two persons are not restricted and occur freely. We will call this view of cities the <em>social reactor model</em>.
However, as already mentioned, such hierarchically organized medieval institutions as the Church, feudal authorities, guilds, family ties, etc., exerted huge power and control in medieval cities. This forms the basis for the hypothesis that these institutions strongly regulated the contacts between people in different social groups, hence restricting the social interactions that boost economic
productivity, flows of ideas, and the creation of knowledge and innovation. We shall call this view of cities the <em>structured social interaction model</em> and discuss the corresponding mathematical model based on hierarchical graphs capturing the mentioned social and economic regulation. We will then statistically analyse the available data and test our hypothesis by interpreting the results of the data analysis from the perspective of these models.</p>
<h3 id="the-social-reactor-model">The Social Reactor Model</h3>
<p>The fundamental idea behind most urban socioeconomic theories, including classical models of geography and urban planning, is the concept of a “spatial equilibrium”. In its <a href="https://en.wikipedia.org/wiki/Bid_rent_theory">simplest form</a> formulated by William Alonso, it states that within a city, individuals and firms choose their location by drawing a balance between land rents and transportation costs and economic preferences, given the available resources. This typically leads to a monocentric spatial organisation with the highest land prices at the city core - the most favourable location in terms of accessibility and low transportation costs - decreasing as one moves away from the city center.
Urban or settlement scaling theory, on the other hand, has the same basic elements as the Alonso model, except that it offers a more refined modelling framework relating the microprocesses and physical structures in cities to their population through a scaling exponent, as described in what follows.
The core idea behind settlement scaling theory is that all human settlements — irrespective of scale or socio-economic complexity — share essential quantitative similarities in terms of general form and function. This comes from the <a href="https://journals.openedition.org/cybergeo/2519">multilayered gains</a> from social agglomeration, whether for economic specialization, innovation, shared infrastructure, common pool of workforce, defence, religion, or trade. In their essence, <strong>they model the relationship between various socio-economic characteristics of a city to its population</strong>. A <a href="https://science.sciencemag.org/content/340/6139/1438/tab-pdf">key difference</a> between urban economics models and urban scaling theory is the replacement of production or utility functions common in economics with a socio-economic network of interactions.</p>
<p>That said, let’s derive the expected relationship between the population of a city, \(N\), and its built-up area, \(A\). Unlike classical urban economic models, we <em>do not</em> have to assume a radial monocentric city structure. To do this, we balance the average benefit to an individual from social-economic interactions against the associated transportation costs:</p>
\[\frac{GN}{A} = \epsilon A^{\frac{H}{2}}.\]
<p>In the equation, \(G\) is the net <a href="https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0162678#:~:text=Despite%20their%20many%20structural%20differences,within%20a%20given%20urban%20system.">benefit per interaction</a> times the area an individual covers over time. Given the differences in social relationships, economic productivity, and innovation in medieval vs. modern cities, \(G\) can vary significantly across time. \(\epsilon\) is the cost of movement per unit length, which is a function of technology (walking, horse riding, etc.). \(H\), another ingredient in specifying the cost associated to individual interactions in a city, is a fractal dimension characterising how people explore a city. For \(H=1\), individuals move in the city through line-like trajectories, while for \(H \rightarrow 2\), they explore the city as an area, exhaustively. In the opposite limit of \(H \rightarrow 0\), individuals remain constrained to a single place (the trajectory is a static point), and the city effectively ceases to be a socio-economic network of interactions.
With some simple high school algebra tweaking, we obtain:</p>
\[GN = \epsilon A^{\frac{H+2}{2}}, \
A^{\frac{H+2}{2}} = \frac{G}{\epsilon}N, \
A = aN^\alpha,\]
<p>where \(\alpha := \frac{2}{H+2}\) and \(a := (\frac{G}{\epsilon})^\alpha\).
This simple result describes some of the most important properties of urban settlements. First, if benefits from social interactions are small relative to transportation costs, which was the case in medieval cities, then the prefactor, \(a\), will be small and the city will be quite dense. Note also that if \(H = 0\), as could have happened in a segregated settlement, the total built-up area, \(A\), becomes proportional to population, with no agglomeration effects whatsoever. For \(H = 1\), one obtains the special exponent value, \(\alpha = 2/3\), a situation <a href="https://journals.plos.org/plosone/article/file?type=printable&id=10.1371/journal.pone.0087902">described</a> as the amorphous settlement model, because it neglects any spatial structure and organisation in cities. In such settlements, population density, \(n\), increases
rapidly with population size as \(n(N)=\frac{N}{A(N)}=a^{-1} N^{1 / 3}\).</p>
<p>However, so far, we have not yet considered the fact that urban settlements are not organised as blank isotropic canvases, but become organised by locations of interest and the networks making the access to them possible (streets, canals, paths). This means that the effective space for social-economic interactions in cities is defined by its <em>access network</em>. The total network area, \(A_n\), can be derived from population density \(N/A\):</p>
<p>\(A_{n}(N) \sim Nd = A^{1/2}N^{1/2}\) where \(d\) is the average distance between individuals \(d = (A/N)^{1/2}\). Further,</p>
\[A_{n} = l a^{1/2} N^{\alpha/2} N^{1/2} = l a^{1/2} N^{\frac{\alpha + 1}{2}} = a_{0} N^{1-\delta},\]
<p>where \(\delta=\frac{H}{2(H+2)}\) and \(a_{0}=l a^{1 / 2}\). Here \(l\) is a length scale capturing the width of the network at each location of interest (e.g., doors and
entrance-ways). For \(H = 1\), we obtain an exponent of \(1-\delta=5/6\).</p>
<p>For the remaining analysis, two aspects of the social reactor model are of particular interest. First, the model predicts that the built-up area of a city should increase with its population, on average, with an exponent \(\frac{2}{3} \leq \alpha \leq \frac{5}{6}\), where the lower bound of 2/3 corresponds to an amorphous isotropic elements, typically small towns, while the upper bound of 5/6 to the role and fine measurement of physical infrastructures in a city.
The second aspect has to do with the hierarchical institutions regulating social interactions, and which we will turn our attention to next.</p>
<h3 id="the-structured-social-interaction-model">The Structured Social Interaction Model</h3>
<p>Now, let’s see how political institutions and social groups can introduce further constraints in social and economic interactions among individuals in cities, and how their influence can change the scaling laws described in the previous section. What we will essentially do is to build a model that moves away from complete freedom with full social mixing in cities towards hierarchical structures of segregated social groups that reduce free social interactions between those groups. These structures can be both formal, such as guilds, parishes, and informal, such as family ties or ethnicity.</p>
<p><img src="https://lexparsimon.github.io/images/Medieval/network.JPG" alt="Schematic social networks in cities" /></p>
<p>In the image above, <strong>A</strong> shows an unstructured social network where anyone is free to connect with anyone else, the only limitation
being cost of movement. Such a network structure means that connectivity increases rapidly with city population size, with a mean degree of \(k(N)=k_{0} N^{\delta}, \delta \sim 1 / 6\). Conversely, <strong>B</strong> shows a structured socioeconomic network in which social interactions are regulated by social groups and political institutions (black squares) and, at each level of the hierarchy, may be damped by a factor s<1. This means that the further in hierarchy the connection to another individual is, the more the positive effects of the social interactions will be reduced by dampening them at each level. When s<1, the overall effect of hierarchical institutions is to reduce
social possibilities and hence reduce agglomeration effects, forcing the exponent of area scaling with population closer to one.</p>
<p>In fact, the hierarchical structures can also be modelled as <strong>contributing</strong> to social interaction by <strong>reducing</strong> interaction costs, for example, by reducing crime or by acting as central places, fostering innovation, such as universities in modern cities. But this is a different story and we will not discuss it here.</p>
<p>The hierarchical structure in the above figure <strong>B</strong> is parameterized by \(h\) levels and at each level we assume that \(b\) connections are possible, similar to the <a href="https://en.wikipedia.org/wiki/Branching_factor">branching factor</a> in tree graphs. The relationship between \(b\) and the city population, \(N\), is then given by \(h(N)=\log_{b} N\).</p>
<p>We can now derive the number of interactions an individual can have as its contacts are mediated by higher-level groups or social institutions. The important parameter here is the <strong>social horizon</strong>, \(r=sb\). If \(r>1\), or the <em>effective</em> contact rate, is above one, then the city does not fall apart as a socio-economic network of interactions. For \(r<1\), the city falls apart into distinct groups dictated by the respective institutions.</p>
<p>We begin by considering the number of connections of a typical person. At the first level of the hierarchy, there are \(b\) possible connections, at the second there are \(b+ sb\), at the third \(b+sb+(sb)^2\) and so on. Then, the total number of interactions of a person, at a given level of dampening, \(s\), is the sum of the finite geometric series</p>
\[k_{s}(N)=b\left[1+s b+(s b)^{2}+\ldots\right]=b \frac{1-r^{h}}{1-r}.\]
<p>If you don’t remember this from your high school algebra class, you can find a refresher in the Appendix below. For very small \(s\), we have \(r<<1\) and \(k_{s}(N)=b /(1-s b) \cong b\), which means that all interactions stay essentially limited to the households <strong>and there is no city as a network of socio-economic relations at all!</strong> In contrast, for \(s\) close to 1, we have \(r>1\), and we can write:</p>
\[k_s(N) = b\frac{(sb)^{h}-1}{sb -1} \approx s^{h}b^h = s^{h}N.\]
<p>Since the city population, \(N\), is proportional to the total infrastructure network area, \(A_n\) times the average connectivity of a typical person, \(k_s(N)\), and since \(h = \log_{b} N\), we obtain</p>
\[A_n \sim \frac{N}{k_s(N)} = s^{-h} = s^{- \log_{b} N} = N^{- \log_{b} s} = N^{\theta},\]
<p>where \(\theta=\left|\frac{\ln s}{\ln b}\right|\). Notice that as \(s \rightarrow 1\), the exponent \(\theta\) goes to zero, rendering the social grouping influence irrelevant.
We remember also that as <a href="https://www.jstor.org/stable/20008031?seq=1">city productivity is proportional to its connectivity</a>, \(A_n = a_0N^{1-\delta} \sim N^{1-\delta}\), and we thus obtain</p>
\[A_n \sim N^{1-\delta + \theta}.\]
<p>In the opposite limit, when social institutions are overly restrictive, we get \(A_n \sim N\), resulting in no population
densification (\(n\) = constant) as the city grows. This shows how hierarchical institutions that restrict social opportunities reduce and might even eliminate socio-economic agglomeration effects, spatial densification, and, by implication, cities themselves. Bottom line: If we observe linear scaling of built-up area with population in any urban system, we can identify a situation in which socio-economic restrictions by formal or informal hierarchical institutions <em>could</em> be at play.</p>
<h2 id="statistical-data-analysis-of-medieval-urban-data">Statistical data analysis of medieval urban data</h2>
<p>A bit of theory under our belt, let’s uncover statistical relations between urban built-up area and population in a <a href="https://journals.plos.org/plosone/article/file?type=supplementary&id=info:doi/10.1371/journal.pone.0162678.s003">dataset of 173 urban settlements</a> in four different political “urban systems”: Germany, Northern Italy, France & Belgium, and England:</p>
<p><img src="https://lexparsimon.github.io/images/Medieval/map.JPG" alt="Urban system map" /></p>
<p>Let’s plot built-up area vs. population for all cities in the dataset to begin with:</p>
<p><img src="https://lexparsimon.github.io/images/Medieval/joint_plot.jpg" alt="Joint plot" /></p>
<p>As is typical for <a href="https://en.wikipedia.org/wiki/Rank-size_distribution">city size</a>, we see highly skewed distributions for both area and population. Since a reasonable assumption is that both population and built-up area can’t be too large, we can safely transform the data by taking the natural logarithm of both variables thus “normalising” them and not running the risk of biased regression results because of the fat tails.</p>
<p><img src="https://lexparsimon.github.io/images/Medieval/joint_plot_log.jpg" alt="Joint plot log" /></p>
<p>Much better! Since the four geopolitical groups could have had different socio-economic structures, we ought to control for them. Let’s break the plot down by the urban system:</p>
<p><img src="https://lexparsimon.github.io/images/Medieval/joint_plot_by_urb_sys.jpg" alt="Joint plot by urban system" /></p>
<p>Now, we essentially define the following <a href="https://en.wikipedia.org/wiki/Ordinary_least_squares">ordinary least squares</a> model:</p>
\[\ln \left(\operatorname{area}_{i}\right)=\alpha+\beta \ln \left(\text {population}_{i}\right)+\epsilon,\]
<p>where \(i\) is the index of a city within a specified urban system and \(\epsilon\) denotes i.i.d. Gaussian distributed error. Note that this is simply a log-transformed version of the social reactor model discussed above, and that \(\beta\) is the same as the scaling coefficient \(\alpha\) in the original model.
Having the data prepared, let’s first run the model for all cities together. Here are the regression results:</p>
<p><img src="https://lexparsimon.github.io/images/Medieval/reg_results.JPG" alt="Regression results" /></p>
<p>We see that the estimated parameter of interest, <strong>the scaling coefficient, \(\beta\), is 0.714, which lies between 2/3 and 5/6!</strong> This is a first hint that the statistical distributions of population and area of medieval cities are in acceptable agreement with urban scaling theory.
Before we move on, let’s first run some diagnostic tests, to make sure we don’t have <a href="https://en.wikipedia.org/wiki/Ordinary_least_squares#Assumptions">heteroskedasticity or normality</a> issues. We can begin with the residual plot:</p>
<p><img src="https://lexparsimon.github.io/images/Medieval/residual_plot.jpg" alt="Residual plot" /></p>
<p>Good, it looks fairly random. For heteroskedasticity, a situation in which the variance of the error term is not the same across observations, we will use the <a href="https://en.wikipedia.org/wiki/Breusch%E2%80%93Pagan_test">Breusch-Pagan test</a> which is essentially a chi-squared test. Skipping the maths, what matters is that if we obtain a small p-value (<0.05), it’s an indication that we have heteroskedasticity and that it should be addressed. Luckily, we obtain a p-value of 0.644, so we can safely assume homoskedasticity.</p>
<p>To check for normality, we look at the <a href="https://en.wikipedia.org/wiki/Q%E2%80%93Q_plot">Q-Q plot</a>:</p>
<p><img src="https://lexparsimon.github.io/images/Medieval/qq_plot_all.jpg" alt="QQ plot all cities" /></p>
<p>and see a remarkably good agreement with the normality assumption.</p>
<p>Let’s now move on to running four different models, one for each geopolitical urban system.
All p-values obtained from the Breusch-Pagan tests for the four urban systems are much larger than 0.05 and the Q-Q plots for all four models are shown below:</p>
<p><img src="https://lexparsimon.github.io/images/Medieval/qq_plot_by_urb_sys.jpg" alt="QQ plot by urban system" /></p>
<p>With no violations of our regression assumptions, we plot the regression lines in log space for the four urban systems:</p>
<p><img src="https://lexparsimon.github.io/images/Medieval/regression_plots.jpg" alt="Regression plots" /></p>
<p>The estimated scaling coefficients are</p>
<ul>
<li>England: <strong>0.73</strong></li>
<li>France & Belgium: <strong>0.79</strong></li>
<li>Germany: <strong>0.75</strong></li>
<li>Northern Italy: <strong>0.72</strong></li>
</ul>
<p>Even with the confidence intervals taken into account, the estimated scaling coefficients are quite similar across geopolitical urban systems with \(2 / 3 \leq \alpha \leq 5 / 6\) as predicted by the social reactor model, and <strong>none of the coefficients are \(\geq 5 / 6.\)</strong> For this reason, <strong>we cannot detect statistically significant evidence that hierarchical institutions limit socio-economic interactions in medieval cities</strong>. Of course, since the coefficients fall roughly in the middle of the \(2 / 3 \leq \alpha \leq 5 / 6\) range, some socio-economic dampening might still have occurred. Nonetheless, the consistent estimated scaling coefficients fall well <a href="https://science.sciencemag.org/content/340/6139/1438/tab-pdf">close to those of modern cities</a> and were not dampened towards one. This
suggests that the hierarchical institutions of medieval cities <strong>did not have a visibly restrictive influence on urban socio-economic interactions, at least within the framework predicted by the structured interactions model.</strong></p>
<h2 id="conclusion">Conclusion</h2>
<p>Despite many structural, social and economic differences, medieval urban settlements have at least one basic property in common with modern cities: <strong>larger cities are denser than smaller cities within a given urban system</strong>. Overall, the data showed us that city areas did not grow with population faster than what the social reactor or structured interaction models would predict, suggesting that no evidence for restrictive dampening of socio-economomic connectivity and agglomeration effects in medieval cities. Despite the hierarchical institutions prevalent in medieval cities, and which are seemingly not so dominant in Western cities today, we can interpret this result as rejecting the intuitive idea of a strongly segregating role of medieval social institutions. This
means that <strong>the hierarchical institutions of Western European cities ca. 1300 AD did not considerably restrict social mixing, economic integration, or the free flow of people, ideas, and information.</strong> These findings indicate that at a basic structural level, <strong>the micro-level socio-economic processes of medieval cities were fundamentally similar to those of modern cities</strong>. Even though there are many structural, functional, and cultural differences, both medieval and contemporary cities seem to be described by social networks that become increasingly denser in space as they grow. All this suggests that past cities can be better understood through modern urbanism, but also that modern cities can be better understood by medieval or perhaps even ancient urbanism. With increasing quantity and quality of historical data, we can expect this to be the case.</p>
<p>The jupyter notebook with the code for this post can be found <a href="https://github.com/lexparsimon/Urban-Data-Science/blob/master/Medieval%20cities.ipynb">here</a>.</p>
<h2 id="appendix">Appendix</h2>
<p>For \(r \neq 1,\) the sum of the first \(n\) terms of a geometric series is</p>
\[a+a r+a r^{2}+a r^{3}+\cdots+a r^{n-1}=\sum_{k=0}^{n-1} a r^{k}=a\left(\frac{1-r^{n}}{1-r}\right),\]
<p>where \(a\) is the first term of the series, and \(r\) is the common ratio. We can derive the formula for the sum, \(s\), as follows:</p>
\[\begin{aligned}
s &=a+a r+a r^{2}+a r^{3}+\cdots+a r^{n-1} \\
r s &=a r+a r^{2}+a r^{3}+\cdots+a r^{n-1}+a r^{n} \\
s-r s &=a-a r^{n} \\
s(1-r) &=a\left(1-r^{n}\right) \\
s &=a\left(\frac{1-r^{n}}{1-r}\right) \quad(\text { if } r \neq 1)
\end{aligned}\]Gevorg Yeghikyangevorg.yeghikyan@sns.itDid hierarchical institutions suppress urban social and economic interactions in medieval cities?Urban drones: the facility location problem2020-01-19T00:00:00+00:002020-01-19T00:00:00+00:00https://lexparsimon.github.io/drones<h2 id="do-cities-need-drones">Do cities need drones?</h2>
<p>According to a <a href="https://www.faa.gov/data_research/aviation/aerospace_forecasts/media/Unmanned_Aircraft_Systems.pdf">report</a> by the United States Federal Aviation Agency, 4.47 million small drones are expected to operate in the United States by 2021, up from today’s 2.75 million. Since 2017, more than 1 million drone owners have already registered with the Federal Aviation Administration (FAA). So far, <a href="https://link.springer.com/article/10.1007/s10846-017-0483-z">as expected</a>, the main driving force behind this rapid increase in commercial drone purchases is their high mobility and applications in computer vision: taking pictures in dangerous areas, building inspection, traffic monitoring, photogammetry, etc.</p>
<p>However, <strong>this is just the beginning</strong>. Drones are expected to carry out important tasks in future cities: They will provide a bird’s eye view from the sky reporting in no time if a bridge is about to collapse, a fire is spreading or a human being is in trouble. They will <a href="https://www.nesta.org.uk/report/flying-high-challenge-future-of-drone-technology-in-uk-cities/exploring-urban-drone-integration/">supplement our transport systems</a> by moving things around or getting someone somewhere quickly. In fact, <a href="https://en.wikipedia.org/wiki/Amazon_Prime_Air">Amazon’s Prime Air</a> drone delivery service is already in its final development stage before beginning operations. Drones will also be robots on wings, performing such tasks as repairing bridges or fighting fires.</p>
<p>In this post, we will discuss how drones will affect cities, and how urban planners will have to extend their scope of expertise to be able to deal with urban airspace, urban air traffic, and its interplay with traditional urban space. We will do this by solving the problem of the efficient placement of drone stations in a city and seeing how different urban air traffic configurations lead to different operation zones and urban mobility.</p>
<h2 id="why-should-urban-planners-be-concerned">Why should urban planners be concerned?</h2>
<p>So far, most of us have seen drones only occasionally. But what will happen when swarms of drones flood our cities? <strong>We can confidently expect the deployment of drones at scale to pose some significant challenges to planning, managing, and designing sustainable cities</strong>.
The proliferation of drone technology at scale would have an impact on the very nature of our cities. Certain types of drone applications will require new physical infrastructure.</p>
<p>Drone systems could affect the way buildings are designed and built. For instance, if <strong>drone docking stations</strong> will be placed on building roofs, the rooftop will have to be easily accessible for humans, but also for transporting goods to and from it. New buildings would have to be designed to accommodate this (for example, with additional internal or external elevator shafts). It could also be a serious challenge retrofitting old buildings to the new conditions. The visual impact on the built environment would also need to be addressed.</p>
<p>New types infrastructure such as <strong>passenger and logistic hubs</strong> forming a drone mobility network, as well as ground-based <strong>counter drone systems</strong> equipped with <strong>radars, <a href="https://uavcoach.com/drone-jammer/">signal jammers</a> and <a href="https://fortemtech.com/products/dronehunter/">drone capturing technology</a></strong> for combatting drones with rogue or dangerous behaviour will emerge. No-fly zones over critical infrastructure and buildings such as government buildings, airports or prisons will have to be designated and enforced via <a href="https://en.wikipedia.org/wiki/Geo-fence">geo-fencing</a>. Integrating all of this with the existing built environment and creating the necessary regulatory framework for it to function will have a great impact on the daily practice of architects, urban planners and policy makers.</p>
<p>Highly automated drone operations will require fixed docking stations for take off and landing, integrated with charging or refuelling systems. This could be a mobile or permanent station placed on the top of buildings, at street level, <a href="https://www.nesta.org.uk/report/flying-high-challenge-future-of-drone-technology-in-uk-cities/exploring-urban-drone-integration/">integrated into existing transport infrastructure</a> or other types of infrastructure such as lampposts for smaller stations. These docking stations will most likely be integrated with electric battery charging. Given the energy policies of most developed countries, the impact of the soaring electricity demand on the grid and associated emissions will have to to be considered and carefully planned for.</p>
<p>For instance, the cost of privately owning drones can be <a href="https://ieeexplore.ieee.org/document/8288882">considerably more expensive compared to renting them</a>, especially for companies who may only require drones for a single task. However, with a city-wide rental system in place, the city, companies, and users will not be required to purchase drones, effectively distributing the cost amongst them.
Therefore, planning a drone rental service based on a system of drone ports (or stations) distributed across the city is necessary. By providing a public drone rental service of distributed autonomous drones waiting at drone stations, this can reduce the total number of drones in the sky and the total cost of utilizing drones to complete tasks requested by the city services, citizens, and other users.
This implies the necessity of addressing the following questions among many others:</p>
<ul>
<li><strong>Where to place the docking stations?</strong></li>
<li><strong>How many of them?</strong></li>
<li><strong>With what capacity?</strong></li>
</ul>
<h2 id="the-facility-location-problem">The facility location problem</h2>
<p>The above questions can be answered with the help of mathematical optimization, particularly with <a href="https://en.wikipedia.org/wiki/Linear_programming">linear programming</a> if formulated as a <a href="https://en.wikipedia.org/wiki/Facility_location_problem#Capacitated_facility_location">capacitated facility location problem</a>.</p>
<p>The latter is a classical optimization problem for choosing the sites for factories, warehouses, power stations, or other infrastructure. A typical facility location problem deals with determining the best among potentially available sites, subject to specific constraints requiring that demands at several locations be serviced by the opened facilities. The costs associated to such an initiative include a part which is usually proportional to the sum of distances from the facilities to the demand locations, as well as to the costs of opening and maintaining the facilities.
<strong>The objective of the problem is to select facility sites in order to minimize total costs</strong>. The facilities may or may not have limited capacities for servicing, which classifies the problems into capacited and uncapacited variants.</p>
<h3 id="lets-give-some-context">Let’s give some context!</h3>
<p>Imagine a situation in which we are tasked with sketching a concept plan for the placement of drone docking stations in Yerevan city. Having the spatial distribution of daily (hourly, weekly, or monthly for that matter) nominal demand in the city (for demonstration purposes taxi demand has been used as a proxy):</p>
<p><img src="https://lexparsimon.github.io/images/drones/Yerevan_drone_demand.jpg" alt="Yerevan drone demand" /></p>
<p>as well as a number of potential sites to install drone stations (30 in our example):</p>
<p><img src="https://lexparsimon.github.io/images/drones/Yerevan_drone_facilities.jpg" alt="Yerevan drone facilities" /></p>
<p><strong>how do we determine the best locations for placing the docking stations?</strong> Well, it obviously depends on the costs.</p>
<p><strong>The first type of cost</strong> is that of purchasing and installing the docking stations. To simplify matters, let’s assume <a href="https://en.wikipedia.org/wiki/Bid_rent_theory">Alonso’s monocentric city model</a> according to which real estate prices drop according to some exponential or power decay with increasing distance from the central business district:</p>
<p><img src="https://lexparsimon.github.io/images/drones/Prices.jpg" alt="Yerevan prices" /></p>
<p><em>There is clearly a tradeoff</em>: while most demand is spatially concentrated in the center (see demand plot), and it would therefore make sense to also install docking stations in central locations to reduce transfer costs, we would, conversely, incur higher costs for choosing central locations (while prices have been chosen with some common sense, they are presented for demonstration purposes only).</p>
<p><strong>The second type of cost</strong> is that of operating the drones which can be safely assumed to be proportional to the distances travelled. However, and here things get interesting, <strong>these distances will depend on the underlying (actually <em>upper</em>lying) urban air traffic path ways!</strong> In the below image, we can see two examples of simple configurations for air traffic, a Cartesian grid (<strong>A</strong>) and a labyrinthine arrangement (<strong>B</strong>):</p>
<p><img src="https://lexparsimon.github.io/images/drones/airstreets1.jpg" alt="Yerevan air streets 1" /></p>
<p>Let’s check out two more examples: a <a href="https://en.wikipedia.org/wiki/Minimum_spanning_tree">minimum spanning tree</a> of the grid above, typically used in building the physical lines in communication networks for reducing the total line length (<strong>C</strong>), and the actual Yerevan street network, since one of the realistic scenarios is that drones will operate directly above the existing street network for reducing visual pollution and for safety reasons (<strong>D</strong>).</p>
<p><img src="https://lexparsimon.github.io/images/drones/airstreets2.jpg" alt="Yerevan air streets 2" /></p>
<p>Since the four air traffic path way systems are so different, we can intuitively expect the solutions to the drone station location problem to yield differences as well.</p>
<h3 id="mathematical-formulation">Mathematical formulation</h3>
<p>In order to see whether this is true, how the solutions compare, and what this means for future urban planners and traffic engineers, we need to solve a linear optimization problem. Consider \(n\) customers \(i=1,2,...,n\) and \(m\) sites for placing the drone docking stations \(j=1,2,...,m\). Define continuous variables \(x_{ij} \geq 0\) as the amount serviced from drone station \(j\) to customer \(i\), and binary variables \(y_j=1\) if a drone station is installed at location \(j\), and \(y_j=0\) otherwise. An integer-optimization model for the capacitated facility location problem can now be formulated as follows:</p>
\[\begin{array}{ll}
{\text { minimize }} & {\sum_{j=1}^{m} f_{j} y_{j}+\sum_{i=1}^{n} \sum_{j=1}^{m} c_{i j} x_{i j}} \\
{\text { subject to: }} & {\sum_{j=1}^{m} x_{i j}=d_{i}} \\
{} & {\sum_{i=1}^{n} x_{i j}=d_{i}} & {\text { for } i=1, \cdots, n} \\
{} & {\sum_{i=1}^{n} x_{i j} \leq M_{j} y_{j}} & {\text { for } j=1, \cdots, m} \\
{} & {x_{i j} \geq 0} & {\text { for } i=1, \cdots, n; j=1, \cdots, m} \\
{} & {y_{j} \in\{0,1\}} & {\text { for } j=1, \cdots, m}
\end{array}\]
<p>The objective of the problem is to minimize the sum of drone station installation costs and drone operation costs. The first set of constraints require each customer’s demand to be strictly satisfied. The capacity of each drone station \(j\) is limited by the second set of constraints: if drone station \(j\) is installed, its capacity restriction is taken into account; if it is not installed, the demand satisfied by \(j\) is zero. The third set of constraints impose variable upper bounds. Even though they are redundant, they yield a much tighter linear programming relaxation than the equivalent weaker formulation (a very good introduction to linear programming can be found <a href="https://www.math.ucla.edu/~tom/LP.pdf">here</a>). For an intuition of how such problems are solved, this <a href="https://en.wikipedia.org/wiki/Linear_programming">Wikipedia article</a> provides a fairly good first overview.
Fortunately, there are many good mathematical optimization solvers out there. Let’s see how to code it ourselves:</p>
<details>
<summary>Python code for solving the capacitated facility location problem</summary>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="kn">from</span> <span class="nn">pulp</span> <span class="kn">import</span> <span class="o">*</span>
<span class="n">COSTUMERS</span> <span class="o">=</span> <span class="n">positive_flow_inds</span> <span class="c1"># the demand vector
</span> <span class="n">potential_stations</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">choice</span><span class="p">(</span><span class="n">COSTUMERS</span><span class="p">,</span><span class="mi">30</span><span class="p">)</span> <span class="c1">#choose 30 potential sites
</span> <span class="n">STATION</span> <span class="o">=</span> <span class="p">[</span><span class="s">'STATION {}'</span><span class="p">.</span><span class="nb">format</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">potential_stations</span><span class="p">]</span>
<span class="n">demand</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span><span class="n">polyflows</span><span class="p">[</span><span class="n">polyflows</span><span class="p">.</span><span class="n">inflow</span><span class="o">></span><span class="mi">0</span><span class="p">][</span><span class="s">'inflow'</span><span class="p">])</span> <span class="c1">#setting up the demand dictionary
</span> <span class="n">STATION_dict</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span><span class="nb">zip</span><span class="p">(</span><span class="n">STATION</span><span class="p">,</span> <span class="n">potential_stations</span><span class="p">))</span>
<span class="c1">#installation cost decay from center (given a vector of distances from a central location)
</span> <span class="n">costs_from_cen</span> <span class="o">=</span> <span class="mi">150000</span> <span class="o">-</span> <span class="mf">1.5</span><span class="o">*</span><span class="n">dists_from_cen</span><span class="o">**</span><span class="mf">1.22</span>
<span class="n">install_cost</span> <span class="o">=</span> <span class="p">{</span><span class="n">key</span><span class="p">:</span> <span class="n">costs_from_cen</span><span class="p">[</span><span class="n">val</span><span class="p">]</span> <span class="k">for</span> <span class="n">key</span><span class="p">,</span> <span class="n">val</span> <span class="ow">in</span> <span class="n">STATION_dict</span><span class="p">.</span><span class="n">items</span><span class="p">()}</span> <span class="c1">#installation costs
</span> <span class="n">max_capacity</span> <span class="o">=</span> <span class="p">{</span><span class="n">key</span><span class="p">:</span> <span class="mi">100000</span> <span class="k">for</span> <span class="n">key</span> <span class="ow">in</span> <span class="n">FACILITY</span><span class="p">}</span> <span class="c1">#maximum capacity
</span>
<span class="c1">#setting up the transportation costs given a distance cost matrix (computed as pairwise shortest paths in the underlying air
</span> <span class="c1">#traffic network)
</span> <span class="n">transp</span> <span class="o">=</span> <span class="p">{}</span>
<span class="k">for</span> <span class="n">loc</span> <span class="ow">in</span> <span class="n">potential_stations</span><span class="p">:</span>
<span class="n">cost_dict</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span><span class="nb">zip</span><span class="p">(</span><span class="n">COSTUMERS</span><span class="p">,</span> <span class="n">costmat_full</span><span class="p">[</span><span class="n">loc</span><span class="p">][</span><span class="n">COSTUMERS</span><span class="p">]))</span>
<span class="n">transp</span><span class="p">[</span><span class="s">'STATION {}'</span><span class="p">.</span><span class="nb">format</span><span class="p">(</span><span class="n">loc</span><span class="p">)]</span> <span class="o">=</span> <span class="n">cost_dict</span>
<span class="c1"># SET PROBLEM VARIABLE
</span> <span class="n">prob</span> <span class="o">=</span> <span class="n">LpProblem</span><span class="p">(</span><span class="s">"STATIONLocation"</span><span class="p">,</span> <span class="n">LpMinimize</span><span class="p">)</span>
<span class="c1"># DECISION VARIABLES
</span> <span class="n">serv_vars</span> <span class="o">=</span> <span class="n">LpVariable</span><span class="p">.</span><span class="n">dicts</span><span class="p">(</span><span class="s">"Service"</span><span class="p">,</span>
<span class="p">[</span>
<span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="n">j</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">COSTUMERS</span> <span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="n">STATION</span>
<span class="p">],</span> <span class="mi">0</span><span class="p">)</span>
<span class="n">use_vars</span> <span class="o">=</span> <span class="n">LpVariable</span><span class="p">.</span><span class="n">dicts</span><span class="p">(</span><span class="s">"Uselocation"</span><span class="p">,</span> <span class="n">STATION</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span> <span class="n">LpBinary</span><span class="p">)</span>
<span class="c1"># OBJECTIVE FUNCTION
</span>
<span class="n">prob</span> <span class="o">+=</span> <span class="n">lpSum</span><span class="p">(</span><span class="n">actcost</span><span class="p">[</span><span class="n">j</span><span class="p">]</span><span class="o">*</span><span class="n">use_vars</span><span class="p">[</span><span class="n">j</span><span class="p">]</span> <span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="n">STATION</span><span class="p">)</span> <span class="o">+</span> <span class="n">lpSum</span><span class="p">(</span><span class="n">transp</span><span class="p">[</span><span class="n">j</span><span class="p">][</span><span class="n">i</span><span class="p">]</span><span class="o">*</span><span class="n">serv_vars</span><span class="p">[(</span><span class="n">i</span><span class="p">,</span><span class="n">j</span><span class="p">)]</span>
<span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="n">STATION</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">COSTUMERS</span><span class="p">)</span>
<span class="c1"># CONSTRAINTS
</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">COSTUMERS</span><span class="p">:</span>
<span class="n">prob</span> <span class="o">+=</span> <span class="n">lpSum</span><span class="p">(</span><span class="n">serv_vars</span><span class="p">[(</span><span class="n">i</span><span class="p">,</span><span class="n">j</span><span class="p">)]</span> <span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="n">STATION</span><span class="p">)</span> <span class="o">==</span> <span class="n">demand</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="c1">#CONSTRAINT 1
</span>
<span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="n">STATION</span><span class="p">:</span>
<span class="n">prob</span> <span class="o">+=</span> <span class="n">lpSum</span><span class="p">(</span><span class="n">serv_vars</span><span class="p">[(</span><span class="n">i</span><span class="p">,</span><span class="n">j</span><span class="p">)]</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">COSTUMERS</span><span class="p">)</span> <span class="o"><=</span> <span class="n">maxam</span><span class="p">[</span><span class="n">j</span><span class="p">]</span><span class="o">*</span><span class="n">use_vars</span><span class="p">[</span><span class="n">j</span><span class="p">]</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">COSTUMERS</span><span class="p">:</span>
<span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="n">STATION</span><span class="p">:</span>
<span class="n">prob</span> <span class="o">+=</span> <span class="n">serv_vars</span><span class="p">[(</span><span class="n">i</span><span class="p">,</span><span class="n">j</span><span class="p">)]</span> <span class="o"><=</span> <span class="n">demand</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">*</span><span class="n">use_vars</span><span class="p">[</span><span class="n">j</span><span class="p">]</span>
<span class="c1"># SOLUTION
</span>
<span class="n">prob</span><span class="p">.</span><span class="n">solve</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Status: "</span><span class="p">,</span> <span class="n">LpStatus</span><span class="p">[</span><span class="n">prob</span><span class="p">.</span><span class="n">status</span><span class="p">])</span>
<span class="c1"># PRINT DECISION VARIABLES
</span> <span class="n">TOL</span> <span class="o">=</span> <span class="p">.</span><span class="mi">0001</span> <span class="c1">#tolerance for identifying which locations the algorithm has chosen
</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">STATION</span><span class="p">:</span>
<span class="k">if</span> <span class="n">use_vars</span><span class="p">[</span><span class="n">i</span><span class="p">].</span><span class="n">varValue</span> <span class="o">></span> <span class="n">TOL</span><span class="p">:</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Establish drone station at site "</span><span class="p">,</span> <span class="n">i</span><span class="p">)</span>
<span class="k">for</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">prob</span><span class="p">.</span><span class="n">variables</span><span class="p">():</span>
<span class="k">print</span><span class="p">(</span><span class="n">v</span><span class="p">.</span><span class="n">name</span><span class="p">,</span> <span class="s">" = "</span><span class="p">,</span> <span class="n">v</span><span class="p">.</span><span class="n">varValue</span><span class="p">)</span>
<span class="c1"># PRINT OPTIMAL SOLUTION
</span> <span class="k">print</span><span class="p">(</span><span class="s">"The total cost of installing and operating drones = "</span><span class="p">,</span> <span class="n">value</span><span class="p">(</span><span class="n">prob</span><span class="p">.</span><span class="n">objective</span><span class="p">))</span>
</code></pre></div> </div>
</details>
<p><br /></p>
<h3 id="the-four-solutions">The four solutions</h3>
<p>Solving the optimization problem above yields the following solutions for configurations <strong>A</strong> and <strong>B</strong>:</p>
<p><img src="https://lexparsimon.github.io/images/drones/solutions1.jpg" alt="Yerevan solutions 1" /></p>
<p>and for <strong>C</strong> and <strong>D</strong>:</p>
<p><img src="https://lexparsimon.github.io/images/drones/solutions2.jpg" alt="Yerevan solutions 2" /></p>
<p>A small animation for the exisiting street network configuration to illustrate the system dynamics:
<img src="/images/drones/drones_streets_animation.gif" alt="Alt Text" /></p>
<p>We see from the plots that while there is a common optimization pattern across all four arrangements with roughly similar optimal numbers of drone stations (from 8 to 11) and similar spatial clustering of large supplier stations (larger circles denote larger supply), the optimal locations and demand coverage areas (shown in colors) vary a lot!</p>
<p>This means that small variations in air traffic path way designs can have a huge impact on the optimal locations of drone stations, the geographical area covered by each station, and the volume of air traffic in different areas.
We can also look at the optimal total cost for each air path way system. In particular, we find that the best path way system is the Cartesian grid (~2.5 million $), then the existing street network (~4.1 million $), then the labyrinthine system (~4.5 million $) and the worst is its minimum spanning tree (~6 million $). We see that the results hint at the tradeoffs future planners will have to face: <strong>cost vs. safety vs. aesthetics vs. practicality</strong>.</p>
<h2 id="conclusion">Conclusion</h2>
<p>In this post, we attempted to raise awareness and alert architects, urban planners, and policy makers to the dramatic changes unmanned aerial vehicle technology is about to bring to our cities. We looked at the case of Yerevan and solved a simple mathematical optimization problem for finding the best locations for drone docking stations to serve the demand (both commercial as well as city operations such as emergency services). We saw how differences in planning the urban air space could have a large impact on the solution of the aforementioned problem. Unlike the urban street space, which, for the most part of its existence has evolved gradually, the urban air space will have to be planned, designed and deployed in comparably miniscule time frames.</p>
<p>There will obviously be many more variables to consider when solving the facility location problem in a real setting making optimization problems <a href="https://link.springer.com/article/10.1007/s10479-019-03385-x">much harder to solve</a>. For instance, in our example we assumed a one-off demand structure, while in most real-world settings the demand is volatile calling for <a href="https://en.wikipedia.org/wiki/Stochastic_optimization">stochastic optimization</a>.</p>
<p>However, the objective of this post was not to provide a manual for solving the drone station location problem (let’s leave it to <a href="https://en.wikipedia.org/wiki/Operations_research">operations research</a> professionals), but to show its impact on urban space and mobility, and to point at the necessity to extend urban planners’ and other city enthusiasts’ scope of interest.</p>Gevorg Yeghikyangevorg.yeghikyan@sns.itWhat will happen when swarms of drones flood our cities and why should urban planners care?Note on beneficial AI and urban planning2020-01-04T00:00:00+00:002020-01-04T00:00:00+00:00https://lexparsimon.github.io/GiniAI<p>This is a short note reflecting on the implications of the current AI paradigms for the future of urban planning.</p>
<h2 id="ai-planning-assistant">AI planning assistant</h2>
<p>Let’s say it’s 2050 and you have an AI planner assistant system helping you to achieve certain targets in the new city masterplan. For instance, one of the objectives as part of the urban economic policy for the masterplan is to achieve a low <a href="https://en.wikipedia.org/wiki/Gini_coefficient">Gini coefficient</a> (i.e. low inequality).</p>
<p>If the current AI paradigm, namely that which is called the <a href="https://futureoflife.org/2019/10/08/ai-alignment-podcast-human-compatible-artificial-intelligence-and-the-problem-of-control-with-stuart-russell/?cn-reloaded=1">“standard model”</a> where an algorithm optimizes a fixed objective (e.g. minimizes a defined loss function), becomes the cornerstone for future developments, <strong>we will have a problem</strong>: in simple terms, a machine more “intelligent” than us will achieve the “fixed objective”, or the purpose we put in the machine, using whatever means and won’t care of any side-effects.</p>
<p>For example*, following our <a href="https://lexparsimon.github.io/Gini/">previous post</a>, the spatial distribution of some values in London, shown in the top left in the below image (<strong>A</strong>) has the <strong>exact same</strong> Gini coefficient as its spatially reshuffled versions (<strong>B</strong>, <strong>C</strong>, <strong>D</strong>).
While the measure of inequality is the same in all cases, the reshuffled versions <strong>B</strong> and <strong>C</strong> clearly show a spatial segregation of high vs. low values.</p>
<p><img src="https://lexparsimon.github.io/images/gini/shuffled.jpg" alt="London parking demand reshuffled distributions" /></p>
<p>In the case of income this would imply achieving the fixed objective though what urban planners call <a href="https://en.wikipedia.org/wiki/Exclusionary_zoning">exclusionary</a> rather than <a href="https://en.wikipedia.org/wiki/Inclusionary_zoning">inclusionary</a> masterplanning.
The machine could have come up with such a policy proposal resulting in spatial segregation by optimizing for many important things (e.g. resources, various economic indicators, etc.) typically set as objective functions in economic optimization problems.
Therefore, it is crucial to have AI systems that defer to humans in the process of high impact decision making.</p>
<h2 id="beneficial-ai">Beneficial AI</h2>
<p>While <a href="https://en.wikipedia.org/wiki/Explainable_artificial_intelligence">explainable AI</a> is gaining momentum nowadays, I find the so called <a href="https://en.wikipedia.org/wiki/Apprenticeship_learning">apprenticeship learning</a> via <a href="https://arxiv.org/abs/1806.06877">inverse reinforcement learning</a> described by initially by Stuart Russell and Andrew Ng very promising in making sure humans remain in control.
In a traditional reinforcement learning setting, the goal is to learn a behavior that maximizes a predefined reward function. Inverse reinforcement learning (IRL), turns the problem on its head: <strong>it tries to extract an approximation of the reward function given the observed behavior of an agent</strong>.</p>
<p>Since humans often have irrational and inconsistent preferences, the IRL system will have a hard time learning them, and the only way to deal with it is by introducing uncertainty in the learnt reward function - for example, via a distribution of net utilities to the human(s) in charge.</p>
<p>Coming back to the AI planning assistant system, imagine it has come up with a policy achieving the desired Gini coefficient in a resource-efficient way resulting in a certain spatial distribution, as discussed above. However, the system is unsure whether such an outcome is desirable to the city, and this uncertainty is described by a Gaussian distribution of net utilities \(U\) with mean 20 and standard deviation of 20, as shown in the left of the below image.</p>
<p><img src="https://lexparsimon.github.io/images/gini/normal_dists.jpg" alt="Net utility distributions" /></p>
<p>In such a setting, should the AI system decide to act on it, the expected net utility to the city would be +20. Alternatively, the system can decide to “switch himself off”, or get out of the decision making process, effectively deferring to the human urban planner. We define such an action to have 0 net value to the city.
If these are the only two choices, the AI system will go ahead and act on the proposed policy, incurring a considerable risk (in fact, 15.9%) of resulting negative net utility for the city. (If the distribution were Gaussian with mean -20, the system would switch itself off.)</p>
<p>However, we present the system with a third choice: <strong>explain the policy and its implications, wait, and let the human urban planner switch it off</strong>.</p>
<p>The whole purpose of this choice - to let the human expert to switch it off or let go ahead - provides the system with new information about the human preferences. If the urban planner allows the AI system to proceed with the policy, it simply means that he judges the proposed urban policy to have a positive net utility to the city. This would tell the system to update its uncertainty by throwing away the negative part, as shown in the right of the above image. It’s fairly easy to verify that the conditional expected utility is 25.7.</p>
<p>Thus, the AI system faces the following choices:</p>
<ul>
<li>Acting now and proceeding with the proposed urban policy has a net utility of +20.</li>
<li>Switching itself off has a net utility of 0.</li>
<li>Waiting and allowing the human urban planner to switch it off can have two outcomes:
<ul>
<li>There is a 15.9% chance that the policy will not be approved by the human urban planner, with net utility 0</li>
<li>There is a 84.1% chance that the policy will be approved and the system will be allowed to proceed, with net utility of +25.7.
Thus, the expected net utility from waiting is 15.9% x 0 + 84.1% x 25.7 = +21.61, which is, as we can see, better than acting now with net utility of +20.</li>
</ul>
</li>
</ul>
<p>The main purpose of this method <a href="https://arxiv.org/abs/1611.08219">proposed</a> by Stuart Russell is to make sure that the AI system will have a <em>positive incentive to allow humans to switch it off</em>. In fact, it is possible to prove the same result in a general setting: as long as there is the slightest uncertainty about whether the proposed action will be approved by humans, the system will allow humans to switch it off (see Appendix for proof). The human’s decision to let it proceed or switch it off provides the system with new information, and this is always useful for improving the system’s decisions. On the contrary, if the system is completely certain about the urban planner’s decision, the decision will not provide any new information, hence the AI system will have no incentive to let the urban planner to decide.</p>
<h2 id="conclusion">Conclusion</h2>
<p>In this post, we saw the caveats of the consequences of current AI paradigm projected into the future on the specific example of urban policy making which may result in an exclusionary master plan.
In particular, the two main takeaways are:</p>
<ol>
<li>Optimizing a fixed objective function is problematic and should be replaced by a provably beneficial AI.</li>
<li>Even if we create AI systems that provably defer to humans, human knowledge still remains crucial, and thus domain knowledge (in urbanism in our case) should be heavily pursued and improved.</li>
</ol>
<p>*A more amusing example: an “intelligent” vacuum cleaner (with the fixed objective to make sure there is no dust in the house) might at some point come up with another way to fight dust: get rid of its source! This is clearly something you don’t want to happen without you being in control since a good part of the indoor dust comes from human hair and skin cells.</p>
<h2 id="appendix">Appendix</h2>
<p>Here we present the general result.</p>
<p>Let \(P(u)\) be the AI system’s prior probability density over the net utility to the city for the proposed policy \(a\). Then the expected utility of proceeding with \(a\) is:</p>
\[E U(a)=\int_{-\infty}^{\infty} P(u) \cdot u d u=\int_{-\infty}^{0} P(u) \cdot u d u+\int_{0}^{\infty} P(u) \cdot u d u\]
<p>(It will become clear in a bit, why the integral is split in this way.) Conversely, the action \(d\), namely deferring to the urban planner, has to components:</p>
<ol>
<li>if \(u>0\), then the urban planner allows the AI system to proceed, and so the net utility is \(u\)</li>
<li>if \(u<0\), then the urban planner switches the AI system off, and so the net utility is 0:</li>
</ol>
\[E U(d)=\int_{-\infty}^{0} P(u) \cdot 0 d u+\int_{0}^{\infty} P(u) \cdot u d u\]
<p>Comparing the two above expressions, we notice that \(EU(d) \geq EU(a)\), since \(EU(d)\) has the negative utility part effectively removed. The two options result in equal net utility to the city only when the negative region has zero probability - i.e., when the AI system is already completely certain that the proposed solution will be liked and approved by the urban planner.</p>Gevorg Yeghikyangevorg.yeghikyan@sns.itA short note on the implications of current AI paradigms on the future of urban planning.Why measuring urban inequality with the Gini index is a bad idea2019-12-28T00:00:00+00:002019-12-28T00:00:00+00:00https://lexparsimon.github.io/Gini<h2 id="the-gini-coefficient">The Gini coefficient</h2>
<p>In urban policy making, we are often confronted with the need to assess the income inequality of the urban population for such purposes as granting tax cuts to businesses targeting certain income groups, or identifying low-income households for offering housing subsidies in the form of cheap credit.
However, wealth or income are not the only quantities the inequality or heterogeneity of which an urban planner would be interested in. For example, urban mobility flows are often concentrated in a few areas capturing a disproportionately large portion of the overall city flows, and knowing how severe this heterogeneity is along with monitoring its trends over time would be the first step towards a meaningful transportation policy, allocation of services and infrastructure such as parking, as well as largely masterplanning.</p>
<p>That said, the most common way to measure inequality is the <a href="https://en.wikipedia.org/wiki/Gini_coefficient">Gini coefficient</a> which has been in use by economists for more than a hundred years already.</p>
<p>For any distribution of values of interest \(X\) in a city, the Gini coefficient can be defined as:</p>
\[G=\frac{\sum_{i=1}^{n} \sum_{j=1}^{n}\left|x_{i}-x_{j}\right|}{2 n^{2} \bar{x}}\]
<p>where \(x_i\) is the \(X\) value at location \(i=[1,2, \ldots, n]\) and \(\bar{x}=(1 / n) \sum_{i} x_{i}\).</p>
<p>As already mentioned, the Gini coefficient, originally used to measure wealth and income inequality, can be applied to quantify the heterogeneity of other variables too. In the case of characterising heterogeneity of values at different locations in a city, as can be seen from the above equation, the Gini coefficient will take on the value of zero if the variable of interest is distributed uniformly across city locations. Conversely, it takes on its maximum value when all of the variables of interest are concentrated in a single location, leading to a Gini coefficient of \(GI=1−1/n\), which is very close to 1 for large \(n\).</p>
<h2 id="computing-the-gini-coefficient">Computing the Gini coefficient</h2>
<p>Let’s make things clear with an example. Let’s say we want to understand how unequal parking demand is distributed in London, and use the Gini coefficient as a measure of this inequality. Below is a plot of what it looks like for the available data at a resolution of \(500 \times 500\)m grid.</p>
<p><img src="https://lexparsimon.github.io/images/gini/sorted0.jpg" alt="London parking demand" /></p>
<p>As one might expect, we see hotspots of high parking demand. Indeed, if we look at the distribution of the number of trips ending in a given location over a week (essentially weekly aggregate parking demand),</p>
<p><img src="https://lexparsimon.github.io/images/gini/Flow_dist.jpg" alt="London parking demand distribution" /></p>
<p>we see an assymatric Pareto-like distribution, with a few locations displaying very high, and most locations very low demand. If we compute the Gini coefficient with the expression above, we obtain a value of roughly 0.6. Although the temporal evolution of this measure would be more meaningful to track, this value indicates a medium-high inequality if thought of in economic terms.</p>
<h2 id="so-whats-wrong-with-it">So what’s wrong with it?</h2>
<p>In the definition of the Gini coefficient, we mentioned a key word: <strong>location</strong>. Urban planning is first and foremost about <em>space</em>. Whether it’s design, management, logistics, or planning, practitioners are working with <em>space</em>. But look carefully at the definition of the widely used Gini coefficient: space - in this case geographical - does not figure in it. The Gini coefficient is completely agnostic to the spatial arrangement of the location of the values of interest. The following four arrangements - the true parking demand and its spatially reshuffled configurations have <strong>the exact same Gini coefficient</strong>:</p>
<p><img src="https://lexparsimon.github.io/images/gini/shuffled.jpg" alt="London parking demand reshuffled distributions" /></p>
<p>In other words, the Gini coefficient fails to capture any spatial information about our variable of interest.</p>
<h2 id="what-should-we-do">What should we do?</h2>
<p>In the field of <a href="https://en.wikipedia.org/wiki/Spatial_analysis">spatial statistics</a> there have been proposed many measures indicative of the spatial component of the variable under study. We will discuss two of them which I find particularly useful to combine with the Gini coefficient when studying the urban environment.</p>
<h3 id="spatial-gini">Spatial Gini</h3>
<p>In order to obtain a Gini coefficient that carries meaningful spatial information, we further use the <a href="https://www.researchgate.net/publication/233650148_A_spatial_decomposition_of_the_Gini_coefficient">Spatial Gini index</a>. In essence, it is a decomposition of the classical Gini with the aim of considering the joint effects of inequality and spatial autocorrelation. More specifically, it exploits the fact that the sum of all pairwise differences can be decomposed into sums of geographical neighbours and non-neighbours:</p>
\[G I=\frac{\sum_{i=1}^{n} \sum_{j=1}^{n} w_{i, j}^{A}\left| x_{i}- x_{j}\right|}{2 n^{2} \bar{x}}+\frac{\sum_{i=1}^{n} \sum_{j=1}^{n}\left(1-w_{i, j}^{A}\right)\left| x_{i}- x_{j}\right|}{2 n^{2} \bar{x}}\]
<p>where \(w_{i, j}^{A}\) is an element of the binary spatial adjacency matrix.
The Spatial Gini index can be interpreted as follows: as the positive spatial auto-correlation increases, the second term in the equation above increases relative to the first, since geographically adjacent values will tend to take on similar values. On the contrary, negative spatial autocorrelation will cause an opposite decomposition, since the difference between non-neighbours will tend to be less than that between geographical neighbours. In either case, this offers the possibility to quantify the relative contributions of these two terms. The results obtained from this approach can further be tested for statistical significance by using random spatial permutations to obtain a sampling distribution under the null hypothesis that the variable of interest is randomly distributed in space.</p>
<p>In essence, we are interested in finding how much of the Gini coefficient is due to non neighbour heterogeneity. To achieve this, we use the non-neighbour term in the Gini decomposition above as a statisticto test for spatial autocorrelation:</p>
\[G I_{2}=\frac{\sum_{i=1}^{n} \sum_{j=1}^{n}\left(1-w_{i, j}^{A}\right)\left| x_{i} - x_{j}\right|}{2 n^{2} \bar{x}}\]
<p>This expression can be interpreted as the portion of overall heterogeneity associated with non-neighbour pairs of grid cells. Inference on this statistic is carried out by computing a pseudo p-value by comparing the \(GI_2\) obtained from the observed data to the distribution of \(GI_2\) values obtained from random spatial permutations. It should be noted that this inference based on random spatial permutations is on the spatial decomposition of the Gini coefficient given by the expression above, and not the value of the Gini coefficient itself.</p>
<p>Following the described approach, we proceed to computing the spatial decompositions of the Gini coefficient, varying the neighbourhood radius in the adjacency matrix from 0 (original Gini) to 6 kilometers:</p>
<p><img src="https://lexparsimon.github.io/images/gini/Ginis.jpg" alt="London parking demand spatial Gini" /></p>
<p>The random spatial permutation approach yielded a statistically significant spatial decomposition (p = 0.01). From the plot we can see that as the neighbourhood radius increases, the inequality due to non-neighbour parking demand values decreases, since the growing neighbourhood captures more and more of the inequality. What’s interesting, however, is the fact that the observed value distribution and the randomized one have similar spatial gini profiles (A and D in the plot), while the two reshufflings with Gaussian distibutions of the parking values (B and C) display the exact same profiles, which decline slower than those of A and D. This is completely expected, since in a Gaussian decay the decline is “smooth” and thus increasing the radius does not make the neighbourhood capture as much diversity, and thus the inequality associated to the non-neighbour component remains relatively high.</p>
<h3 id="spreading-index">Spreading index</h3>
<p>Despite their informative relevance, the Gini coefficient and its spatial variant exploit the mean \(\bar{x}\), which, under fat-tailed distributions, as many socio-economic variables tend to be, may be undefined. In such cases, the Gini coefficient <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3005184">cannot be reliably estimated</a> with non-parametric methods and will result in a downward bias emerging under fat tails.</p>
<p>Another downside of measuring heterogeneity of the parking demand with the Gini approach is that it does not offer the possibility to study the spatial arrangement of the “hotspots” - locations with very large demand. The hotspots are defined as the grid cells with a parking demand above a certain threshold \(\bar{x^{\star}}\). The intuitive first choice of a threshold would be the city-wide average demand. However, this is often too low a threshold and a better approach <a href="https://www.nature.com/articles/srep05276">has been proposed</a>. Once the threshold has been chosen and the hotspots are identified as cells with parking demand values larger than the chosen threshold \(\bar{x^{\star}}\), we can use the <a href="https://arxiv.org/abs/1804.00855">recently proposed</a> spreading index to measure the ratio between average distance between the hotspots, and the average city distance as a measure of city size:</p>
\[\eta\left(x^{\star}\right) = \frac{\frac{1}{N\left(x^{\star}\right)} \sum_{i, j} d(i, j) 1_{\left(x_{i}>x^{\star}\right)} 1_{\left(x_{j}>x^{\star}\right)}}{\frac{1}{N} \sum_{i, j} d(i, j)}\]
<p>where \(N(x^{\star})\) is the number of pairwise distances of grid cells with a parking demand greater than \(\bar{x^{\star}}\), \(N\) is the number of pairwise distances between all grid cells covering the city, \(d(i,j)\) is the distance between cell \(i\) and cell \(j\), and \(1_{\left(x_{i}>x^{\star}\right)}\) is the indicator function for identifying the cells with praking demand greater than \(\bar{x^{\star}}\) for computing the distances. The spreading index is essentially the average distance between cells with \(\left(x_{i}>x^{\star}\right)\), divided by the average distance between all city cells. If the cells with large parking demand are spread around across the city, this ratio will be large. Conversely,if the high demand cells are concentrated close to each other, as in a monocentric city, this ratio will be small.</p>
<p>Instead of choosing one particular threshold value, we will set it as a parameter and see how the <em>spreading index</em> behaves as a function of the threshold \(\bar{x^{\star}}\) for the four types of spatial arrangements.</p>
<p><img src="https://lexparsimon.github.io/images/gini/etas.jpg" alt="London parking demand spreading index" /></p>
<p>As we can see from the plot, the completely random reshuffling (<strong>D</strong>) displays the highest <em>spreading index</em> profile, followed by the observad data (<strong>A</strong>). The two peak Gaussian reshuffling (<strong>C</strong>) follows next, with the monocentric Gaussian reshuffling profile dropping rapidly as the threshold increases.
These four types of <em>spreading index</em> profiles make a more or less complete classification of broad mono- versus poly-centric structures to be found in the spatial arrangements of socio-economic quantities in cities. A mono-centric spatial configuration will result in a rapid decline of the profile and an overall low <em>spreading index</em>, while a polycentric configuration will have an overall high <em>spreading index</em>.</p>
<p>In the use case of working with the spatial distibution of parking demand in London, we see that it has hotspots spread across the city, making for a polycentric spatial structure.</p>
<h2 id="conclusion">Conclusion</h2>
<p>In the coming era of rich data streams from a myriad of sources in cities, it becomes ever more important to devise and apply simple indicators capturing and providing meaningful information to the urban planner and policy maker. In this post, we have discussed the Gini coefficient as a measure of heterogeneity of a distribution of values, shown its shortcomings with a simple trick, and presented methods for complementing it with other metrics capable of capturing spatial information.</p>
<p>The jupyter notebook with the code for this post can be found <a href="https://github.com/lexparsimon/Urban-Data-Science/blob/master/Gini%20coefficient.ipynb">here</a>.</p>Gevorg Yeghikyangevorg.yeghikyan@sns.itNote on how the Gini coefficient is agnostic to space and how to fix it.