Visualization for Area Unit Data

Spring 2023

Author

Serge Rey

Published

February 23, 2023

Statistical Visualization of Area Unit Data

Geovisualization

Choropleths

Areal Unit Data

import geopandas
import libpysal

/tmp/ipykernel_326488/1387931905.py:1: UserWarning:

Shapely 2.0 is installed, but because PyGEOS is also installed, GeoPandas will still use PyGEOS by default for now. To force to use and test Shapely 2.0, you have to set the environment variable USE_PYGEOS=0. You can do this before starting the Python process, or in your code before importing geopandas:

import os
os.environ['USE_PYGEOS'] = '0'
import geopandas

In a future release, GeoPandas will switch to using Shapely by default. If you are using PyGEOS directly (calling PyGEOS functions on geometries from GeoPandas), this will then stop working and you are encouraged to migrate from PyGEOS to Shapely 2.0 (https://shapely.readthedocs.io/en/latest/migration_pygeos.html).

south = libpysal.examples.load_example('South')

libpysal.examples.explain('South')

south_gdf = geopandas.read_file(south.get_path('south.shp'))

south_gdf.plot()

<Axes: >

import seaborn

seaborn.displot(south_gdf, x='HR60')

<seaborn.axisgrid.FacetGrid at 0x7fab3d5335e0>

south_gdf.explore(column='HR60')

Make this Notebook Trusted to load map: File -> Trust Notebook

south_gdf.HR60.describe()

count    1412.000000
mean        7.292144
std         6.421018
min         0.000000
25%         3.213471
50%         6.245125
75%         9.956272
max        92.936803
Name: HR60, dtype: float64

south_gdf.plot(column='HR60')

<Axes: >

south_gdf.plot(column='HR60', scheme='Quantiles')

<Axes: >

south_gdf.plot(column='HR60', scheme='Quantiles', legend=True)

<Axes: >

Classification Schemes

\[c_j \lt y_i \le c_{j+1} \forall y_i \in C_j\]

where \(y_i\) is the value for the attribute at location \(i\), \(j\) is a class index, and \(c_j\) represents the lower bound of interval \(j\).

import mapclassify

mapclassify.Quantiles(south_gdf.HR60)

Quantiles

   Interval      Count
----------------------
[ 0.00,  2.50] |   283
( 2.50,  5.10] |   282
( 5.10,  7.62] |   282
( 7.62, 10.98] |   282
(10.98, 92.94] |   283

mapclassify.Quantiles(south_gdf.HR60, k=10)

Quantiles

   Interval      Count
----------------------
[ 0.00,  0.00] |   180
( 0.00,  2.50] |   103
( 2.50,  3.93] |   141
( 3.93,  5.10] |   141
( 5.10,  6.25] |   141
( 6.25,  7.62] |   141
( 7.62,  9.19] |   141
( 9.19, 10.98] |   141
(10.98, 14.31] |   141
(14.31, 92.94] |   142

mapclassify.EqualInterval(south_gdf.HR60, k=10)

EqualInterval

   Interval      Count
----------------------
[ 0.00,  9.29] |  1000
( 9.29, 18.59] |   358
(18.59, 27.88] |    39
(27.88, 37.17] |     8
(37.17, 46.47] |     4
(46.47, 55.76] |     2
(55.76, 65.06] |     0
(65.06, 74.35] |     0
(74.35, 83.64] |     0
(83.64, 92.94] |     1

mapclassify.MaximumBreaks(south_gdf.HR60, k=10)

MaximumBreaks

   Interval      Count
----------------------
[ 0.00, 29.42] |  1400
(29.42, 30.74] |     1
(30.74, 33.40] |     1
(33.40, 35.94] |     1
(35.94, 39.00] |     4
(39.00, 43.29] |     1
(43.29, 48.96] |     1
(48.96, 52.69] |     1
(52.69, 73.12] |     1
(73.12, 92.94] |     1

mapclassify.FisherJenks(south_gdf.HR60, k=10)

FisherJenks

   Interval      Count
----------------------
[ 0.00,  1.71] |   216
( 1.71,  4.45] |   278
( 4.45,  7.08] |   287
( 7.08, 10.02] |   288
(10.02, 13.59] |   176
(13.59, 19.60] |   121
(19.60, 28.77] |    34
(28.77, 40.74] |     8
(40.74, 53.30] |     3
(53.30, 92.94] |     1

mapclassify.BoxPlot(south_gdf.HR60)

BoxPlot

   Interval      Count
----------------------
( -inf, -6.90] |     0
(-6.90,  3.21] |   353
( 3.21,  6.25] |   353
( 6.25,  9.96] |   353
( 9.96, 20.07] |   311
(20.07, 92.94] |    42

mapclassify.HeadTailBreaks(south_gdf.HR60)

HeadTailBreaks

   Interval      Count
----------------------
[ 0.00,  7.29] |   802
( 7.29, 12.41] |   405
(12.41, 18.18] |   147
(18.18, 26.87] |    40
(26.87, 38.73] |    13
(38.73, 56.98] |     4
(56.98, 92.94] |     1

Map Customization

Legends
Color Schemes

Legends

south_gdf[['STATE_NAME', 'HR60', 'HR90']].head()

	STATE_NAME	HR60	HR90
0	West Virginia	1.682864	0.946083
1	West Virginia	4.607233	1.234934
2	West Virginia	0.974132	2.621009
3	West Virginia	0.876248	4.461577
4	Delaware	4.228385	6.712736

south_gdf['increased' ] =  south_gdf.HR90 > south_gdf.HR60

south_gdf.plot(column='increased', categorical=True, legend=True);

v = south_gdf.increased.map({True: 'Increased', False: 'Decreased'})

south_gdf['Increased'] = v

south_gdf.plot(column='Increased', categorical=True, legend=True);

south_gdf.plot(column='Increased', categorical=True, legend=True,
               legend_kwds={'bbox_to_anchor': (1.3, 1)});

south_gdf.plot(column='Increased', categorical=True, legend=True,
               legend_kwds={'bbox_to_anchor': (1.3, 1),
                           'title':'Homicide Rates 1960-1990'},
           );

south_gdf.plot(column='Increased', categorical=True, legend=True,
               legend_kwds={'bbox_to_anchor': (0, 1),
                           'title':'Homicide Rates 1960-1990'},
           );

south_gdf.plot(column='Increased', categorical=True, legend=True,
               legend_kwds={'bbox_to_anchor': (-0.1, 1),
                           'title':'Homicide Rates 1960-1990'},
           );

Color schemes

For more info see matplotlib

Sequential Color Schemes

south_gdf.plot(column='HR60', scheme='Quantiles', legend=True, 
                legend_kwds={'bbox_to_anchor': (1.3, 1)},
               cmap='Blues');

south_gdf.plot(column='HR60', scheme='Quantiles', legend=True, 
                legend_kwds={'bbox_to_anchor': (1.3, 1)},
               cmap='Greens');

south_gdf.plot(column='HR60', scheme='Quantiles', legend=True, 
                legend_kwds={'bbox_to_anchor': (1.3, 1)},
               cmap='YlGnBu');

Diverging Color Schme

south_gdf.plot(column='Increased', categorical=True, legend=True,
               legend_kwds={'bbox_to_anchor': (-0.1, 1),
                           'title':'Homicide Rates 1960-1990'},
               cmap='coolwarm',
           );

south_gdf.plot(column='Increased', categorical=True, legend=True,
               legend_kwds={'bbox_to_anchor': (-0.1, 1),
                           'title':'Homicide Rates 1960-1990'},
               cmap='bwr',
           );

Qualitative Color Scheme

south_gdf.plot(column='STATE_NAME', categorical=True)

<Axes: >

south_gdf.plot(column='STATE_NAME', categorical=True, legend=True)

<Axes: >

south_gdf.plot(column='STATE_NAME', categorical=True, legend=True,
               legend_kwds={'bbox_to_anchor': (0, 1)})

<Axes: >

import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_axes([0, 0, 1, 1])
ax.axis('off')

south_gdf.plot(column='STATE_NAME', categorical=True, legend=True,
               legend_kwds={'bbox_to_anchor': (0, 1)}, ax=ax);

Comparisons (Sequential)

south_gdf.plot(column='HR60', scheme='Quantiles', legend=True, 
                legend_kwds={'bbox_to_anchor': (1.3, 1)},
               cmap='YlGnBu', k=10);

south_gdf.plot(column='HR60', scheme='MaximumBreaks', legend=True, 
                legend_kwds={'bbox_to_anchor': (1.3, 1)},
               cmap='YlGnBu', k=10);

south_gdf.plot(column='HR60', scheme='FisherJenks', legend=True, 
                legend_kwds={'bbox_to_anchor': (1.3, 1)},
               cmap='YlGnBu', k=10);