Centrography refers to a set of descriptive statistics that provide summary descriptions of point patterns.
This notebook introduces three types of centrography analysis for point patterns in pysal.
Central Tendency
Dispersion and Orientation
Shape Analysis
We also illustrate centrography analysis using two simulated datasets. See Another Example
Central Tendency
mean_center: calculate the mean center of the unmarked point pattern.
weighted_mean_center: calculate the weighted mean center of the marked point pattern.
manhattan_median: calculate the manhattan median
euclidean_median: calculate the Euclidean median
Dispersion and Orientation
std_distance: calculate the standard distance
standard deviational ellipse
Shape Analysis
hull: calculate the convex hull of the point pattern
mbr: calculate the minimum bounding box (rectangle)
All of the above functions operate on a series of coordinate pairs. That is, the data type of the first argument should be \((n,2)\) array_like. In case that you have a point pattern (PointPattern instance), you need to pass its attribute “points” instead of itself to these functions.
import numpy as npfrom pointpats import PointPattern%matplotlib inlineimport matplotlib.pyplot as pltpoints = [[66.22, 32.54], [22.52, 22.39], [31.01, 81.21], [9.47, 31.02], [30.78, 60.10], [75.21, 58.93], [79.26, 7.68], [8.23, 39.93], [98.73, 77.17], [89.78, 42.53], [65.19, 92.08], [54.46, 8.48]]pp = PointPattern(points) #create a point pattern "pp" from listpp.points
/Users/serge/miniconda3/envs/courses/lib/python3.10/site-packages/libpysal/cg/shapes.py:1492: FutureWarning: Objects based on the `Geometry` class will deprecated and removed in a future version of libpysal.
warnings.warn(dep_msg, FutureWarning)
/Users/serge/miniconda3/envs/courses/lib/python3.10/site-packages/libpysal/cg/shapes.py:1208: FutureWarning: Objects based on the `Geometry` class will deprecated and removed in a future version of libpysal.
warnings.warn(dep_msg, FutureWarning)
x
y
0
66.22
32.54
1
22.52
22.39
2
31.01
81.21
3
9.47
31.02
4
30.78
60.10
5
75.21
58.93
6
79.26
7.68
7
8.23
39.93
8
98.73
77.17
9
89.78
42.53
10
65.19
92.08
11
54.46
8.48
type(pp.points)
pandas.core.frame.DataFrame
We can use PointPattern class method plot to visualize pp.
pp.plot()
from pointpats.centrography import (hull, mbr, mean_center, weighted_mean_center, manhattan_median, std_distance,euclidean_median,ellipse)
Central Tendency
Central Tendency concerns about the center point of the two-dimensional distribution. It is similar to the first moment of a one-dimensional distribution. There are several ways to measure central tendency, each having pros and cons. We need to carefully select the appropriate measure according to our objective and data status.
The Weighted mean center is meant for marked point patterns. Aside from the first argument which is a series of \((x,y)\) coordinates in weighted_mean_center function, we need to specify its second argument which is the weight for each event point.
weights = np.arange(12)weights
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
wmc = weighted_mean_center(pp.points, weights)wmc
array([60.51681818, 47.76848485])
pp.plot() #use class method "plot" to visualize point patternplt.plot(mc[0], mc[1], 'b^', label='Mean Center') plt.plot(wmc[0], wmc[1], 'gd', label='Weighted Mean Center')plt.legend(numpoints=1)
The Manhattan median is the location which minimizes the absolute distance to all the event points. It is an extension of the median measure in one-dimensional space to two-dimensional space. Since in one-dimensional space, a median is the number separating the higher half of a dataset from the lower half, we define the Manhattan median as a tuple whose first element is the median of \(x\) coordinates and second element is the median of \(y\) coordinates.
Though Manhattan median can be found very quickly, it is not unique if you have even number of points. In this case, pysal handles the Manhattan median the same way as numpy.median: return the average of the two middle values.
#get the number of points in point pattern "pp"pp.n
12
#Manhattan Median is not unique for "pp"mm = manhattan_median(pp.points)mm
/Users/serge/miniconda3/envs/courses/lib/python3.10/site-packages/pointpats/centrography.py:208: UserWarning: Manhattan Median is not unique for even point patterns.
warnings.warn(s)
The Euclidean Median is the location from which the sum of the Euclidean distances to all points in a distribution is a minimum. It is an optimization problem and very important for more general location allocation problems. There is no closed form solution. We can use first iterative algorithm (Kuhn and Kuenne, 1962) to approximate Euclidean Median.
Below, we define a function named median_center with the first argument points a series of \((x,y)\) coordinates and the second argument crit the convergence criterion.
The Standard distance is closely related to the usual definition of the standard deviation of a data set, and it provides a measure of how dispersed the events are around their mean center \((x_m,y_m)\). Taken together, these measurements can be used to plot a summary circle (standard distance circle) for the point pattern, centered at \((x_m,y_m)\) with radius \(SD\), as shown below.
stdd = std_distance(pp.points)stdd
40.14980648908671
Plot mean center as well as the standard distance circle.
From the above figure, we can observe that there are five points outside the standard distance circle which are potential outliers.
Standard Deviational Ellipse
Compared with standard distance circle which measures dispersion using a single parameter \(SD\), standard deviational ellipse measures dispersion and trend in two dimensions through angle of rotation \(\theta\), dispersion along major axis \(s_x\) and dispersion along minor axis \(s_y\):
Major axis defines the direction of maximum spread in the distribution. \(s_x\) is the semi-major axis (half the length of the major axis):
Minor axis defines the direction of minimum spread and is orthogonal to major axis. \(s_y\) is the semi-minor axis (half the length of the minor axis):
By specifying “hull” argument True in PointPattern class method plot, we can easily plot convex hull of the point pattern.
pp.plot(title='Centers', hull=True ) #plot point pattern "pp" as well as its convex hullplt.plot(mc[0], mc[1], 'b^', label='Mean Center')plt.plot(wmc[0], wmc[1], 'gd', label='Weighted Mean Center')plt.plot(mm[0], mm[1], 'rv', label='Manhattan Median')plt.plot(em[0], em[1], 'm+', label='Euclidean Median')plt.legend(numpoints=1)
Minimum Bounding Rectangle (Box) is the same as the minimum bounding Rectangle of its convex hull. Thus, it is almost always bigger than convex hull.
We can call mbr function to calculate the leftmost, downmost, rightmost, and upmost value of the vertices of minimum bounding rectangle.
mbr(pp.points)
/var/folders/_n/1q8cd8t93vd7g0sp8v27v25h0000gn/T/ipykernel_20738/2243439823.py:1: FutureWarning: This function will be deprecated in the next release of pointpats.
mbr(pp.points)
(8.23, 7.68, 98.73, 92.08)
Thus, four vertices of the minimum bounding rectangle is \((8.23,7.68),(98.73,7.68),(98.73,92.08),(8.23,92.08)\).
pp.plot(title='Centers', window=True ) #plot point pattern "pp" as well as its Minimum Bounding Rectangleplt.plot(mc[0], mc[1], 'b^', label='Mean Center')plt.plot(wmc[0], wmc[1], 'gd', label='Weighted Mean Center')plt.plot(mm[0], mm[1], 'rv', label='Manhattan Median')plt.plot(em[0], em[1], 'm+', label='Euclidean Median')plt.legend(numpoints=1)
We apply the centrography statistics and visualization to 2 simulated random datasets.
#from pysal.contrib import shapely_extfrom libpysal.cg import shapely_extfrom pointpats import PoissonPointProcess as csrimport libpysal as psfrom pointpats import as_window#import pysal_examples# open "vautm17n" polygon shapefileva = ps.io.open(ps.examples.get_path("vautm17n.shp"))# Create the exterior polygons for VA from the union of the county shapespolys = [shp for shp in va]state = shapely_ext.cascaded_union(polys)
/Users/serge/miniconda3/envs/courses/lib/python3.10/site-packages/libpysal/cg/shapes.py:1492: FutureWarning: Objects based on the `Geometry` class will deprecated and removed in a future version of libpysal.
warnings.warn(dep_msg, FutureWarning)
/Users/serge/miniconda3/envs/courses/lib/python3.10/site-packages/libpysal/cg/shapes.py:1208: FutureWarning: Objects based on the `Geometry` class will deprecated and removed in a future version of libpysal.
warnings.warn(dep_msg, FutureWarning)
/Users/serge/miniconda3/envs/courses/lib/python3.10/site-packages/libpysal/cg/shapes.py:103: FutureWarning: Objects based on the `Geometry` class will deprecated and removed in a future version of libpysal.
warnings.warn(dep_msg, FutureWarning)
Simulate a 100-point dataset within VA state border from a CSR (complete spatial randomness) process.
/Users/serge/miniconda3/envs/courses/lib/python3.10/site-packages/libpysal/cg/shapes.py:1923: FutureWarning: Objects based on the `Geometry` class will deprecated and removed in a future version of libpysal.
warnings.warn(dep_msg, FutureWarning)
/Users/serge/miniconda3/envs/courses/lib/python3.10/site-packages/pointpats/centrography.py:208: UserWarning: Manhattan Median is not unique for even point patterns.
warnings.warn(s)
<matplotlib.legend.Legend at 0x13c3731f0>
Plot Standard Distance Circle of the simulated point pattern.
/Users/serge/miniconda3/envs/courses/lib/python3.10/site-packages/libpysal/cg/shapes.py:1492: FutureWarning: Objects based on the `Geometry` class will deprecated and removed in a future version of libpysal.
warnings.warn(dep_msg, FutureWarning)
/Users/serge/miniconda3/envs/courses/lib/python3.10/site-packages/libpysal/cg/shapes.py:1208: FutureWarning: Objects based on the `Geometry` class will deprecated and removed in a future version of libpysal.
warnings.warn(dep_msg, FutureWarning)
/Users/serge/miniconda3/envs/courses/lib/python3.10/site-packages/libpysal/cg/shapes.py:1923: FutureWarning: Objects based on the `Geometry` class will deprecated and removed in a future version of libpysal.
warnings.warn(dep_msg, FutureWarning)
/Users/serge/miniconda3/envs/courses/lib/python3.10/site-packages/libpysal/cg/shapes.py:103: FutureWarning: Objects based on the `Geometry` class will deprecated and removed in a future version of libpysal.
warnings.warn(dep_msg, FutureWarning)
/Users/serge/miniconda3/envs/courses/lib/python3.10/site-packages/pointpats/centrography.py:208: UserWarning: Manhattan Median is not unique for even point patterns.
warnings.warn(s)
If we calculate the Euclidean distances between every event point and Mean Center (Euclidean Median), and sum them up, we can see that Euclidean Median is the optimal point in iterms of minimizing the Euclidean distances to all the event points.
from pointpats import dtotprint(dtot(mc, pp.points))print(dtot(em, pp.points))print(dtot(mc, pp.points) > dtot(em, pp.points))