Halloween Kid Counts: 2007 Results

As promised, I have gathered the Hallowen 2007 kidcount data and mapped it. The original data is available, and the methodology is described below. The following images are 2D representations of the data:

The Zombie Outbreak View:


(blended gradients with a count-dependent radius)

Inverse-Distance View:


(each point is calculated using an inverse-distance weighted mean of the seed points)

Nearest neighbor View


(each point is equal to the value of its nearest neighbor)

Explanation

Because of the dramatic difference between the highest and lowest counts, and because I was a little suspicious of the really high values, these maps are generated using the log of the counts, rather than the counts themselves. This allows minor variations on the low end to be displayed, while still illustrating the overall data.

It is difficult to meaninfully interpolate the data between points and to extrapolate the counts beyond the reporting areas. Any method of mapping the data will have strengths and weaknesses, and none is 'more correct' than the others. I tried many different views, and I have chosen to display the three most interesting views onto the data:

  • The Zombie Outbreak View: blended gradients with a count-dependent radius. Nicely displays 'hot spots', and identifies each seed point. Does not include any interpolation for areas between seed points, or extrapolate values very far beyond each seed.
  • Inverse-Distance View: each point is calculated using an inverse-distance weighted mean of the seed points. Very similar to Nearest Neighbor View. Presents smooth transitions between zones, and extrapolated points further from seed points appoach the mean value of the seeded area.
  • Nearest neighbor View: each point is equal to the value of its nearest neighbor. Has sharp transitions between zones, and extrapolated points are equal to the nearest value (unlike the Inverse-Distance view).
The maps display the counts via relative color, but I have not included a numeric scale, because it would be confusing (the scale would not be linear), and the distribution of the values is more interesting than the values themselves.

Not all of the data collected was mapped. One data point (1000 N. Duke) was excluded because the Durham GIS doesn't include the address. Five data points were excluded because they were distant from the primary cluster (positional outliers), and there were not a significant number of reporting addresses to warrant expansion of the map. I apologize for omitting your data...If you can get your neighbors to add their counts next year, I promise to include them.

It would have been nice to have more data points...that would have allowed for a lot more detail (higher granularity) in the maps. Hopefully, more data points will be collected next year than this year.

I will leave any interpretation of these maps to you.