Hover over (or tap on mobile) red markers to reveal tooltip of graduates practicing in that city. Circle next to name indicates academic position.

This is an interactive map showing where alumni of Washington University in St Louis’s Otolaryngology Residency Training program between 1990 and 2017 currently practice. Alumni are grouped by city - those in nearby cities are in a tooltip, ordered by graduation date. Those with an academic rank are indicated by a color to the left of their name in the tooltip.

Resources:
  • Tooltips produced using the tippy.js plugin
  • General map, bubbles, and visualization produced with d3.js, everyone's favorite visualization library
  • Mapping zip code to nearest city done with the turf.js geospatial analysis library. Given a collection of points group, turf.nearestPoint(p, group) returns closest member of that collection to point of interest p
  • This awesome article from Dudley Storey on making SVG-based visualizations more responsive
Trickiest Part

The current locations of each came as individual zip codes in an Excel spreadsheet - Without any street-address ambiguity with zip codes, producing lat/lon pairs would be pretty failsafe through any common geocoders (nominatim/OSM, google/bing's geocoding API, etc.).
However, I soon realized that the trickiness was in the distribution of the grads' locations: As I wanted only one tooltip per city (not per grad), the goal was to bin graduates when close together (in same city), but not when far apart. Thus I needed to detect if a lat/lon was near or in a large city for combining with others nearby, to avoid having markers stacking over each other in cities.

Stacking is a problem for a few reasons, because it:
  • Prevents the user from accessing markers at the bottom of the stack. While the list can always be sorted to plot smaller markers on top and larger ones on the bottom (increasing access to small markers), it's still tough to play around to get the bottom marker/its tooltip, and also doesn't solve #2.
  • Gives a false impression of distribution. A stack of many markers on the same point will be about the same size as one marker by itself - likely only the opacity will differ depending on presentation. This inherently undermines the ability of the cartographer to communicate accurately how the data is distributed spatially, and in what quantities.
  • Also makes the map seem cluttered and complicated. With mobile devices, this is even more of a problem - in stacked and clustered areas you might end up with 'dead zones', where users don't even attempt to access the data there cuz it's just too messy.
Solution

My approach was to create two lists of cities: One for the largest ones (top 50 in US), and one for medium and small ones. The idea is to bin grads into cities known to be big first, then fall back to a small city if needed.

  1. A graduate is first matched to a city in the large city list - If they're within 100km, they'll be grouped into that city's tooltip.
  2. If they're >100km away, they're probably not actually in the large city or its suburbs, and a smaller city is probably more accurate. They're then matched to a town in the small city list, and will be included in that tooltip.

The advantage of this approach is that it allowed fine-tuning of the cities that are binned first, establishing a sort of 'binning priority'. This flexibility was important for special cases like St. Louis with a large number of grads, and need for two tooltips and markers - one for academics at WashU and one for the rest of St. Louis. To accomplish this, I simply moved a St Louis suburb into the large city list, manually forcing another marker to appear, and another bin of graduates to be produced.