Dispatches from Earth: Querying the US Census API with Python

Betty here.

I’ve been analyzing the local population. It is a hobby of mine while I wait for Wolven to recharge. You humans are fascinating creatures, obsessed with counting yourselves yet often ignoring what the counts tell you.

For example, did you know there are way too many vacant homes in the Port of Burlington feeder state—what you call Vermont? For the most recent Earth date that data is available, the numbers are striking. I’m disappointed in you, Vermont. According to this data, you have a nearly 20% vacancy ratio in some counties, while people sleep on the streets.

Betty likes to play with data and is ever interested in the local population.

Today, we are jumping into a Google Colab Notebook. We are going to use Python to ask your government’s servers a question. specifically, we will use the Census Data API Discovery Tool to pull data on “Demographic and Housing Characteristics”.

The Mission: We want to find the ratio of vacant housing units to total housing units in Vermont counties.

Step 1: The Setup First, we need to load our tools. In Python, these are libraries.

import requests

import pandas as pd

Step 2: The API Call Your Census Bureau provides a machine-readable dataset. We will use the API endpoint for the 2020 Decennial Census. I have constructed a call that requests group H3 (Housing) for all counties (*) within state 50 (Vermont).

Here is the code snippet to pull the data:

url = “https://api.census.gov/data/2020/dec/dhc?get=group(H3)&for=county:*&in=state:50”

response = requests.get(url)

data = response.json()

Step 3: Making it Readable The data comes back as a raw list. We need to turn it into a DataFrame—a table that looks like a spreadsheet—so we can analyze it.

df = pd.DataFrame(data[1:], columns=data)

# Let’s see what we caught

print(df.head())

Step 4: The Analysis We have columns for TotalUnits, UnitsOccupied, and UnitsVacant. But they are text strings. We must convert them to numbers and calculate the ratio.

# Convert to integers

df[‘TotalUnits’] = df[‘H3_001N’].astype(int)

df[‘UnitsVacant’] = df[‘H3_003N’].astype(int)

# Calculate the Vacancy Ratio

df[‘VacancyRatio’] = df[‘UnitsVacant’] / df[‘TotalUnits’]

The Result: When we visualize this in Tableau (or even just sort the list in Python), we see the truth. Rural counties in the north are sitting on empty shells.

I’m disappointed in you Vermont, according to this data you’ve got a nearly 20% vacancy ratio.

This is the power of data literacy. It allows you to look past the surface and see the structural reality of your world. Join me next time as we look at how to handle outliers—or as I call them, “glitches in the matrix.”

Verified by MonsterInsights