Blog

Blog

Food Pantries in the US in 2020

Below is a map that shows the location of food pantries in the U.S. in 2020 by county. There are over 48,500 in total. If you zoom in and click on each marker, you can see the location and information of each pantry, including the street address, latitude, longitude, and pantry name. If you zoom out, you can also see Alaska and Hawaii. A few things to consider: If you are interested, here is an academic paper I wrote that is based on these data (although there isn’t a perfect

Read More »
Blog

Not Feeling So Great? A Random Forest Analysis of Demographics and Health

Each of us have an intuition of our own health and understand when we are feeling great, or not not so great. Indeed, self-perceived health status even predicts mortality. I started to wonder how well basic demographics, such as age, gender, and income predict self-perceived health status. We know demographics strongly correlate with lots of other health outcomes, but I have never seen them used to predict self-perceived health status. In this post I use a Random Forest classifier to predict self-perceived health status using only demographic variables. Random Forests are

Read More »
Blog

How To Cross Validate and Bootstrap in R

Life is like a box of samples, you never know what you’re gonna get — Forest Gump Good ol’ Forrest Gump. He is right, you never know what you’re going to get, and that is why we use resampling techniques in data science! In this post, I will review two popular resampling techniques for predictive models and give examples of how to implement them in R. But first, let’s review the basics. What does “resample” mean? Often we only have one dataset to examine and use to build prediction models.

Read More »
Blog

How to Use Survey Weights in R

Survey weights are common in large-scale government-funded data collections. For example, NHIS and NHANES are two large scale surveys that track the health and well-being of Americans that have survey weights. These data collections use complex and multi-stage survey sampling to ensure that results are representative of the U.S. population. Although use of survey weights is sometimes contested in regression analyses, they are needed for simple means and proportions. The general guidance is that if analysts can control for the factors that were used to create the weights in their

Read More »
Blog

Differences in Household Food Insecurity by Region and State in 2023

Download a full resolution version here. This visualization shows variation in regional and state and householdfood insecurity rates in 2023. Food insecure households experience disruptionsin the quality and quantity of the household food supply due to a lack of resources.Translation: food insecure households struggle to afford the food they need to feedtheir family. I’ve been thinking about state and regional variation in the U.S. For most issues, wecan imagine that they will vary by state and region. Oftentimes, however, nationallevel estimates get the limelight. For example, unemployment, poverty, and rates

Read More »