stevedata: Steve’s Toy Data for Teaching About a Variety of Methodological, Social, and Political Topics

My stevedata  hexlogo

{stevedata} is an R package full of toy data sets that you may find useful for various purposes. Namely, I’ve created probably over a hundred toy data sets along the way, either to riff on some topic on my blog, show my students something in one of my many classes, or just to entertain myself. I had stuffed a lot of these into {stevemisc}, but I want to keep that package mostly about the functions (and whatever data are necessary for showing off the functions). {stevedata} will have all my toy data going forward.

I anticipate two sets of R users may find these data useful. First, instructors may find these data useful for classes on a variety of topics, but prominently quantitative methods and international relations. Many of the toy data sets included in this R package are data I’ve acquired or assembled to teach about topics in quantitative methods or international relations in a reproducible way. Users should see my Github repositories for my classes on introduction to international relations, quantitative methods in political science, and foundations of social science research for public policy to see how I’ve used these data (or development versions of them). Topics here are diverse, including (but not limited to) carbon dioxide emissions over 800,000 years (as an illustration of climate change), coffee prices (as an illustration of the worsening terms of trade, the justifiability of bribe-taking (as an illustration of information-poor and discrete variables that a researcher may be tempted to treat as drawn from a normal distribution), the canonical case of illiteracy rates in the 1930 U.S. Census (as an illustration of an ecological fallacy), and many, many more topics.

Second, my students in these classes (but especially my methods classes) should find this R package useful. I will also be having my methods students (undergraduate and graduate) download this package to work through problem sets in the R programming language. It’d be a benefit to them (and less hassle/headache for myself) to have my students download this package from CRAN rather than work through potential curl issues by installing through Github.

In almost all instances, each data set has an underlying code/script that generates them. These are in a data-raw directory that is (increasingly) included in the Github repository (but not the R package).


This package is now on CRAN. You can download it as you would any other R package.


You can also install the development version of {stevedata} from Github via the {devtools} package. I suppose using the {remotes} package would work as well.



The data set already has a lot to offer those who might be curious about its contents. You can do this to see what is in it.

data(package = "stevedata")

You can also check the website for more information. There is an informal vignette that describes these data in some detail.