Skip to content

rfordatascience/tidytuesday

Repository files navigation

Logo for the TidyTuesday project, represented by the word TidyTuesday over a messy splash of black paint

About TidyTuesday

  • TidyTuesday is a weekly social data project. All are welcome to participate! Please remember to share the code used to generate your results!
  • TidyTuesday is organized by the R4DS Online Learning Community. Join our Slack for free online help with R and other data-related topics, or to participate in a data-related book club!

Goals

Our over-arching goal for TidyTuesday is to make learning to work with data easier, by providing real-world datasets.

Our goal for 2023-2024 is to increase usage of #TidyTuesday within classrooms. We would like to be used in at least 10 courses by September 2024. If you are using TidyTuesday to teach data-related skills, please let us know!


How to Participate

  • Data is posted to social media every Monday morning. Follow the instructions in the new post for how to download the data.
  • Explore the data, watching out for interesting relationships. We would like to emphasize that you should not draw conclusions about causation in the data. There are various moderating variables that affect all data, many of which might not have been captured in these datasets. As such, our suggestion is to use the data provided to practice your data tidying and plotting techniques, and to consider for yourself what nuances might underlie these relationships.
  • Create a visualization, a model, a shiny app, or some other piece of data-science-related output, using R or another programming language.
  • Share your output and the code used to generate it on social media with the #TidyTuesday hashtag.

DataSets

Week Date Data Source Article
1 2024-01-02 Bring your own data to start 2024!
2 2024-01-09 Canadian NHL Player Birth Dates Statistics Canada, NHL team list endpoint, NHL API Are Birth Dates Still Destiny for Canadian NHL Players?
3 2024-01-16 US Polling Places 2012-2020 Center for Public Integrity National data release sheds light on past polling place changes
4 2024-01-23 Educational attainment of young people in English towns The UK Office for National Statistics Why do children and young people in smaller towns do better academically than those in larger towns?
5 2024-01-30 Groundhog predictions Groundhog-day.com API Groundhog-day.com Predictions by Year
6 2024-02-06 World heritage sites UNESCO World Heritage Sites 1 dataset 100 visualizations
7 2024-02-13 Valentine's Day consumer data Valentine's Days consumer survey data National Retail Federation Valentine's Day Data Center
8 2024-02-20 R Consortium ISC Grants R Consortium ISC Funded Projects R Consortium ISC Call for Proposals 2024
9 2024-02-27 Leap Day Wikipedia: February 29 Wikipedia: February 29
10 2024-03-05 Trash Wheel Collection Data Healthy Harbor Trash Wheel Collection Data Mr. Trash Wheel
11 2024-03-12 Fiscal Sponsors Fiscal Sponsor Directory Fiscal Sponsor Directory facts
12 2024-03-19 X-Men Mutant Moneyball Mutant Moneyball Data Mutant moneyball: a data-driven ultimate X-Men
13 2024-03-26 NCAA Men's March Madness Men's March Madness Data Bracketology: predicting March Madness

Citing TidyTuesday

To cite the TidyTuesday repo/project in publications use:

R4DS Online Learning Community (2023). Tidy Tuesday: A weekly social data project. https://github.com/rfordatascience/tidytuesday.

A BibTeX entry for LaTeX users is

  @misc{tidytuesday, 
    title = {Tidy Tuesday: A weekly social data project}, 
    author = {R4DS Online Learning Community}, 
    url = {https://github.com/rfordatascience/tidytuesday}, 
    year = {2023} 
  }

Note: If you would like to cite the tidytuesdayR package, you should use citation("tidytuesdayR") instead.


Submitting Datasets

TidyTuesday is built around open datasets that are found in the "wild" or submitted as Issues on our GitHub.

If you find a dataset that you think would be interesting, you can approach it through two ways:

Submit the dataset as an Issue

  1. Find an interesting dataset
  2. Find a report, blog post, article, etc relevant to the data
  3. Submit the dataset as an Issue along with a link to the article (and, ideally, 2 images from the article, with alt text)

Create an entire TidyTuesday challenge!

  1. Find an interesting dataset
  2. Find a report, blog post, article, etc relevant to the data (or create one yourself!)
  3. Let us know you've found something interesting and are working on it by filing an Issue on our GitHub
  4. Provide a link or the raw data and a cleaning script for the data
  5. Write a basic readme.md file using a recent readme.md as a template. Make sure to give yourself credit!