23  Team project work

In this last session of the workshop, you will take what you learned and apply it to create a reproducible project while working in a team (of 2 or 3 people). This will help you reinforce what you learned in the previous sessions. The aim of this project is to create a reproducible data analysis report in HTML using Quarto and R and sharing it on GitHub. The open dataset you will all be working on is from the study Associations of serum uric acid with cardiovascular disease risk factors: a retrospective cohort study in Southeastern China.

During the last 15-20 minutes of this session, the lead teacher will download your project from GitHub and re-generate your report to check that it is reproducible. The teacher will run approximately these functions after downloading your project:

23.1 Specific tasks

You will be collaborating as a team using Git and GitHub to manage your team assignment. We will set up the project with Git and GitHub for you so you can quickly start collaborating together on the project. You will be pushing and pulling a lot of content, so you will need to maintain regular and open communication with your team members.

Below are a list of tasks to complete in order. Read through each tasks first before starting any of them.

23.1.1 Clone team repository

All team members need to clone (download) your team’s repository to their own computer. The steps are found in RStudio in the “File” menu on the top menu bar. Under the “File” menu, click “New Project…”, then select “Version Control”, then “Git”. In the next screen, paste your team’s repository URL (copied from the team’s repository page on GitHub) into the “Repository URL” box. Don’t change the name of the project in the “Project directory name”. Select the “Browse…” button next to the “Create project as subdirectory of” box and choose to save the project to your Desktop/ folder. Finally, click the “Create Project”. After you clone the repository in RStudio, it should open up as an R project for you.

23.1.2 Download the dataset

Each team member needs to download the dataset for the project work and save the file into the data/ folder of the R project. We already set the project to ignore the data/ folder, so the data will not be tracked by Git. There are two ways to download the dataset:

  • Manually download the dataset and move it from your Downloads/ folder of your computer into the R project’s data/ folder by using your file manager (like Finder or File Explorer). Use this link to directly download the data.

  • Or, run this code below into the Console while in the R project to download the data directly into the data/ folder:

    Console
    download.file(
      url = "https://zenodo.org/records/8292712/files/SUA_CVDs_risk_factors.csv",
      destfile = here::here("data/SUA_CVDs_risk_factors.csv")
    )

23.1.3 Create a Quarto file

Each team member creates a Quarto file by running prodigenr::create_report() in the Console while in the team’s RStudio R project. This will create a file called docs/report.qmd. Rename this file report-YOURNAME.qmd (replacing YOURNAME with your actual name) in the docs/ folder of your project.

This means that each team member will have their own document to work in without conflicting with the other team members. If there are several members with the same name, make sure your .qmd documents are unique, for example, by using your full name.

Inside each team member’s .qmd file, write code in the setup code chunk at the top to load the library(tidyverse) and to import your data found in the data/ folder. You can use the read_csv() function like you learned in .

23.1.4 Create some figures and tables

This task will take the most amount of time. We want each team member to create a few figures and tables. Ideally, each team member would create different figures and tables, so that you get practice on your own. So this will require some coordination and communication between team members.

Tip

You can do any wrangling, plotting, and table-making that you want, as long as you use the code and techniques you learned during the workshop. Just to remind you, you can wrangling and analyse the data any way you want, even if technically it might not make sense or be wrong. The point of this project is to practice what you learned and have fun along the way.

Important

Keep the figures and tables simple. The simpler the better. The goal is to practice what you learned, not to create a perfect analysis.

To help you get started or to give you some ideas, here are some plots and tables you could make:

  • A table of the mean of Age and BMI (using summarise()) grouped by Sex. Make the table with knitr::kable().
  • A table of the mean of Age, BMI, SBP, and DBP (using summarise()) grouped by hypertension (or dyslipidemia or hyperglycemia). Make the table with knitr::kable().
  • A histogram of BMI with geom_histogram(), with hyperglycemia as the fill (in aes()) or using facet_grid().
  • A scatter plot of SBP vs DBP with geom_point(), colored by hypertension (maybe with a geom_smooth() line).
  • A bar plot of the counts of hyperglycemia status with geom_bar(), making facets of Times (using facet_grid()).
  • A boxplot of TC (total cholesterol) by dyslipidemia status with geom_boxplot().
  • Find the mean BMI by hyperglycemia and pipe to create a bar plot with geom_col().

We suggest that you talk with your team members to get some other ideas or to decide who does what. As you code, get or give help to your teammates.

23.1.5 Regularly style your code

Make sure your code is styled correctly by regularly using the Palette (Ctrl-Shift-P, then type “style file”) to style your code as you work in your Quarto document. Nicely formatted code is just as important as code that runs, as it makes it easier to read and understand. So practice styling your code often!

23.1.6 Regularly commit the changes

Use the “Git workflow” by adding to the staged area, committing, pushing, and pulling the changes you and your teammate have made. You may encounter merge conflicts. If you do, get one of the helpers or teachers to help you out!

23.1.7 Regularly render the Quarto document

Render your Quarto document to HTML often to ensure reproducibility of your document as well as to see how your report looks like. You don’t need to commit the rendered HTML file if you don’t want to.

23.2 Quick “checklist” for a good project

  • Project used Git and is on GitHub.
  • Used Quarto for making figures and tables.
  • Used {tidyverse} for data wrangling and plotting.
  • Code is correctly styled.

23.3 Checking reproducibility

At the end, the lead teacher will download each of the teams’ projects from GitHub to their computer and will render the documents to test their reproducibility. The teacher will also run styler::style_dir() to check that the code is correctly styled.

23.4 Expectations for the project

What we expect you to do for the team project:

  • Use Git and GitHub throughout your work.
  • Work collaboratively as a team and share responsibilities and tasks.
  • Use as much of what we covered in the workshop to practice what you learned.

What we don’t expect:

  • Complicated analysis or coding. The simpler it is, the easier is to for you to do the coding and understand what is going on. It also helps us to see that you’ve practiced what you’ve learned.
  • Clever or overly concise code. Clearly written and readable code is always better than clever or concise code. Keep it simple and understandable!

Essentially, the team project is a way to reinforce what you learned during the workshop, but in a more relaxed and collaborative setting.