23 Team project work
In this last session of the workshop, you will take what you learned and apply it to create a reproducible project while working in a team (of 2 or 3 people). This will help you reinforce what you learned in the previous sessions. The aim of this project is to create a reproducible data analysis report in HTML using Quarto and R and sharing it on GitHub. The open dataset you will all be working on is from the study Associations of serum uric acid with cardiovascular disease risk factors: a retrospective cohort study in Southeastern China.
During the last 15-20 minutes of this session, the lead teacher will download your project from GitHub and re-generate your report to check that it is reproducible. The teacher will run approximately these functions after downloading your project:
styler::style_dir()to check that your code is styled.quarto::quarto_render()on the Quarto documents to render your report.
23.1 Specific tasks
You will be collaborating as a team using Git and GitHub to manage your team assignment. We will set up the project with Git and GitHub for you so you can quickly start collaborating together on the project. You will be pushing and pulling a lot of content, so you will need to maintain regular and open communication with your team members.
Below are a list of tasks to complete in order. Read through each tasks first before starting any of them.
23.1.1 Clone team repository
All team members need to clone (download) your team’s repository to their own computer. The steps are found in RStudio in the “File” menu on the top menu bar. Under the “File” menu, click “New Project…”, then select “Version Control”, then “Git”. In the next screen, paste your team’s repository URL (copied from the team’s repository page on GitHub) into the “Repository URL” box. Don’t change the name of the project in the “Project directory name”. Select the “Browse…” button next to the “Create project as subdirectory of” box and choose to save the project to your Desktop/ folder. Finally, click the “Create Project”. After you clone the repository in RStudio, it should open up as an R project for you.
23.1.2 Download the dataset
Each team member needs to download the dataset for the project work and save the file into the data/ folder of the R project. We already set the project to ignore the data/ folder, so the data will not be tracked by Git. There are two ways to download the dataset:
Manually download the dataset and move it from your
Downloads/folder of your computer into the R project’sdata/folder by using your file manager (like Finder or File Explorer). Use this link to directly download the data.Or, run this code below into the Console while in the R project to download the data directly into the
data/folder:Console
download.file( url = "https://zenodo.org/records/8292712/files/SUA_CVDs_risk_factors.csv", destfile = here::here("data/SUA_CVDs_risk_factors.csv") )
23.1.3 Create a Quarto file
Each team member creates a Quarto file by running prodigenr::create_report() in the Console while in the team’s RStudio R project. This will create a file called docs/report.qmd. Rename this file report-YOURNAME.qmd (replacing YOURNAME with your actual name) in the docs/ folder of your project.
This means that each team member will have their own document to work in without conflicting with the other team members. If there are several members with the same name, make sure your .qmd documents are unique, for example, by using your full name.
Inside each team member’s .qmd file, write code in the setup code chunk at the top to load the library(tidyverse) and to import your data found in the data/ folder. You can use the read_csv() function like you learned in Chapter 17.
23.1.4 Create some figures and tables
This task will take the most amount of time. We want each team member to create a few figures and tables. Ideally, each team member would create different figures and tables, so that you get practice on your own. So this will require some coordination and communication between team members.
You can do any wrangling, plotting, and table-making that you want, as long as you use the code and techniques you learned during the workshop. Just to remind you, you can wrangling and analyse the data any way you want, even if technically it might not make sense or be wrong. The point of this project is to practice what you learned and have fun along the way.
Keep the figures and tables simple. The simpler the better. The goal is to practice what you learned, not to create a perfect analysis.
To help you get started or to give you some ideas, here are some plots and tables you could make:
- A table of the mean of
AgeandBMI(usingsummarise()) grouped bySex. Make the table withknitr::kable(). - A table of the mean of
Age,BMI,SBP, andDBP(usingsummarise()) grouped byhypertension(ordyslipidemiaorhyperglycemia). Make the table withknitr::kable(). - A histogram of
BMIwithgeom_histogram(), withhyperglycemiaas thefill(inaes()) or usingfacet_grid(). - A scatter plot of
SBPvsDBPwithgeom_point(), colored byhypertension(maybe with ageom_smooth()line). - A bar plot of the counts of
hyperglycemiastatus withgeom_bar(), making facets ofTimes(usingfacet_grid()). - A boxplot of
TC(total cholesterol) bydyslipidemiastatus withgeom_boxplot(). - Find the mean
BMIbyhyperglycemiaand pipe to create a bar plot withgeom_col().
We suggest that you talk with your team members to get some other ideas or to decide who does what. As you code, get or give help to your teammates.
23.1.5 Regularly style your code
Make sure your code is styled correctly by regularly using the Palette (Ctrl-Shift-PCtrl-Shift-P, then type “style file”) to style your code as you work in your Quarto document. Nicely formatted code is just as important as code that runs, as it makes it easier to read and understand. So practice styling your code often!
23.1.6 Regularly commit the changes
Use the “Git workflow” by adding to the staged area, committing, pushing, and pulling the changes you and your teammate have made. You may encounter merge conflicts. If you do, get one of the helpers or teachers to help you out!
23.1.7 Regularly render the Quarto document
Render your Quarto document to HTML often to ensure reproducibility of your document as well as to see how your report looks like. You don’t need to commit the rendered HTML file if you don’t want to.
23.2 Quick “checklist” for a good project
- Project used Git and is on GitHub.
- Used Quarto for making figures and tables.
- Used
{tidyverse}for data wrangling and plotting. - Code is correctly styled.
23.3 Checking reproducibility
At the end, the lead teacher will download each of the teams’ projects from GitHub to their computer and will render the documents to test their reproducibility. The teacher will also run styler::style_dir() to check that the code is correctly styled.
23.4 Expectations for the project
What we expect you to do for the team project:
- Use Git and GitHub throughout your work.
- Work collaboratively as a team and share responsibilities and tasks.
- Use as much of what we covered in the workshop to practice what you learned.
What we don’t expect:
- Complicated analysis or coding. The simpler it is, the easier is to for you to do the coding and understand what is going on. It also helps us to see that you’ve practiced what you’ve learned.
- Clever or overly concise code. Clearly written and readable code is always better than clever or concise code. Keep it simple and understandable!
Essentially, the team project is a way to reinforce what you learned during the workshop, but in a more relaxed and collaborative setting.