If you find any typos, errors, or places where the text may be improved, please let us know by providing feedback either in the feedback survey (given during class) or by using GitHub.
On GitHub open an issue or submit a pull request by clicking the " Edit this page" link at the side of this page.
Appendix D — Extras: Additional exercises
D.1 Discussion: How might you use Git in your work?
Time: ~10 minutes.
One of the biggest comments people make when first learning about Git is: “I see some of the benefits, but can’t see how I would use it in my own work. How or why would I?” So, to tackle that question right away, take some time to brainstorm and discuss with your neighbour how you might use Git right now or in the near future.
- Take 1 minute to think to yourself.
- Take 6 minutes to discuss and brainstorm with your neighbour.
- In the final few minutes, we will all share with the group some thoughts.
D.2 Discussion: How can you use Git to collaborate better?
Time: ~10 minutes.
Before actually learning about how you might collaborate with others (and future you) by using Git and GitHub, let’s brainstorm and discuss how you might do it. Based on what you have learned so far, take some time to think about how you might use Git and GitHub to collaborate more easily between collaborators and future you.
- Take 1 minute to think to yourself.
- Take 6 minutes to brainstorm and discuss with your neighbour.
- In the final few minutes, we will share what you discussed with the group.
D.3 Coding: Calculate some basic statistics
Practice using summarise()
by calculating various summary statistics. Copy and paste the code below into the R/learning.R
script file.
Then, start replacing the ___
with the appropriate code to complete the tasks below. Don’t forget to use na.rm = TRUE
in the basic statistic functions.
- Calculate the mean of
bp_sys_ave
andage
. - Calculate the max and min of
bp_dia
. - Lastly, add and commit any changes made to the Git history with the RStudio Git interface.
Click for the solution. Only click if you are struggling or are out of time.
D.4 Coding: Answer some statistical questions with group by and summarise
Copy and paste the code below into the R/learning.R
script file.
Then, start replacing the ___
with the appropriate code including group_by()
with summarise()
, to answer these questions:
- What is the mean, max, and min differences in age between active and inactive persons with or without diabetes?
- What is the mean, max, and min differences in systolic BP and diastolic BP between active and inactive persons with or without diabetes?
- Once done, add and commit the changes to the file to the Git history.
Click for the solution. Only click if you are struggling or are out of time.
# 1.
nhanes_small %>%
filter(!is.na(diabetes)) %>%
group_by(diabetes, phys_active) %>%
summarise(
mean_age = mean(age, na.rm = TRUE),
max_age = max(age, na.rm = TRUE),
min_age = min(age, na.rm = TRUE)
)
# 2.
nhanes_small %>%
filter(!is.na(diabetes)) %>%
group_by(diabetes, phys_active) %>%
summarise(
mean_bp_sys = mean(bp_sys_ave, na.rm = TRUE),
max_bp_sys = max(bp_sys_ave, na.rm = TRUE),
min_bp_sys = min(bp_sys_ave, na.rm = TRUE),
mean_bp_dia = mean(bp_dia_ave, na.rm = TRUE),
max_bp_dia = max(bp_dia_ave, na.rm = TRUE),
min_bp_dia = min(bp_dia_ave, na.rm = TRUE)
)
D.5 Coding: Practicing the dplyr functions
Practice using dplyr by using the NHANES
dataset and wrangling the data into a summary output. Don’t create any intermediate objects by only using the pipe operator to link each task below with the next one.
- Rename all columns to use snakecase.
- Select the columns
gender
,age
andBMI
. - Exclude
"NAs"
from all of the selected columns. - Rename
gender
tosex
. - Create a new column called
age_class
, where anyone under 50 years old is labeled"under 50"
and those 50 years and older are labeled"over 50"
. - Group the data according to
sex
andage_class
. - Calculate the
mean
andmedian
BMI according to the grouping to determine the difference in BMI between age classes and sex. - Run styler on the file with the Palette (Ctrl-Shift-P, then type “style file”).
- Add and commit changes to the Git history with the RStudio Git interface using Ctrl-Alt-M or with the Palette (Ctrl-Shift-P, then type “commit”).
Click for the solution. Only click if you are struggling or are out of time.
NHANES %>%
rename_with(snakecase::to_snake_case) %>%
select(gender, age, bmi) %>%
filter(!is.na(gender) & !is.na(age) & !is.na(bmi)) %>%
rename(sex = gender) %>%
mutate(age_class = if_else(age < 50, "under 50", "over 50")) %>%
group_by(age_class, sex) %>%
summarize(bmi_mean = mean(bmi, na.rm = TRUE),
bmi_median = median(bmi, na.rm = TRUE))
D.6 Coding: Adding figures and changing the theme
Time: 20 minutes.
Complete these tasks in the doc/learning.qmd
file.
- Search online for a picture that you like and want to put into the Quarto file. Download it and save the image in
doc/
. - In the Quarto file, insert the image somewhere (e.g. under
# Results
) and give it an appropriate caption. - Render the document using Ctrl-Shift-K or with the Palette (Ctrl-Shift-P, then type “render”) to make sure the image gets inserted.
- Go to Quarto’s HTML Theming page and look through the different themes that are available for the HTML output. Find a theme you like by changing the
theme
option in the YAML header and re-render the document with Ctrl-Shift-K or with the Palette (Ctrl-Shift-P, then type “render”). - Finally, add and commit the changes you’ve made to the document using Ctrl-Alt-M or with the Palette (Ctrl-Shift-P, then type “commit”). For now, don’t add and commit the HTML output file.