Want to help out or contribute?

If you find any typos, errors, or places where the text may be improved, please let us know by providing feedback either in the feedback survey (given during class) or by using GitHub.

On GitHub open an issue or submit a pull request by clicking the " Edit this page" link at the side of this page.

Appendix D — Extras: Additional exercises

D.1 Discussion: How might you use Git in your work?

Time: ~10 minutes.

One of the biggest comments people make when first learning about Git is: “I see some of the benefits, but can’t see how I would use it in my own work. How or why would I?” So, to tackle that question right away, take some time to brainstorm and discuss with your neighbour how you might use Git right now or in the near future.

  1. Take 1 minute to think to yourself.
  2. Take 6 minutes to discuss and brainstorm with your neighbour.
  3. In the final few minutes, we will all share with the group some thoughts.

D.2 Discussion: How can you use Git to collaborate better?

Time: ~10 minutes.

Before actually learning about how you might collaborate with others (and future you) by using Git and GitHub, let’s brainstorm and discuss how you might do it. Based on what you have learned so far, take some time to think about how you might use Git and GitHub to collaborate more easily between collaborators and future you.

  1. Take 1 minute to think to yourself.
  2. Take 6 minutes to brainstorm and discuss with your neighbour.
  3. In the final few minutes, we will share what you discussed with the group.

D.3 Coding: Calculate some basic statistics

Practice using summarise() by calculating various summary statistics. Copy and paste the code below into the R/learning.R script file.

# 1.
nhanes_small %>%
    summarise(mean_bp_sys = ___,
              mean_age = ___)

# 2.
nhanes_small %>%
    summarise(max_bp_dia = ___,
              min_bp_dia = ___)

Then, start replacing the ___ with the appropriate code to complete the tasks below. Don’t forget to use na.rm = TRUE in the basic statistic functions.

  1. Calculate the mean of bp_sys_ave and age.
  2. Calculate the max and min of bp_dia.
  3. Lastly, add and commit any changes made to the Git history with the RStudio Git interface.
Click for the solution. Only click if you are struggling or are out of time.
# 1.
nhanes_small %>%
    summarise(mean_bp_sys = mean(bp_sys_ave, na.rm = TRUE),
              mean_age = mean(age, na.rm = TRUE))

# 2.
nhanes_small %>%
    summarise(max_bp_dia = max(bp_dia_ave, na.rm = TRUE),
              min_bp_dia = min(bp_dia_ave, na.rm = TRUE))

D.4 Coding: Answer some statistical questions with group by and summarise

Copy and paste the code below into the R/learning.R script file.

# 1. 
nhanes_small %>% 
    filter(!is.na(diabetes)) %>% 
    ___(___, ___) %>% 
    ___(
        ___,
        ___,
        ___
    )

# 2. 
nhanes_small %>% 
    filter(!is.na(diabetes)) %>% 
    ___(___, ___) %>% 
    ___(
        ___,
        ___,
        ___,
        ___,
        ___,
        ___
    )

Then, start replacing the ___ with the appropriate code including group_by() with summarise(), to answer these questions:

  1. What is the mean, max, and min differences in age between active and inactive persons with or without diabetes?
  2. What is the mean, max, and min differences in systolic BP and diastolic BP between active and inactive persons with or without diabetes?
  3. Once done, add and commit the changes to the file to the Git history.
Click for the solution. Only click if you are struggling or are out of time.
# 1. 
nhanes_small %>% 
    filter(!is.na(diabetes)) %>% 
    group_by(diabetes, phys_active) %>% 
    summarise(
        mean_age = mean(age, na.rm = TRUE),
        max_age = max(age, na.rm = TRUE),
        min_age = min(age, na.rm = TRUE)
    )

# 2. 
nhanes_small %>% 
    filter(!is.na(diabetes)) %>% 
    group_by(diabetes, phys_active) %>% 
    summarise(
        mean_bp_sys = mean(bp_sys_ave, na.rm = TRUE),
        max_bp_sys = max(bp_sys_ave, na.rm = TRUE),
        min_bp_sys = min(bp_sys_ave, na.rm = TRUE),
        mean_bp_dia = mean(bp_dia_ave, na.rm = TRUE),
        max_bp_dia = max(bp_dia_ave, na.rm = TRUE),
        min_bp_dia = min(bp_dia_ave, na.rm = TRUE)
    )

D.5 Coding: Practicing the dplyr functions

Practice using dplyr by using the NHANES dataset and wrangling the data into a summary output. Don’t create any intermediate objects by only using the pipe operator to link each task below with the next one.

  1. Rename all columns to use snakecase.
  2. Select the columns gender, age and BMI.
  3. Exclude "NAs" from all of the selected columns.
  4. Rename gender to sex.
  5. Create a new column called age_class, where anyone under 50 years old is labeled "under 50" and those 50 years and older are labeled "over 50".
  6. Group the data according to sex and age_class.
  7. Calculate the mean and median BMI according to the grouping to determine the difference in BMI between age classes and sex.
  8. Run styler on the file with the Palette (Ctrl-Shift-P, then type “style file”).
  9. Add and commit changes to the Git history with the RStudio Git interface using Ctrl-Alt-M or with the Palette (Ctrl-Shift-P, then type “commit”).
Click for the solution. Only click if you are struggling or are out of time.
NHANES %>% 
    rename_with(snakecase::to_snake_case) %>% 
    select(gender, age,  bmi) %>% 
    filter(!is.na(gender) & !is.na(age) & !is.na(bmi)) %>% 
    rename(sex = gender) %>% 
    mutate(age_class = if_else(age < 50, "under 50", "over 50")) %>%
    group_by(age_class, sex) %>% 
    summarize(bmi_mean = mean(bmi, na.rm = TRUE), 
              bmi_median = median(bmi, na.rm = TRUE))

D.6 Coding: Adding figures and changing the theme

Time: 20 minutes.

Complete these tasks in the doc/learning.qmd file.

  • Search online for a picture that you like and want to put into the Quarto file. Download it and save the image in doc/.
  • In the Quarto file, insert the image somewhere (e.g. under # Results) and give it an appropriate caption.
    • Use the Markdown syntax ![](){} to insert the image, where the file path is in between the ().
    • Include a caption in between the [], center align by using fig-align="center" within the {}, and resize the image to "75%" using width within the {}.
  • Render the document using Ctrl-Shift-K or with the Palette (Ctrl-Shift-P, then type “render”) to make sure the image gets inserted.
  • Go to Quarto’s HTML Theming page and look through the different themes that are available for the HTML output. Find a theme you like by changing the theme option in the YAML header and re-render the document with Ctrl-Shift-K or with the Palette (Ctrl-Shift-P, then type “render”).
  • Finally, add and commit the changes you’ve made to the document using Ctrl-Alt-M or with the Palette (Ctrl-Shift-P, then type “commit”). For now, don’t add and commit the HTML output file.