Advanced statistics in R

class: center, middle, inverse, title-slide

.title[
# Advanced statistics in R
]
.subtitle[
## Regression - univariate and stratified
]
.author[
### 
]
.date[
### <a href="mailto:contact@appliedepi.org">contact@appliedepi.org</a>
]

---

# Regression

.pull-left[
Regression analysis is one of the most useful tools in our toolbox.

It allows us to establish statistical relationships between an outcome and an exposure, or exposures, *in our dataset*.

And this allows us to do a number of different things, such as...
]

.pull-right[
<img src="../../images/regression/regression_figure.png" width="100%" />
]

---

# Regression

- Testing a theory or association.
 
<img src="../../images/regression/testing_test_tube.jpg" width="50%" />

---

# Regression

- Testing a theory or association.
- Predicting what could happen in a new dataset if the relationships remain the same.
 
<img src="../../images/regression/testing_test_tube.jpg" width="50%" />

---

# Regression

- Testing a theory or association.
- Predicting what could happen in a new dataset if the relationships remain the same.
- Controlling for confounding and effect modification.
 - Are the relationships between the outcome and the predictor true? 
 - Or are they an artifact of another value we previously did not adjust for?
 
<img src="../../images/regression/testing_test_tube.jpg" width="50%" />

---

# gtsummary

There are many ways to carry out regressions in R, but here we will be using the package **gtsummary** as it allows us to quickly and efficiently analysis data and produce publication ready tables with ease.

<img src="../../images/regression/gt_logo.png" width="30%" />
---

# gtsummary syntax

While these have numerous potential inputs (see `?tbl_uvregression` for examples for the *univariate* regression), we are primarily concerned with only four of them

---

# gtsummary syntax

* `method = `
  * The type of regression we want to run, set to `glm` for our purposes

---

# gtsummary syntax

* `method = `
  * The type of regression we want to run, set to `glm` for our purposes
* `y = `
  * The dependent (outcome) exposure we want to estimate

---

# gtsummary syntax

---

# gtsummary syntax

* `method = `
  * The type of regression we want to run, set to `glm` for our purposes
* `y = `
  * The dependent (outcome) exposure we want to estimate
* `method.args = `
  * The type of glm we want to run, for a logistic regression it would be `method.args = list(family = binomial)`
* `exponentiate = `
  * Whether or not we want to exponentiate the result to produce odds ratios rather than log odds (only useful for logistic regression)

---

# gtsummary syntax

.pull-left[

``` r
linelist %>%
```
]

---

# gtsummary syntax

.pull-left[

``` r
linelist %>%
  select(age, 
         gender, 
         temp, 
         cough, 
         outcome_death)
```
]
---

# gtsummary syntax

.pull-left[

``` r
linelist %>%
  select(age, 
         gender, 
         temp, 
         cough, 
         outcome_death) %>%       
     drop_na() 
```
]

---

# gtsummary syntax

.pull-left[

``` r
linelist %>%
  select(age, 
         gender, 
         temp, 
         cough, 
         outcome_death) %>%       
     drop_na() %>%                
  tbl_uvregression(
    method = ,                
    y = ,                     
    method.args = ,           
    exponentiate =            
  )
```
]

---

# Univariate regression

.pull-left[

``` r
linelist %>%
  select(age, 
         gender,
         temp, 
         cough, 
         outcome_death) %>%       
     drop_na() %>%                
  tbl_uvregression(
    method = glm,                 
    y = outcome_death,                     
    method.args = list(binomial),           
    exponentiate = TRUE
  )
```
]

.pull-right[
<div id="ydgabwknrs" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
<style>#ydgabwknrs table {
 font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
 -webkit-font-smoothing: antialiased;
 -moz-osx-font-smoothing: grayscale;
}

#ydgabwknrs thead, #ydgabwknrs tbody, #ydgabwknrs tfoot, #ydgabwknrs tr, #ydgabwknrs td, #ydgabwknrs th {
  border-style: none;
}

#ydgabwknrs p {
  margin: 0;
  padding: 0;
}

#ydgabwknrs .gt_table {
  display: table;
  border-collapse: collapse;
  line-height: normal;
  margin-left: auto;
  margin-right: auto;
  color: #333333;
  font-size: 16px;
  font-weight: normal;
  font-style: normal;
  background-color: #FFFFFF;
  width: auto;
  border-top-style: solid;
  border-top-width: 2px;
  border-top-color: #A8A8A8;
  border-right-style: none;
  border-right-width: 2px;
  border-right-color: #D3D3D3;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #A8A8A8;
  border-left-style: none;
  border-left-width: 2px;
  border-left-color: #D3D3D3;
}

#ydgabwknrs .gt_caption {
  padding-top: 4px;
  padding-bottom: 4px;
}

#ydgabwknrs .gt_title {
  color: #333333;
  font-size: 125%;
  font-weight: initial;
  padding-top: 4px;
  padding-bottom: 4px;
  padding-left: 5px;
  padding-right: 5px;
  border-bottom-color: #FFFFFF;
  border-bottom-width: 0;
}

#ydgabwknrs .gt_subtitle {
  color: #333333;
  font-size: 85%;
  font-weight: initial;
  padding-top: 3px;
  padding-bottom: 5px;
  padding-left: 5px;
  padding-right: 5px;
  border-top-color: #FFFFFF;
  border-top-width: 0;
}

#ydgabwknrs .gt_heading {
  background-color: #FFFFFF;
  text-align: center;
  border-bottom-color: #FFFFFF;
  border-left-style: none;
  border-left-width: 1px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 1px;
  border-right-color: #D3D3D3;
}

#ydgabwknrs .gt_bottom_border {
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
}

#ydgabwknrs .gt_col_headings {
  border-top-style: solid;
  border-top-width: 2px;
  border-top-color: #D3D3D3;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  border-left-style: none;
  border-left-width: 1px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 1px;
  border-right-color: #D3D3D3;
}

#ydgabwknrs .gt_col_heading {
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: normal;
  text-transform: inherit;
  border-left-style: none;
  border-left-width: 1px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 1px;
  border-right-color: #D3D3D3;
  vertical-align: bottom;
  padding-top: 5px;
  padding-bottom: 6px;
  padding-left: 5px;
  padding-right: 5px;
  overflow-x: hidden;
}

#ydgabwknrs .gt_column_spanner_outer {
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: normal;
  text-transform: inherit;
  padding-top: 0;
  padding-bottom: 0;
  padding-left: 4px;
  padding-right: 4px;
}

#ydgabwknrs .gt_column_spanner_outer:first-child {
  padding-left: 0;
}

#ydgabwknrs .gt_column_spanner_outer:last-child {
  padding-right: 0;
}

#ydgabwknrs .gt_column_spanner {
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  vertical-align: bottom;
  padding-top: 5px;
  padding-bottom: 5px;
  overflow-x: hidden;
  display: inline-block;
  width: 100%;
}

#ydgabwknrs .gt_spanner_row {
  border-bottom-style: hidden;
}

#ydgabwknrs .gt_group_heading {
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: initial;
  text-transform: inherit;
  border-top-style: solid;
  border-top-width: 2px;
  border-top-color: #D3D3D3;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  border-left-style: none;
  border-left-width: 1px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 1px;
  border-right-color: #D3D3D3;
  vertical-align: middle;
  text-align: left;
}

#ydgabwknrs .gt_empty_group_heading {
  padding: 0.5px;
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: initial;
  border-top-style: solid;
  border-top-width: 2px;
  border-top-color: #D3D3D3;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  vertical-align: middle;
}

#ydgabwknrs .gt_from_md > :first-child {
  margin-top: 0;
}

#ydgabwknrs .gt_from_md > :last-child {
  margin-bottom: 0;
}

#ydgabwknrs .gt_row {
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
  margin: 10px;
  border-top-style: solid;
  border-top-width: 1px;
  border-top-color: #D3D3D3;
  border-left-style: none;
  border-left-width: 1px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 1px;
  border-right-color: #D3D3D3;
  vertical-align: middle;
  overflow-x: hidden;
}

#ydgabwknrs .gt_stub {
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: initial;
  text-transform: inherit;
  border-right-style: solid;
  border-right-width: 2px;
  border-right-color: #D3D3D3;
  padding-left: 5px;
  padding-right: 5px;
}

#ydgabwknrs .gt_stub_row_group {
  color: #333333;
  background-color: #FFFFFF;
  font-size: 100%;
  font-weight: initial;
  text-transform: inherit;
  border-right-style: solid;
  border-right-width: 2px;
  border-right-color: #D3D3D3;
  padding-left: 5px;
  padding-right: 5px;
  vertical-align: top;
}

#ydgabwknrs .gt_row_group_first td {
  border-top-width: 2px;
}

#ydgabwknrs .gt_row_group_first th {
  border-top-width: 2px;
}

#ydgabwknrs .gt_summary_row {
  color: #333333;
  background-color: #FFFFFF;
  text-transform: inherit;
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
}

#ydgabwknrs .gt_first_summary_row {
  border-top-style: solid;
  border-top-color: #D3D3D3;
}

#ydgabwknrs .gt_first_summary_row.thick {
  border-top-width: 2px;
}

#ydgabwknrs .gt_last_summary_row {
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
}

#ydgabwknrs .gt_grand_summary_row {
  color: #333333;
  background-color: #FFFFFF;
  text-transform: inherit;
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
}

#ydgabwknrs .gt_first_grand_summary_row {
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
  border-top-style: double;
  border-top-width: 6px;
  border-top-color: #D3D3D3;
}

#ydgabwknrs .gt_last_grand_summary_row_top {
  padding-top: 8px;
  padding-bottom: 8px;
  padding-left: 5px;
  padding-right: 5px;
  border-bottom-style: double;
  border-bottom-width: 6px;
  border-bottom-color: #D3D3D3;
}

#ydgabwknrs .gt_striped {
  background-color: rgba(128, 128, 128, 0.05);
}

#ydgabwknrs .gt_table_body {
  border-top-style: solid;
  border-top-width: 2px;
  border-top-color: #D3D3D3;
  border-bottom-style: solid;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
}

#ydgabwknrs .gt_footnotes {
  color: #333333;
  background-color: #FFFFFF;
  border-bottom-style: none;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  border-left-style: none;
  border-left-width: 2px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 2px;
  border-right-color: #D3D3D3;
}

#ydgabwknrs .gt_footnote {
  margin: 0px;
  font-size: 90%;
  padding-top: 4px;
  padding-bottom: 4px;
  padding-left: 5px;
  padding-right: 5px;
}

#ydgabwknrs .gt_sourcenotes {
  color: #333333;
  background-color: #FFFFFF;
  border-bottom-style: none;
  border-bottom-width: 2px;
  border-bottom-color: #D3D3D3;
  border-left-style: none;
  border-left-width: 2px;
  border-left-color: #D3D3D3;
  border-right-style: none;
  border-right-width: 2px;
  border-right-color: #D3D3D3;
}

#ydgabwknrs .gt_sourcenote {
  font-size: 90%;
  padding-top: 4px;
  padding-bottom: 4px;
  padding-left: 5px;
  padding-right: 5px;
}

#ydgabwknrs .gt_left {
  text-align: left;
}

#ydgabwknrs .gt_center {
  text-align: center;
}

#ydgabwknrs .gt_right {
  text-align: right;
  font-variant-numeric: tabular-nums;
}

#ydgabwknrs .gt_font_normal {
  font-weight: normal;
}

#ydgabwknrs .gt_font_bold {
  font-weight: bold;
}

#ydgabwknrs .gt_font_italic {
  font-style: italic;
}

#ydgabwknrs .gt_super {
  font-size: 65%;
}

#ydgabwknrs .gt_footnote_marks {
  font-size: 75%;
  vertical-align: 0.4em;
  position: initial;
}

#ydgabwknrs .gt_asterisk {
  font-size: 100%;
  vertical-align: 0;
}

#ydgabwknrs .gt_indent_1 {
  text-indent: 5px;
}

#ydgabwknrs .gt_indent_2 {
  text-indent: 10px;
}

#ydgabwknrs .gt_indent_3 {
  text-indent: 15px;
}

#ydgabwknrs .gt_indent_4 {
  text-indent: 20px;
}

#ydgabwknrs .gt_indent_5 {
  text-indent: 25px;
}

#ydgabwknrs .katex-display {
  display: inline-flex !important;
  margin-bottom: 0.75em !important;
}

#ydgabwknrs div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
 height: 0px !important;
}
</style>
<table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
 <thead>
 <tr class="gt_col_headings">
 <th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" scope="col" id="label">Characteristic</th>
 <th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="stat_n">N</th>
 <th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="estimate">OR1</th>
 <th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="conf.low">95% CI1</th>
 <th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="p.value">p-value</th>
 </tr>
 </thead>
 <tbody class="gt_table_body">
 <tr><td headers="label" class="gt_row gt_left">age</td>
<td headers="stat_n" class="gt_row gt_center">291</td>
<td headers="estimate" class="gt_row gt_center">0.99</td>
<td headers="conf.low" class="gt_row gt_center">0.97, 1.01</td>
<td headers="p.value" class="gt_row gt_center">0.3</td></tr>
 <tr><td headers="label" class="gt_row gt_left">gender</td>
<td headers="stat_n" class="gt_row gt_center">291</td>
<td headers="estimate" class="gt_row gt_center"> </td>
<td headers="conf.low" class="gt_row gt_center"> </td>
<td headers="p.value" class="gt_row gt_center"> </td></tr>
 <tr><td headers="label" class="gt_row gt_left">    female</td>
<td headers="stat_n" class="gt_row gt_center"> </td>
<td headers="estimate" class="gt_row gt_center">—</td>
<td headers="conf.low" class="gt_row gt_center">—</td>
<td headers="p.value" class="gt_row gt_center"> </td></tr>
 <tr><td headers="label" class="gt_row gt_left">    male</td>
<td headers="stat_n" class="gt_row gt_center"> </td>
<td headers="estimate" class="gt_row gt_center">0.81</td>
<td headers="conf.low" class="gt_row gt_center">0.50, 1.29</td>
<td headers="p.value" class="gt_row gt_center">0.4</td></tr>
 <tr><td headers="label" class="gt_row gt_left">temp</td>
<td headers="stat_n" class="gt_row gt_center">291</td>
<td headers="estimate" class="gt_row gt_center">1.08</td>
<td headers="conf.low" class="gt_row gt_center">0.83, 1.39</td>
<td headers="p.value" class="gt_row gt_center">0.6</td></tr>
 <tr><td headers="label" class="gt_row gt_left">cough</td>
<td headers="stat_n" class="gt_row gt_center">291</td>
<td headers="estimate" class="gt_row gt_center"> </td>
<td headers="conf.low" class="gt_row gt_center"> </td>
<td headers="p.value" class="gt_row gt_center"> </td></tr>
 <tr><td headers="label" class="gt_row gt_left">    no</td>
<td headers="stat_n" class="gt_row gt_center"> </td>
<td headers="estimate" class="gt_row gt_center">—</td>
<td headers="conf.low" class="gt_row gt_center">—</td>
<td headers="p.value" class="gt_row gt_center"> </td></tr>
 <tr><td headers="label" class="gt_row gt_left">    yes</td>
<td headers="stat_n" class="gt_row gt_center"> </td>
<td headers="estimate" class="gt_row gt_center">1.92</td>
<td headers="conf.low" class="gt_row gt_center">1.01, 3.68</td>
<td headers="p.value" class="gt_row gt_center">0.046</td></tr>
 </tbody>
 
 <tfoot class="gt_footnotes">
 <tr>
 <td class="gt_footnote" colspan="5">1 OR = Odds Ratio, CI = Confidence Interval</td>
 </tr>
 </tfoot>
</table>
</div>
]

---

# Stratified regression

Here we define stratified regression as the process of carrying out separate regression analyses on **different "groups" of data**.

We do this because we think there may be plausible reasons why there might be **different relationships** for **different groups between** between the dependent and independent exposures.

---
# Groups we might want to stratify by

Can you think of any groups you might want to separate in your analysis?

---

# Groups we might want to stratify by

Age
---

# Groups we might want to stratify by

Age

Sex

---

# Groups we might want to stratify by

Age

Sex

Race/ethnicity

---

# Groups we might want to stratify by

Age

Sex

Race/ethnicity

Geographic area

---

# Stratified regression

* `filter()` our dataset to the group we want
 - `gender == male` and `gender == female`
* We then _remove_ gender after we filter
 - We are subsetting the data so each regression only has the `gender` data of the subset

---