if (!requireNamespace("dplyr", quietly = TRUE)) install.packages("dplyr")
if (!requireNamespace("lubridate", quietly = TRUE)) install.packages("lubridate")
if (!requireNamespace("pharmaversesdtm", quietly = TRUE)) install.packages("pharmaversesdtm")
library(dplyr)
library(lubridate)
library(pharmaversesdtm)Day 10: AE Domain Mastery & SAE Logic
Deep Dive into Severity, Causality, and Outcomes
1 Learning Objectives
By the end of Day 10, you will be able to:
- Understand the AE domain at a production level - beyond the basics covered in Day 7
- Distinguish between severity grading (MILD/MODERATE/SEVERE) and toxicity grading (CTCAE Grade 1-5)
- Work with all SAE-related variables: AESER, AESDTH, AESLIFE, AESHOSP, AESDISAB, AESCONG, AESMIE
- Derive treatment-emergent adverse events (TEAEs) - one of the most critical derivations in clinical programming
- Calculate AE duration and handle ongoing AEs correctly
- Understand how AE data maps to the ADaM ADAE dataset
2 Why This Day Matters
On Day 7 we built a basic AE domain as part of the capstone. Today we go much deeper - because in production clinical programming, AE data is arguably the most scrutinized dataset by regulatory agencies. Getting AE logic wrong can delay a submission or trigger FDA queries.
Adverse event data directly impacts:
- Patient safety decisions - Should the trial continue?
- Drug labeling - What warnings go on the label?
- Regulatory approval - Is the risk-benefit profile acceptable?
- Post-marketing surveillance - What to watch for after approval
Every variable, every flag, every derivation matters.
3 Package Installation & Loading
3.1 Required Packages
| Package | Purpose |
|---|---|
dplyr |
Data manipulation (filter, mutate, joins) |
lubridate |
Date/time arithmetic for AE durations |
pharmaversesdtm |
Example SDTM datasets including AE and DM |
3.2 Install & Load
4 Exploring the AE Domain from pharmaversesdtm
4.1 Load and Inspect
# Load AE and DM domains
data("ae", package = "pharmaversesdtm")
data("dm", package = "pharmaversesdtm")
cat("AE domain dimensions:", nrow(ae), "rows x", ncol(ae), "columns\n")AE domain dimensions: 1191 rows x 35 columns
cat("Number of unique subjects:", n_distinct(ae$USUBJID), "\n")Number of unique subjects: 225
cat("Variables available:\n")Variables available:
cat(paste(names(ae), collapse = ", "), "\n")STUDYID, DOMAIN, USUBJID, AESEQ, AESPID, AETERM, AELLT, AELLTCD, AEDECOD, AEPTCD, AEHLT, AEHLTCD, AEHLGT, AEHLGTCD, AEBODSYS, AEBDSYCD, AESOC, AESOCCD, AESEV, AESER, AEACN, AEREL, AEOUT, AESCAN, AESCONG, AESDISAB, AESDTH, AESHOSP, AESLIFE, AESOD, AEDTC, AESTDTC, AEENDTC, AESTDY, AEENDY
4.2 Understanding What Each AE Variable Means
Let’s look at the actual data - every column tells a story:
dplyr::glimpse(ae)Rows: 1,191
Columns: 35
$ STUDYID <chr> "CDISCPILOT01", "CDISCPILOT01", "CDISCPILOT01", "CDISCPILOT01…
$ DOMAIN <chr> "AE", "AE", "AE", "AE", "AE", "AE", "AE", "AE", "AE", "AE", "…
$ USUBJID <chr> "01-701-1015", "01-701-1015", "01-701-1015", "01-701-1023", "…
$ AESEQ <dbl> 1, 2, 3, 3, 1, 2, 4, 1, 2, 1, 2, 4, 1, 2, 3, 4, 10, 3, 1, 9, …
$ AESPID <chr> "E07", "E08", "E06", "E10", "E08", "E09", "E08", "E04", "E05"…
$ AETERM <chr> "APPLICATION SITE ERYTHEMA", "APPLICATION SITE PRURITUS", "DI…
$ AELLT <chr> "APPLICATION SITE REDNESS", "APPLICATION SITE ITCHING", "DIAR…
$ AELLTCD <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ AEDECOD <chr> "APPLICATION SITE ERYTHEMA", "APPLICATION SITE PRURITUS", "DI…
$ AEPTCD <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ AEHLT <chr> "HLT_0617", "HLT_0317", "HLT_0148", "HLT_0415", "HLT_0284", "…
$ AEHLTCD <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ AEHLGT <chr> "HLGT_0152", "HLGT_0338", "HLGT_0588", "HLGT_0086", "HLGT_019…
$ AEHLGTCD <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ AEBODSYS <chr> "GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS", "GENE…
$ AEBDSYCD <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ AESOC <chr> "GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS", "GENE…
$ AESOCCD <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ AESEV <chr> "MILD", "MILD", "MILD", "MILD", "MILD", "MODERATE", "MILD", "…
$ AESER <chr> "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "…
$ AEACN <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ AEREL <chr> "PROBABLE", "PROBABLE", "REMOTE", "POSSIBLE", "POSSIBLE", "PR…
$ AEOUT <chr> "NOT RECOVERED/NOT RESOLVED", "NOT RECOVERED/NOT RESOLVED", "…
$ AESCAN <chr> "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "…
$ AESCONG <chr> "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "…
$ AESDISAB <chr> "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "…
$ AESDTH <chr> "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "…
$ AESHOSP <chr> "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "…
$ AESLIFE <chr> "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "…
$ AESOD <chr> "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "…
$ AEDTC <chr> "2014-01-16", "2014-01-16", "2014-01-16", "2012-08-27", "2012…
$ AESTDTC <chr> "2014-01-03", "2014-01-03", "2014-01-09", "2012-08-26", "2012…
$ AEENDTC <chr> NA, NA, "2014-01-11", NA, "2012-08-30", NA, "2012-08-30", NA,…
$ AESTDY <dbl> 2, 2, 8, 22, 3, 3, 3, 3, 21, 58, 125, 27, 1, 1, 23, 52, 52, 5…
$ AEENDY <dbl> NA, NA, 10, NA, 26, NA, 26, NA, NA, NA, NA, NA, 1, 1, NA, NA,…
Here’s what the key variables represent in plain English:
| Variable | What it means | Example |
|---|---|---|
AETERM |
AE as reported by investigator | “Headache” |
AEDECOD |
Standardized preferred term (MedDRA) | “HEADACHE” |
AEBODSYS |
Body system (MedDRA SOC) | “NERVOUS SYSTEM DISORDERS” |
AESEV |
Severity: MILD, MODERATE, SEVERE | “MODERATE” |
AESER |
Is it serious? Y/N | “N” |
AEREL |
Related to study drug? | “POSSIBLY RELATED” |
AEACN |
Action taken with study drug | “DOSE NOT CHANGED” |
AEOUT |
Outcome | “RECOVERED/RESOLVED” |
AESTDTC |
Start date (ISO 8601) | “2014-01-03” |
AEENDTC |
End date (ISO 8601) | “2014-01-12” |
5 Severity vs. Toxicity Grading
This is a concept that confuses many programmers. Let’s clarify it with code.
5.1 Severity Grading (AESEV)
Severity describes the intensity of the adverse event. It’s a clinical judgment:
- MILD: Awareness of event but easily tolerated
- MODERATE: Discomfort causing interference with usual activity
- SEVERE: Incapacitating; unable to do usual activities
# What severity values exist in our data?
ae %>%
count(AESEV, name = "Count") %>%
mutate(Percent = round(100 * Count / sum(Count), 1)) %>%
arrange(match(AESEV, c("MILD", "MODERATE", "SEVERE")))# A tibble: 3 × 3
AESEV Count Percent
<chr> <int> <dbl>
1 MILD 770 64.7
2 MODERATE 378 31.7
3 SEVERE 43 3.6
5.2 Toxicity Grading (AETOXGR - CTCAE Scale)
Toxicity grading uses the CTCAE (Common Terminology Criteria for Adverse Events) scale and is more granular:
| Grade | Description |
|---|---|
| 1 | Mild; asymptomatic or mild symptoms |
| 2 | Moderate; minimal, local, or non-invasive intervention indicated |
| 3 | Severe or medically significant; hospitalization indicated |
| 4 | Life-threatening; urgent intervention indicated |
| 5 | Death related to AE |
# Check if AETOXGR exists in our data
if ("AETOXGR" %in% names(ae)) {
ae %>%
count(AETOXGR, AESEV) %>%
arrange(AETOXGR)
} else {
cat("AETOXGR is not present in the pharmaversesdtm AE dataset.\n")
cat("This is common - not all studies use CTCAE grading.\n\n")
cat("When AETOXGR IS available, it goes in the AE domain as:\n")
cat(" AETOXGR = Toxicity grade (1-5)\n")
cat(" AETOXGRS = Toxicity grade from source\n")
}AETOXGR is not present in the pharmaversesdtm AE dataset.
This is common - not all studies use CTCAE grading.
When AETOXGR IS available, it goes in the AE domain as:
AETOXGR = Toxicity grade (1-5)
AETOXGRS = Toxicity grade from source
This is the #1 most common confusion in clinical programming:
- AESEV (Severity) = How intense is the event? (MILD/MODERATE/SEVERE)
- AESER (Seriousness) = Does it meet SAE criteria? (Y/N)
A MILD rash could be an SAE if it requires hospitalization. A SEVERE headache may NOT be an SAE if it resolves quickly with OTC medication.
Severity describes intensity. Seriousness describes regulatory significance.
6 Serious Adverse Events (SAEs)
6.1 What Makes an AE “Serious”?
An adverse event is classified as serious (AESER = “Y”) if it meets any of the following criteria:
┌─────────────────────────────────────────────────────────────────────┐
│ SAE CRITERIA VARIABLES │
├─────────────────────────────────────────────────────────────────────┤
│ AESER = "Y" if ANY of the following are "Y": │
│ │
│ AESDTH = Results in Death │
│ AESLIFE = Is Life-Threatening │
│ AESHOSP = Requires or Prolongs Hospitalization │
│ AESDISAB = Results in Persistent/Significant Disability │
│ AESCONG = Congenital Anomaly/Birth Defect │
│ AESMIE = Other Medically Important Event │
└─────────────────────────────────────────────────────────────────────┘
6.2 Exploring SAE Data
# How many SAEs in our data?
ae %>%
count(AESER, name = "Count") %>%
mutate(Percent = round(100 * Count / sum(Count), 1))# A tibble: 2 × 3
AESER Count Percent
<chr> <int> <dbl>
1 N 1188 99.7
2 Y 3 0.3
6.3 SAE Criteria Breakdown
# Check which SAE criteria variables are available
sae_vars <- c("AESDTH", "AESLIFE", "AESHOSP", "AESDISAB", "AESCONG", "AESMIE")
available_sae_vars <- sae_vars[sae_vars %in% names(ae)]
cat("SAE criteria variables available:", paste(available_sae_vars, collapse = ", "), "\n\n")SAE criteria variables available: AESDTH, AESLIFE, AESHOSP, AESDISAB, AESCONG
if (length(available_sae_vars) > 0) {
# Show SAE details
ae %>%
filter(AESER == "Y") %>%
select(USUBJID, AEDECOD, AESEV, AESER, any_of(sae_vars)) %>%
head(10)
} else {
cat("No SAE criteria sub-variables found in this dataset.\n")
cat("In production, you would create them from raw CRF data.\n")
}# A tibble: 3 × 9
USUBJID AEDECOD AESEV AESER AESDTH AESLIFE AESHOSP AESDISAB AESCONG
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 01-709-1424 SYNCOPE MODE… Y N Y N N N
2 01-718-1170 SYNCOPE SEVE… Y N N Y N N
3 01-718-1371 PARTIAL SEIZU… SEVE… Y N N Y N N
6.4 Simulating SAE Logic from Scratch
Since the practice dataset may not have all SAE sub-variables, let’s build the logic ourselves. This is exactly what you’d do on a real study:
# Create simulated AE data with SAE criteria
set.seed(42)
ae_sample <- tibble(
USUBJID = rep(paste0("CDISC01-001-00", 1:5), each = 4),
AESEQ = rep(1:4, 5),
AEDECOD = sample(c("HEADACHE", "NAUSEA", "RASH", "FALL", "PNEUMONIA",
"MYOCARDIAL INFARCTION", "SEIZURE", "ANEMIA"), 20, replace = TRUE),
AESEV = sample(c("MILD", "MODERATE", "SEVERE"), 20, replace = TRUE,
prob = c(0.5, 0.35, 0.15)),
# SAE criteria - simulate realistic probabilities
AESDTH = sample(c("Y", "N"), 20, replace = TRUE, prob = c(0.02, 0.98)),
AESLIFE = sample(c("Y", "N"), 20, replace = TRUE, prob = c(0.05, 0.95)),
AESHOSP = sample(c("Y", "N"), 20, replace = TRUE, prob = c(0.10, 0.90)),
AESDISAB = sample(c("Y", "N"), 20, replace = TRUE, prob = c(0.03, 0.97)),
AESCONG = "N", # Very rare in adult trials
AESMIE = sample(c("Y", "N"), 20, replace = TRUE, prob = c(0.05, 0.95))
)
# Derive AESER: "Y" if ANY criterion is "Y"
ae_with_ser <- ae_sample %>%
mutate(
AESER = case_when(
AESDTH == "Y" ~ "Y",
AESLIFE == "Y" ~ "Y",
AESHOSP == "Y" ~ "Y",
AESDISAB == "Y" ~ "Y",
AESCONG == "Y" ~ "Y",
AESMIE == "Y" ~ "Y",
TRUE ~ "N"
)
)
cat("SAE derivation results:\n")SAE derivation results:
ae_with_ser %>%
count(AESER, name = "Count") %>%
mutate(Percent = round(100 * Count / sum(Count), 1))# A tibble: 2 × 3
AESER Count Percent
<chr> <int> <dbl>
1 N 16 80
2 Y 4 20
# View SAE records with their criteria
cat("\nSAE records detail:\n")
SAE records detail:
ae_with_ser %>%
filter(AESER == "Y") %>%
select(USUBJID, AEDECOD, AESEV, AESER, AESDTH, AESLIFE, AESHOSP, AESDISAB, AESMIE)# A tibble: 4 × 9
USUBJID AEDECOD AESEV AESER AESDTH AESLIFE AESHOSP AESDISAB AESMIE
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 CDISC01-001-001 PNEUMONIA MILD Y N Y N N N
2 CDISC01-001-002 NAUSEA SEVERE Y N N N N Y
3 CDISC01-001-004 FALL MODERA… Y N N Y N N
4 CDISC01-001-004 HEADACHE MILD Y N N Y N N
The case_when() above uses a waterfall approach - it checks each criterion in sequence and returns "Y" at the first match. This works well for deriving AESER, but in production you’d typically use a more explicit approach:
AESER = if_else(
AESDTH == "Y" | AESLIFE == "Y" | AESHOSP == "Y" |
AESDISAB == "Y" | AESCONG == "Y" | AESMIE == "Y",
"Y", "N"
)Both approaches give the same result. The if_else() version makes the OR-logic more explicit.
7 Treatment-Emergent Adverse Events (TEAEs)
7.1 What is a TEAE?
A treatment-emergent adverse event is an AE that:
- Started on or after the first dose of study drug, OR
- Was present before treatment but worsened after the first dose
This is one of the most important derivations in clinical programming because most safety analyses focus exclusively on TEAEs.
7.2 The TEAE Decision Tree
┌─────────────────────────────────────────────────────────────────────┐
│ TEAE DECISION LOGIC │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ Is AE Start Date (AESTDTC) >= First Dose Date (RFSTDTC)? │
│ YES → TEAE = "Y" │
│ NO → Was severity worse than pre-treatment? │
│ YES → TEAE = "Y" │
│ NO → TEAE = "N" (Pre-treatment AE, not worsened) │
│ │
│ Is AE Start Date missing? │
│ → Compare AE End Date with First Dose Date │
│ → If AEENDTC >= RFSTDTC, treat as potentially TEAE │
│ → Flag for medical review │
└─────────────────────────────────────────────────────────────────────┘
7.3 Deriving TEAEs in Code
# Load DM for reference dates
data("dm", package = "pharmaversesdtm")
data("ae", package = "pharmaversesdtm")
# Get reference start date per subject
ref_dates <- dm %>%
select(USUBJID, RFSTDTC) %>%
filter(!is.na(RFSTDTC))
cat("Reference dates (first dose) for sample subjects:\n")Reference dates (first dose) for sample subjects:
head(ref_dates)# A tibble: 6 × 2
USUBJID RFSTDTC
<chr> <chr>
1 01-701-1015 2014-01-02
2 01-701-1023 2012-08-05
3 01-701-1028 2013-07-19
4 01-701-1033 2014-03-18
5 01-701-1034 2014-07-01
6 01-701-1047 2013-02-12
# Join AE with reference dates and derive TEAE flag
ae_with_teae <- ae %>%
left_join(ref_dates, by = "USUBJID") %>%
mutate(
# Parse dates - handle potential partial dates
ae_start = ymd(AESTDTC),
ref_start = ymd(RFSTDTC),
# Core TEAE derivation
TRTEMFL = case_when(
# Case 1: AE starts on or after first dose
!is.na(ae_start) & !is.na(ref_start) & ae_start >= ref_start ~ "Y",
# Case 2: AE start date is missing - conservative approach
is.na(ae_start) & !is.na(AEENDTC) & !is.na(ref_start) &
ymd(AEENDTC) >= ref_start ~ "Y",
# Case 3: Both dates available, AE started before treatment
!is.na(ae_start) & !is.na(ref_start) & ae_start < ref_start ~ "N",
# Case 4: Cannot determine
TRUE ~ NA_character_
)
)
# Summary
cat("TEAE derivation results:\n")TEAE derivation results:
ae_with_teae %>%
count(TRTEMFL, name = "Count") %>%
mutate(Percent = round(100 * Count / sum(Count), 1))# A tibble: 3 × 3
TRTEMFL Count Percent
<chr> <int> <dbl>
1 N 45 3.8
2 Y 1131 95
3 <NA> 15 1.3
# Show examples of TEAE vs non-TEAE
cat("\nSample of TEAE records (started on/after first dose):\n")
Sample of TEAE records (started on/after first dose):
ae_with_teae %>%
filter(TRTEMFL == "Y") %>%
select(USUBJID, AEDECOD, AESTDTC, RFSTDTC, TRTEMFL) %>%
head(5)# A tibble: 5 × 5
USUBJID AEDECOD AESTDTC RFSTDTC TRTEMFL
<chr> <chr> <chr> <chr> <chr>
1 01-701-1015 APPLICATION SITE ERYTHEMA 2014-01-03 2014-01-02 Y
2 01-701-1015 APPLICATION SITE PRURITUS 2014-01-03 2014-01-02 Y
3 01-701-1015 DIARRHOEA 2014-01-09 2014-01-02 Y
4 01-701-1023 ATRIOVENTRICULAR BLOCK SECOND DEGREE 2012-08-26 2012-08-05 Y
5 01-701-1023 ERYTHEMA 2012-08-07 2012-08-05 Y
cat("\nSample of non-TEAE records (started before first dose):\n")
Sample of non-TEAE records (started before first dose):
ae_with_teae %>%
filter(TRTEMFL == "N") %>%
select(USUBJID, AEDECOD, AESTDTC, RFSTDTC, TRTEMFL) %>%
head(5)# A tibble: 5 × 5
USUBJID AEDECOD AESTDTC RFSTDTC TRTEMFL
<chr> <chr> <chr> <chr> <chr>
1 01-701-1111 ERYTHEMA 2012-09-02 2012-09-07 N
2 01-701-1111 ERYTHEMA 2012-09-02 2012-09-07 N
3 01-701-1111 LOCALISED INFECTION 2012-07-08 2012-09-07 N
4 01-701-1111 PRURITUS 2012-09-02 2012-09-07 N
5 01-701-1111 PRURITUS 2012-09-02 2012-09-07 N
In real clinical data, you will frequently encounter partial dates like:
"2014-01"(month known, day unknown)"2014"(only year known)""orNA(completely missing)
The lubridate::ymd() function will return NA for partial dates. In production, you’d implement date imputation rules as specified in the Statistics Analysis Plan (SAP). Common approaches:
- Conservative: Impute to the latest possible date (e.g., first of the month)
- Non-conservative: Impute to the earliest possible date
- Rule-based: Use other available information to make the best guess
8 AE Duration Calculations
8.1 Computing Duration in Days
# Calculate AE duration
ae_duration <- ae %>%
mutate(
ae_start = ymd(AESTDTC),
ae_end = ymd(AEENDTC),
# Duration = end - start + 1 (inclusive of start and end day)
AEDUR = as.numeric(ae_end - ae_start) + 1,
# Flag ongoing AEs (no end date)
AEONGO = if_else(is.na(ae_end), "Y", "N")
)
# Summary of durations
cat("AE Duration Summary (days):\n")AE Duration Summary (days):
ae_duration %>%
filter(!is.na(AEDUR)) %>%
summarise(
N = n(),
Mean = round(mean(AEDUR), 1),
Median = median(AEDUR),
Min = min(AEDUR),
Max = max(AEDUR),
SD = round(sd(AEDUR), 1)
)# A tibble: 1 × 6
N Mean Median Min Max SD
<int> <dbl> <dbl> <dbl> <dbl> <dbl>
1 714 23.8 11 1 444 40.2
# Ongoing AEs
cat("\nOngoing AEs (no end date):\n")
Ongoing AEs (no end date):
ae_duration %>%
count(AEONGO, name = "Count") %>%
mutate(Percent = round(100 * Count / sum(Count), 1))# A tibble: 2 × 3
AEONGO Count Percent
<chr> <int> <dbl>
1 N 718 60.3
2 Y 473 39.7
Notice we calculate duration as (end - start) + 1. This is the standard clinical convention:
- An AE that starts on Day 5 and ends on Day 5 lasted 1 day (not 0)
- An AE that starts on Day 5 and ends on Day 7 lasted 3 days (not 2)
This is the same “inclusive” counting rule used for study day calculations.
9 Causality Assessment
9.1 How Causality Is Determined
Causality (AEREL) indicates whether the AE is related to the study drug. Common categories:
| AEREL Value | Description |
|---|---|
NOT RELATED |
No reasonable possibility of relationship |
UNLIKELY |
Doubtful relationship |
POSSIBLE |
Cannot rule out relationship |
PROBABLE |
Likely related |
DEFINITE |
Clearly related |
# Causality distribution
cat("Causality (AEREL) distribution:\n")Causality (AEREL) distribution:
ae %>%
count(AEREL, name = "Count") %>%
mutate(Percent = round(100 * Count / sum(Count), 1)) %>%
arrange(desc(Count))# A tibble: 5 × 3
AEREL Count Percent
<chr> <int> <dbl>
1 PROBABLE 361 30.3
2 POSSIBLE 343 28.8
3 NONE 322 27
4 REMOTE 161 13.5
5 <NA> 4 0.3
10 Action Taken and Outcome
10.1 AEACN - Action Taken with Study Drug
The action taken in response to the AE is captured in AEACN:
ae %>%
count(AEACN, name = "Count") %>%
mutate(Percent = round(100 * Count / sum(Count), 1)) %>%
arrange(desc(Count))# A tibble: 1 × 3
AEACN Count Percent
<chr> <int> <dbl>
1 <NA> 1191 100
10.2 AEOUT - Outcome of the AE
ae %>%
count(AEOUT, name = "Count") %>%
mutate(Percent = round(100 * Count / sum(Count), 1)) %>%
arrange(desc(Count))# A tibble: 3 × 3
AEOUT Count Percent
<chr> <int> <dbl>
1 NOT RECOVERED/NOT RESOLVED 723 60.7
2 RECOVERED/RESOLVED 465 39
3 FATAL 3 0.3
All of these variables - AEACN, AEOUT, AESEV, AESER - use CDISC Controlled Terminology. You can’t invent your own values. The allowed values are specified in the CDISC CT Package.
For example, valid AEOUT values are: - RECOVERED/RESOLVED - RECOVERING/RESOLVING - NOT RECOVERED/NOT RESOLVED - RECOVERED/RESOLVED WITH SEQUELAE - FATAL - UNKNOWN
11 Complete Example: Production AE Processing
Let’s put everything together into a production-quality AE processing pipeline:
# ---- Production AE Processing Pipeline ----
# Step 1: Start with raw AE data
data("ae", package = "pharmaversesdtm")
data("dm", package = "pharmaversesdtm")
# Step 2: Get reference dates
ref <- dm %>%
select(STUDYID, USUBJID, RFSTDTC, RFENDTC) %>%
mutate(
ref_start = ymd(RFSTDTC),
ref_end = ymd(RFENDTC)
)
# Step 3: Enhance AE data with all derivations
ae_production <- ae %>%
# Join reference dates
left_join(ref %>% select(USUBJID, ref_start, ref_end, RFSTDTC), by = "USUBJID") %>%
mutate(
# Parse dates
ae_start = ymd(AESTDTC),
ae_end = ymd(AEENDTC),
# ---- TEAE Flag ----
TRTEMFL = case_when(
!is.na(ae_start) & !is.na(ref_start) & ae_start >= ref_start ~ "Y",
is.na(ae_start) & !is.na(ae_end) & !is.na(ref_start) & ae_end >= ref_start ~ "Y",
!is.na(ae_start) & !is.na(ref_start) & ae_start < ref_start ~ "N",
TRUE ~ NA_character_
),
# ---- Study Days ----
AESTDY = case_when(
!is.na(ae_start) & !is.na(ref_start) & ae_start >= ref_start ~
as.numeric(ae_start - ref_start) + 1,
!is.na(ae_start) & !is.na(ref_start) & ae_start < ref_start ~
as.numeric(ae_start - ref_start),
TRUE ~ NA_real_
),
AEENDY = case_when(
!is.na(ae_end) & !is.na(ref_start) & ae_end >= ref_start ~
as.numeric(ae_end - ref_start) + 1,
!is.na(ae_end) & !is.na(ref_start) & ae_end < ref_start ~
as.numeric(ae_end - ref_start),
TRUE ~ NA_real_
),
# ---- Duration ----
AEDUR = if_else(!is.na(ae_start) & !is.na(ae_end),
as.numeric(ae_end - ae_start) + 1,
NA_real_),
# ---- Ongoing Flag ----
AEONGO = if_else(is.na(ae_end), "Y", "N"),
# ---- Binary Relatedness ----
RELFL = case_when(
AEREL %in% c("POSSIBLE", "PROBABLE", "DEFINITE") ~ "Y",
grepl("RELAT", AEREL, ignore.case = TRUE) ~ "Y",
TRUE ~ "N"
)
) %>%
# Clean up helper columns
select(-ae_start, -ae_end, -ref_start, -ref_end)
cat("Production AE dataset:\n")Production AE dataset:
cat("Total AE records:", nrow(ae_production), "\n")Total AE records: 1191
cat("TEAEs:", sum(ae_production$TRTEMFL == "Y", na.rm = TRUE), "\n")TEAEs: 1131
cat("SAEs:", sum(ae_production$AESER == "Y", na.rm = TRUE), "\n")SAEs: 3
cat("Drug-related:", sum(ae_production$RELFL == "Y", na.rm = TRUE), "\n\n")Drug-related: 704
# Preview the enhanced dataset
ae_production %>%
select(USUBJID, AEDECOD, AESEV, AESER, TRTEMFL, AESTDY, AEDUR, AEONGO, RELFL) %>%
head(15)# A tibble: 15 × 9
USUBJID AEDECOD AESEV AESER TRTEMFL AESTDY AEDUR AEONGO RELFL
<chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <chr> <chr>
1 01-701-1015 APPLICATION SITE E… MILD N Y 2 NA Y Y
2 01-701-1015 APPLICATION SITE P… MILD N Y 2 NA Y Y
3 01-701-1015 DIARRHOEA MILD N Y 8 3 N N
4 01-701-1023 ATRIOVENTRICULAR B… MILD N Y 22 NA Y Y
5 01-701-1023 ERYTHEMA MILD N Y 3 24 N Y
6 01-701-1023 ERYTHEMA MODE… N Y 3 NA Y Y
7 01-701-1023 ERYTHEMA MILD N Y 3 24 N Y
8 01-701-1028 APPLICATION SITE E… MILD N Y 3 NA Y Y
9 01-701-1028 APPLICATION SITE P… MILD N Y 21 NA Y Y
10 01-701-1034 APPLICATION SITE P… MILD N Y 58 NA Y Y
11 01-701-1034 FATIGUE MILD N Y 125 NA Y Y
12 01-701-1047 BUNDLE BRANCH BLOC… MILD N Y 27 NA Y N
13 01-701-1047 HIATUS HERNIA MODE… N Y 1 1 N N
14 01-701-1047 HIATUS HERNIA MODE… N Y 1 1 N N
15 01-701-1047 UPPER RESPIRATORY … MILD N Y 23 NA Y N
12 Key AE Counts for Safety Reporting
In clinical study reports, AE tables almost always include these counts:
# ---- Summary Table: AE Incidence ----
cat("=== AE Incidence Summary ===\n\n")=== AE Incidence Summary ===
# Total subjects
n_total <- n_distinct(dm$USUBJID)
n_ae <- n_distinct(ae_production$USUBJID)
cat("Total subjects enrolled:", n_total, "\n")Total subjects enrolled: 306
cat("Subjects with any AE:", n_ae,
sprintf("(%.1f%%)", 100 * n_ae / n_total), "\n\n")Subjects with any AE: 225 (73.5%)
# TEAE summary
teae_data <- ae_production %>% filter(TRTEMFL == "Y")
n_teae <- n_distinct(teae_data$USUBJID)
cat("Subjects with any TEAE:", n_teae,
sprintf("(%.1f%%)", 100 * n_teae / n_total), "\n")Subjects with any TEAE: 218 (71.2%)
# SAE summary
sae_data <- ae_production %>% filter(AESER == "Y")
n_sae <- n_distinct(sae_data$USUBJID)
cat("Subjects with any SAE:", n_sae,
sprintf("(%.1f%%)", 100 * n_sae / n_total), "\n")Subjects with any SAE: 3 (1.0%)
# Drug-related TEAE
rel_teae <- ae_production %>% filter(TRTEMFL == "Y", RELFL == "Y")
n_rel <- n_distinct(rel_teae$USUBJID)
cat("Subjects with drug-related TEAE:", n_rel,
sprintf("(%.1f%%)", 100 * n_rel / n_total), "\n")Subjects with drug-related TEAE: 185 (60.5%)
# TEAE by maximum severity
cat("\nTEAE by Maximum Severity per Subject:\n")
TEAE by Maximum Severity per Subject:
teae_data %>%
mutate(SEV_NUM = case_when(
AESEV == "MILD" ~ 1,
AESEV == "MODERATE" ~ 2,
AESEV == "SEVERE" ~ 3
)) %>%
group_by(USUBJID) %>%
summarise(MAX_SEV = max(SEV_NUM, na.rm = TRUE), .groups = "drop") %>%
mutate(MAX_AESEV = case_when(
MAX_SEV == 1 ~ "MILD",
MAX_SEV == 2 ~ "MODERATE",
MAX_SEV == 3 ~ "SEVERE"
)) %>%
count(MAX_AESEV, name = "N_Subjects") %>%
mutate(Percent = round(100 * N_Subjects / n_total, 1)) %>%
arrange(match(MAX_AESEV, c("MILD", "MODERATE", "SEVERE")))# A tibble: 3 × 3
MAX_AESEV N_Subjects Percent
<chr> <int> <dbl>
1 MILD 77 25.2
2 MODERATE 112 36.6
3 SEVERE 29 9.5
13 Preview: From AE to ADAE
The SDTM AE domain feeds into the ADaM ADAE dataset. Here’s a preview of the key mappings:
| SDTM AE | ADaM ADAE | Description |
|---|---|---|
| AEDECOD | AEDECOD | Preferred term (carried forward) |
| AEBODSYS | AEBODSYS | Body system (carried forward) |
| AESEV | AESEV | Severity (carried forward) |
| AESTDTC → parsed | ASTDT | Analysis start date (numeric) |
| AEENDTC → parsed | AENDT | Analysis end date (numeric) |
| derived | TRTEMFL | Treatment-emergent flag |
| derived | AESEQ_GR | Worst event selection per subject/term |
| from DM | TRT01A | Actual treatment (from ADSL) |
In Week 3, when we build ADaM datasets with admiral, the derive_var_trtemfl() function will handle TEAE derivation automatically - but understanding the logic behind it (as we’ve done today) is essential for debugging and validation.
14 Deliverable Summary
Today you completed the following:
| Task | Status |
|---|---|
| Understood severity vs. toxicity grading | ✓ Done |
| Explored all SAE criteria variables (AESER, AESDTH, etc.) | ✓ Done |
| Derived SAE flag from sub-criteria using OR-logic | ✓ Done |
| Derived treatment-emergent AE (TEAE) flag | ✓ Done |
| Calculated AE duration and identified ongoing AEs | ✓ Done |
| Analyzed causality, action taken, and outcome | ✓ Done |
| Built a production AE processing pipeline | ✓ Done |
| Generated safety summary counts | ✓ Done |
15 Key Takeaways
- Severity ≠ Seriousness - A mild AE can be serious; a severe AE may not be serious
- SAE criteria are additive - AESER = “Y” if ANY sub-criterion is “Y”
- TEAEs are the focus - Most safety analyses exclude pre-treatment AEs
- Duration uses the +1 rule - Inclusive of both start and end day
- Causality is simplified - Binary related/not-related flags are common in ADaM
- Controlled Terminology is mandatory - Use only CDISC-approved values
16 Resources
- CDISC SDTM Implementation Guide - AE Domain - Official AE specification
- MedDRA Terminology - Medical Dictionary for Regulatory Activities
- CTCAE v5.0 - Common Terminology Criteria for Adverse Events
- ICH E2A Guidelines - Clinical Safety Data Management
- Admiral ADAE Vignette - Building ADAE with admiral
17 What’s Next?
In Day 11, we will focus on Disposition (DS) & Trial Design Domains:
- Understanding the DS domain for screen failures, completers, early terminators
- Trial Design domains: TA, TE, TV, TI, TS
- Working with EPOCH and milestone variables
- Subject flow and disposition summaries