Welcome to Hands-On ADSL Creation
Today you’ll build ADSL Part 1 - the foundation dataset that every other ADaM depends on.
ADSL is critical because:
One record per subject
All other ADaMs merge from ADSL
Treatment dates define analysis windows
Population flags control analyses
Today’s focus: Treatment variables and dates from EX and DS domains.
Learning Objectives
By the end of Day 16, you will:
Derive first dose date (TRTSDTM, TRTSDT) using derive_vars_merged() with mode = "first"
Derive last dose date (TRTEDTM, TRTEDT) using mode = "last"
Calculate treatment duration (TRTDURD) with derive_var_trtdurd()
Derive treatment variables (TRT01P, TRT01A)
Handle screen failures and never-dosed subjects
Export ADSL Part 1 as XPT
ADSL Variables Created Today
Treatment Dates:
TRTSDT: Date of First Exposure to Treatment
TRTSDTM: Datetime of First Exposure to Treatment
TRTEDT: Date of Last Exposure to Treatment
TRTEDTM: Datetime of Last Exposure to Treatment
TRTDURD: Total Treatment Duration (Days)
Treatment Assignment:
TRT01P: Planned Treatment for Period 01
TRT01A: Actual Treatment for Period 01
Study Dates:
EOSDT: End of Study Date
EOSSTT: End of Study Status
DCSREAS: Reason for Discontinuation
Setup
library (admiral)
library (pharmaversesdtm)
library (dplyr)
library (lubridate)
library (stringr)
# Load SDTM domains
dm <- pharmaversesdtm:: dm
ex <- pharmaversesdtm:: ex
ds <- pharmaversesdtm:: ds
# Check dimensions
cat ("DM:" , nrow (dm), "subjects \n " )
cat ("EX:" , nrow (ex), "records \n " )
cat ("DS:" , nrow (ds), "records \n " )
Step 1: Initialize ADSL from DM
# Start with Demographics domain
adsl <- dm %>%
select (- DOMAIN)
cat (" \n ADSL initialized with" , nrow (adsl), "subjects and" , ncol (adsl), "variables \n " )
ADSL initialized with 306 subjects and 25 variables
# Preview first few subjects
adsl %>%
select (USUBJID, AGE, SEX, RACE, ARM, ACTARM) %>%
head (5 )
# A tibble: 5 × 6
USUBJID AGE SEX RACE ARM ACTARM
<chr> <dbl> <chr> <chr> <chr> <chr>
1 01-701-1015 63 F WHITE Placebo Placebo
2 01-701-1023 64 M WHITE Placebo Placebo
3 01-701-1028 71 M WHITE Xanomeline High Dose Xanomeline High Dose
4 01-701-1033 74 M WHITE Xanomeline Low Dose Xanomeline Low Dose
5 01-701-1034 77 F WHITE Xanomeline High Dose Xanomeline High Dose
Why start with DM?
DM has one record per subject (ADSL structure)
Contains demographics, treatment assignment, reference dates
Foundation for all other derivations
Step 2: Derive Treatment Variables
# Derive planned and actual treatment
adsl <- adsl %>%
mutate (
TRT01P = ARM,
TRT01A = ACTARM
)
# Check treatment distribution
adsl %>%
count (TRT01P, TRT01A)
# A tibble: 5 × 3
TRT01P TRT01A n
<chr> <chr> <int>
1 Placebo Placebo 86
2 Screen Failure Screen Failure 52
3 Xanomeline High Dose Xanomeline High Dose 72
4 Xanomeline High Dose Xanomeline Low Dose 12
5 Xanomeline Low Dose Xanomeline Low Dose 84
TRT01P vs TRT01A:
TRT01P: Planned treatment (from randomization)
TRT01A: Actual treatment received
May differ due to treatment switches or errors
Step 3: Prepare EX with Date Conversion
# Convert exposure dates to datetime
ex_ext <- ex %>%
derive_vars_dtm (
dtc = EXSTDTC,
new_vars_prefix = "EXST"
) %>%
derive_vars_dtm (
dtc = EXENDTC,
new_vars_prefix = "EXEN" ,
time_imputation = "last"
)
cat (" \n EX dataset extended with datetime variables \n " )
EX dataset extended with datetime variables
# Check one subject
ex_ext %>%
filter (USUBJID == "01-701-1015" ) %>%
select (USUBJID, EXSTDTC, EXSTDTM, EXENDTC, EXENDTM, EXDOSE, VISIT) %>%
head (5 )
# A tibble: 3 × 7
USUBJID EXSTDTC EXSTDTM EXENDTC EXENDTM EXDOSE VISIT
<chr> <chr> <dttm> <chr> <dttm> <dbl> <chr>
1 01-701-1… 2014-0… 2014-01-02 00:00:00 2014-0… 2014-01-16 23:59:59 0 BASE…
2 01-701-1… 2014-0… 2014-01-17 00:00:00 2014-0… 2014-06-18 23:59:59 0 WEEK…
3 01-701-1… 2014-0… 2014-06-19 00:00:00 2014-0… 2014-07-02 23:59:59 0 WEEK…
What does derive_vars_dtm() do?
Converts ISO 8601 character dates (EXSTDTC) to POSIXct datetime (EXSTDTM)
Handles partial dates with imputation
Creates imputation flags automatically
Step 4: Derive First Dose Date
# Derive first dose datetime
adsl <- adsl %>%
derive_vars_merged (
dataset_add = ex_ext,
filter_add = (EXDOSE > 0 |
(EXDOSE == 0 & str_detect (EXTRT, "PLACEBO" ))) &
! is.na (EXSTDTM),
new_vars = exprs (TRTSDTM = EXSTDTM),
order = exprs (EXSTDTM, EXSEQ),
mode = "first" ,
by_vars = exprs (STUDYID, USUBJID)
)
cat (" \n First dose date derived \n " )
# Check results
adsl %>%
select (USUBJID, TRT01A, TRTSDTM) %>%
filter (! is.na (TRTSDTM)) %>%
head (5 )
# A tibble: 5 × 3
USUBJID TRT01A TRTSDTM
<chr> <chr> <dttm>
1 01-701-1015 Placebo 2014-01-02 00:00:00
2 01-701-1023 Placebo 2012-08-05 00:00:00
3 01-701-1028 Xanomeline High Dose 2013-07-19 00:00:00
4 01-701-1033 Xanomeline Low Dose 2014-03-18 00:00:00
5 01-701-1034 Xanomeline High Dose 2014-07-01 00:00:00
Understanding derive_vars_merged():
dataset_add = ex_ext: Merge from EX dataset
filter_add: Only actual doses (EXDOSE > 0)
mode = "first": Take earliest record
order = exprs(EXSTDTM, EXSEQ): Order by datetime, then sequence
by_vars: Join on STUDYID and USUBJID
Step 5: Derive Last Dose Date
# Derive last dose datetime
adsl <- adsl %>%
derive_vars_merged (
dataset_add = ex_ext,
filter_add = (EXDOSE > 0 |
(EXDOSE == 0 & str_detect (EXTRT, "PLACEBO" ))) &
! is.na (EXENDTM),
new_vars = exprs (TRTEDTM = EXENDTM),
order = exprs (EXENDTM, EXSEQ),
mode = "last" ,
by_vars = exprs (STUDYID, USUBJID)
)
cat (" \n Last dose date derived \n " )
# Check results
adsl %>%
select (USUBJID, TRTSDTM, TRTEDTM) %>%
filter (! is.na (TRTSDTM)) %>%
head (5 )
# A tibble: 5 × 3
USUBJID TRTSDTM TRTEDTM
<chr> <dttm> <dttm>
1 01-701-1015 2014-01-02 00:00:00 2014-07-02 23:59:59
2 01-701-1023 2012-08-05 00:00:00 2012-09-01 23:59:59
3 01-701-1028 2013-07-19 00:00:00 2014-01-14 23:59:59
4 01-701-1033 2014-03-18 00:00:00 2014-03-31 23:59:59
5 01-701-1034 2014-07-01 00:00:00 2014-12-30 23:59:59
Key difference from Step 4:
mode = "last": Take latest record instead of earliest
Same function, different parameter
Step 6: Convert Datetime to Date
# Extract date from datetime
adsl <- adsl %>%
derive_vars_dtm_to_dt (source_vars = exprs (TRTSDTM, TRTEDTM))
cat (" \n Date variables created from datetime \n " )
Date variables created from datetime
# Check both formats
adsl %>%
select (USUBJID, TRTSDTM, TRTSDT, TRTEDTM, TRTEDT) %>%
filter (! is.na (TRTSDT)) %>%
head (5 )
# A tibble: 5 × 5
USUBJID TRTSDTM TRTSDT TRTEDTM TRTEDT
<chr> <dttm> <date> <dttm> <date>
1 01-701-1015 2014-01-02 00:00:00 2014-01-02 2014-07-02 23:59:59 2014-07-02
2 01-701-1023 2012-08-05 00:00:00 2012-08-05 2012-09-01 23:59:59 2012-09-01
3 01-701-1028 2013-07-19 00:00:00 2013-07-19 2014-01-14 23:59:59 2014-01-14
4 01-701-1033 2014-03-18 00:00:00 2014-03-18 2014-03-31 23:59:59 2014-03-31
5 01-701-1034 2014-07-01 00:00:00 2014-07-01 2014-12-30 23:59:59 2014-12-30
Why both datetime and date?
TRTSDTM/TRTEDTM: Precise datetime for sorting, merging
TRTSDT/TRTEDT: Date for calculations, display in tables
Step 7: Derive Treatment Duration
# Calculate treatment duration in days
adsl <- adsl %>%
derive_var_trtdurd ()
cat (" \n Treatment duration calculated \n " )
Treatment duration calculated
# Summary statistics
adsl %>%
filter (! is.na (TRTDURD)) %>%
summarise (
N = n (),
Mean = round (mean (TRTDURD), 1 ),
SD = round (sd (TRTDURD), 1 ),
Median = median (TRTDURD),
Min = min (TRTDURD),
Max = max (TRTDURD)
)
# A tibble: 1 × 6
N Mean SD Median Min Max
<int> <dbl> <dbl> <dbl> <dbl> <dbl>
1 252 115. 70.7 132 1 212
Formula: TRTDURD = TRTEDT - TRTSDT + 1
The +1 is CDISC convention: same-day treatment = 1 day, not 0.
Step 8: Derive Disposition Date (EOSDT)
# Convert DS date
ds_ext <- derive_vars_dt (
ds,
dtc = DSSTDTC,
new_vars_prefix = "DSST"
)
# Merge end of study date to ADSL
adsl <- adsl %>%
derive_vars_merged (
dataset_add = ds_ext,
by_vars = exprs (STUDYID, USUBJID),
new_vars = exprs (EOSDT = DSSTDT),
filter_add = DSCAT == "DISPOSITION EVENT" & DSDECOD != "SCREEN FAILURE"
)
cat (" \n End of study date derived \n " )
End of study date derived
# Check
adsl %>%
filter (! is.na (EOSDT)) %>%
select (USUBJID, TRTEDT, EOSDT) %>%
head (5 )
# A tibble: 5 × 3
USUBJID TRTEDT EOSDT
<chr> <date> <date>
1 01-701-1015 2014-07-02 2014-07-02
2 01-701-1023 2012-09-01 2012-09-02
3 01-701-1028 2014-01-14 2014-01-14
4 01-701-1033 2014-03-31 2014-04-14
5 01-701-1034 2014-12-30 2014-12-30
Step 9: Derive Disposition Status (EOSSTT)
# Helper function for status formatting
format_eosstt <- function (x) {
case_when (
x %in% c ("COMPLETED" ) ~ "COMPLETED" ,
x %in% c ("SCREEN FAILURE" ) ~ NA_character_ ,
TRUE ~ "DISCONTINUED"
)
}
# Derive end of study status
adsl <- adsl %>%
derive_vars_merged (
dataset_add = ds,
by_vars = exprs (STUDYID, USUBJID),
filter_add = DSCAT == "DISPOSITION EVENT" ,
new_vars = exprs (EOSSTT = format_eosstt (DSDECOD)),
missing_values = exprs (EOSSTT = "ONGOING" )
)
cat (" \n End of study status derived \n " )
End of study status derived
# Distribution
adsl %>%
count (EOSSTT)
# A tibble: 3 × 2
EOSSTT n
<chr> <int>
1 COMPLETED 110
2 DISCONTINUED 144
3 <NA> 52
Status categories:
COMPLETED: Finished study per protocol
DISCONTINUED: Withdrew before completion
ONGOING: Still in study (for data cuts)
Step 10: Derive Discontinuation Reason
# Derive reason for discontinuation
adsl <- adsl %>%
derive_vars_merged (
dataset_add = ds,
by_vars = exprs (USUBJID),
new_vars = exprs (DCSREAS = DSDECOD, DCSREASP = DSTERM),
filter_add = DSCAT == "DISPOSITION EVENT" &
! (DSDECOD %in% c ("SCREEN FAILURE" , "COMPLETED" , NA ))
)
cat (" \n Discontinuation reason derived \n " )
Discontinuation reason derived
# Check discontinued subjects
adsl %>%
filter (! is.na (DCSREAS)) %>%
select (USUBJID, EOSSTT, DCSREAS, DCSREASP) %>%
head (10 )
# A tibble: 10 × 4
USUBJID EOSSTT DCSREAS DCSREASP
<chr> <chr> <chr> <chr>
1 01-701-1023 DISCONTINUED ADVERSE EVENT ADVERSE EVENT
2 01-701-1033 DISCONTINUED STUDY TERMINATED BY SPONSOR SPONSOR DECISION (STUDY…
3 01-701-1047 DISCONTINUED ADVERSE EVENT ADVERSE EVENT
4 01-701-1111 DISCONTINUED ADVERSE EVENT ADVERSE EVENT
5 01-701-1115 DISCONTINUED ADVERSE EVENT ADVERSE EVENT
6 01-701-1146 DISCONTINUED ADVERSE EVENT ADVERSE EVENT
7 01-701-1180 DISCONTINUED ADVERSE EVENT ADVERSE EVENT
8 01-701-1181 DISCONTINUED ADVERSE EVENT ADVERSE EVENT
9 01-701-1188 DISCONTINUED ADVERSE EVENT ADVERSE EVENT
10 01-701-1211 DISCONTINUED DEATH DEATH
Step 11: Derive Safety Population Flag
# Flag subjects who received any study treatment
adsl <- adsl %>%
derive_var_merged_exist_flag (
dataset_add = ex,
by_vars = exprs (STUDYID, USUBJID),
new_var = SAFFL,
condition = (EXDOSE > 0 | (EXDOSE == 0 & str_detect (EXTRT, "PLACEBO" )))
)
cat (" \n Safety population flag derived \n " )
Safety population flag derived
# Population counts
adsl %>%
count (SAFFL)
# A tibble: 2 × 2
SAFFL n
<chr> <int>
1 Y 254
2 <NA> 52
Safety Population (SAFFL = “Y”):
Received at least one dose of study treatment
Used for all safety analyses (AEs, labs, vitals)
Validation Checks
cat (" \n === ADSL Part 1 Validation === \n\n " )
=== ADSL Part 1 Validation ===
# Check 1: TRTSDT ≤ TRTEDT
check1 <- adsl %>%
filter (! is.na (TRTSDT) & ! is.na (TRTEDT) & TRTSDT > TRTEDT)
cat ("Check 1 - TRTSDT > TRTEDT (should be 0):" , nrow (check1), " \n " )
Check 1 - TRTSDT > TRTEDT (should be 0): 0
# Check 2: TRTDURD calculation
check2 <- adsl %>%
filter (! is.na (TRTDURD)) %>%
mutate (TRTDURD_check = as.numeric (TRTEDT - TRTSDT) + 1 ) %>%
filter (TRTDURD != TRTDURD_check)
cat ("Check 2 - TRTDURD mismatch (should be 0):" , nrow (check2), " \n " )
Check 2 - TRTDURD mismatch (should be 0): 0
# Check 3: Safety flag consistency
check3 <- adsl %>%
filter (! is.na (TRTSDT) & is.na (SAFFL))
cat ("Check 3 - Treated but no SAFFL (should be 0):" , nrow (check3), " \n " )
Check 3 - Treated but no SAFFL (should be 0): 0
# Check 4: Screen failures
check4 <- adsl %>%
filter (ARM == "Screen Failure" & ! is.na (TRTSDT))
cat ("Check 4 - Screen failures with TRTSDT (should be 0):" , nrow (check4), " \n " )
Check 4 - Screen failures with TRTSDT (should be 0): 0
cat (" \n ✓ All validation checks passed \n " )
✓ All validation checks passed
Final ADSL Part 1 Summary
cat (" \n === ADSL Part 1 Summary === \n\n " )
=== ADSL Part 1 Summary ===
cat ("Total subjects:" , nrow (adsl), " \n " )
cat ("Variables:" , ncol (adsl), " \n\n " )
# Treatment summary by arm
adsl %>%
group_by (TRT01A) %>%
summarise (
N = n (),
N_Treated = sum (! is.na (TRTSDT)),
Pct_Treated = round (100 * sum (! is.na (TRTSDT)) / n (), 1 ),
Mean_Duration = round (mean (TRTDURD, na.rm = TRUE ), 1 ),
SD_Duration = round (sd (TRTDURD, na.rm = TRUE ), 1 ),
.groups = "drop"
)
# A tibble: 4 × 6
TRT01A N N_Treated Pct_Treated Mean_Duration SD_Duration
<chr> <int> <int> <dbl> <dbl> <dbl>
1 Placebo 86 86 100 150. 60.4
2 Screen Failure 52 0 0 NaN NA
3 Xanomeline High Dose 72 72 100 112. 65.5
4 Xanomeline Low Dose 96 96 100 86.8 70.5
Screen Failures and Never-Dosed Subjects
# Subjects with no treatment
no_treatment <- adsl %>%
filter (is.na (TRTSDT))
cat (" \n Subjects with no treatment dates:" , nrow (no_treatment), " \n " )
Subjects with no treatment dates: 52
# Distribution by ARM
no_treatment %>%
count (ARM)
# A tibble: 1 × 2
ARM n
<chr> <int>
1 Screen Failure 52
Expected behavior:
Screen failures have ARM = “Screen Failure”
TRTSDT, TRTEDT, TRTDURD all missing
SAFFL = NA
Will be excluded from efficacy/safety analyses
Key Takeaways
Technical:
ADSL starts with DM (one record per subject)
Date workflow: derive_vars_dtm() → derive_vars_dtm_to_dt()
derive_vars_merged() with mode = “first” or “last”
Filter carefully: only actual exposures (EXDOSE > 0)
TRTDURD formula: TRTEDT - TRTSDT + 1 (CDISC convention)
Strategic:
ADSL is the foundation - all other ADaMs merge from it
Validation is critical - check logical consistency
Missing treatment dates for screen failures is correct
Population flags (SAFFL) control analysis populations
Next Steps
Day 17: Complete ADSL with:
Population flags (ITTFL, PPSFL)
Demographics groupings (AGEGR1, RACEN, SEXN)
Baseline measurements (HEIGHTBL, WEIGHTBL, BMIBL)
Resources
Admiral Documentation:
CDISC Standards:
Pharmaverse:
End of Day 16
Tomorrow: ADSL Part 2 - Population Flags & Demographics