# Load packages
library(admiral)
library(dplyr)
library(pharmaversesdtm) # For example SDTM data
library(tibble)
# Check Admiral version
packageVersion("admiral")[1] '1.4.0'
Week 3, Day 15: Understanding ADaM Structures, Admiral Philosophy, and Core Derivation Patterns
Week 3 Overview: After mastering SDTM creation and validation in Weeks 1-2, we now transition to ADaM (Analysis Data Model) - the analysis-ready datasets that feed directly into regulatory Tables, Listings, and Figures (TLFs).
This week is intensive - we’ll use the {admiral} package to build every major ADaM dataset type. Admiral was co-developed by multiple pharmaceutical companies as a tidyverse-based, modular, non-black-box toolbox for ADaM creation, and is now maintained by 50+ pharma company programmers across the pharmaverse community.
Day 15 is foundational - today we learn the architecture before building datasets tomorrow.
By the end of Day 15, you will be able to:
derive_vars_*, derive_var_*, derive_param_*, and derive_extreme_*Clinical Context: Today is conceptual but critical. Understanding ADaM architecture prevents hours of debugging later. Knowing which Admiral functions to use for which patterns makes you 10x more efficient in Days 16-30.
Clinical Trial → EDC Data → SDTM (Tabulation) → ADaM (Analysis) → TLFs → Regulatory Submission
↑
Admiral creates this
ADaM (Analysis Data Model) is the CDISC standard for analysis-ready datasets used in regulatory submissions to FDA, EMA, and other health authorities.
SDTM limitations for analysis: - One record per observation (not per analysis unit) - No derived variables (change from baseline, treatment-emergent flags) - No baseline or analysis visit structure - Not optimized for statistical procedures
ADaM solves this by: - Structuring data for analysis (one row per subject/parameter/timepoint) - Adding derived variables (flags, categories, calculated values) - Ensuring traceability back to SDTM - Enabling direct input to SAS PROC or R statistical functions
Clinical Example:
In SDTM (LB - Laboratory):
USUBJID LBTESTCD LBSTRESN LBDTC VISIT
001 ALT 45 2024-01-15 Baseline
001 ALT 52 2024-02-15 Week 4
001 ALT 48 2024-03-15 Week 8
In ADaM (ADLB - BDS structure):
USUBJID PARAMCD AVISIT AVAL ABLFL BASE CHG ANRIND
001 ALT Baseline 45 Y 45 NA NORMAL
001 ALT Week 4 52 N 45 7 NORMAL
001 ALT Week 8 48 N 45 3 NORMAL
Analysis benefit: - Baseline flag (ABLFL) clearly identifies reference measurement - Change from baseline (CHG) pre-calculated - Reference range indicator (ANRIND) ready for shift tables - Structure enables lm(CHG ~ AVISIT + BASE, data = adlb_filtered)
CDISC defines three primary ADaM structures. Understanding which structure to use is the first decision in any ADaM creation.
Structure: One record per subject
Purpose: - Demographics and baseline characteristics - Treatment assignment (planned and actual) - Population flags (Safety, ITT, Per-Protocol) - Study dates and disposition - Foundation for all other ADaM datasets
Key Variables: - Identifiers: USUBJID, SUBJID, SITEID - Demographics: AGE, SEX, RACE, ETHNIC - Treatment: TRT01P, TRT01A, TRTSDT, TRTEDT - Flags: SAFFL, ITTFL, COMPLFL - Disposition: EOSSTT, DCSREAS
SDTM Sources: - Primary: DM (Demographics) - Secondary: EX (Exposure for treatment dates), DS (Disposition), VS/LB (baseline values)
Clinical Use: - Demographic tables (Table 14.1.1 in submissions) - Population definitions for all analyses - Subgroup definitions (age groups, sex, baseline severity)
Example Record:
USUBJID: "01-701-1015"
AGE: 75, SEX: "F", RACE: "WHITE"
TRT01P: "Xanomeline High Dose", TRT01A: "Xanomeline High Dose"
TRTSDT: 2024-01-20, TRTEDT: 2024-07-15, TRTDURD: 178
SAFFL: "Y", ITTFL: "Y", COMPLFL: "Y"
EOSSTT: "COMPLETED"
Structure: One record per subject per parameter per analysis timepoint
Purpose: - Repeated measures over time (labs, vital signs, efficacy scales) - Baseline identification and change from baseline - Reference range and shift analysis
Key Variables: - PARAMCD/PARAM: Parameter identifier (e.g., “ALT”, “DIABP”) - AVAL/AVALC: Analysis value (numeric/character) - AVISIT/AVISITN: Analysis visit - ABLFL: Baseline flag (Y for baseline record) - BASE: Baseline value (carried forward to all post-baseline) - CHG: Change from baseline (AVAL - BASE) - ANRIND: Analysis reference range indicator (LOW/NORMAL/HIGH)
SDTM Sources: - LB → ADLB (Lab results) - VS → ADVS (Vital signs) - QS → ADQS (Questionnaires) - EG → ADEG (ECG results)
BDS Formula:
Number of Records = N subjects × N parameters × N timepoints
Example: ADLB
USUBJID PARAMCD AVISIT AVAL ABLFL BASE CHG PCHG ANRIND
001 ALT Baseline 45 Y 45 NA NA NORMAL
001 ALT Week 4 52 N 45 7 15.6 NORMAL
001 ALT Week 8 48 N 45 3 6.7 NORMAL
001 ALKPH Baseline 78 Y 78 NA NA NORMAL
001 ALKPH Week 4 82 N 78 4 5.1 NORMAL
Clinical Use: - Change from baseline analysis - Shift tables (baseline NORMAL → Week 4 HIGH) - Time-course plots - Repeated measures ANCOVA
Structure: One record per subject per event occurrence
Purpose: - Adverse events - Medications - Medical history events - Any event that can occur 0, 1, or multiple times per subject
Key Variables: - ASTDT/AENDT: Analysis start/end dates - ADURN: Analysis duration - TRTEMFL: Treatment-emergent flag - AOCCFL: First occurrence flag - ANL01FL: Primary analysis flag - Event-specific variables (e.g., severity, causality for AE)
SDTM Sources: - AE → ADAE (Adverse events) - CM → ADCM (Concomitant medications) - MH → ADMH (Medical history)
OCCDS Formula:
Number of Records = Variable (0 to N events per subject)
Example: ADAE
USUBJID AESEQ AETERM ASTDT AENDT TRTEMFL AOCCFL ASEV
001 1 Headache 2024-01-22 2024-01-23 Y Y MILD
001 2 Nausea 2024-02-10 2024-02-12 Y N MODERATE
001 3 Dizziness 2024-03-05 2024-03-05 Y N MILD
002 1 Headache 2024-01-18 2024-01-19 N N MILD
Clinical Use: - Adverse event frequency tables - Time-to-first-event analysis - Treatment-emergent vs pre-existing events - Causality and severity analysis
Pre-Admiral Era (2010-2020): - Every pharma company had proprietary SAS macro libraries - Black-box functions (couldn’t see or modify code) - Duplicated effort (50 companies solving the same problems) - Difficult to validate (macros hidden in compiled code) - Knowledge siloed within companies
Admiral Vision (2020-present): - Open-source: Code is fully transparent and inspectable - Cross-company collaboration: Multiple pharmaceutical companies co-develop - Tidyverse-native: Leverages dplyr, tidyr, lubridate - familiar to R users - Modular: Small, composable functions instead of monolithic macros - Non-black-box: Every derivation is explicit in your script - Validated: Extensive test coverage and regulatory use
Bad (monolithic):
Good (modular Admiral):
Every step is visible and modifiable.
Admiral functions work seamlessly with %>% pipes and dplyr verbs:
Every Admiral function: - Returns a modified data frame (never a hidden state) - Has explicit input and output - Can be inspected mid-pipeline with glimpse() or View()
Admiral functions are parameterized - same function, different arguments:
Admiral encourages inline documentation:
| Aspect | SAS Macros (Legacy) | Admiral (Modern) |
|---|---|---|
| Source code | Hidden in compiled .sas7bcat |
Fully open on GitHub |
| Collaboration | Company-specific | Cross-company |
| Debugging | Difficult (macro expansion errors) | Easy (standard R debugging) |
| Testing | Manual | Automated with testthat |
| Community | Internal only | 1000+ GitHub stars, Slack |
| Learning curve | Steep (macro language) | Moderate (tidyverse knowledge) |
| Validation | Company validates internally | Shared validation evidence |
Admiral has 170+ functions, but they fall into 4 main categories. Learning these categories is like learning grammar - it lets you construct any derivation.
derive_vars_* - Add Multiple VariablesPattern: derive_vars_*(dataset, dataset_add, by_vars, new_vars, ...)
Purpose: Add multiple variables to a dataset by merging from another dataset
Key Functions:
derive_vars_merged()Merge variables from another dataset (most versatile function in Admiral)
Use cases: - Treatment dates from EX - Disposition reasons from DS - Baseline values from LB/VS
derive_vars_dt() and derive_vars_dtm()Convert character dates to Date or POSIXct
Use cases: - All date conversions in ADSL, ADAE, ADLB, etc. - Handles partial dates with imputation
derive_vars_merged_lookup()Add variables from a lookup table (e.g., parameter metadata)
Use cases: - Parameter labels in BDS datasets - Decode values from reference tables
derive_var_* - Add One VariablePattern: derive_var_*(dataset, ...)
Purpose: Add a single derived variable to a dataset
Key Functions:
derive_var_trtdurd()Treatment duration (days)
derive_var_merged_exist_flag()Flag if a condition exists in another dataset
Use cases: - Population flags (SAFFL, ITTFL) - Event flags (had SAE, had dose interruption)
derive_var_extreme_flag()Flag first/last/min/max record within a group
Use cases: - Baseline flags (ABLFL) - First occurrence flags (AOCCFL)
derive_param_* - Add Computed ParametersPattern: derive_param_*(dataset, ...)
Purpose: Add new rows with computed parameters (BDS-specific)
Key Functions:
derive_param_computed()Create a parameter computed from other parameters
Use cases: - Derived parameters (MAP from BP, BMI from height/weight) - Ratios (LDL/HDL)
derive_param_exist_flag()Create a parameter indicating if an event occurred
derive_extreme_records() and UtilitiesPattern: derive_extreme_records(dataset, dataset_add, ...)
Purpose: Compute records (e.g., LOCF, average across visits)
Key Functions:
derive_extreme_records()Create records by selecting extremes (first, last, min, max)
Use cases: - LOCF/WOCF imputation - Summary records (average of multiple readings)
call_derivation() Apply the same derivation with different parameters
restrict_derivation() Apply derivation only to subset
slice_derivation() Apply different derivations to different slices
This table is your rosetta stone for ADaM creation. Memorize this mapping.
| ADaM Dataset | SDTM Source(s) | Structure | Key Admiral Functions | Primary Use |
|---|---|---|---|---|
| ADSL | DM, EX, DS, AE, VS, LB | Subject-level (1 row/subject) | derive_vars_merged(), derive_var_merged_exist_flag(), derive_var_trtdurd() |
Demographics, population flags, treatment dates |
| ADAE | AE, ADSL | OCCDS (1 row/event) | derive_vars_merged() (ADSL), derive_var_extreme_flag() (TRTEMFL, AOCCFL), derive_vars_dt() |
Safety tables, AE listings, time-to-first-AE |
| ADLB | LB, ADSL | BDS (1 row/subject/param/visit) | derive_var_extreme_flag() (ABLFL), derive_var_base(), derive_var_chg(), derive_var_pchg() |
Lab result tables, shift tables, change analysis |
| ADVS | VS, ADSL | BDS | derive_var_extreme_flag() (ABLFL), derive_var_base(), derive_var_chg(), derive_vars_joined() (visit mapping) |
Vital signs tables, plots |
| ADTTE | DM, DS, AE, ADSL | BDS-like (1 row/subject/endpoint) | event_source(), censor_source(), derive_param_tte() |
Kaplan-Meier, Cox regression, time-to-event |
| ADCM | CM, ADSL | OCCDS | derive_vars_merged() (ADSL), derive_vars_dt(), period flags (CMPREFL, CMCONFL) |
Concomitant meds tables |
| ADEX | EX, ADSL | BDS (1 row/subject/param/dose) | derive_vars_merged() (ADSL), derive_param_exposure() (cumulative dose, intensity) |
Dose intensity, compliance |
| ADRS | RS, TR, TU, ADSL | BDS (oncology-specific) | admiral.onco package: derive_param_response(), derive_param_confirmed_resp() |
Tumor response, RECIST criteria |
Let’s load Admiral and explore its structure.
[1] '1.4.0'
derive_vars_* functions (add multiple variables):
[1] "derive_vars_aage" "derive_vars_atc" "derive_vars_cat"
[4] "derive_vars_computed" "derive_vars_crit_flag" "derive_vars_dt"
[7] "derive_vars_dtm" "derive_vars_dtm_to_dt" "derive_vars_dtm_to_tm"
[10] "derive_vars_duration"
derive_var_* functions (add one variable):
[1] "derive_var_age_years" "derive_var_analysis_ratio"
[3] "derive_var_anrind" "derive_var_atoxgr"
[5] "derive_var_atoxgr_dir" "derive_var_base"
[7] "derive_var_chg" "derive_var_dthcaus"
[9] "derive_var_extreme_dt" "derive_var_extreme_dtm"
derive_param_* functions (add computed parameters):
[1] "derive_param_bmi" "derive_param_bsa"
[3] "derive_param_computed" "derive_param_doseint"
[5] "derive_param_exist_flag" "derive_param_exposure"
[7] "derive_param_extreme_record" "derive_param_framingham"
[9] "derive_param_map" "derive_param_qtc"
derive_extreme_* functions:
[1] "derive_extreme_event" "derive_extreme_records"
Higher-order functions (HOFs):
[1] "call_derivation" "restrict_derivation" "slice_derivation"
derive_vars_merged() arguments:
dataset, dataset_add, by_vars, order, new_vars, filter_add, mode, exist_flag, true_value, false_value, missing_values, check_type, duplicate_msg, relationship
derive_var_extreme_flag() arguments:
dataset, by_vars, order, new_var, mode, true_value, false_value, flag_all, check_type
Scenario: “I need to flag the first adverse event per subject.”
Thought process: 1. First occurrence → need a flag → derive_var_* 2. First/extreme → derive_var_extreme_flag() 3. Check help: ?derive_var_extreme_flag
Your deliverable today is a mapping matrix documenting the architecture of your future ADaM package.
# Create mapping matrix
adam_mapping <- tribble(
~ADAM_Dataset, ~SDTM_Sources, ~Structure_Type, ~Admiral_Function_1, ~Admiral_Function_2, ~Admiral_Function_3, ~Clinical_Purpose,
"ADSL",
"DM (primary), EX, DS, AE, VS, LB",
"Subject-level",
"derive_vars_merged()",
"derive_var_merged_exist_flag()",
"derive_var_trtdurd()",
"Demographics tables, population flags, treatment assignments",
"ADAE",
"AE (primary), ADSL",
"OCCDS",
"derive_vars_merged() [ADSL]",
"derive_var_extreme_flag() [TRTEMFL]",
"derive_vars_dt() [dates]",
"Adverse event frequency tables, AE listings, safety summaries",
"ADLB",
"LB (primary), ADSL",
"BDS",
"derive_var_extreme_flag() [ABLFL]",
"derive_var_base()",
"derive_var_chg()",
"Lab result tables, shift tables, change from baseline analysis",
"ADVS",
"VS (primary), ADSL",
"BDS",
"derive_var_extreme_flag() [ABLFL]",
"derive_var_base()",
"derive_vars_joined() [visit mapping]",
"Vital signs tables, BP plots, vital sign changes",
"ADTTE",
"DS, AE, ADSL",
"BDS-like (time-to-event)",
"event_source()",
"censor_source()",
"derive_param_tte()",
"Kaplan-Meier curves, Cox regression, survival analysis",
"ADCM",
"CM (primary), ADSL",
"OCCDS",
"derive_vars_merged() [ADSL]",
"derive_vars_dt() [dates]",
"Period flags (CMPREFL, CMCONFL)",
"Concomitant medications tables, prior/concomitant distinction",
"ADEX",
"EX (primary), ADSL",
"BDS (exposure metrics)",
"derive_vars_merged() [ADSL]",
"derive_param_exposure() [cumulative dose]",
"derive_var_duration()",
"Dose intensity tables, treatment compliance, exposure summaries",
"ADRS",
"RS, TR, TU, ADSL",
"BDS (oncology)",
"admiral.onco::derive_param_response()",
"admiral.onco::derive_param_confirmed_resp()",
"admiral.onco::derive_param_clinbenefit()",
"Tumor response tables, RECIST assessments, best overall response"
)
# Display mapping
adam_mapping %>%
select(ADAM_Dataset, SDTM_Sources, Structure_Type, Clinical_Purpose) %>%
knitr::kable(caption = "ADaM Dataset Architecture Overview")| ADAM_Dataset | SDTM_Sources | Structure_Type | Clinical_Purpose |
|---|---|---|---|
| ADSL | DM (primary), EX, DS, AE, VS, LB | Subject-level | Demographics tables, population flags, treatment assignments |
| ADAE | AE (primary), ADSL | OCCDS | Adverse event frequency tables, AE listings, safety summaries |
| ADLB | LB (primary), ADSL | BDS | Lab result tables, shift tables, change from baseline analysis |
| ADVS | VS (primary), ADSL | BDS | Vital signs tables, BP plots, vital sign changes |
| ADTTE | DS, AE, ADSL | BDS-like (time-to-event) | Kaplan-Meier curves, Cox regression, survival analysis |
| ADCM | CM (primary), ADSL | OCCDS | Concomitant medications tables, prior/concomitant distinction |
| ADEX | EX (primary), ADSL | BDS (exposure metrics) | Dose intensity tables, treatment compliance, exposure summaries |
| ADRS | RS, TR, TU, ADSL | BDS (oncology) | Tumor response tables, RECIST assessments, best overall response |
| ADAM_Dataset | Admiral_Function_1 | Admiral_Function_2 | Admiral_Function_3 |
|---|---|---|---|
| ADSL | derive_vars_merged() | derive_var_merged_exist_flag() | derive_var_trtdurd() |
| ADAE | derive_vars_merged() [ADSL] | derive_var_extreme_flag() [TRTEMFL] | derive_vars_dt() [dates] |
| ADLB | derive_var_extreme_flag() [ABLFL] | derive_var_base() | derive_var_chg() |
| ADVS | derive_var_extreme_flag() [ABLFL] | derive_var_base() | derive_vars_joined() [visit mapping] |
| ADTTE | event_source() | censor_source() | derive_param_tte() |
| ADCM | derive_vars_merged() [ADSL] | derive_vars_dt() [dates] | Period flags (CMPREFL, CMCONFL) |
| ADEX | derive_vars_merged() [ADSL] | derive_param_exposure() [cumulative dose] | derive_var_duration() |
| ADRS | admiral.onco::derive_param_response() | admiral.onco::derive_param_confirmed_resp() | admiral.onco::derive_param_clinbenefit() |
✓ Mapping matrix saved to adam_dataset_mapping.csv
This matrix will guide your ADaM builds in Days 16-30!
Understanding Admiral’s input-output contract prevents common errors.
Each function: - Takes a data frame as first argument - Returns a modified data frame - Can be inspected mid-pipeline
Admiral functions only add new variables or modify explicitly named variables. They never silently change existing variables.
URL: https://pharmaverse.github.io/admiral/
Key sections: - Get Started: Quick introduction - Vignettes: Topic-based guides (ADSL, ADAE, ADLB, dates, etc.) - Reference: Alphabetical function list with examples - Articles: Advanced topics (Higher-Order Functions, metadata, programming strategy)
URL: https://github.com/pharmaverse/admiral
Memorize these patterns - they appear in 90% of ADaM derivations.
Use for: Treatment dates, disposition reasons, any variable from another SDTM domain
Use for: Population flags (SAFFL, ITTFL), event occurrence flags
Use for: Baseline flags, first occurrence flags, min/max flags
Use for: Carrying baseline forward in BDS datasets
Use for: CHG, PCHG in all BDS datasets
Use for: All date conversions in all ADaM datasets
Standard sequence for any ADaM dataset:
1. Start with primary SDTM domain
↓
2. Merge ADSL (for demographics, treatment, flags)
↓
3. Derive dates (convert DTC → DT/DTM)
↓
4. Derive analysis values (AVAL, AVALC)
↓
5. Derive baseline flags (ABLFL for BDS)
↓
6. Derive baseline values (BASE for BDS)
↓
7. Derive change from baseline (CHG, PCHG for BDS)
↓
8. Derive analysis flags (TRTEMFL, ANL01FL, etc.)
↓
9. Derive computed parameters (for BDS)
↓
10. Add metadata (variable labels, formats)
↓
11. Export to XPT
Admiral makes each step explicit in your code.
%>%derive_vars_* (multi-var), derive_var_* (single-var), derive_param_* (computed parameters), derive_extreme_* (record computations)#admiral channelPrepare for Day 16 (hands-on ADSL) by:
pharmaversesdtm datasets: Load DM, EX, DS and examine their structureTomorrow we build ADSL from scratch!
End of Day 15
Today you mastered ADaM architecture and Admiral philosophy.
Tomorrow you’ll put it into practice building your first ADaM dataset: ADSL!