# Install packages if not already installed
if (!requireNamespace("dplyr", quietly = TRUE)) suppressMessages(install.packages("dplyr"))
if (!requireNamespace("tidyr", quietly = TRUE)) suppressMessages(install.packages("tidyr"))
if (!requireNamespace("pharmaversesdtm", quietly = TRUE)) suppressMessages(install.packages("pharmaversesdtm"))Day 9: VS (Vital Signs) & Repeated Measures
Visit-Level Data and Positional Readings
1 Learning Objectives
By the end of Day 9, you will be able to:
- Understand the VS (Vital Signs) domain structure and its role as a Findings class domain
- Handle repeated measures - multiple readings per subject per visit per timepoint
- Work with positional variables like
VSPOS(SITTING/STANDING) andVSLOC(ARM) - Derive summary statistics (mean, min, max) across multiple readings
- Understand how VS data maps to the ADaM BDS (Basic Data Structure) for ADVS
2 Introduction to the VS Domain
2.1 What is the VS Domain?
The VS (Vital Signs) domain captures measurements of a subject’s vital functions during a clinical trial. Like the LB domain we covered on Day 8, VS belongs to the Findings class of SDTM domains.
Common vital signs measured include:
- Blood Pressure (Systolic and Diastolic)
- Heart Rate (Pulse)
- Temperature
- Respiratory Rate
- Weight and Height
- Oxygen Saturation
2.2 Why is VS Special?
The VS domain introduces an important concept: repeated measures at the same timepoint. Unlike demographics (one row per subject) or adverse events (one row per event), vital signs are often measured multiple times:
- Multiple blood pressure readings at the same visit (e.g., 3 readings taken 2 minutes apart)
- Position-dependent measurements (sitting vs. standing blood pressure)
- Location-specific readings (left arm vs. right arm)
When analyzing vital signs, we need to decide:
- Do we use the first reading?
- Do we calculate the mean of all readings?
- Do we report the minimum or maximum?
This is a key consideration when building ADaM datasets (ADVS) from SDTM VS data.
2.3 The VS Domain Structure
┌──────────────────────────────────────────────────────────────────────────────┐
│ VS DOMAIN - KEY VARIABLES │
├──────────────────────────────────────────────────────────────────────────────┤
│ IDENTIFIER VARIABLES │
│ STUDYID = Study identifier │
│ USUBJID = Unique subject identifier (links to DM) │
│ VSSEQ = Sequence number within subject │
├──────────────────────────────────────────────────────────────────────────────┤
│ TOPIC VARIABLE │
│ VSTESTCD = Vital sign test short name (e.g., "SYSBP", "DIABP", "HR") │
│ VSTEST = Vital sign test name (e.g., "Systolic Blood Pressure") │
├──────────────────────────────────────────────────────────────────────────────┤
│ RESULT VARIABLES │
│ VSORRES = Result as originally collected (character) │
│ VSORRESU = Unit of original result │
│ VSSTRESC = Standardized result (character) │
│ VSSTRESN = Standardized result (numeric) │
│ VSSTRESU = Standardized unit │
├──────────────────────────────────────────────────────────────────────────────┤
│ POSITIONAL VARIABLES (Unique to VS) │
│ VSPOS = Position of subject (SITTING, STANDING, SUPINE) │
│ VSLOC = Location on body (LEFT ARM, RIGHT ARM) │
├──────────────────────────────────────────────────────────────────────────────┤
│ TIMING VARIABLES │
│ VISITNUM = Visit number │
│ VISIT = Visit name (e.g., "SCREENING", "WEEK 2") │
│ VSDY = Study day of measurement │
│ VSTPT = Planned time point name (PRE-DOSE, 1 HOUR POST-DOSE) │
│ VSTPTNUM = Planned time point number │
│ VSDTC = Date/time of measurement (ISO 8601) │
└──────────────────────────────────────────────────────────────────────────────┘
3 Package Installation & Loading
3.1 Required Packages
| Package | Purpose |
|---|---|
dplyr |
Data manipulation (filter, mutate, summarise) |
tidyr |
Data reshaping (pivot operations) |
pharmaversesdtm |
Example SDTM datasets including VS |
3.2 Install Packages (if needed)
3.3 Load Packages
library(dplyr)
library(tidyr)
library(pharmaversesdtm)4 Exploring pharmaversesdtm VS Data
Let’s start by loading and exploring the VS domain from pharmaversesdtm.
4.1 Load the VS Domain
# Load VS domain from pharmaversesdtm
data("vs", package = "pharmaversesdtm")
# Quick overview
cat("VS domain dimensions:", nrow(vs), "rows x", ncol(vs), "columns\n")VS domain dimensions: 29643 rows x 24 columns
cat("Number of unique subjects:", n_distinct(vs$USUBJID), "\n")Number of unique subjects: 254
cat("Number of unique tests:", n_distinct(vs$VSTESTCD), "\n")Number of unique tests: 6
4.2 Explore VS Structure
# View the structure of the VS domain
dplyr::glimpse(vs)Rows: 29,643
Columns: 24
$ STUDYID <chr> "CDISCPILOT01", "CDISCPILOT01", "CDISCPILOT01", "CDISCPILOT01…
$ DOMAIN <chr> "VS", "VS", "VS", "VS", "VS", "VS", "VS", "VS", "VS", "VS", "…
$ USUBJID <chr> "01-701-1015", "01-701-1015", "01-701-1015", "01-701-1015", "…
$ VSSEQ <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18…
$ VSTESTCD <chr> "DIABP", "DIABP", "DIABP", "DIABP", "DIABP", "DIABP", "DIABP"…
$ VSTEST <chr> "Diastolic Blood Pressure", "Diastolic Blood Pressure", "Dias…
$ VSPOS <chr> "SUPINE", "STANDING", "STANDING", "SUPINE", "STANDING", "STAN…
$ VSORRES <chr> "64", "83", "57", "68", "59", "71", "56", "51", "61", "67", "…
$ VSORRESU <chr> "mmHg", "mmHg", "mmHg", "mmHg", "mmHg", "mmHg", "mmHg", "mmHg…
$ VSSTRESC <chr> "64", "83", "57", "68", "59", "71", "56", "51", "61", "67", "…
$ VSSTRESN <dbl> 64, 83, 57, 68, 59, 71, 56, 51, 61, 67, 61, 65, 56, 50, 54, 6…
$ VSSTRESU <chr> "mmHg", "mmHg", "mmHg", "mmHg", "mmHg", "mmHg", "mmHg", "mmHg…
$ VSSTAT <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ VSLOC <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ VSBLFL <chr> NA, NA, NA, NA, NA, NA, "Y", "Y", "Y", NA, NA, NA, NA, NA, NA…
$ VISITNUM <dbl> 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.5, 3.5, 3.5, 4…
$ VISIT <chr> "SCREENING 1", "SCREENING 1", "SCREENING 1", "SCREENING 2", "…
$ VISITDY <dbl> -7, -7, -7, -1, -1, -1, 1, 1, 1, 13, 13, 13, 14, 14, 14, 28, …
$ VSDTC <chr> "2013-12-26", "2013-12-26", "2013-12-26", "2013-12-31", "2013…
$ VSDY <dbl> -7, -7, -7, -2, -2, -2, 1, 1, 1, 13, 13, 13, 15, 15, 15, 29, …
$ VSTPT <chr> "AFTER LYING DOWN FOR 5 MINUTES", "AFTER STANDING FOR 1 MINUT…
$ VSTPTNUM <dbl> 815, 816, 817, 815, 816, 817, 815, 816, 817, 815, 816, 817, 8…
$ VSELTM <chr> "PT5M", "PT1M", "PT3M", "PT5M", "PT1M", "PT3M", "PT5M", "PT1M…
$ VSTPTREF <chr> "PATIENT SUPINE", "PATIENT STANDING", "PATIENT STANDING", "PA…
4.3 Key VS Variables
Let’s examine the vital sign tests available in our data:
# View unique vital sign tests
vs %>%
distinct(VSTESTCD, VSTEST, VSSTRESU) %>%
arrange(VSTESTCD)# A tibble: 9 × 3
VSTESTCD VSTEST VSSTRESU
<chr> <chr> <chr>
1 DIABP Diastolic Blood Pressure mmHg
2 DIABP Diastolic Blood Pressure <NA>
3 HEIGHT Height cm
4 PULSE Pulse Rate BEATS/MIN
5 PULSE Pulse Rate <NA>
6 SYSBP Systolic Blood Pressure mmHg
7 SYSBP Systolic Blood Pressure <NA>
8 TEMP Temperature C
9 WEIGHT Weight kg
Common CDISC vital sign test codes:
- SYSBP = Systolic Blood Pressure
- DIABP = Diastolic Blood Pressure
- PULSE = Pulse Rate / Heart Rate
- TEMP = Temperature
- RESP = Respiratory Rate
- WEIGHT = Weight
- HEIGHT = Height
- BMI = Body Mass Index (often derived)
5 Understanding Visit Structure
5.1 Visits and Timepoints
In clinical trials, vital signs are collected at specific visits (e.g., Screening, Week 2, Week 8) and sometimes at specific timepoints within a visit (e.g., pre-dose, 1 hour post-dose).
# View visit structure
vs %>%
distinct(VISITNUM, VISIT) %>%
arrange(VISITNUM)# A tibble: 16 × 2
VISITNUM VISIT
<dbl> <chr>
1 1 SCREENING 1
2 2 SCREENING 2
3 3 BASELINE
4 3.1 UNSCHEDULED 3.1
5 3.5 AMBUL ECG PLACEMENT
6 4 WEEK 2
7 5 WEEK 4
8 6 AMBUL ECG REMOVAL
9 7 WEEK 6
10 8 WEEK 8
11 9 WEEK 12
12 10 WEEK 16
13 11 WEEK 20
14 12 WEEK 24
15 13 WEEK 26
16 201 RETRIEVAL
5.2 SDTM Timing Variables Explained
| Variable | Description | Example |
|---|---|---|
VISITNUM |
Numeric visit identifier | 1, 2, 3 |
VISIT |
Visit name | “BASELINE”, “WEEK 2” |
VSDY |
Study day of measurement | 1, 15, 29 |
VSTPT |
Planned time point name | “PRE-DOSE”, “1HR POST-DOSE” |
VSTPTNUM |
Planned time point number | 1, 2, 3 |
VSDTC |
Date/time (ISO 8601) | “2023-03-15T09:30:00” |
# Example of timing variables
vs %>%
filter(USUBJID == first(vs$USUBJID), VSTESTCD == "SYSBP") %>%
select(USUBJID, VSTESTCD, VISITNUM, VISIT, VSDY, VSTPT, VSTPTNUM) %>%
arrange(VISITNUM, VSTPTNUM) %>%
head(10)# A tibble: 10 × 7
USUBJID VSTESTCD VISITNUM VISIT VSDY VSTPT VSTPTNUM
<chr> <chr> <dbl> <chr> <dbl> <chr> <dbl>
1 01-701-1015 SYSBP 1 SCREENING 1 -7 AFTER LYING… 815
2 01-701-1015 SYSBP 1 SCREENING 1 -7 AFTER STAND… 816
3 01-701-1015 SYSBP 1 SCREENING 1 -7 AFTER STAND… 817
4 01-701-1015 SYSBP 2 SCREENING 2 -2 AFTER LYING… 815
5 01-701-1015 SYSBP 2 SCREENING 2 -2 AFTER STAND… 816
6 01-701-1015 SYSBP 2 SCREENING 2 -2 AFTER STAND… 817
7 01-701-1015 SYSBP 3 BASELINE 1 AFTER LYING… 815
8 01-701-1015 SYSBP 3 BASELINE 1 AFTER STAND… 816
9 01-701-1015 SYSBP 3 BASELINE 1 AFTER STAND… 817
10 01-701-1015 SYSBP 3.5 AMBUL ECG PLACEMENT 13 AFTER LYING… 815
6 Positional Variables: VSPOS and VSLOC
6.1 Why Position and Location Matter
Blood pressure measurements can vary significantly based on:
- Position: Sitting, standing, or lying down (supine)
- Location: Which arm the reading was taken from
Clinical protocols often specify:
“Blood pressure should be measured after the patient has been seated for at least 5 minutes, using the right arm.”
6.2 VSPOS - Position of Subject
The VSPOS variable captures the subject’s position during measurement:
| VSPOS Value | Description |
|---|---|
SITTING |
Subject is seated |
STANDING |
Subject is standing |
SUPINE |
Subject is lying flat |
RECLINING |
Subject is partially reclined |
# Check position values in our data
vs %>%
filter(VSTESTCD %in% c("SYSBP", "DIABP")) %>%
count(VSPOS, name = "Count") %>%
arrange(desc(Count))# A tibble: 2 × 2
VSPOS Count
<chr> <int>
1 STANDING 10942
2 SUPINE 5473
6.3 VSLOC - Location on Body
The VSLOC variable specifies where on the body the measurement was taken:
| VSLOC Value | Description |
|---|---|
LEFT ARM |
Blood pressure cuff on left arm |
RIGHT ARM |
Blood pressure cuff on right arm |
ARM |
Arm (unspecified) |
ORAL |
Oral temperature |
AXILLARY |
Under the armpit (temperature) |
# Check location values
vs %>%
filter(VSTESTCD %in% c("SYSBP", "DIABP", "TEMP")) %>%
count(VSTESTCD, VSLOC, name = "Count") %>%
arrange(VSTESTCD, desc(Count))# A tibble: 4 × 3
VSTESTCD VSLOC Count
<chr> <chr> <int>
1 DIABP <NA> 8207
2 SYSBP <NA> 8208
3 TEMP ORAL CAVITY 1765
4 TEMP EAR 955
7 Handling Multiple Readings
7.1 The Problem: Multiple Measurements
In many studies, vital signs are measured multiple times at the same visit to ensure accuracy. For blood pressure, it’s common to take 3 readings and use the average.
7.2 Creating Sample Data with Multiple Readings
Let’s create a dataset that demonstrates this scenario:
# Create sample data with 3 BP readings per visit
set.seed(42)
subjects <- c("CDISC01-001-001", "CDISC01-001-002", "CDISC01-001-003")
visits <- c("SCREENING", "BASELINE", "WEEK 2", "WEEK 4")
visit_nums <- c(-1, 1, 2, 3)
# Generate VS data with multiple readings per visit
vs_multiple <- expand_grid(
USUBJID = subjects,
VISIT_INFO = tibble(VISIT = visits, VISITNUM = visit_nums)
) %>%
unnest(VISIT_INFO) %>%
# Create 3 readings per vital sign per visit
crossing(
READING = 1:3,
tibble::tribble(
~VSTESTCD, ~VSTEST, ~VSSTRESU, ~BASE_VALUE, ~SD_VALUE,
"SYSBP", "Systolic Blood Pressure", "mmHg", 125, 10,
"DIABP", "Diastolic Blood Pressure", "mmHg", 80, 8,
"PULSE", "Pulse Rate", "beats/min", 72, 8
)
) %>%
# Generate values with some variation between readings
rowwise() %>%
mutate(
VSORRES = as.character(round(rnorm(1, BASE_VALUE, SD_VALUE), 0)),
VSSTRESN = as.numeric(VSORRES),
VSSTRESC = VSORRES,
VSORRESU = VSSTRESU,
VSPOS = "SITTING",
VSLOC = ifelse(VSTESTCD %in% c("SYSBP", "DIABP"), "RIGHT ARM", NA_character_)
) %>%
ungroup() %>%
# Add identifiers
mutate(
STUDYID = "CDISC01",
VSTPT = paste("READING", READING),
VSTPTNUM = READING
) %>%
group_by(USUBJID) %>%
mutate(VSSEQ = row_number()) %>%
ungroup() %>%
select(STUDYID, USUBJID, VSSEQ, VSTESTCD, VSTEST,
VSORRES, VSORRESU, VSSTRESC, VSSTRESN, VSSTRESU,
VSPOS, VSLOC, VISITNUM, VISIT, VSTPT, VSTPTNUM)
cat("Sample VS data with multiple readings:\n")Sample VS data with multiple readings:
cat("Total records:", nrow(vs_multiple), "\n")Total records: 108
cat("Subjects:", n_distinct(vs_multiple$USUBJID), "\n")Subjects: 3
cat("Visits:", n_distinct(vs_multiple$VISIT), "\n\n")Visits: 4
# Show example for one subject at one visit
vs_multiple %>%
filter(USUBJID == first(vs_multiple$USUBJID), VISIT == "BASELINE") %>%
print()# A tibble: 9 × 16
STUDYID USUBJID VSSEQ VSTESTCD VSTEST VSORRES VSORRESU VSSTRESC VSSTRESN
<chr> <chr> <int> <chr> <chr> <chr> <chr> <chr> <dbl>
1 CDISC01 CDISC01-001-… 1 DIABP Diast… 91 mmHg 91 91
2 CDISC01 CDISC01-001-… 2 PULSE Pulse… 67 beats/m… 67 67
3 CDISC01 CDISC01-001-… 3 SYSBP Systo… 129 mmHg 129 129
4 CDISC01 CDISC01-001-… 4 DIABP Diast… 85 mmHg 85 85
5 CDISC01 CDISC01-001-… 5 PULSE Pulse… 75 beats/m… 75 75
6 CDISC01 CDISC01-001-… 6 SYSBP Systo… 124 mmHg 124 124
7 CDISC01 CDISC01-001-… 7 DIABP Diast… 92 mmHg 92 92
8 CDISC01 CDISC01-001-… 8 PULSE Pulse… 71 beats/m… 71 71
9 CDISC01 CDISC01-001-… 9 SYSBP Systo… 145 mmHg 145 145
# ℹ 7 more variables: VSSTRESU <chr>, VSPOS <chr>, VSLOC <chr>, VISITNUM <dbl>,
# VISIT <chr>, VSTPT <chr>, VSTPTNUM <int>
Notice that for each subject at each visit, we have:
- 3 readings × 3 tests = 9 records per subject per visit
- Each reading is identified by
VSTPTandVSTPTNUM - This granular data allows for various analysis approaches
8 Deriving Statistics Across Readings
Now let’s derive mean, minimum, and maximum values across the multiple readings.
8.1 Calculate Summary Statistics
# Calculate statistics across readings for each subject/visit/test
vs_summary <- vs_multiple %>%
group_by(STUDYID, USUBJID, VSTESTCD, VSTEST, VSSTRESU,
VSPOS, VSLOC, VISITNUM, VISIT) %>%
summarise(
N_READINGS = n(),
VSSTRESN_MEAN = round(mean(VSSTRESN, na.rm = TRUE), 1),
VSSTRESN_MIN = min(VSSTRESN, na.rm = TRUE),
VSSTRESN_MAX = max(VSSTRESN, na.rm = TRUE),
VSSTRESN_SD = round(sd(VSSTRESN, na.rm = TRUE), 2),
.groups = "drop"
)
cat("Summary statistics by subject/visit/test:\n")Summary statistics by subject/visit/test:
vs_summary %>%
filter(USUBJID == first(vs_summary$USUBJID)) %>%
print()# A tibble: 12 × 14
STUDYID USUBJID VSTESTCD VSTEST VSSTRESU VSPOS VSLOC VISITNUM VISIT
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <chr>
1 CDISC01 CDISC01-001-001 DIABP Diastol… mmHg SITT… RIGH… -1 SCRE…
2 CDISC01 CDISC01-001-001 DIABP Diastol… mmHg SITT… RIGH… 1 BASE…
3 CDISC01 CDISC01-001-001 DIABP Diastol… mmHg SITT… RIGH… 2 WEEK…
4 CDISC01 CDISC01-001-001 DIABP Diastol… mmHg SITT… RIGH… 3 WEEK…
5 CDISC01 CDISC01-001-001 PULSE Pulse R… beats/m… SITT… <NA> -1 SCRE…
6 CDISC01 CDISC01-001-001 PULSE Pulse R… beats/m… SITT… <NA> 1 BASE…
7 CDISC01 CDISC01-001-001 PULSE Pulse R… beats/m… SITT… <NA> 2 WEEK…
8 CDISC01 CDISC01-001-001 PULSE Pulse R… beats/m… SITT… <NA> 3 WEEK…
9 CDISC01 CDISC01-001-001 SYSBP Systoli… mmHg SITT… RIGH… -1 SCRE…
10 CDISC01 CDISC01-001-001 SYSBP Systoli… mmHg SITT… RIGH… 1 BASE…
11 CDISC01 CDISC01-001-001 SYSBP Systoli… mmHg SITT… RIGH… 2 WEEK…
12 CDISC01 CDISC01-001-001 SYSBP Systoli… mmHg SITT… RIGH… 3 WEEK…
# ℹ 5 more variables: N_READINGS <int>, VSSTRESN_MEAN <dbl>,
# VSSTRESN_MIN <dbl>, VSSTRESN_MAX <dbl>, VSSTRESN_SD <dbl>
8.2 Create Analysis-Ready Dataset
When creating ADaM ADVS, we typically select one representative value from the multiple readings. Common approaches:
# Create analysis-ready dataset using MEAN of readings
vs_analysis <- vs_summary %>%
mutate(
AVAL = VSSTRESN_MEAN,
AVALC = as.character(AVAL),
DTYPE = "MEAN" # Derivation type
) %>%
select(STUDYID, USUBJID, VSTESTCD, VSTEST,
VISITNUM, VISIT, VSPOS, VSLOC,
AVAL, AVALC, VSSTRESU, N_READINGS, DTYPE)
cat("Analysis-ready dataset (using mean values):\n")Analysis-ready dataset (using mean values):
head(vs_analysis, 12)# A tibble: 12 × 13
STUDYID USUBJID VSTESTCD VSTEST VISITNUM VISIT VSPOS VSLOC AVAL AVALC
<chr> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <dbl> <chr>
1 CDISC01 CDISC01-001-0… DIABP Diast… -1 SCRE… SITT… RIGH… 77.7 77.7
2 CDISC01 CDISC01-001-0… DIABP Diast… 1 BASE… SITT… RIGH… 89.3 89.3
3 CDISC01 CDISC01-001-0… DIABP Diast… 2 WEEK… SITT… RIGH… 73.7 73.7
4 CDISC01 CDISC01-001-0… DIABP Diast… 3 WEEK… SITT… RIGH… 75 75
5 CDISC01 CDISC01-001-0… PULSE Pulse… -1 SCRE… SITT… <NA> 74 74
6 CDISC01 CDISC01-001-0… PULSE Pulse… 1 BASE… SITT… <NA> 71 71
7 CDISC01 CDISC01-001-0… PULSE Pulse… 2 WEEK… SITT… <NA> 74.3 74.3
8 CDISC01 CDISC01-001-0… PULSE Pulse… 3 WEEK… SITT… <NA> 76.7 76.7
9 CDISC01 CDISC01-001-0… SYSBP Systo… -1 SCRE… SITT… RIGH… 123. 123.3
10 CDISC01 CDISC01-001-0… SYSBP Systo… 1 BASE… SITT… RIGH… 133. 132.7
11 CDISC01 CDISC01-001-0… SYSBP Systo… 2 WEEK… SITT… RIGH… 127 127
12 CDISC01 CDISC01-001-0… SYSBP Systo… 3 WEEK… SITT… RIGH… 121. 120.7
# ℹ 3 more variables: VSSTRESU <chr>, N_READINGS <int>, DTYPE <chr>
When building ADVS (Analysis Dataset for Vital Signs):
- AVAL = Analysis Value (numeric)
- AVALC = Analysis Value (character)
- DTYPE = Derivation Type (documents how AVAL was calculated)
Common DTYPE values: “MEAN”, “MEDIAN”, “LAST”, “FIRST”, “WORST”
9 Creating a Wide Format View
For some analyses, having one row per subject per visit (with BP columns side by side) is useful.
9.1 Pivot to Wide Format
# Create wide format with one row per subject/visit
vs_wide <- vs_summary %>%
select(USUBJID, VISITNUM, VISIT, VSTESTCD, VSSTRESN_MEAN) %>%
pivot_wider(
names_from = VSTESTCD,
values_from = VSSTRESN_MEAN,
names_prefix = "MEAN_"
)
cat("Wide format - One row per subject per visit:\n")Wide format - One row per subject per visit:
print(vs_wide)# A tibble: 12 × 6
USUBJID VISITNUM VISIT MEAN_DIABP MEAN_PULSE MEAN_SYSBP
<chr> <dbl> <chr> <dbl> <dbl> <dbl>
1 CDISC01-001-001 -1 SCREENING 77.7 74 123.
2 CDISC01-001-001 1 BASELINE 89.3 71 133.
3 CDISC01-001-001 2 WEEK 2 73.7 74.3 127
4 CDISC01-001-001 3 WEEK 4 75 76.7 121.
5 CDISC01-001-002 -1 SCREENING 78 76 133.
6 CDISC01-001-002 1 BASELINE 80 68.3 111
7 CDISC01-001-002 2 WEEK 2 79.7 65 130.
8 CDISC01-001-002 3 WEEK 4 86.7 70 132
9 CDISC01-001-003 -1 SCREENING 77 76.3 127.
10 CDISC01-001-003 1 BASELINE 81 68.3 130
11 CDISC01-001-003 2 WEEK 2 84.3 63.7 125.
12 CDISC01-001-003 3 WEEK 4 79.3 79.3 126.
Wide format is useful for:
- Correlation analysis (e.g., systolic vs. diastolic BP)
- Patient profiles (view all vitals at once)
- Exploratory data analysis
SDTM VS is always in long format (one row per measurement).
10 Complete Example: Building a VS Domain
Let’s build a complete VS domain from simulated raw data, similar to what you might receive from an EDC system.
10.1 Step 1: Create Raw Vital Signs Data
# Simulate raw vital signs data as it might come from EDC
set.seed(123)
subjects <- paste0("CDISC01-001-00", 1:5)
visits <- c("SCREENING", "BASELINE", "WEEK 2", "WEEK 4", "WEEK 8")
visit_nums <- c(-1, 1, 2, 3, 4)
# Generate comprehensive VS data
raw_vs <- expand_grid(
USUBJID = subjects,
VISIT_INFO = tibble(VISIT = visits, VISITNUM = visit_nums)
) %>%
unnest(VISIT_INFO) %>%
# Add vital sign tests with 3 readings each for BP, 1 for others
crossing(
tibble::tribble(
~TEST, ~TESTCD, ~UNIT, ~MEAN_VAL, ~SD_VAL, ~N_READ,
"Systolic Blood Pressure", "SYSBP", "mmHg", 125, 12, 3,
"Diastolic Blood Pressure", "DIABP", "mmHg", 78, 8, 3,
"Pulse Rate", "PULSE", "beats/min", 72, 8, 1,
"Temperature", "TEMP", "C", 36.8, 0.3, 1,
"Respiratory Rate", "RESP", "breaths/min", 16, 2, 1,
"Weight", "WEIGHT", "kg", 75, 10, 1,
"Height", "HEIGHT", "cm", 170, 10, 1
)
) %>%
# Expand for number of readings
rowwise() %>%
mutate(data = list(tibble(READING = 1:N_READ))) %>%
unnest(data) %>%
ungroup() %>%
# Generate values
rowwise() %>%
mutate(
VALUE = round(rnorm(1, MEAN_VAL, SD_VAL), 1),
# Height stays constant within subject
VALUE = ifelse(TESTCD == "HEIGHT", round(rnorm(1, 170, 10), 0), VALUE),
# Weight stays relatively constant
VALUE = ifelse(TESTCD == "WEIGHT", round(rnorm(1, 75 + (VISITNUM * 0.5), 2), 1), VALUE)
) %>%
ungroup() %>%
# Clean up - HEIGHT only at SCREENING
filter(!(TESTCD == "HEIGHT" & VISIT != "SCREENING")) %>%
mutate(
STUDYID = "CDISC01",
VSPOS = case_when(
TESTCD %in% c("SYSBP", "DIABP", "PULSE") ~ "SITTING",
TRUE ~ NA_character_
),
VSLOC = case_when(
TESTCD %in% c("SYSBP", "DIABP") ~ "RIGHT ARM",
TESTCD == "TEMP" ~ "ORAL",
TRUE ~ NA_character_
),
VSTPT = ifelse(N_READ > 1, paste("READING", READING), NA_character_),
VSTPTNUM = ifelse(N_READ > 1, READING, NA_integer_)
)
cat("Raw VS data generated:\n")Raw VS data generated:
cat("Total records:", nrow(raw_vs), "\n")Total records: 255
10.2 Step 2: Transform to SDTM VS Format
# Transform to proper SDTM VS format
sdtm_vs <- raw_vs %>%
# Add required SDTM variables
mutate(
DOMAIN = "VS",
VSTESTCD = TESTCD,
VSTEST = TEST,
VSORRES = as.character(VALUE),
VSORRESU = UNIT,
VSSTRESC = VSORRES,
VSSTRESN = VALUE,
VSSTRESU = UNIT,
VSDTC = "2023-03-15", # Simplified for this example
EPOCH = case_when(
VISITNUM < 1 ~ "SCREENING",
VISITNUM == 1 ~ "TREATMENT",
TRUE ~ "TREATMENT"
)
) %>%
# Add sequence number
group_by(USUBJID) %>%
mutate(VSSEQ = row_number()) %>%
ungroup() %>%
# Select and order SDTM variables
select(
STUDYID, DOMAIN, USUBJID, VSSEQ,
VSTESTCD, VSTEST,
VSORRES, VSORRESU,
VSSTRESC, VSSTRESN, VSSTRESU,
VSPOS, VSLOC,
VISITNUM, VISIT, VSTPT, VSTPTNUM,
EPOCH, VSDTC
) %>%
arrange(STUDYID, USUBJID, VISITNUM, VSTESTCD, VSTPTNUM)
cat("SDTM VS domain created:\n")SDTM VS domain created:
cat("Total records:", nrow(sdtm_vs), "\n\n")Total records: 255
# Preview
head(sdtm_vs, 15)# A tibble: 15 × 19
STUDYID DOMAIN USUBJID VSSEQ VSTESTCD VSTEST VSORRES VSORRESU VSSTRESC
<chr> <chr> <chr> <int> <chr> <chr> <chr> <chr> <chr>
1 CDISC01 VS CDISC01-001-0… 11 DIABP Diast… 80.9 mmHg 80.9
2 CDISC01 VS CDISC01-001-0… 12 DIABP Diast… 81.2 mmHg 81.2
3 CDISC01 VS CDISC01-001-0… 13 DIABP Diast… 78.9 mmHg 78.9
4 CDISC01 VS CDISC01-001-0… 14 HEIGHT Height 172 cm 172
5 CDISC01 VS CDISC01-001-0… 15 PULSE Pulse… 86.3 beats/m… 86.3
6 CDISC01 VS CDISC01-001-0… 16 RESP Respi… 17 breaths… 17
7 CDISC01 VS CDISC01-001-0… 17 SYSBP Systo… 101.4 mmHg 101.4
8 CDISC01 VS CDISC01-001-0… 18 SYSBP Systo… 133.4 mmHg 133.4
9 CDISC01 VS CDISC01-001-0… 19 SYSBP Systo… 119.3 mmHg 119.3
10 CDISC01 VS CDISC01-001-0… 20 TEMP Tempe… 36.5 C 36.5
11 CDISC01 VS CDISC01-001-0… 21 WEIGHT Weight 73 kg 73
12 CDISC01 VS CDISC01-001-0… 1 DIABP Diast… 73.5 mmHg 73.5
13 CDISC01 VS CDISC01-001-0… 2 DIABP Diast… 76.2 mmHg 76.2
14 CDISC01 VS CDISC01-001-0… 3 DIABP Diast… 90.5 mmHg 90.5
15 CDISC01 VS CDISC01-001-0… 4 PULSE Pulse… 73 beats/m… 73
# ℹ 10 more variables: VSSTRESN <dbl>, VSSTRESU <chr>, VSPOS <chr>,
# VSLOC <chr>, VISITNUM <dbl>, VISIT <chr>, VSTPT <chr>, VSTPTNUM <int>,
# EPOCH <chr>, VSDTC <chr>
10.3 Step 3: Validate the VS Domain
# Validation checks
cat("=== VS Domain Validation ===\n\n")=== VS Domain Validation ===
# Check 1: Required variables
required_vars <- c("STUDYID", "DOMAIN", "USUBJID", "VSSEQ",
"VSTESTCD", "VSTEST", "VSORRES", "VSORRESU")
present <- required_vars %in% names(sdtm_vs)
cat("Required variables check:\n")Required variables check:
for (i in seq_along(required_vars)) {
cat(" ", required_vars[i], ":", ifelse(present[i], "✓", "✗"), "\n")
} STUDYID : ✓
DOMAIN : ✓
USUBJID : ✓
VSSEQ : ✓
VSTESTCD : ✓
VSTEST : ✓
VSORRES : ✓
VSORRESU : ✓
# Check 2: Data completeness
cat("\nData completeness:\n")
Data completeness:
cat(" Missing VSSTRESN:", sum(is.na(sdtm_vs$VSSTRESN)), "\n") Missing VSSTRESN: 0
cat(" Missing VSPOS (for BP/pulse):",
sum(is.na(sdtm_vs$VSPOS) & sdtm_vs$VSTESTCD %in% c("SYSBP", "DIABP", "PULSE")), "\n") Missing VSPOS (for BP/pulse): 0
# Check 3: Value ranges
cat("\nValue ranges by test:\n")
Value ranges by test:
sdtm_vs %>%
group_by(VSTESTCD, VSTEST, VSSTRESU) %>%
summarise(
Min = min(VSSTRESN, na.rm = TRUE),
Max = max(VSSTRESN, na.rm = TRUE),
.groups = "drop"
) %>%
print()# A tibble: 7 × 5
VSTESTCD VSTEST VSSTRESU Min Max
<chr> <chr> <chr> <dbl> <dbl>
1 DIABP Diastolic Blood Pressure mmHg 61.6 96.3
2 HEIGHT Height cm 172 189
3 PULSE Pulse Rate beats/min 60.3 86.3
4 RESP Respiratory Rate breaths/min 11.4 20.2
5 SYSBP Systolic Blood Pressure mmHg 101. 151.
6 TEMP Temperature C 36.4 37.8
7 WEIGHT Weight kg 70.5 81.1
# Check 4: Coverage
cat("\nData coverage:\n")
Data coverage:
cat(" Subjects:", n_distinct(sdtm_vs$USUBJID), "\n") Subjects: 5
cat(" Visits:", n_distinct(sdtm_vs$VISIT), "\n") Visits: 5
cat(" Tests:", n_distinct(sdtm_vs$VSTESTCD), "\n") Tests: 7
11 Preview: ADaM BDS Structure
The VS domain maps directly to the ADaM ADVS dataset using the BDS (Basic Data Structure) format. Here’s a preview of how this transition works:
11.1 Key ADaM Concepts for ADVS
| SDTM Variable | ADaM Variable | Description |
|---|---|---|
| VSSTRESN | AVAL | Analysis value (numeric) |
| VSSTRESC | AVALC | Analysis value (character) |
| VISIT | AVISIT | Analysis visit |
| VISITNUM | AVISITN | Analysis visit number |
| - | ABLFL | Baseline flag (“Y”) |
| - | BASE | Baseline value |
| - | CHG | Change from baseline |
| - | PCHG | Percent change from baseline |
11.2 Mock ADVS Creation
# Create a simple ADVS preview
# First, aggregate to one value per subject/visit/test (using mean for BP)
advs_prep <- sdtm_vs %>%
group_by(STUDYID, USUBJID, VSTESTCD, VSTEST, VSSTRESU, VISITNUM, VISIT) %>%
summarise(
AVAL = round(mean(VSSTRESN, na.rm = TRUE), 1),
.groups = "drop"
) %>%
rename(PARAMCD = VSTESTCD, PARAM = VSTEST, AVISITN = VISITNUM, AVISIT = VISIT)
# Add baseline flag and value
advs_preview <- advs_prep %>%
group_by(USUBJID, PARAMCD) %>%
mutate(
# Flag baseline (VISITNUM = 1)
ABLFL = ifelse(AVISITN == 1, "Y", NA_character_),
# Get baseline value
BASE = AVAL[AVISITN == 1][1],
# Calculate change from baseline
CHG = ifelse(AVISITN >= 1, round(AVAL - BASE, 1), NA_real_),
# Calculate percent change
PCHG = ifelse(AVISITN >= 1 & BASE != 0,
round(100 * (AVAL - BASE) / BASE, 1), NA_real_)
) %>%
ungroup()
cat("ADVS Preview (BDS Structure):\n")ADVS Preview (BDS Structure):
advs_preview %>%
filter(USUBJID == first(advs_preview$USUBJID), PARAMCD == "SYSBP") %>%
select(USUBJID, PARAMCD, AVISITN, AVISIT, AVAL, ABLFL, BASE, CHG, PCHG) %>%
print()# A tibble: 5 × 9
USUBJID PARAMCD AVISITN AVISIT AVAL ABLFL BASE CHG PCHG
<chr> <chr> <dbl> <chr> <dbl> <chr> <dbl> <dbl> <dbl>
1 CDISC01-001-001 SYSBP -1 SCREENING 118 <NA> 119 NA NA
2 CDISC01-001-001 SYSBP 1 BASELINE 119 Y 119 0 0
3 CDISC01-001-001 SYSBP 2 WEEK 2 127. <NA> 119 8.1 6.8
4 CDISC01-001-001 SYSBP 3 WEEK 4 120. <NA> 119 0.9 0.8
5 CDISC01-001-001 SYSBP 4 WEEK 8 126. <NA> 119 6.7 5.6
In Week 3 (Days 15-21), we’ll use the admiral package to properly build ADaM datasets like ADVS. The key admiral functions include:
derive_var_extreme_flag()- Create baseline flagsderive_var_base()- Derive baseline valuesderive_vars_chg()- Calculate change from baseline
Learn more at: Admiral ADVS Vignette
12 Deliverable Summary
Today you completed the following:
| Task | Status |
|---|---|
| Understood VS domain structure and repeated measures | ✓ Done |
| Explored positioning variables (VSPOS, VSLOC) | ✓ Done |
| Handled multiple readings per visit | ✓ Done |
| Derived summary statistics (mean, min, max) | ✓ Done |
| Created wide format view for analysis | ✓ Done |
| Built a complete VS domain with 5+ vital signs | ✓ Done |
| Previewed ADaM BDS structure (ADVS) | ✓ Done |
13 Key Takeaways
- VS is a Findings class domain - It shares structure with LB (–TESTCD, –ORRES, –STRESN)
- Repeated measures are common - Blood pressure often has 3 readings per visit
- Position matters - VSPOS (SITTING/STANDING) affects measurements significantly
- Location matters - VSLOC (LEFT ARM/RIGHT ARM) should be consistent per protocol
- Summary statistics are essential - Choose mean, min, max based on analysis needs
- VS maps to ADVS - BDS structure includes baseline, change, and percent change
14 Resources
- CDISC SDTM Implementation Guide - VS Domain - Official VS specification
- Admiral ADVS Vignette - Building vital signs analysis datasets
- Pharmaverse.org - R packages for clinical data
- tidyr pivot documentation - Wide/long format transformations
- CDISC ADaM BDS Structure - Basic Data Structure specification
15 What’s Next?
In Day 10, we will focus on AE Domain Mastery & SAE Logic:
- Deep dive into severity, causality, and outcomes
- Understanding SAE criteria (AESER, AESDTH, AESHOSP)
- Deriving AE duration and treatment-emergent flags
- Preparing for ADaM ADAE creation