library(dplyr)
library(tidyr)
library(tibble)
library(lubridate)Day 6: Introduction to sdtm.oak
EDC-to-SDTM Transformation Patterns
1 Learning Objectives
By the end of Day 6, you will be able to:
- Understand the philosophy of
sdtm.oakfor SDTM generation - Simulate “Raw” EDC datasets for both VS and LB
- Apply algorithm-based transformations (mapping, pivoting, hardcoding)
- Perform unit standardization (e.g., mg/dL to mmol/L)
- Create a complete SDTM LB domain with sequence numbers
2 Introduction
2.1 What is sdtm.oak?
sdtm.oak is a package from the pharmaverse that helps you create SDTM datasets in a standardized, repeatable way. Instead of writing custom code for every study, you use a set of reusable algorithms and rules. This makes your code easier to maintain, test, and share with others.
2.1.1 Key concepts:
- Algorithm-based: You use small, well-defined steps (algorithms) to transform your data. For example, you might have an algorithm to assign subject IDs, another to map test codes, and another to standardize units.
- Metadata-driven: The rules for how to transform the data are defined in a specification (metadata), not hard-coded in your script. This means you can update the rules without rewriting your code.
- Modular: Each transformation is a small, testable function. You can chain them together to build complex workflows.
2.1.2 Why is this important?
This approach saves time, reduces errors, and makes it easier to follow CDISC standards. It also helps new programmers understand what each step is doing, because the code is organized and well-documented.
2.2 The sdtm.oak Philosophy
Here are some common algorithms used in SDTM transformations:
| Algorithm | Description | Example Use |
|---|---|---|
| Assign | Copy source to target | Subject ID → USUBJID |
| Hardcode | Set a constant value | DOMAIN = “LB” |
| Condition | Apply logic based on condition | If severity >= 3 then “Y” |
| Assign CT | Map to Controlled Terminology | “Male” → “M” |
Each algorithm does one thing, and you can combine them to build your SDTM domains step by step.
Note: We’ll simulate sdtm.oak patterns using dplyr/tidyr to understand the concepts.
3 Package Loading
4 Part 1: Simulating Raw EDC Data
4.1 Raw Vital Signs Data
First, let’s create data that looks like it came from an Electronic Data Capture (EDC) system like Rave or Veeva.
# Simulated Raw Vital Signs (EDC export format)
raw_vs <- tribble(
~SubjectID, ~Site, ~Visit, ~Date, ~SysBP, ~DiaBP, ~Pulse, ~Temp_C,
"001", "101", "Screening", "2024-01-01", 120, 80, 72, 36.5,
"001", "101", "Baseline", "2024-01-15", 118, 78, 70, 36.8,
"001", "101", "Week 4", "2024-02-12", 115, 76, 68, 36.6,
"002", "101", "Screening", "2024-01-02", 130, 85, 88, 37.0,
"002", "101", "Baseline", "2024-01-16", 128, 82, 85, 36.7,
"003", "102", "Screening", "2024-01-03", 145, 92, 95, 36.9,
"003", "102", "Baseline", "2024-01-17", 140, 88, 90, 36.5
)
print(raw_vs)# A tibble: 7 × 8
SubjectID Site Visit Date SysBP DiaBP Pulse Temp_C
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 001 101 Screening 2024-01-01 120 80 72 36.5
2 001 101 Baseline 2024-01-15 118 78 70 36.8
3 001 101 Week 4 2024-02-12 115 76 68 36.6
4 002 101 Screening 2024-01-02 130 85 88 37
5 002 101 Baseline 2024-01-16 128 82 85 36.7
6 003 102 Screening 2024-01-03 145 92 95 36.9
7 003 102 Baseline 2024-01-17 140 88 90 36.5
4.2 Raw Laboratory Data
Now let’s create lab data with values in different units that need standardization.
# Simulated Raw Lab Data (with different units per site)
raw_lb <- tribble(
~SubjectID, ~Site, ~Visit, ~Date, ~Test, ~Result, ~Unit,
# Site 101: Uses US units
"001", "101", "Screening", "2024-01-01", "Glucose", 95, "mg/dL",
"001", "101", "Screening", "2024-01-01", "Cholesterol", 180, "mg/dL",
"001", "101", "Screening", "2024-01-01", "ALT", 25, "U/L",
"001", "101", "Baseline", "2024-01-15", "Glucose", 92, "mg/dL",
"001", "101", "Baseline", "2024-01-15", "Cholesterol", 175, "mg/dL",
"001", "101", "Baseline", "2024-01-15", "ALT", 28, "U/L",
"002", "101", "Screening", "2024-01-02", "Glucose", 110, "mg/dL",
"002", "101", "Screening", "2024-01-02", "Cholesterol", 220, "mg/dL",
"002", "101", "Screening", "2024-01-02", "ALT", 45, "U/L",
# Site 102: Uses SI units
"003", "102", "Screening", "2024-01-03", "Glucose", 5.5, "mmol/L",
"003", "102", "Screening", "2024-01-03", "Cholesterol", 4.8, "mmol/L",
"003", "102", "Screening", "2024-01-03", "ALT", 30, "U/L",
"003", "102", "Baseline", "2024-01-17", "Glucose", 5.2, "mmol/L",
"003", "102", "Baseline", "2024-01-17", "Cholesterol", 4.5, "mmol/L",
"003", "102", "Baseline", "2024-01-17", "ALT", 32, "U/L"
)
print(raw_lb)# A tibble: 15 × 7
SubjectID Site Visit Date Test Result Unit
<chr> <chr> <chr> <chr> <chr> <dbl> <chr>
1 001 101 Screening 2024-01-01 Glucose 95 mg/dL
2 001 101 Screening 2024-01-01 Cholesterol 180 mg/dL
3 001 101 Screening 2024-01-01 ALT 25 U/L
4 001 101 Baseline 2024-01-15 Glucose 92 mg/dL
5 001 101 Baseline 2024-01-15 Cholesterol 175 mg/dL
6 001 101 Baseline 2024-01-15 ALT 28 U/L
7 002 101 Screening 2024-01-02 Glucose 110 mg/dL
8 002 101 Screening 2024-01-02 Cholesterol 220 mg/dL
9 002 101 Screening 2024-01-02 ALT 45 U/L
10 003 102 Screening 2024-01-03 Glucose 5.5 mmol/L
11 003 102 Screening 2024-01-03 Cholesterol 4.8 mmol/L
12 003 102 Screening 2024-01-03 ALT 30 U/L
13 003 102 Baseline 2024-01-17 Glucose 5.2 mmol/L
14 003 102 Baseline 2024-01-17 Cholesterol 4.5 mmol/L
15 003 102 Baseline 2024-01-17 ALT 32 U/L
5 Part 2: Creating SDTM VS Domain
5.1 Step 1: Hardcode Standard Variables
# Algorithm: Hardcode
vs_step1 <- raw_vs %>%
mutate(
STUDYID = "DEMO-001",
DOMAIN = "VS",
# Algorithm: Assign (concatenate)
USUBJID = paste(STUDYID, Site, SubjectID, sep = "-")
)
head(vs_step1)# A tibble: 6 × 11
SubjectID Site Visit Date SysBP DiaBP Pulse Temp_C STUDYID DOMAIN USUBJID
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <chr>
1 001 101 Screeni… 2024… 120 80 72 36.5 DEMO-0… VS DEMO-0…
2 001 101 Baseline 2024… 118 78 70 36.8 DEMO-0… VS DEMO-0…
3 001 101 Week 4 2024… 115 76 68 36.6 DEMO-0… VS DEMO-0…
4 002 101 Screeni… 2024… 130 85 88 37 DEMO-0… VS DEMO-0…
5 002 101 Baseline 2024… 128 82 85 36.7 DEMO-0… VS DEMO-0…
6 003 102 Screeni… 2024… 145 92 95 36.9 DEMO-0… VS DEMO-0…
5.2 Step 2: Pivot to Long Format (Findings Algorithm)
SDTM Findings domains are long (one row per test per visit).
# Algorithm: Transpose/Pivot
vs_step2 <- vs_step1 %>%
pivot_longer(
cols = c(SysBP, DiaBP, Pulse, Temp_C),
names_to = "RAW_TEST",
values_to = "VSORRES_NUM"
)
head(vs_step2, 10)# A tibble: 10 × 9
SubjectID Site Visit Date STUDYID DOMAIN USUBJID RAW_TEST VSORRES_NUM
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl>
1 001 101 Screening 2024-0… DEMO-0… VS DEMO-0… SysBP 120
2 001 101 Screening 2024-0… DEMO-0… VS DEMO-0… DiaBP 80
3 001 101 Screening 2024-0… DEMO-0… VS DEMO-0… Pulse 72
4 001 101 Screening 2024-0… DEMO-0… VS DEMO-0… Temp_C 36.5
5 001 101 Baseline 2024-0… DEMO-0… VS DEMO-0… SysBP 118
6 001 101 Baseline 2024-0… DEMO-0… VS DEMO-0… DiaBP 78
7 001 101 Baseline 2024-0… DEMO-0… VS DEMO-0… Pulse 70
8 001 101 Baseline 2024-0… DEMO-0… VS DEMO-0… Temp_C 36.8
9 001 101 Week 4 2024-0… DEMO-0… VS DEMO-0… SysBP 115
10 001 101 Week 4 2024-0… DEMO-0… VS DEMO-0… DiaBP 76
5.3 Step 3: Map to Controlled Terminology
# Algorithm: Assign CT (Map test codes)
vs_step3 <- vs_step2 %>%
mutate(
# Controlled Terminology mapping
VSTESTCD = case_when(
RAW_TEST == "SysBP" ~ "SYSBP",
RAW_TEST == "DiaBP" ~ "DIABP",
RAW_TEST == "Pulse" ~ "PULSE",
RAW_TEST == "Temp_C" ~ "TEMP"
),
VSTEST = case_when(
VSTESTCD == "SYSBP" ~ "Systolic Blood Pressure",
VSTESTCD == "DIABP" ~ "Diastolic Blood Pressure",
VSTESTCD == "PULSE" ~ "Pulse Rate",
VSTESTCD == "TEMP" ~ "Temperature"
),
VSORRESU = case_when(
VSTESTCD %in% c("SYSBP", "DIABP") ~ "mmHg",
VSTESTCD == "PULSE" ~ "BEATS/MIN",
VSTESTCD == "TEMP" ~ "C"
)
)
head(vs_step3)# A tibble: 6 × 12
SubjectID Site Visit Date STUDYID DOMAIN USUBJID RAW_TEST VSORRES_NUM
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl>
1 001 101 Screening 2024-01… DEMO-0… VS DEMO-0… SysBP 120
2 001 101 Screening 2024-01… DEMO-0… VS DEMO-0… DiaBP 80
3 001 101 Screening 2024-01… DEMO-0… VS DEMO-0… Pulse 72
4 001 101 Screening 2024-01… DEMO-0… VS DEMO-0… Temp_C 36.5
5 001 101 Baseline 2024-01… DEMO-0… VS DEMO-0… SysBP 118
6 001 101 Baseline 2024-01… DEMO-0… VS DEMO-0… DiaBP 78
# ℹ 3 more variables: VSTESTCD <chr>, VSTEST <chr>, VSORRESU <chr>
5.4 Step 4: Add Sequence Number
Every SDTM record needs a unique sequence number (--SEQ) within subject.
# Algorithm: Derive sequence
vs_step4 <- vs_step3 %>%
arrange(USUBJID, Date, VSTESTCD) %>%
group_by(USUBJID) %>%
mutate(VSSEQ = row_number()) %>%
ungroup()
vs_step4 %>%
select(USUBJID, VSSEQ, VSTESTCD, Visit) %>%
head(10)# A tibble: 10 × 4
USUBJID VSSEQ VSTESTCD Visit
<chr> <int> <chr> <chr>
1 DEMO-001-101-001 1 DIABP Screening
2 DEMO-001-101-001 2 PULSE Screening
3 DEMO-001-101-001 3 SYSBP Screening
4 DEMO-001-101-001 4 TEMP Screening
5 DEMO-001-101-001 5 DIABP Baseline
6 DEMO-001-101-001 6 PULSE Baseline
7 DEMO-001-101-001 7 SYSBP Baseline
8 DEMO-001-101-001 8 TEMP Baseline
9 DEMO-001-101-001 9 DIABP Week 4
10 DEMO-001-101-001 10 PULSE Week 4
5.5 Step 5: Final SDTM VS Domain
sdtm_vs <- vs_step4 %>%
mutate(
VSORRES = as.character(VSORRES_NUM),
VSSTRESN = VSORRES_NUM,
VSSTRESU = VSORRESU,
VSDTC = Date,
VISIT = Visit
) %>%
select(
STUDYID, DOMAIN, USUBJID, VSSEQ, VSTESTCD, VSTEST,
VSORRES, VSORRESU, VSSTRESN, VSSTRESU, VSDTC, VISIT
)
cat("SDTM VS Domain:\n")SDTM VS Domain:
cat("Records:", nrow(sdtm_vs), "\n\n")Records: 28
head(sdtm_vs, 10)# A tibble: 10 × 12
STUDYID DOMAIN USUBJID VSSEQ VSTESTCD VSTEST VSORRES VSORRESU VSSTRESN
<chr> <chr> <chr> <int> <chr> <chr> <chr> <chr> <dbl>
1 DEMO-001 VS DEMO-001-101… 1 DIABP Diast… 80 mmHg 80
2 DEMO-001 VS DEMO-001-101… 2 PULSE Pulse… 72 BEATS/M… 72
3 DEMO-001 VS DEMO-001-101… 3 SYSBP Systo… 120 mmHg 120
4 DEMO-001 VS DEMO-001-101… 4 TEMP Tempe… 36.5 C 36.5
5 DEMO-001 VS DEMO-001-101… 5 DIABP Diast… 78 mmHg 78
6 DEMO-001 VS DEMO-001-101… 6 PULSE Pulse… 70 BEATS/M… 70
7 DEMO-001 VS DEMO-001-101… 7 SYSBP Systo… 118 mmHg 118
8 DEMO-001 VS DEMO-001-101… 8 TEMP Tempe… 36.8 C 36.8
9 DEMO-001 VS DEMO-001-101… 9 DIABP Diast… 76 mmHg 76
10 DEMO-001 VS DEMO-001-101… 10 PULSE Pulse… 68 BEATS/M… 68
# ℹ 3 more variables: VSSTRESU <chr>, VSDTC <chr>, VISIT <chr>
6 Part 3: Creating SDTM LB Domain with Unit Standardization
The LB domain is more complex because we need to standardize units across sites.
6.1 Unit Conversion Reference
| Test | Original Unit | Standard Unit | Conversion Factor |
|---|---|---|---|
| Glucose | mg/dL | mmol/L | ÷ 18.0182 |
| Cholesterol | mg/dL | mmol/L | ÷ 38.67 |
| ALT | U/L | U/L | None (already SI) |
6.2 Step 1: Initial Mapping
lb_step1 <- raw_lb %>%
mutate(
STUDYID = "DEMO-001",
DOMAIN = "LB",
USUBJID = paste(STUDYID, Site, SubjectID, sep = "-"),
# Map test codes
LBTESTCD = case_when(
Test == "Glucose" ~ "GLUC",
Test == "Cholesterol" ~ "CHOL",
Test == "ALT" ~ "ALT"
),
LBTEST = case_when(
LBTESTCD == "GLUC" ~ "Glucose",
LBTESTCD == "CHOL" ~ "Cholesterol",
LBTESTCD == "ALT" ~ "Alanine Aminotransferase"
),
# Original results
LBORRES = as.character(Result),
LBORRESU = Unit
)
head(lb_step1)# A tibble: 6 × 14
SubjectID Site Visit Date Test Result Unit STUDYID DOMAIN USUBJID LBTESTCD
<chr> <chr> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <chr> <chr>
1 001 101 Scre… 2024… Gluc… 95 mg/dL DEMO-0… LB DEMO-0… GLUC
2 001 101 Scre… 2024… Chol… 180 mg/dL DEMO-0… LB DEMO-0… CHOL
3 001 101 Scre… 2024… ALT 25 U/L DEMO-0… LB DEMO-0… ALT
4 001 101 Base… 2024… Gluc… 92 mg/dL DEMO-0… LB DEMO-0… GLUC
5 001 101 Base… 2024… Chol… 175 mg/dL DEMO-0… LB DEMO-0… CHOL
6 001 101 Base… 2024… ALT 28 U/L DEMO-0… LB DEMO-0… ALT
# ℹ 3 more variables: LBTEST <chr>, LBORRES <chr>, LBORRESU <chr>
6.3 Step 2: Unit Standardization
This is where the real work happens - converting all values to standard SI units.
lb_step2 <- lb_step1 %>%
mutate(
# Standard unit is SI (mmol/L for glucose/cholesterol, U/L for ALT)
LBSTRESU = case_when(
LBTESTCD %in% c("GLUC", "CHOL") ~ "mmol/L",
LBTESTCD == "ALT" ~ "U/L"
),
# Convert to standard units
LBSTRESN = case_when(
# Glucose: mg/dL to mmol/L
LBTESTCD == "GLUC" & LBORRESU == "mg/dL" ~ round(Result / 18.0182, 2),
LBTESTCD == "GLUC" & LBORRESU == "mmol/L" ~ Result,
# Cholesterol: mg/dL to mmol/L
LBTESTCD == "CHOL" & LBORRESU == "mg/dL" ~ round(Result / 38.67, 2),
LBTESTCD == "CHOL" & LBORRESU == "mmol/L" ~ Result,
# ALT: Already in U/L
LBTESTCD == "ALT" ~ Result
),
# Character version of standardized result
LBSTRESC = as.character(LBSTRESN)
)
# Show the conversion
lb_step2 %>%
select(USUBJID, LBTESTCD, LBORRES, LBORRESU, LBSTRESN, LBSTRESU) %>%
head(10)# A tibble: 10 × 6
USUBJID LBTESTCD LBORRES LBORRESU LBSTRESN LBSTRESU
<chr> <chr> <chr> <chr> <dbl> <chr>
1 DEMO-001-101-001 GLUC 95 mg/dL 5.27 mmol/L
2 DEMO-001-101-001 CHOL 180 mg/dL 4.65 mmol/L
3 DEMO-001-101-001 ALT 25 U/L 25 U/L
4 DEMO-001-101-001 GLUC 92 mg/dL 5.11 mmol/L
5 DEMO-001-101-001 CHOL 175 mg/dL 4.53 mmol/L
6 DEMO-001-101-001 ALT 28 U/L 28 U/L
7 DEMO-001-101-002 GLUC 110 mg/dL 6.1 mmol/L
8 DEMO-001-101-002 CHOL 220 mg/dL 5.69 mmol/L
9 DEMO-001-101-002 ALT 45 U/L 45 U/L
10 DEMO-001-102-003 GLUC 5.5 mmol/L 5.5 mmol/L
Notice how subjects from Site 101 (US units) and Site 102 (SI units) now have comparable values in LBSTRESN. This is essential for cross-site analysis!
6.4 Step 3: Add Sequence and Reference Ranges
lb_step3 <- lb_step2 %>%
arrange(USUBJID, Date, LBTESTCD) %>%
group_by(USUBJID) %>%
mutate(LBSEQ = row_number()) %>%
ungroup() %>%
# Add reference ranges (in standard units)
mutate(
LBSTNRLO = case_when(
LBTESTCD == "GLUC" ~ 3.9,
LBTESTCD == "CHOL" ~ 0.0,
LBTESTCD == "ALT" ~ 7.0
),
LBSTNRHI = case_when(
LBTESTCD == "GLUC" ~ 5.6,
LBTESTCD == "CHOL" ~ 5.2,
LBTESTCD == "ALT" ~ 56.0
),
# Normal range indicator
LBNRIND = case_when(
LBSTRESN < LBSTNRLO ~ "LOW",
LBSTRESN > LBSTNRHI ~ "HIGH",
TRUE ~ "NORMAL"
)
)
lb_step3 %>%
select(USUBJID, LBTESTCD, LBSTRESN, LBSTNRLO, LBSTNRHI, LBNRIND) %>%
head(10)# A tibble: 10 × 6
USUBJID LBTESTCD LBSTRESN LBSTNRLO LBSTNRHI LBNRIND
<chr> <chr> <dbl> <dbl> <dbl> <chr>
1 DEMO-001-101-001 ALT 25 7 56 NORMAL
2 DEMO-001-101-001 CHOL 4.65 0 5.2 NORMAL
3 DEMO-001-101-001 GLUC 5.27 3.9 5.6 NORMAL
4 DEMO-001-101-001 ALT 28 7 56 NORMAL
5 DEMO-001-101-001 CHOL 4.53 0 5.2 NORMAL
6 DEMO-001-101-001 GLUC 5.11 3.9 5.6 NORMAL
7 DEMO-001-101-002 ALT 45 7 56 NORMAL
8 DEMO-001-101-002 CHOL 5.69 0 5.2 HIGH
9 DEMO-001-101-002 GLUC 6.1 3.9 5.6 HIGH
10 DEMO-001-102-003 ALT 30 7 56 NORMAL
6.5 Step 4: Final SDTM LB Domain
sdtm_lb <- lb_step3 %>%
mutate(
LBDTC = Date,
VISIT = Visit
) %>%
select(
STUDYID, DOMAIN, USUBJID, LBSEQ, LBTESTCD, LBTEST,
LBORRES, LBORRESU, LBSTRESC, LBSTRESN, LBSTRESU,
LBSTNRLO, LBSTNRHI, LBNRIND, LBDTC, VISIT
)
cat("SDTM LB Domain:\n")SDTM LB Domain:
cat("Records:", nrow(sdtm_lb), "\n")Records: 15
cat("Variables:", ncol(sdtm_lb), "\n\n")Variables: 16
head(sdtm_lb, 10)# A tibble: 10 × 16
STUDYID DOMAIN USUBJID LBSEQ LBTESTCD LBTEST LBORRES LBORRESU LBSTRESC
<chr> <chr> <chr> <int> <chr> <chr> <chr> <chr> <chr>
1 DEMO-001 LB DEMO-001-101… 1 ALT Alani… 25 U/L 25
2 DEMO-001 LB DEMO-001-101… 2 CHOL Chole… 180 mg/dL 4.65
3 DEMO-001 LB DEMO-001-101… 3 GLUC Gluco… 95 mg/dL 5.27
4 DEMO-001 LB DEMO-001-101… 4 ALT Alani… 28 U/L 28
5 DEMO-001 LB DEMO-001-101… 5 CHOL Chole… 175 mg/dL 4.53
6 DEMO-001 LB DEMO-001-101… 6 GLUC Gluco… 92 mg/dL 5.11
7 DEMO-001 LB DEMO-001-101… 1 ALT Alani… 45 U/L 45
8 DEMO-001 LB DEMO-001-101… 2 CHOL Chole… 220 mg/dL 5.69
9 DEMO-001 LB DEMO-001-101… 3 GLUC Gluco… 110 mg/dL 6.1
10 DEMO-001 LB DEMO-001-102… 1 ALT Alani… 30 U/L 30
# ℹ 7 more variables: LBSTRESN <dbl>, LBSTRESU <chr>, LBSTNRLO <dbl>,
# LBSTNRHI <dbl>, LBNRIND <chr>, LBDTC <chr>, VISIT <chr>
6.6 Summary Statistics
sdtm_lb %>%
group_by(LBTESTCD, LBTEST) %>%
summarise(
N = n(),
Mean = round(mean(LBSTRESN, na.rm = TRUE), 2),
SD = round(sd(LBSTRESN, na.rm = TRUE), 2),
Low = sum(LBNRIND == "LOW"),
Normal = sum(LBNRIND == "NORMAL"),
High = sum(LBNRIND == "HIGH"),
.groups = "drop"
)# A tibble: 3 × 8
LBTESTCD LBTEST N Mean SD Low Normal High
<chr> <chr> <int> <dbl> <dbl> <int> <int> <int>
1 ALT Alanine Aminotransferase 5 32 7.71 0 5 0
2 CHOL Cholesterol 5 4.83 0.49 0 4 1
3 GLUC Glucose 5 5.44 0.4 0 4 1
7 🎯 Practice Exercise
7.1 Your Turn: Add LOINC Codes
LOINC codes are standard lab test identifiers. Add LBLOINC to the LB domain.
# LOINC Reference
loinc_lookup <- tribble(
~LBTESTCD, ~LBLOINC,
"GLUC", "2345-7",
"CHOL", "2093-3",
"ALT", "1742-6"
)
# TODO: Join the LOINC codes to sdtm_lb
sdtm_lb_loinc <- sdtm_lb %>%
# Your code here...
head(sdtm_lb_loinc)8 Deliverable Summary
Today you completed the following:
| Task | Status |
|---|---|
| Created simulated Raw VS and LB data | ✓ Done |
| Built SDTM VS with sequence numbers | ✓ Done |
| Performed unit standardization (mg/dL → mmol/L) | ✓ Done |
| Added reference ranges and normal indicators | ✓ Done |
| Created complete SDTM LB domain | ✓ Done |
9 Key Takeaways
- sdtm.oak Philosophy: Algorithm-based, modular, traceable.
- Pivoting: Raw data is wide; SDTM Findings are long.
- Unit Standardization: Critical for multi-site studies.
- Sequence Numbers: Every record needs a unique
--SEQ. - Reference Ranges:
LBSTNRLO,LBSTNRHI,LBNRINDare key for flagging abnormals.
10 Resources
11 What’s Next?
In Day 7, we will complete the Week 1 Capstone:
- Build DM, AE, EX domains from scratch (20+ subjects)
- Apply all concepts learned this week
- Export to submission-ready
.xptfiles