# Install packages if not already installed
if (!requireNamespace("dplyr", quietly = TRUE)) suppressMessages(install.packages("dplyr"))
if (!requireNamespace("tidyr", quietly = TRUE)) suppressMessages(install.packages("tidyr"))
if (!requireNamespace("pharmaversesdtm", quietly = TRUE)) suppressMessages(install.packages("pharmaversesdtm"))Day 8: Complex SDTM Domains - LB (Lab Results)
Findings Class with Unit Standardization
1 Learning Objectives
By the end of Day 8, you will be able to:
- Understand the LB (Laboratory) domain structure and its role as a Findings class domain
- Identify and explain the key LB variables: LBTESTCD, LBORRES, LBORRESU, LBSTRESC, LBSTRESN, LBSTRESU
- Perform unit conversions (e.g., mg/dL → mmol/L) for standardized results
- Derive reference range flags (LBNRIND) and understand normal range variables
- Create a complete LB domain from simulated raw data using
dplyrandtidyr
2 Introduction to the LB Domain
2.1 What is the LB Domain?
The LB (Laboratory) domain is one of the most complex and data-rich domains in SDTM. It belongs to the Findings class of SDTM domains, which are designed to capture observations or measurements collected during a study.
Findings domains share a common structure:
- –TESTCD: Short name for the test (e.g., “ALT”, “GLUC”)
- –TEST: Full descriptive name (e.g., “Alanine Aminotransferase”, “Glucose”)
- –ORRES: Original result as collected (character)
- –ORRESU: Original unit of the result
- –STRESC: Standardized result in character format
- –STRESN: Standardized result in numeric format
- –STRESU: Standardized unit
2.2 Why is LB Important?
Laboratory data is fundamental to clinical trials because:
- It provides objective measurements of safety (liver enzymes, kidney function)
- It helps assess efficacy (biomarkers, disease markers)
- Regulatory agencies require standardized lab data for safety reviews
- Lab abnormalities often determine eligibility and safety signals
2.3 The LB Domain Structure
┌──────────────────────────────────────────────────────────────────────────────┐
│ LB DOMAIN - KEY VARIABLES │
├──────────────────────────────────────────────────────────────────────────────┤
│ IDENTIFIER VARIABLES │
│ STUDYID = Study identifier │
│ USUBJID = Unique subject identifier (links to DM) │
│ LBSEQ = Sequence number within subject │
├──────────────────────────────────────────────────────────────────────────────┤
│ TOPIC VARIABLE │
│ LBTESTCD = Lab test short name (e.g., "ALT", "BILI", "GLUC") │
│ LBTEST = Lab test full name (e.g., "Alanine Aminotransferase") │
│ LBCAT = Category (e.g., "CHEMISTRY", "HEMATOLOGY") │
├──────────────────────────────────────────────────────────────────────────────┤
│ RESULT VARIABLES (Original) │
│ LBORRES = Result as originally collected (character) │
│ LBORRESU = Unit of original result │
├──────────────────────────────────────────────────────────────────────────────┤
│ RESULT VARIABLES (Standardized) │
│ LBSTRESC = Standardized result (character) │
│ LBSTRESN = Standardized result (numeric) │
│ LBSTRESU = Standardized unit │
├──────────────────────────────────────────────────────────────────────────────┤
│ REFERENCE RANGE VARIABLES │
│ LBSTNRLO = Standard normal range lower limit │
│ LBSTNRHI = Standard normal range upper limit │
│ LBNRIND = Normal range indicator (LOW/NORMAL/HIGH) │
├──────────────────────────────────────────────────────────────────────────────┤
│ TIMING VARIABLES │
│ VISITNUM = Visit number │
│ VISIT = Visit name (e.g., "SCREENING", "WEEK 2") │
│ LBDTC = Date/time of specimen collection (ISO 8601) │
│ LBDY = Study day of specimen collection │
└──────────────────────────────────────────────────────────────────────────────┘
3 Package Installation & Loading
3.1 Required Packages
| Package | Purpose |
|---|---|
dplyr |
Data manipulation (filter, mutate, joins) |
tidyr |
Data reshaping (pivot operations) |
pharmaversesdtm |
Example SDTM datasets including LB |
3.2 Install Packages (if needed)
3.3 Load Packages
library(dplyr)
library(tidyr)
library(pharmaversesdtm)4 Exploring pharmaversesdtm LB Data
Let’s start by loading and exploring the LB domain from pharmaversesdtm. This is the same data used by production teams in the pharmaverse community when developing the admiral package.
4.1 Load the LB Domain
# Load LB domain from pharmaversesdtm
data("lb", package = "pharmaversesdtm")
# Quick overview
cat("LB domain dimensions:", nrow(lb), "rows x", ncol(lb), "columns\n")LB domain dimensions: 59580 rows x 23 columns
cat("Number of unique subjects:", n_distinct(lb$USUBJID), "\n")Number of unique subjects: 254
cat("Number of unique tests:", n_distinct(lb$LBTESTCD), "\n")Number of unique tests: 47
4.2 Explore LB Structure
# View the structure of the LB domain
dplyr::glimpse(lb)Rows: 59,580
Columns: 23
$ STUDYID <chr> "CDISCPILOT01", "CDISCPILOT01", "CDISCPILOT01", "CDISCPILOT01…
$ DOMAIN <chr> "LB", "LB", "LB", "LB", "LB", "LB", "LB", "LB", "LB", "LB", "…
$ USUBJID <chr> "01-701-1015", "01-701-1015", "01-701-1015", "01-701-1015", "…
$ LBSEQ <dbl> 1, 39, 74, 104, 134, 164, 199, 229, 259, 294, 2, 40, 75, 105,…
$ LBTESTCD <chr> "ALB", "ALB", "ALB", "ALB", "ALB", "ALB", "ALB", "ALB", "ALB"…
$ LBTEST <chr> "Albumin", "Albumin", "Albumin", "Albumin", "Albumin", "Album…
$ LBCAT <chr> "CHEMISTRY", "CHEMISTRY", "CHEMISTRY", "CHEMISTRY", "CHEMISTR…
$ LBORRES <chr> "3.8", "3.9", "3.8", "3.7", "3.8", "3.8", "3.7", "3.7", "3.8"…
$ LBORRESU <chr> "g/dL", "g/dL", "g/dL", "g/dL", "g/dL", "g/dL", "g/dL", "g/dL…
$ LBORNRLO <chr> "3.3", "3.3", "3.3", "3.3", "3.3", "3.3", "3.3", "3.3", "3.3"…
$ LBORNRHI <chr> "4.9", "4.9", "4.9", "4.9", "4.9", "4.9", "4.9", "4.9", "4.9"…
$ LBSTRESC <chr> "38", "39", "38", "37", "38", "38", "37", "37", "38", "38", "…
$ LBSTRESN <dbl> 38, 39, 38, 37, 38, 38, 37, 37, 38, 38, 34, 50, 41, 43, 47, 5…
$ LBSTRESU <chr> "g/L", "g/L", "g/L", "g/L", "g/L", "g/L", "g/L", "g/L", "g/L"…
$ LBSTNRLO <dbl> 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 35, 35, 35, 35, 35, 3…
$ LBSTNRHI <dbl> 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 115, 115, 115, 115, 1…
$ LBNRIND <chr> "NORMAL", "NORMAL", "NORMAL", "NORMAL", "NORMAL", "NORMAL", "…
$ LBBLFL <chr> "Y", NA, NA, NA, NA, NA, NA, NA, NA, NA, "Y", NA, NA, NA, NA,…
$ VISITNUM <dbl> 1, 4, 5, 7, 8, 9, 10, 11, 12, 13, 1, 4, 5, 7, 8, 9, 10, 11, 1…
$ VISIT <chr> "SCREENING 1", "WEEK 2", "WEEK 4", "WEEK 6", "WEEK 8", "WEEK …
$ VISITDY <dbl> -7, 14, 28, 42, 56, 84, 112, 140, 168, 182, -7, 14, 28, 42, 5…
$ LBDTC <chr> "2013-12-26T14:45", "2014-01-16T13:17", "2014-01-30T08:50", "…
$ LBDY <dbl> -7, 15, 29, 42, 63, 84, 126, 140, 168, 182, -7, 15, 29, 42, 6…
4.3 Key LB Variables Explained
Let’s examine some critical LB variables:
# View unique lab tests
lb %>%
distinct(LBTESTCD, LBTEST, LBCAT) %>%
arrange(LBCAT, LBTESTCD) %>%
head(15)# A tibble: 15 × 3
LBTESTCD LBTEST LBCAT
<chr> <chr> <chr>
1 ALB Albumin CHEMISTRY
2 ALP Alkaline Phosphatase CHEMISTRY
3 ALT Alanine Aminotransferase CHEMISTRY
4 AST Aspartate Aminotransferase CHEMISTRY
5 BILI Bilirubin CHEMISTRY
6 BUN Blood Urea Nitrogen CHEMISTRY
7 CA Calcium CHEMISTRY
8 CHOL Cholesterol CHEMISTRY
9 CK Creatine Kinase CHEMISTRY
10 CL Chloride CHEMISTRY
11 CREAT Creatinine CHEMISTRY
12 GGT Gamma Glutamyl Transferase CHEMISTRY
13 GLUC Glucose CHEMISTRY
14 K Potassium CHEMISTRY
15 PHOS Phosphate CHEMISTRY
Notice how LBTESTCD is a short, standardized code while LBTEST is the full descriptive name. This follows CDISC conventions where:
- Short codes (LBTESTCD) enable efficient data processing
- Full names (LBTEST) provide human-readable descriptions
- Categories (LBCAT) group related tests (CHEMISTRY, HEMATOLOGY, etc.)
5 Understanding Original vs Standardized Results
One of the most important concepts in the LB domain is the distinction between original and standardized results.
5.1 Why Standardization Matters
Labs from different sites may report the same test in different units:
- Site A reports Glucose as 90 mg/dL
- Site B reports Glucose as 5.0 mmol/L
Both values are the same measurement, just in different units! SDTM requires us to standardize these values so they can be analyzed together.
5.2 The Result Variable Pairs
| Original Variable | Standardized Variable | Description |
|---|---|---|
LBORRES |
LBSTRESC |
Result (character format) |
| - | LBSTRESN |
Result (numeric format) |
LBORRESU |
LBSTRESU |
Unit of measurement |
5.3 Viewing Original vs Standardized Values
# Compare original and standardized values for a specific test
lb %>%
filter(LBTESTCD == "GLUC") %>%
select(USUBJID, VISIT, LBORRES, LBORRESU, LBSTRESC, LBSTRESN, LBSTRESU) %>%
head(10)# A tibble: 10 × 7
USUBJID VISIT LBORRES LBORRESU LBSTRESC LBSTRESN LBSTRESU
<chr> <chr> <chr> <chr> <chr> <dbl> <chr>
1 01-701-1015 SCREENING 1 85 mg/dL 4.71835 4.72 mmol/L
2 01-701-1015 WEEK 2 84 mg/dL 4.66284 4.66 mmol/L
3 01-701-1015 WEEK 4 79 mg/dL 4.38529 4.39 mmol/L
4 01-701-1015 WEEK 6 92 mg/dL 5.10692 5.11 mmol/L
5 01-701-1015 WEEK 8 82 mg/dL 4.55182 4.55 mmol/L
6 01-701-1015 WEEK 12 87 mg/dL 4.82937 4.83 mmol/L
7 01-701-1015 WEEK 16 86 mg/dL 4.77386 4.77 mmol/L
8 01-701-1015 WEEK 20 88 mg/dL 4.88488 4.88 mmol/L
9 01-701-1015 WEEK 24 81 mg/dL 4.49631 4.50 mmol/L
10 01-701-1015 WEEK 26 92 mg/dL 5.10692 5.11 mmol/L
6 Unit Conversion: A Hands-On Example
Let’s create a practical example of unit conversion. This is one of the most common tasks when building an LB domain from raw lab data.
6.1 Common Lab Unit Conversions
| Test | Original Unit | Standard Unit | Conversion Factor |
|---|---|---|---|
| Glucose | mg/dL | mmol/L | ÷ 18.02 |
| Creatinine | mg/dL | µmol/L | × 88.42 |
| Bilirubin | mg/dL | µmol/L | × 17.10 |
| Cholesterol | mg/dL | mmol/L | ÷ 38.67 |
For a comprehensive list of clinical lab unit conversions, see:
6.2 Step-by-Step: Creating Standardized Results
Let’s simulate a scenario where we receive raw lab data and need to standardize it.
# Create simulated raw lab data with mixed units
raw_lab <- tibble::tribble(
~USUBJID, ~LBTESTCD, ~LBTEST, ~LBORRES, ~LBORRESU, ~VISITNUM, ~VISIT,
"CDISC01-001-001", "GLUC", "Glucose", "95", "mg/dL", 1, "BASELINE",
"CDISC01-001-001", "GLUC", "Glucose", "5.1", "mmol/L", 2, "WEEK 2",
"CDISC01-001-001", "CREAT", "Creatinine", "1.2", "mg/dL", 1, "BASELINE",
"CDISC01-001-001", "CREAT", "Creatinine", "98", "umol/L", 2, "WEEK 2",
"CDISC01-001-002", "GLUC", "Glucose", "102", "mg/dL", 1, "BASELINE",
"CDISC01-001-002", "GLUC", "Glucose", "88", "mg/dL", 2, "WEEK 2",
"CDISC01-001-002", "BILI", "Bilirubin Total", "0.8", "mg/dL", 1, "BASELINE",
"CDISC01-001-002", "BILI", "Bilirubin Total", "1.1", "mg/dL", 2, "WEEK 2"
)
cat("Raw lab data with mixed units:\n")Raw lab data with mixed units:
print(raw_lab)# A tibble: 8 × 7
USUBJID LBTESTCD LBTEST LBORRES LBORRESU VISITNUM VISIT
<chr> <chr> <chr> <chr> <chr> <dbl> <chr>
1 CDISC01-001-001 GLUC Glucose 95 mg/dL 1 BASELINE
2 CDISC01-001-001 GLUC Glucose 5.1 mmol/L 2 WEEK 2
3 CDISC01-001-001 CREAT Creatinine 1.2 mg/dL 1 BASELINE
4 CDISC01-001-001 CREAT Creatinine 98 umol/L 2 WEEK 2
5 CDISC01-001-002 GLUC Glucose 102 mg/dL 1 BASELINE
6 CDISC01-001-002 GLUC Glucose 88 mg/dL 2 WEEK 2
7 CDISC01-001-002 BILI Bilirubin Total 0.8 mg/dL 1 BASELINE
8 CDISC01-001-002 BILI Bilirubin Total 1.1 mg/dL 2 WEEK 2
6.3 Create a Unit Conversion Function
# Define a reusable conversion function
convert_lab_units <- function(testcd, value, from_unit, to_unit) {
# Convert character value to numeric
val <- as.numeric(value)
# Define conversion factors (from original to standard)
# Standard units: Glucose = mmol/L, Creatinine = umol/L, Bilirubin = umol/L
result <- case_when(
# Glucose: mg/dL to mmol/L
testcd == "GLUC" & from_unit == "mg/dL" & to_unit == "mmol/L" ~ val / 18.02,
# Glucose: already in mmol/L
testcd == "GLUC" & from_unit == "mmol/L" & to_unit == "mmol/L" ~ val,
# Creatinine: mg/dL to umol/L
testcd == "CREAT" & from_unit == "mg/dL" & to_unit == "umol/L" ~ val * 88.42,
# Creatinine: already in umol/L
testcd == "CREAT" & from_unit == "umol/L" & to_unit == "umol/L" ~ val,
# Bilirubin: mg/dL to umol/L
testcd == "BILI" & from_unit == "mg/dL" & to_unit == "umol/L" ~ val * 17.10,
# Default: no conversion
TRUE ~ val
)
return(round(result, 2))
}6.4 Apply Unit Conversions
# Define standard units for each test
standard_units <- tibble::tribble(
~LBTESTCD, ~LBSTRESU,
"GLUC", "mmol/L",
"CREAT", "umol/L",
"BILI", "umol/L"
)
# Apply standardization
lab_standardized <- raw_lab %>%
# Join to get target standard units
left_join(standard_units, by = "LBTESTCD") %>%
# Apply conversion
mutate(
LBSTRESN = convert_lab_units(LBTESTCD, LBORRES, LBORRESU, LBSTRESU),
LBSTRESC = as.character(LBSTRESN)
) %>%
# Reorder columns for clarity
select(USUBJID, LBTESTCD, LBTEST, VISITNUM, VISIT,
LBORRES, LBORRESU, LBSTRESC, LBSTRESN, LBSTRESU)
cat("Standardized lab data:\n")Standardized lab data:
print(lab_standardized)# A tibble: 8 × 10
USUBJID LBTESTCD LBTEST VISITNUM VISIT LBORRES LBORRESU LBSTRESC LBSTRESN
<chr> <chr> <chr> <dbl> <chr> <chr> <chr> <chr> <dbl>
1 CDISC01-001… GLUC Gluco… 1 BASE… 95 mg/dL 5.27 5.27
2 CDISC01-001… GLUC Gluco… 2 WEEK… 5.1 mmol/L 5.1 5.1
3 CDISC01-001… CREAT Creat… 1 BASE… 1.2 mg/dL 106.1 106.
4 CDISC01-001… CREAT Creat… 2 WEEK… 98 umol/L 98 98
5 CDISC01-001… GLUC Gluco… 1 BASE… 102 mg/dL 5.66 5.66
6 CDISC01-001… GLUC Gluco… 2 WEEK… 88 mg/dL 4.88 4.88
7 CDISC01-001… BILI Bilir… 1 BASE… 0.8 mg/dL 13.68 13.7
8 CDISC01-001… BILI Bilir… 2 WEEK… 1.1 mg/dL 18.81 18.8
# ℹ 1 more variable: LBSTRESU <chr>
Notice how the original values are preserved (LBORRES, LBORRESU) while standardized values are derived (LBSTRESC, LBSTRESN, LBSTRESU). This is a fundamental SDTM principle: never lose the original data!
7 Reference Ranges and Normal Range Indicators
7.1 What are Reference Ranges?
Reference ranges (also called normal ranges) define the expected values for a lab test in a healthy population. Values outside this range may indicate:
- LOW: Value below the normal range (potentially abnormal)
- NORMAL: Value within expected range
- HIGH: Value above the normal range (potentially abnormal)
7.2 Key Reference Range Variables
| Variable | Description |
|---|---|
LBSTNRLO |
Standard normal range - lower limit |
LBSTNRHI |
Standard normal range - upper limit |
LBNRIND |
Normal range indicator (LOW/NORMAL/HIGH) |
LBORNRLO |
Original normal range - lower limit |
LBORNRHI |
Original normal range - upper limit |
7.3 Deriving Reference Range Flags
# Define reference ranges (typically from lab vendor or specifications)
reference_ranges <- tibble::tribble(
~LBTESTCD, ~LBSTNRLO, ~LBSTNRHI, ~LBSTRESU,
"GLUC", 3.9, 5.6, "mmol/L", # Fasting glucose
"CREAT", 62, 106, "umol/L", # Creatinine (male)
"BILI", 5.1, 17.0, "umol/L" # Total bilirubin
)
# Join reference ranges and derive LBNRIND
lab_with_ranges <- lab_standardized %>%
left_join(
reference_ranges %>% select(LBTESTCD, LBSTNRLO, LBSTNRHI),
by = "LBTESTCD"
) %>%
mutate(
# Derive normal range indicator
LBNRIND = case_when(
is.na(LBSTRESN) ~ NA_character_, # Missing result
LBSTRESN < LBSTNRLO ~ "LOW", # Below range
LBSTRESN > LBSTNRHI ~ "HIGH", # Above range
TRUE ~ "NORMAL" # Within range
)
)
# Display results
lab_with_ranges %>%
select(USUBJID, LBTESTCD, VISIT, LBSTRESN, LBSTRESU,
LBSTNRLO, LBSTNRHI, LBNRIND)# A tibble: 8 × 8
USUBJID LBTESTCD VISIT LBSTRESN LBSTRESU LBSTNRLO LBSTNRHI LBNRIND
<chr> <chr> <chr> <dbl> <chr> <dbl> <dbl> <chr>
1 CDISC01-001-001 GLUC BASELINE 5.27 mmol/L 3.9 5.6 NORMAL
2 CDISC01-001-001 GLUC WEEK 2 5.1 mmol/L 3.9 5.6 NORMAL
3 CDISC01-001-001 CREAT BASELINE 106. umol/L 62 106 HIGH
4 CDISC01-001-001 CREAT WEEK 2 98 umol/L 62 106 NORMAL
5 CDISC01-001-002 GLUC BASELINE 5.66 mmol/L 3.9 5.6 HIGH
6 CDISC01-001-002 GLUC WEEK 2 4.88 mmol/L 3.9 5.6 NORMAL
7 CDISC01-001-002 BILI BASELINE 13.7 umol/L 5.1 17 NORMAL
8 CDISC01-001-002 BILI WEEK 2 18.8 umol/L 5.1 17 HIGH
The LBNRIND variable is critical for safety analyses. It helps identify:
- Clinically significant abnormalities that may require action
- Trends in lab values over time (e.g., worsening liver function)
- Treatment-emergent abnormalities (normal at baseline, abnormal post-treatment)
8 LOINC Coding Concepts
8.1 What is LOINC?
LOINC (Logical Observation Identifiers Names and Codes) is a universal standard for identifying medical laboratory observations. While not always required in SDTM, it enhances interoperability.
8.2 How LOINC Integrates with SDTM
| SDTM Variable | LOINC Equivalent |
|---|---|
LBLOINC |
LOINC code for the test |
LBTESTCD |
Mapped to LOINC component |
8.3 Example LOINC Codes
| Test | LBTESTCD | LOINC Code | Description |
|---|---|---|---|
| Glucose | GLUC | 2339-0 | Glucose [Mass/volume] in Blood |
| Creatinine | CREAT | 2160-0 | Creatinine [Mass/volume] in Serum or Plasma |
| ALT | ALT | 1742-6 | Alanine aminotransferase [Enzymatic activity/volume] in Serum or Plasma |
9 Complete Example: Building an LB Domain
Let’s put everything together and build a more complete LB domain from simulated raw data.
9.1 Step 1: Create Comprehensive Raw Data
# Create more comprehensive raw lab data
set.seed(42)
subjects <- c("CDISC01-001-001", "CDISC01-001-002", "CDISC01-001-003",
"CDISC01-001-004", "CDISC01-001-005")
visits <- c("SCREENING", "BASELINE", "WEEK 2", "WEEK 4", "WEEK 8")
visit_nums <- c(-1, 1, 2, 3, 4)
# Generate raw lab data for 5 subjects across 5 visits
raw_lb_data <- expand_grid(
USUBJID = subjects,
VISIT_INFO = tibble(VISIT = visits, VISITNUM = visit_nums)
) %>%
unnest(VISIT_INFO) %>%
# Add lab tests
crossing(
tibble::tribble(
~LBTESTCD, ~LBTEST, ~LBCAT, ~LBORRESU,
"GLUC", "Glucose", "CHEMISTRY", "mg/dL",
"CREAT", "Creatinine", "CHEMISTRY", "mg/dL",
"ALT", "Alanine Aminotransferase", "CHEMISTRY", "U/L",
"AST", "Aspartate Aminotransferase", "CHEMISTRY", "U/L",
"BILI", "Bilirubin Total", "CHEMISTRY", "mg/dL",
"WBC", "Leukocytes", "HEMATOLOGY","10^9/L",
"RBC", "Erythrocytes", "HEMATOLOGY","10^12/L",
"HGB", "Hemoglobin", "HEMATOLOGY","g/dL",
"PLT", "Platelets", "HEMATOLOGY","10^9/L",
"SODIUM", "Sodium", "CHEMISTRY", "mmol/L"
)
) %>%
# Generate random values within typical ranges
rowwise() %>%
mutate(
LBORRES = as.character(round(case_when(
LBTESTCD == "GLUC" ~ rnorm(1, 100, 15),
LBTESTCD == "CREAT" ~ rnorm(1, 1.0, 0.2),
LBTESTCD == "ALT" ~ rnorm(1, 30, 10),
LBTESTCD == "AST" ~ rnorm(1, 28, 8),
LBTESTCD == "BILI" ~ rnorm(1, 0.8, 0.3),
LBTESTCD == "WBC" ~ rnorm(1, 7.0, 2.0),
LBTESTCD == "RBC" ~ rnorm(1, 4.8, 0.5),
LBTESTCD == "HGB" ~ rnorm(1, 14.5, 1.5),
LBTESTCD == "PLT" ~ rnorm(1, 250, 50),
LBTESTCD == "SODIUM" ~ rnorm(1, 140, 3),
TRUE ~ NA_real_
), 2))
) %>%
ungroup() %>%
# Add STUDYID and sequence
mutate(STUDYID = "CDISC01") %>%
group_by(USUBJID) %>%
mutate(LBSEQ = row_number()) %>%
ungroup()
cat("Raw LB data created:", nrow(raw_lb_data), "records\n")Raw LB data created: 250 records
cat("Tests per subject per visit:", n_distinct(raw_lb_data$LBTESTCD), "\n")Tests per subject per visit: 10
9.2 Step 2: Apply Standardization
# Define standard units and reference ranges for all tests
lab_standards <- tibble::tribble(
~LBTESTCD, ~LBSTRESU, ~LBSTNRLO, ~LBSTNRHI, ~CONV_FACTOR,
"GLUC", "mmol/L", 3.9, 5.6, 0.0555, # mg/dL to mmol/L
"CREAT", "umol/L", 62, 106, 88.42, # mg/dL to umol/L
"ALT", "U/L", 7, 56, 1.0, # No conversion
"AST", "U/L", 10, 40, 1.0, # No conversion
"BILI", "umol/L", 5.1, 17.0, 17.10, # mg/dL to umol/L
"WBC", "10^9/L", 4.5, 11.0, 1.0, # No conversion
"RBC", "10^12/L", 4.2, 5.4, 1.0, # No conversion
"HGB", "g/dL", 12.0, 17.0, 1.0, # No conversion
"PLT", "10^9/L", 150, 400, 1.0, # No conversion
"SODIUM", "mmol/L", 136, 145, 1.0 # No conversion
)
# Create complete LB domain
lb_complete <- raw_lb_data %>%
# Join standards
left_join(lab_standards, by = "LBTESTCD") %>%
# Calculate standardized numeric result
mutate(
LBSTRESN = round(as.numeric(LBORRES) * CONV_FACTOR, 2),
LBSTRESC = as.character(LBSTRESN),
# Derive normal range indicator
LBNRIND = case_when(
is.na(LBSTRESN) ~ NA_character_,
LBSTRESN < LBSTNRLO ~ "LOW",
LBSTRESN > LBSTNRHI ~ "HIGH",
TRUE ~ "NORMAL"
),
# Add specimen type (typically from raw data, simulated here)
LBSPEC = "SERUM"
) %>%
# Select and order columns per SDTM standard
select(
STUDYID, USUBJID, LBSEQ,
LBTESTCD, LBTEST, LBCAT, LBSPEC,
LBORRES, LBORRESU,
LBSTRESC, LBSTRESN, LBSTRESU,
LBSTNRLO, LBSTNRHI, LBNRIND,
VISITNUM, VISIT
) %>%
arrange(STUDYID, USUBJID, VISITNUM, LBTESTCD)
# Preview
cat("\nComplete LB domain preview:\n")
Complete LB domain preview:
head(lb_complete, 15)# A tibble: 15 × 17
STUDYID USUBJID LBSEQ LBTESTCD LBTEST LBCAT LBSPEC LBORRES LBORRESU LBSTRESC
<chr> <chr> <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 CDISC01 CDISC01… 11 ALT Alani… CHEM… SERUM 19.97 U/L 19.97
2 CDISC01 CDISC01… 12 AST Aspar… CHEM… SERUM 23.97 U/L 23.97
3 CDISC01 CDISC01… 13 BILI Bilir… CHEM… SERUM 0.8 mg/dL 13.68
4 CDISC01 CDISC01… 14 CREAT Creat… CHEM… SERUM 0.9 mg/dL 79.58
5 CDISC01 CDISC01… 15 GLUC Gluco… CHEM… SERUM 98.71 mg/dL 5.48
6 CDISC01 CDISC01… 16 HGB Hemog… HEMA… SERUM 13.3 g/dL 13.3
7 CDISC01 CDISC01… 17 PLT Plate… HEMA… SERUM 254.24 10^9/L 254.24
8 CDISC01 CDISC01… 18 RBC Eryth… HEMA… SERUM 4.08 10^12/L 4.08
9 CDISC01 CDISC01… 19 SODIUM Sodium CHEM… SERUM 135.22 mmol/L 135.22
10 CDISC01 CDISC01… 20 WBC Leuko… HEMA… SERUM 9.17 10^9/L 9.17
11 CDISC01 CDISC01… 1 ALT Alani… CHEM… SERUM 33.63 U/L 33.63
12 CDISC01 CDISC01… 2 AST Aspar… CHEM… SERUM 25.77 U/L 25.77
13 CDISC01 CDISC01… 3 BILI Bilir… CHEM… SERUM 1.37 mg/dL 23.43
14 CDISC01 CDISC01… 4 CREAT Creat… CHEM… SERUM 1.14 mg/dL 100.8
15 CDISC01 CDISC01… 5 GLUC Gluco… CHEM… SERUM 103.09 mg/dL 5.72
# ℹ 7 more variables: LBSTRESN <dbl>, LBSTRESU <chr>, LBSTNRLO <dbl>,
# LBSTNRHI <dbl>, LBNRIND <chr>, VISITNUM <dbl>, VISIT <chr>
9.3 Step 3: Validate the LB Domain
# Validation checks
cat("=== LB Domain Validation ===\n\n")=== LB Domain Validation ===
# Check 1: All required variables present
required_vars <- c("STUDYID", "USUBJID", "LBSEQ", "LBTESTCD", "LBTEST",
"LBORRES", "LBORRESU", "LBSTRESN", "LBSTRESU")
present <- required_vars %in% names(lb_complete)
cat("Required variables check:\n")Required variables check:
for (i in seq_along(required_vars)) {
cat(" ", required_vars[i], ":", ifelse(present[i], "✓", "✗"), "\n")
} STUDYID : ✓
USUBJID : ✓
LBSEQ : ✓
LBTESTCD : ✓
LBTEST : ✓
LBORRES : ✓
LBORRESU : ✓
LBSTRESN : ✓
LBSTRESU : ✓
# Check 2: No missing USUBJID
cat("\nMissing USUBJID:", sum(is.na(lb_complete$USUBJID)), "\n")
Missing USUBJID: 0
# Check 3: Distribution of LBNRIND
cat("\nNormal Range Indicator distribution:\n")
Normal Range Indicator distribution:
table(lb_complete$LBNRIND, useNA = "ifany")
HIGH LOW NORMAL
24 12 214
# Check 4: Unique subjects and visits
cat("\nData coverage:\n")
Data coverage:
cat(" Unique subjects:", n_distinct(lb_complete$USUBJID), "\n") Unique subjects: 5
cat(" Unique visits:", n_distinct(lb_complete$VISIT), "\n") Unique visits: 5
cat(" Unique tests:", n_distinct(lb_complete$LBTESTCD), "\n") Unique tests: 10
cat(" Total records:", nrow(lb_complete), "\n") Total records: 250
10 Summary Statistics for LB Data
When working with lab data, it’s often useful to generate summary statistics.
# Summary by test and visit
lb_summary <- lb_complete %>%
group_by(LBTESTCD, LBTEST, VISIT, LBSTRESU) %>%
summarise(
N = n(),
Mean = round(mean(LBSTRESN, na.rm = TRUE), 2),
SD = round(sd(LBSTRESN, na.rm = TRUE), 2),
Min = round(min(LBSTRESN, na.rm = TRUE), 2),
Max = round(max(LBSTRESN, na.rm = TRUE), 2),
N_Low = sum(LBNRIND == "LOW", na.rm = TRUE),
N_High = sum(LBNRIND == "HIGH", na.rm = TRUE),
.groups = "drop"
)
# Display chemistry panel at BASELINE
cat("Chemistry Panel Summary at BASELINE:\n")Chemistry Panel Summary at BASELINE:
lb_summary %>%
filter(VISIT == "BASELINE", LBTESTCD %in% c("GLUC", "CREAT", "ALT", "AST", "BILI")) %>%
print()# A tibble: 5 × 11
LBTESTCD LBTEST VISIT LBSTRESU N Mean SD Min Max N_Low N_High
<chr> <chr> <chr> <chr> <int> <dbl> <dbl> <dbl> <dbl> <int> <int>
1 ALT Alanine A… BASE… U/L 5 27.2 10.8 12.8 39.7 0 0
2 AST Aspartate… BASE… U/L 5 26.6 5.4 21.0 35.3 0 0
3 BILI Bilirubin… BASE… umol/L 5 20.2 7.16 11.1 30.3 0 4
4 CREAT Creatinine BASE… umol/L 5 84 12.1 69.0 101. 0 0
5 GLUC Glucose BASE… mmol/L 5 4.95 0.56 4.51 5.72 0 1
11 Deliverable Summary
Today you completed the following:
| Task | Status |
|---|---|
| Understood LB domain structure and Findings class | ✓ Done |
| Explored key LB variables (LBTESTCD, LBORRES, LBSTRESN, etc.) | ✓ Done |
| Performed unit conversions (mg/dL → mmol/L) | ✓ Done |
| Derived reference range flags (LBNRIND) | ✓ Done |
| Created a complete LB domain with 10+ tests | ✓ Done |
| Generated summary statistics | ✓ Done |
12 Key Takeaways
- LB is a Findings class domain - It follows the –TESTCD, –ORRES, –STRESC pattern shared by VS, LB, EG
- Original vs Standardized - Always preserve original results while creating standardized versions
- Unit conversions are critical - Labs from different sites may use different units
- Reference ranges enable safety analysis - LBNRIND flags abnormal values
- LOINC provides universal identification - Enhances data interoperability
13 Resources
- CDISC SDTM Implementation Guide - LB Domain - Official LB specification
- LOINC Official Website - Medical laboratory codes
- NIH Unit Conversion Tables - Lab unit conversions
- Pharmaverse.org - R packages for clinical data
- sdtm.oak Documentation - SDTM creation package
14 What’s Next?
In Day 9, we will focus on VS (Vital Signs) & Repeated Measures:
- Understanding visit-level data and multiple readings per timepoint
- Working with positional variables (VSPOS, VSLOC)
- Deriving statistics across multiple readings
- Preparing for ADaM BDS structure (baseline, change from baseline)