Scan model output — scan_output • llmshieldr

Scans LLM output for sensitive data, unsafe code, agency claims, system prompt leakage, misinformation markers, and optional NLP intent signals.

Usage

scan_output(
  text,
  policy = "enterprise_default",
  reviewer = NULL,
  checks = "rules",
  redaction = NULL,
  scanners = scanner_options(),
  show_tokens = FALSE
)

Arguments

text: Model output text.
policy: A shieldr_policy or built-in policy name such as "comprehensive".
reviewer: Optional reviewer function or object with $chat().
checks: One of "rules", "nlp", "llm", or "both".
redaction: Optional redaction strategy from redaction_strategy().
scanners: Optional scanner configuration from scanner_options().
show_tokens: Whether to attach token counts when ellmer is available.

Value

A shieldr_report.

Details

Output scanning is the last guardrail before model text is displayed, stored, or passed to another tool. It runs the policy rule set over the full output and adds output-specific checks for common failure modes:

fenced code blocks are scanned for unsafe code and command patterns
excessive-agency language such as "I will now" or "I have deleted"
system-prompt structural markers such as "# System" or role declarations
high-confidence medical or financial claim markers

Use checks = "nlp" when you want a lightweight local NLP-only pass over model output. The return value is a shieldr_report() with the same scoring and action semantics as scan_prompt().

Examples

scan_output("A concise answer.")
#> llmshieldr report
#> action: allow
#> risk_score: 0.000
#> findings: 0
scan_output("A concise answer.", show_tokens = TRUE)
#> llmshieldr report
#> action: allow
#> risk_score: 0.000
#> findings: 0
#> tokens: 5