Skip to contents

Scans user prompt text with rule-based, NLP, and optional semantic reviewer checks. Findings retain OWASP LLM Top 10 categories when known; see https://genai.owasp.org/llm-top-10/.

Usage

scan_prompt(
  text,
  policy = "enterprise_default",
  reviewer = NULL,
  checks = "rules",
  redact = TRUE,
  redaction = NULL,
  scanners = scanner_options(),
  show_tokens = FALSE
)

Arguments

text

Prompt text.

policy

A shieldr_policy or built-in policy name such as "comprehensive".

reviewer

Optional reviewer function or object with $chat().

checks

One of "rules", "nlp", "llm", or "both".

redact

Whether to redact matched spans in text_clean.

redaction

Optional redaction strategy from redaction_strategy(). Ignored when redact = FALSE.

scanners

Optional scanner configuration from scanner_options().

show_tokens

Whether to attach token counts when ellmer is available.

Value

A shieldr_report.

Details

scan_prompt() is usually the first guardrail in a workflow. It normalizes text with Unicode NFKC normalization, collapses whitespace, applies policy rules, optionally applies the NLP intent rule, optionally asks a semantic reviewer for JSON findings, calculates a risk_score, resolves an action, and returns a shieldr_report().

checks = "rules" uses deterministic policy rules. Built-in policies include regular expressions and an NLP intent rule. checks = "nlp" runs only NLP intent checks, using tokenizers for word tokenization and SnowballC for stemming when those optional packages are installed. checks = "llm" uses only the semantic reviewer when one is supplied. checks = "both" combines policy rules with semantic review. If LLM review returns malformed JSON, the function warns and continues with the findings it already has.

Redaction replaces matched spans with [REDACTED]. Function-based findings can influence score and action even when they do not provide exact spans.

Examples

scan_prompt("hello")
#> llmshieldr report
#> action: allow
#> risk_score: 0.000
#> findings: 0
scan_prompt("patient has cancer password ak$1234567890", policy = "comprehensive")
#> llmshieldr report
#> action: block
#> risk_score: 1.000
#> findings: 3
scan_prompt("email neel@example.com", redaction = redaction_strategy("hash"))
#> llmshieldr report
#> action: redact
#> risk_score: 0.300
#> findings: 1
scan_prompt("hello", show_tokens = TRUE)
#> llmshieldr report
#> action: allow
#> risk_score: 0.000
#> findings: 0
#> tokens: 2