Scans user prompt text with rule-based, NLP, and optional semantic reviewer checks. Findings retain OWASP LLM Top 10 categories when known; see https://genai.owasp.org/llm-top-10/.
Usage
scan_prompt(
text,
policy = "enterprise_default",
reviewer = NULL,
checks = "rules",
redact = TRUE,
redaction = NULL,
scanners = scanner_options(),
show_tokens = FALSE
)Arguments
- text
Prompt text.
- policy
A
shieldr_policyor built-in policy name such as"comprehensive".- reviewer
Optional reviewer function or object with
$chat().- checks
One of
"rules","nlp","llm", or"both".- redact
Whether to redact matched spans in
text_clean.- redaction
Optional redaction strategy from
redaction_strategy(). Ignored whenredact = FALSE.- scanners
Optional scanner configuration from
scanner_options().- show_tokens
Whether to attach token counts when
ellmeris available.
Details
scan_prompt() is usually the first guardrail in a workflow. It normalizes
text with Unicode NFKC normalization, collapses whitespace, applies policy
rules, optionally applies the NLP intent rule, optionally asks a semantic
reviewer for JSON findings, calculates a risk_score, resolves an action,
and returns a shieldr_report().
checks = "rules" uses deterministic policy rules. Built-in policies include
regular expressions and an NLP intent rule. checks = "nlp" runs only NLP
intent checks, using tokenizers for word tokenization and SnowballC for
stemming when those optional packages are installed. checks = "llm" uses
only the semantic reviewer when one is supplied. checks = "both" combines
policy rules with semantic review. If LLM review returns malformed JSON, the
function warns and continues with the findings it already has.
Redaction replaces matched spans with [REDACTED]. Function-based findings
can influence score and action even when they do not provide exact spans.
Examples
scan_prompt("hello")
#> llmshieldr report
#> action: allow
#> risk_score: 0.000
#> findings: 0
scan_prompt("patient has cancer password ak$1234567890", policy = "comprehensive")
#> llmshieldr report
#> action: block
#> risk_score: 1.000
#> findings: 3
scan_prompt("email neel@example.com", redaction = redaction_strategy("hash"))
#> llmshieldr report
#> action: redact
#> risk_score: 0.300
#> findings: 1
scan_prompt("hello", show_tokens = TRUE)
#> llmshieldr report
#> action: allow
#> risk_score: 0.000
#> findings: 0
#> tokens: 2
