Scan streamed output chunks with rolling context

scan_stream() scans character chunks as they arrive from a streaming model API. Each scan uses the current chunk plus a configurable overlap from the previous text so rules can catch phrases split across chunk boundaries.

Usage

scan_stream(
  chunks,
  policy = "enterprise_default",
  reviewer = NULL,
  checks = "rules",
  chunk_size = 1000L,
  overlap = 200L,
  on_block = c("stop", "return"),
  redaction = NULL,
  scanners = scanner_options(),
  show_tokens = FALSE
)

Arguments

chunks: Character vector of streamed text chunks, or one long string.
policy: A shieldr_policy or built-in policy name.
reviewer: Optional reviewer function or object with $chat().
checks: One of "rules", "nlp", "llm", or "both".
chunk_size: Maximum size used to split a single long string.
overlap: Number of trailing characters from prior output to include when scanning the next chunk.
on_block: One of "stop" or "return".
redaction: Optional redaction strategy from redaction_strategy().
scanners: Optional scanner configuration from scanner_options().
show_tokens: Whether to attach token counts when ellmer is available.

Value

A shieldr_stream_result list with action, text, and reports.

Details

This helper is intentionally transport-agnostic: pass the text chunks you receive from an SDK, callback, or websocket handler. It returns per-window reports and a combined action. If on_block = "stop", the function aborts as soon as a window resolves to block; use on_block = "return" when you want a full report object instead.

chunk_size is used only when chunks is a single long string. Character vectors with more than one element are treated as already chunked.

Examples

scan_stream(
  c("I will now ", "delete the records."),
  on_block = "return"
)
#> $action
#> [1] "block"
#> 
#> $text
#> [1] "I will now delete the records."
#> 
#> $reports
#> $reports[[1]]
#> llmshieldr report
#> action: block
#> risk_score: 1.000
#> findings: 1
#> 
#> $reports[[2]]
#> llmshieldr report
#> action: block
#> risk_score: 1.000
#> findings: 1
#> 
#> 
#> attr(,"class")
#> [1] "shieldr_stream_result"