Probes
Focused behavioural tests for specific capabilities.
A probe takes a model and input, runs a test, returns output (and optional score).
The Probe Interface
class Probe:
name: str # Unique identifier
category: ProbeCategory # LOGIC, BIAS, SAFETY, etc.
def run(self, model, data, **kwargs) -> Any:
"""Execute the probe on a single input."""
def run_batch(self, model, dataset, **kwargs) -> list:
"""Execute on multiple inputs."""
def score(self, results) -> ProbeScore:
"""Aggregate results into a score."""
Probe Categories
| Category | Tests For | Example Probes |
|---|---|---|
LOGIC | Reasoning, deduction | LogicProbe |
BIAS | Demographic fairness | BiasProbe |
SAFETY | Security, jailbreaks | AttackProbe, JailbreakProbe |
FACTUALITY | Factual accuracy | FactualityProbe |
CODE | Programming tasks | CodeGenerationProbe |
INSTRUCTION | Following instructions | InstructionFollowingProbe |
CUSTOM | User-defined | Your probes |
How Probes Work
Simple Example
from insideLLMs.probes import LogicProbe
probe = LogicProbe()
result = probe.run(model, {"question": "What is 2 + 2?"})
# result = "4" (model's response)
With Runner
from insideLLMs.runtime.runner import ProbeRunner
runner = ProbeRunner(model, probe)
results = runner.run([
{"question": "What is 2 + 2?"},
{"question": "What is 3 + 3?"},
])
Batch Execution
results = probe.run_batch(
model,
dataset,
max_workers=4,
progress_callback=lambda c, t: print(f"{c}/{t}")
)
Input Formats
Probes accept various input formats:
Dict with Fields
{"question": "...", "expected": "..."}
{"prompt": "...", "constraints": [...]}
{"code": "...", "bug_type": "syntax"}
String (Simple)
"What is the capital of France?"
Messages (Chat)
{
"messages": [
{"role": "system", "content": "You are helpful."},
{"role": "user", "content": "Hello!"}
]
}
Scoring
Probes can evaluate results:
class MyScoredProbe(ScoredProbe):
def evaluate_single(self, output, reference, input_data):
is_correct = reference.lower() in output.lower()
return {"correct": is_correct, "confidence": 0.9}
def score(self, results):
correct = sum(1 for r in results if r.output.get("correct"))
return ProbeScore(value=correct / len(results))
Probe Lifecycle
sequenceDiagram
participant R as Runner
participant P as Probe
participant M as Model
loop For each input
R->>P: run(model, input)
P->>P: Format prompt
P->>M: generate(prompt)
M-->>P: response
P->>P: Process output
P-->>R: result
end
R->>P: score(results)
P-->>R: ProbeScore
Built-in vs Custom
Use Built-in When:
- Testing standard capabilities (logic, bias, safety)
- Need proven, well-tested implementations
- Want consistent results across projects
Use Custom When:
- Domain-specific evaluation (legal, medical, etc.)
- Proprietary scoring logic
- Integration with external systems
Creating Custom Probes
See Custom Probe Tutorial for a complete walkthrough.
Quick template:
from insideLLMs.probes.base import Probe
from insideLLMs.types import ProbeCategory
class MyProbe(Probe[dict]):
name = "my_probe"
default_category = ProbeCategory.CUSTOM
def run(self, model, data, **kwargs) -> dict:
prompt = data.get("prompt", str(data))
response = model.generate(prompt)
return {"response": response, "custom_field": "value"}
See Also
- Probes Catalog - All built-in probes
- Custom Probe Tutorial - Build your own
- Runners - Orchestrating probe execution