Frequently Asked Questions

Quick answers to common questions. For detailed troubleshooting, see Troubleshooting.

Installation

Do I need API keys to get started?

No. Use DummyModel for offline testing:

insidellms quicktest "Hello" --model dummy

API keys are only needed for hosted providers (OpenAI, Anthropic, etc.).

What Python version do I need?

Python 3.10 or higher. Check with:

python --version

How do I install optional features?

pip install -e ".[nlp]"           # NLP features
pip install -e ".[visualisation]" # Charts and reports
pip install -e ".[all]"           # Everything

Configuration

Why can’t it find my dataset file?

Relative paths are resolved from the config file’s directory, not your current directory.

# If config is at /project/configs/harness.yaml
dataset:
  path: ../data/test.jsonl  # Resolves to /project/data/test.jsonl

How do I use environment variables in configs?

model:
  type: openai
  args:
    api_key: ${OPENAI_API_KEY}

What’s the difference between `run` and `harness`?

Command	Models	Probes	Use Case
`run`	Single	Single	Simple tests
`harness`	Multiple	Multiple	Comparisons

Running

Why does `--overwrite` refuse to overwrite?

Safety guard. insideLLMs only overwrites directories containing .insidellms_run marker.

Solutions:

Use --overwrite with a valid run directory
Delete the directory manually
Use a new directory name

How do I resume an interrupted run?

insidellms run config.yaml --run-dir ./my_run --resume

This continues from where it left off using existing records.jsonl.

Can I run multiple models in parallel?

Yes, with async execution:

async: true
concurrency: 10

Or via CLI: --async --concurrency 10

Models

Can I run local models?

Yes! Supported options:

Runner	Setup
Ollama	`ollama pull llama3`
llama.cpp	Download GGUF model
vLLM	`pip install vllm`

See Local Models Guide.

How do I compare different models?

Use a harness config:

models:
  - type: openai
    args: {model_name: gpt-4o}
  - type: anthropic
    args: {model_name: claude-3-5-sonnet-20241022}

What models are supported?

OpenAI, Anthropic, Google/Gemini, Cohere, HuggingFace, Ollama, vLLM, llama.cpp, and custom implementations. See Models Catalog.

Cost & Performance

How do I reduce API costs?

Limit examples: max_examples: 50
Enable caching: cache: {enabled: true}
Use cheaper models: Start with gpt-4o-mini
Test with DummyModel: No cost for framework testing

How do I speed up runs?

async: true
concurrency: 20
cache:
  enabled: true

I’m hitting rate limits. What do I do?

rate_limit:
  enabled: true
  requests_per_minute: 60

concurrency: 5  # Lower this

See Rate Limiting Guide.

Outputs

What files does insideLLMs create?

File	Purpose
`records.jsonl`	Raw results
`manifest.json`	Run metadata
`summary.json`	Aggregated stats
`report.html`	Visual report

How do I keep outputs out of `~/.insidellms`?

# Per-run
insidellms run config.yaml --run-dir ./my_output

# Global default
export INSIDELLMS_RUN_ROOT=./runs

How do I generate just a report?

insidellms report ./my_run

CI Integration

How do I detect behavioural changes in CI?

insidellms diff ./baseline ./candidate --fail-on-changes

Exit code 1 = changes detected, 0 = identical.

Why do my CI runs produce different outputs?

Model responses are non-deterministic. For deterministic CI:

models:
  - type: dummy
    args:
      response: "Fixed response"

How do I update the baseline after intentional changes?

insidellms harness config.yaml --run-dir ./baseline --overwrite
git add ./baseline
git commit -m "Update baseline: [describe changes]"

Troubleshooting

“command not found: insidellms”

Activate your virtual environment:

source .venv/bin/activate

Or run as module: python -m insideLLMs.cli

“Invalid API key”

Check key format (OpenAI: sk-..., Anthropic: sk-ant-...)
Verify key in provider dashboard
Ensure env var is set: echo $OPENAI_API_KEY

How do I turn off coloured output?

export NO_COLOR=1

Where can I find example datasets?

data/ directory in the repo
benchmarks/ for standard benchmarks
insideLLMs.benchmark_datasets for built-in datasets
HuggingFace datasets via config

Advanced

Can I create custom probes?

Yes! See Custom Probe Tutorial.

from insideLLMs.probes.base import Probe

class MyProbe(Probe):
    def run(self, model, data, **kwargs):
        return model.generate(data["prompt"])

Can I create custom models?

Yes! Implement the Model interface:

from insideLLMs.models.base import Model

class MyModel(Model):
    def generate(self, prompt: str, **kwargs) -> str:
        return "response"

How do I integrate with LangChain?

See LangChain and LangGraph.

Getting Help

Troubleshooting Guide
GitHub Issues
Run insidellms doctor to check your environment