Provider Setup Guide

This guide walks you through setting up different model providers with insideLLMs.

Quick Start

For offline testing (no API keys needed):

from insideLLMs import DummyModel

model = DummyModel()
response = model.generate("Hello, world!")
print(response)

For production use, configure one or more of the providers below.


OpenAI

Prerequisites

Setup

1. Install dependencies:

pip install "insidellms[providers]"
# Or if already installed:
pip install openai

2. Set environment variable:

export OPENAI_API_KEY=your_api_key_here

Or in Python:

import os
os.environ["OPENAI_API_KEY"] = "your_api_key_here"

3. Use in code:

from insideLLMs import OpenAIModel

# With default model (gpt-4o-mini)
model = OpenAIModel()

# With specific model
model = OpenAIModel(model_name="gpt-4o")

# With custom temperature
model = OpenAIModel(
    model_name="gpt-4o",
    temperature=0.7,
    max_tokens=1000
)

# Generate response
response = model.generate("What is 2+2?")
print(response)

Available Models

  • gpt-4o - Latest GPT-4 Omni model
  • gpt-4o-mini - Cost-effective GPT-4 Omni (default)
  • gpt-4-turbo - GPT-4 Turbo
  • gpt-3.5-turbo - Legacy GPT-3.5

Rate Limits

  • Free tier: Very limited (for testing only)
  • Paid tier: Varies by model (see OpenAI pricing)
  • Use insideLLMs.rate_limiting for automatic rate limit handling

Troubleshooting

Error: “Incorrect API key provided”

  • Check your API key is correct
  • Ensure OPENAI_API_KEY environment variable is set
  • Verify your API key hasn’t been revoked

Error: “Rate limit exceeded”

from insideLLMs import OpenAIModel, RateLimiter

rate_limiter = RateLimiter(requests_per_minute=10)
model = OpenAIModel(rate_limiter=rate_limiter)

Error: “Insufficient quota”


Anthropic (Claude)

Prerequisites

Setup

1. Install dependencies:

pip install "insidellms[providers]"
# Or:
pip install anthropic

2. Set environment variable:

export ANTHROPIC_API_KEY=your_api_key_here

3. Use in code:

from insideLLMs import AnthropicModel

# With default model (claude-3-5-sonnet)
model = AnthropicModel()

# With specific model
model = AnthropicModel(model_name="claude-3-opus-20240229")

# With custom parameters
model = AnthropicModel(
    model_name="claude-3-5-sonnet-20241022",
    temperature=0.7,
    max_tokens=2000
)

response = model.generate("Explain quantum computing")
print(response)

Available Models

  • claude-3-5-sonnet-20241022 - Latest Claude 3.5 Sonnet (default)
  • claude-3-opus-20240229 - Most capable Claude 3 model
  • claude-3-sonnet-20240229 - Balanced performance
  • claude-3-haiku-20240307 - Fastest, most cost-effective

Rate Limits

Troubleshooting

Error: “Invalid API key”

Error: “Model not found”

  • Check you’re using the full model identifier (e.g., claude-3-5-sonnet-20241022)
  • Some models require specific access permissions

Ollama (Local Models)

Prerequisites

Setup (Local)

1. Install Ollama:

# macOS/Linux
curl -fsSL https://ollama.ai/install.sh | sh

# Or download from https://ollama.ai/download

2. Pull a model:

ollama pull llama3.2
ollama pull mistral
ollama pull qwen2.5

3. Start Ollama server:

ollama serve
# Server runs on http://localhost:11434 by default

4. Use in code:

from insideLLMs import OllamaModel

# Local Ollama instance
model = OllamaModel(
    model_name="llama3.2",
    base_url="http://localhost:11434"
)

response = model.generate("Write a haiku about coding")
print(response)

Setup (Ollama Cloud)

1. Get API key from Ollama Cloud

2. Set environment variable:

export OLLAMA_API_KEY=your_api_key_here

3. Use in code:

from insideLLMs import OllamaModel

model = OllamaModel(
    model_name="deepseek-r1:7b",
    base_url="https://api.ollama.cloud",
    api_key=os.getenv("OLLAMA_API_KEY")
)

response = model.generate("Explain machine learning")
print(response)
  • llama3.2 - Meta’s Llama 3.2
  • mistral - Mistral AI’s flagship model
  • qwen2.5 - Alibaba’s Qwen 2.5
  • deepseek-r1:7b - DeepSeek reasoning model

Troubleshooting

Error: “Connection refused”

  • Ensure Ollama server is running: ollama serve
  • Check server URL (default: http://localhost:11434)

Error: “Model not found”

  • Pull the model first: ollama pull <model_name>
  • List available models: ollama list

HuggingFace

Prerequisites

Setup

1. Install dependencies:

pip install "insidellms[huggingface]"
# Or:
pip install transformers torch

2. (Optional) Set HF token for gated models:

export HF_TOKEN=your_token_here

3. Use in code:

from insideLLMs import HuggingFaceModel

# Small model for testing
model = HuggingFaceModel(model_name="gpt2")

# Larger instruction-tuned model
model = HuggingFaceModel(
    model_name="meta-llama/Llama-3.2-3B-Instruct",
    device="cuda",  # or "cpu"
    max_length=512
)

response = model.generate("What is Python?", max_length=100)
print(response)
  • gpt2 - Small, fast, good for testing
  • meta-llama/Llama-3.2-3B-Instruct - Llama 3.2 3B (requires acceptance)
  • google/flan-t5-base - T5-based instruction model
  • mistralai/Mistral-7B-Instruct-v0.2 - Mistral 7B

System Requirements

  • CPU models: 8GB+ RAM recommended
  • GPU models: CUDA-capable GPU, 8GB+ VRAM for 7B models
  • Use device="cpu" for CPU-only inference

Troubleshooting

Error: “Model requires authentication”

  • Create HuggingFace account
  • Accept model license on model page
  • Set HF_TOKEN environment variable

Error: “CUDA out of memory”

# Use smaller model or CPU
model = HuggingFaceModel(model_name="gpt2", device="cpu")

# Or use quantization (requires bitsandbytes)
model = HuggingFaceModel(
    model_name="meta-llama/Llama-3.2-3B-Instruct",
    load_in_8bit=True
)

OpenRouter

Prerequisites

Setup

1. Set environment variable:

export OPENROUTER_API_KEY=your_api_key_here

2. Use in code:

from insideLLMs import OpenRouterModel

model = OpenRouterModel(
    model_name="anthropic/claude-3-5-sonnet",
    api_key=os.getenv("OPENROUTER_API_KEY")
)

response = model.generate("Summarise machine learning")
print(response)

Available Models

OpenRouter provides access to 100+ models from multiple providers:

  • anthropic/claude-3-5-sonnet
  • openai/gpt-4o
  • google/gemini-2.0-flash-exp
  • meta-llama/llama-3.2-90b-vision-instruct

See full list: https://openrouter.ai/models


Gemini (Google)

Prerequisites

Setup

1. Install dependencies:

pip install "insidellms[providers]"
# Or:
pip install google-generativeai

2. Set environment variable:

export GOOGLE_API_KEY=your_api_key_here

3. Use in code:

from insideLLMs import GeminiModel

model = GeminiModel(
    model_name="gemini-2.0-flash-exp",
    api_key=os.getenv("GOOGLE_API_KEY")
)

response = model.generate("Explain neural networks")
print(response)

Available Models

  • gemini-2.0-flash-exp - Latest Gemini 2.0 Flash (experimental)
  • gemini-1.5-pro - Gemini 1.5 Pro
  • gemini-1.5-flash - Fast and efficient

Provider Comparison

Provider Best For Cost Setup Difficulty Local Option
OpenAI Production quality, reliability $$$ Easy No
Anthropic Long context, reasoning $$$ Easy No
Ollama Privacy, local inference Free Medium Yes
HuggingFace Customisation, research Free (compute costs) Medium-Hard Yes
OpenRouter Multi-model access, flexibility Variable Easy No
Gemini Google ecosystem integration $$ Easy No

Multi-Provider Configuration

Using multiple providers in a single harness:

from insideLLMs import (
    OpenAIModel,
    AnthropicModel,
    OllamaModel,
    ProbeRunner,
    LogicProbe
)

models = [
    OpenAIModel(model_name="gpt-4o-mini"),
    AnthropicModel(model_name="claude-3-5-sonnet-20241022"),
    OllamaModel(model_name="llama3.2", base_url="http://localhost:11434")
]

probe = LogicProbe()
test_data = ["What is 5 + 3?", "If A implies B, and A is true, what about B?"]

for model in models:
    print(f"\n=== Testing {model.model_name} ===")
    runner = ProbeRunner(model, probe)
    results = runner.run(test_data)
    for result in results:
        print(f"  {result}")

Environment Variable Summary

Provider Variable Example
OpenAI OPENAI_API_KEY sk-proj-...
Anthropic ANTHROPIC_API_KEY sk-ant-...
Ollama Cloud OLLAMA_API_KEY ollama_...
HuggingFace HF_TOKEN hf_...
OpenRouter OPENROUTER_API_KEY sk-or-...
Gemini GOOGLE_API_KEY AIza...

Setting multiple at once:

export OPENAI_API_KEY=your_openai_key
export ANTHROPIC_API_KEY=your_anthropic_key
export OLLAMA_API_KEY=your_ollama_key

Or use a .env file:

# .env
OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key
OLLAMA_API_KEY=your_ollama_key

Then load with python-dotenv:

from dotenv import load_dotenv
load_dotenv()  # Loads .env file

# Now environment variables are set
from insideLLMs import OpenAIModel
model = OpenAIModel()  # Uses OPENAI_API_KEY from .env

Next Steps