Ollama (Local Models)

Ollama allows you to run large language models locally on your machine, providing privacy, control, and no API costs. This guide covers how to use Ollama through EasilyAI.

Getting Started

Installation

First, install Ollama on your system:

macOS:

bash

# Download from https://ollama.ai or use Homebrew
brew install ollama

Linux:

bash

curl -fsSL https://ollama.ai/install.sh | sh

Windows: Download the installer from ollama.ai

Start Ollama Service

bash

# Start Ollama service
ollama serve

Pull a Model

bash

# Pull a model (e.g., Llama 2)
ollama pull llama2

# See available models
ollama list

Basic Usage with EasilyAI

python

from easilyai import create_app

# Create Ollama app (no API key needed)
app = create_app("Ollama", "ollama", "", "llama2")

# Generate text
response = app.request("Explain machine learning in simple terms")
print(response)

Available Models

Ollama supports many open-source models. Here are some popular options:

Llama 2

Model ID: llama2 (7B), llama2:13b, llama2:70b
Best for: General-purpose tasks, instruction following
Strengths: Well-balanced, good performance

bash

# Pull different sizes
ollama pull llama2         # 7B parameters
ollama pull llama2:13b     # 13B parameters
ollama pull llama2:70b     # 70B parameters (requires more RAM)

python

# Use different Llama 2 variants
llama2_7b = create_app("Llama7B", "ollama", "", "llama2")
llama2_13b = create_app("Llama13B", "ollama", "", "llama2:13b")

Code Llama

Model ID: codellama, codellama:13b, codellama:34b
Best for: Code generation and programming tasks
Strengths: Specialized for coding tasks

bash

ollama pull codellama

python

code_app = create_app("CodeLlama", "ollama", "", "codellama")
code_response = code_app.request("Write a Python function to calculate fibonacci numbers")

Mistral

Model ID: mistral, mistral:7b
Best for: Fast, efficient performance
Strengths: Good performance with smaller size

bash

ollama pull mistral

python

mistral_app = create_app("Mistral", "ollama", "", "mistral")
response = mistral_app.request("What are the benefits of renewable energy?")

Neural Chat

Model ID: neural-chat
Best for: Conversational AI
Strengths: Optimized for chat applications

bash

ollama pull neural-chat

python

chat_app = create_app("NeuralChat", "ollama", "", "neural-chat")
chat_response = chat_app.request("Let's discuss the future of artificial intelligence")

Dolphin Models

Model ID: dolphin-mistral, dolphin-llama2
Best for: Uncensored, helpful responses
Strengths: Less restricted outputs

bash

ollama pull dolphin-mistral

python

dolphin_app = create_app("Dolphin", "ollama", "", "dolphin-mistral")

Parameters

Text Generation Parameters

python

response = app.request(
    "Write a creative story",
    temperature=0.8,         # Controls randomness (0.0 to 2.0)
    num_predict=500,        # Number of tokens to predict
    top_p=0.9,              # Nucleus sampling
    top_k=40,               # Top-k sampling
    repeat_penalty=1.1      # Penalty for repetition
)

Advanced Parameters

python

# Fine-tune model behavior
advanced_response = app.request(
    "Explain quantum computing",
    temperature=0.7,
    num_predict=1000,
    top_p=0.95,
    top_k=50,
    repeat_penalty=1.05,
    presence_penalty=0.5,
    frequency_penalty=0.5
)

Use Cases

Local Development

Perfect for development without API costs:

python

from easilyai import create_app

dev_app = create_app("LocalDev", "ollama", "", "codellama")

# Code generation
function_code = dev_app.request(
    "Write a Python class for a simple todo list with add, remove, and list methods"
)

# Code review
review = dev_app.request(
    "Review this code for improvements:\n"
    "def bubble_sort(arr):\n"
    "    for i in range(len(arr)):\n"
    "        for j in range(len(arr)-1):\n"
    "            if arr[j] > arr[j+1]:\n"
    "                arr[j], arr[j+1] = arr[j+1], arr[j]\n"
    "    return arr"
)

# Documentation
docs = dev_app.request(
    "Write documentation for this function:\n"
    "def calculate_distance(lat1, lon1, lat2, lon2):\n"
    "    # Implementation here\n"
    "    pass"
)

Private AI Assistant

Run a personal AI assistant locally:

python

from easilyai import create_app

assistant_app = create_app("LocalAssistant", "ollama", "", "llama2")

# Personal planning
planning = assistant_app.request(
    "Help me plan a productive workday. I have meetings at 10 AM and 3 PM, "
    "and I need to finish a presentation."
)

# Learning assistance
learning = assistant_app.request(
    "I'm learning React. Can you explain the concept of hooks and give me "
    "a simple example?"
)

# Creative writing
creative = assistant_app.request(
    "Help me brainstorm ideas for a short story about time travel"
)

Content Creation

Generate content without sending data to external services:

python

from easilyai import create_app

content_app = create_app("ContentCreator", "ollama", "", "mistral")

# Blog posts
blog_post = content_app.request(
    "Write a 500-word blog post about the benefits of local AI models"
)

# Social media content
social_content = content_app.request(
    "Create 5 engaging social media posts about sustainable technology"
)

# Email drafts
email_draft = content_app.request(
    "Draft a professional email to introduce our new software product to potential clients"
)

Research and Analysis

Analyze data and documents privately:

python

from easilyai import create_app

research_app = create_app("LocalResearch", "ollama", "", "llama2:13b")

# Document analysis
analysis = research_app.request(
    "Analyze this market research data and provide insights:\n"
    "[Your private data here]"
)

# Summarization
summary = research_app.request(
    "Summarize the key points from this internal meeting transcript:\n"
    "[Private transcript here]"
)

# Trend analysis
trends = research_app.request(
    "Based on this sales data, what trends do you see?\n"
    "[Private sales data here]"
)

Model Management

Listing Models

python

import subprocess

def list_ollama_models():
    """List all downloaded Ollama models"""
    result = subprocess.run(['ollama', 'list'], capture_output=True, text=True)
    return result.stdout

print(list_ollama_models())

Model Information

python

def get_model_info(model_name):
    """Get information about a specific model"""
    result = subprocess.run(['ollama', 'show', model_name], capture_output=True, text=True)
    return result.stdout

print(get_model_info("llama2"))

Switching Models

python

from easilyai import create_app

# Switch between models for different tasks
code_model = create_app("Coder", "ollama", "", "codellama")
chat_model = create_app("Chat", "ollama", "", "neural-chat")
general_model = create_app("General", "ollama", "", "llama2")

# Use appropriate model for each task
code_response = code_model.request("Write a sorting algorithm")
chat_response = chat_model.request("Let's discuss AI ethics")
general_response = general_model.request("Explain photosynthesis")

Performance Optimization

Hardware Considerations

python

import psutil

def check_system_resources():
    """Check if system can handle larger models"""
    ram_gb = psutil.virtual_memory().total / (1024**3)
    cpu_count = psutil.cpu_count()
    
    print(f"RAM: {ram_gb:.1f} GB")
    print(f"CPU cores: {cpu_count}")
    
    # Model recommendations based on RAM
    if ram_gb >= 32:
        print("Can run: llama2:70b, codellama:34b")
    elif ram_gb >= 16:
        print("Can run: llama2:13b, codellama:13b")
    else:
        print("Recommended: llama2, mistral, codellama")

check_system_resources()

Model Selection by Task

python

from easilyai import create_app

class LocalAIManager:
    def __init__(self):
        self.models = {
            "code": create_app("Code", "ollama", "", "codellama"),
            "chat": create_app("Chat", "ollama", "", "neural-chat"),
            "general": create_app("General", "ollama", "", "llama2"),
            "fast": create_app("Fast", "ollama", "", "mistral")
        }
    
    def request(self, prompt, task_type="general", **kwargs):
        """Route request to appropriate model"""
        if task_type not in self.models:
            task_type = "general"
        
        return self.models[task_type].request(prompt, **kwargs)

# Usage
ai_manager = LocalAIManager()

# Automatically use best model for task
code_response = ai_manager.request("Write a web scraper", task_type="code")
chat_response = ai_manager.request("How are you today?", task_type="chat")
quick_response = ai_manager.request("What is 2+2?", task_type="fast")

Temperature Control

python

from easilyai import create_app

app = create_app("TempControl", "ollama", "", "llama2")

# Consistent, factual responses
factual = app.request(
    "What is the capital of Japan?",
    temperature=0.1
)

# Balanced responses
balanced = app.request(
    "Explain machine learning",
    temperature=0.5
)

# Creative responses
creative = app.request(
    "Write a poem about the ocean",
    temperature=0.9
)

Troubleshooting

Common Issues

python

from easilyai import create_app
from easilyai.exceptions import EasilyAIException

def safe_ollama_request(prompt, model="llama2"):
    """Make a safe Ollama request with error handling"""
    try:
        app = create_app("Ollama", "ollama", "", model)
        return app.request(prompt)
    
    except EasilyAIException as e:
        error_msg = str(e).lower()
        
        if "connection" in error_msg:
            return "Error: Ollama service not running. Start with 'ollama serve'"
        elif "model" in error_msg:
            return f"Error: Model {model} not found. Pull with 'ollama pull {model}'"
        else:
            return f"Error: {e}"

# Usage
response = safe_ollama_request("Hello!")
print(response)

Model Loading Issues

python

import subprocess
import time

def ensure_model_available(model_name):
    """Ensure a model is available, pull if necessary"""
    # Check if model exists
    result = subprocess.run(['ollama', 'list'], capture_output=True, text=True)
    
    if model_name not in result.stdout:
        print(f"Model {model_name} not found. Downloading...")
        subprocess.run(['ollama', 'pull', model_name])
        print(f"Model {model_name} downloaded successfully")
    
    return True

# Ensure model is available before use
ensure_model_available("llama2")
app = create_app("Ollama", "ollama", "", "llama2")

Performance Issues

python

def optimize_ollama_performance():
    """Tips for optimizing Ollama performance"""
    tips = [
        "Use smaller models (7B) for faster responses",
        "Increase system RAM for larger models",
        "Use SSD storage for better model loading",
        "Close other applications to free up resources",
        "Use GPU acceleration if available (CUDA/Metal)"
    ]
    
    for tip in tips:
        print(f"• {tip}")

optimize_ollama_performance()

Batch Processing

Processing Multiple Requests

python

from easilyai import create_app
import time

def batch_process_local(prompts, model="llama2"):
    """Process multiple prompts with local model"""
    app = create_app("BatchOllama", "ollama", "", model)
    results = []
    
    for i, prompt in enumerate(prompts):
        print(f"Processing {i+1}/{len(prompts)}")
        
        try:
            response = app.request(prompt)
            results.append({"prompt": prompt, "response": response, "success": True})
        except Exception as e:
            results.append({"prompt": prompt, "error": str(e), "success": False})
        
        # Small delay to prevent overwhelming the system
        time.sleep(0.1)
    
    return results

# Usage
prompts = [
    "What is Python?",
    "Explain machine learning",
    "Write a hello world program"
]

results = batch_process_local(prompts, "codellama")

for result in results:
    if result["success"]:
        print(f"✓ {result['prompt']}: {result['response'][:50]}...")
    else:
        print(f"✗ {result['prompt']}: {result['error']}")

Best Practices

1. Choose the Right Model

python

# For code tasks
code_app = create_app("Code", "ollama", "", "codellama")

# For general conversation
chat_app = create_app("Chat", "ollama", "", "neural-chat")

# For quick responses
fast_app = create_app("Fast", "ollama", "", "mistral")

2. Manage System Resources

python

import psutil

def check_resources_before_request():
    """Check system resources before making requests"""
    ram_usage = psutil.virtual_memory().percent
    cpu_usage = psutil.cpu_percent(interval=1)
    
    if ram_usage > 90:
        print("Warning: High RAM usage. Consider using a smaller model.")
    
    if cpu_usage > 90:
        print("Warning: High CPU usage. Wait before making more requests.")
    
    return ram_usage < 90 and cpu_usage < 90

# Check before making requests
if check_resources_before_request():
    response = app.request("Your prompt here")

3. Use Appropriate Parameters

python

# For consistent outputs
consistent = app.request("Explain AI", temperature=0.1)

# For creative outputs
creative = app.request("Write a story", temperature=0.8)

# For longer responses
detailed = app.request("Detailed analysis", num_predict=1000)

4. Privacy and Security

Ollama keeps everything local, but consider:

Regularly update models for security patches
Be aware of model capabilities and limitations
Monitor system resources and performance
Keep sensitive data processing local

Advantages of Local Models

Privacy

No data sent to external servers
Complete control over your information
Compliance with data protection regulations

Cost

No API fees or usage limits
One-time setup cost for hardware
Unlimited usage once installed

Control

Choose specific models for your needs
Customize parameters without restrictions
No rate limits or quotas

Availability

Works offline
No dependency on external services
Consistent availability

Limitations

Performance

Generally slower than cloud APIs
Limited by local hardware
Larger models require significant RAM

Model Selection

Limited to open-source models
May not have latest cutting-edge capabilities
Updates require manual model downloads

Maintenance

Requires system administration
Model management and updates
Hardware requirements planning

Ollama provides an excellent way to run AI models locally, offering privacy, control, and cost savings for many use cases. It's particularly valuable for development, private data processing, and scenarios where data security is paramount.

Ollama (Local Models) ​

Getting Started ​

Installation ​

Start Ollama Service ​

Pull a Model ​

Basic Usage with EasilyAI ​

Available Models ​

Llama 2 ​

Code Llama ​

Mistral ​

Neural Chat ​

Dolphin Models ​

Parameters ​

Text Generation Parameters ​

Advanced Parameters ​

Use Cases ​

Local Development ​

Private AI Assistant ​

Content Creation ​

Research and Analysis ​

Model Management ​

Listing Models ​

Model Information ​

Switching Models ​

Performance Optimization ​

Hardware Considerations ​

Model Selection by Task ​

Temperature Control ​

Troubleshooting ​

Common Issues ​

Model Loading Issues ​

Performance Issues ​

Batch Processing ​

Processing Multiple Requests ​

Best Practices ​

1. Choose the Right Model ​

2. Manage System Resources ​

3. Use Appropriate Parameters ​

4. Privacy and Security ​

Advantages of Local Models ​

Privacy ​

Cost ​

Control ​

Availability ​

Limitations ​

Performance ​

Model Selection ​

Maintenance ​

Ollama (Local Models)

Getting Started

Installation

Start Ollama Service

Pull a Model

Basic Usage with EasilyAI

Available Models

Llama 2

Code Llama

Mistral

Neural Chat

Dolphin Models

Parameters

Text Generation Parameters

Advanced Parameters

Use Cases

Local Development

Private AI Assistant

Content Creation

Research and Analysis

Model Management

Listing Models

Model Information

Switching Models

Performance Optimization

Hardware Considerations

Model Selection by Task

Temperature Control

Troubleshooting

Common Issues

Model Loading Issues

Performance Issues

Batch Processing

Processing Multiple Requests

Best Practices

1. Choose the Right Model

2. Manage System Resources

3. Use Appropriate Parameters

4. Privacy and Security

Advantages of Local Models

Privacy

Cost

Control

Availability

Limitations

Performance

Model Selection

Maintenance