Ollama

The Ollama model provides access to open source models, both locally-hosted and via Ollama Cloud. Local Usage: Run models on your own hardware using the Ollama client. Perfect for development, privacy-sensitive workloads, and when you want full control over your infrastructure. Cloud Usage: Access cloud-hosted models via Ollama Cloud with an API key for scalable, production-ready deployments. No local setup required - simply set your OLLAMA_API_KEY and start using powerful models instantly.

Key Features

Dual Deployment Options: Choose between local hosting for privacy and control, or cloud hosting for scalability
Seamless Switching: Easy transition between local and cloud deployments with minimal code changes
Auto-configuration: When using an API key, the host automatically defaults to Ollama Cloud
Wide Model Support: Access to extensive library of open-source models including GPT-OSS, Llama, Qwen, DeepSeek, and Phi models

Parameters

Parameter	Type	Default	Description
`id`	`str`	`"llama3.2"`	The name of the Ollama model to use
`name`	`str`	`"Ollama"`	The name of the model
`provider`	`str`	`"Ollama"`	The provider of the model
`host`	`str`	`"http://localhost:11434"`	The host URL for the Ollama server
`timeout`	`Optional[int]`	`None`	Request timeout in seconds
`format`	`Optional[str]`	`None`	The format to return the response in (e.g., “json”)
`options`	`Optional[Dict[str, Any]]`	`None`	Additional model options (temperature, top_p, etc.)
`keep_alive`	`Optional[Union[float, str]]`	`None`	How long to keep the model loaded (e.g., “5m”, 3600 seconds)
`template`	`Optional[str]`	`None`	The prompt template to use
`system`	`Optional[str]`	`None`	System message to use
`raw`	`Optional[bool]`	`None`	Whether to return raw response without formatting
`stream`	`bool`	`True`	Whether to stream the response
`retries`	`int`	`0`	Number of retries to attempt before raising a ModelProviderError
`delay_between_retries`	`int`	`1`	Delay between retries, in seconds
`exponential_backoff`	`bool`	`False`	If True, the delay between retries is doubled each time

Agno SDK Reference

AgentOS API Reference

Agno Infra Reference

Key Features

Parameters

Agno SDK Reference

AgentOS API Reference

Agno Infra Reference

​Key Features

​Parameters

Key Features

Parameters