Code
cookbook/11_models/vllm/basic.py
from agno.agent import Agent
from agno.models.vllm import VLLM
agent = Agent(
model=VLLM(id="Qwen/Qwen2.5-7B-Instruct", top_k=20, enable_thinking=False),
markdown=True,
)
agent.print_response("Share a 2 sentence horror story")
Setup vLLM Server
uv pip install vllm
python -m vllm.entrypoints.openai.api_server \
--model Qwen/Qwen2.5-7B-Instruct \
--port 8000
Was this page helpful?