Desi Vocal - Agno

Generate text-to-speech using Agno agents integrated with Desivocal, that specializing in high-quality, multilingual text-to-speech (TTS) and voice cloning, with a particular emphasis on Indian and Western languages.

Prerequisites

uv pip install requests

from agno.agent import Agent
from agno.models.openai import OpenAIChat
from agno.tools.desi_vocal import DesiVocalTools

# ---------------------------------------------------------------------------
# Create Agent
# ---------------------------------------------------------------------------


audio_agent = Agent(
    model=OpenAIChat(id="gpt-4o"),
    tools=[DesiVocalTools()],
    description="You are an AI agent that can generate audio using the DesiVocal API.",
    instructions=[
        "When the user asks you to generate audio, use the `text_to_speech` tool to generate the audio.",
        "You'll generate the appropriate prompt to send to the tool to generate audio.",
        "You don't need to find the appropriate voice first, I already specified the voice to user.",
        "Return the audio file name in your response. Don't convert it to markdown.",
        "Generate the text prompt we send in hindi language",
    ],
    markdown=True,
)

# ---------------------------------------------------------------------------
# Run Agent
# ---------------------------------------------------------------------------
if __name__ == "__main__":
    audio_agent.print_response(
        "Generate a very small audio of history of french revolution"
    )

Run the Example

# Clone and setup repo
git clone https://github.com/agno-agi/agno.git
cd agno/cookbook/91_tools

# Create and activate virtual environment
./scripts/demo_setup.sh
source .venvs/demo/bin/activate

python desi_vocal_tools.py

For details, see Desi vocal cookbook.

​Prerequisites

​Run the Example

Prerequisites

Run the Example