LiteLLM
LiteLLM is an LLM gateway, simplifying access to multiple LLM models or providers within a single proxy service. LiteLLM allows you to manage spending and rate limits across providers while configuring fallback and other access control configurations. They offer a free open-source, self-hosted code package that allows for the creation of a simple proxy service.
This guide demonstrates how to configure the Venice API within a LiteLLM proxy server.
Prerequisites
- Venice API key obtained through VVV staking or Venice Pro (more restrictive rate limits apply to Pro)
- Installed system dependencies including: “python3” and “pip”
Steps
- Grab your API key from Venice.ai (follow instructions on https://docs.venice.ai/welcome/guides/generating-api-key for more information)
- Go to https://docs.venice.ai/api-reference/endpoint/models/list and click “try it” next to GET /models to see all of the available models through the API. We recommend to use the default model as a starting place (llama-3.3-70b).
- Create new directory on your machine. For this, we will create a folder called “litellmtest” on our desktop
- Run the following command to install the LiteLLM package
- Create a file called “config.yaml“ with the following. We are using “llama-3.3-70b” as the modelID for this example.
- Start the LiteLLM server using your configuration file. We chose to have “detailed_debug” enabled to help us identify any errors throughout the process
- The process will start running. You will see the following response within the terminal, amongst other process statuses. This shows that the configuration and model have loaded properly
- Now that your configuration is working properly, we will send our first request to the server. LiteLLM allows you to utilize standard OpenAI API format. First open a new terminal window, and head over to the same directory where you ran the process. You can input this basic call to the API to confirm proper functionality
- If you check the original terminal window where the LiteLLM server is running, you will see the following response. This confirms that the request was passed properly to the server.
- The response from the Venice API will be returned in the same window in which you made the API call. Here is the response from the request above.
- Congratulations on completing your Venice API integration with LiteLLM
Getting the most out of your LiteLLM and Venice API integration
Venice's API integration with LiteLLM provides several key advantages:
- Complete privacy: Your prompts and conversations are never stored
- Uncensored responses: Receive answers without artificial restrictions
- Free ongoing inference: Through VVV staking for high-volume usage
- Multiple model access: Through a single, unified interface
For more information and support:
- Check the Venice API documentation
- Review detailed model specifications
- Join the Venice Discord for developer discussions and support
- Explore LiteLLM's documentation for advanced configurations
Venice's API access through VVV staking provides ongoing, private access to AI capabilities without per-request fees or data collection.
Back to all posts
Venice.ai