Usage Limits

Managing Usage Limits with Vortex

Vortex stands at the forefront of managing and optimising the use of Large Language Models (LLMs), providing an essential toolkit for balancing the innovative capabilities of LLMs with the practicalities of cost management.

Setting Token-Based Consumption Limits

A cornerstone of Vortex's usage management is the ability to set token-based limits for each API key.

This functionality empowers you to define clear limits on the number of tokens an API key can consume within a set timeframe, offering a proactive approach to managing model usage rates.

Insights into Token-Based Limits:

  • Model-Specific Control: Usage limits are applied per model for each API key, allowing for precise resource consumption management.
  • Mandatory Limits for Model Access: To prevent unexpected charges, access to any model is contingent upon having an established limit. Without a defined limit for a model, requests to it will be automatically blocked.

Real-World Application:

Consider a scenario where you aim to cap the daily token consumption for the gpt-4 model at 2000 tokens. When generating your API key, you would set up the limit as follows:

  • Model: gpt-4
  • Token Limit: 2000
  • Duration: 1 day

This setup exemplifies how Vortex equips you with detailed control over each model's usage, ensuring that you can harness the capabilities of LLMs efficiently while keeping an eye on operational costs.

On this page