FAQs


Vortex aims to provide transparency in how costs are calculated and offers tools to monitor and manage your usage.

By understanding these aspects of LLM cost calculation, users can make informed decisions about model selection and usage patterns, optimizing their investment in AI-powered applications.

Cost data on Vortex

  • Please note that the cost data presented are close estimations.
  • These estimations are derived from the latest rates publicly listed for each model on the provider's website.
  • Given that these rates are subject to adjustments over time, we strive to regularly update our data to maintain the accuracy of these cost estimates.

In Vortex, information related to costs is accessible in multiple sections:

  • On the Dashboard, you can view both Cost and Average Cost/Req.
  • The Request Logs Table displays the Cost associated with each individual request.

General LLM Cost

The cost of using Large Language Models (LLMs) is influenced by several factors, primarily the specific model chosen and the volume of tokens processed. Each model available through Vortex comes with its unique capabilities, performance characteristics, and pricing structure.

Price Per Token

Pricing is often expressed in terms of cost per token, where a token can represent a piece of a word, an entire word, or sometimes even multiple words, depending on the language and context. On average, 1,000 tokens approximate to about 750 words, making this paragraph around 35 tokens.

Model Selection

Selecting a more advanced or specialized model typically incurs higher costs. For instance, newer generations of models that offer improved accuracy, comprehension, and context handling capabilities may be priced higher than their predecessors.

Token Consumption

The total cost of a request also depends on the number of tokens processed. This includes tokens generated in response to prompts as well as the tokens comprising the prompts themselves. It's crucial to optimize prompt design to manage costs effectively, especially for applications with high request volumes.


On this page