Provide a tokenizer in Java for counting input tokens #359

lukehutch · 2025-03-25T16:19:35Z

Since billing is based on tokens, it would be very helpful to be able to measure how many input and output tokens are used by a given request. I don't see documentation about how to track that. Is this possible? If so, can you please create an example of how to measure it?

TomerAberbach · 2025-03-25T22:05:06Z

As far as I know, there's no way to do this in Java right now.

https://platform.openai.com/tokenizer gives a rule of thumb:

A helpful rule of thumb is that one token generally corresponds to ~4 characters of text for common English text. This translates to roughly ¾ of a word (so 100 tokens ~= 75 words).

But the only programmatic tokenizer libraries are in Python and JS.

@kwhinnery-openai are there plans to provide support for all languages?

IvanLuchkin · 2025-03-27T01:04:26Z

Use this https://github.com/knuddelsgmbh/jtokkit, and port this https://github.com/hmarr/openai-chat-tokens. That will give you at least a ballpark value, if you are going to count function calls or schemas in your requests. If not, you will have accurate results.

Also, don't forget something along the lines of

EncodingType encType = request.getModel() == GPT4o_mini || request.getModel() == GPT4o ? O200K_BASE : CL100K_BASE;

since different models utilize different tokenizers. You shouldn't rely on the rule of thumb provided in the documentation if your case involves different languages. Tokenizer efficiency varies drastically depending on the language of the text.

lukehutch · 2025-03-27T04:42:02Z

Wouldn't it make sense to just make available the count that was measured by the billing engine?

IvanLuchkin · 2025-03-27T11:18:07Z

It is available in the ResponseUsage object. But you only have access to it after the request.

If you need to calculate the estimate for prompt token usage before the request for some reason (rate limit handling for example), you have to use the method I've provided. If you don't need this, then it's pretty straightforward to just use ResponseUsage or its Chat Completions API equivalent.

lukehutch · 2025-03-27T22:53:35Z

I can't see how to get a ResponseUsage object from the new responses API (in streaming mode)... Can you please give me a pointer?

IvanLuchkin · 2025-03-27T23:09:32Z

Unfortunately, I don't have experience with streaming in Chat Completions or Responses APIs. From what I was able to find, you should be looking for the ResponseStreamEvent.completed.response.usage

TomerAberbach changed the title ~~Please document how to measure the number of input and output tokens~~ Provide a tokenizer in Java for counting input tokens Apr 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide a tokenizer in Java for counting input tokens #359

Provide a tokenizer in Java for counting input tokens #359

lukehutch commented Mar 25, 2025

TomerAberbach commented Mar 25, 2025

IvanLuchkin commented Mar 27, 2025 •

edited

Loading

lukehutch commented Mar 27, 2025

IvanLuchkin commented Mar 27, 2025

lukehutch commented Mar 27, 2025

IvanLuchkin commented Mar 27, 2025

Provide a tokenizer in Java for counting input tokens #359

Provide a tokenizer in Java for counting input tokens #359

Comments

lukehutch commented Mar 25, 2025

TomerAberbach commented Mar 25, 2025

IvanLuchkin commented Mar 27, 2025 • edited Loading

lukehutch commented Mar 27, 2025

IvanLuchkin commented Mar 27, 2025

lukehutch commented Mar 27, 2025

IvanLuchkin commented Mar 27, 2025

IvanLuchkin commented Mar 27, 2025 •

edited

Loading