OpenAI API-compatible server (vllm) & tokenizer #559

micael-git · 2025-06-18T14:26:32Z

micael-git
Jun 18, 2025

Hello,

I'm currently testing gptme with local model behind couple liteLLM/vLLM, but I'm facing an issue regarding the tokenizer.

Seems gptme wants to use the openai tokenizer:
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f5401a95450>, 'Connection to openaipublic.blob.core.windows.net timed out. (connect timeout=None)'))

I search on the documentation but haven't found a way to disable the tokenizer (used for counts and costs only?) or use my own tokenizer in local.
Can anyone help, or point to me a misunderstanding?

Thank you

ErikBjare · 2025-06-19T15:49:21Z

ErikBjare
Jun 19, 2025
Maintainer

I found this answer on SO which answers how to deal with this generally by keeping a copy of the tokenizer locally: https://stackoverflow.com/a/76107077/965332

You could also modify this function to not use tiktoken and instead simply do return len(content) / 4 or some such approximation.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gptme

OpenAI API-compatible server (vllm) & tokenizer #559

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

gptme

OpenAI API-compatible server (vllm) & tokenizer #559

Uh oh!

micael-git Jun 18, 2025

Replies: 1 comment

Uh oh!

Uh oh!

ErikBjare Jun 19, 2025 Maintainer

micael-git
Jun 18, 2025

ErikBjare
Jun 19, 2025
Maintainer