-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Insights: triton-inference-server/server
Overview
Could not load contribution data
Please try again later
3 Pull requests merged by 2 people
-
TPRD-1554: update readme and versions
#8244 merged
Jun 11, 2025 -
Update default branch to track development for 2.60.0 / 25.07
#8243 merged
Jun 11, 2025 -
ci: fix the trtllm tests after the repo migration of trtllm backend
#8241 merged
Jun 9, 2025
2 Pull requests opened by 2 people
-
feat: Add guided decoding support to OpenAI frontend
#8245 opened
Jun 11, 2025 -
docs: fix capitalization of Triton Inference Server
#8252 opened
Jun 13, 2025
1 Issue closed by 1 person
-
Not loaded: No model version was found
#7420 closed
Jun 12, 2025
6 Issues opened by 6 people
-
Spike in Failed Inference Requests During Triton Server Shutdown (gRPC Endpoint)
#8253 opened
Jun 15, 2025 -
Real latency is much higher, queue time is high
#8251 opened
Jun 12, 2025 -
Triton deploys CPU service without releasing memory usage
#8250 opened
Jun 12, 2025 -
cpu-only docker base image is not available.
#8249 opened
Jun 12, 2025 -
how can I build the tensorRT engine for tensorflow models with savedmodel formats?
#8242 opened
Jun 11, 2025 -
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc4 in position 40692
#8240 opened
Jun 9, 2025
2 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
fix: Support max_completion_tokens option in OpenAI frontend
#8226 commented on
Jun 12, 2025 • 1 new comment -
GPU memory leak when loading/unloading models
#5841 commented on
Jun 11, 2025 • 0 new comments