Pulse · triton-inference-server/tensorrtllm_backend · GitHub

April 18, 2025 – April 25, 2025

Overview

1 Active pull request

7 Active issues
- 1 Merged pull request
- 0 Open pull requests
- 3 Closed issues
- 4 New issues

1 Pull request merged by 1 person

Update TensorRT-LLM backend
#742 merged Apr 23, 2025

3 Issues closed by 3 people

docker export than import the image，run error
#738 closed Apr 25, 2025
convert_checkpoint.py get failed for whisper model
#741 closed Apr 24, 2025
Inconsistent Batch Index Order in Decoupled Mode with trt-llm and triton trtllm backend
#705 closed Apr 22, 2025

4 Issues opened by 4 people

whisper tensorrt-llm backend drop the accuracy for small.en model
#743 opened Apr 24, 2025
max request capped at 228 while load testing using locust
#740 opened Apr 23, 2025
no /app folder in container nvcr.io/nvidia/tritonserver:24.12-trtllm-python-py3
#739 opened Apr 22, 2025
Vision Preprocessor Not Initialized for LLaVA in Triton Workflow
#737 opened Apr 18, 2025

1 Unresolved conversation

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

Can you provide an example of a visual language model or multimodal model launch by triton server?
#463 commented on Apr 22, 2025 • 0 new comments