You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/quick_start_new_user.md
+4-4
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ type: explainer
7
7
8
8
# Trial in 30mins(new users)
9
9
10
-
TorchPipe is a multi-instance pipeline parallel library that provides a seamless integration between lower-level acceleration libraries (such as TensorRT and OpenCV) and RPC frameworks. It guarantees high service throughput while meeting latency requirements. This document is mainly for new users, that is, users who are in the introductory stage of acceleration-related theoretical knowledge, know some python grammar, and can read simple codes. This content mainly includes the use of torchpipe for accelerating service deployment, complemented by performance and effect comparisons. The complete code of this document can be found at [resnet50_thrift](https://github.com/torchpipe/torchpipe/-/blob/develop/examples/resnet50_thrift/)。
10
+
TorchPipe is a multi-instance pipeline parallel library that provides a seamless integration between lower-level acceleration libraries (such as TensorRT and OpenCV) and RPC frameworks. It guarantees high service throughput while meeting latency requirements. This document is mainly for new users, that is, users who are in the introductory stage of acceleration-related theoretical knowledge, know some python grammar, and can read simple codes. This content mainly includes the use of torchpipe for accelerating service deployment, complemented by performance and effect comparisons. The complete code of this document can be found at [resnet50_thrift](https://github.com/torchpipe/torchpipe/blob/develop/examples/resnet50_thrift/)。
The overall online service deployment can be found at [main_trt.py](https://github.com/torchpipe/torchpipe/-/blob/develop/examples/resnet50_thrift/main_trt.py)
87
+
The overall online service deployment can be found at [main_trt.py](https://github.com/torchpipe/torchpipe/blob/develop/examples/resnet50_thrift/main_trt.py)
88
88
89
89
:::tip
90
90
Since TensorRT is not thread-safe, when using this method for model acceleration, it is necessary to handle locking (with self.lock:) during the service deployment process.
@@ -104,7 +104,7 @@ From the above process, it's clear that when accelerating a single model, the fo
104
104
105
105

106
106
107
-
We've made adjustments to the deployment of our service using TorchPipe.The overall online service deployment can be found at [main_torchpipe.py](https://github.com/torchpipe/torchpipe/-/blob/develop/examples/resnet50_thrift/main_torchpipe.py).
107
+
We've made adjustments to the deployment of our service using TorchPipe.The overall online service deployment can be found at [main_torchpipe.py](https://github.com/torchpipe/torchpipe/blob/develop/examples/resnet50_thrift/main_torchpipe.py).
Copy file name to clipboardExpand all lines: docs/tools/quantization.mdx
+11-11
Original file line number
Diff line number
Diff line change
@@ -29,7 +29,7 @@ For detection models, you can consider using the [official complete tutorial](ht
29
29
30
30
In addition to the pre-training parameters provided by the model for normal training, training-based quantization also requires quantization pre-training parameters provided by post-training quantization (ptq).
31
31
32
-
We have integrated [calib_tools](https://github.com/torchpipe/torchpipe/-/blob/develop/examples/int8/qat/calib_tools.py) for reference.
32
+
We have integrated [calib_tools](https://github.com/torchpipe/torchpipe/blob/develop/examples/int8/qat/calib_tools.py) for reference.
33
33
34
34
- Define calibrator:
35
35
```python
@@ -100,17 +100,17 @@ The official training format is very simple and is only used as an example.
100
100
#### Direct Quantization without Modifying Backbone
101
101
Following the official example, we conducted step-by-step experiments on resnet:
102
102
103
-
- Download training data: [code](https://github.com/torchpipe/torchpipe/-/blob/develop/examples/int8/qat/download_data.py)
104
-
- Train for 10 epochs to obtain the resnet50 model: [code](https://github.com/torchpipe/torchpipe/-/blob/develop/examples/int8/qat/fp32_train.py), accuracy 98.44%
- Download training data: [code](https://github.com/torchpipe/torchpipe/blob/develop/examples/int8/qat/download_data.py)
104
+
- Train for 10 epochs to obtain the resnet50 model: [code](https://github.com/torchpipe/torchpipe/blob/develop/examples/int8/qat/fp32_train.py), accuracy 98.44%
The above resnet training uses the max quantization method and does not fuse the Add layer, resulting in TensorRT running speed not meeting expectations. The following are the results after fusing Add under int8 and switching to the mse mode:
0 commit comments