DeepSparkInference推理模型库作为DeepSpark开源社区的核心项目,于2024年3月正式开源,一期甄选了48个推理模型示例,涵盖计算机视觉,自然语言处理,语音识别等领域,后续将逐步拓展更多AI领域。
DeepSparkInference中的模型提供了在国产推理引擎IGIE或IxRT下运行的推理示例和指导文档,部分模型提供了基于国产通用GPU智铠100的评测结果。
IGIE(Iluvatar GPU Inference Engine)是基于TVM框架研发的高性能、高通用、全流程的AI推理引擎。支持多框架模型导入、量化、图优化、多算子库支持、多后端支持、算子自动调优等特性,为推理场景提供易部署、高吞吐量、低延迟的完整方案。
IxRT(Iluvatar CoreX RunTime)是天数智芯自研的高性能推理引擎,专注于最大限度发挥天数智芯通用GPU 的性能,实现各领域模型的高性能推理。IxRT支持动态形状推理、插件和INT8/FP16推理等特性。
DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型类别并拓展大模型推理。
Models | Precision | IGIE | IxRT |
---|---|---|---|
AlexNet | FP16 | Supported | Supported |
INT8 | Supported | Supported | |
CLIP | FP16 | Supported | - |
INT8 | - | - | |
Conformer-B | FP16 | Supported | - |
INT8 | - | - | |
ConvNeXt-Small | FP16 | Supported | - |
INT8 | - | - | |
CSPDarkNet5 | FP16 | Supported | - |
INT8 | - | - | |
CSPResNet50 | FP16 | - | Supported |
INT8 | - | Supported | |
DeiT-tiny | FP16 | Supported | - |
INT8 | - | - | |
DenseNet121 | FP16 | Supported | Supported |
INT8 | - | - | |
DenseNet161 | FP16 | Supported | - |
INT8 | - | - | |
DenseNet169 | FP16 | Supported | - |
INT8 | - | - | |
EfficientNet-B0 | FP16 | Supported | Supported |
INT8 | - | Supported | |
EfficientNet-B1 | FP16 | Supported | Supported |
INT8 | - | Supported | |
EfficientNet-B2 | FP16 | Supported | - |
INT8 | - | - | |
EfficientNetV2 | FP16 | Supported | Supported |
INT8 | - | Supported | |
EfficientNetv2_rw_t | FP16 | Supported | - |
INT8 | - | - | |
GoogLeNet | FP16 | Supported | Supported |
INT8 | Supported | Supported | |
HRNet-W18 | FP16 | Supported | Supported |
INT8 | - | Supported | |
InceptionV3 | FP16 | Supported | Supported |
INT8 | Supported | Supported | |
Inception_ResNet_V2 | FP16 | - | Supported |
INT8 | - | Supported | |
MobileNetV2 | FP16 | Supported | Supported |
INT8 | Supported | Supported | |
MobileNetV3_Large | FP16 | Supported | - |
INT8 | - | - | |
MobileNetV3_Small | FP16 | Supported | Supported |
INT8 | - | - | |
RegNet_x_1_6gf | FP16 | Supported | - |
INT8 | - | - | |
RepVGG | FP16 | Supported | Supported |
INT8 | - | - | |
Res2Net50 | FP16 | Supported | Supported |
INT8 | - | Supported | |
ResNeSt50 | FP16 | Supported | - |
INT8 | - | - | |
ResNet101 | FP16 | Supported | Supported |
INT8 | Supported | Supported | |
ResNet152 | FP16 | Supported | - |
INT8 | Supported | - | |
ResNet18 | FP16 | Supported | Supported |
INT8 | Supported | Supported | |
ResNet34 | FP16 | - | Supported |
INT8 | - | Supported | |
ResNet50 | FP16 | Supported | Supported |
INT8 | Supported | - | |
ResNet_V1_D50 | FP16 | - | Supported |
INT8 | - | Supported | |
ResNeXt50_32x4d | FP16 | Supported | - |
INT8 | - | - | |
SEResNet50 | FP16 | Supported | - |
INT8 | - | - | |
ShuffleNetV1 | FP16 | - | Supported |
INT8 | - | - | |
ShuffleNetV2_x0_5 | FP16 | Supported | - |
INT8 | - | - | |
ShuffleNetV2_x1_0 | FP16 | Supported | - |
INT8 | - | - | |
SqueezeNet 1.0 | FP16 | - | Supported |
INT8 | - | Supported | |
SqueezeNet 1.1 | FP16 | - | Supported |
INT8 | - | Supported | |
Swin Transformer | FP16 | Supported | - |
INT8 | - | - | |
Swin Transformer Large | FP16 | - | Supported |
INT8 | - | - | |
VGG16 | FP16 | Supported | Supported |
INT8 | Supported | - | |
Wide ResNet50 | FP16 | Supported | Supported |
INT8 | Supported | Supported |
Models | Precision | IGIE | IxRT |
---|---|---|---|
ATSS | FP16 | Supported | - |
INT8 | - | - | |
CenterNet | FP16 | Supported | - |
INT8 | - | - | |
DETR | FP16 | - | Supported |
INT8 | - | - | |
FCOS | FP16 | Supported | Supported |
INT8 | - | - | |
FoveaBox | FP16 | Supported | - |
INT8 | - | - | |
FSAF | FP16 | Supported | - |
INT8 | - | - | |
HRNet | FP16 | Supported | - |
INT8 | - | - | |
RetinaFace | FP16 | Supported | - |
INT8 | - | - | |
RetinaNet | FP16 | Supported | - |
INT8 | - | - | |
RTMDet | FP16 | Supported | - |
INT8 | - | - | |
YOLOv3 | FP16 | Supported | Supported |
INT8 | Supported | Supported | |
YOLOv4 | FP16 | Supported | Supported |
INT8 | Supported | Supported | |
YOLOv5 | FP16 | Supported | Supported |
INT8 | Supported | Supported | |
YOLOv5s | FP16 | - | Supported |
INT8 | - | Supported | |
YOLOv6 | FP16 | Supported | Supported |
INT8 | - | Supported | |
YOLOv7 | FP16 | Supported | Supported |
INT8 | Supported | Supported | |
YOLOv8 | FP16 | Supported | Supported |
INT8 | Supported | Supported | |
YOLOX | FP16 | Supported | Supported |
INT8 | Supported | Supported |
Models | Precision | IGIE | IxRT |
---|---|---|---|
FaceNet | FP16 | - | Supported |
INT8 | - | Supported |
Models | Precision | IGIE | IxRT |
---|---|---|---|
RTMPose | FP16 | Supported | - |
INT8 | - | - |
Models | Precision | IGIE | IxRT |
---|---|---|---|
Mask R-CNN | FP16 | - | Supported |
INT8 | - | - | |
SOLOv1 | FP16 | - | Supported |
INT8 | - | - |
Models | Precision | IGIE | IxRT |
---|---|---|---|
FastReID | FP16 | Supported | - |
INT8 | - | - | |
DeepSort | FP16 | Supported | - |
INT8 | Supported | - | |
RepNet-Vehicle-ReID | FP16 | Supported | - |
INT8 | - | - |
Models | Precision | IGIE | IxRT |
---|---|---|---|
ALBERT | FP16 | - | Supported |
INT8 | - | - | |
BERT Base NER | FP16 | - | - |
INT8 | Supported | - | |
BERT Base SQuAD | FP16 | Supported | Supported |
INT8 | - | Supported | |
BERT Large SQuAD | FP16 | Supported | Supported |
INT8 | Supported | Supported | |
DeBERTa | FP16 | - | Supported |
INT8 | - | - | |
RoBERTa | FP16 | - | Supported |
INT8 | - | - | |
RoFormer | FP16 | - | Supported |
INT8 | - | - | |
VideoBERT | FP16 | - | Supported |
INT8 | - | - |
Models | vLLM | TensorRT-LLM | TGI |
---|---|---|---|
Baichuan2-7B | Supported | - | - |
ChatGLM-3-6B | Supported | - | - |
Llama2-7B | - | Supported | - |
Llama3-70B | Supported | - | - |
Qwen-7B | - | - | Supported |
Qwen1.5-7B | Supported | - | Supported |
Qwen1.5-14B | Supported | - | - |
Qwen1.5-72B | Supported | - | - |
Models | Precision | IGIE | IxRT |
---|---|---|---|
Conformer | FP16 | Supported | Supported |
INT8 | - | - | |
Transformer ASR | FP16 | - | Supported |
INT8 | - | - |
请参见 DeepSpark Code of Conduct on Gitee or on GitHub。
请参见 DeepSparkInference Contributing Guidelines。
本项目许可证遵循Apache-2.0。