简体中文

ByteMLPerf Inference General Perf 概览

ByteMLPerf Inference General Perf 是一个 AI 加速器基准测试工具，专注于从实际生产的角度评估 AI 加速器，包括软件和硬件的易用性及多功能性。

特点：

模型和运行环境会更贴近真实业务；
对于新硬件，除了评估性能和精度之外，同时也会评估图编译的易用性、覆盖率等指标；
在开放 Model Zoo 上测试所得的性能和精度，会作为新硬件引入评估的参考；

整体结构

ByteMLPerf Inference General Perf 架构如下图所示：

模型库列表

ByteMLPerf Inference General Perf 支持的模型都收集在模型库(Model Zoo)中。从访问权限的角度看，它们目前分为内部模型和公开模型。与 ByteMLPerf Inference General Perf 一起发布的是相应版本中包含的公开模型。

公开模型的收集原则：

基本模型：包括 Resnet50、Bert 和 WnD；
流行模型：包括目前在行业中广泛使用的模型；
SOTA: 包括与业务领域对应的 SOTA 模型；

除完整模型结构外，ByteMLPerf Inference General Perf 还将添加一些典型的模型子结构子图或 OPs（前提是公开模型中找不到包含此类经典子结构的合适模型），例如具有不同序列长度的 transformer 编码器/解码器，各种常见的卷积操作，如组卷积、深度卷积、点对点卷积，以及常见的 rnn 结构，如 gru/lstm 等。

Model	Domain	Purpose	Framework	Dataset	Precision
resnet50-v1.5	cv	regular	tensorflow, pytorch	imagenet2012	fp32
bert-base	nlp	regular	tensorflow, pytorch	squad-1.1	fp32
wide&deep	rec	regular	tensorflow	criteo	fp32
videobert	mm	popular	onnx	cifar100	fp32
albert	nlp	popular	pytorch	squad-1.1	fp32
conformer	nlp	popular	onnx	none	fp32
roformer	nlp	popular	tensorflow	cail2019	fp32
yolov5	cv	popular	onnx	none	fp32
roberta	nlp	popular	pytorch	squad-1.1	fp32
deberta	nlp	popular	pytorch	squad-1.1	fp32
swin-transformer	cv	popular	pytorch	imagenet2012	fp32
stable diffusion	cv	sota	onnx	none	fp32

支持的厂商列表

ByteMLPerf Inference General Perf 目前支持的厂商后端列表如下所示：

Vendor	SKU	Key Parameters	Supplement
Intel	Xeon	-	-
Stream Computing	STC P920	Computation Power:128 TFLOPS@FP16 Last Level Buffer: 8MB, 256GB/s Level 1 Buffer: 1.25MB, 512GB/s Memory: 16GB, 119.4GB/S Host Interface：PCIe 4, 16x, 32GB/s TDP: 160W	STC Introduction
Graphcore	Graphcore® C600	Compute: 280 TFLOPS@FP16, 560 TFLOPS@FP8 In Processor Memory: 900 MB, 52 TB/s Host Interface: Dual PCIe Gen4 8-lane interfaces, 32GB/s TDP: 185W	IPU Introduction
Moffett-AI	Moffett-AI S30	Compute: 1440 (32x-Sparse) TFLOPS@BF16, 2880 (32x-Sparse) TOPS@INT8, Memory: 60 GB, Host Interface: Dual PCIe Gen4 8-lane interfaces, 32GB/s TDP: 250W	SPU Introduction

With ByteIR

ByteIR 项目是字节跳动的模型编译解决方案。ByteIR 包括编译器、运行时和前端，并提供端到端的模型编译解决方案。尽管所有的 ByteIR 组件（编译器/runtime/前端）一起提供端到端的解决方案，并且都在同一个代码库下，但每个组件在技术上都可以独立运行。

更多信息请查看ByteIR

ByteIR 编译支持的模型列表:

Model	Domain	Purpose	Framework	Dataset	Precision
resnet50-v1.5	cv	regular	mhlo	imagenet2012	fp32
bert-base	nlp	regular	mhlo	squad-1.1	fp32

ON THIS PAGE

#ByteMLPerf Inference General Perf 概览

#整体结构

#模型库列表

#支持的厂商列表

#With ByteIR

ByteMLPerf Inference General Perf 概览

整体结构

模型库列表

支持的厂商列表

With ByteIR