# benchdnn
**benchdnn** is a standalone correctness and performance benchmark for
[Intel(R) Math Kernel Library for Deep Neural Networks (Intel(R) MKL-DNN)](/intel/mkl-dnn) library.
The purpose of the benchmark is extended and robust correctness verification of
the primitives provided by MKL-DNN. So far **benchdnn** supports convolutions
and inner products of different data types. It also implicitly tests reorders.
## License
**benchdnn** is licensed under
[Apache License Version 2.0](http://www.apache.org/licenses/LICENSE-2.0).
## Usage (main driver)
**benchdnn** itself is a driver for different implementation specific
harnesses. So far it has harness for Intel MKL-DNN convolution, inner product,
reorder, batch normalization, and harness for testing itself.
The usage:
```
$ ./benchdnn: [--HARNESS] [--mode=MODE] [-vN|--verbose=N] HARNESS-OPTS
```
where:
- `HARNESS` is either `conv` [default], `ip`, `reorder`, `bnorm`, `rnn` or `self`
- `MODE` -- string that contains flags for benchmark mode. Use `C` or `c` for correctness (used by default), and `P` or `p` for performance
- `N` -- verbose level (integer from 0 [default] to ...)
- `HARNESS-OPTS` are passed to the chosen harness
Returns `0` on success (all tests passed), and non-zero in case of any error
happened.
## Usage (convolution harness)
The usage:
```
[harness-knobs] [conv-desc] ...
```
where *harness-knobs* are:
- `--cfg={f32, u8s8u8s32, ...}` configuration (see below), default `f32`
- `--dir={FWD_D (forward data), FWD_B (forward data + bias), BWD_D (backward data), BWD_W (backward weights), BWD_WB (backward weights + bias)}` direction, default `FWD_B`
- `--alg={DIRECT, WINO}` convolution algorithm, default DIRECT
- `--merge={NONE, RELU}` merged primitive, default NONE (nothing merged)
- `--attr="attr_str"` convolution attributes (see in the section below), default `""` (no attributes set)
- `--mb=N` override minibatch that is specified in convolution description, default `0` (use mb specified in conv desc)
- `--match=regex` check only convolutions that match with regex, default is `".*"`. Notice: Windows may only interpret string arguments surrounded by double quotation marks.
- `--skip-impl="str1[:str2]..."` skip implementation (see mkldnn_query_impl_info_str), default `""`
- `--allow-unimpl=true|false` do not treat unimplemented configuration as an error, default `false`
- `--perf-template=template-str` set template for performance report (see section *Performance measurements*)
- `--reset` reset all the parameters set before to default one
- `-vN|--verbose=N` verbose level, default `0`
- `--batch=file` use options from the given file (see in subdirectory)
and *conv-desc* is convolution description. The canonical form is:
```
gXmbXicXihXiwXocXohXowXkhXkwXshXswXphXpwXdhXdwXnS
```
Here X is a number and S is string (n stands for name). Some of the parameters
might be omitted if there is either default one (e.g. if g is not specified
**benchdnn** uses 1) or if the can be computed automatically (e.g. output shape
can be derived from the input one and kernel). Also if either width or height
is not specified than it is assumed height == width. Special symbol `_` is
ignored, hence maybe used as delimiter. See `str2desc()` in conv/conv_aux.cpp
for more details and implicit rules :^)
The attribute string *attr_str* is defined as (new lines for readability):
```
[irmode={nearest,down};]
[oscale={none,common,per_oc}[:scale];]
[post_ops='[{relu,sum[:sum_scale]};]...';]
```
Here `irmode` defines the rounding mode for integer output (default is nearest).
Next, `oscale` stands for output_scales. The first parameter is the policy that
is defined below. The second optional parameter is a scale that specifies
either the one common output scale (for `none` and `common` polices) or a
starting point for `per_oc` policy, which uses many scales. The default scale
is 1.0. Known policies are:
- `none` (default) means no output scales set (i.e. scale = 1.)
- `common` corresponds to `mask=0` with common scale factor
- `per_oc` corresponds to `mask=1<<1` (i.e. output channels) with different scale factors
Next, `post_ops` stands for post operation sequence. Currently supported post
ops are:
- `relu` with no parameters (i.e. corresponding scale is 1., alg = eltwise_relu, alpha = beta = 0.)
- `sum` with optional parameter scale (default 1.)
### convolution configurations (aka precision specification)
`--cfg` option specifies what convolution would be used in terms of data type.
Also it defines all the magic with data filling inside. For integer type
saturation is implicitly implied.
Finally configuration defines threshold for computation errors (ideally we
want keep it 0 and it seems to work for now).
The table below shows cases supported by Intel MKL-DNN and corresponding
configurations for **benchdnn**:
|src type | wei type | dst type | acc type | cfg | notes
|:--- |:--- |:--- |:--- |:--- |:---
| f32 | f32 | f32 | f32 | f32 | inference optimized for sse4.2+, training avx2+
| s16 | s16 | s32 | s32 | s16s16s32s32 | optimized for processors with support of 4vnni, forward pass only (aka FWD_D, FWD_B)
| s32 | s16 | s16 | s32 | s32s16s16s32 | optimized for processors with support of 4vnni, backward wrt data only (aka BWD_D)
| s16 | s32 | s16 | s32 | s16s32s16s32 | optimized for processors with support of 4vnni, backward wrt weights (aka BWD_W, BWD_WB)
| u8 | s8 | f32 | s32 | u8s8f32s32 | optimized for processors with support of avx512vl, forward pass only (aka FWD_D, FWD_B)
| u8 | s8 | s32 | s32 | u8s8s32s32 | same notes as for u8s8s32s32
| u8 | s8 | s8 | s32 | u8s8s8s32 | same notes as for u8s8s32s32
| u8 | s8 | u8 | s32 | u8s8u8s32 | same notes as for u8s8s32s32
## Performance measurements
**benchdnn** supports custom performance report. Template is passed via
command line and consists of terminal and nonterminal symbols. Nonterminal
symbols are printed as is. Description of terminal symbols is given below.
There is also a notion of modifiers (marked as @) that change meaning of
terminal symbols, e.g. sign '-' means minimum of (in terms of time). See
table of modifiers below.
> **caution:** threads have to be pinned in order to get consistent frequency
| abbreviation | description
|:------------ |:-----------
| %d | problem descriptor
| %D | expanded problem descriptor (conv parameters in csv format)
| %n | problem name
| %z | direction
| %@F | effective cpu frequency computed as clocks[@] / time[@]
| %O | number of ops required (padding is not taken into account)
| %@t | time in ms
| %@c | time in clocks
| %@p | ops per second
| modifier | description
|:-------- |:-----------
| | default
| - | min (time) -- default
| 0 | avg (time)
| + | max (time)
| |
| K | Kilo (1e3)
| M | Mega (1e6)
| G | Giga (1e9)
The definition of expanded problem descriptor is:
`g,mb,ic,ih,iw,oc,oh,ow,kh,kw,sh,sw,ph,pw`.
The default template can be found in conv/bench_conv.cpp that is defined as
`perf,%n,%d,%GO,%GF,%-t,%-Gp,%0t,%0Gp`. That will produce the following output
in CSV format:
```
string: perf
convolution name
full conv-desc
number of giga ops calculated
effective cpu frequency in GHz (amb clocks[min] / time[min])
minimum time spent in ms
best gigaops (since it corresponds to mimimum time)
average time spent in ms
average gigaops (since it corresponds to average time)
```
## Examples
Run the set of f32 forward convolutions from inputs/conv_all file w/ bias and default minibatch:
```
$ ./benchdnn --conv \
--cfg=f32 --dir=FWD_B --batch=inputs/conv_all
```
Run t
没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论


















收起资源包目录





































































































共 455 条
- 1
- 2
- 3
- 4
- 5
资源评论

- 元气少女缘结神2019-07-25真是太太太奇怪了,为什么从github上下载的都没有prepare_mkl.sh的脚本?我下了几个版本都没有

qq_42859819
- 粉丝: 0
上传资源 快速赚钱
我的内容管理 展开
我的资源 快来上传第一个资源
我的收益
登录查看自己的收益我的积分 登录查看自己的积分
我的C币 登录后查看C币余额
我的收藏
我的下载
下载帮助


最新资源
- 有线数字视频广播(DVB-C)系统综述.doc.doc
- 设计网络拓扑结构.ppt
- 计算机网络基础讲课讲稿(最终).doc
- 基于层次聚类的分类数据可视化:适合科研初学者的Matlab实现及应用 - 层次聚类 (2025-07-28)
- 第三次答案(项目管理第三次答案).doc
- 项目管理的概念与原则.docx
- 手把手教你用VB实现ModbusRTU串行通讯工程实例.doc
- 软件毕业答辩1范例PPT课件.ppt
- 项目管理(1).pdf
- 手机壳料项目管理流程.doc
- 通信线路工程技术规范.docx
- 微型计算机控制技术.doc
- 微型计算机原理与接口技术课后答案资料.docx
- (源码)基于C语言和汇编的BoneOS操作系统.zip
- 计算机平面设计教学标准.doc
- 网络营销策划答辩.pptx
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈



安全验证
文档复制为VIP权益,开通VIP直接复制
