欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

Keras|Tensorflow 计算模型的FLOPs

程序员文章站 2022-05-26 19:01:27
...

       最近在研究模型的计算量,发现Pytorch有库可以直接计算模型的计算量,所以需要一个一个Keras和Tensorflow可以用的,直接把Model接入到函数中,print一下就可以计算出FLOPs

       FLOPS:注意全大写,是floating point operations per second的缩写,意指每秒浮点运算次数,理解为计算速度。是一个衡量硬件性能的指标。

        FLOPs:注意s小写,是floating point operations的缩写(s表复数),意指浮点运算数,理解为计算量。可以用来衡量算法/模型的复杂度。

      对于计算量主要有Madds和MFlops两个概念。shufflenet的论文用的是Flops,Mobilenet用的是Madds,Flops应该是Madds的两倍,具体可参考

  https://blog.csdn.net/shwan_ma/article/details/84924142

   https://www.zhihu.com/question/65305385/answer/451060549

  计算函数如下:

import tensorflow as tf
import keras.backend as K


def get_flops(model):
    run_meta = tf.RunMetadata()
    opts = tf.profiler.ProfileOptionBuilder.float_operation()

    # We use the Keras session graph in the call to the profiler.
    flops = tf.profiler.profile(graph=K.get_session().graph,
                                run_meta=run_meta, cmd='op', options=opts)

    return flops.total_float_ops  # Prints the "flops" of the model.


# .... Define your model here ....
print(get_flops(model))

  贴一个Mask_RCNN的计算结果

  

Doc:
op: The nodes are operation kernel type, such as MatMul, Conv2D. Graph nodes belonging to the same type are aggregated together.
flops: Number of float operations. Note: Please read the implementation for the math behind it.

Profile:
node name | # float_ops
Conv2D                   95.74b float_ops (100.00%, 90.16%)
Conv2DBackpropInput      10.28b float_ops (9.84%, 9.68%)
Mul                      63.89m float_ops (0.16%, 0.06%)
Add                      63.88m float_ops (0.10%, 0.06%)
BiasAdd                  46.49m float_ops (0.04%, 0.04%)
ArgMax                   80.00k float_ops (0.00%, 0.00%)
Minimum                  4.10k float_ops (0.00%, 0.00%)
Maximum                  4.10k float_ops (0.00%, 0.00%)
Sub                      2.33k float_ops (0.00%, 0.00%)
GreaterEqual             1.00k float_ops (0.00%, 0.00%)
Greater                  1.00k float_ops (0.00%, 0.00%)
Equal                      400 float_ops (0.00%, 0.00%)
RealDiv                    202 float_ops (0.00%, 0.00%)
Log                        102 float_ops (0.00%, 0.00%)
Less                         2 float_ops (0.00%, 0.00%)

相关标签: 人工智能