Tflite | CineNeural

TFlite 组成 Tensorflow Lite interpreter 在部署的硬件上进行推演，硬件可以包括，手机，微控制器，嵌入式设备。 Tensorflow Lite Converter 转换模型，使其容量更小，推理速度更快。 TFlite Converter Tensorflow Converter 可以将模型转化成FlatBuffers格式，FlatBuffers是一款跨平台序列化工具，结构化数据都以二进制形式进行存储，在微控制器上的表现就是节省内存， Python Keras Converter: converter = tf.lite.TFLiteConverter.from_keras_model(model) tflite_model = converter.convert() with tf.io.gfile.GFile('mode1.tflite','wb') as f: f.write(tflite_model) Command line Converter: tflite_convert --saved_model_dir=$modelDir --output_file=mode1.tflite TFlite 模型量化 Quantizing models for CPU model size 量化权重从原先的32bits降到8bits，可以加快推理的时间。 Python: converter.optimizations = [tf.lite.Optimize.DEFAULT] tflite_quantize_model1 = converter.convert() with tf.io.gfile.GFile('mode1-default-quant.tflite','wb') as f: f.write(tflite_quantize_model1) Full integer quantization of weights and activations 将所有计算全部限定在整数集，进一步缩减模型的大小，和加快模型的推演速度。下图列出了不同的量化方式下对于同一个模型所产生的效率。