TensorFlow Lite for Microcontrollers

使用方法

参考tensorflow/lite/micro/examples/xxx 目录下的使用方法，以hello_world为例，

文件hello_world_test.cc

1. 创建MicroErrorReporter object

tflite::MicroErrorReporter micro_error_reporter;

2. 有tflite model文件得到 tflite::Modle 结构体

const tflite::Model* model = ::tflite::GetModel(g_model);

3. 创建OpsResolver对象

这里是创建包含所有算子的对象

tflite::AllOpsResolver resolver;

如果是根据实际使用的算子创建OpsResolver, 使用

static tflite::MicroMutableOpResolver<5> micro_op_resolver;
micro_op_resolver.AddConv2D();
micro_op_resolver.AddDepthwiseConv2D();
micro_op_resolver.AddFullyConnected();
micro_op_resolver.AddMaxPool2D();
micro_op_resolver.AddSoftmax();

4. 提供一段连续的内存，用于model的内存(placement new)

uint8_t tensor_arena[tensor_arena_size];

5. 创建MicroInterpreter对象

MicroInterpreter(const Model* model, const MicroOpResolver& op_resolver,
uint8_t* tensor_arena, size_t tensor_arena_size,
ErrorReporter* error_reporter,
tflite::Profiler* profiler = nullptr);
构造函数的参数类型是基类，而创建对象的参数是派生类的引用或指针，实现了多态性

tflite::MicroInterpreter interpreter(
model, resolver, tensor_arena, tensor_arena_size, µ_error_reporter);

6. 为所有的Tensor/ScratchBuffer分配内存，为所有支撑的变量分配内存等

interpreter.AllocateTensors();

//因内存的大小是个超参数常量，可以通过使用情况调整下，如参考arena_used_bytes()

interpreter.arena_used_bytes();

7. 得到输入对应的Tensor

TfLiteTensor* input = interpreter.input(0);

检测input tensor的一些属性

TF_LITE_MICRO_EXPECT_EQ(2, input->dims->size);
// The value of each element gives the length of the corresponding tensor.
// We should expect two single element tensors (one is contained within the

// other).
TF_LITE_MICRO_EXPECT_EQ(1, input->dims->data[0]);
TF_LITE_MICRO_EXPECT_EQ(1, input->dims->data[1]);
// The input is a 32 bit floating point value
TF_LITE_MICRO_EXPECT_EQ(kTfLiteFloat32, input->type);

8. Provide an input value赋值input tensor

input->data.f[0] = 0.;

9. 进行推断

interpreter.Invoke();

10. 得到推断结果

// Obtain a pointer to the output tensor and make sure it has the
// properties we expect. It should be the same as the input tensor.
TfLiteTensor* output = interpreter.output(0);

//检测推断结果的属性
TF_LITE_MICRO_EXPECT_EQ(2, output->dims->size);
TF_LITE_MICRO_EXPECT_EQ(1, output->dims->data[0]);
TF_LITE_MICRO_EXPECT_EQ(1, output->dims->data[1]);
TF_LITE_MICRO_EXPECT_EQ(kTfLiteFloat32, output->type);

// Obtain the output value from the tensor
float value = output->data.f[0];

按上面的步骤，分析下代码实现，尽量明白一个 AIEngine reference的实现。

ErrorReporter

ErrorReport class 是用来输出log, 各个平台如PC/ 串口打印log的具体实现不同，使用虚函数继承是合理的，接口类Class ErrorReporter：lite/core/api/error_report.h

虚函数是virtual int Report(const char*, va_list args), 普通成员函数调用虚函数也得到了跨平台的接口。

对外的接口是个宏: 其中... 表示所有输入

#define TF_LITE_REPORT_ERROR(reporter, ...)
do {
static_cast(reporter)->Report(__VA_ARGS__);
} while (false)

TF_LITE_REPORT_ERROR(error_reporter_,
"Tensor index %d", index);

展开为：

static_cast(reporter)->Report(Tensor index %d", index); 强制类型转换

static_cast(reporter) 把reporter转换为基类，调用函数Report(const char* format, ...) 而Report的实现又调用虚函数得到log的具体实现。

tflite::Model

Model文件包含的数据包括: subgraphs(子图) 每个子图中包括输入、输出的tensor索引号和tensors

model的buffer里保存的是tensor相关的数据，model的operator_codes里包括了每个算子的具体信息如op_code(操作码)，算子的input/oupt(tensor的索引号)，算子的custom/builtin data.

模型的量化信息，

OpResolver

算子决议，根据opcode找到具体的实现

算子的决议实现方法是调用AddBuiltin把opcode和对应的TfLiteRegistration, BuiltinParseFunction注册到数组中，最后通过GetRegistrationFromOpCode得到对应的TfLiteRegistration。

tf:Model中的算子在代码里表示NodeAndRegistration 或者说是算子的输入和算子的执行体

MicroInterpreter构造函数

MicroInterpreter::MicroInterpreter(const Model* model,
const MicroOpResolver& op_resolver,
uint8_t* tensor_arena,
size_t tensor_arena_size,
ErrorReporter* error_reporter,
tflite::Profiler* profiler)
: model_(model),
op_resolver_(op_resolver),
error_reporter_(error_reporter),
allocator_(*MicroAllocator::Create(tensor_arena, tensor_arena_size,
error_reporter)),
tensors_allocated_(false),
initialization_status_(kTfLiteError),
eval_tensors_(nullptr),
context_helper_(error_reporter_, &allocator_, model),
input_tensor_(nullptr),
output_tensor_(nullptr) {
Init(profiler);
}
初始化列表中创建了类: MicroAllocator, ContextHelper

MicroAllocator

MicroAllocator的作用是从arena中分配内存，具体实现是成员变量 class SimpleMemoryAllocator

这是代码最复杂的部分，下面会详述

ContextHelper

ContextHelper是为to encapsulate the implementation of APIs in Context.

TfLiteContext包含了不少函数指针，ContextHelper封装这些函数的实现，或者说提供了这些函数的实现。

context_.AllocatePersistentBuffer = context_helper_.AllocatePersistentBuffer;

context_.RequestScratchBufferInArena = context_helper_.RequestScratchBufferInArena;

context_.GetScratchBuffer = context_helper_.GetScratchBuffer;

MicroInterpreter::AllocateTensors()

AllocateTensors之所以这么复杂，是为了内存的复用，如算子1：tensor占用的内存，到算子3时如果不用了，这块内存就可以复用，怎样知道那些可以复用也就是内存 planner 的实现GreedyMemoryAllocator 贪婪算法。

引入TfLiteevalTensor

为了减少Memory的使用引入了TfLiteevalTensor 保留了op必须使用的成员

SimpleMemoryAllocator

SimpleMemoryAllocator 管理一个线性数组(memory arena)

memory 分为3部分:

head(scratch buffer: GreedyMemoryPlanner策略的内存)，

tail(persistent 部分，内容的生命周期 persistent),

temp(head, tail间的内存，使用后马上释放部分)

online/offline

memory 怎么plann是怎么得到？一种方法是offline 也就是 model文件的meta data里已经提供memory plann，而online是 MicroInterpreter根据model 文件的内容 online进行计算得到memory planner

Tensors are allocated differently depending on the type of tensor.

Weight tensors are located in the flatbuffer, which is allocated by the application that calls TensorFlow Lite Micro.

evalTensors are allocated in the tensor arena, either offline planned as specified in the flatbuffers metadata, or allocated during runtime by the memory planner (online planned). The tensor arena is allocated by MicroAllocator in TensorFlow Lite Micro, and the model buffer (represented by a .tflite-file) is allocated by the application using TensorFlow Lite Micro.

TensorFlow Lite for Microcontrollers

Python相关栏目本月热门文章