@lijiang

Sculpting in time

Do one thing and do it well.
Every story has a beginning and an end.

TF-MPI-Distributed-Training

Tensorflow MPI-based Distributed Training

6-Minute Read

I have written an article about how to build a distributed training cluster on Raspberry Pie 4 using the distributed training system that comes with TF2.0. However, there is a drawback: we need to start the training program at each node, and the distributed training will only work after all the nodes are started. MPI is mainly used in the field of supercomputing. Building MPI cluster on Raspberry, firstly, it can be used to learn distributed computing on supercomputing, and secondly, it can…

Recent Posts

Categories

About

Keep thinking, Stay curious
Always be sensitive to new things