How Many Cluster Nodes Should I Use for a Simulation?
It is important to scale the simulation size with the number of cluster nodes being used. The optimal scaling is determined from a ratio of the time to compute and the time that is taken to exchange data between cluster nodes.
If the simulation is distributed between too many cluster nodes, the time that is taken to perform the necessary data exchange dominates the total time that is taken to compute each simulation iteration. Consequently, no benefit is achieved from using more cluster nodes.
The time that it takes to exchange data between cluster nodes is determined by the size of the simulation, the number of cluster nodes that are used, and the speed at which the network hardware can exchange data.
CFD calculations require pulling large amounts of data through each core continuously. Since the typical calculations performed on the data are relatively simple, the key limitation on overall performance comes from the ability to access the data. Multiple cores have to share memory bandwidth, which may not be able to keep up with the data flow requirements.
Therefore, the optimal number of computational processes is dependent on the specification of the nodes and the interconnects used. It is recommended that when using 100/10BaseT Ethernet cards the minimum number of mesh cells per process is 100,000, whereas on higher-performance network hardware this limit can be decreased.