Home
Client-Server Setup
Simcenter STAR-CCM+ uses a client-server architecture in which the simulation objects are created on the server and the client provides access to these objects for setup and display.
Working With Parallel Servers
A parallel computation is one where the work is computed by more than one process.
Using a Group of Machines
How Many Cluster Nodes Should I Use for a Simulation?
It is important to scale the simulation size with the number of cluster nodes being used. The optimal scaling is determined from a ratio of the time to compute and the time that is taken to exchange data between cluster nodes.

Share

Link: copied

Breadcrumb: copied

Client-Server Setup
Simcenter STAR-CCM+ uses a client-server architecture in which the simulation objects are created on the server and the client provides access to these objects for setup and display.
- Client-Server Connections
  Simcenter STAR-CCM+ allows you to connect a client to any running server for which you have access, and also to disconnect clients when you choose. The client version must match the server version for a successful connection. Secure connections require additional inputs.
- Mixed or Double Precision
  Simcenter STAR-CCM+ comes in two versions: a mixed precision version and a double precision version. Both versions are provided on the download area of the Support Center portal.
- Running Simcenter STAR-CCM+ on Linux ARM64
  Simcenter STAR-CCM+ is supported on Linux ARM64 for running simulations in batch mode. The graphical user interface (GUI) is not officially supported in this release. Some other restrictions are in place for this platform, including unsupported modules.
- Launching Simcenter STAR-CCM+ and Working With Simulations
  A Simcenter STAR-CCM+ simulation exists on a server, and is accessed by connecting a client to the server on which the simulation exists.
- Working With Parallel Servers
  A parallel computation is one where the work is computed by more than one process.
  - Parallel Servers and Requirements
    A parallel server is a Simcenter STAR-CCM+ server that is set to run on multiple cores or a cluster of machines when it is first launched.
  - Supported MPI Implementations
    Regardless of whether you use a local multi-core workstation, remote machines, or a remote cluster to run your parallel server, all approaches require an implementation of the Message Passing Interface (MPI).
  - Using a Local Multi-Core Workstation
  - Using a Remote Multi-Core Workstation
  - Using a Group of Machines
    - How Many Cluster Nodes Should I Use for a Simulation?
      It is important to scale the simulation size with the number of cluster nodes being used. The optimal scaling is determined from a ratio of the time to compute and the time that is taken to exchange data between cluster nodes.
    - Defining a Machine File for Parallel Hosts
      If you are not using a dedicated cluster with its own batch management system, you can define a group of networked machines on which Simcenter STAR-CCM+ can launch a parallel server.
    - Starting from the Command Line
    - Starting a Cluster of Machines Manually
    - Editing Hostnames of Machines
      A command line option permits you to edit the hostnames of the machines that are prescribed to the server. This facility is primarily for the use with batch systems.
    - Using the Power Session License Feature
    - Using Parallel I/O
      For systems that support it, simulation files can be saved and restored using parallel input/output (I/O).
  - Troubleshooting Parallel Servers and MPI
    This section contains some troubleshooting tips when working with parallel servers and MPI distributions.
  - Balancing
  - Getting Parallel Help through the Command Line
    You can get information on the parallel options through the command line.
  - Viewing the Partitioning
  - Using Batch Management Systems
    Batch systems manage the allocation of computational jobs to a single large computer or a cluster of smaller computers.
  - Vectorization
    Simcenter STAR-CCM+ uses AVX-2 instructions on all Xeon processors that support the AVX-2 instruction set—currently, the Haswell, Broadwell, and Skylake processors.
  - Running Distributed Parallel Simulations on Multiple Windows Machines
- GPGPU Computation
  The idea behind general-purpose computing on graphics processing units (GPGPU computation) is that you assign computations that are traditionally solved on a CPU (central processing unit) to a graphics processing unit (GPU). The nature of operations on the GPU means that a single unit can deliver computational performance comparable to using many CPUs in parallel. GPGPU computation is only supported for Simcenter STAR-CCM+ servers running under Linux.
- Teamcenter Share
  Teamcenter Share is a collaboration platform that you can access directly within the Simcenter STAR-CCM+ workspace as well as in a web browser. Teamcenter Share allows you to upload and share simulation files and related data, such as Viewer (.sce) files. Through the integration within Simcenter STAR-CCM+, you can download and open files from this platform.
- Command-Line Reference

How Many Cluster Nodes Should I Use for a Simulation?

It is important to scale the simulation size with the number of cluster nodes being used. The optimal scaling is determined from a ratio of the time to compute and the time that is taken to exchange data between cluster nodes.

If the simulation is distributed between too many cluster nodes, the time that is taken to perform the necessary data exchange dominates the total time that is taken to compute each simulation iteration. Consequently, no benefit is achieved from using more cluster nodes.

The time that it takes to exchange data between cluster nodes is determined by the size of the simulation, the number of cluster nodes that are used, and the speed at which the network hardware can exchange data.

CFD calculations require pulling large amounts of data through each core continuously. Since the typical calculations performed on the data are relatively simple, the key limitation on overall performance comes from the ability to access the data. Multiple cores have to share memory bandwidth, which may not be able to keep up with the data flow requirements.

Therefore, the optimal number of computational processes is dependent on the specification of the nodes and the interconnects used. It is recommended that when using 100/10BaseT Ethernet cards the minimum number of mesh cells per process is 100,000, whereas on higher-performance network hardware this limit can be decreased.