clusters:cluster-overview

A cluster is a group of computers organized to work together to solve problems. The computers are also called nodes or hosts, and the problems are broken down into tasks called jobs.

The Grid.UP cluster is constructed from the following types of node:

  • compute - These nodes do the actual work for the jobs.
  • management - These nodes help run the cluster, managing which compute nodes should run the jobs on and scheduling when the jobs should run.
  • submit - These nodes are where you give (or submit) your jobs to the cluster.

To run a job you first need to describe what the job will do and what resources you need to complete the job, then you need to send the job and all the information that the job will need to a submit node, and finally you need to actually submit the job.

After the job is submitted the management nodes will decide whether it can actually run the job or not. If it decides that it can't run it now (perhaps because you asked for more resources than are available at the moment), then it will wait until it can run the job. Otherwise it will either run the job immediately or tell you that it can never run it.

  • clusters/cluster-overview.txt
  • Last modified: 2024/03/01 16:03
  • by ptsilva