Optimal Scheduling for Precedence-Constrained Applications on Heterogeneous Machines

Graphical Abstract Abstract. High-Performance Computing (HPC) is a growing necessity of our technological society, HPC demands high loads of parallel computing jobs, an optimal scheduling of the parallel applications tasks is a priority to meet the demands of its users on time. Branch-and-bound (BB) Algorithms and Mathematical Programming (MP) solve complex optimization problems in an optimal manner, some MP or BB even have parallel computing capabilities, making them suitable solutions to solve realworld problems. In this paper, we propose two exact algorithms, a BB and an MP Model for scheduling precedence-constrained applications, on heterogeneous computing systems, as far as we known the first ones on his kind presented in the state of the art. One major contribution of the work is the proposed formulations of the objective function in both methods. Experimental results obtained more than twenty optimal values for synthetic applications from the literature.


Introduction
Normally a heterogeneous computing system (HCS) provides high computing machines on parallel and/or distributed systems which works cooperatively to solve problems that require an intensive computing power and diverse computational requirements. The heterogeneous computing systems have been used to solve a wide variety of problems that require a high computing power [1].
The principal issue in an HCS consists in finding the best schedule of tasks on machines such that satisfies some requirements related to efficiency, workload, economic benefits, costs, and others. This is a scheduling problem where the scheduling considers diverse operations per job and dependencies between jobs. Typically, dependency is modeled by a directed graph [2], [3]. Either scheduling tasks MILP for HCSP with dependencies 3 formulation is correct the computed makespan can vary; if the tasks are not executed in the best priority way. This is because the system can execute the tasks in any feasible order that do not violate the precedence-constraints. Needing to compute di↵erent priorities execution for tasks, this best priority verification is computationally expensive, for example when two tasks are assigned to the same machine and ready to go at the same time, which one should be executed first? the obvious answer is the one that minimizes the overall makespan, but this needs to be computed first to know.    Is easier to simplify problem formulation from Equation 1 is when the task are executed in a topological order (feasible), allowing a lower computational cost in the objective function: problems with and without precedence-constraints as in the parallel application scheduling, are known to be NP-Hard [4]. That means no deterministic algorithm is available to solve it in polynomial time, hence its relevance in solving it in an efficient way.

Parallel program representation
Generally, a parallel application is represented by a Directed Acyclic Graph (DAG) with the following description. Given a DAG, = ( , ), consists of a set of corresponding tasks * of the parallel program and set of edges. In general, the nodes present segments from an application that can be computed independently; each edge + * , , -∈E represents a precedence constraint such that tasks , cannot start until * finish their execution. The edge ( * , , ) ∈ between tasks * , , also represents intertask communication. The HCS is represented by a set of machines = { 3 , 4 , … , 6 } with different processing times for every task * ∈ . The problem formulation is represented by the next three equations.
where *,U is the computation time of the task * in the machine U , *,U is the start time of an available window of size *,U in the machine U , after the execution of their precedence tasks, and ,,U is the communication of the task , to the machine U . ,,U is equal to the edge weight + , , *when the tasks * and , are executed in a different machine, otherwise ,,U = 0. * is the starting time of task and * is the ending time of task . Although the above formulation, the correct computed makespan can vary; if the tasks are not executed in the best priority way. This is because the system can execute the tasks in any feasible order that do not violate the precedence-constraints. Needing to compute different priorities execution for tasks, this best priority verification is computationally expensive, for example when two tasks are assigned to the same machine and ready to go at the same time, Which one should be executed first? the obvious answer is the one that minimizes the overall makespan, but this needs to be computed first to be known. In order to avoid these inconsistencies prioritizes scheduling heuristics had been developed, a popular one is list scheduling. These heuristics makes use of two attributes: 1) the b-level computation, which is

MILP for HCSP with dependencies 3
formulation is correct the computed makespan can vary; if the tasks are not executed in the best priority way. This is because the system can execute the tasks in any feasible order that do not violate the precedence-constraints. Needing to compute di↵erent priorities execution for tasks, this best priority verification is computationally expensive, for example when two tasks are assigned to the same machine and ready to go at the same time, which one should be executed first? the obvious answer is the one that minimizes the overall makespan, but this needs to be computed first to know. the length of the longest path from the exit task of the DAG job to the task; 2) the t-level computation, which is the length of the longest path from the entry task to the task [5]. The ¡Error! No se encuentra el origen de la referencia. shows a recursive computation of b-level attribute. The heuristic used in this work is the Heterogeneous Earliest-Finish-Time (HEFT) [6]. The heuristic computes the priority of all the tasks and schedules each task on its best processor, which minimizes the task's computation time.

Algorithm 1. Recursive computation of b-level.
It is easier to simplify the problem formulation from Equation 1, when the tasks are executed in a topological order (feasible), allowing a lower computational cost of the objective function. Table 1 shows the values of b-level for each task from the example instance. The Algorithm 2 shows the computation of the objective value with a topological order.

Algorithm 2.
Objective function with a topological order.

Experimental results
The experimentation consists of solving a set of 21 instances from the literature. The reference for each instance can be found in Table 2. The time limit to solve each instance was set to 37297 seconds. Table 2 shows the results provided by a branch-and-bound and a MILP model. The first column is the instance name. The second column is the optimal value corresponding to the instance. The reference to the instance is in the third column. The fourth and fifth columns are the objective value retrieved and the required time of the branch-and-bound to find it, respectively. The sixth and seventh columns are the objective value found and the required time of the MILP.

Conclusions
In this work, a novel model for Heterogenous Computing Scheduling Problem is presented. The mixed integer linear programming model is compared against a branch-and-bound based on HEFT. Both strategies achieve the optimal values, but the MILP gets a better computational time.