SHADOW: A workflow scheduling algorithm reference and testing framework

As the scale of science projects increase, so does the demand on computing infrastructures. The complexity of science processing pipelines, and the heterogeneity of the environments on which they are run, continues to increase; in order to deal with this, the algorithmic approaches to executing these applications must also be adapted and improved to deal with this increased complexity. An example of this is workflow scheduling, algorithms for which are continually being developed; however, in many systems that are used to deploy science workflows for major science projects, the same algorithms and heuristics are used for scheduling. We have developed SHADOW, a workfloworiented scheduling algorithm framework built to address an absence of open implementations of these common algorithms, and to facilitate the development and testing of new algorithms against these ’industry standards’. SHADOW has implementations of common scheduling heuristics, with the intention of continually updating the framework with heuristics, metaheuristics, and mathematical optimisation approaches in the near future. In addition to the algorithm implementations, there is also a number of workflow and environment generation options, using the companion utility SHADOWGen; this has been provided to improve the productivity of algorithm developers in experimenting with their new algorithms over a large variety of workflows and computing environments. SHADOWGen also has a translation utilities that will convert from other formats, like the Pegasus DAX file, into the SHADOW-JSON configuration. SHADOW is open-source and uses key SciPy libraries; the intention is for the framework to become a reference implementation of scheduling algorithms, and provide algorithm designers an opportunity to develop and test their own algorithms with the framework. SHADOW code is hosted on GitHub at https://github.com/myxie/shadow; documentation for the project is available in the repository, as well as at https://shadowscheduling.readthedocs.org.


Introduction
To obtain useful results from the raw data produced by science experiments, a series of scripts or applications is often required to produce tangible results. These application pipelines are referred to as Science Workflows [ALRP16], which are typically a Directed-Acyclic Graph (DAG) representation of the dependency relationships between application tasks in a pipeline. An example of science workflow usage is Montage 1 , which takes sky images and re-projects, background corrects and add astronomical images into custom mosaics of the sky [BCD + 08], [JCD + 13]. A Montage pipeline may consist of more than 10,000 jobs, perform more than 200GB of I/O (read and write), and take 5 hours to run [JCD + 13]. This would be deployed using a workflow management system (for example, Pegasus [DVJ + 15]), which coordinates the deployment and execution of the workflow. It is this workflow management system that passes the workflow to a workflow scheduling algorithm, which will pre-allocate the individual application tasks to nodes on the execution environment (e.g. a local grid or a cloud environment) in preparation for the workflow's execution.
The processing of Science Workflows is an example of the DAG-Task scheduling problem, a classic problem at the intersection of operations research and high performance computing [KA99a]. Science workflow scheduling is a field with varied contributions in algorithm development and optimisation, which address a number of different sub-problems within the field [WWT15], [CCAT14], [BÇRS13], [HDRD98], [RB16], [Bur]. Unfortunately, implementations of these contributions are difficult to find; for example, implementations that are only be found in code that uses it, such as in simulation frameworks like Work-flowSim [THW02], [CD12]; others are not implemented in any public way at all [YB06], [ANE10]. These are also typically used as benchmarking or stepping stones for new algorithms; for example, the Heterogeneous Earliest Finish Time (HEFT) heuristic continues to be used as the foundation for scheduling heuristics [DFP12], [CCCR18], meta-heuristics, and even mathematical optimisation procedures [BBL + 16], despite being 20 years old. The lack of a consistent testing environment and implementation of algorithms makes it hard to reproduce and verify the results of published material, especially when a common workflow model cannot be verified.
Researchers benefit as a community from having open implementations of algorithms, as it improves reproducibility and accuracy of benchmarking and algorithmic analysis [CHI14]. There exists a number of open-source frameworks designed for testing and benchmarking of algorithms, demonstrate typical implementations, and provide an infrastructure for the development and testing of new algorithms; examples include NLOPT for nonlinear optimisation in a number of languages (C/C++, Python, Java) [Joh], NetworkX for graph and network implementations in Python, MOEA for Java, and DEAP for distributed EAs in Python [DRFG + 12]. SHADOW (Scheduling Algorithms for DAG Workflows) is our answer to the absence of Workflow Schedulingbased algorithm and testing framework, like those discussed above. It is an algorithm repository and testing environment, in which the performance of single-and multi-objective workflow scheduling algorithms may be compared to implementations of common algorithms. The intended audience of SHADOW is those Fig. 1: A sample DAG; vertices represent compute tasks, and edges show precedence relationships between nodes. Vertex-and edgeweights are conventionally used to describe computational and data costs, respectively. This is adapted from [THW02], and is a simple example of the DAG structure of a science workflow; a typical workflow in deployment will often be more complex and contain many hundreds of nodes and edges.
developing and testing novel workflow scheduling algorithms, as well as those interested in exploring existing approaches within an accessible framework.
To the best of our knowledge, there is no single-source repository of implementations of DAG or Workflow scheduling algorithms. The emphasis in SHADOW is on reproducibility and accuracy in algorithm performance analysis, rather than a simulated demonstration of the application of a particular algorithm in certain environments. Additionally, with the popularity of Python in other domains that are also growing within the workflow community, such as Machine and Deep Learning, SHADOW provides a frictionless opportunity to integrate with the frameworks and libraries commonly used in those domains.

Workflow Scheduling
A workflow is commonly represented in the literature as a Directed Acyclic Graph (DAG) [CK88], [CA93], [Ull75], [KA99a]; a sequence of tasks will have precedence constraints that limit when a task may start. A DAG task-graph is represented formally as a graph G = (V, E), where V is a set of v vertices and E is a set of e edges [KA99a]; an example is featured in Figure 1, which will be build upon as the paper progresses. Vertices and Edges represent communication and computation costs respectively. The objective of the DAG-scheduling problem is to map tasks to a set of resources in an order and combination that minimise the execution length of the final schedule; this is referred to as the makespan.
The complexity and size of data products from modern science projects necessitates dedicated infrastructure for compute, in a way that requires re-organisation of existing tasks and processes. As a result, it is often not enough to run a sequence of tasks in series, or submit them to batch processing; this would likely be computationally inefficient, as well taking as much longer than necessary. As a result, science projects that have computationallyand data-intensive programs, that are interrelated, have adopted the DAG-scheduling model for representing their compute pipelines; this is where science workflow scheduling is derived.

Design and Core Architecture
Design SHADOW adopts a workflow-oriented design approach, where workflows are at the centre of all decisions made within the framework; environments are assigned to workflows, algorithms operate on workflows, and the main object that is manipulated and interacted with when developing an algorithm is likely to be a workflow object.
By adopting a workflow-oriented model to developing algorithms to test, three important outcomes are achieved: • Freedom of implementation; for users wishing to develop their own algorithms, there is no prohibition of additional libraries or data-structures, provided the workflow structure is used within the algorithm.
• Focus on the workflow and reproducibility; when running analysis and benchmarking experiments, the same workflow model is used by all algorithms, which ensures comparisons between differing approaches (e.g. a singleobjective model such as HEFT vs. a dynamic implementation of a multi-objective heuristic model) are applied to the same workflow.
• Examples: We have implemented popular and welldocumented algorithms that are commonly used to benchmark new algorithms and approaches. There is no need to follow the approaches taken by these implementations, but they provide a useful starting point for those interested in developing their own. SHADOW is not intended to accurately simulate the execution of a workflow in an real-world environment; for example, working with delays in processing, or node failure in a cluster. Strategies to mitigate these are often implemented secondary to the scheduling algorithms, especially in the case of static scheduling, and would not be a fair approach to benchmarking the relative performance between each application. Instead, it provides algorithms that may be used, statically or dynamically, in a larger simulation environment, where one would be able to compare the specific environmental performance of one algorithm over another.

Architecture
SHADOW is split into three main components that are separated by their intended use case, whether it be designing new algorithms, or to benchmark against the existing implementations. These components are: The models module is likely the main entry point for researchers or developers of algorithms; it contains a number of key components of the framework, the uses of which are demonstrated both in the examples directory, as well as the implemented sample algorithms in the algorithms module. The algorithms module is concerned with the implementations of algorithms, with the intention of providing both a recipe for implementing algorithms using SHADOW components, and benchmark implementations for performance analysis and testing. The visualiser is a useful way to add graphical components to a benchmarking recipe, or can be invoked using the command line interface to quickly run one of the in-built algorithms.  Figure 1); weights on the edges describe data products from the respective parent node being sent to the child. In SHADOW, task computation cost is represented by the total number of Floating Point Operations required to run the task (see Table 1). This is intended to alleviate the difficulty of converting the run-time between different test environment configurations.  These components are all contained within the main shadow directory; there are also additional codes that are located in utils, which are covered in the Additional Tools section.

Models
The models module provides the Workflow class, the foundational data structure of shadow. Currently, a Workflow object is initialised using a JSON configuration file that represents the underlying DAG structure of the workflow, along with storing different attributes for task-nodes and edges in Figure 2 (which is an extension of Figure 1).
These attributes are implicitly defined within the configuration file; for example, if the task graph has compute demand (as total number of FLOPs/task) but not memory demand (as average GB/task), then the Workflow object is initialised without memory, requiring no additional input from the developer.
Using the example workflow shown in Figures 1 and 2 It is clear from Figure HEFT Edges in the graph, which are the dependency relationship between tasks, are described by links, along with the related data-products:  NetworkX is used to form the base-graph structure for the workflow; it allows the user to specify nodes as Python objects, so tasks are stored using the SHADOW Task object structure. By using the NetworkX.DiGraph as the storage object for the workflow structure, users familiar with NetworkX may use with the SHADOW Workflow object in any way they would normally interact with a NetworkX Graph.
In addition to the JSON configuration for the workflow DAG, a Workflow object also requires an Environment object. Environment objects represent the compute platform on which the Workflow is executed; they are add to Workflow objects in the event that different environments are being analysed. The environment is also specified in JSON; currently, there is no prescribed way to specify an environment in code, although it is possible to do so if using JSON is not an option.
In our example, we have three machines on which we are attempting to schedule the workflow from Figure 2. The different performance of each machine is described in Table 1, with the JSON equivalent below:  Fig. 3: This is a replication of the costs provided in [THW02]. The table shows a different run-time for each task-machine pairing. It is the same structure as Figure 2; however, the JSON specification is different to cater for the pre-calculated run-time on separate machines. The Workflow class calculates task run-time and other values based on its current environment when the environment is passed to the Workflow); however, users of the environment class may interact with these compute values if necessary. Configuration files may be generated in a number of ways, following a variety of specifications, using the SHADOWGen utility.
It is also possible to use pre-calculated costs (i.e. completion time in seconds) when scheduling with SHADOW.
This approach is less flexible for scheduling workflows, but is a common approach used in the scheduling algorithm literature [KA99a], [KA99b], [?], [BM08], [YB06]; an example of this is shown in Figure 3. This can be achieved by adding a list of costsper-tasks to the workflow specification JSON file, in addition to the following header. For example, if instead of the total FLOPS we had provided to us in Table 1, we instead had timed the runtime of the applications on each machine separately, the JSON for The final class that may be of interest to algorithm developers is the Solution class. For single-objective heuristics like HEFT or min-min, the final result is a single solution, which is a set of machine-task pairs. However, for population-and search-based metaheuristics, multiple solutions must be generated, and then evaluated, often for two or more (competing) objectives. These solutions also need to be sanity-checked in order to ensure that randomly generated task-machine pairs still follow the precedence constraints defined by the original workflow DAG. The Solution provides a basic object structure that stores machines and task pairs as a dictionary of Allocations; allocations store the task-ID and its start and finish time on the machine. This provides an additional ease-of-use functionality for developers, who can interact with allocations using intuitive attributes (rather than navigating a dictionary of stored keywords). The Solution currently stores a single objective (makespan) but can be expanded to include other, algorithm-specific requirements. For example, NSGAII* ranks each generated solution using the non-dominated rank and crowding distance operator; as a result, the SHADOW implementation creates a class, NSGASolution, that inherits the basic Solution class and adds the these additional attributes. This reduces the complexity of the global solution class whilst providing the flexibility for designers to create more elaborate solutions (and algorithms).

Algorithms
These algorithms may be extended by others, or used when running comparisons and benchmarking. The examples directory gives you an overview of recipes that one can follow to use the algorithms to perform benchmarking.
The SHADOW approach to describing an algorithm presents the algorithm as a single entity (e.g. heft()), with an initialised workflow object passed as a function parameter. The typical structure of a SHADOW algorithm function is as follows: • The main algorithm (the function to which a Workflow well be passed) is titled using its publication name or title (e.g. HEFT, PCP, NSGAII* etc.). Following PEP8, this is (ideally) in lower-case.
• Within the main algorithm function, effort has been made to keep it structured in a similar way to the pseudo-code as presented in the respective paper. For example, HEFT has two main components to the algorithm; Upward Ranking of tasks in the workflow, and the Insertion Policy allocation scheme. This is presented in SHADOW as: def heft(workflow): """ Implementation of the original 1999 HEFT algorithm.
:params workflow: The workflow object to schedule :returns: The solution object from the scheduled workflow """ upward_rank(workflow) workflow.sort_tasks('rank') insertion_policy(workflow) return workflow.solution Complete information of the final schedule is stored in the HEFTWorkflow.solution object, which provides additional information, such as task-machine allocation pairs. It is convention In keeping with the generic requirements of DAG-based scheduling algorithms, the base Solution class prioritises makespan over other objectives; however, this may be amended (or even ignored) for other approaches. For example, the NSGAII algorithm uses a sub-class for this purpose, as it generates multiple solutions before ranking each solution using the crowded distance or nondominated sort [SD94]: class NSGASolution(Solution): """ A simple class to store each solutions' related information """ def __init__(self, machines): super().__init__(machines) self.dom_counter = 0 self.nondom_rank = -1 self.crowding_dist = -1 self.solution_cost = 0 Visualiser SHADOW provides wrappers to matplotlib that are structured around the Workflow and Solution classes. The Visualiser uses the Solution class to retrieve allocation data, and generates a plot based on that information. For example, Figure 4 is the result of visualising the HEFTWorkflow example mentioned previously: This can be achieved by creating a script using the algorithms as described above, and then passing the scheduled workflow to one of the Visualiser classes:

Additional tools
Command-line interface SHADOW provides a simple command-line interface (CLI) that allows users to run algorithms on workflows without composing a separate Python file to do so; this provides more flexibility and allows users to use a scripting language of their choice to run experiments and analysis. It is also possible to use the unittest module from the script to run through all tests if necessary: python3 shadow.py test --all

SHADOWGen
SHADOWGen is a utility built into the framework to generate workflows that are reproducible and interpretable. It is designed to generate a variety of workflows that have been documented and characterised in the literature in a way that augments current techniques, rather than replacing them entirely.
This includes the following: • Python code that runs the GGen graph generator 2 , which produces graphs in a variety of shapes and sizes based on provided parameters. This was originally designed to produce task graphs to test the performance of DAG scheduling algorithms.
• DAX Translator: This takes the commonly used Directed Acyclic XML (DAX) file format, used to generate graphs for Pegasus, and translates them into the SHADOW format. Future work will also interface with the Workflow-Generator code that is based on the work conduced in [BCD + 08], which generates DAX graphs.
• DALiuGE/EAGLE Translator [WTV + 17]: EAGLE logical graphs must be unrolled into Physical Graph Templates (PGT) before they are in a DAG that can be scheduled in SHADOW. SHADOWGen will run the DALiUGE unroll code, and then convert this PGT into a SHADOW-based JSON workflow.

Cost generation in SHADOWGen
A majority of work published in workflow scheduling will use workflows generated using the approach laid out in [BCD + 08]. The five workflows described in the paper (Montage, CyberShake, Epigenomics, SIPHT and LIGO) had their task run-time, memory and I/O rates profiled, from which they created a WorkflowGenerator tool 3 . This tool uses the distribution sizes for run-time etc., without requiring any information on the hardware on which the workflows are being scheduled. This means that the characterisation is only accurate for that particular hardware, if those values are to be used across the board; testing on heterogeneous systems, for example, is not possible unless the values are to be changed. This is dealt with in varied ways across the literature. For example, [RB18] use the distributions from [BCD + 08] paper, and change the units from seconds to MIPS, rather than doing a conversion between the two. Others use the values taken from distribution and workflow generator, without explaining how  SHADOWGen differs from the literature by using a normalised-cost approach, in which the values calculated for the run-time, memory, and I/O for each tasks is derived from the normalised size as profiled in [JCD + 13] and [BCD + 08]. This way, the costs per-workflow are indicative of the relative length and complexity of each task, and are more likely to transpose across different hardware configurations than using the varied approaches in the literature.
The distribution of values is derived from a table of normalised values using a variation on min-max feature scaling for each mean or standard deviation column in Table 2. The formula to calculate each task's normalised values is described in Equation 1; the results of applying this to Table 2 is shown in Table 3: This approach allows algorithm designers and testers to describe what units they are interested in (e.g. seconds, MIPS, or FLOP seconds for run-time, MB or GB for Memory etc.) whilst still retaining the relative costs of that task within the workflow. In the example of Table 3, it is clear that mAdd and mBackground are still the longest running and I/O intensive tasks, making the units less of a concern.
Alternatives to SHADOW It should be noted that existing work already addresses testing workflow scheduling algorithms in real-world environments; tools like SimGrid [CLQ], BatSim [DMPR17], GridSim [BM02], and its extensions, CloudSim [CRB + 11] and WorkflowSim [CD12], all feature strongly in the literature. These are excellent resources for determining the effectiveness of the implementations at the application level; however, they do not provide a standardised repository of existing algorithms, or a template workflow model that can be used to ensure consistency across performance testing. Current implementations of workflow scheduling algorithms may be found in a number of different environments; for example, HEFT and dynamic-HEFT implementations exist in WorkflowSim 4 , but one must traverse large repositories in order to reach them. There are also a number of implementations that are present on opensource repositories such as GitHub, but these are not always official releases from papers, and it is difficult to keep track of multiple implementations to ensure quality and consistency. The algorithms that form the algorithms module in SHADOW are open and continually updated, and share a consistent workflow model. Kwok and Ahmed [KA99a] provide a comprehensive overview of the metrics and foundations of what is required when benchmarking DAG-scheduling algorithms, Maurya et al. maurya2018' extend this work and describe key features of a potential framework for scheduling algorithms; SHADOW takes inspiration from, and extends, both approaches.

Conclusion
SHADOW is a development framework that addresses the absence of a repository of workflow scheduling algorithms, which is important for benchmarking and reproducibility [MT18]. This repository continues to be updated, providing a resource for future developers. SHADOWGen extends on existing research from both the task-and workflow-scheduling communities in graph generation by using existing techniques and wrapping them into a simple and flexible utility. The adoption of a JSON data format compliments the move towards JSON as a standardised way of representing workflows, as demonstrated by the Common Workflow Language [CCH + 16] and WorkflowHub 5 .

Future work
Moving forward, heuristics and metaheuristics will continue to be added to the SHADOW algorithms module to facilitate broader benchmarking and to provide a living repository of workflow scheduling algorithms. Further investigation into workflow visualisation techniques will also be conducted. There are plans to develop a tool that uses the specifications in hpconfig 6 , a Python class-based of different hardware (e.g. class XeonPhi ) and High Performance Computing facilities (e.g class PawseyGalaxy). The motivation behind hpconfig is that classes can be quickly unwrapped into a large cluster or system, without having large JSON files in the repository or on disk; they also improve readability, as specification data is represented clearly as class attributes.