Taskflow

![Ubuntu](https://github.com/taskflow/taskflow/actions?query=workflow%3AUbuntu)
![macOS](https://github.com/taskflow/taskflow/actions?query=workflow%3AmacOS)
![Windows](https://github.com/taskflow/taskflow/actions?query=workflow%3AWindows)
![Wiki][documentation]
![TFProf](https://taskflow.github.io/tfprof/)
![Cite][TPDS22]

Taskflow helps you quickly write task-parallel programs using modern C++

Why Taskflow?

Taskflow is faster, more expressive, and easier to integrate than many existing task programming frameworks when handling complex parallel workloads.

![](https://github.com/taskflow/taskflow/blob/master/image/performance.png)

Taskflow lets you quickly implement task decomposition strategies
that incorporate both regular and irregular compute patterns,
together with an efficient work-stealing scheduler to optimize your multithreaded performance.

| Static Tasking | Subflow Tasking |
| :------------: | :-------------: |
| ![](https://github.com/taskflow/taskflow/blob/master/image/static_graph.svg) |

|

Taskflow supports conditional tasking for you to make rapid control-flow decisions
across dependent tasks to implement cycles and conditions that were otherwise difficult to do
with existing tools.

| Conditional Tasking |
| :-----------------: |
| ![](https://github.com/taskflow/taskflow/blob/master/image/condition.svg) |

Taskflow is composable. You can create large parallel graphs through
composition of modular and reusable blocks that are easier to optimize
at an individual scope.

| Taskflow Composition |
| :---------------: |
|![](https://github.com/taskflow/taskflow/blob/master/image/framework.svg)|

Taskflow supports heterogeneous tasking for you to
accelerate a wide range of scientific computing applications
by harnessing the power of CPU-GPU collaborative computing.

| Concurrent CPU-GPU Tasking |
| :-----------------: |
| ![](https://github.com/taskflow/taskflow/blob/master/image/cudaflow.svg) |

Taskflow provides visualization and tooling needed for profiling Taskflow programs.

| Taskflow Profiler |
| :-----------------: |
| ![](https://github.com/taskflow/taskflow/blob/master/image/tfprof.png) |

We are committed to support trustworthy developments for both academic and industrial research projects
in parallel computing. Check out Who is Using Taskflow and what our users say:

+ "Taskflow is the cleanest Task API I've ever seen." Damien Hocking @Corelium Inc
+ "Taskflow has a very simple and elegant tasking interface. The performance also scales very well." [Glen Fraser][totalgee]
+ "Taskflow lets me handle parallel processing in a smart way." Hayabusa @Learning
+ "Taskflow improves the throughput of our graph engine in just a few hours of coding." Jean-Michaël @KDAB
+ "Best poster award for open-source parallel programming library." [Cpp Conference 2018][Cpp Conference 2018]
+ "Second Prize of Open-source Software Competition." ACM Multimedia Conference 2019

See a quick poster presentation below and
visit the [documentation][documentation] to learn more about Taskflow.
Technical details can be referred to our [IEEE TPDS paper][TPDS22].

![](https://github.com/taskflow/taskflow/blob/master/image/taskflow-poster.png)

Installation

Run:

``

bash

$ npm i taskflow.cxx





And then include

taskflow.hpp

 as follows:

cxx

// main.cxx

#include 



int main() { / ... / }





Finally, compile while adding the path

node_modules/taskflow.cxx

 to your compiler's include paths.

bash

$ clang++ -I./node_modules/taskflow.cxx main.cxx  # or, use g++

$ g++     -I./node_modules/taskflow.cxx main.cxx





You may also use a simpler approach with the cpoach tool, which automatically adds the necessary include paths of all the installed dependencies for your project.

bash

$ cpoach clang++ main.cxx  # or, use g++

$ cpoach g++     main.cxx





Start Your First Taskflow Program



The following program (

simple.cpp

) creates a taskflow of four tasks

A, B, C, and D, where A runs before B and C, and D



runs after

B and C

.

When

A finishes, B and C

 can run in parallel.

Try it live on Compiler Explorer (godbolt)!

cpp

#include   // Taskflow is header-only



int main(){



  tf::Executor executor;

  tf::Taskflow taskflow;



  auto [A, B, C, D] = taskflow.emplace(  // create four tasks

    [] () { std::cout << "TaskA\n"; },

    [] () { std::cout << "TaskB\n"; },

    [] () { std::cout << "TaskC\n"; },

    [] () { std::cout << "TaskD\n"; }

  );



  A.precede(B, C);  // A runs before B and C

  D.succeed(B, C);  // D runs after  B and C



  executor.run(taskflow).wait();



  return 0;

}





Taskflow is header-only and there is no wrangle with installation.

To compile the program, clone the Taskflow project and

tell the compiler to include the headers.

bash

~$ git clone https://github.com/taskflow/taskflow.git  # clone it only once

~$ g++ -std=c++20 examples/simple.cpp -I. -O2 -pthread -o simple

~$ ./simple

TaskA

TaskC

TaskB

TaskD





Visualize Your First Taskflow Program



Taskflow comes with a built-in profiler,

TFProf,

for you to profile and visualize taskflow programs

in an easy-to-use web-based interface.



![](https://github.com/taskflow/taskflow/blob/master/doxygen/images/tfprof.png)

bash

run the program with the environment variable TF_ENABLE_PROFILER enabled

~$ TF_ENABLE_PROFILER=simple.json ./simple

~$ cat simple.json

[

{"executor":"0","data":[{"worker":0,"level":0,"data":[{"span":[172,186],"name":"0_0","type":"static"},{"span":[187,189],"name":"0_1","type":"static"}]},{"worker":2,"level":0,"data":[{"span":[93,164],"name":"2_0","type":"static"},{"span":[170,179],"name":"2_1","type":"static"}]}]}

]

paste the profiling json data to https://taskflow.github.io/tfprof/





In addition to execution diagram, you can dump the graph to a DOT format

and visualize it using a number of free [GraphViz][GraphViz] tools.



// dump the taskflow graph to a DOT format through std::cout

taskflow.dump(std::cout);









Express Task Graph Parallelism



Taskflow empowers users with both static and dynamic task graph constructions

to express end-to-end parallelism in a task graph that

embeds in-graph control flow.



- Taskflow 

- Why Taskflow?

- Installation

- Start Your First Taskflow Program

- Visualize Your First Taskflow Program

- Express Task Graph Parallelism

  - Create a Subflow Graph

  - Integrate Control Flow to a Task Graph

  - Offload a Task to a GPU

  - Compose Task Graphs

  - Launch Asynchronous Tasks

  - Execute a Taskflow

  - Leverage Standard Parallel Algorithms

- Supported Compilers

- Learn More about Taskflow

- License



Create a Subflow Graph



Taskflow supports dynamic tasking for you to create a subflow

graph from the execution of a task to perform dynamic parallelism.

The following program spawns a task dependency graph parented at task

cpp

tf::Task A = taskflow.emplace([](){}).name("A");

tf::Task C = taskflow.emplace([](){}).name("C");

tf::Task D = taskflow.emplace([](){}).name("D");



tf::Task B = taskflow.emplace([] (tf::Subflow& subflow) {

  tf::Task B1 = subflow.emplace([](){}).name("B1");

  tf::Task B2 = subflow.emplace([](){}).name("B2");

  tf::Task B3 = subflow.emplace([](){}).name("B3");

  B3.succeed(B1, B2);  // B3 runs after B1 and B2

}).name("B");



A.precede(B, C);  // A runs before B and C

D.succeed(B, C);  // D runs after  B and C









Integrate Control Flow to a Task Graph



Taskflow supports conditional tasking for you to make rapid

control-flow decisions across dependent tasks to implement cycles

and conditions in an end-to-end task graph.

cpp

tf::Task init = taskflow.emplace([](){}).name("init");

tf::Task stop = taskflow.emplace([](){}).name("stop");



// creates a condition task that returns a random binary

tf::Task cond = taskflow.emplace(

  [](){ return std::rand() % 2; }

).name("cond");



init.precede(cond);



// creates a feedback loop {0: cond, 1: stop}

cond.precede(cond, stop);











Offload a Task to a GPU



Taskflow supports GPU tasking for you to accelerate a wide range of scientific computing applications by harnessing the power of CPU-GPU collaborative computing using Nvidia CUDA Graph.

cpp

__global__ void saxpy(size_t N, float alpha, float dx, float dy) {

  int i = blockIdx.x*blockDim.x + threadIdx.x;

  if (i < N) {

    y[i] = alpha*x[i] + y[i];

  }

}



// create a CUDA Graph task

tf::Task cudaflow = taskflow.emplace([&]() {

  tf::cudaGraph cg;

  tf::cudaTask h2d_x = cg.copy(dx, hx.data(), N);

  tf::cudaTask h2d_y = cg.copy(dy, hy.data(), N);

  tf::cudaTask d2h_x = cg.copy(hx.data(), dx, N);

  tf::cudaTask d2h_y = cg.copy(hy.data(), dy, N);

  tf::cudaTask saxpy = cg.kernel((N+255)/256, 256, 0, saxpy, N, 2.0f, dx, dy);

  saxpy.succeed(h2d_x, h2d_y)

       .precede(d2h_x, d2h_y);



  // instantiate an executable CUDA graph and run it through a stream

  tf::cudaGraphExec exec(cg);

  tf::cudaStream stream;

  stream.run(exec).synchronize();

}).name("CUDA Graph Task");









Compose Task Graphs



Taskflow is composable.

You can create large parallel graphs through composition of modular

and reusable blocks that are easier to optimize at an individual scope.

cpp

tf::Taskflow f1, f2;



// create taskflow f1 of two tasks

tf::Task f1A = f1.emplace([]() { std::cout << "Task f1A\n"; })

                 .name("f1A");

tf::Task f1B = f1.emplace([]() { std::cout << "Task f1B\n"; })

                 .name("f1B");



// create taskflow f2 with one module task composed of f1

tf::Task f2A = f2.emplace([]() { std::cout << "Task f2A\n"; })

                 .name("f2A");

tf::Task f2B = f2.emplace([]() { std::cout << "Task f2B\n"; })

                 .name("f2B");

tf::Task f2C = f2.emplace([]() { std::cout << "Task f2C\n"; })

                 .name("f2C");



tf::Task f1_module_task = f2.composed_of(f1)

                            .name("module");



f1_module_task.succeed(f2A, f2B)

              .precede(f2C);









Launch Asynchronous Tasks



Taskflow supports asynchronous tasking.

You can launch tasks asynchronously to dynamically explore task graph parallelism.

cpp

tf::Executor executor;



// create asynchronous tasks directly from an executor

std::future future = executor.async([](){

  std::cout << "async task returns 1\n";

  return 1;

});

executor.silent_async([](){ std::cout << "async task does not return\n"; });



// create asynchronous tasks with dynamic dependencies

tf::AsyncTask A = executor.silent_dependent_async([](){ printf("A\n"); });

tf::AsyncTask B = executor.silent_dependent_async([](){ printf("B\n"); }, A);

tf::AsyncTask C = executor.silent_dependent_async([](){ printf("C\n"); }, A);

tf::AsyncTask D = executor.silent_dependent_async([](){ printf("D\n"); }, B, C);



executor.wait_for_all();





Execute a Taskflow



The executor provides several thread-safe methods to run a taskflow.

You can run a taskflow once, multiple times, or until a stopping criteria is met.

These methods are non-blocking with a

tf::Future

 return

to let you query the execution status.

cpp

// runs the taskflow once

tf::Future run_once = executor.run(taskflow);



// wait on this run to finish

run_once.get();



// run the taskflow four times

executor.run_n(taskflow, 4);



// runs the taskflow five times

executor.run_until(taskflow, [counter=5](){ return --counter == 0; });



// block the executor until all submitted taskflows complete

executor.wait_for_all();





Leverage Standard Parallel Algorithms



Taskflow defines algorithms for you to quickly express common parallel

patterns using standard C++ syntaxes,

such as parallel iterations, parallel reductions, and parallel sort.

cpp

tf::Task task1 = taskflow.for_each( // assign each element to 100 in parallel

  first, last, [] (auto& i) { i = 100; }

);

tf::Task task2 = taskflow.reduce(   // reduce a range of items in parallel

  first, last, init, [] (auto a, auto b) { return a + b; }

);

tf::Task task3 = taskflow.sort(     // sort a range of items in parallel

  first, last, [] (auto a, auto b) { return a < b; }

);





Additionally, Taskflow provides composable graph building blocks for you to

efficiently implement common parallel algorithms, such as parallel pipeline.

cpp

// create a pipeline to propagate five tokens through three serial stages

tf::Pipeline pl(num_parallel_lines,

  tf::Pipe{tf::PipeType::SERIAL, [](tf::Pipeflow& pf) {

    if(pf.token() == 5) {

      pf.stop();

    }

  }},

  tf::Pipe{tf::PipeType::SERIAL, [](tf::Pipeflow& pf) {

    printf("stage 2: input buffer[%zu] = %d\n", pf.line(), buffer[pf.line()]);

  }},

  tf::Pipe{tf::PipeType::SERIAL, [](tf::Pipeflow& pf) {

    printf("stage 3: input buffer[%zu] = %d\n", pf.line(), buffer[pf.line()]);

  }}

);

taskflow.composed_of(pl)

executor.run(taskflow).wait();

Supported Compilers

To use Taskflow v4.0.0, you need a compiler that supports C++20:

+ GNU C++ Compiler at least v11.0 with -std=c++20
+ Clang C++ Compiler at least v12.0 with -std=c++20
+ Microsoft Visual Studio at least v19.29 (VS 2019) with /std:c++20
+ Apple Clang (Xcode) at least v13.0 with -std=c++20
+ NVIDIA CUDA Toolkit and Compiler (nvcc) at least v12.0 with host compiler supporting C++20
+ Intel oneAPI DPC++/C++ Compiler at least v2022.0 with -std=c++20

Taskflow works on Linux, Windows, and Mac OS X.

Learn More about Taskflow

Visit our [project website][Project Website] and [documentation][documentation]
to learn more about Taskflow. To get involved:

+ See [release notes][release notes] to stay up-to-date with newest versions
+ Read the step-by-step tutorial at [cookbook][cookbook]
+ Submit an issue at [GitHub issues][GitHub issues]
+ Find out our technical details at [references][references]
+ Watch our technical talks at YouTube

![Taskflow Tutorials](https://www.youtube.com/watch?v=u4vaY0cjzos)

We are committed to support trustworthy developments for
both academic and industrial research projects in parallel
and heterogeneous computing.
If you are using Taskflow, please cite the following paper we published at 2021 IEEE TPDS:

+ Tsung-Wei Huang, Dian-Lun Lin, Chun-Xun Lin, and Yibo Lin, "Taskflow: A Lightweight Parallel and Heterogeneous Task Graph Computing System," IEEE Transactions on Parallel and Distributed Systems (TPDS), vol. 33, no. 6, pp. 1303-1320, June 2022

More importantly, we appreciate all Taskflow [contributors][contributors] and
the following organizations for sponsoring the Taskflow project!

| | | | |
|:-------------------------:|:-------------------------:|:-------------------------:|:-------------------------:|
|

|
|

| | |

Taskflow project is also supported by ADS.FUND.

License

Taskflow is licensed with the MIT License.
You are completely free to re-distribute your work derived from Taskflow.

*

[Tsung-Wei Huang]: https://tsung-wei-huang.github.io/
[GitHub releases]: https://github.com/taskflow/taskflow/releases
[GitHub issues]: https://github.com/taskflow/taskflow/issues
[GitHub insights]: https://github.com/taskflow/taskflow/pulse
[GitHub pull requests]: https://github.com/taskflow/taskflow/pulls
[GraphViz]: https://www.graphviz.org/
[Project Website]: https://taskflow.github.io/
[cppcon20 talk]: https://www.youtube.com/watch?v=MX15huP5DsM
[contributors]: https://taskflow.github.io/taskflow/contributors.html
[totalgee]: https://github.com/totalgee
[NSF]: https://www.nsf.gov/
[UIUC]: https://illinois.edu/
[CSL]: https://csl.illinois.edu/
[UofU]: https://www.utah.edu/
[documentation]: https://taskflow.github.io/taskflow/index.html
[release notes]: https://taskflow.github.io/taskflow/Releases.html
[cookbook]: https://taskflow.github.io/taskflow/pages.html
[references]: https://taskflow.github.io/taskflow/References.html
[PayMe]: https://www.paypal.me/twhuang/10
[email me]: mailto:twh760812@gmail.com
[Cpp Conference 2018]: https://github.com/CppCon/CppCon2018
[TPDS22]: https://tsung-wei-huang.github.io/papers/tpds21-taskflow.pdf

![](https://wolfram77.github.io)

![SRC](https://github.com/taskflow/taskflow)
![ORG](https://nodef.github.io)
![](https://ga-beacon.deno.dev/G-RC63DPBH3P:SH3Eq-NoQ9mwgYeHWxu7cw/github.com/nodef/taskflow.cxx)