Workflow Syntax

Path: doc/Workflow Syntax
Last Update: Tue May 04 23:26:16 -0600 2010

Workflow Syntax

Tap uses a custom syntax for specifying workflows. The syntax consists of argument vectors separated by various breaks (ex ’-’, ’—’, ’-:’). The argument vectors define tasks, joins, and other workflow objects, while the breaks signify various things to do with the objects.

Note this documentation is written from the perspective of ruby-1.9.*, which changes the output of Array#to_s (among other things). See the Ruby to Ruby document for more details.

Basics

Argument vectors start with a constant identifier and subsequent arguments indicate inputs and/or options. Inputs are passed to the task as an array while options are parsed out as configurations.

  % tap dump 'goodnight moon'
  ["goodnight moon"]

  % tap dump --help
  Tap::Tasks::Dump -- dump data
  ...

Multiple objects can be specified using delimiters where a single-dash (-) delimits, and a double-dash (—) enques. The first object is implicitly enqued, but that behavior can be changed by manually setting a delimiter.

  % tap dump a - dump b -- dump c
  ignoring args: ["b"]
  ["a"]
  ["c"]

  % tap - dump a -- dump b - dump c
  ignoring args: ["a"]
  ignoring args: ["c"]
  ["b"]

Each object in a workflow is stored in the application by index, allowing them to be referenced from contexts that specifically need them, such as during joins or in signaling (note the context chooses when this happens; normally indexes like 0 are just another character).

  % tap - dump -/0/enq 'enqued by a signal'
  ["enqued by a signal"]

Constants are identified by matching strings right-to-left as if the constant were underscored into a path. Alternatively, the full constant name can be provided. These all resolve to Tap::Tasks::Load.

  % tap load
  % tap tap/tasks/load
  % tap /tap/tasks/load
  % tap Tap::Tasks::Load

In addition a left:right identifier can be specified to match both sides of the constant path:

  % tap tap:load
  % tap /tap/tasks:tasks/load
  % tap /tap/tasks/load:
  % tap :/tap/tasks/load

The workflow syntax reserves dash and all dash-nonword pairs as breaks. Use the escape begin (-.) and escape end (.-) sequences to provide a breaks as inputs.

  % tap dump begin -. -- - -- --- -: -/ --/ .- end
  ["begin", "-", "--", "---", "-:", "-/", "--/", "end"]

Not all possible breaks are used, but reserving them all makes for one simple rule. The various breaks and common use cases are described below.

Sequence (-:)

Objects can be linked into sequences using a -: break. Here load processes the input into a string, which gets passed to dump and printed (hence the output looks like a string, not an array).

  % tap load 'goodnight moon' -: dump
  goodnight moon

The sequence break is a convenient shorthand for manually building two objects and a join.

  % tap load 'goodnight moon' - dump - join 0 1
  goodnight moon

The longer syntax supports multi-way joins like fork and merge.

  % tap load 'goodnight moon' - dump - dump - join 0 1,2
  goodnight moon
  goodnight moon

  % tap load goodnight -- load moon - dump - join 0,1 2
  goodnight
  moon

It also allows for more complex join types. For example this is a gate join where the outputs are collected into a size-limited array before being passed along.

  % tap load a -- load b -- load c - dump - gate 0,1,2 3 --limit 2
  ["a", "b"]
  ["c"]

Variations

Joins can iterate or arrayify outputs before passing them to join targets. These options are useful for changing the input/output signatures of objects across a join. For example (borrowing the yaml loading task from tap-tasks):

  % tap load/yaml "[1, 2, 3]" - dump - join 0 1
  [1, 2, 3]

  % tap load/yaml "[1, 2, 3]" - dump - join 0 1 --iterate
  1
  2
  3

Sequences can specify flags and a join class like ’-:flags.class’. For example, these are equivalent workflows:

  % tap load/yaml "[1, 2, 3]" - dump - dump - join 0 1 --iterate - gate 1 2
  1
  2
  3
  [1, 2, 3]

  % tap load/yaml "[1, 2, 3]" -:i dump -:.gate dump
  1
  2
  3
  [1, 2, 3]

Signals (-/, —/)

Internally tap parses workflows into signals that build and control workflow objects. These signals can be specified manually on the command line.

  % tap -/set 0 load -/set 1 dump -/bld join 0 1 -/enq 0 'goodnight moon'
  goodnight moon

Invoked from a prompt.

  % tap prompt
  /set 0 load
  /set 1 dump
  /bld join 0 1
  /enq 0 'goodnight moon'
  /run
  goodnight moon

Or specified in a taprc file, one or more of which can be loaded with a triple-dash. Complex workflows are normally written in this way when they become too cumbersome to write in the single-line syntax.

  [taprc]
  set 0 load
  set 1 dump
  bld join 0 1
  enq 0 'goodnight moon'

  % tap --- taprc
  goodnight moon

  % tap --- taprc taprc taprc
  goodnight moon
  goodnight moon
  goodnight moon

Signals are the basis for controlling apps and objects. As mentioned before, objects in the application space can be signaled directly by name.

  % tap - dump -/0/exe 'goodnight moon'
  ["goodnight moon"]

Note that the dash-slash break (-/) enques signals, and as such they will run in order, as declared in the workflow. To execute a signal immediately, use a double-dash-slash (—/).

  % tap -- dump a -/0/exe b --/0/exe c
  ["c"]
  ["a"]
  ["b"]

[Validate]