# Delivery: A system for packaging images and data

What is delivery? Delivery is what happens after the the active processing portion of the workflow concludes. It is the
step that moves the retrieved or generated products from the processing area to a place where they can be accessed by
the requesting user.

Most workflows proceed by retrieving some files from NGAS and running CASA on those files to produce new products. The
files are large and CASA is quite heavy, so we retrieve the files into a spool area on the Lustre filesystem and then
launch the CASA jobs on the cluster. Once CASA is finished, the files the user wants are still sitting in that spool
area on Lustre. Delivery is what gets the files from there to where the user can retrieve them.

## Concept

Delivery starts from a directory with some products in it. Delivery then identifies the products in that directory.
Using knowledge from Capo about different destinations, delivery copies the data into those destinations, in the
correct format for the product type. Delivery also accepts some arguments to filter out products that aren't
interesting or to perform simple packaging steps like creating tar archives containing the data.

## Usage

```
usage: deliver [-h] [--prefix PREFIX] [-p | -P] [-l LOCAL_DESTINATION] [-t] [-r] SOURCE_DIRECTORY

positional arguments:
  SOURCE_DIRECTORY      The directory where the products to be delivered are located

optional arguments:
  -h, --help            show this help message and exit
  --prefix PREFIX       Prefix for the destination (a request ID perhaps)
  -p, --use-piperesults
                        Use the CASA piperesults file, if present
  -P, --ignore-piperesults
                        Ignore the CASA piperesults file

Destination options:
  -l LOCAL_DESTINATION, --local-destination LOCAL_DESTINATION
                        Deliver to this local directory instead of the appropriate web root
  -t, --tar             Archive the delivered items as a tar file

Product filtering options:
  -r, --rawdata         Deliver the rawdata instead of the products
```

The command `deliver` must be called with a mandatory source directory. This is the location containing the files to
be delivered.

If the user has specified the destination, `-l <dir>` may be specified to tell delivery where to write files.
Without this argument, delivery will use a path in Capo, specifically `edu.nrao.workspaces.DeliverySettings.
downloadDirectory`, which is currently set to `/lustre/aoc/cluster/pipeline/$PROFILE/downloads`, and a download URL
will be generated based on the Capo setting `edu.nrao.workspaces.DeliverySettings.downloadUrl`, which is currently
set to `https://dl-nrao.aoc.nrao.edu`.

If the user has requested a single tar archive, then call delivery with `-t` to force it to generate a tar archive.
The default behavior is to simply copy the files.

Delivery can use CASA's "piperesults" file to discern the location and type of generated products. If you want this
behavior, call `deliver -p`. If you want instead for delivery to ignore it, use `deliver -P`.

Delivery also supports a `--prefix` argument, which allows you to generate intermediate directories between the
requested or implied delivery root and where the files are ultimately placed.

## References

Discussion of the new design can be found in Confluence under
[Proposed Delivery Redesign](https://open-confluence.nrao.edu/display/AAT/Proposed+Delivery+Redesign)
and some important supporting documentation around how the directories are built can be found at
[Delivery Directory Improvements](https://open-confluence.nrao.edu/display/SPR/Delivery+Directory+Improvements).