# Ingest Envoy: The Workspaces NGAS & Metadata Ingestion System

Ingest Envoy is responsible for setup and launch of all types of file ingestion for the Workspaces System.
Currently, this includes standard calibration and standard image ingestion.

```
usage: ingest_envoy [-h] [--calibration CALIBRATION] [--image IMAGE] [--observation OBSERVATION OBSERVATION] [--seci SECI SECI]

Workspaces Ingestion System

options:
  -h, --help            show this help message and exit
  --calibration CALIBRATION
                        run ingestion for a calibration product
  --image IMAGE         run ingestion for an image product
  --observation OBSERVATION OBSERVATION
                        run ingestion for an observation
  --seci SECI SECI      run ingestion for VLASS SECI image products

```

Ingest Envoy makes use of the existing *ingest* functionality of the AAT-PPI which simply takes an
*ingestion manifest* as input. While this is consistent regaurdless of ingestion type, the manifest itself,
as well as the ingestion staging requirements differ between the types of files to be ingested. For this reason,
Ingest Envoy's functionality can be broken into two underlying parts: Setup and Launch.

## Setup
Setup can be further divided into two essential components: Product Staging and Manifest Generation.

### Product Staging
Product Staging is the step that collects all ingestable products and places them in the workspaces staging area located at:
``` /lustre/aoc/cluster/pipeline/<capo-profile>/workspaces/staging ```

This collection is most often performed via a shell script such as the ```calibration-table-collector.sh```
for calibration ingestion or the ```image-product-collector.sh``` for image ingestion.

In the case of calibration ingestion, the collection script creates a tar file containing all the calibration tables
and then creates a new weblog tar file to ensure that only the most recent version is ingested with the tables.
Both tar files are then placed in the staging area.

In the case of image ingestion, the collection script copies the image fits files to the staging area along with a new
weblog tar file as with calibration ingestion, but it also creates a *pipeline artifacts* tar file which contains other
files produced during a CASA imaging run, such as *casa_pipescript.py* and the CASA produced PPR file
*unknown.hifv_contimage.pprequest.xml*. There is also an extra metadata file, *aux_image_metadata.json*, required for image ingestion which must
be transferred to the staging area.

Once product staging is done, the envoy is ready to produce the ingestion manifest file.

### Manifest Generation
The Ingestion Manifest is essentially the master instruction list for an ingestion request. It contains the names,
locations, and types of all products to be ingested into NGAS and the NRAO metadata database for retrieval via
the new NRAO Archive.

There are three main sections to an ingestion manifest: Parameters, Input Group, and Output Group.
The Parameters section sets parameters such as ingestion path, telescope, and if there is an additional metadata file.

The Input Group section defines the input group association for the files to be ingested. This section contains the
input science product that was used to create the file to be ingested. For calibrations, this should be an execution
block locator, and for images, this should be the calibration locator.

The Output Group section defines all files related to the main science product being ingested. This section contains
the type and file name of the main science product and the type and file name of any associated ancillary products.
An ancillary product is anything the is related to the main science product that is worth ingesting, such as weblogs
and the pipeline and ingestion artifacts tar files.

After the manifest is properly generated, the manifest and any additional metadata files are tarred up into the
*ingestion artifacts* tar file and both the manifest and the new artifact tar file are placed in the staging area.

## Launch
Calibration and Image ingestion are initiated in exactly the same way - by providing the staging area directory name to
 *ingest*. Because of this, Ingest Envoy contains a single function for launching the *ingest* pex which is called by
all typed ingestion launchers.

Ingest Envoy has two types of Launcher classes: IngestCalibrationLauncher and IngestImageLauncher. Each typed launcher
handles the type specific setup, as described above, and then call the shared ingestion function. Upon *ingest*'s
completion, Ingest Envoy checks the return code and logs either a successful or failed ingestion and exits.