Skip to content
Snippets Groups Projects

Ingest Envoy: The Workspaces NGAS & Metadata Ingestion System

Ingest Envoy is responsible for setup and launch of all types of file ingestion for the Workspaces System. Currently, this includes standard calibration and standard image ingestion.

usage: ingest_envoy [-h] [--calibration CALIBRATION] [--image IMAGE] [--observation OBSERVATION OBSERVATION] [--seci SECI SECI]

Workspaces Ingestion System

options:
  -h, --help            show this help message and exit
  --calibration CALIBRATION
                        run ingestion for a calibration product
  --image IMAGE         run ingestion for an image product
  --observation OBSERVATION OBSERVATION
                        run ingestion for an observation
  --seci SECI SECI      run ingestion for VLASS SECI image products

Ingest Envoy makes use of the existing ingest functionality of the AAT-PPI which simply takes an ingestion manifest as input. While this is consistent regaurdless of ingestion type, the manifest itself, as well as the ingestion staging requirements differ between the types of files to be ingested. For this reason, Ingest Envoy's functionality can be broken into two underlying parts: Setup and Launch.

Setup

Setup can be further divided into two essential components: Product Staging and Manifest Generation.

Product Staging

Product Staging is the step that collects all ingestable products and places them in the workspaces staging area located at: /lustre/aoc/cluster/pipeline/<capo-profile>/workspaces/staging

This collection is most often performed via a shell script such as the calibration-table-collector.sh for calibration ingestion or the image-product-collector.sh for image ingestion.

In the case of calibration ingestion, the collection script creates a tar file containing all the calibration tables and then creates a new weblog tar file to ensure that only the most recent version is ingested with the tables. Both tar files are then placed in the staging area.

In the case of image ingestion, the collection script copies the image fits files to the staging area along with a new weblog tar file as with calibration ingestion, but it also creates a pipeline artifacts tar file which contains other files produced during a CASA imaging run, such as casa_pipescript.py and the CASA produced PPR file unknown.hifv_contimage.pprequest.xml. There is also an extra metadata file, aux_image_metadata.json, required for image ingestion which must be transferred to the staging area.

Once product staging is done, the envoy is ready to produce the ingestion manifest file.

Manifest Generation

The Ingestion Manifest is essentially the master instruction list for an ingestion request. It contains the names, locations, and types of all products to be ingested into NGAS and the NRAO metadata database for retrieval via the new NRAO Archive.

There are three main sections to an ingestion manifest: Parameters, Input Group, and Output Group. The Parameters section sets parameters such as ingestion path, telescope, and if there is an additional metadata file.

The Input Group section defines the input group association for the files to be ingested. This section contains the input science product that was used to create the file to be ingested. For calibrations, this should be an execution block locator, and for images, this should be the calibration locator.

The Output Group section defines all files related to the main science product being ingested. This section contains the type and file name of the main science product and the type and file name of any associated ancillary products. An ancillary product is anything the is related to the main science product that is worth ingesting, such as weblogs and the pipeline and ingestion artifacts tar files.

After the manifest is properly generated, the manifest and any additional metadata files are tarred up into the ingestion artifacts tar file and both the manifest and the new artifact tar file are placed in the staging area.

Launch

Calibration and Image ingestion are initiated in exactly the same way - by providing the staging area directory name to ingest. Because of this, Ingest Envoy contains a single function for launching the ingest pex which is called by all typed ingestion launchers.

Ingest Envoy has two types of Launcher classes: IngestCalibrationLauncher and IngestImageLauncher. Each typed launcher handles the type specific setup, as described above, and then call the shared ingestion function. Upon ingest's completion, Ingest Envoy checks the return code and logs either a successful or failed ingestion and exits.