ssa
workspaces

Repository



NRAO Archive and Pipeline Processing Interface
Detailed documentation is available on our Confluence page.

Overview
The NRAO archive system allows people to download and do some reprocessing of
radio astronomical observations made using NRAO-affiliated instruments:
The Very Large Array (VLA),
the Very Long Baseline Array (VLBA),
the Atacama Large Millimeter Array (ALMA)
as well as the the Green Bank Telescope (GBT).

Components
The archive system is a constellation of subsystems each performing a critical
task.


amygdala is the core of the nervous system for messaging and making
decisions about those messages

archiveIface is the web-based user interface for starting downloads and
reprocessing
the data-fetcher is responsible for retrieving archive files from NGAS

deployment is our system for putting this system online

logback-utils and logback-servlet-utils are provide logging services

mail is a templated mailing system

messaging provides messaging services the other components rely on

Model provides Java models for the entities in the system

archive-solr provides Solr indexing services for fast lookups

NGRH-ALMA-10_8 is the request handler and provides users with insight into
what step their download or reprocessing request is on

opencadc and tap-server provide Virtual Observatory
services

pipeline-manifest-lib and ppr-schema generate and parse reprocessing
requests and their results

schema is the database schema used by the archive system

pyat is the Python interface to the archive as well as the ingestion
system

workflow-all provides cluster-based workflows for downloads, imaging and
calibration


How are requests processed?
To give a quick view of how the system works, let's walk through a single request.


The user arrives at the archiveIface at archive-new.nrao.edu
wanting to fetch some data


The user searches for a particular observation, such as 13B-014.
Behind the scenes, the archiveIface makes a request to a Solr index,
built by archive-solr, to find observations for 13B-014, which it
then presents to the user.


The user selects a data set and chooses download and reprocessing options
provided by archiveIface and clicks either Download or Reprocess.


archiveIface sends the request to NGRH-ALMA-10_8 (the request handler).


NGRH-ALMA-10_8 sends a workflow-start message to workflow-all.


workflow-all runs a sequence of workflow steps:


ppr-schema is used to generate pipeline processing request (PPR) for
the user's request

data-fetcher is used in the cluster to obtain the user's data
Other workflow tasks and jobs are used to run CASA and obtain the results
Finally, a message is sent back to *NGRH-ALMA-10_8 with the results


NGRH-ALMA-10_8 shows the user their download is complete and how to obtain
the files.


All of the work is coordinated using AMQP messaging (via messaging) and the
database (defined by schema).
More details can be had by looking at our Confluence documentation.