Skip to content

No Story: Restore CMS single weblog

Daniel Nemergut requested to merge restore_cms_single_weblog into 2.8.5-DEVELOPMENT

We have been told that in a restore_cms workflow, only the most recent weblog is the one relevant to the restoration. This modifies delivery for restores to only deliver a single weblog from the products/ and/or working/ directories.

Tested in a script on the example directory we were provided. Listing the files shows:

vlapipe@aatweb-dev$ ls -la /lustre/naasc/web/almapipe/pipeline/naasc-test/workspaces/spool/tmpp8idh61z
...
drwxr-xr-x  4 vlapipe vlapipe    33280 Sep 13 16:14 pipeline-20240913T195707/
...
drwxr-xr-x  4 vlapipe vlapipe    33280 Sep 13 16:34 pipeline-20240913T201705/
...

The script:

import re
from datetime import datetime
from pathlib import Path

test_path = "/lustre/naasc/web/almapipe/pipeline/naasc-test/workspaces/spool/tmpp8idh61z"
WEBLOG_REGEX = "^pipeline-[0-9]+T[0-9]+$"
SUBDIR_FILENAME_REGEXES = {
    "products": [
        "casa_pipescript\\.py",
        "casa_commands\\.log",
        "PPR_calibration\\.xml",
        ".*calapply\\.txt",
        ".*caltables\\.tgz",
        ".*flagtsystemplate\\.txt",
    ],
    "working": [
        "flux\\.csv",
        ".*\\.ms",
        ".*casa_piperestorescript\\.py",
        "PPR\\.xml",
    ],
}

for subdir in SUBDIR_FILENAME_REGEXES.keys():
    weblogs = [d for d in (Path(test_path) / subdir).iterdir() if d.is_dir() and re.fullmatch(WEBLOG_REGEX, d.name)]
    if weblogs:
        latest_weblog = max(weblogs, key=lambda item: datetime.strptime(item.name.split("-")[1], "%Y%m%dT%H%M%S"))
        print(latest_weblog)

Prints: working/pipeline-20240913T201705

Edited by Daniel Nemergut

Merge request reports

Loading