No Story: Restore CMS single weblog
We have been told that in a restore_cms
workflow, only the most recent weblog is the one relevant to the restoration. This modifies delivery for restores to only deliver a single weblog from the products/
and/or working/
directories.
Tested in a script on the example directory we were provided. Listing the files shows:
vlapipe@aatweb-dev$ ls -la /lustre/naasc/web/almapipe/pipeline/naasc-test/workspaces/spool/tmpp8idh61z
...
drwxr-xr-x 4 vlapipe vlapipe 33280 Sep 13 16:14 pipeline-20240913T195707/
...
drwxr-xr-x 4 vlapipe vlapipe 33280 Sep 13 16:34 pipeline-20240913T201705/
...
The script:
import re
from datetime import datetime
from pathlib import Path
test_path = "/lustre/naasc/web/almapipe/pipeline/naasc-test/workspaces/spool/tmpp8idh61z"
WEBLOG_REGEX = "^pipeline-[0-9]+T[0-9]+$"
SUBDIR_FILENAME_REGEXES = {
"products": [
"casa_pipescript\\.py",
"casa_commands\\.log",
"PPR_calibration\\.xml",
".*calapply\\.txt",
".*caltables\\.tgz",
".*flagtsystemplate\\.txt",
],
"working": [
"flux\\.csv",
".*\\.ms",
".*casa_piperestorescript\\.py",
"PPR\\.xml",
],
}
for subdir in SUBDIR_FILENAME_REGEXES.keys():
weblogs = [d for d in (Path(test_path) / subdir).iterdir() if d.is_dir() and re.fullmatch(WEBLOG_REGEX, d.name)]
if weblogs:
latest_weblog = max(weblogs, key=lambda item: datetime.strptime(item.name.split("-")[1], "%Y%m%dT%H%M%S"))
print(latest_weblog)
Prints: working/pipeline-20240913T201705
Edited by Daniel Nemergut