Newer
Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
************
Known Issues
************
Bugs
====
Messaging System
----------------
- Occasionally, the delivery message for a request is not triggered by the workflow-complete message
Capability System
-----------------
- Executions that are queued beyond the concurrency limit seem to be lost and never executed, possibly due to engines not looking for new executions once freed
Gripes
======
Docker
------
- Sidecar for visibility into what containers are running; docker logs
(can maybe use Prometheus built-in to gitlab) - `WS-425 <https://open-jira.nrao.edu/browse/WS-425>`__
Condor
------
- Get the data copy plugin from SCG → repo - `WS-415 <https://open-jira.nrao.edu/browse/WS-415>`__
- Update the wf_monitor to recognize other Condor status codes - `WS-413 <https://open-jira.nrao.edu/browse/WS-413>`__
Docs
----
- Setup for development page → update for docker containers
- Move that info into the installation page
- Update the README.md files to say something about what they're
attached to
- Integrate the README.md files into the docs, maybe the API docs
themselves
- `WS-428 <https://open-jira.nrao.edu/browse/WS-428>`__
Testing
-------
- Audit code for missing tests, irrelevant tests
- See if we can make coverage combination less finicky
- Optimize run-test.sh to not run redundant tests
- Fix the end-to-end tests that Nathan disabled because they are
hard-coded for the redirect to the request page
- Add schema migration to CI
- `WS-435 <https://open-jira.nrao.edu/browse/WS-435>`__
Database
--------
- need to generate an archive "core sample"
- copy of the archive database schema
- data from ~10 small projects
- Consider moving from json to jsonb datatype
- `WS-441 <https://open-jira.nrao.edu/browse/WS-441>`__
Pipeline
--------
- Update the end-to-end test container to see how detailed we can be
Code Tweaks
-----------
- `wf_monitor`: Support for more HTCondor event codes and support for them within the system
.. code-block:: python
# Enum example by Daniel
class HTCondorEvent(Enum):
def __init__(self, code: int, meaning: str, terminator: bool):
self.code, self.meaning, self.terminator = code, meaning, terminator
# then decoding it looks like HTCondorEvent[code] and you can ask questions like if HTCondorEvent[code].is_terminator: …
SUBMITTED(0, 'executing', False)
EXECUTING(1, 'executing', False)
...
TERMINATED(5, 'terminated', True)
- Hardcoded 48 GB of RAM in the calibration template; needs to use a
Capo profile
- See if Mustache can access Capo properties without much extra work
- `WS-447 <https://open-jira.nrao.edu/browse/WS-447>`__