Skip to content
Snippets Groups Projects
known-issues.rst 2.69 KiB
Newer Older
************
Known Issues
************

Bugs
====

Messaging System
----------------

- Occasionally, the delivery message for a request is not triggered by the workflow-complete message

Capability System
-----------------

- Executions that are queued beyond the concurrency limit seem to be lost and never executed, possibly due to engines not looking for new executions once freed

Gripes
======

Docker
------


-  Sidecar for visibility into what containers are running; docker logs
   (can maybe use Prometheus built-in to gitlab) - `WS-425 <https://open-jira.nrao.edu/browse/WS-425>`__

Condor
------

-  Get the data copy plugin from SCG → repo - `WS-415 <https://open-jira.nrao.edu/browse/WS-415>`__

-  Update the wf_monitor to recognize other Condor status codes - `WS-413 <https://open-jira.nrao.edu/browse/WS-413>`__

Docs
----


-  Setup for development page → update for docker containers

-  Move that info into the installation page

-  Update the README.md files to say something about what they're
   attached to

-  Integrate the README.md files into the docs, maybe the API docs
   themselves

- `WS-428 <https://open-jira.nrao.edu/browse/WS-428>`__

Testing
-------


-  Audit code for missing tests, irrelevant tests

-  See if we can make coverage combination less finicky

-  Optimize run-test.sh to not run redundant tests

-  Fix the end-to-end tests that Nathan disabled because they are
   hard-coded for the redirect to the request page

   -  Add schema migration to CI

- `WS-435 <https://open-jira.nrao.edu/browse/WS-435>`__

Database
--------

-  need to generate an archive "core sample"

   -  copy of the archive database schema

   -  data from ~10 small projects

-  Consider moving from json to jsonb datatype

- `WS-441 <https://open-jira.nrao.edu/browse/WS-441>`__

Pipeline
--------

-  Update the end-to-end test container to see how detailed we can be

Code Tweaks
-----------

- `wf_monitor`: Support for more HTCondor event codes and support for them within the system

   .. code-block:: python

      # Enum example by Daniel
      class HTCondorEvent(Enum):
        def __init__(self, code: int, meaning: str, terminator: bool):
          self.code, self.meaning, self.terminator = code, meaning, terminator
          # then decoding it looks like HTCondorEvent[code] and you can ask questions like if HTCondorEvent[code].is_terminator: …
        SUBMITTED(0, 'executing', False)
        EXECUTING(1, 'executing', False)
        ...
        TERMINATED(5, 'terminated', True)

-  Hardcoded 48 GB of RAM in the calibration template; needs to use a
   Capo profile

   -  See if Mustache can access Capo properties without much extra work

- `WS-447 <https://open-jira.nrao.edu/browse/WS-447>`__