Update the documentation

exalearn · Jan 5, 2024 · ad38514 · ad38514
1 parent a757fbf
commit ad38514
Show file tree

Hide file tree

Showing 7 changed files with 61 additions and 54 deletions.
diff --git a/colmena/task_server/globus.py b/colmena/task_server/globus.py
@@ -29,7 +29,7 @@ class GlobusComputeTaskServer(FutureBasedTaskServer):
     `registers <https://funcx.readthedocs.io/en/latest/sdk.html#registering-functions>`_
     the wrapped function with Globus Compute.
     You must also provide a Globus Compute :class:`~globus_compute_sdk.client.Client`
-     that the task server will use to authenticate with the web service.
+    that the task server will use to authenticate with the web service.
 
     The task server works using Globus Compute's :class:`~globus_compute_sdk.executor.Executor`
     to communicate to the web service over a web socket.

diff --git a/docs/design.rst b/docs/design.rst
@@ -1,9 +1,8 @@
 Design
 ======
 
-Colmena is a library for applications that steer
+Colmena is a library for building applications that steer
 ensembles of simulations running on distributed computing resources.
-We describe the concepts behind Colmena here.
 
 Key Concepts
 ------------
@@ -19,20 +18,21 @@ delegates them to the Doer.
 "Thinker": Planning Agent
 +++++++++++++++++++++++++
 
-The "Thinker" is defines the strategy for a computational campaign.
+The "Thinker" defines the strategy for a computational campaign.
 The strategy is expressed by a series of "agents" that identify
 which computations to run and adapt to their results.
 As `demonstrated in our optimization examples <how-to#creating-a-thinker-application>`_,
 complex strategies are simple if broken into many agents.
 
 
-"Doer": task server
+"Doer": Task Server
 +++++++++++++++++++
 
-The "Doer" server accepts tasks specification, deploys tasks on remote services
-and sends results back to the Thinker agent(s).
-Doers are interfaces to workflow engies, such as `Parsl <https://parsl-project.org>`_
-or `FuncX <https://funcx.org/>`_.
+The "Doer" server accepts tasks specification from the Thinker,
+deploys tasks on remote services
+and sends results back to the Thinker.
+Doers are interfaces to workflow engines, such as `Parsl <https://parsl-project.org>`_
+or `Globus Compute <https://funcx.org/>`_.
 
 Implementation
 --------------
@@ -47,7 +47,7 @@ Client
 
 The "Thinker" process is a Python program that runs a separate thread for each agent.
 
-Agents are functions that define which computations to run by sending *task request*
+Agents are functions that define which computations to run by sending *task requests*
 to a task server or reading *results* from a queue.
 Results are returned in the order they are completed.
 
@@ -109,8 +109,8 @@ Communication
 +++++++++++++
 
 Task requests and results are communicated between Thinker and Doer via queues.
-Thinkers submit a task request to one queue and receive results in a second as soon as possible it completes.
-Users can also denote tasks with a "topics" if there are tasks used by different agents.
+Thinkers submit a task request to one queue and receive results in a second as soon it completes.
+Users can also denote tasks with a "topic" to separate tasks used by different agents.
 
 The easiest-to-configure queue, :class:`~colmena.queue.python.PipeQueues`, is based on Python's multiprocessing Pipes.
 Creating it requires no other services or configuration beyond the topics:
@@ -145,51 +145,52 @@ by illustrating a typical :class:`~colmena.models.Result` object.
         "method": "reduce",
         "value": 2,
         "success": true,
-        "time_created": 1593498015.132477,
-        "time_input_received": 1593498015.13357,
-        "time_compute_started": 1593498018.856764,
-        "time_result_sent": 1593498018.858268,
-        "time_result_received": 1593498018.860002,
-        "time_running": 1.8e-05,
-        "time_serialize_inputs": 4.07e-05,
-        "time_deserialize_inputs": 4.28-05,
-        "time_serialize_results": 3.32e-05,
-        "time_deserialize_results": 3.30e-05,
+        "timestamps": {
+            "created": 1593498015.132477,
+            "input_received": 1593498015.13357,
+            "compute_started": 1593498018.856764,
+            "result_sent": 1593498018.858268,
+            "result_received": 1593498018.860002
+        },
+        "time": {
+            "running": 1.8e-05,
+            "serialize_inputs": 4.07e-05,
+            "deserialize_inputs": 4.28-05,
+            "serialize_results": 3.32e-05,
+            "deserialize_results": 3.30e-05
+        }
     }
 
-**Launching Tasks**: A client creates a task request at ``time_created`` and adds the the input
+**Launching Tasks**: A client creates a task request at ``timestamp.created`` and adds the the input
 specification (``method`` and ``inputs``) to an "outbound" Redis queue. The task request is formatted
-in the JSON format defined above with only the ``method``, ``inputs`` and ``time_created`` fields
-populated. The task inputs are then serialized (``time_serialize_inputs``) and send using
-the Redis Queue to the task server.
-The serialization method is communicated along with the inputs.
+in the JSON format defined above with only the ``method``, ``inputs`` and ``timestamp.created`` fields
+populated. The task inputs are then serialized (``time.serialize_inputs`` records the execution time)
+and passed via the queue to the Task Server.
 
-**Task Routing**: The task server reads the task request from the outbound queue at ``time_input_received``
+**Task Routing**: The task server reads the task request from the outbound queue at ``timestamp.input_received``
 and submits the task to the distributed workflow engine.
 The method definitions in the task server denote on which resources they can run,
 and Parsl chooses when and to which resource to submit tasks.
 
-**Computation**: A Parsl worker starts a task at ``time_compute_started``.
-The task inputs are deserialized (``time_deserialize_inputs``),
-the requested work is executed (``time_running``),
-and the results serialized (``time_serialize_results``).
+**Computation**: A Parsl worker starts a task at ``timestamp.compute_started``.
+The task inputs are deserialized (``time.deserialize_inputs``),
+the requested work is executed (``time.running``),
+and the results serialized (``time.serialize_results``).
 
 **Result Communication**: The task server adds the result to the task specification (``value``) and
-sends it back to the client in an "inbound" queue at (``time_result_sent``).
+sends it back to the client in an "inbound" queue at (``timestamp.result_sent``).
 
 **Result Retrieval**: The client retrieves the message from the inbound queue.
 The result is deserialized (``time_deserialize_result``) and returned
-back to the client at ``time_result_received``.
-
-The overall efficiency of the task system can be approximated by comparing the ``time_running``, which
-denotes the actual time spent executing the task on the workers, to the difference between the ``time_created``
-and ``time_returned`` (i.e., the round-trip time).
-Comparing round-trip time and ``time_running`` captures both the overhead of the system and any time
-waiting in a queue for other tasks to complete and must be viewed carefully.
-
-The overhead specific to Colmena (i.e., and not Parsl) can be measured by assessing the communication time
-for the Redis queues.
-For example, the inbound queue can be assessed by comparing the ``time_created`` and ``time_input_received``.
-The communication times for Parsl can be measured only when the queue length is negligible
-through the differences between ``time_inputs_received`` and ``time_compute_started``.
-The communication times related to serialization are also stored (e.g., ``time_serialize_result``).
+back to the client at ``timestamp.result_received``.
+
+The overall efficiency of the task system can be approximated by comparing the ``time.running``, which
+denotes the actual time spent executing the task on the workers, to the difference between the ``timestamp.created``
+and ``timestamp.result_returned`` (i.e., the round-trip time).
+
+The overhead specific to Colmena (i.e., and not Parsl) can be measured by assessing the communication time for each step.
+For example, the inbound queue can be assessed by comparing the ``timestamp.created`` and ``timestamp.input_received``.
+The communication times for Parsl can be measured through the differences between
+``timestamp.inputs_received`` and ``timestamp.compute_started``,
+provided the task does not wait for a worker to become available.
+The communication times related to serialization are also stored (e.g., ``time.serialize_result``).
diff --git a/docs/examples.rst b/docs/examples.rst
@@ -41,6 +41,8 @@ Papers where Colmena was used
   [`ChemRxiv <https://doi.org/10.26434/chemrxiv-2022-8w9ft>`_]
 - Zvyagin et al. "GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics."
   [`bioRxiv <https://doi.org/10.1101/2022.10.10.511571>`_]
+- Dharuman et al. "Protein Generation via Genome-scale Language Models with Bio-physical Scoring"
+  [`SC-W '23 <https://dl.acm.org/doi/abs/10.1145/3624062.3626087>`_]
 
 Open-Source Software
 --------------------

diff --git a/docs/how-to.rst b/docs/how-to.rst
@@ -40,7 +40,7 @@ Methods that used Compiled Applications
 Many Colmena applications launch tasks that use software written in languages besides Python.
 Colmena provides the :class:`~colmena.models.ExecutableTask` class to help integrate these tasks into a Colmena application.
 
-The definition of an `ExectuableTask` is split into three parts:
+The definition of an ``ExectuableTask`` is split into three parts:
 
 1. ``__init__``: create the shell command needed to launch your code and pass it to the initializer of the base class.
 2. ``preprocess``: use method arguments to create the input files, command line arguments, or stdin needed to execute
@@ -71,7 +71,7 @@ then stores the result in stdout.
 Some Task Server implements execute the pre- and post-processing step on separate resources
 from the executable task to make more efficient use of the compute resources.
 
-See the `MPI with RADICAL Cybertools (RCT) <#>`_ example for a demonstration.
+See the `MPI example <https://github.com/exalearn/colmena/tree/master/demo_apps/mpi-with-rct>`_.
 
 MPI Applications
 ................

diff --git a/docs/index.rst b/docs/index.rst
@@ -26,7 +26,7 @@ Colmena provides a few main components to enable building thinking applications:
 
     #. An extensible base class for building thinking applications with a dataflow-like programming model
     #. A "Task Server" that provides a simplified interface to HPC-ready workflow systems
-    #. A high-performance queuing system communicating between tasks server(s) from thinking applications
+    #. A high-performance queuing system communicating to tasks servers from thinking applications
 
 The `demo applications <https://github.com/exalearn/colmena/tree/master/demo_apps/optimizer-examples>`_
 illustrate how to implement different thinking applications that solve optimization problems.

diff --git a/docs/installation.rst b/docs/installation.rst
@@ -15,3 +15,7 @@ Installation on Windows
 
 We recommend installing the Windows Subsystem for Linux in order to use Colmena from a Windows system.
 
+Development Environment
+-----------------------
+
+The Anaconda environment in the ``dev`` folder provides an environment capable of running the examples.
diff --git a/docs/quickstart.rst b/docs/quickstart.rst
@@ -26,7 +26,7 @@ Translating our target function, *f(x)*, and search algorithm into Python yields
         return best_to_date + random() - 0.5
 
 1. Define Communication
-------------------------------
+-----------------------
 
 .. image:: _static/overview.svg
     :height: 100
@@ -85,7 +85,7 @@ it to only use up to 4 processes on a single machine:
 
     config = Config(executors=[HighThroughputExecutor(max_workers=4)])
 
-The list of methods and resources are used to define the "task server":
+Build a task server by providing a list of methods and resources:
 
 .. code-block:: python
 
@@ -98,7 +98,7 @@ Colmena provides a "BaseThinker" class to create steering applications.
 These applications run multiple operations (called agents) that send tasks and receive results
 from the task server.
 
-Our thinker has two agents that each are class methods marked with the ``@agent`` decorator:
+Our example thinker has two agents that each are class methods marked with the ``@agent`` decorator:
 
 .. code-block:: python
 
@@ -171,7 +171,7 @@ Accordingly, we call their ``.start()`` methods to launch them.
 
 Launch the Colmena application by running it with Python: ``python multi-agent-thinker.py``
 
-The application will produce a prolific about of log messages, including:
+The application will produce log messages from many components, including:
 
 1. Log items from the thinker that mark the agent which wrote them: