Skip to content

Creating and running new modules

williamratcliff edited this page Jun 22, 2011 · 5 revisions

Defining a reduction module

To define a new module, the current norm is to have a separate directory that contains each module in different files. The base class Module is located in /dataflow/core.py which allows for specifying an id, name, version, description, fields, icon, terminals, and an action. The last three parameters are of special interest and need to be formatted correctly. The icon is a dictionary that has the keys 'URI' and 'terminals'. The 'URI' is the location of the icon for that module while the 'terminals' is a dictionary that specifies terminal locations for the keys 'input' and 'output' of the format (x, y, dx, dy). The terminals parameter for creating a new Module is different from the 'terminals' key that was just described as it is a list of dictionaries that describe the input and/or output terminals. For each dictionary, an id, datatype, use, and description should be given. Other keys include 'required' and 'multiple' which are used to determine if input is required (this is necessary because some terminals may be optional for some modules, whereas in others, it may not be.) and if multiple nodes can attach to the input node. The last parameter, action, is a method that actual transforms the data. I'm not doing Paul justice by describing his code, but if you use any of the modules in /dataflow/modules as an example, it should be clear.

After you define the module with the proper action (which should accept an 'input' list of data and output a dictionary that maps 'output' to a certain result. From there you can piece together a template and configuration and run the whole data reduction with run_template(template, config) from /dataflow/core.py.

We should also have an "activated" and perhaps a "busy" state for a module. The active flag would denote whether or not the module should be used as part of the chain, or if values should just "flow through". An example would be in the SANS pipeline--sometimes they may or may not want to put their data on an absolute scale. I picture that this will be a common motif when we provide "standard" templates for users. The template could still be used, but we could just turn a particular module "off". This could fall in with the current architecture, but let's promote it to a required part of a template (the default will be True) so that we can rely on it for calculations, diagramming, etc. rather than just a suggested practice.

*Under construction, more details will come soon

Clone this wiki locally