Skip to content

Commit

Permalink
MRG importer specs drafted, MRGT specs adapted (needs review)
Browse files Browse the repository at this point in the history
Signed-off-by: Rieks <RieksJ@users.noreply.github.com>
  • Loading branch information
RieksJ committed Jul 14, 2023
1 parent 6fa89ff commit 5dc7f14
Show file tree
Hide file tree
Showing 6 changed files with 208 additions and 11 deletions.
2 changes: 1 addition & 1 deletion docs/terminology-design/saf.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ scopes: #
- scopetags: [ tev2 ]
scopedir: https://github.com/tno-terminology-design/tev2-specifications/tree/master/docs/tev2
- scopetags: [ essiflab, essif-lab ]
scopedir: https://github.com/essif-lab/framework/tree/master/docs
scopedir: https://github.com/tno-terminology-design/tev2-specifications/docs/terminology-design/tree/master/docs
- scopetags: [ ctwg ]
scopedir: https://github.com/trustoverip/ctwg
#
Expand Down
2 changes: 1 addition & 1 deletion docs/tev2/miscellaneous/tool-development.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ For the date of the tooling status, see the "<i>last updated on</i>" text at the
### Under development

- [MRGT](/docs/tev2/spec-tools/mrgt), which is in [this toip repo](https://github.com/trustoverip/ctwg-toolkit-mrg). The tool works, but still has some [bugs/issues](https://github.com/trustoverip/ctwg-toolkit-mrg/issues) that need to be fixed.
- [TRRT](/docs/tev2/spec-tools/trrt), which is currently actively developed by TNO in [this repo](https://github.com/essif-lab/trrt).
- [TRRT](/docs/tev2/spec-tools/trrt), which is currently actively developed by TNO in [this repo](https://github.com/tno-terminology-design/trrt).

### High priority
- ingress tools that convert wiki-files (and perhaps some other formats) into [curated texts](/docs/tev2/spec-files/00-ctext.md);
Expand Down
2 changes: 1 addition & 1 deletion docs/tev2/saf.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ scopes: #
- scopetags: # definition of (scope) tag(s) that are used within this scope to refer to a specific terminology
- essiflab
- essif-lab
scopedir: https://github.com/essif-lab/framework/tree/master/docs # URL of the scope-directory
scopedir: https://github.com/tno-terminology-design/tev2-specifications/docs/tev2/tree/master/docs # URL of the scope-directory
- scopetags: # definition of (scope)tag(s) that are used within this scope to refer to a specific terminology
- ctwg
- toip-ctwg
Expand Down
114 changes: 114 additions & 0 deletions docs/tev2/spec-tools/12-mrg-importer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
---
id: mrg-importer
sidebar_label: MRG Importer
displayed_sidebar: tev2SideBar
scopetag: tev2
date: 20230731
---

# MRG Import Tool

import useBaseUrl from '@docusaurus/useBaseUrl'

import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

<!-- Use 'Mark' as an HTML tag, e.g. <mark>text to mark</Mark?-->
export const mark = ({children}) => (
<span style={{ color:'black', backgroundColor:'yellow', padding:'0.2rem', borderRadius:'2px', }}>
{children}
</span> );

:::caution
The entire section on Terminology Engine v 2 (TEv2) is still under construction.<br/>
As TEv2 is not (yet) available, the texts that specify the tool are still 'raw', i.e. not yet processed.<br/>[readers](@) will need to see through some (currently unprocessed) notational conventions.
:::

The **[MRG](@) Import Tool ([MRG importer](@))** ensures that the [scope](@) within which it is run, obtains a local copy of all [MRGs](@) that are available in the [scopes](@) that are mentioned in the [scopes section](/docs/tev2/spec-files/saf#scopes) of its [SAF](@). This makes life easy for various tools, e.g., the [MRGT](@) and the [TRRT](@), that can now assume that all [MRGs](@) that they may need to consult in order to do their job, are readily available.

There will shortly be an implementation of the tool:
- the repo for the code of the tool is [here](https://github.com/tno-terminology-design/mrg-import).
- the documentation is [<mark>tbd</mark>].

## Installing the Tool

The tool can be installed from the command line and made globally available by executing

~~~
npm install tno-terminology-design/mrg-import -g
~~~

## Calling the Tool

The behavior of the [MRG importer](@) can be configured per call e.g. by a configuration file and/or command-line parameters. The command-line syntax is as follows:

~~~
mrg-import [ <paramlist> ]
~~~

where:
- `<paramlist>` (optional) is a list of key-value pairs

<details>
<summary>Legend</summary>

The columns in the following table are defined as follows:
1. **`Key`** is the text to be used as a key.
2. **`Value`** represents the kind of value to be used.
3. **`Req'd`** specifies whether (`Y`) or not (`n`) the field is required to be present when the tool is being called. If required, it MUST either be present in the configuration file, or as a command-line parameter.
4. **`Description`** specifies the meaning of the `Value` field, and other things you may need to know, e.g. why it is needed, a required syntax, etc.

</details>

| Key | Value | Req'd | Description |
| :------------- | :------------ | :---: | :---------- |
| `config` | `<path>` | n | Path (including the filename) of the tool's (YAML) configuration file. This file contains the default key-value pairs to be used. Allowed keys (and the associated values) are documented in this table. Command-line arguments override key-value pairs specified in the configuration file. This parameter MUST NOT appear in the configuration file itself. |
| `scopedir` | `<path>` | Y | Path of the [scope directory](@) from which the tool is called. It MUST contain the [SAF](@) for that [scope](@), which we will refer to as the 'current scope' for the [MRG importer](@). |
| `onNotExist` | `<action>` | n | specifies the action to take in case an MRG file that was expected to exist, does not exist. Default is `'throw'`. |

The `<action>` parameter can take the following values:

| `<action>` | Description |
| :--------- | :---------- |
| `'throw'` | an error is thrown (an exception is raised), and processing will stop. |
| `'warn'` | a message is displayed (and logged) and processing continues. |
| `'log'` | a message is written to a log(file) and processing continues. |
| `'ignore'` | processing continues as if nothing happened. |

## Processing, Errors and Warnings

The [MRG importer](@) starts by reading its command-line and configuration file. If the command-line has a key that is also found in the configuration file, the command-line key-value pair takes precedence. The resulting set of key-value pairs is tested for proper syntax and validity. Every improper syntax and every invalidity found will be logged. Improper syntax may be e.g. an invalid [globpattern](https://en.wikipedia.org/wiki/Glob_(programming)#Syntax). Invalidities include non-existing directories or files, lack of write-permissions where needed, etc.

Then, the [MRG importer](@) reads the [SAF](@) of the [scope](@) from which the [MRG importer](@) is called. For every element in the [scopes section](/tev2-specifications/docs/tev2/spec-files/saf#scopes), it will import the [MRG](@)-files of the versions that are actively maintained by (the [curators](@) of) that [scope](@) that the current [scope](@) specified a relation with, by
- getting the [scopetag](@) from the [SAF](@) that is used within that [scope](@) to refer to itself. This scopetag is found in the `scopetag`-field of the [scope section](/tev2-specifications/docs/tev2/spec-files/saf#terminology) of the [SAF](@). In the below step, we will use `<scopetag>` to refer to its value.
- collecting all [versiontags](@) that are defined in the [SAF](@) of that [scope](@). This is the set of all [versiontags](@) that are either found in a `vsntag` or in an `altvsntags` field of an element in the [versions section](/tev2-specifications/docs/tev2/spec-files/saf#versions) of the [SAF](@)
- For every such [versiontag](@) `<vsntag>`:
- it is checked whether or not the file `mrg.<scopetag>.<vsntag>.yaml` exists in the [glossarydir](@) of that [scope](@). If it doesn't exist, this results in the bahaviour as specified by the `<action>` value of the `onNotExist` parameter. Default is `throw`.
-



the specified input files (in arbitrary order), and for each of them, produces an output file that is the same as the input file except for the fact that all [term refs](@) have been replaced with regular [markdown links](https://www.markdownguide.org/basic-syntax/#links), and (optionally) with additional texts that are to be used by third-party rendering tools for enhanced rendering of such links. An example of this would be text that can be used to enhance a link with a popup that contains the definition, or a description of the [term](@) that is being referenced.

The [MRG importer](@) logs every error- and/or warning condition that it comes across while processing its configuration file, commandline parameters, and input files, in a way that helps tool-operators and document [authors](@) to identify and fix such conditions.

## Deploying the Tool

The [MRG importer](@) comes with documentation that enables developers to ascertain its correct functioning (e.g. by using a test set of files, test scripts that exercise its parameters, etc.), and also enables them to deploy the tool in a git repo and author/modify CI-pipes to use that deployment.

## Discussion Notes

This section contains some notes of a discussion between Daniel and Rieks on these matters of some time ago.

- A ToIP [glossary](@) will be put by default at `http://trustoverip.github.io/<terms-community>/glossary`, where `<terms-community>` is the name of the [terms-community](@). This allows every [terms-community](@) to have its own [glossary](@). However, the above specifications allow [terms-communities](terms-community@) to [curate](@) multiple [scopes](scope@).
- Storing [glossaries](glossary@) elsewhere was seen to break the (basic workings of the postprocessing tool, but the above specifications would fix that.
- Entries, e.g. 'foo' can be referenced as `http://trustoverip.github.io/<community>/[glossary](@)#foo` (in case of a standalone glossary), and `http://trustoverip.github.io/<community>/document-that-includes-glossary-fragment#foo` (in case of a fragmented glossary).
- There will be a new convention for content [authors](@) who want to reference [terms](term@) (let's call it the 'short form'). This topic is fully addressed above, and extended to be a bit more generic.
- do we expect [glossaries](glossary@) that are generated by a [terms-community](@) to live at a fixed place (how do people find it, refer to its contents)? This topic is addressed
- once [glossaries](glossary@) are generated, the idea is that all artifacts produced in a [terms-community](@) can use references to the [terms](term@) in the generated [glossaries](glossary@), e.g.:
- confluence pages: we need to see how such pages can be processed. [Authors](@) can remove links like they do now, they could use [term refs](@) as they see fit and then run TRRT.
- github pages (e.g. https://github.com/trustoverip/ctwg-terms). Check (it's a github repo).
- github wiki pages (e.g. https://github.com/trustoverip/ctwg-terms/wiki). Check (it's a github repo).
- github wiki home pages (e.g. https://github.com/trustoverip/ctwg-terms/wiki/Home). Check (it's a github repo).
- github-pages pages (e.g. https://github.com/trustoverip/ctwg-terms/docs
- We could also see GGT and TRRT to be extended, e.g. to work in conjunction with LaTeX or word-processor documents. This needs some looking into, but [pandoc](https://pandoc.org/) may be useful here.
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Term ref resolution is the same process as we use for ingestion, and other conve
The **Term Ref(erence) Resolution Tool ([TRRT](@))** takes files that contain so-called [term refs](@) and outputs a copy of these files in which these [term refs](@) are converted into so-called [renderable refs](@), i.e. texts that can be further processed by tools such as GitHub pages, Docusaurus, etc. The result of this is that the rendered document contains markups that help [readers](@) to quickly find more explanations of the [concept](@) or other [knowledge artifact](@) that is being referenced.

There is currently one implementation of the tool:
- the repo in which the tool is being developed is [here](https://github.com/essif-lab/trrt).
- the repo is [here](https://github.com/tno-terminology-design/trrt).
- the documentation is [<mark>tbd</mark>].

<details>
Expand Down Expand Up @@ -90,6 +90,14 @@ By cleanly separating [term ref](@) interpretation from the part where it is ove

In order to convert [term refs](@) into [renderable refs](@), [TRRT](@) expects the [SAF](@) and the [MRG](@) of the [scope](@) from within which it is being called, to be available. The [MRG](@) is used to resolve all links to [terms](@) that are part of the [terminology](@) of this [scope](@). The [SAF](@) is used to locate the [MRGs](@) of any (other) [scope](@) whose [scopetag](@) is used as part of a [term ref](@) that needs to be resolved.

## Installing the Tool

The tool can be installed from the command line and made globally available by executing

~~~
npm install tno-terminology-design/trrt -g
~~~

## Calling the Tool

The behavior of the [TRRT](@) can be configured per call e.g. by a configuration file and/or command-line parameters. The command-line syntax is as follows:
Expand Down Expand Up @@ -117,11 +125,11 @@ The columns in the following table are defined as follows:
| :--------- | :------------ | :---: | :---------- |
| `config` | `<path>` | n | Path (including the filename) of the tool's (YAML) configuration file. This file contains the default key-value pairs to be used. Allowed keys (and the associated values) are documented in this table. Command-line arguments override key-value pairs specified in the configuration file. This parameter MUST NOT appear in the configuration file itself. |
| `input` | `<globpattern>` | n | [Globpattern](https://en.wikipedia.org/wiki/Glob_(programming)#Syntax) that specifies the set of (input) files that are to be processed. |
| `output` | `<dir>` | Y | Directory where output files are to be written. This directory is specified as an absolute or relative path. |
| `output` | `<dir>` | Y | (Root) directory where output files are to be written. This directory is specified as an absolute or relative path. |
| `scopedir` | `<path>` | Y | Path of the [scope directory](@) from which the tool is called. It MUST contain the [SAF](@) for that [scope](@), which we will refer to as the 'current scope' for the [TRRT](@). |
| `version` | `<versiontag>` | n | Version of the [terminology](@) that is to be used to resolve [term refs](@) for which neither a `scope` nor a `version` part has been specified (which is the most common case). It MUST match either the `vsntag` field, or an element of the `altvsntags` field of a [terminology](@)-version as specified in the [`versions` section](/docs/tev2/spec-files/saf#versions) of the [SAF](@). When not specified, its value is taken from the `vsntag` field in the [terminology section](/docs/tev2/spec-files/mrg#mrg-terminology) of the default [MRG](@) (which is [identified](@) by the contents of the `mrgfile` field (in the [`scope` section](/docs/tev2/spec-files/saf#terminology) of the [SAF](@)). |
| `interpreter` | `<type>` | n | Allows for the switching between interpreter types. By default the `AltInterpreter` and `StandardInterpreter` are available. When this parameter is omitted, the basic [term ref](@) syntax is used. |
| `converter` | `<type>` | n | The type of converter which creates the [renderable refs](@). When this parameter is omitted, the Markdown converter is used. |
| `version` | `<versiontag>` | n | Version of the [terminology](@) that is to be used to resolve [term refs](@) for which neither a `scope` nor a `version` part has been specified (which is the most common case). It MUST match either the `vsntag` field, or an element of the `altvsntags` field as specified in the [`versions` section](/docs/tev2/spec-files/saf#versions) of the [SAF](@). When not specified, its value is taken from the `defaultvsn` field in the [terminology section](/docs/tev2/spec-files/mrg#mrg-terminology) of the default [MRG](@) (which is [identified](@) by the contents of the `mrgfile` field (in the [`scope` section](/docs/tev2/spec-files/saf#terminology) of the [SAF](@)). |
| `interpreter` | `<type>` | n | Allows for the switching between interpreter types. By default the `AltInterpreter` and `StandardInterpreter` are available. When this parameter is omitted, the basic [term ref](@) syntax is used. |
| `converter` | `<type>` | n | The type of converter which creates the [renderable refs](@). When this parameter is omitted, the Markdown converter is used. |

## Term Ref Resolution

Expand Down
Loading

0 comments on commit 5dc7f14

Please sign in to comment.