Skip to content

Commit

Permalink
Merge branch 'master' into master
Browse files Browse the repository at this point in the history
  • Loading branch information
thpap committed Jul 3, 2023
2 parents aabdf5e + 5d09a5c commit 55348f9
Show file tree
Hide file tree
Showing 42 changed files with 524 additions and 464 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -326,7 +326,7 @@ jobs:
echo "Upload OpenQuakeforAdvancedUsers (PDF)"
rsync -e 'ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null' -ax --delete build/latex/OpenQuakeforAdvancedUsers.pdf "docs@docs.openquake.org:${DOCS_BASE}${TARGET}PDF/"
ssh -v -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null docs@docs.openquake.org "bash -cx 'cd \"${DOCS_BASE}${PDFDOCS}\"; ln -sf \"../${TARGET}PDF/OpenQuakeforAdvancedUsers.pdf\" \"OpenQuake for Advanced Users ${PDF_VER}.pdf\" .'"
ssh -v -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null docs@docs.openquake.org "bash -cx 'cd \"${DOCS_BASE}${PDFDOCS}\"; ln -sf \"../${TARGET}PDF/OpenQuakeforAdvancedUsers.pdf\" \"OpenQuake for Advanced Users ${PDF_VER}.pdf\"'"
echo "Make Advanced Manual (HTML)"
cd ../adv-manual && make clean && make html
Expand Down
1 change: 1 addition & 0 deletions CONTRIBUTORS.txt
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ Julián Montejo Espitia May 2022
Pablo Iturrieta (@pabloitu) June 2022
Anne Hulsey (@annehulsey) July 2022
Nicole Paul (@nicolepaul) May 2023
Claudio Schill (@claudio525) June 2023

Project management
------------------
Expand Down
12 changes: 12 additions & 0 deletions debian/changelog
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@

[Athanasios Papadopoulos]
* Adjusted the Swiss-specific implementations of the GMPEs used
in the Swiss national seismic hazard (SUIhaz15) and risk (ERM-CH23) models.
Expand All @@ -11,7 +12,18 @@
ch_phisss03, ch_phis2s06 site parameters to the site.py file.
These parameters are required by the apply_swiss_amplification_sa method.

[Marco Pagani, Michele Simionato]
* Added support for the "direct" method in MRD calculations

[Michele Simionato]
* Reimplemented the conditional_spectrum calculator as a post-processor

[Paolo Tormene, Michele Simionato]
* Raise an error when the same parameter is set in different sections
of the job.ini file

[Michele Simionato]
* Fixed tau-phi inversion in LanzanoLuzi2019
* Fixed another bug with conditioned GMFs appearing as the error
`array[m, n, bad] = exp(mean_covs[0, g, m, n], im)
TypeError: list indices must be integers or slices, not tuple`
Expand Down
4 changes: 2 additions & 2 deletions debian/control
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,13 @@ Homepage: http://github.com/gem/oq-engine
Package: python3-oq-engine
Architecture: amd64
X-Python-Version: >= 3.6
Depends: ${shlibs:Depends}, ${misc:Depends}, ${python3:Depends}, ${dist:Depends}, python3-oq-libs (>=3.8.0~)
Depends: ${shlibs:Depends}, ${misc:Depends}, ${python3:Depends}, ${dist:Depends}, python3-oq-libs (>=3.8.1~)
Conflicts: python-noq, python-oq, python-nhlib
Replaces: python-oq-risklib, python-oq-hazardlib, python-oq-engine (<< ${binary:Version})
# See https://www.debian.org/doc/debian-policy/ch-relationships.html#s-breaks
Breaks: python-oq-risklib, python-oq-hazardlib, python-oq-engine (<< ${binary:Version})
Recommends: ${dist:Recommends}
Suggests: python3-oq-libs-extra (>=3.8.0~)
Suggests: python3-oq-libs-extra (>=3.8.1~)
Description: computes seismic hazard and physical risk
based on the hazard and risk libraries developed by the GEM foundation
(http://www.globalquakemodel.org/)
Expand Down
4 changes: 2 additions & 2 deletions debian/control.in
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,13 @@ Homepage: http://github.com/gem/oq-engine
Package: python3-oq-engine
Architecture: amd64
X-Python-Version: >= 3.6
Depends: ${shlibs:Depends}, ${misc:Depends}, ${python3:Depends}, ${dist:Depends}, python3-oq-libs (>=3.8.0~)
Depends: ${shlibs:Depends}, ${misc:Depends}, ${python3:Depends}, ${dist:Depends}, python3-oq-libs (>=3.8.1~)
Conflicts: python-noq, python-oq, python-nhlib
Replaces: python-oq-risklib, python-oq-hazardlib, python-oq-engine (<< ${binary:Version})
# See https://www.debian.org/doc/debian-policy/ch-relationships.html#s-breaks
Breaks: python-oq-risklib, python-oq-hazardlib, python-oq-engine (<< ${binary:Version})
Recommends: ${dist:Recommends}
Suggests: python3-oq-libs-extra (>=3.8.0~)
Suggests: python3-oq-libs-extra (>=3.8.1~)
Description: computes seismic hazard and physical risk
based on the hazard and risk libraries developed by the GEM foundation
(http://www.globalquakemodel.org/)
Expand Down
247 changes: 156 additions & 91 deletions doc/adv-manual/classical_PSHA.rst
Original file line number Diff line number Diff line change
Expand Up @@ -851,84 +851,74 @@ are the same.
disagg_by_src
=======================================

Given a system of various sources affecting a specific site,
one very common question to ask is: what are the more relevant sources,
i.e. which sources contribute the most to the mean hazard curve?
The engine is able to answer such question by setting the ``disagg_by_src``
flag in the job.ini file. When doing that, the engine saves in
the datastore a 5-dimensional array called ``disagg_by_src`` with
dimensions (site ID, realization ID, intensity measure type,
intensity measure level, source ID). For that it is possible to extract
the contribution of each source to the mean hazard curve (interested
people should look at the code in the function ``check_disagg_by_src``).
The array ``disagg_by_src`` can also be read as a pandas DataFrame,
then getting something like the following::

>> dstore.read_df('disagg_by_src', index='src_id')
site_id rlz_id imt lvl value
ASCTRAS407 0 0 PGA 0 9.703749e-02
IF-CFS-GRID03 0 0 PGA 0 3.720510e-02
ASCTRAS407 0 0 PGA 1 6.735009e-02
IF-CFS-GRID03 0 0 PGA 1 2.851081e-02
ASCTRAS407 0 0 PGA 2 4.546237e-02
... ... ... ... ... ...
IF-CFS-GRID03 0 31 PGA 17 6.830692e-05
ASCTRAS407 0 31 PGA 18 1.072884e-06
IF-CFS-GRID03 0 31 PGA 18 1.275539e-05
ASCTRAS407 0 31 PGA 19 1.192093e-07
IF-CFS-GRID03 0 31 PGA 19 5.960464e-07
Given a system of various sources affecting a specific site, one very
common question to ask is: what are the more relevant sources,
i.e. which sources contribute the most to the mean hazard curve? The
engine is able to answer such question by setting the
``disagg_by_src`` flag in the job.ini file. When doing that, the
engine saves in the datastore a 4-dimensional ArrayWrapper called
``mean_rates_by_src`` with dimensions (site ID, intensity measure
type, intensity measure level, source ID). From that it is possible to
extract the contribution of each source to the mean hazard curve
(interested people should look at the code in the function
``check_disagg_by_src``). The ArrayWrapper ``mean_rates_by_src`` can also be
converted into a pandas DataFrame, then getting something like the
following::

>> dstore['mean_rates_by_src'].to_dframe().set_index('src_id')
site_id imt lvl value
ASCTRAS407 0 PGA 0 9.703749e-02
IF-CFS-GRID03 0 PGA 0 3.720510e-02
ASCTRAS407 0 PGA 1 6.735009e-02
IF-CFS-GRID03 0 PGA 1 2.851081e-02
ASCTRAS407 0 PGA 2 4.546237e-02
... ... ... ... ...
IF-CFS-GRID03 0 PGA 17 6.830692e-05
ASCTRAS407 0 PGA 18 1.072884e-06
IF-CFS-GRID03 0 PGA 18 1.275539e-05
ASCTRAS407 0 PGA 19 1.192093e-07
IF-CFS-GRID03 0 PGA 19 5.960464e-07

The ``value`` field here is the probability of exceedence in the hazard
curve. The ``lvl`` field is an integer corresponding to the intensity
measure level in the hazard curve.

There is a consistency check comparing the mean hazard curves
with the value obtained by composing the probabilities in the
disagg_by_src` array, for the heighest level of each intensity measure type.

It should be noticed that many hazard models contain thousands of
sources and as a consequence the ``disagg_by_src`` matrix can be
impossible to compute without running out of memory. Even if you have
enough memory, having a very large ``disagg_by_src`` matrix is a bad
idea, so there is a limit on the size of the matrix, hard-coded to 4
GB. The way to circumvent the limit is to reduce the number of
sources: for instance you could convert point sources in multipoint
sources.

In engine 3.15 we also introduced the so-called "colon convention" on source
IDs: if you have many sources that for some reason should be collected
together - for instance because they all account for seismicity in the same
tectonic region, or because they are components of a same source but are split
into separate sources by magnitude - you
can tell the engine to collect them into one source in the ``disagg_by_src``
matrix. The trick is to use IDs with the same prefix, a colon, and then a
numeric index. For instance, if you had 3 sources with IDs ``src_mag_6.65``,
``src_mag_6.75``, ``src_mag_6.85``, fragments of the same source with
different magnitudes, you could change their IDs to something like
``src:0``, ``src:1``, ``src:2`` and that would reduce the size of the
matrix ``disagg_by_src`` by 3 times by collecting together the contributions
of each source. There is no restriction on the numeric indices to start
from 0, so using the names ``src:665``, ``src:675``, ``src:685`` would
work too and would be clearer: the IDs should be unique, however.

If the IDs are not unique and the engine determines that the underlying
sources are different, then an extension "semicolon + incremental index"
is automatically added. This is useful when the hazard modeler wants
to define a model where the more than one version of the same source appears
in one source model, having changed some of the parameters, or when varied
versions of a source appear in each branch of a logic tree. In that case,
the modeler should use always the exact same ID (i.e. without the colon and
numeric index): the engine will automatically distinguish the
sources during the calculation of the hazard curves and consider them the same
when saving the array ``disagg_by_src``: you can see an example in the
test ``qa_tests_data/classical/case_79`` in the engine code base. In that
case the ``source_info`` dataset will list 6 sources
``ASCTRAS407;0``, ``ASCTRAS407;1``, ``ASCTRAS407;2``, ``ASCTRAS407;3``,
``IF-CFS-GRID03;0``, ``IF-CFS-GRID03;1`` but the matrix ``disagg_by_src``
will see only two sources ``ASCTRAS407`` and ``IF-CFS-GRID03`` obtained
by composing together the versions of the underlying sources.

In version 3.15 ``disagg_by_src`` was extended to work with mutually
In engine 3.15 we introduced the so-called "colon convention" on
source IDs: if you have many sources that for some reason should be
collected together - for instance because they all account for
seismicity in the same tectonic region, or because they are components
of a same source but are split into separate sources by magnitude -
you can tell the engine to collect them into one source in the
``mean_rates_by_src`` matrix. The trick is to use IDs with the same
prefix, a colon, and then a numeric index. For instance, if you had 3
sources with IDs ``src_mag_6.65``, ``src_mag_6.75``, ``src_mag_6.85``,
fragments of the same source with different magnitudes, you could
change their IDs to something like ``src:0``, ``src:1``, ``src:2`` and
that would reduce the size of the matrix ``mean_rates_by_src`` by 3
times by collecting together the contributions of each source. There
is no restriction on the numeric indices to start from 0, so using the
names ``src:665``, ``src:675``, ``src:685`` would work too and would
be clearer: the IDs should be unique, however.

If the IDs are not unique and the engine determines that the
underlying sources are different, then an extension "semicolon +
incremental index" is automatically added. This is useful when the
hazard modeler wants to define a model where the more than one version
of the same source appears in one source model, having changed some of
the parameters, or when varied versions of a source appear in each
branch of a logic tree. In that case, the modeler should use always
the exact same ID (i.e. without the colon and numeric index): the
engine will automatically distinguish the sources during the
calculation of the hazard curves and consider them the same when
saving the array ``mean_rates_by_src``: you can see an example in the
test ``qa_tests_data/classical/case_20/job_bis.ini`` in the engine
code base. In that case the ``source_info`` dataset will list 7
sources ``CHAR1;0 CHAR1;1 CHAR1;2 COMFLT1;0 COMFLT1;1 SFLT1;0
SFLT1;1`` but the matrix ``mean_rates_by_src`` will see only three
sources ``CHAR1 COMFLT1 SFLT1`` obtained by composing together the
versions of the underlying sources.

In version 3.15 ``mean_rates_by_src`` was extended to work with mutually
exclusive sources, i.e. for the Japan model. You can see an example in
the test ``qa_tests_data/classical/case_27``. However, the case of
mutually exclusive ruptures - an example is the New Madrid cluster
Expand Down Expand Up @@ -975,15 +965,102 @@ NB: ``disagg_by_src`` can be set to true only if the
not in the original source model, thus breaking the connection between
the values of the matrix and the original sources.

The conditional spectrum calculator
========================================
The post-processing framework and Vector-valued PSHA calculations
=================================================================

Since version 3.17 the OpenQuake engine has special support for
custom postprocessors. A postprocessor is a Python module located
in the directory ``openquake/calculators/postproc`` and containing
a ``main`` function with signature:

.. code-block::
def main(dstore, [csm], ...):
...
Post-processors are called after a classical or
preclassical calculation: the ``dstore`` parameter is a DataStore
instance corresponding to the calculation, while the ``csm``
parameter is a CompositeSourceModel instance (it can be omitted if not
needed).

The ``main`` function is called when the user sets in
the job.ini file the parameters ``postproc_func`` and
``postproc_args``. ``postproc_func`` is the dotted name of the
postprocessing function (in the form ``modulename.funcname``
where ``funcname`` is normally ``main``) and
``postproc_args`` is a dictionary of literal arguments that get passed
to the function; if not specified the empty dictionary is
passed. This happens for istance for the conditional spectrum
post-processor since it does not require additional arguments with
respect to the ones in ``dstore['oqparam']``.

The post-processing framework was put in place in order to run
VPSHA calculations. The user can find an example in
``qa_tests_data/postproc/case_mrd``. In the job.ini file there are the lines::

postproc_func = compute_mrd.main
postproc_args = {
'imt1': 'PGA',
'imt2': 'SA(0.05)',
'cross_correlation': 'BakerJayaram2008',
'seed': 42,
'meabins': [0.1, 0.2, 0.3, 0.4, 0.5, 0.6],
'sigbins': [0.2, 0.3, 0.4, 0.5, 0.6, 0.7],
'method': 'indirect'}

while the postprocessor module ``openquake.calculators.postproc.compute_mrd``
contains the function

.. code-block::
# inside openquake.calculators.postproc.compute_mrd
def main(dstore, imt1, imt2, cross_correlation, seed, meabins, sigbins,
method='indirect'):
...
Inside ``main`` there is code to create the dataset ``mrd`` which
contains the Mean Rate Distribution as an array of shape L1 x L1 x N
where L1 is the number of levels per IMT minus 1 and N the number of
sites (normally 1).

While the postprocessing part for VPSHA calculations is computationally
intensive, it is much more common to have a light postprocessing, i.e.
faster than the classical calculation it depends on. In such
situations the postprocessing framework really shines, since it is
possible to reuse the original calculation via the standard ``--hc``
switch, i.e. you can avoid repeating multiple times the same classical
calculation if you are interested in running the postprocessor with
different parameters. In that situation the ``main`` function will
get a DataStore instance with an attribute ``parent`` corresponding
to the DataStore of the original calculation.

The postprocessing framework also integrates very well with interactive
development (think of Jupyter notebooks). The following lines are all
you need to create a child datastore where the postprocessing function
can store its results after reading the data from the calculation datastore:

.. code-block::
>> from openquake.commonlib.datastore import read, build_dstore_log
>> from openquake.calculators.postproc import mypostproc
>> dstore, log = build_dstore_log(parent=read(calc_id))
>> with log:
.. mypostproc.main(dstore)
The conditional spectrum post-processor
---------------------------------------

Since version 3.17 the engine includes an experimental post-processor
which is able to compute the conditional spectrum.

The ``conditional_spectrum`` calculator is an experimental calculator
introduced in version 3.13, which is able to compute the conditional
spectrum in the sense of Baker.
The implementation was adapted from the paper *Conditional Spectrum
Computation Incorporating Multiple Causal Earthquakes and
Ground-Motion Prediction Models* by Ting Lin, Stephen C. Harmsen,
Jack W. Baker, and Nicolas Luco (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.845.163&rep=rep1&type=pdf) and it is rather sophisticated.

In order to perform a conditional spectrum calculation you need to
specify (on top of the usual parameter of a classical calculation):
specify, in addition to the usual parameter of a classical calculation:

1. a reference intensity measure type (i.e. ``imt_ref = SA(0.2)``)
2. a cross correlation model (i.e. ``cross_correlation = BakerJayaram2008``)
Expand Down Expand Up @@ -1033,15 +1110,3 @@ standard deviations.

Conditional spectra for individual realizations are also computed and stored
for debugging purposes, but they are not exportable.

The implementation was adapted from the paper *Conditional Spectrum
Computation Incorporating Multiple Causal Earthquakes and
Ground-Motion Prediction Models* by Ting Lin, Stephen C. Harmsen,
Jack W. Baker, and Nicolas Luco (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.845.163&rep=rep1&type=pdf) and it is rather sophisticated.
The core formula is implemented in the method
`openquake.hazardlib.contexts.get_cs_contrib`.

The ``conditional_spectrum`` calculator, like the disaggregation calculator,
is a kind of post-calculator, i.e. you can run a regular classical calculation
and then compute the ``conditional_spectrum`` in post-processing by using
the ``--hc`` option.
Loading

0 comments on commit 55348f9

Please sign in to comment.