-
Notifications
You must be signed in to change notification settings - Fork 0
Testing drunc with the latest nightly
Pawel Plesniak edited this page Aug 8, 2024
·
17 revisions
You first need to set up an OKS nightly/release.
cd <directory_above_where_you_want_the_new_software_area>
source /cvmfs/dunedaq.opensciencegrid.org/setup_dunedaq.sh
setup_dbt latest_v5
dbt-create -n NFD_DEV_YYMMDD_A9 <work_dir> # NFD_DEV_240705_A9 or newer
# Or, to use the v5.1.0 rc1 candidate release
# dbt-create -b candidate fddaq-v5.1.0-rc1-a9 <work_dir>
cd <work_dir>/sourcecode
# List below is indicative
git clone https://github.com/DUNE-DAQ/appmodel.git -b develop
git clone https://github.com/DUNE-DAQ/confmodel.git -b develop # maybe don't clone this one, it takes ages to compile it.
cd ..
source env.sh
dbt-build
dbt-workarea-env
Clone and install drunc
(better to do that in the <work_dir>
described in the step above):
cd <work_dir>
git@github.com:DUNE-DAQ/drunc.git
cd drunc
pip install -e .
cd <work_dir>
git@github.com:DUNE-DAQ/druncschema.git
cd druncschema
pip install -e .
You will need one of the following file:
- drunc/data/process-manager-no-kafka.json.
- drunc/data/process-manager-CERN-kafka.json.
- drunc/data/process-manager-pocket-kafka.json.
Again 2 choices:
- Either you have
git clone
'ddrunc
and it's there in the repository - Or you need to download it:
wget https://raw.githubusercontent.com/DUNE-DAQ/drunc/develop/data/process-manager-CERN-kafka.json
Let's say you are using the default: appmodel/test/config/test-session.data.xml.
- Either you have cloned
appmodel
, and you can modify this line and point it to your RTE script (which should be in your<work_dir>/install
directory) and compile (dbt-build
)again ; - Or you need to copy the
appmodel/test/config/test-session.data.xml
to your current working directory, and modify the line, which you can do withwget
:
wget https://raw.githubusercontent.com/DUNE-DAQ/appmodel/develop/test/config/test-session.data.xml
drunc-unified-shell <configuration>
Where <configuration>
points to the file from the instructions above, which can either be one of the defined types:
k8s
ssh-CERN-kafka
ssh-kafka
-
ssh-standalone
These are the packaged configurations found indrunc/src/drunc/data/process-manager
. The configuration can also be defined relative to the current working directory (defined with afile://
prefix) as e.g.file://data/process-manager-no-kafka.json
if working from thedrunc
root.
Once there, you can boot:
drunc-unified-shell > boot test/config/test-session.data.xml test-session
# OR, if you decided to wget test-session.data.xml to PWD:
drunc-unified-shell > boot test-session.data.xml test-session
You can list all the apps with ps
:
drunc-unified-shell > ps
Processes running
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┓
┃ session ┃ user ┃ friendly name ┃ uuid ┃ alive ┃ exit-code ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━┩
│ test-session │ plasorak │ root-controller │ ecf48bea-c4f3-404a-9b07-a762c2f5aaa7 │ True │ 0 │
│ test-session │ plasorak │ ru-controller │ c24f579c-60ca-4e9d-a288-698ba1c2ad42 │ True │ 0 │
│ test-session │ plasorak │ ru-01 │ c085cd31-4fd6-450a-a3f0-93ee762b5efa │ True │ 0 │
│ test-session │ plasorak │ df-controller │ 4dcaf726-7bf3-440d-8c76-ff175b5f4507 │ True │ 0 │
│ test-session │ plasorak │ df-01 │ daaecb18-921a-4a98-a89c-185bce247dc5 │ False │ 134 │
│ test-session │ plasorak │ dfo-01 │ f7859b07-93c3-4390-9e44-e1a8c3eb2398 │ True │ 0 │
│ test-session │ plasorak │ tp-stream-writer │ 1ed50313-96fb-459e-9eb9-36c6309c4d4f │ True │ 0 │
│ test-session │ plasorak │ trg-controller │ bd116248-2d47-4957-bb24-8fb339bab21e │ True │ 0 │
│ test-session │ plasorak │ mlt │ a114ba8b-9152-4350-899b-ffbaa29dfdcf │ True │ 0 │
└──────────────┴──────────┴──────────────────┴──────────────────────────────────────┴───────┴───────────┘
... can be done with the logs
command:
drunc-unified-shell > logs --name df-01
───────────────────────────────────── daaecb18-921a-4a98-a89c-185bce247dc5 logs ──────────────────────────────────────
<snippet>
2024-Mar-11 18:35:44,768 LOG [dunedaq::iomanager::QueueSenderModel<Datatype>::QueueSenderModel(const
dunedaq::iomanager::connection::ConnectionId&) [with Datatype =
std::unique_ptr<dunedaq::daqdataformats::TriggerRecord>] at
/cvmfs/dunedaq-development.opensciencegrid.org/nightly/NB_DEV_240306_A9/spack-0.20.0/opt/spack/linux-almalinux9-x86_64
/gcc-12.1.0/iomanager-NB_DEV_240306_A9-2xz2rt44fleigv3eqwu4arneebeq5bhv/include/iomanager/queue/detail/QueueSenderMode
l.hxx:68] QueueSenderModel created with DT! Addr: 0x7f25db252090
2024-Mar-11 18:35:44,768 LOG [dunedaq::iomanager::QueueSenderModel<Datatype>::QueueSenderModel(const
dunedaq::iomanager::connection::ConnectionId&) [with Datatype =
std::unique_ptr<dunedaq::daqdataformats::TriggerRecord>] at
/cvmfs/dunedaq-development.opensciencegrid.org/nightly/NB_DEV_240306_A9/spack-0.20.0/opt/spack/linux-almalinux9-x86_64
/gcc-12.1.0/iomanager-NB_DEV_240306_A9-2xz2rt44fleigv3eqwu4arneebeq5bhv/include/iomanager/queue/detail/QueueSenderMode
l.hxx:70] QueueSenderModel m_queue=0x7f25db223080
2024-Mar-11 18:35:44,768 ERROR [static void ers::ErrorHandler::SignalHandler::action(int, siginfo_t*, void*) at
/tmp/root/spack-stage/spack-stage-ers-NB_DEV_240306_A9-ypb44oo4yxx6glfbk4bna7ogqvbnlauw/spack-src/src/ErrorHandler.cpp
:90] Got signal 11 Segmentation fault (invalid memory reference)
Parameters = 'name=Segmentation fault (invalid memory reference)' 'signum=11'
Qualifiers = 'unknown'
host = np04-srv-019
user = plasorak (122687)
process id = 336403
thread id = 336403
process wd = /nfs/home/plasorak/NFD_DEV_240306_A9/runarea
stack trace of the crashing thread:
#0
/cvmfs/dunedaq-development.opensciencegrid.org/nightly/NB_DEV_240306_A9/spack-0.20.0/opt/spack/linux-almalinux9-x86_64
/gcc-12.1.0/dfmodules-NB_DEV_240306_A9-vbjtsx3ofhhig3gc7ocrub44iy4ldd3o/lib64/libdfmodules_TriggerRecordBuilder_duneDA
QModule.so(dunedaq::dfmodules::TriggerRecordBuilder::setup_data_request_connections(dunedaq::appdal::ReadoutApplicatio
n const*)+0xaa2) [0x7f25d7de6b22]
#1 /lib64/libc.so.6(+0x54df0) [0x7f25de254df0]
#2
/cvmfs/dunedaq-development.opensciencegrid.org/nightly/NB_DEV_240306_A9/spack-0.20.0/opt/spack/linux-almalinux9-x86_64
/gcc-12.1.0/dfmodules-NB_DEV_240306_A9-vbjtsx3ofhhig3gc7ocrub44iy4ldd3o/lib64/libdfmodules_TriggerRecordBuilder_duneDA
QModule.so(dunedaq::dfmodules::TriggerRecordBuilder::setup_data_request_connections(dunedaq::appdal::ReadoutApplicatio
n const*)+0xaa2) [0x7f25d7de6b22]
#3
/cvmfs/dunedaq-development.opensciencegrid.org/nightly/NB_DEV_240306_A9/spack-0.20.0/opt/spack/linux-almalinux9-x86_64
/gcc-12.1.0/dfmodules-NB_DEV_240306_A9-vbjtsx3ofhhig3gc7ocrub44iy4ldd3o/lib64/libdfmodules_TriggerRecordBuilder_duneDA
QModule.so(dunedaq::dfmodules::TriggerRecordBuilder::init(std::shared_ptr<dunedaq::appfwk::ModuleConfiguration>)+0xf3b
) [0x7f25d7de807b]
#4 daq_application() [0x438b82]
#5 daq_application() [0x4396d8]
#6 daq_application() [0x439abf]
#7 daq_application() [0x42e7d1]
#8 /lib64/libc.so.6(+0x3feb0) [0x7f25de23feb0]
#9 /lib64/libc.so.6(__libc_start_main+0x80) [0x7f25de23ff60]
#10 daq_application() [0x4301a5]
bash: line 1: 336403 Aborted (core dumped) daq_application --name df-01 -c rest://localhost:3339 -i
kafka://monkafka.cern.ch:30092/opmon --configurationService oksconfig:test/config/test-session.data.xml
... with kill
(to kill the applications) and flush
(to erase them from memory, i.e. you won't be able to restart them)
drunc-unified-shell > kill --user plasorak
<snip>
drunc-unified-shell > flush
<snip>
drunc-unified-shell > fsm conf
drunc-unified-shell > fsm start run_number 123 # Note the run number here!
drunc-unified-shell > fsm enable_trigger
drunc-unified-shell > status
drunc-unified-shell > fsm disable_trigger
drunc-unified-shell > fsm stop_trigger_sources
drunc-unified-shell > fsm stop
drunc-unified-shell > fsm scrap
drunc-unified-shell > describe --command fsm
- The FSM sequences have not been implemented. This is the occasion for you to revise the DAQ FSM! So you will actually need to send
enable_trigger
,drain_dataflow
,disable_trigger
,stop_trigger_sources
... - The thread pinning file that is used is specified in
appdal/config/appdal/fsm.data.xml
, it'sreadoutlibs/share/config/cpupins/cpupin-example.json
before and afterconf
, and afterstart
. In this example, it's the same for all the pre and post transitions, but that can modified (do check outfsm.data.xml
, you should be able to figure out how).
- Home
- Release notes
- Roadmap
- Check before merging
- Setup
- Operation
- Developers
- Testing