Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chore: update repo with latest fork #1

Merged
merged 4 commits into from
Feb 8, 2024
Merged

Conversation

JasperTan97
Copy link

@JasperTan97 JasperTan97 commented Jan 31, 2024

Description

This PR solves the issue by:

  • following correct file structure
  • updating cmakelists
  • adding test script: main.cpp

We need to discuss how to merge this test_stag repo with the stag repo, and what becomes of this (Do I delete, or is there an archive). Are there best practices on how I would merge the 2 repos (basically I want to delete everything in stag and replace with this, save the readme), also how do we preserve these discussions for the future?

Supporting information

The correctness of this repo is defined by here

You can test the build-ability of the repo with a Dockerfile like so:

# syntax=docker/dockerfile:1

# Use an official Ubuntu runtime as the base image
FROM ubuntu:22.04

# Update the system
RUN apt-get update -y && apt-get upgrade -y

# Install OpenCV
RUN DEBIAN_FRONTEND=noninteractive apt-get install -y libopencv-dev cmake g++
# RUN apt-get install -y g++

# Copy the current directory contents into the container at /app
COPY . /app

# Set the working directory to /app
WORKDIR /app

# Create a build directory and navigate into it
RUN mkdir build && cd build

# Run CMake to generate the Makefile
RUN cmake .

# Build the project
RUN make

# Run the compiled binary
CMD ["./stag_main"]

Review guidelines

Estimated Time of Review: 15 minutes
You can review by building the repo.

Checklist before merging:

  • Confirm that the relevant changelog(s) are up-to-date in case of any user-facing changes

@JasperTan97
Copy link
Author

JasperTan97 commented Jan 31, 2024

Thread 39 "state_engine" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff71ffb640 (LWP 391)]
0x00007ffff7b28ac1 in rclcpp_lifecycle::LifecycleNode::get_logger() const () from /opt/ros/iron/lib/librclcpp_lifecycle.so
(gdb) backtrace
#0  0x00007ffff7b28ac1 in rclcpp_lifecycle::LifecycleNode::get_logger() const ()
   from /opt/ros/iron/lib/librclcpp_lifecycle.so
#1  0x00007fff11661a3f in pose_detection::components::MarkerDetectorComponent::detect_markers (
    this=0x7fff247eba00, image=...)
    at /home/ros2/ws/src/pose_detection/src/components/MarkerDetectorComponent.cpp:43
#2  0x00007fff1166252b in pose_detection::components::MarkerDetectorComponent::on_image_callback (
    this=0x7fff247eba00, image=...)
    at /home/ros2/ws/src/pose_detection/src/components/MarkerDetectorComponent.cpp:82
#3  0x00007fff115b4997 in pose_detection::components::PoseDetectorComponent::image_callback (
    this=0x7fff247eba00, 
    image=std::shared_ptr<const sensor_msgs::msg::Image_<std::allocator<void> >> (use count 5, weak count 0) = {...}) at /home/ros2/ws/src/pose_detection/src/components/PoseDetectorComponent.cpp:125
#4  0x00007fff115b29ca in operator() (__closure=0x7fff1400d9f0, 
    image=std::shared_ptr<const sensor_msgs::msg::Image_<std::allocator<void> >> (use count 5, weak count 0) = {...}) at /home/ros2/ws/src/pose_detection/src/components/PoseDetectorComponent.cpp:65
#5  0x00007fff115b957a in std::__invoke_impl<void, pose_detection::components::PoseDetectorComponent::on_configure_callback()::<lambda(sensor_msgs::msg::Image_<std::allocator<void> >::ConstSharedPtr)>&, std::shared_ptr<const sensor_msgs::msg::Image_<std::allocator<void> > > >(std::__invoke_other, struct {...} &)
    (__f=...) at /usr/include/c++/11/bits/invoke.h:61
#6  0x00007fff115b8f24 in std::__invoke_r<void, pose_detection::components::PoseDetectorComponent::on_configure_callback()::<lambda(sensor_msgs::msg::Image_<std::allocator<void> >::ConstSharedPtr)>&, std::shared_ptr<const sensor_msgs::msg::Image_<std::allocator<void> > > >(struct {...} &) (__fn=...)
    at /usr/include/c++/11/bits/invoke.h:111
#7  0x00007fff115b8655 in std::_Function_handler<void(std::shared_ptr<const sensor_msgs::msg::Image_<std::allocator<void> > >), pose_detection::components::PoseDetectorComponent::on_configure_callback()::<lambda(sensor_msgs::msg::Image_<std::allocator<void> >::ConstSharedPtr)> >::_M_invoke(const std::_Any_data &, std::shared_ptr<sensor_msgs::msg::Image_<std::allocator<void> > const> &&) (__functor=..., __args#0=...)

So this is what I get repeatedly now that I've updated the repo. Looks like there's error from our part of the code, but maybe it means double free is gone???

Although the relevant part of the code is about the logger,:

std::vector<state_representation::CartesianPose> MarkerDetectorComponent::detect_markers(const cv::Mat& image) {
  std::vector<state_representation::CartesianPose> marker_poses;
  auto markers = marker_detector_->detect_markers(image);
  bool select_marker_detected = false;
  bool any_marker_detected = false;
  for (const auto& [marker_id, marker_points]: markers) {
    auto [pose, error] = marker_detector_->get_marker_pose(marker_id, marker_points);
    RCLCPP_DEBUG_STREAM(this->get_logger(),
                        "Marker " << marker_id << ": Reprojection Error " << error << std::endl << pose);
    marker_poses.push_back(pose);
    any_marker_detected = true;
    if (marker_selection_.find(marker_id) != marker_selection_.cend()) {
      select_marker_detected = true;
    }
  }
  this->set_predicate("is_any_marker_detected", any_marker_detected);
  this->set_predicate("is_any_selected_marker_detected", select_marker_detected);
  this->show_image(marker_detector_->get_annotated_image());
  return marker_poses;
}

and I am not too sure why this would happen.

@JasperTan97 JasperTan97 self-assigned this Jan 31, 2024
@JasperTan97
Copy link
Author

JasperTan97 commented Jan 31, 2024

@eeberhard , if you have given this a glance, perhaps it's more obvious to you where the segfault is coming from. I am wondering why that line would have errors?

There are a few things that could have been deleted here, but I think I have narrowed it to this pointer: If the command for deletion comes from another thread, we could have a problem when the debug statement tries to access this then we get this problem. But I don't understand why this issue was never a problem before.

I am also assuming that something is being deleted, even though it is not clear from the gdb console because:

[INFO] [1706703644.082283908] [ComponentManagerInterface.video_publisher]: Destroying interface for component /video_publisher
[INFO] [1706703644.082463845] [ComponentManagerInterface.stag_detector]: Destroying interface for component /stag_detector

Thread 43 "state_engine" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff6f7fe640 (LWP 259)]
0x00007ffff7b28ac1 in rclcpp_lifecycle::LifecycleNode::get_logger() const () from /opt/ros/iron/lib/librclcpp_lifecycle.so
(gdb) backtrace
#0  0x00007ffff7b28ac1 in rclcpp_lifecycle::LifecycleNode::get_logger() const ()
   from /opt/ros/iron/lib/librclcpp_lifecycle.so
#1  0x00007fff10d20a96 in pose_detection::components::MarkerDetectorComponent::detect_markers (
    this=0x7fff187ed160, image=...)
    at /home/ros2/ws/src/pose_detection/src/components/MarkerDetectorComponent.cpp:46
#2  0x00007fff10d21583 in pose_detection::components::MarkerDetectorComponent::on_image_callback (
    this=0x7fff187ed160, image=...)
    at /home/ros2/ws/src/pose_detection/src/components/MarkerDetectorComponent.cpp:85
#3  0x00007fff10c739d7 in pose_detection::components::PoseDetectorComponent::image_callback (
    this=0x7fff187ed160, 
    image=std::shared_ptr<const sensor_msgs::msg::Image_<std::allocator<void> >> (use count 5, weak count 0) = {...}) at /home/ros2/ws/src/pose_detection/src/components/PoseDetectorComponent.cpp:125
#4  0x00007fff10c71a0a in operator() (__closure=0x7fff3800cca0, 
    image=std::shared_ptr<const sensor_msgs::msg::Image_<std::allocator<void> >> (use count 5, weak count 0) = {...}) at /home/ros2/ws/src/pose_detection/src/components/PoseDetectorComponent.cpp:65
#5  0x00007fff10c785ba in std::__invoke_impl<void, pose_detection::components::PoseDetectorComponent::on_configure_callback()::<lambda(sensor_msgs::msg::Image_<std::allocator<void> >::ConstSharedPtr)>&, std::shared_ptr<const sensor_msgs::msg::Image_<std::allocator<void> > > >(std::__invoke_other, struct {...} &)
    (__f=...) at /usr/include/c++/11/bits/invoke.h:61
#6  0x00007fff10c77f64 in std::__invoke_r<void, pose_detection::components::PoseDetectorComponent::on_configure_callback()::<lambda(sensor_msgs::msg::Image_<std::allocator<void> >::ConstSharedPtr)>&, std::shared_ptr<const sensor_msgs::msg::Image_<std::allocator<void> > > >(struct {...} &) (__fn=...)
    at /usr/include/c++/11/bits/invoke.h:111
#7  0x00007fff10c77695 in std::_Function_handler<void(std::shared_ptr<const sensor_msgs::msg::Image_<std::allocator<void> > >), pose_detection::components::PoseDetectorComponent::on_configure_callback()::<lambda(sensor_msgs::msg::Image_<std::allocator<void> >::ConstSharedPtr)> >::_M_invoke(const std::_Any_data &, std::shared_ptr<sensor_msgs::msg::Image_<std::allocator<void> > const> &&) (__functor=..., __args#0=...)
--Type <RET> for more, q to quit, c to continue without paging--
    at /usr/include/c++/11/bits/std_function.h:290
#8  0x00007fff10d1413b in std::function<void (std::shared_ptr<sensor_msgs::msg::Image_<std::allocator<void> > const>)>::operator()(std::shared_ptr<sensor_msgs::msg::Image_<std::allocator<void> > const>) const
    (this=0x7fff3800cca0, 
    __args#0=std::shared_ptr<const sensor_msgs::msg::Image_<std::allocator<void> >> (empty) = {...})
    at /usr/include/c++/11/bits/std_function.h:590
#9  0x00007fff10d0a94d in rclcpp::AnySubscriptionCallback<sensor_msgs::msg::Image_<std::allocator<void> >, std::allocator<void> >::dispatch(std::shared_ptr<sensor_msgs::msg::Image_<std::allocator<void> > >, rclcpp::MessageInfo const&)::{lambda(auto:1&&)#1}::operator()<std::function<void (std::shared_ptr<sensor_msgs::msg::Image_<std::allocator<void> > const>)>&>(std::function<void (std::shared_ptr<sensor_msgs::msg::Image_<std::allocator<void> > const>)>&) const (__closure=0x7fff6f7fb250, callback=...)
    at /opt/ros/iron/include/rclcpp/rclcpp/any_subscription_callback.hpp:556
#10 0x00007fff10d18153 in std::__invoke_impl<void, rclcpp::AnySubscriptionCallback<sensor_msgs::msg::Image_<std::allocator<void> >, std::allocator<void> >::dispatch(std::shared_ptr<sensor_msgs::msg::Image_<std::allocator<void> > >, rclcpp::MessageInfo const&)::{lambda(auto:1&&)#1}, std::function<void (std::shared_ptr<sensor_msgs::msg::Image_<std::allocator<void> > const>)>&>(std::__invoke_other, rclcpp::AnySubscriptionCallback<sensor_msgs::msg::Image_<std::allocator<void> >, std::allocator<void> >::dispatch(std::shared_ptr<sensor_msgs::msg::Image_<std::allocator<void> > >, rclcpp::MessageInfo const&)::{lambda(auto:1&&)#1}&&, std::function<void (std::shared_ptr<sensor_msgs::msg::Image_<std::allocator<void> > const>)>&) (
    __f=...) at /usr/include/c++/11/bits/invoke.h:61
#11 0x00007fff10d145a8 in std::__invoke<rclcpp::AnySubscriptionCallback<sensor_msgs::msg::Image_<std::allocator<void> >, std::allocator<void> >::dispatch(std::shared_ptr<sensor_msgs::msg::Image_<std::allocator<void> > >, rclcpp::MessageInfo const&)::{lambda(auto:1&&)#1}, std::function<void (std::shared_ptr<sensor_msgs::msg::Image_<std::allocator<void> > const>)>&>(rclcpp::AnySubscriptionCallback<sensor_msgs::msg::Image_<std::allocator<void> >, std::allocator<void> >::dispatch(std::shared_ptr<sensor_msgs::msg::Image_<std::allocator<void> > >, rclcpp::MessageInfo const&)::{lambda(auto:1&&)#1}&&, std::function<void (std::shared_ptr<sensor_msgs::msg::Image_<std::allocator<void> > const>)>&) (__fn=...)

I can confirm by doing:

std::vector<state_representation::CartesianPose> MarkerDetectorComponent::detect_markers(const cv::Mat& image) {
  std::vector<state_representation::CartesianPose> marker_poses;
  auto markers = marker_detector_->detect_markers(image);
  bool select_marker_detected = false;
  bool any_marker_detected = false;
  if (markers.empty()) {
    return marker_poses;
  }
  for (const auto& [marker_id, marker_points]: markers) {
    auto [pose, error] = marker_detector_->get_marker_pose(marker_id, marker_points);
    RCLCPP_DEBUG_STREAM(this->get_logger(),
                        "Marker " << marker_id << ": Reprojection Error " << error << std::endl << pose);
    marker_poses.push_back(pose);
    any_marker_detected = true;
    if (marker_selection_.find(marker_id) != marker_selection_.cend()) {
      select_marker_detected = true;
    }
  }
  this->set_predicate("is_any_marker_detected", any_marker_detected);
  this->set_predicate("is_any_selected_marker_detected", select_marker_detected);
  this->show_image(marker_detector_->get_annotated_image());
  return marker_poses;
}

and still I get a segfault at RCLCPP_DEBUG_STREAM(this->get_logger(), "Marker " << marker_id << ": Reprojection Error " << error << std::endl << pose);

@JasperTan97
Copy link
Author

https://github.com/aica-technology/dynamic-state-engine/pull/230

I am wondering if what you mentioned here could be the issue?

@eeberhard
Copy link
Member

To clarify, does the segfault still happen when stopping the application? If the segfault is happening inside this->get_logger(), then the NodeLoggingInterface pointer node_logging_ is invalid here, and this in turns points to some kind of deletion during access. This could well be because the component starts be unloaded and destroyed while the execution thread is still going. I imagine this race condition is much more apparent in your use-case because the image processing step occupies a lot of time.

The fix in https://github.com/aica-technology/dynamic-state-engine/pull/230 does not necessarily solve this problem because it happens while the component is being destructed. Rather I would see if you can add a mutex around MarkerDetectorComponent::detect_markers to make it cannot be destructed halfway through

Copy link
Member

@eeberhard eeberhard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems generally good except for most minor nitpicks. Once we are sure these changes work with the downstream users, we can update the main aica-technology/stag fork accordingly.

We need to discuss how to merge this test_stag repo with the stag repo. Are there best practices on how I would merge the 2 repos (basically I want to delete everything in stag and replace with this, save the readme), also how do we preserve these discussions for the future?

To merge it will be a little tricky, we will have to see the best approach. For preserving issues and history, archiving one of the repos is a good way to while still clearly deprecating it. Otherwise it is also possible to transfer issues between repos.

src/main.cpp Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not call this main.cpp since it's not really a usable executable of the library. It's a useful usage example so could be named example.cpp, or otherwise just put as a code snippet in the README. Similarly, I wouldn't build this executable in the CMakeList by default, unless BUILD_TESTING or a similar flag is true.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I was following what was done in the original stag repo. But it did feel out of place definitely

@@ -106,7 +106,7 @@ double fastsqrt (double y) {
tempf = y;
*tfptr = (0xbfcdd90a - *tfptr)>>1; /* estimate of 1/sqrt(y) */
x = tempf;
z = y*0.5; /* hoist out the �/2� */
z = y*0.5; /* hoist out the �/2� */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird characters

Suggested change
z = y*0.5; /* hoist out the �/2� */
z = y*0.5; /* hoist out the "/2" */

@JasperTan97
Copy link
Author

To merge it will be a little tricky, we will have to see the best approach. For preserving issues and history, archiving one of the repos is a good way to while still clearly deprecating it. Otherwise it is also possible to transfer issues between repos.

I think one easy way is to archive the previous stag repo and call this stag in its place. But we can open another issue there. Then when we settle on a good method for this, then we can retroactively make the changelog

@JasperTan97 JasperTan97 merged commit f1452d2 into master Feb 8, 2024
@JasperTan97 JasperTan97 deleted the chore/updateRepo branch February 9, 2024 10:21
eeberhard pushed a commit that referenced this pull request Feb 12, 2024
* remove git modules

* update file structure to group by src and include dirs

* remove large fixture files

* update CMakeLists for better portability

The previous AICA fork has been archived at the following URL:
- https://github.com/aica-technology/stag-archive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants