-
-
Notifications
You must be signed in to change notification settings - Fork 469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed crash when raycasting while map is updated. #1793
Fixed crash when raycasting while map is updated. #1793
Conversation
Possibly due to map not being updated on main thread when not explicitly using QueuedConnection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not yet convinced that the described bug is related to a threading issue.
The topic subscriber uses update_nh_
:
rviz/src/rviz/default_plugin/map_display.cpp
Line 441 in 68ce34b
map_sub_ = update_nh_.subscribe(topic_property_->getTopicStd(), 1, &MapDisplay::incomingMap, |
which already should ensure that ROS callbacks of that subscriber will be handled in the main GUI thread:
rviz/src/rviz/visualization_manager.cpp
Line 334 in 68ce34b
ros::spinOnce(); |
Can you confirm (via a backtrace) that the crash doesn't occur in the main thread?
Just to clarify: I trust you stating that this PR reliably fixes your issue, but I don't yet understand how/why. |
Do you use standalone rviz or some tool build upon it? |
I've used a blank default rviz config on my desktop where I don't have the fix. Just added a map display and our waypoint tool which uses the Here's the backtrace:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the backtrace. This proves my assumption that everything is handled within the same (main) thread.
I'm still hesitating to merge the PR as is, because I still don't understand why the crash only occurs with your tool. Obviously, normal rendering of an immediately updated map works w/o issues. So, why does it fail when your tools is called in between?
How does your tool modify the internal data structures to cause the crash?
Could you please provide your tool's source for further analysis with address sanitizer tools?
This PR essentially introduces a one-frame delay between receiving map updates and displaying them. If you have a very slowly progressing ROS time (e.g. from sim) this should become noticeable.
I can upload the whole code but the relevant portion should be this: void WaypointTool::onInitialize()
{
rviz::InteractionTool::onInitialize();
preview_mesh_ = rviz::loadMeshFromResource( WAYPOINT_PREVIEW_RESOURCE );
if ( preview_mesh_.isNull())
{
ROS_ERROR( "Failed to load waypoint tool preview mesh." );
return;
}
arrow_mesh_ = rviz::loadMeshFromResource( WAYPOINT_ARROW_RESOURCE );
if ( arrow_mesh_.isNull())
{
ROS_ERROR( "Failed to load waypoint tool orientation preview mesh." );
return;
}
waypoint_preview_node_ = createWaypointNode();
waypoint_preview_node_->setVisible( false );
}
Ogre::SceneNode *WaypointTool::createWaypointNode()
{
Ogre::SceneNode *node = scene_manager_->getRootSceneNode()->createChildSceneNode();
Ogre::Entity *entity = scene_manager_->createEntity( preview_mesh_ );
node->attachObject( entity );
node->setScale( WAYPOINT_PREVIEW_SCALE, WAYPOINT_PREVIEW_SCALE, WAYPOINT_PREVIEW_SCALE );
Ogre::SceneNode *arrow_node = node->createChildSceneNode();
entity = scene_manager_->createEntity( arrow_mesh_ );
arrow_node->attachObject( entity );
arrow_node->setVisible( false );
arrow_node->setScale( 30 * WAYPOINT_PREVIEW_SCALE, 30 * WAYPOINT_PREVIEW_SCALE, 30 * WAYPOINT_PREVIEW_SCALE );
return node;
}
int WaypointTool::processMouseEvent( rviz::ViewportMouseEvent &event )
{
if ( waypoint_preview_node_ == nullptr ) return rviz::InteractionTool::processMouseEvent( event );
// Perform raycast
waypoint_preview_node_->setVisible( false ); // Disable for raycast
Ogre::Vector3 intersection;
if ( !raycast3d_property_->getBool() || !context_->getSelectionManager()->get3DPoint( event.viewport, event.x, event.y, intersection ))
{
// Fall back to plane intersection
Ogre::Plane ground_plane( Ogre::Vector3::UNIT_Z, 0.f );
if ( !rviz::getPointOnPlaneFromWindowXY( event.viewport, ground_plane, event.x, event.y, intersection ))
{
return Render;
}
}
// Show preview node at current intersection unless we are in the process of placing a waypoint
waypoint_preview_node_->setVisible( active_waypoint_ == nullptr, false );
waypoint_preview_node_->setPosition( intersection );
return rviz::InteractionTool::processMouseEvent( event );
} I've removed the branches that are definitely not being hit from the processMouseEvent for easier readability. |
I modified the |
I just did the same and it crashes just like with our tool. diff --git a/rviz_plugin_tutorials/src/plant_flag_tool.cpp b/rviz_plugin_tutorials/src/plant_flag_tool.cpp
index ea2689b..a8e73d0 100644
--- a/rviz_plugin_tutorials/src/plant_flag_tool.cpp
+++ b/rviz_plugin_tutorials/src/plant_flag_tool.cpp
@@ -38,6 +38,7 @@
#include <rviz/mesh_loader.h>
#include <rviz/geometry.h>
#include <rviz/properties/vector_property.h>
+#include <rviz/selection/selection_manager.h>
#include "plant_flag_tool.h"
@@ -169,9 +170,8 @@ int PlantFlagTool::processMouseEvent( rviz::ViewportMouseEvent& event )
}
Ogre::Vector3 intersection;
Ogre::Plane ground_plane( Ogre::Vector3::UNIT_Z, 0.0f );
- if( rviz::getPointOnPlaneFromWindowXY( event.viewport,
- ground_plane,
- event.x, event.y, intersection ))
+ moving_flag_node_->setVisible(false);
+ if( context_->getSelectionManager()->get3DPoint( event.viewport, event.x, event.y, intersection ))
{
moving_flag_node_->setVisible( true );
moving_flag_node_->setPosition( intersection );
Do you also have an updating map? 1Hz should be sufficient. Same backtrace
|
I used the moving (but planar) map from |
Here's a 30s bag file 2023-05-08-19-21-33.zip. |
Hm. Still can't reproduce your segfault. I guess you use the latest rviz from source? |
Same result with latest Noetic release. I suggest you prepare a self-contained environment (e.g. a docker image) to reliably reproduce the problem. |
The issue contains the details for my laptop.
Nvidia driver versions are 530 on the laptop 3060, and 515 on my desktop.
That seems like a lot of work for a minor change, I can also just continue using our fork if you think this change could have negative consequences. |
We have seen many segfaults due to OpenGL rendering issues on NVIDIA hardware: On your laptop, you probably also have an integrated Intel GPU. Could you please check whether the bug persists on this hardware? To this end use the environment variables:
Logically, the proposed change should not be necessary. If it is for you, I would like to understand why before merging. |
I can reproduce the segfault with NVIDIA hardware. I will try to understand the problem now... |
Any success at finding the cause? |
I didn't find the root cause of the issue. I agree to merge this work-around. Please merge the suggestion I added before: |
Co-authored-by: Robert Haschke <[email protected]>
The queued connection update introduced in ros-visualization#1793 caused the map display to first update the pose of the map and then the map itself (in the next update cycle). This can be perfectly seen when throttling the rviz frame rate. Updates to the visualization should be handled in update() only. Thus, now, if a map update is received, a flag is set to perform the costly map update. If that flag is not set, only the transform is updated.
The queued connection update introduced in ros-visualization#1793 caused the map display to first update the pose of the map and then the map itself (in the next update cycle). This can be perfectly seen when throttling the rviz frame rate. Updates to the visualization should be handled in update() only. Thus, now, if a map update is received, a flag is set to perform the costly map update. If that flag is not set, only the transform is updated.
The queued connection update introduced in #1793 caused the map display to first update the pose of the map and then the map itself (in the next update cycle). The resulting jittering can be perfectly seen when throttling the rviz frame rate. Updates to the visualization should be handled in update() only. Thus, now, if a map update is received, a flag is set to perform the costly map update. If that flag is not set, only the transform is updated.
When using the scene manager raycast in one of our tool rviz crashed when the tool touched the map area while a new map was received.
Apparently, the connection is not queued (anymore?) and therefore the update does not happen on the main thread.
This small change should fix this and is fully ABI compatible.
Fixes #1792