-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added custom task (PullCubeTool-v1) with motion planner + Motion Planner for LiftPegUpright and PullCube #641
Merged
Merged
Changes from all commits
Commits
Show all changes
57 commits
Select commit
Hold shift + click to select a range
f150746
completed v1 of pull cube implementation
Viswesh-N c2f7732
fix ground truth info, robot spawn
Viswesh-N ef8f578
fixed init files
Viswesh-N d6d74a9
removed runs directory
Viswesh-N ccbd50a
fixed handle bugs
Viswesh-N 59e84b7
Merge branch 'main' of github.com:Viswesh-N/ManiSkill
Viswesh-N cff9005
fixed handle bug
Viswesh-N 4c5fc54
code refactor, requested changes
Viswesh-N f79e683
fixed cube and tool spawn
Viswesh-N 736281c
added working env implementation. TODO: check dense reward
Viswesh-N 32ddccc
modified reward function to reflect task better
Viswesh-N 23dfa7e
removed cube and tool pose from default obs
Viswesh-N 4f3dc3c
changed reward and episode length to observe behavior
Viswesh-N 0f79d8d
bug fix
Viswesh-N 112a3a3
Update pull_cube_tool.py
Viswesh-N decbce9
merged w main, fixed a few bugs
Viswesh-N 186e189
Merge branch 'main' of github.com:Viswesh-N/ManiSkill
Viswesh-N 5ca6f62
Merge branch 'haosulab:main' into main
Viswesh-N ea0eaba
added draft motion planning code
Viswesh-N 9c16e34
Merge branch 'main' of github.com:Viswesh-N/ManiSkill
Viswesh-N f492bda
merged w main, fixed bugs
Viswesh-N 683fa26
modified planner code to include new solution
Viswesh-N d6cd604
modified cube spawn, bug fixes
Viswesh-N 832db07
working planner
Viswesh-N 9416382
working planner
Viswesh-N 77d6941
modified spawn conditions
Viswesh-N 7aaf829
Merge branch 'haosulab:main' into main
Viswesh-N eb14f2c
Update pull_cube_tool.py
Viswesh-N ef014e1
changed env
Viswesh-N de2bf98
fixed all bugs in planner
Viswesh-N 0f1c3b8
fixed all bugs in planner
Viswesh-N e1fe3a5
cleaner grasp
Viswesh-N 53be1d9
Merge branch 'haosulab:main' into main
Viswesh-N eb8ea40
cleaned up grasp
Viswesh-N 957ee93
changed render colour
Viswesh-N 22061d0
Merge branch 'main' of github.com:Viswesh-N/ManiSkill
Viswesh-N 5e7283b
added peg motion planner - buggy
Viswesh-N 5cdcd0b
removed env success bugs
Viswesh-N 1362ad4
removed env success bugs
Viswesh-N 73dce20
removed env success bugs
Viswesh-N 140b0b4
removed env success bugs
Viswesh-N 9070e25
improved planner success and cube spawn
Viswesh-N 6f255a7
modified reward for learning
Viswesh-N eb126d5
reward modified
Viswesh-N a5ae4ca
Merge branch 'haosulab:main' into main
Viswesh-N c827308
Merge branch 'haosulab:main' into main
Viswesh-N 9553f8a
working LiftPeg and PullCubeTool planner with high accuracy
Viswesh-N a729e88
Merge branch 'haosulab:main' into main
Viswesh-N 0426d6f
added better reward learning
Viswesh-N a8fecd9
fully working planners and envs
Viswesh-N bfd3314
working dense reward for pullcube
Viswesh-N c0f7c37
Merge branch 'haosulab:main' into main
Viswesh-N 9068f03
Merge branch 'haosulab:main' into main
Viswesh-N 3ec00fb
added planner for pullcube env
Viswesh-N fa16ffb
added working planner for pullcube env
Viswesh-N 3751c29
Update pull_cube.py
Viswesh-N 95ea99d
Update pull_cube.py
Viswesh-N File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,268 @@ | ||
from typing import Any, Dict, Union | ||
import numpy as np | ||
import torch | ||
import sapien | ||
from mani_skill.agents.robots import Fetch, Panda | ||
from mani_skill.envs.sapien_env import BaseEnv | ||
from mani_skill.envs.utils import randomization | ||
from mani_skill.sensors.camera import CameraConfig | ||
from mani_skill.utils import sapien_utils | ||
from mani_skill.utils.building import actors | ||
from mani_skill.utils.registration import register_env | ||
from mani_skill.utils.scene_builder.table import TableSceneBuilder | ||
from mani_skill.utils.structs import Pose | ||
from mani_skill.utils.structs.types import GPUMemoryConfig, SimConfig | ||
|
||
|
||
@register_env("PullCubeTool-v1", max_episode_steps=100) | ||
class PullCubeToolEnv(BaseEnv): | ||
""" | ||
Task Description | ||
----------------- | ||
Given an L-shaped tool that is within the reach of the robot, leverage the | ||
tool to pull a cube that is out of it's reach | ||
|
||
Randomizations | ||
--------------- | ||
- The cube's position (x,y) is randomized on top of a table in the region "<out of manipulator | ||
reach, but within reach of tool>". It is placed flat on the table | ||
- The target goal region is the region on top of the table marked by "<within reach of arm>" | ||
|
||
Success Conditions | ||
----------------- | ||
- The cube's xy position is within the goal region of the arm's base (marked by reachability) | ||
""" | ||
|
||
SUPPORTED_ROBOTS = ["panda", "fetch"] | ||
SUPPORTED_REWARD_MODES = ("normalized_dense", "dense", "sparse", "none") | ||
agent: Union[Panda, Fetch] | ||
|
||
goal_radius = 0.3 | ||
cube_half_size = 0.02 | ||
handle_length = 0.2 | ||
hook_length = 0.05 | ||
width = 0.05 | ||
height = 0.05 | ||
cube_size = 0.02 | ||
arm_reach = 0.35 | ||
|
||
def __init__(self, *args, robot_uids="panda", robot_init_qpos_noise=0.02, **kwargs): | ||
self.robot_init_qpos_noise = robot_init_qpos_noise | ||
super().__init__(*args, robot_uids=robot_uids, **kwargs) | ||
|
||
@property | ||
def _default_sim_config(self): | ||
return SimConfig( | ||
gpu_memory_config=GPUMemoryConfig( | ||
found_lost_pairs_capacity=2**25, max_rigid_patch_count=2**18 | ||
) | ||
) | ||
|
||
@property | ||
def _default_sensor_configs(self): | ||
pose = sapien_utils.look_at(eye=[0.3, 0, 0.5], target=[-0.1, 0, 0.1]) | ||
return [ | ||
CameraConfig( | ||
"base_camera", | ||
pose=pose, | ||
width=128, | ||
height=128, | ||
fov=np.pi / 2, | ||
near=0.01, | ||
far=100, | ||
) | ||
] | ||
|
||
@property | ||
def _default_human_render_camera_configs(self): | ||
pose = sapien_utils.look_at([0.6, 0.7, 0.6], [0.0, 0.0, 0.35]) | ||
return [ | ||
CameraConfig( | ||
"render_camera", | ||
pose=pose, | ||
width=512, | ||
height=512, | ||
fov=1, | ||
near=0.01, | ||
far=100, | ||
) | ||
] | ||
|
||
def _build_l_shaped_tool(self, handle_length, hook_length, width, height): | ||
builder = self.scene.create_actor_builder() | ||
|
||
mat = sapien.render.RenderMaterial() | ||
mat.set_base_color([1, 0, 0, 1]) | ||
mat.metallic = 1.0 | ||
mat.roughness = 0.0 | ||
mat.specular = 1.0 | ||
|
||
builder.add_box_collision(sapien.Pose([handle_length / 2, 0, 0]),[handle_length / 2, width / 2, height / 2] | ||
, density= 500) | ||
builder.add_box_visual( | ||
sapien.Pose([handle_length / 2, 0, 0]), | ||
[handle_length / 2, width / 2, height / 2], | ||
material=mat, | ||
) | ||
|
||
builder.add_box_collision( | ||
sapien.Pose([handle_length - hook_length / 2, width, 0]), | ||
[hook_length / 2, width , height / 2], | ||
) | ||
builder.add_box_visual( | ||
sapien.Pose([handle_length - hook_length / 2, width, 0]), | ||
[hook_length / 2, width, height / 2], | ||
material=mat, | ||
) | ||
|
||
return builder.build(name="l_shape_tool") | ||
|
||
def _load_scene(self, options: dict): | ||
self.scene_builder = TableSceneBuilder( | ||
self, robot_init_qpos_noise=self.robot_init_qpos_noise | ||
) | ||
self.scene_builder.build() | ||
|
||
self.cube = actors.build_cube( | ||
self.scene, | ||
half_size=self.cube_half_size, | ||
color=np.array([12, 42, 160, 255]) / 255, | ||
name="cube", | ||
body_type="dynamic", | ||
) | ||
|
||
self.l_shape_tool = self._build_l_shaped_tool( | ||
handle_length=self.handle_length, | ||
hook_length=self.hook_length, | ||
width=self.width, | ||
height=self.height, | ||
) | ||
|
||
|
||
def _initialize_episode(self, env_idx: torch.Tensor, options: dict): | ||
with torch.device(self.device): | ||
b = len(env_idx) | ||
self.scene_builder.initialize(env_idx) | ||
|
||
tool_xyz = torch.zeros((b, 3), device=self.device) | ||
tool_xyz[..., :2] = - torch.rand((b, 2), device=self.device) * 0.2 - 0.1 | ||
tool_xyz[..., 2] = self.height / 2 | ||
tool_q = torch.tensor([1, 0, 0, 0], device=self.device).expand(b, 4) | ||
|
||
tool_pose = Pose.create_from_pq(p=tool_xyz, q=tool_q) | ||
self.l_shape_tool.set_pose(tool_pose) | ||
|
||
cube_xyz = torch.zeros((b, 3), device=self.device) | ||
cube_xyz[..., 0] = self.arm_reach + torch.rand(b, device=self.device) * ( | ||
self.handle_length | ||
) - 0.3 | ||
cube_xyz[..., 1] = torch.rand(b, device=self.device) * 0.3 - 0.25 | ||
cube_xyz[..., 2] = self.cube_size / 2 + 0.015 | ||
|
||
cube_q = randomization.random_quaternions( | ||
b, | ||
lock_x=True, | ||
lock_y=True, | ||
lock_z=False, | ||
bounds=(-np.pi / 6, np.pi / 6), | ||
device=self.device, | ||
) | ||
|
||
cube_pose = Pose.create_from_pq(p=cube_xyz, q=cube_q) | ||
self.cube.set_pose(cube_pose) | ||
|
||
def _get_obs_extra(self, info: Dict): | ||
obs = dict( | ||
tcp_pose=self.agent.tcp.pose.raw_pose, | ||
) | ||
|
||
if self._obs_mode in ["state", "state_dict"]: | ||
obs.update( | ||
cube_pose=self.cube.pose.raw_pose, | ||
tool_pose=self.l_shape_tool.pose.raw_pose, | ||
) | ||
|
||
return obs | ||
|
||
def evaluate(self): | ||
cube_pos = self.cube.pose.p | ||
|
||
robot_base_pos = self.agent.robot.get_links()[0].pose.p | ||
|
||
cube_to_base_dist = torch.linalg.norm(cube_pos[:, :2] - robot_base_pos[:, :2], dim=1) | ||
|
||
# Success condition - cube is pulled close enough | ||
cube_pulled_close = cube_to_base_dist < 0.6 # | ||
|
||
workspace_center = robot_base_pos.clone() | ||
workspace_center[:, 0] += self.arm_reach * 0.1 | ||
cube_to_workspace_dist = torch.linalg.norm(cube_pos - workspace_center, dim=1) | ||
progress = 1 - torch.tanh(3.0 * cube_to_workspace_dist) | ||
|
||
return { | ||
"success": cube_pulled_close, | ||
"success_once": cube_pulled_close, | ||
"success_at_end": cube_pulled_close, | ||
"cube_progress": progress.mean(), | ||
"cube_distance": cube_to_workspace_dist.mean(), | ||
"reward": self.compute_normalized_dense_reward(None, None, {"success": cube_pulled_close}), | ||
} | ||
|
||
def compute_dense_reward(self, obs: Any, action: torch.Tensor, info: Dict): | ||
|
||
tcp_pos = self.agent.tcp.pose.p | ||
cube_pos = self.cube.pose.p | ||
tool_pos = self.l_shape_tool.pose.p | ||
robot_base_pos = self.agent.robot.get_links()[0].pose.p | ||
|
||
# Stage 1: Reach and grasp tool | ||
tool_grasp_pos = tool_pos + torch.tensor([0.02, 0, 0], device=self.device) | ||
tcp_to_tool_dist = torch.linalg.norm(tcp_pos - tool_grasp_pos, dim=1) | ||
reaching_reward = 2.0 * (1 - torch.tanh(5.0 * tcp_to_tool_dist)) | ||
|
||
# Add specific grasping reward | ||
is_grasping = self.agent.is_grasping(self.l_shape_tool, max_angle=20) | ||
grasping_reward = 2.0 * is_grasping | ||
tool_reached = tcp_to_tool_dist < 0.01 | ||
|
||
# Stage 2: Position tool behind cube | ||
ideal_hook_pos = cube_pos + torch.tensor( | ||
[-(self.hook_length + self.cube_half_size), -0.067, 0], | ||
device=self.device | ||
) | ||
tool_positioning_dist = torch.linalg.norm(tool_pos - ideal_hook_pos, dim=1) | ||
positioning_reward = 1.5 * (1 - torch.tanh(3.0 * tool_positioning_dist)) | ||
tool_positioned = tool_positioning_dist < 0.05 | ||
|
||
# Stage 3: Pull cube to workspace | ||
workspace_target = robot_base_pos + torch.tensor([0.05, 0, 0], device=self.device) | ||
cube_to_workspace_dist = torch.linalg.norm(cube_pos - workspace_target, dim=1) | ||
initial_dist = torch.linalg.norm( | ||
torch.tensor([self.arm_reach + 0.1, 0, self.cube_size/2], device=self.device) - workspace_target, | ||
dim=1 | ||
) | ||
pulling_progress = (initial_dist - cube_to_workspace_dist) / initial_dist | ||
pulling_reward = 3.0 * pulling_progress * tool_positioned | ||
|
||
# Combine rewards with staging and grasping dependency | ||
reward = reaching_reward + grasping_reward | ||
reward += positioning_reward * is_grasping | ||
reward += pulling_reward * is_grasping | ||
|
||
# Penalties | ||
cube_pushed_away = cube_pos[:, 0] > (self.arm_reach + 0.15) | ||
reward[cube_pushed_away] -= 2.0 | ||
|
||
# Success bonus | ||
if "success" in info: | ||
reward[info["success"]] += 5.0 | ||
|
||
return reward | ||
|
||
def compute_normalized_dense_reward(self, obs: Any, action: torch.Tensor, info: Dict): | ||
""" | ||
Normalizes the dense reward by the maximum possible reward (success bonus) | ||
""" | ||
max_reward = 5.0 # Maximum possible reward from success bonus | ||
dense_reward = self.compute_dense_reward(obs=obs, action=action, info=info) | ||
return dense_reward / max_reward |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following the setup of other envs (e.g. push cube), can you ensure the ground truth info (tool pose, cube pose) are only provided if the observation is state based
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is still not resolved. cube pose and tool pose are always included