Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added custom task (PullCubeTool-v1) with motion planner + Motion Planner for LiftPegUpright and PullCube #641

Merged
merged 57 commits into from
Nov 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
f150746
completed v1 of pull cube implementation
Viswesh-N Oct 21, 2024
c2f7732
fix ground truth info, robot spawn
Viswesh-N Oct 22, 2024
ef8f578
fixed init files
Viswesh-N Oct 22, 2024
d6d74a9
removed runs directory
Viswesh-N Oct 23, 2024
ccbd50a
fixed handle bugs
Viswesh-N Oct 23, 2024
59e84b7
Merge branch 'main' of github.com:Viswesh-N/ManiSkill
Viswesh-N Oct 23, 2024
cff9005
fixed handle bug
Viswesh-N Oct 23, 2024
4c5fc54
code refactor, requested changes
Viswesh-N Oct 23, 2024
f79e683
fixed cube and tool spawn
Viswesh-N Oct 23, 2024
736281c
added working env implementation. TODO: check dense reward
Viswesh-N Oct 24, 2024
32ddccc
modified reward function to reflect task better
Viswesh-N Oct 24, 2024
23dfa7e
removed cube and tool pose from default obs
Viswesh-N Oct 24, 2024
4f3dc3c
changed reward and episode length to observe behavior
Viswesh-N Oct 25, 2024
0f79d8d
bug fix
Viswesh-N Oct 25, 2024
112a3a3
Update pull_cube_tool.py
Viswesh-N Oct 25, 2024
decbce9
merged w main, fixed a few bugs
Viswesh-N Oct 25, 2024
186e189
Merge branch 'main' of github.com:Viswesh-N/ManiSkill
Viswesh-N Oct 25, 2024
5ca6f62
Merge branch 'haosulab:main' into main
Viswesh-N Oct 25, 2024
ea0eaba
added draft motion planning code
Viswesh-N Oct 25, 2024
9c16e34
Merge branch 'main' of github.com:Viswesh-N/ManiSkill
Viswesh-N Oct 25, 2024
f492bda
merged w main, fixed bugs
Viswesh-N Oct 25, 2024
683fa26
modified planner code to include new solution
Viswesh-N Oct 25, 2024
d6cd604
modified cube spawn, bug fixes
Viswesh-N Oct 26, 2024
832db07
working planner
Viswesh-N Oct 26, 2024
9416382
working planner
Viswesh-N Oct 26, 2024
77d6941
modified spawn conditions
Viswesh-N Oct 26, 2024
7aaf829
Merge branch 'haosulab:main' into main
Viswesh-N Oct 26, 2024
eb14f2c
Update pull_cube_tool.py
Viswesh-N Oct 26, 2024
ef014e1
changed env
Viswesh-N Oct 26, 2024
de2bf98
fixed all bugs in planner
Viswesh-N Oct 26, 2024
0f1c3b8
fixed all bugs in planner
Viswesh-N Oct 26, 2024
e1fe3a5
cleaner grasp
Viswesh-N Oct 27, 2024
53be1d9
Merge branch 'haosulab:main' into main
Viswesh-N Nov 1, 2024
eb8ea40
cleaned up grasp
Viswesh-N Nov 2, 2024
957ee93
changed render colour
Viswesh-N Nov 2, 2024
22061d0
Merge branch 'main' of github.com:Viswesh-N/ManiSkill
Viswesh-N Nov 2, 2024
5e7283b
added peg motion planner - buggy
Viswesh-N Nov 2, 2024
5cdcd0b
removed env success bugs
Viswesh-N Nov 2, 2024
1362ad4
removed env success bugs
Viswesh-N Nov 2, 2024
73dce20
removed env success bugs
Viswesh-N Nov 2, 2024
140b0b4
removed env success bugs
Viswesh-N Nov 2, 2024
9070e25
improved planner success and cube spawn
Viswesh-N Nov 2, 2024
6f255a7
modified reward for learning
Viswesh-N Nov 2, 2024
eb126d5
reward modified
Viswesh-N Nov 3, 2024
a5ae4ca
Merge branch 'haosulab:main' into main
Viswesh-N Nov 5, 2024
c827308
Merge branch 'haosulab:main' into main
Viswesh-N Nov 8, 2024
9553f8a
working LiftPeg and PullCubeTool planner with high accuracy
Viswesh-N Nov 8, 2024
a729e88
Merge branch 'haosulab:main' into main
Viswesh-N Nov 9, 2024
0426d6f
added better reward learning
Viswesh-N Nov 9, 2024
a8fecd9
fully working planners and envs
Viswesh-N Nov 10, 2024
bfd3314
working dense reward for pullcube
Viswesh-N Nov 15, 2024
c0f7c37
Merge branch 'haosulab:main' into main
Viswesh-N Nov 15, 2024
9068f03
Merge branch 'haosulab:main' into main
Viswesh-N Nov 16, 2024
3ec00fb
added planner for pullcube env
Viswesh-N Nov 16, 2024
fa16ffb
added working planner for pullcube env
Viswesh-N Nov 16, 2024
3751c29
Update pull_cube.py
Viswesh-N Nov 16, 2024
95ea99d
Update pull_cube.py
Viswesh-N Nov 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion examples/baselines/ppo/ppo.py
Original file line number Diff line number Diff line change
Expand Up @@ -468,4 +468,4 @@ def clip_action(action: torch.Tensor):
print(f"model saved to {model_path}")
logger.close()
envs.close()
eval_envs.close()
eval_envs.close()
3 changes: 2 additions & 1 deletion mani_skill/envs/tasks/tabletop/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,5 @@
from .poke_cube import PokeCubeEnv
from .place_sphere import PlaceSphereEnv
from .roll_ball import RollBallEnv
from .push_t import PushTEnv
from .push_t import PushTEnv
from .pull_cube_tool import PullCubeToolEnv
2 changes: 1 addition & 1 deletion mani_skill/envs/tasks/tabletop/pull_cube.py
Original file line number Diff line number Diff line change
Expand Up @@ -135,4 +135,4 @@ def compute_dense_reward(self, obs: Any, action: Array, info: Dict):

def compute_normalized_dense_reward(self, obs: Any, action: Array, info: Dict):
max_reward = 3.0
return self.compute_dense_reward(obs=obs, action=action, info=info) / max_reward
return self.compute_dense_reward(obs=obs, action=action, info=info) / max_reward
268 changes: 268 additions & 0 deletions mani_skill/envs/tasks/tabletop/pull_cube_tool.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,268 @@
from typing import Any, Dict, Union
import numpy as np
import torch
import sapien
from mani_skill.agents.robots import Fetch, Panda
from mani_skill.envs.sapien_env import BaseEnv
from mani_skill.envs.utils import randomization
from mani_skill.sensors.camera import CameraConfig
from mani_skill.utils import sapien_utils
from mani_skill.utils.building import actors
from mani_skill.utils.registration import register_env
from mani_skill.utils.scene_builder.table import TableSceneBuilder
from mani_skill.utils.structs import Pose
from mani_skill.utils.structs.types import GPUMemoryConfig, SimConfig


@register_env("PullCubeTool-v1", max_episode_steps=100)
class PullCubeToolEnv(BaseEnv):
"""
Task Description
-----------------
Given an L-shaped tool that is within the reach of the robot, leverage the
tool to pull a cube that is out of it's reach

Randomizations
---------------
- The cube's position (x,y) is randomized on top of a table in the region "<out of manipulator
reach, but within reach of tool>". It is placed flat on the table
- The target goal region is the region on top of the table marked by "<within reach of arm>"

Success Conditions
-----------------
- The cube's xy position is within the goal region of the arm's base (marked by reachability)
"""

SUPPORTED_ROBOTS = ["panda", "fetch"]
SUPPORTED_REWARD_MODES = ("normalized_dense", "dense", "sparse", "none")
agent: Union[Panda, Fetch]

goal_radius = 0.3
cube_half_size = 0.02
handle_length = 0.2
hook_length = 0.05
width = 0.05
height = 0.05
cube_size = 0.02
arm_reach = 0.35

def __init__(self, *args, robot_uids="panda", robot_init_qpos_noise=0.02, **kwargs):
self.robot_init_qpos_noise = robot_init_qpos_noise
super().__init__(*args, robot_uids=robot_uids, **kwargs)

@property
def _default_sim_config(self):
return SimConfig(
gpu_memory_config=GPUMemoryConfig(
found_lost_pairs_capacity=2**25, max_rigid_patch_count=2**18
)
)

@property
def _default_sensor_configs(self):
pose = sapien_utils.look_at(eye=[0.3, 0, 0.5], target=[-0.1, 0, 0.1])
return [
CameraConfig(
"base_camera",
pose=pose,
width=128,
height=128,
fov=np.pi / 2,
near=0.01,
far=100,
)
]

@property
def _default_human_render_camera_configs(self):
pose = sapien_utils.look_at([0.6, 0.7, 0.6], [0.0, 0.0, 0.35])
return [
CameraConfig(
"render_camera",
pose=pose,
width=512,
height=512,
fov=1,
near=0.01,
far=100,
)
]

def _build_l_shaped_tool(self, handle_length, hook_length, width, height):
builder = self.scene.create_actor_builder()

mat = sapien.render.RenderMaterial()
mat.set_base_color([1, 0, 0, 1])
mat.metallic = 1.0
mat.roughness = 0.0
mat.specular = 1.0

builder.add_box_collision(sapien.Pose([handle_length / 2, 0, 0]),[handle_length / 2, width / 2, height / 2]
, density= 500)
builder.add_box_visual(
sapien.Pose([handle_length / 2, 0, 0]),
[handle_length / 2, width / 2, height / 2],
material=mat,
)

builder.add_box_collision(
sapien.Pose([handle_length - hook_length / 2, width, 0]),
[hook_length / 2, width , height / 2],
)
builder.add_box_visual(
sapien.Pose([handle_length - hook_length / 2, width, 0]),
[hook_length / 2, width, height / 2],
material=mat,
)

return builder.build(name="l_shape_tool")

def _load_scene(self, options: dict):
self.scene_builder = TableSceneBuilder(
self, robot_init_qpos_noise=self.robot_init_qpos_noise
)
self.scene_builder.build()

self.cube = actors.build_cube(
self.scene,
half_size=self.cube_half_size,
color=np.array([12, 42, 160, 255]) / 255,
name="cube",
body_type="dynamic",
)

self.l_shape_tool = self._build_l_shaped_tool(
handle_length=self.handle_length,
hook_length=self.hook_length,
width=self.width,
height=self.height,
)


def _initialize_episode(self, env_idx: torch.Tensor, options: dict):
with torch.device(self.device):
b = len(env_idx)
self.scene_builder.initialize(env_idx)

tool_xyz = torch.zeros((b, 3), device=self.device)
tool_xyz[..., :2] = - torch.rand((b, 2), device=self.device) * 0.2 - 0.1
tool_xyz[..., 2] = self.height / 2
tool_q = torch.tensor([1, 0, 0, 0], device=self.device).expand(b, 4)

tool_pose = Pose.create_from_pq(p=tool_xyz, q=tool_q)
self.l_shape_tool.set_pose(tool_pose)

cube_xyz = torch.zeros((b, 3), device=self.device)
cube_xyz[..., 0] = self.arm_reach + torch.rand(b, device=self.device) * (
self.handle_length
) - 0.3
cube_xyz[..., 1] = torch.rand(b, device=self.device) * 0.3 - 0.25
cube_xyz[..., 2] = self.cube_size / 2 + 0.015

cube_q = randomization.random_quaternions(
b,
lock_x=True,
lock_y=True,
lock_z=False,
bounds=(-np.pi / 6, np.pi / 6),
device=self.device,
)

cube_pose = Pose.create_from_pq(p=cube_xyz, q=cube_q)
self.cube.set_pose(cube_pose)

def _get_obs_extra(self, info: Dict):
obs = dict(
tcp_pose=self.agent.tcp.pose.raw_pose,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following the setup of other envs (e.g. push cube), can you ensure the ground truth info (tool pose, cube pose) are only provided if the observation is state based

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is still not resolved. cube pose and tool pose are always included


if self._obs_mode in ["state", "state_dict"]:
obs.update(
cube_pose=self.cube.pose.raw_pose,
tool_pose=self.l_shape_tool.pose.raw_pose,
)

return obs

def evaluate(self):
cube_pos = self.cube.pose.p

robot_base_pos = self.agent.robot.get_links()[0].pose.p

cube_to_base_dist = torch.linalg.norm(cube_pos[:, :2] - robot_base_pos[:, :2], dim=1)

# Success condition - cube is pulled close enough
cube_pulled_close = cube_to_base_dist < 0.6 #

workspace_center = robot_base_pos.clone()
workspace_center[:, 0] += self.arm_reach * 0.1
cube_to_workspace_dist = torch.linalg.norm(cube_pos - workspace_center, dim=1)
progress = 1 - torch.tanh(3.0 * cube_to_workspace_dist)

return {
"success": cube_pulled_close,
"success_once": cube_pulled_close,
"success_at_end": cube_pulled_close,
"cube_progress": progress.mean(),
"cube_distance": cube_to_workspace_dist.mean(),
"reward": self.compute_normalized_dense_reward(None, None, {"success": cube_pulled_close}),
}

def compute_dense_reward(self, obs: Any, action: torch.Tensor, info: Dict):

tcp_pos = self.agent.tcp.pose.p
cube_pos = self.cube.pose.p
tool_pos = self.l_shape_tool.pose.p
robot_base_pos = self.agent.robot.get_links()[0].pose.p

# Stage 1: Reach and grasp tool
tool_grasp_pos = tool_pos + torch.tensor([0.02, 0, 0], device=self.device)
tcp_to_tool_dist = torch.linalg.norm(tcp_pos - tool_grasp_pos, dim=1)
reaching_reward = 2.0 * (1 - torch.tanh(5.0 * tcp_to_tool_dist))

# Add specific grasping reward
is_grasping = self.agent.is_grasping(self.l_shape_tool, max_angle=20)
grasping_reward = 2.0 * is_grasping
tool_reached = tcp_to_tool_dist < 0.01

# Stage 2: Position tool behind cube
ideal_hook_pos = cube_pos + torch.tensor(
[-(self.hook_length + self.cube_half_size), -0.067, 0],
device=self.device
)
tool_positioning_dist = torch.linalg.norm(tool_pos - ideal_hook_pos, dim=1)
positioning_reward = 1.5 * (1 - torch.tanh(3.0 * tool_positioning_dist))
tool_positioned = tool_positioning_dist < 0.05

# Stage 3: Pull cube to workspace
workspace_target = robot_base_pos + torch.tensor([0.05, 0, 0], device=self.device)
cube_to_workspace_dist = torch.linalg.norm(cube_pos - workspace_target, dim=1)
initial_dist = torch.linalg.norm(
torch.tensor([self.arm_reach + 0.1, 0, self.cube_size/2], device=self.device) - workspace_target,
dim=1
)
pulling_progress = (initial_dist - cube_to_workspace_dist) / initial_dist
pulling_reward = 3.0 * pulling_progress * tool_positioned

# Combine rewards with staging and grasping dependency
reward = reaching_reward + grasping_reward
reward += positioning_reward * is_grasping
reward += pulling_reward * is_grasping

# Penalties
cube_pushed_away = cube_pos[:, 0] > (self.arm_reach + 0.15)
reward[cube_pushed_away] -= 2.0

# Success bonus
if "success" in info:
reward[info["success"]] += 5.0

return reward

def compute_normalized_dense_reward(self, obs: Any, action: torch.Tensor, info: Dict):
"""
Normalizes the dense reward by the maximum possible reward (success bonus)
"""
max_reward = 5.0 # Maximum possible reward from success bonus
dense_reward = self.compute_dense_reward(obs=obs, action=action, info=info)
return dense_reward / max_reward
4 changes: 2 additions & 2 deletions mani_skill/examples/motionplanning/panda/motionplanner.py
Original file line number Diff line number Diff line change
Expand Up @@ -170,8 +170,8 @@ def open_gripper(self):
self.base_env.render_human()
return obs, reward, terminated, truncated, info

def close_gripper(self, t=6):
self.gripper_state = CLOSED
def close_gripper(self, t=6, gripper_state = CLOSED):
self.gripper_state = gripper_state
qpos = self.robot.get_qpos()[0, :-2].cpu().numpy()
for i in range(t):
if self.control_mode == "pd_joint_pos":
Expand Down
6 changes: 5 additions & 1 deletion mani_skill/examples/motionplanning/panda/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,17 @@
import os.path as osp
from mani_skill.utils.wrappers.record import RecordEpisode
from mani_skill.trajectory.merge_trajectory import merge_trajectories
from mani_skill.examples.motionplanning.panda.solutions import solvePushCube, solvePickCube, solveStackCube, solvePegInsertionSide, solvePlugCharger
from mani_skill.examples.motionplanning.panda.solutions import solvePushCube, solvePickCube, solveStackCube, solvePegInsertionSide, solvePlugCharger, solvePullCubeTool, solveLiftPegUpright, solvePullCube
MP_SOLUTIONS = {
"PickCube-v1": solvePickCube,
"StackCube-v1": solveStackCube,
"PegInsertionSide-v1": solvePegInsertionSide,
"PlugCharger-v1": solvePlugCharger,
"PushCube-v1": solvePushCube,
"PullCubeTool-v1": solvePullCubeTool,
"LiftPegUpright-v1": solveLiftPegUpright,
"PullCube-v1": solvePullCube

}
def parse_args(args=None):
parser = argparse.ArgumentParser()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,6 @@
from .peg_insertion_side import solve as solvePegInsertionSide
from .plug_charger import solve as solvePlugCharger
from .push_cube import solve as solvePushCube
from .pull_cube_tool import solve as solvePullCubeTool
from .lift_peg_upright import solve as solveLiftPegUpright
from .pull_cube import solve as solvePullCube
Loading