Skip to content

Commit

Permalink
Update learning_note.md
Browse files Browse the repository at this point in the history
  • Loading branch information
shaoanlu committed May 2, 2024
1 parent 8d25002 commit f86a9be
Showing 1 changed file with 11 additions and 1 deletion.
12 changes: 11 additions & 1 deletion learning_note.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
## Learning note
### Main insight
- The model should learn the residuals (velocity, gradient of the denoising) if possible. This greatly stabilizes the training.
- The model should learn the residuals (gradient of the denoising) if possible. This greatly stabilizes the training.
- Advantages of diffusion model: 1) capability of modeling multi-modality, 2) stable training, and 3) temporally output consistency.
- Iteratively add training data of failure modes to make extrapolation into interpolation.
### Scribbles
- The trained policy does not 100% reach the goal without collision (there is no collision in its training data).
- Unable to recover from OOD data.
Expand All @@ -14,9 +15,18 @@
- Even though the loss curve appears saturated, the performance of the controller can still improve as training continues.
- The training loss curves of the diffusion model are extremely smooth btw.
- On the contrary, it might be difficult to know if the model is overfitting or not by looking at the trajectory as well as the the denoising process.
- But in general I feel there is little harm training duffusion model as long as possible.
- DDPM and DDIM samplers yield the best result.
- Inference is not in real-time. The controller is set to sun 100Hz.

### Possible reasons for failures on collision avoidance
1. There is no data having collision in the training data.
2. Policy learned with imitation learning can exhibit accumulated error during closed-loop control

When the quadrotor getting too close to the obstacles (due to 2), the input state becomes OOD (due to 1), therefore the diffusion policy is unable to recover from such situation.

- Possible fix: adding training data that the quadrotor recovers from collision.

### Things that didn't work
- Tried encoding distances to each obstacle. Did not observe improvement in terms of collision avoidance.
- Tried using vision encoder to replace obstacle encoding. Didn't see performance improvement.

0 comments on commit f86a9be

Please sign in to comment.