Take pictures using LERF embeddings; visualize grounding results in 3D mesh; parallelization #17

jedyang97 · 2023-06-01T07:45:24Z

Try the newest demo at here!

Specifically, we have made improvement on:

Use LERF embeddings + DBSCAN clustering to determine camera poses
Take a picture for each object instance
Use LLaVA-13B to caption each picture
GPT-4 reads all captions and reason internally or ask user for clarification to ground object
Display grounding results to user: object instances highlighted in a 3D mesh using bounding sphere
Significantly speed up the pipeline with parallelization on rendering and LLaVA inference

"How many doors are there in this room?"

"find all the chairs"

…ic_demo

…rf into xuweic_localizer

XuweiyiChen and others added 18 commits May 19, 2023 04:41

update demo

4cfc967

Merge branch 'main' of github.com:sled-group/chat-with-nerf into xuwe…

4e57edb

…ic_demo

update with minior

5d1b81c

code for demo

29c11e9

Update README.md

f14126d

Update README.md

9be19bd

Update README.md

72c26cc

add picture taker

240f45c

solve merge conflicts

4989bb4

optimize picture taking

28ec174

remove main

4ef42eb

update glb file

dab9479

fix docker & push code

75a8370

fix dockerfile

4d4e35d

add grounding result visualization, optimize inference speed

6e56a17

Merge branch 'xuweic_localizer' of github.com:sled-group/chat-with-ne…

7bd85af

…rf into xuweic_localizer

improve system prompt

8169e32

tune clustering hyperparameter

3c9a1c7

jedyang97 requested a review from XuweiyiChen June 1, 2023 07:45

jedyang97 self-assigned this Jun 1, 2023

jedyang97 requested a review from JasonQSY June 1, 2023 07:49

jedyang97 mentioned this pull request Jun 1, 2023

Always return the same photos #15

Closed

jedyang97 merged commit f419c7b into main Jun 1, 2023

Provide feedback