-
Notifications
You must be signed in to change notification settings - Fork 78
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
777888d
commit f659771
Showing
3 changed files
with
66 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
64 changes: 64 additions & 0 deletions
64
site/_posts/2018-09-27-HoME - a Household Multimodal Environment.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
--- | ||
layout: post | ||
title: HoME - a Household Multimodal Environment | ||
comments: True | ||
excerpt: | ||
tags: ['2017', 'Multi Modal', 'NIPS 2017', 'NIPS Workskop', 'Virtual Embodiment', AI, NIPS] | ||
--- | ||
|
||
## Introduction | ||
|
||
* Environment for learning using modalities like vision, audio, semantics, physics and interaction with objects and other agents. | ||
|
||
* [Link to the paper](https://arxiv.org/abs/1711.11017) | ||
|
||
## Motivation | ||
|
||
* Humans learn by interacting with their surroundings (environment). | ||
|
||
* Similarly training an agent in an interactive multi-model environment (virtual embodiment) could be useful for a learning agent. | ||
|
||
|
||
## Characteristics | ||
|
||
* Open-source and Open-AI gym compatible | ||
|
||
* Built on top of 45000 3D house layouts from SUNCG dataset. | ||
|
||
* Provides both 3D visual and audio recording. | ||
|
||
* Semantic image segmentation and langauge description of objects. | ||
|
||
## Components | ||
|
||
* Rendering Engine | ||
|
||
* Implemented using Panda 3D game engine. | ||
|
||
* Renders RGB+depth scenes based on textures, multi-source lightings and shadows. | ||
|
||
* Acoustic Engine | ||
|
||
* Implemented using EVERT | ||
|
||
* Supports multiple microphones, sound sources, sound absorption based on material, atmospheric conditions etc. | ||
|
||
* Semantics Engine | ||
|
||
* Provides a short textual description for each object, along with information like color, category, material size, location etc. | ||
|
||
* Physics Engine | ||
|
||
* Implemented using Bullet3 Engine | ||
|
||
* Supports physical interaction, external forces like gravity and position and velocity information for multiple agents. | ||
|
||
## Potential Applications | ||
|
||
* Visual Question Answering | ||
|
||
* Conversational Agents | ||
|
||
* Training an agent to follow instructions | ||
|
||
* Multi-agent communication |
Submodule _site
updated
from d48f9c to 2d236a