Warning: This version of the software does not fully support 2D and it may not be stable.
Systems that can visualize word embedding vectors in 3D and 2D spaces.
First, git clone
this repository into a directory on your computer. Then, run server.py.
Running on http://xxx.xxx.xxx.xxx:8080/ (Press CTRL+C to quit)
When you see something like this, go to that URL and you will be redirected to the contents of your index.html file.
The MDVectors file contains a large amount (25,000) of words and vectors that are similar to the words in the Semantle corpus. No additional action is required if you want to generate charts based on the Semantle words.
Generating 3D or 2D vectors from an nltk corpus
Note: You can avoid this entire section if you wish to use the Semantle corpus. The 2D Vector and 3D Vector files are both made up of a corpus similar to the Semantle corpus.
A commonly used corpus is called the brown corpus. We will use the brown corpus in this example.
Running the following code generates a 3D vector .json
file from the brown corpus.
from nltk.corpus import brown
from model import GenerateVectorsFile, Model
NewModel = Model(corpus=brown, dimensions=3)
Path = "static/corpora/3DVectors-new.json"
NewModel.GenerateVectorsFile(Path)
Word vectors are stored in a .json
file such as the provided 3D vectors file. Each json contains a word as a key, and the dimensions as values.
{
"in": [
2.0589427947998047, // X
-0.4056415557861328, // Y
-35.34573745727539 // Z (If 3D charts are enabled)
]
}
The program converts multi-dimensional word vectors into 2D or 3D by using t-SNE. There is a helpful video on how t-SNE converts these dimensions.
These dimensions are then projected onto plot.ly or chartjs plots.
new VisualizeJs();
string
required
: Dimensionstring
required
: Path to vector filestring
required
: The ID of the element that it attaches to
Creating a 3D Chart.
Visualization = new VisualizeJs("3D", '/static/corpora/3DVectors.json');
Creating a 2D Chart.
Visualization = new VisualizeJs("2D", '/static/corpora/3DVectors.json');
<body>
<div id="chart"></div>
</body>
<footer>
<script>
Visualization = new VisualizeJs("3D", '/static/corpora/3DVectors.json', "chart");
</script>
</footer>
Visualization.RenderAll();
Note: If you are using a large corpus, Plotly struggles with adding the word text as a label to each point. It may take a long time to load or hang forever. Use the
.RenderSome()
function to render a select number of words near a specific point.
None
The plotly chart object.
<body>
<div id="chart"></div>
</body>
<footer>
<script>
Visualization = new VisualizeJs("3D", '/static/corpora/3DVectors.json', "chart");
Visualization.RenderAll();
</script>
</footer>
Visualization.RenderClosestWords();
string
case sensitive
required
: The word that you want to display words related tofloat
required
: The view distance (the size of the radius or square that words are included in)
The plotly chart object.
<body>
<div id="chart"></div>
</body>
<footer>
<script>
Visualization = new VisualizeJs("3D", '/static/corpora/3DVectors.json', "chart");
Visualization.RenderClosestWords("university"); // Renders words similar to "university"
</script>
</footer>
Visualization.RenderClosestWordsToVector();
Similar to
.RenderClosestWords()
but instead of inputting a word, you can input specific coordinates.
float
required
: X positionfloat
required
: Y positionfloat
optional
: Z position (will be ignored if visualization is set to "2D" mode)float
required
: The view distance (the size of the radius or square that words are included in)
The plotly chart object.
<body>
<div id="chart"></div>
</body>
<footer>
<script>
Visualization = new VisualizeJs("3D", '/static/corpora/3DVectors.json', "chart");
Visualization.RenderClosestWordsToVector(27, 48.5, 39.2, 5);
</script>
</footer>
Visualization.RenderWords();
Similar to
.RenderClosestWords()
but instead of inputting a word, you can input specific coordinates.
float
required
: X positionfloat
required
: Y positionfloat
optional
: Z position (will be ignored if visualization is set to "2D" mode)float
required
: The view distance (the size of the radius or square that words are included in)
The plotly chart object.
<body>
<div id="chart"></div>
</body>
<footer>
<script>
Visualization = new VisualizeJs("3D", '/static/corpora/3DVectors.json', "chart");
Visualization.RenderClosestWordsToVector(27, 48.5, 39.2, 5);
</script>
</footer>