move all docs/readmes/etc facebookresearch -> flashlight (flashlight#540

) Summary: Pull Request resolved: flashlight#540 migration from facebookresearch into flashlight Reviewed By: jacobkahn Differential Revision: D27802384 fbshipit-source-id: 884a468581fe1cc1b5e26af1aebefdffaf7b8b94
tpolasek · Apr 15, 2021 · e9628da · e9628da
1 parent 7e56c1f
commit e9628da
Show file tree

Hide file tree

Showing 13 changed files with 49 additions and 49 deletions.
diff --git a/.docker/README.md b/.docker/README.md
@@ -1,7 +1,7 @@
 Flashlight and its dependencies can also be built with the provided Dockerfiles. Both CUDA and CPU backends are supported with Docker. The current Docker images are frozen at **Ubuntu 18.04** and **CUDA 10.0**; we update these periodically.
 
 ## Docker images on [Docker Hub](https://hub.docker.com/r/flml/flashlight/tags)
- 
+
 Docker images for the CUDA and CPU backends for each Flashlight commit are [available on Docker Hub](https://hub.docker.com/r/flml/flashlight/tags).
 
 ### Running Flashlight with Docker
@@ -27,7 +27,7 @@ cd /root/flashlight/build && make test
 
 Using the Dockerfiles in this directory:
 ```shell
-git clone --recursive https://github.com/facebookresearch/flashlight.git
+git clone --recursive https://github.com/flashlight/flashlight.git
 cd flashlight
 # for CUDA backend
 sudo docker build -f .docker/Dockerfile-CUDA -t flashlight .

diff --git a/.github/workflows/docker_image_build.yml b/.github/workflows/docker_image_build.yml
@@ -5,7 +5,7 @@ on:
       - master
 jobs:
   cuda_image_build:
-    if: github.repository_owner == 'facebookresearch'
+    if: github.repository_owner == 'flashlight'
     name: CUDA image build
     runs-on: ubuntu-latest
     steps:
@@ -26,7 +26,7 @@ jobs:
     - name: Docker logout
       run: docker logout
   cpu_image_build:
-    if: github.repository_owner == 'facebookresearch'
+    if: github.repository_owner == 'flashlight'
     name: CPU image build
     runs-on: ubuntu-latest
     steps:

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -17,7 +17,7 @@ All contributors must sign the CLA for their pull requests to be eligible for me
 You can find the CLA [here](https://code.facebook.com/cla).
 
 ## Issues
-We use [GitHub issues](https://github.com/facebookresearch/flashlight/issues) to track public bugs. When filing, a bug, please make sure your description is clear and include sufficient instructions to reproduce the issue (for instance, your OS, compiler version, and selected backend).
+We use [GitHub issues](https://github.com/flashlight/flashlight/issues) to track public bugs. When filing, a bug, please make sure your description is clear and include sufficient instructions to reproduce the issue (for instance, your OS, compiler version, and selected backend).
 
 ## License
 By contributing to flashlight, you agree that your contributions will be licensed

diff --git a/README.md b/README.md
@@ -6,9 +6,9 @@
 | [**Installation**](#building-and-installing)
 | [**Documentation**](https://fl.readthedocs.io/en/latest/)
 
-[![CircleCI](https://circleci.com/gh/facebookresearch/flashlight.svg?style=shield)](https://circleci.com/gh/facebookresearch/flashlight)
+[![CircleCI](https://circleci.com/gh/flashlight/flashlight.svg?style=shield)](https://app.circleci.com/pipelines/github/flashlight/flashlight)
 [![Documentation Status](https://img.shields.io/readthedocs/fl.svg)](https://fl.readthedocs.io/en/latest/)
-[![Docker Image Build Status](https://img.shields.io/github/workflow/status/facebookresearch/flashlight/Publish%20Docker%20images?label=docker%20image%20build)](https://hub.docker.com/r/flml/flashlight/tags)
+[![Docker Image Build Status](https://img.shields.io/github/workflow/status/flashlight/flashlight/Publish%20Docker%20images?label=docker%20image%20build)](https://hub.docker.com/r/flml/flashlight/tags)
 [![Join the chat at https://gitter.im/flashlight-ml/community](https://img.shields.io/gitter/room/flashlight-ml/community)](https://gitter.im/flashlight-ml/community?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
 
 [![Docker Image for CUDA backend](https://img.shields.io/docker/image-size/flml/flashlight/cuda-latest?label=docker%20%28cuda%29&logo=docker)](https://hub.docker.com/r/flml/flashlight/tags?page=1&ordering=last_updated&name=cuda-latest)
@@ -26,8 +26,8 @@ tensor library.
 - CUDA and CPU backends for GPU and CPU training.
 - An emphasis on efficiency and scale.
 
-Native support in C++ and simple extensibility makes Flashlight a powerful research framework that's *hackable to its core* and enables fast iteration on new experimental setups and algorithms without sacrificing performance. In a single repository, Flashlight provides [apps](https://github.com/facebookresearch/flashlight/tree/master/flashlight/app) for research across multiple domains:
-- [Automatic speech recognition](https://github.com/facebookresearch/flashlight/tree/master/flashlight/app/asr) (the [wav2letter](https://github.com/facebookresearch/wav2letter/) project) — [Documentation](flashlight/app/asr) | [Tutorial](flashlight/app/asr/tutorial)
+Native support in C++ and simple extensibility makes Flashlight a powerful research framework that's *hackable to its core* and enables fast iteration on new experimental setups and algorithms without sacrificing performance. In a single repository, Flashlight provides [apps](https://github.com/flashlight/flashlight/tree/master/flashlight/app) for research across multiple domains:
+- [Automatic speech recognition](https://github.com/flashlight/flashlight/tree/master/flashlight/app/asr) (the [wav2letter](https://github.com/flashlight/wav2letter/) project) — [Documentation](flashlight/app/asr) | [Tutorial](flashlight/app/asr/tutorial)
 - [Image classification](flashlight/app/imgclass)
 - [Object detection](flashlight/app/objdet)
 - [Language modeling](flashlight/app/lm)
@@ -188,7 +188,7 @@ To build the Flashlight CPU backend from source using dependencies installed wit
 ##### Build Using the `vcpkg` Toolchain File
 To build Flashlight from source with these dependencies, clone the repository:
 ```shell
-git clone https://github.com/facebookresearch/flashlight.git && cd flashlight
+git clone https://github.com/flashlight/flashlight.git && cd flashlight
 mkdir -p build && cd build
 ```
 Then, build from source using `vcpkg`'s [CMake toolchain](https://github.com/microsoft/vcpkg/blob/master/docs/users/integration.md#cmake-toolchain-file-recommended-for-open-source-cmake-projects):
@@ -209,7 +209,7 @@ Some dependencies marked below are downloaded and installed automatically if not
 
 **Once all dependencies are installed**, clone the repository:
 ```shell
-git clone https://github.com/facebookresearch/flashlight.git && cd flashlight
+git clone https://github.com/flashlight/flashlight.git && cd flashlight
 mkdir -p build && cd build
 ```
 Then build all Flashlight components with:
@@ -224,7 +224,7 @@ To build a smaller subset of Flashlight features/apps, see the [build options](#
 
 To install Flashlight in a custom directory, use CMake's [`CMAKE_INSTALL_PREFIX`](https://cmake.org/cmake/help/v3.10/variable/CMAKE_INSTALL_PREFIX.html) argument. Flashlight libraries can be built as shared libraries using CMake's [`BUILD_SHARED_LIBS`](https://cmake.org/cmake/help/v3.10/variable/BUILD_SHARED_LIBS.html) argument.
 
-Flashlight uses modern CMake and `IMPORTED` targets for most dependencies. If a dependency isn't found, passing `-D<package>_DIR` to your `cmake` command or exporting `<package>_DIR` as an environment variable equal to the path to `<package>Config.cmake` can help locate dependencies on your system. See [the documentation](https://cmake.org/cmake/help/v3.10/command/find_package.html) for more details. If CMake is failing to locate a package, check to see if a corresponding [issue](https://github.com/facebookresearch/flashlight/issues) has already been created before creating your own.
+Flashlight uses modern CMake and `IMPORTED` targets for most dependencies. If a dependency isn't found, passing `-D<package>_DIR` to your `cmake` command or exporting `<package>_DIR` as an environment variable equal to the path to `<package>Config.cmake` can help locate dependencies on your system. See [the documentation](https://cmake.org/cmake/help/v3.10/command/find_package.html) for more details. If CMake is failing to locate a package, check to see if a corresponding [issue](https://github.com/flashlight/flashlight/issues) has already been created before creating your own.
 
 #### Dependencies
 

diff --git a/bindings/python/README.md b/bindings/python/README.md
@@ -150,7 +150,7 @@ where ``ntokens`` is the number of tokens predicted for each frame (number of cl
 ### Beam-search decoder
 Currently lexicon-based and lexicon-free based beam-search decoder is supported for CTC/ASG models only (no seq2seq models support). Also only n-gram (KenLM) language model is supported for python bindings.
 However, one can define custom language model inside python and use it for decoding, details see below.
-To have better understanding how this beam-search decoder works please see [Beam-search decoder section](https://github.com/facebookresearch/flashlight/tree/master/flashlight/app/asr#beam-search-decoders).
+To have better understanding how this beam-search decoder works please see [Beam-search decoder section](https://github.com/flashlight/flashlight/tree/master/flashlight/app/asr#beam-search-decoders).
 
 To run decoder one first should define its options:
 ```python
@@ -182,7 +182,7 @@ To run decoder one first should define its options:
 
 Then we should prepare tokens dictionary (tokens for which acoustic models
 returns probability for each frame), lexicon (mapping between words and its spelling with the tokens set).
-Details on the tokens and lexicon files format have a look at [Data Preparation](https://github.com/facebookresearch/flashlight/tree/master/flashlight/app/asr#data-preparation).
+Details on the tokens and lexicon files format have a look at [Data Preparation](https://github.com/flashlight/flashlight/tree/master/flashlight/app/asr#data-preparation).
 
 ```python
     from flashlight.lib.text.dictionary import Dictionary, load_words, create_word_dict

diff --git a/flashlight/app/README.md b/flashlight/app/README.md
@@ -1,10 +1,10 @@
  # Flashlight Applications
 
- Flashlight application libraries are domain-specific libraries build on top of the [flashlight core](https://github.com/facebookresearch/flashlight/tree/master/flashlight/fl) and [flashlight lib](https://github.com/facebookresearch/flashlight/tree/master/flashlight/lib). They provide lightweight, unopinionated pipelines and tools that are easily modifiable for training or inference across tasks. Below are supported applications; new applications are under active development.
+ Flashlight application libraries are domain-specific libraries build on top of the [flashlight core](https://github.com/flashlight/flashlight/tree/master/flashlight/fl) and [flashlight lib](https://github.com/flashlight/flashlight/tree/master/flashlight/lib). They provide lightweight, unopinionated pipelines and tools that are easily modifiable for training or inference across tasks. Below are supported applications; new applications are under active development.
 
- ### [Automatic Speech Recognition](https://github.com/facebookresearch/flashlight/tree/master/flashlight/app/asr) — `asr` (the [wav2letter](https://github.com/facebookresearch/wav2letter/) Project)
+ ### [Automatic Speech Recognition](https://github.com/flashlight/flashlight/tree/master/flashlight/app/asr) — `asr` (the [wav2letter](https://github.com/flashlight/wav2letter/) Project)
 
- The `asr` application provides tools for audio processing/augmentation, acoustic model training, beam search decoding, and preprocessing/preparing audio data for use. Full documentation for usage and binaries [can be found here](https://github.com/facebookresearch/flashlight/tree/master/flashlight/app/asr).
+ The `asr` application provides tools for audio processing/augmentation, acoustic model training, beam search decoding, and preprocessing/preparing audio data for use. Full documentation for usage and binaries [can be found here](https://github.com/flashlight/flashlight/tree/master/flashlight/app/asr).
 
  #### Provided Artifacts:
 - Binaries:
@@ -15,16 +15,16 @@
   - `fl_asr_tutorial_inference_ctc`
   - `fl_asr_tutorial_finetune_ctc`
 
-### [Language Modeling](https://github.com/facebookresearch/flashlight/tree/master/flashlight/app/lm) — `lm`
+### [Language Modeling](https://github.com/flashlight/flashlight/tree/master/flashlight/app/lm) — `lm`
 
-The `lm` application provides tools for text preprocessing and language model training for both auto-regressive and BERT-style models. Full documentation for usage and binaries [can be found here](https://github.com/facebookresearch/flashlight/tree/master/flashlight/app/lm).
+The `lm` application provides tools for text preprocessing and language model training for both auto-regressive and BERT-style models. Full documentation for usage and binaries [can be found here](https://github.com/flashlight/flashlight/tree/master/flashlight/app/lm).
 
 #### Provided Artifacts:
 - Binaries:
   - `fl_lm_dictionary_builder`
   - `fl_lm_train`
 
-### [Image Classification](https://github.com/facebookresearch/flashlight/tree/master/flashlight/app/imgclass) — `imgclass`
+### [Image Classification](https://github.com/flashlight/flashlight/tree/master/flashlight/app/imgclass) — `imgclass`
 
 The `imgclass` application is still in early, active development. It currently provides dataset abstractions for ImageNet and an example training pipeline for `Resnet34` which can be easily extended to more complex setups.
 

diff --git a/flashlight/app/asr/README.md b/flashlight/app/asr/README.md
@@ -1,8 +1,8 @@
 # Automatic Speech Recognition (ASR)
 
-Flashlight's ASR application (formerly the [wav2letter](https://github.com/facebookresearch/wav2letter/) project) provides training and inference capabilities for end-to-end speech recognition systems. Outside of original research conducted with Flashlight and wav2letter, the codebase contains up-to-date implementations of recent architectures and developments in the speech domain.
+Flashlight's ASR application (formerly the [wav2letter](https://github.com/flashlight/wav2letter/) project) provides training and inference capabilities for end-to-end speech recognition systems. Outside of original research conducted with Flashlight and wav2letter, the codebase contains up-to-date implementations of recent architectures and developments in the speech domain.
 
-**To get started using the ASR library with existing/pre-trained models, see [tutorials](https://github.com/facebookresearch/flashlight/tree/master/flashlight/app/asr/tutorial).**
+**To get started using the ASR library with existing/pre-trained models, see [tutorials](https://github.com/flashlight/flashlight/tree/master/flashlight/app/asr/tutorial).**
 
 ### Table of Contents
 
@@ -407,7 +407,7 @@ epoch:        6 | nupdates:         1000 | lr: 0.000469 | lrcriterion: 0.000000
 where we report epochs, number of updates, learning rates, timing, loss and WER/LER for train and validation data.
 
 #### Flags
-We give a short description of some of the more important flags here. A complete list of the flag definitions and short descriptions of their meaning can be found [here](https://github.com/facebookresearch/flashlight/blob/master/flashlight/app/asr/common/Defines.cpp).
+We give a short description of some of the more important flags here. A complete list of the flag definitions and short descriptions of their meaning can be found [here](https://github.com/flashlight/flashlight/blob/master/flashlight/app/asr/common/Defines.cpp).
 
 The `datadir` flag is the base path to where all the `train` and `valid` dataset list files live. Every `train` path will be prefixed by `datadir`. Multiple datasets can be passed to `train` and `valid` as a comma-separated list.
 
@@ -504,7 +504,7 @@ root → w → o → r → l → d ([world])
 root → p → i → n → e ([pine]) → a → p → p → l → e → ([pineapple])
 ```
 
-**Note** We forced to have up to 6 words with the same spelling, others will be ignored in the inference. So if you have more than 6 words in the lexicon, you need to [update this constant](https://github.com/facebookresearch/flashlight/blob/master/flashlight/lib/text/decoder/Trie.h#L17).
+**Note** We forced to have up to 6 words with the same spelling, others will be ignored in the inference. So if you have more than 6 words in the lexicon, you need to [update this constant](https://github.com/flashlight/flashlight/blob/master/flashlight/lib/text/decoder/Trie.h#L17).
 
 ##### 1.2 Lexicon-free beam-search decoder (`uselexicon=false`)
 The lexicon-free beam-search decoder considers any possible token as candidates and there is no notion of words during decoding. In this case, a word separator should be set by `wordseparator` and included into tokens set for AM training. The word separator is treated and predicted as all the other normal tokens. After obtaining the transcription in tokens, word separator is used to split the sequence into words. Usually, when we use word-pieces as target units, the word separator can be part of the token. To correctly handle this case, one should set `--usewordpiece=true`.
@@ -529,7 +529,7 @@ Currently we are supporting decoding with the following language models: ZeroLM,
 
 **KenLM** language model can be trained standalone with [KenLM library](https://kheafield.com/code/kenlm/). The text data should be prepared accordingly to the acoustic model data. For example, in case of word-level LM if your AM token set doesn’t contain punctuation, then remove all punctuation from the data. In case of token-level LM training one should split words into tokens set sequence and only then train LM on such data in a way that LM predicts probability for a token (not for a word). Both of the `.arpa` and the binarized `.bin` LM can be used. However it is recommended to convert arpa files to the [binary format](https://github.com/kpu/kenlm#querying) for faster loading.
 
-**ConvLM** models are convolutional neural networks. They are currently trained in the [fairseq](https://github.com/pytorch/fairseq) and then converted into flashlight-serializable models ([example](https://github.com/facebookresearch/wav2letter/blob/master/recipes/lexicon_free/librispeech/convert_convlm.sh) how we are doing this) to be able to load. `lm_vocab` should be specified as it is a dictionary to map tokens into indices in the ConvLM training. Note that this token set is usually different from the one used in AM training. Each line of this file is a single token (char, word, word-piece, etc.) and the token index is exactly its line number.
+**ConvLM** models are convolutional neural networks. They are currently trained in the [fairseq](https://github.com/pytorch/fairseq) and then converted into flashlight-serializable models ([example](https://github.com/flashlight/wav2letter/blob/master/recipes/lexicon_free/librispeech/convert_convlm.sh) how we are doing this) to be able to load. `lm_vocab` should be specified as it is a dictionary to map tokens into indices in the ConvLM training. Note that this token set is usually different from the one used in AM training. Each line of this file is a single token (char, word, word-piece, etc.) and the token index is exactly its line number.
 
 To efficiently decode with ConvLM, which is pretty expensive on running the forward pass, we design a dynamic cache to hold the probabilities over all the tokens given the candidates generated from the previous frame. This way, when we want to propose new candidates, we can easily check the cache for its pre-computed LM score. In other words, we only need to run the ConvLM forward pass in batches at the end of decoding each frame, when all the possible new candidates are gathered. Thus, the batching and caching can greatly reduce the number of the forward passes we need to run in total.