Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make pre trained model available on torrent #2196

Open
perilbrain opened this issue Jun 22, 2019 · 9 comments
Open

Make pre trained model available on torrent #2196

perilbrain opened this issue Jun 22, 2019 · 9 comments

Comments

@perilbrain
Copy link

I have been trying to download pre-trained model for last one month but not a single time I succeeded. Such big files are good candidates for torrent sharing because of two major reasons

  1. Allows resuming downloads.
  2. Takes away load from main server because of peer to peer sharing which is quiet economical for open source project.

Here are some problems I have been facing while downloading:

  1. wget : (781 KB/s) - Read error at byte 304165325/1916988031 (Connection reset by peer)
  2. axel : Too many redirects.
  3. Firefox: Downloading at the rate of 37KB/s, where usually most other downloads are 4-5MB/s.

If a torrent is shared while allowing seeding from the main server i.e. aws in this case, may be people will be able to download with less effort. Same goes for sharing data of voice initiative of mozilla but I guess it is different project to talk about.

One more issue was raised earlier regarding this #2151 which was closed blindly.

@lissyx
Copy link
Collaborator

lissyx commented Jun 24, 2019

Allows resuming downloads.

Technically nothing stops from supporting resuming download, it's been working fine downloading from Github for me.

If a torrent is shared while allowing seeding from the main server i.e. aws in this case

We don't have any tracker and we don't control AWS hosting, it's Github's hosting.

Same goes for sharing data of voice initiative of mozilla but I guess it is different project to talk about.

It's a different project, and the issue has already been raised, you can check on Discourse the discussion. What stopped from doing it for Common Voice, however, is not valid for us, so it might be possible.

One more issue was raised earlier regarding this #2151 which was closed blindly.

No, that issue was closed because there was no proper discussion / documentation.

@dabinat
Copy link
Collaborator

dabinat commented Jun 25, 2019

It seems to me that anyone could put up a torrent so it doesn't necessarily need to be done officially by Mozilla.

@kdavis-mozilla
Copy link
Contributor

@dabinat Yes.

However, the download statistics are used within Mozilla as a measure of this project's success. So, if the statistics are significantly curtailed as a result of using a torrent, management will think the project isn't healthy and make project cuts.

So if we use a torrent, then we'll need to do so in a manner that maintains some notion of download statistics. I think @reuben has some ideas in this regard.

@perilbrain
Copy link
Author

perilbrain commented Jun 25, 2019

@lissyx

Technically nothing stops from supporting resuming download, it's been working fine downloading from Github for me.

Initially it was not resuming I don't know if wget was not considering -c flag or what, but it used to hang among some of the redirections. Anyway I was able to download in 9 continued trials with a bash scripts.

@kdavis-mozilla

the download statistics are used within Mozilla as a measure of this project's success

Of course we understand downloads could be a parameter for evaluation and overcoming that decision is challenging for developers, yet, a very large number of seeders and leechers shows real time popularity and patrons for the project, you just need to convince :).

@any-other-victim-of-issue

In case any one is having a problem downloading the script I am sharing a download script that might help:

#!/bin/bash
R=1
x=0
while [[ $R -ne 0 ]] ; do
    echo "$x Attempt. Last status: $R"
    if [[ ! -f "deepspeech-0.5.1-models.tar.gz" ]] ; then
        echo "No earlier file present. Exiting"
        exit 1
    fi
    wget --continue https://github.com/mozilla/DeepSpeech/releases/download/v0.5.1/deepspeech-0.5.1-models.tar.gz
    R=$?
    sleep .5
    x=$(( $x + 1 ))
done

change the url as it changes on release page otherwise you'll get restricted at v0.5.1.

@lissyx
Copy link
Collaborator

lissyx commented Nov 15, 2019

@kdavis-mozilla @reuben Should we do that for v0.6 when it's ready ?

@kdavis-mozilla
Copy link
Contributor

@lissyx I still worry about the

the download statistics are used within Mozilla as a measure of this project's success

issue.

@reuben
Copy link
Contributor

reuben commented Nov 17, 2019

Last time we were looking into this I verified that webseed requests are tracked normally by GitHub as if it was a "normal" download. It can over-count sometimes if a client making several concurrent requests. Of course, if we publish the torrent file itself as part of the release we can also track how many times people are downloading it too.

@lissyx
Copy link
Collaborator

lissyx commented Sep 9, 2020

This might be something we want now? cc @reuben

@reuben
Copy link
Contributor

reuben commented Sep 9, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants