-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
module request: piper #866
Comments
I suggest that you also have a look at It is implemented in C++ and has various APIs for different languages, e.g., Python/C/Go/C#/Swift/Kotlin, etc. You can find Android APKs for it at You can also try it in our huggingface space without installing anything. By the way, it supports models from piper as well. |
Also cc @Elleo . You may find sherpa-onnx interesting. |
Just for a little context on what I'm doing, Pied can currently configure speech dispatcher to work with Piper through the sd_generic module, but my long term plan is to create a piper speech dispatcher module that can be kept loaded to further reduce latency and add support for speed/pitch/etc. changes @csukuangfj Thanks, that's interesting, I'll check it out! |
I'm trying to integrate Piper through the sd_generic module. But I get the error: I can't find any information on this error. Any help will be appreciated! I added the module in speechd.conf: And created the piper.conf file in the /etc/speech-dispatcher/modules directory: |
As mentioned in the issue template, add |
(of course, the issue template knows better, that's why we write documentation, so we don't have to rely on our memory: it's LogLevel 5 in the speechd config, and indeed Debug 1 in the module config) |
I set the LogLevel to 5 and Debug to 1 and attached the files |
I've obviously been trying everything to get this going and in the process I've changed a lot of the config and it might not be optimal. Since it seems to be an issue with the loading of the sound plugin, I changed the AudioOutputMethod back to "pulse". I'm running Ubuntu 22.04 and according to the output of inxi, Pulse is running:
Then I get the following error: I noticed that there is an Ubuntu package speech-dispatcher-audio-plugins, and it is installed, and contains the following: So, the Pulse plugin is installed. |
Since it is using the generic module,
Your Actually, at the end of your |
Yes, there aren't any issues logged in speech-dispatcher.log or piper.log, but it isn't working. There is no sound played. If I run the command directly, i.e. it works. |
Your command is missing the Also, in your speech-dispatcher.log I don't see any speech attempt, how do you actually test it? |
Yes, the full command I run on the command-line is: Through speech-dispatcher, I test it on the command-line with |
You are using
Then please provide the logs that correspond to this test. The logs you uploaded didn't contain anything about that. |
is there a specific reason for |
I've changed the configuration to the simplest case to avoid confusion. I selected alsa for the audio output. I can use the following command and it works: |
There is still a difference: And your speech-dispatcher.log still doesn't show any attempt to speech anything. No client ever connects to it within the 5s daemon timeout:
Again: how exactly do you test? |
Again: spd-say "hello" |
But that does not show up at all in the logs... Are you sure you have only one installation of speech-dispatcher, as in: is spd-say actually connecting to the speech-dispatcher daemon that you are starting? Does it work with other speech syntheses? |
I just tried to get going from scratch on a different computer and now I have the issue where speech-dispatcher doesn't want to start.
In the log file, it is the same issue:
It seems that spd-say is not using speech-dispatcher as the voice is different from the piper voice. This is a standard Ubuntu install |
As I already mentioned, this is just a harmless warning. What's important is after that. That's why one should always put the whole log in the bug report.
Does that work as root? You are starting speech-dispatcher from systemd, but that assumes that you can emit audio from root-started speech-dispatcher. Nowadays what usually happens is rather that you don't start speech-dispatcher from systemd, but let it get auto-started from the spd-say call.
You can use |
I can run the command with sudo and get audio output: According to spd-say: But it is not using the speech-dispatcher that I've configured! If I run spd-say -O -L, I get: It doesn't seem like there is any logic to how this operates... It seems I have to abandon this, but I'm working on a Qt application, and QtTextToSpeech integrates with speech-dispatcher. |
sudo only applies to the first command of your pipeline. It's just before
Maybe check whether you might have different log files in
There is, it's just that with nowaday's desktops, things have become more involved, as system-wide daemons are now frowned upon, and thus daemons are rather started in user sessions. |
Using sudo before aplay results in: |
So that explains why using a system-wide speech-dispatcher won't work. And thus why you want to just let the speechd auto-start trigger in your desktop session (as is the default), and see logs in |
Can you point me to the documentation to do this? All the explanations I've seen show the configuration I've applied. How do I undo the changes I've made? Do I just remove the references in speecd.conf to piper? |
It's already the default. Your
Yes, that's the problem with documentation when people don't take the time to update them. Help is welcome.
You probably don't need to do anything, and just make sure to open the logs that actually correspond to the instance that is auto-started. |
There is no log directory in the /run/user/1000 directory. So where do I configure the piper module if the way I did it is incorrect? |
@andresmessina1701 The first version of Pied is now publicly released, that can automatically set everything up for you: https://pied.mikeasoft.com/ |
it has various options if you compile it yourself (flatpak, appimage), see the repo: |
Currently, yes; I am working on making it available via flatpak and appimage (and probably eventually as a deb too), but there are still some issues that need work with those packages. |
You're welcome! |
For anyone wondering, this the module for piper that i wrote. It can handle multiple languages and maps # /etc/speech-dispatcher/modules/piper-generic.conf
Debug "1"
GenericCmdDependency "piper-tts"
GenericCmdDependency "sox"
GenericCmdDependency "jq"
GenericCmdDependency "bc"
GenericExecuteSynth \
"printf %s \'\$DATA\' \
| /opt/piper-tts/piper --model /opt/piper-tts/voices/\$VOICE.onnx --output_raw \
| sox -v 1 -r \$(jq .audio.sample_rate < /opt/piper-tts/voices/\$VOICE.onnx.json) -c 1 -b 16 -e signed-integer -t raw - -t wav - tempo \$(echo \"0.000055*\$RATE*\$RATE+0.0145*\$RATE+1\" | bc) pitch \$PITCH norm \
| \$PLAY_COMMAND"
# not using $VOLUME
AddVoice "en-us" "MALE1" "en_US-ryan-medium" # "en_US-ryan-high"
AddVoice "en-us" "MALE2" "en_US-lessac-medium" # "en_US-lessac-high"
AddVoice "en-gb" "FEMALE1" "en_GB-jenny_dioco-medium"
AddVoice "en-us" "FEMALE2" "en_US-amy-medium"
AddVoice "it" "MALE1" "it_IT-riccardo-x_low"
DefaultVoice "it_IT-riccardo-x_low" I found that using high quality models takes some time. I have a better experience with medium!
|
This looks nice :) @carlocastoldi could you try to add
to check that this correctly makes the voice list shown by |
My user module config for speechd works fine, I am sharing it below.
|
Ok, I'm still not sure how the formula works because if you put 0 in GenericRateAdd the output becomes a float and with 1 it becomes an integer and it's not the purpose given in the doc. But, it works with
(I don't use pitch modifications but has 2 noise parameters if someone want to set it)
|
Could you please give some description to non-technical users like me what to change in config to replace: AddVoice "en" "MALE1" DefaultVoiceType "MALE1" What these values are and how can I replace them to chose different voice? How to find this classification for equivalent of "MALE1", for example I typed piper --help and don't see any --list-voices command. I downloaded them from hugging face and so far applied from command line pointing to onnx file. For example in Plasma Okular there is an option to change voice (this also sometimes means language). But with proposed configuration I don't know how to make other voices available for speech dispatcher. |
Check my 2 previous comments (1, 2), you should be able to use it by modifying only those lines: |
FYI, I found this app which does it all for you 😄 |
Unfortunately not all. It changes config files every time voice is changed, so there is no way to set speed or other values and keep it. And with Pied only one voice is available at a time as on option in programs like Calibre or Okular. Piper still needs a good speech dispatcher support like Festival or espeak have. |
Just as a side-note, if you have sox installed then Pied 0.2 now supports speech-dispatcher's dynamic rate and pitch settings at runtime |
Which is really not the way speech-dispatcher workers. The piper module should just expose all the voices that are available, just like e.g. |
Can someone share working config, my config which i got from - https://aur.archlinux.org/cgit/aur.git/tree/piper-generic.conf?h=piper-voices-common and also the config generated by Pied -- had long pause between sentences ( 2-3 ) seconds. Found this black magic - Edit - i asked in read-aloud repo, and they said ->
Any idea on how to do prefetch? |
@KAGEYAM4, I tried adding your black magic from ken107/read-aloud#375(comment-1937517761), but it just added more pauses everywhere. Just try my config from a few comments above, even on a 15 years old machine that I use for tests, I have less than 200-400ms at start and between paragraphs. |
i used your config it's alot better. Thanks a lot. By the way does the following error matter? It seems arch-repo dosen't provide these files ->
|
I don't have some of those files too, for example gender-neutral.dic is provided for french, spanish and german only by the debian package. It's because those languages have (sadly) new rules to write in a gender-neutral fashion. So I think that thoses errors should be only INFO or WARN level. |
@tkapias This #866 (comment) is very useful. I'll probably create a gist using your work for instructions. Works on Chromium Version 128.0.6586.0 (Developer Build) (64-bit), does not work on Firefox Nightly 130.0a1, the |
I recently had to reinstall a clean desktop and wrote a new note about piper installation for Debian testing, it works with Firefox 115.12.0esr. Installation# Keep the installation files and symlink to the latest version
mkdir -p piper piper/voices & cd piper
wget https://github.com/rhasspy/piper/releases/download/2023.11.14-2/piper_linux_x86_64.tar.gz
tar xvf piper_linux_x86_64.tar.gz
mv piper piper-2023.11.14-2
sudo ln -s /home/tomasz/Forge/Logiciels/piper/piper-2023.11.14-2/piper /usr/local/bin/piper
# Download voices from https://huggingface.co/rhasspy/piper-voices/tree/main
# example with Male/Female for French/English
cd voices/
wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/fr/fr_FR/upmc/medium/fr_FR-upmc-medium.onnx
wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/fr/fr_FR/upmc/medium/fr_FR-upmc-medium.onnx.json
wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/amy/medium/en_US-amy-medium.onnx
wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/amy/medium/en_US-amy-medium.onnx.json
wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/hfc_male/medium/en_US-hfc_male-medium.onnx.json
wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/hfc_male/medium/en_US-hfc_male-medium.onnx
# Copy voices to system
mkdir -p $HOME/.local/share/piper/voices
sudo cp ./* $HOME/.local/share/piper/voices/ Configurationsudo apt install
libspeechd2 python3-speechd speech-dispatcher-audio-plugins speech-dispatcher-espeak-ng speech-dispatcher alsa-utils bc
mkdir -p $HOME/.config/speech-dispatcher/modules
cd $HOME/.config/speech-dispatcher/
touch speechd.conf
touch modules/piper.conf
# Past their content to piper.conf and modules/piper.conf from below
# then test output with spd-say
spd-say --language en 'Welcome to the world of speech synthesis!' speechd.conf
modules/piper.conf
|
@tkapias A gist to link to would be useful. This is what I came up with from your original work https://gist.github.com/guest271314/9f09ab899df11e344c568a7b93f544c3. I'm using the full path to We can also pipe to
Symlinking the I had to restart What is different in your update that results in the code working on Firefox? |
@guest271314, if you're okay to maintain it, I will leave comments on your Gist when I have updates. I will test /dev/audio, thanks. I don't symlink the voices, only the binary, but I didn't know that it would fail. About Firefox, I don't know why it makes any difference, the config files are pretty much the same. |
Sure. |
Firstly, thank you for your guide. I've been trying to set piper with speechd and foliate for quite some days and your post was a savior. But i noticed that the 2-3s pause between sentences only occur in *high.onnx files. Is it possible to get it to working on These high quality .onnx files instead of medium voices without the pause? |
The code below CORRECTLY implements both volume control and speech-rate pass-through from speech-dispatcher to Piper. File /etc/speech-dispatcher/modules/piper.conf follows below:
File /etc/speech-dispatcher/speechd.conf follows below:
The Piper binary as well as voice files are all lumped together in the /opt/piper/ directory. |
This https://github.com/guest271314/native-messaging-piper provides complete control over the entire process, without having to fiddle with Speech Dispatcher and the socket connection between the browser. Want to change pitch, speed, etc, just make use of Web Audio API audio nodes. |
piper
is 'a fast, local neural text to speech system' (samples here).it would be nice to have
speechd
support this as well.cross-post: rhasspy/piper#265
/cc @Elleo who has done some work integrating these thru pied.
The text was updated successfully, but these errors were encountered: