Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to read flac file - workaround #5

Open
m1cha3lya1r opened this issue Nov 20, 2024 · 0 comments
Open

Failed to read flac file - workaround #5

m1cha3lya1r opened this issue Nov 20, 2024 · 0 comments

Comments

@m1cha3lya1r
Copy link

m1cha3lya1r commented Nov 20, 2024

I wrote about it there, if you'd like to read about the troubleshooting.

Here I'll bring the workaround, which is mainly learning your code and fitting it to my needs.
It have been a while since the one who opened the issue (while using your code).
I chose to "make it work" rather than wait for a fix (seems that repository is not active since 2023).

It all started when I try to input a Flac audio file of 3 second exactly, and sample rate of 48000:

from pathlib import Path
# as shown in birdnet documentations
audio_path = Path("my_3_second_segment.flac")
predictions = SpeciesPredictions(predict_species_within_audio_file)
  audio_path,
  custom_model=model,
  species_filter=custom_species
))
"""
Error:
soundfile.LibsndfileError: Internal psf_fseek() failed.
"""

The rest about the investigation and troubleshooting is there.

I solved it by practically going around the problem and avoiding soundfile.
I downloaded the protobuf version of BirdNet model, loaded it and its labels.
Here is some code to get the picture (without some error handling, to simplify its readability):

# loaded the save_mode.pb directory:
import tensorflow as tf
loaded_model = tf.saved_model.load(Path("path_to_folder_containing_saved_model_file"))
model = loaded_model.signatures['basic']

# loaded the labels
with open(path_to_labels_file, 'r') as f:
    labels = f.read().splitlines()

to get the signal from librosa.load with the loaded model I piped ffmpeg to io.BytesIO with the following:

import subprocess
import io

# make sure you got ffmpeg installed in your system
wav = subprocess.run(
        f"ffmpeg -v quiet -i {path.as_posix()} -ar {sample_rate} -y -f wav -",
        shell=True,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
    )

# load using librosa to get the signal
import librosa
wav_buff = io.BytesIO(wav.stdout)
signal, sample_rate = librosa.load(wav_buff, sr=48000)

# convert to tensor and reshape as the model expects to get the signal
tensor = tf.reshape(tf.convert_to_tensor(signal, dtype=tf.float32), (1, len(signal)))

from this point I mimic the birdnet package code, modified to serve my use case:

# inference
results_x = model(tensor)

# to get the scores values as confidence

def logistic(x, sensitivity=1.):
    return  1/(1+np.exp(sensitivity*x))

results = logistic(results_x["scores"][0])

# attach results to labels
results = list(zip(labels, results))

# filter species by location and time
filter_species = birdnet.predict_species_at_location_and_time(latitude, longitude, week)
filter_set = set (filter_species.keys())
filtered_results = [r for r in results if r[0] in filter_set]

# to get best 5 ascending
best = 5
top_5 = sorted(filtered_results, key=lambda r: r[1])[-best:]

# OR descending
top_5 = sorted(filtered_results, key=lambda r: r[1], reverse=True)[:best]

I think the main take here is piping ffmpeg output to io.BytesIO buffer to avoid disk IO and can be read by librosa.
Practically it allows to use all the audio files ffmpeg knows to handle, convert and resample to wav and continue as usual from this point...
Maybe this example is "good enough" for workaround, and you need something solid with predictable error handling.
I think there is a python binding for ffmpeg, but I never used it, so I can't recommend.

(feel free to correct me if I missed something, signal processing is kinda new to me)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant