Failed to read flac file - workaround #5

m1cha3lya1r · 2024-11-20T21:51:38Z

I wrote about it there, if you'd like to read about the troubleshooting.

Here I'll bring the workaround, which is mainly learning your code and fitting it to my needs.
It have been a while since the one who opened the issue (while using your code).
I chose to "make it work" rather than wait for a fix (seems that repository is not active since 2023).

It all started when I try to input a Flac audio file of 3 second exactly, and sample rate of 48000:

from pathlib import Path
# as shown in birdnet documentations
audio_path = Path("my_3_second_segment.flac")
predictions = SpeciesPredictions(predict_species_within_audio_file)
  audio_path,
  custom_model=model,
  species_filter=custom_species
))
"""
Error:
soundfile.LibsndfileError: Internal psf_fseek() failed.
"""

The rest about the investigation and troubleshooting is there.

I solved it by practically going around the problem and avoiding soundfile.
I downloaded the protobuf version of BirdNet model, loaded it and its labels.
Here is some code to get the picture (without some error handling, to simplify its readability):

# loaded the save_mode.pb directory:
import tensorflow as tf
loaded_model = tf.saved_model.load(Path("path_to_folder_containing_saved_model_file"))
model = loaded_model.signatures['basic']

# loaded the labels
with open(path_to_labels_file, 'r') as f:
    labels = f.read().splitlines()

to get the signal from librosa.load with the loaded model I piped ffmpeg to io.BytesIO with the following:

import subprocess
import io

# make sure you got ffmpeg installed in your system
wav = subprocess.run(
        f"ffmpeg -v quiet -i {path.as_posix()} -ar {sample_rate} -y -f wav -",
        shell=True,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
    )

# load using librosa to get the signal
import librosa
wav_buff = io.BytesIO(wav.stdout)
signal, sample_rate = librosa.load(wav_buff, sr=48000)

# convert to tensor and reshape as the model expects to get the signal
tensor = tf.reshape(tf.convert_to_tensor(signal, dtype=tf.float32), (1, len(signal)))

from this point I mimic the birdnet package code, modified to serve my use case:

# inference
results_x = model(tensor)

# to get the scores values as confidence

def logistic(x, sensitivity=1.):
    return  1/(1+np.exp(sensitivity*x))

results = logistic(results_x["scores"][0])

# attach results to labels
results = list(zip(labels, results))

# filter species by location and time
filter_species = birdnet.predict_species_at_location_and_time(latitude, longitude, week)
filter_set = set (filter_species.keys())
filtered_results = [r for r in results if r[0] in filter_set]

# to get best 5 ascending
best = 5
top_5 = sorted(filtered_results, key=lambda r: r[1])[-best:]

# OR descending
top_5 = sorted(filtered_results, key=lambda r: r[1], reverse=True)[:best]

I think the main take here is piping ffmpeg output to io.BytesIO buffer to avoid disk IO and can be read by librosa.
Practically it allows to use all the audio files ffmpeg knows to handle, convert and resample to wav and continue as usual from this point...
Maybe this example is "good enough" for workaround, and you need something solid with predictable error handling.
I think there is a python binding for ffmpeg, but I never used it, so I can't recommend.

(feel free to correct me if I missed something, signal processing is kinda new to me)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed to read flac file - workaround #5

Failed to read flac file - workaround #5

m1cha3lya1r commented Nov 20, 2024 •

edited

Loading

Failed to read flac file - workaround #5

Failed to read flac file - workaround #5

Comments

m1cha3lya1r commented Nov 20, 2024 • edited Loading

m1cha3lya1r commented Nov 20, 2024 •

edited

Loading