Warum ist Whisper nicht aus einer Audiodatei zurückgegeben?

Post a reply

Smilies
:) :( :oops: :chelo: :roll: :wink: :muza: :sorry: :angel: :read: *x) :clever:
View more smilies

BBCode is ON
[img] is ON
[flash] is OFF
[url] is ON
Smilies are ON

Topic review
   

Expand view Topic review: Warum ist Whisper nicht aus einer Audiodatei zurückgegeben?

by Anonymous » Yesterday, 02:54

Ich arbeite mit Whisper in einem .NET MAUI -Projekt. Für die Kompilierung Schnelligkeit ist mein Ziel für die Arbeit an Windows.

Code: Select all

    private static WhisperFactory? _factory;
private static WhisperProcessor? _processor;
private static string DBName = "ggml-large-v1.bin";
private static string dbPath = string.Empty;

public static void WhisperHandlerStartup(string modelPath)
{
// check to see if llm exists
dbPath = Path.Combine(FileSystem.AppDataDirectory, DBName);
if (!File.Exists(dbPath))
{
Debug.WriteLine($"Model file {dbPath} does not exist. Downloading...");
DownloadModel(DBName, GgmlType.LargeV1).GetAwaiter().GetResult();
}
else
{
Debug.WriteLine($"Model file {dbPath} already exists.");
}
try
{
_factory = WhisperFactory.FromPath(dbPath);
_processor = _factory.CreateBuilder()
.WithLanguage("auto")
.Build();
}
catch (Exception ex)
{
Debug.WriteLine($"Error initializing WhisperHandler: {ex.Message}");
throw;
}
}

public static async Task TestMp3ToWhisper()
{
string mp3FilePath = @"D:\Test DATA\recording.m4a";

try
{
// Read all bytes from the MP3/M4A file
byte[] mp3Bytes = File.ReadAllBytes(mp3FilePath);

// Create a MemoryStream from the byte array
using (MemoryStream audioStream = new MemoryStream(mp3Bytes))
{

Debug.WriteLine($"Audio file loaded into MemoryStream. Length: {audioStream.Length} bytes.");

// Example: You could reset the position to the beginning if needed for reading
audioStream.Position = 0;

var test = audioStream.Length;
var floatSamples = ConvertPcm16ToFloat(audioStream);  // Whisper can only process PCM, convert here

var res = _processor.ProcessAsync(floatSamples);

string returnResult = string.Empty;
// ProcessAsync returns an IAsyncEnumerable
await foreach (var result in res )
{
returnResult += result.Text;
}
Debug.WriteLine(res.ToString());
Debug.WriteLine("Processing complete.");
}
}
catch (FileNotFoundException)
{
Console.WriteLine($"Error: audio file not found at '{mp3FilePath}'");
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}

/// 
/// Converts a PCM 16-bit mono MemoryStream to a float array normalized between -1.0 and 1.0.
/// 
private static float[] ConvertPcm16ToFloat(Stream pcmStream)
{
pcmStream.Seek(0, SeekOrigin.Begin);
using var br = new BinaryReader(pcmStream, System.Text.Encoding.Default, leaveOpen: true);
var sampleCount = (int)(pcmStream.Length / 2);
var floatSamples = new float[sampleCount];
for (int i = 0; i < sampleCount; i++)
{
short sample = br.ReadInt16();
floatSamples[i] = sample / 32768f;
}
return floatSamples;
}
Die Datei lädt und konvertiert in PCM OK, aber wenn ich an Whisper weitergegeben habe, bekomme ich nichts zurück und es gibt keine Fehler.>

Top