Speech_recognition und GTTs verstehen keine Zahlen unter 11
Posted: 03 Jan 2025, 18:16
- Ich habe einfachen Code zusammengestellt, der den Benutzer auffordert, zwischen Option 1, Orangen, und Option 2, Birnen, zu wählen:
Egal was ich oben schreibe, der Spracherkenner erkennt die Zahlen „eins“ und „zwei“ nicht. Nur die nicht numerischen Optionen „Nummer eins“, „Birnen“ usw. werden korrekt erkannt.
Code: Select all
options = { (1, "1", "one", "number one", "oranges", "orange", "orange's", "oranges'"): 1, (2, "2", "two", "number two", "pears", "pear", "pear's", "pears'", "pier"): 2 }
- Das zweite Problem ist das am seltsamsten. Der Spracherkenner versteht keine Zahl unter 11, es sei denn, Sie sagen sie gefolgt von „.0“ („dreikomma null“). Er versteht „10“, wenn Sie „eins null“ sagen. Ab 11 versteht er die Zahlen als du sagst sie:**
Code: Select all
def convert_to_number(text): number_words = { "zero": 0, "one": 1, "two": 2, "three": 3, "four": 4, "five": 5, "six": 6, "seven": 7, "eight": 8, "nine": 9, "ten": 10, "eleven": 11, "twelve": 12, "thirteen": 13, "fourteen": 14, "fifteen": 15, "sixteen": 16, "seventeen": 17, "eighteen": 18, "nineteen": 19, "twenty": 20 }
- Ausgabe:
Ausgabefenster – Testen, was die Spracherkennung versteht
- Dies waren die Schritte, die ich befolgt habe, um die beschriebenen Fehler zu beheben:
a. Ich habe zunächst die Bibliotheken pyttsx3 und Speech_recognition verwendet,
dann habe ich pyttsx3 in gtts und pydub geändert. Es gab keine Änderung im
fehlerhaften Verhalten bei verschiedenen Bibliotheken.
b. Ich habe ein en-GB-Gebietsschema hinzugefügt, auch keine Auswirkung.
c. Ich habe einen britischen Muttersprachler gebeten, die Optionen auszusprechen, es gab auch keinen Unterschied.
d. Alles ist gut konfiguriert, ffmpeg, Mikrofon usw... - Das ist der vollständige Code:
Code: Select all
from gtts import gTTS import speech_recognition as sr from pydub import AudioSegment from pydub.playback import play import os # Set the path to the ffmpeg executable os.environ["PATH"] += os.pathsep + "C:/ffmpeg/bin" # Initialize STT recognizer recognizer = sr.Recognizer() def speak(text): tts = gTTS(text=text, lang='en-GB') tts.save("temp.mp3") sound = AudioSegment.from_mp3("temp.mp3") play(sound) os.remove("temp.mp3") def listen(): with sr.Microphone() as source: print("Listening...") audio = recognizer.listen(source) try: text = recognizer.recognize_google(audio, language='en-GB') print(f"You said: {text}") return text.lower() except sr.UnknownValueError: print("Sorry, I did not understand that.") speak("Sorry, I did not understand that.") return None except sr.RequestError: print("Sorry, my speech service is down.") speak("Sorry, my speech service is down.") return None def convert_to_number(text): number_words = { # BUG: #3 All numbers below 11 are not recognized by the speech recognizer # BUG: #4 10 is only recognized if one says "one zero" instead of "ten" "zero": 0, "one": 1, "two": 2, "three": 3, "four": 4, "five": 5, "six": 6, "seven": 7, "eight": 8, "nine": 9, "ten": 10, "eleven": 11, "twelve": 12, "thirteen": 13, "fourteen": 14, "fifteen": 15, "sixteen": 16, "seventeen": 17, "eighteen": 18, "nineteen": 19, "twenty": 20 } try: return float(text) except ValueError: return number_words.get(text, None) def get_choice(options): while True: choice = listen() if choice is not None: for key, value in options.items(): if choice in key: print(f"Recognized choice: {value}") return value print("Invalid input. Please say a valid option.") speak("Invalid input. Please say a valid option.") def get_quantity(): while True: quantity = listen() if quantity is not None: quantity = convert_to_number(quantity) if quantity is not None and quantity > 0: print(f"Recognized quantity: {quantity}") return quantity else: print("Please enter a positive number.") speak("Please enter a positive number.") item1 = "Oranges" item1_price = 0.75 item2 = "Pears" item2_price = 1.25 vat_tax = 0.20 options = { # BUG: #2 No matter what I write here the numbers "one" and "two" are not recognized by the speech recognizer # BUG: #1 Only non-numeric options are recognized correctly (1, "1", "one", "number one", "oranges", "orange", "orange's", "oranges'"): 1, (2, "2", "two", "number two", "pears", "pear", "pear's", "pears'", "pier"): 2 } while True: speak("- What would you like to taste today, guvnor?\n" f" 1. Our fresh {item1}, for £{item1_price} each?\n" f" 2. Or, our delicious {item2}, for £{item2_price} each?\n") print("- What would you like to taste today, guvnor?\n" f" 1. Our fresh {item1}, for £{item1_price} each?\n" f" 2. Or, our delicious {item2}, for £{item2_price} each?\n") buyer_choice = get_choice(options) if buyer_choice == 1: speak(f"\n- And how many {item1} for the lady?\n") print(f"\n- And how many {item1} for the lady?\n") buyer_quant = get_quantity() sub_total = (item1_price * buyer_quant) vat_total = (sub_total * vat_tax) total = sub_total + vat_total speak(f"\n- That will be {buyer_quant:,.0f} {item1} for only £{sub_total:,.2f}.\n" f" Plus £{vat_total:,.2f} of V.A.T., total is £{total:,.2f}.\n" " Thanks for your custom!\n") print(f"\n- That will be {buyer_quant:,.0f} {item1} for only £{sub_total:,.2f}.\n" f" Plus £{vat_total:,.2f} of V.A.T., total is £{total:,.2f}.\n" " Thanks for your custom!\n") break elif buyer_choice == 2: speak(f"\n- And how many {item2} for the lady?\n") print(f"\n- And how many {item2} for the lady?\n") buyer_quant = get_quantity() sub_total = (item2_price * buyer_quant) vat_total = (sub_total * vat_tax) total = sub_total + vat_total speak(f"\n- That will be {buyer_quant:,.0f} {item2} for only £{sub_total:,.2f}.\n" f" Plus £{vat_total:,.2f} of V.A.T., total is £{total:,.2f}.\n" " Thanks for your custom!\n") print(f"\n- That will be {buyer_quant:,.0f} {item2} for only £{sub_total:,.2f}.\n" f" Plus £{vat_total:,.2f} of V.A.T., total is £{total:,.2f}.\n" " Thanks for your custom!\n") break else: speak("\n- We just ran out of that, sorry. Please choose a valid option.\n") print("\n- We just ran out of that, sorry. Please choose a valid option.\n")