Audio-Transkription

Erstellt von Johannes Eberhard, Geändert am Mo, 2 Jun, 2025 um 3:57 NACHMITTAGS von Johannes Eberhard

POST /public-voice/audio/transcriptions
Transkribiert Audio-Dateien in Text (z. B. Sprache-zu-Text)

Parameter

Datei (erforderlich): Audio-Datei (MP3, WAV, etc.)
Prompt: Optionaler Kontexttext zur Verbesserung der Genauigkeit
Response Format: Ausgabeformat (Standard: json)
Timestamp Granularities: Zeitstempel pro Wort/Segment (z. B. ["word"])

Beispiel Request

curl -X 'POST' \
  'https://IHRE_INSTANZ.localmind.dev/localmind/public-voice/audio/transcriptions' \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -F 'file=@Aufzeichnung.m4a;type=audio/x-m4a' \
  -F 'language=' \
  -F 'response_format=json' \
  -F 'temperature=0'

Beispiel Response

{
  "text": "Das ist eine Testaufnahme.",
  "language": null,
  "task": "transcribe",
  "duration": 2.12,
  "words": null,
  "segments": [
    {
      "id": 0,
      "avg_logprob": null,
      "compression_ratio": null,
      "end": 2.12,
      "no_speech_prob": null,
      "seek": null,
      "start": 0,
      "temperature": null,
      "text": "Das ist eine Testaufnahme.",
      "tokens": null
    }
  ]