ASR Speech Recognition


  • input: Raw audio data containing the speech


  • output: Transcribed text from the input audio.


  • source: The path to a model or the name of a pre-trained model.
  • device: The device to run the model on (e.g. “cpu” or “cuda”).
  • chunk_size: The size of each chunk of audio data to process.
  • left_context_size: The number of chunks to consider as context for each chunk.


Receives audio data and transcribes it.


This task uses a pre-trained ASR model from SpeechBrain and transcribes the input audio in real-time. It is recommended to use a high-performance device (e.g. GPU) to run this task efficiently.