Audio Studio
  • 19 Feb 2025
  • Dark
    Light
  • PDF

Audio Studio

  • Dark
    Light
  • PDF

Article summary

Overview

Dataloop offers an audio studio that allows audio annotations per timeframe with transcription capability. It enables annotators to quickly annotate audio files for various purposes. Machine learning tasks such as speech-to-text model training, noise detection model training, translation, and more can be performed using Dataloop Audio Studio.

Prerequisites
  • Supported Image formats section to view the supported image formats in the Image Studio.
  • Task Prerequisites: Labeling work is performed in the context of a labeling task. Create your task with the intended data (image files), the Recipe (labels, attributes, and labeling instructions), and annotate team members.

Basic operations

The basic operations related to audio files include the following:

  • Play: Click the Play icon to run the audio file.
  • Pause: Click the Pause icon to pause the audio file.
  • Stop: Click the Stop icon to stop the audio file.
  • Rewind 5 seconds: Click the left round arrow to go back 5 seconds of the audio file.
  • Skip 5 seconds: Click the right round arrow to go forward 5 seconds of the audio file.
  • Playback speed: You can adjust the speed of the audio file without affecting its pitch. Click on the current speed and select the relevant one from the list 0.5, 0.75, 1, 1.25, 1.5, and 2. By default, X1.00.
  • Classify: The Classify feature lets users make annotations on an audio file by using the position of their mouse pointer on the waveform (the visual representation of the sound). This means they can easily tag specific parts of the audio by pointing and clicking on the waveform.
  • Peak Amplitude (dB) (by default): Indicates the highest sound level within the selected segment (annotation). Select a single annotation to display the dB value. Click on the dropdown to display the Sound Meter option.
  • Sound Meter (dB): It provides measurement (in dB) of sound levels at the cursor's current point in the audio track.
  • Audio Binding: You can toggle this option from the control bar by using the slider (on/off modes).
    • ON mode (by default): while playing, selecting an annotation from either the audio signal or the annotation list will set the playback position at the beginning of this annotation.
    • OFF mode: Causes this binding to be removed, and moving between annotations either from the audio wave or from the annotation list will not change the audio play position.
  • Loop playback: This button enables the looping of a section of an audio item within the boundaries of the current selection range.
    • Loop back mode:
      • If an annotation is selected, it is played in a loop until either paused or it reaches the existing “loop playback” mode.
      • If no annotation is selected, the audio will loop back when it reaches the end.
  • Duration (01:30:04.768 / 02:30:04.768): The Dataloop Audio Annotation Studio supports millisecond annotations for precise annotations. For example, the duration "01:30:04.768" represents "hh:mm:ss.milliseconds".

Label picker

A Label Picker is a feature within the annotation studio that allows you to pick or select labels to assign to specific objects or elements in the data being annotated. You can perform the following activities on the Label Picker section. The available labels are determined in the Recipe.

  • Scroll and click a label to activate it.
  • Use the search bar to easily find labels.
  • Resize the label list to better fit your number of labels by clicking and dragging the separator line.
  • Use the Shortcut keys to navigate between labels.

Supported audio formats

The following audio file formats are supported in the Audio Studio:

  • WAV
  • MP3
  • OGG
  • FLAC
  • M4A
  • AAC

Annotation type - subtitle type

Audio studio annotations are of Subtitle Type, enables adding transcription into the classification annotation. Refer to the Audio Item section to classify an audio file using the SDK.


Audio classification tool

Audio classification is the only annotation tool supported in the audio studio, and it allows you to classify the audio segments with the annotation labels available in the recipe.

Classify

The Classify feature lets you create annotations based on the position of your cursor on the audio waveform. Here's how you can use it:

Create Annotations and Transcriptions: Position your cursor on the desired point of the audio file to create annotations.

For example:

  • If you select the audio position at 15 seconds in a 30-second audio file and click Classify, an annotation will be created from the beginning to the 15-second mark.
  • If there is an existing annotation from 5 seconds to 10 seconds, the new annotation will be created from 10 seconds to 15 seconds.

Transcription mode

Clicking on the toggle button for Transcription mode allows you to hide the left and right panels to have a clearer and larger view of the Audio Studio.


Transcription list

In the audio annotation studio, transcription is the process of converting spoken language in an audio file into written text. Audio Studio supports transcription tasks done by human labelers, or pre-annotations uploaded with transcription.

The Transcription table provides a list of annotations along with their transcriptions. Each transcription is linked to a specific annotation and its corresponding time frame. The table includes the following details:

  • Annotation's Label Name and ID: Identifiers for each annotation.
  • Audio Control: Play or Pause option for the audio segment.
  • Segment Start and End Times: Indicates the start and end times of a segment within the audio file. For example, "01:30:04.768" indicates that the segment begins at 1 hour, 30 minutes, and 4.768 seconds into the audio file.
  • Total Duration: The length of the segment within the audio file.
  • Annotation Info and Note Icons: Icons providing additional information and notes about the annotation.

Speaker name

The speaker name identifies the person speaking in the audio as accurately as possible. It is linked to the label at the item level and does not carry over to other items.

For example, if the task is transcribing a podcast and the labels are interviewer or guest, the annotator can add the name of the speaker once, and it will be relevant at that item level.

Note: The speaker name is optional to use, and it is up to the instructions of the task.


Create audio annotations

  1. Play the audio file in order to hear the audio recording.
  2. Select the desired label.
  3. Identify the desired time frame (start and end points) to be annotated.
  4. Drag the mouse on top of the waveform from the start point to the end point. The selected timeframe changes color to match the color of the label.

Note: A new annotation is created, and the new annotation is listed both on the annotation list and the transcription list.


Add annotations with transcription

  1. Create an annotation.
  2. Click on the Transcription field.
  3. Type the audio transcription in the Transcription field of the annotation.
  4. Press the Enter or Return keys to save it. A new annotation is displayed in the annotations list located in the transcription column. The transcription saves the characters exactly as typed, allowing high accuracy in translation and speech-to-text tasks.

Note: To edit, click on the transcription at any time.

Using classify

  1. Select the required to be annotated interval in the time frame.
  2. Click Classify. The Transcription text box is enabled.
  3. Enter the transcription and press the Enter key to save the changes.

Using classify while playing the audio

  1. Start playing the audio file.
  2. Once you reach the selected point to annotate in the time frame, click Classify. The Transcription text box is enabled.
  3. Enter the transcription and press the Enter key to save the changes.

Edit annotations' start/end time

Suggested below are multiple methods for editing the annotation start time and end time.

  • Use the + or - icons to adjust the end or start time.
  • Drag the annotation onto the waveform.
  • Use the pin icon to set a new start time or end time for the related annotation.

Split an annotation's transcription

Using this transcription feature, you can split annotations.

Place the mouse cursor between the transcription words and press the Enter or Return key. A new annotation is created.


Merge an annotation's transcription

Using this transcription feature, you can merge annotations.

Place your mouse cursor at the beginning of the transcription and press the Backspace or Delete key. The annotation will be merged with the previous annotation.


Find and replace transcriptions

Use the Transcription Finder function to find and replace transcription texts. Follow these steps to use the Transcription Finder function:

  1. Open the Audio Annotation Studio.
  2. Press the F key or click on the 🔍 icon from the top-bar to open the Transcription Finder popup.
  3. Input the desired text in the Find What field to find it within the transcriptions.
  4. To replace the text with another, input the new text in the Replace With field.
  5. (Optional) Choose additional options as necessary:
    1. Match case
    2. Find whole word only
  6. Click Find Text to highlight matching texts in the transcription.
  7. If replacement is needed, click either Replace or Replace All. A confirmation message will be displayed.

Remove transcriptions

  1. Identify the annotation from the list and click on the transcription. The Transcription text box is enabled.
  2. Delete the text in the transcription box and press the Enter to save the changes.

Set the speaker's name

  1. Select the annotation from the annotations list.
  2. Click on the Pencil icon. A pop-up window is displayed.
  3. Select the Speaker Name tab.
  4. Enter the name of the speaker.
  5. Click Save Changes.

Create overlapping annotations

  1. Identify the timeframe on the existing annotation where you need to create a new annotation.
  2. Click the Shift key and drag the mouse to create an annotation on top of another annotation.

Zoom annotations interval

There are two options to zoom an annotation interval.

Zoom in via the mouse

  1. Click the relevant annotated interval on the time frame.
  2. Use the mouse's scroll wheel to zoom in on the interval.

Zoom in at once via a double click

  1. Click the relevant annotated interval on the time frame.
  2. Double-click the mouse to zoom in on the interval.

Zoom out

  1. Click the relevant annotated interval on the time frame.
  2. Use the mouse's scroll wheel to zoom out the interval, or

Use the Fit to screen icon to return to its original size.


Modify annotations' ID

  1. Click on the annotation label. It enables the ID field to be modified.
  2. Modify the value and press the Enter key.

Annotations tab

The Annotations tab on the right-panel allows you to control and manage annotations involves utilizing the annotations list and attribute controls, particularly when attributes are configured in the Recipe.

Learn more about the annotations and actions available.


Item tab

The Item tab displays information according to the type of the selected item.

Learn more about the item and actions available.


Item info & controls (top-panel)

Item Info & Controls are available depends on the type of annotation studio. For the detailed information, refer to the following articles.


Item's view controls (bottom-panel)

The controls on the bottom-side panel display based on the annotation contexts and work controls.

Workflow Context

Assignment controls, including moving between items, displaying the item gallery, and the status buttons (Complete / Discard). It displays only while working on an annotation or QA task.

  • Browse between the assignment items using the Left and Right arrows.
  • Open the Thumbnails' gallery viewer, and click a thumbnail to open that item.
  • Save button: Clicking the button when it is enabled triggers saving changes to the Dataloop platform, before the auto-saving feature takes care of that.
  • Status buttons: Complete and Discard.

View Controls

From left to right, work controls at the bottom panel.


Keyboard shortcuts

General Shortcuts

ActionKeyboard Shortcuts
SaveS
DeleteDelete
UndoCtrl + Z
RedoCtrl + Y
Zoom In/OutScroll
Change BrightnessVertical Arrow + M
Change ContrastVertical Arrow + R
PanCtrl + Drag
Search LabelsShift + L
Search a labelShift + 1-9 (for the sub-labels, use the Tab key)
Navigate in label pickerUp and Down arrows
Select label in label pickerEnter
Tool Selection0-9
Move selected annotationsShift + Arrow Keys
Previous ItemLeft Arrow
Next ItemRight Arrow
Add Item DescriptionT
Mark Item as DoneShift + F
Mark Item as DiscardedShift + G
Enable Cross Grid Tool HelperAlt + G
Hold G to show Cross Grid MeasurementsG
Go to annotation listShift + ;
Navigate in annotation listUp and Down arrows
Select/deselect an annotationSpace
Hide/Show Selected AnnotationH
Hide/Show All AnnotationJ
Show Unmasked PixelsCtrl + M
Hide/Show Annotation ControllersC
Set Object ID menuO
Toggle pixel measurementP
Use tool creation modeHold Shift
Copy annotations from previous itemShift + V

Annotation Tool - Audio

ActionKeyboard Shortcuts
Toggle Play/PauseSpace
Previous Annotation<
Next Annotation>
Decrease Play Speed[
Increase Play Speed]
StopBackspace
Activate zoom button-
Fit to screen=
Toggle Audio Binding/
Jump 20 seconds backwardZ
Jump 10 seconds backwardX
Jump 5 seconds backwardC
Jump 5 seconds forwardV
Jump 10 seconds forwardB
Jump 20 seconds forwardN
Set current transcription start time\
Set current transcription end time"
Edit selected annotation transcriptionCtrl + A
Save transcriptionCtrl + S
Open Search Modal (Transcription Finder)F




What's Next