Audio JSON Format
  • 28 Feb 2024
  • Dark
  • PDF

Audio JSON Format

  • Dark
  • PDF

Article summary


Dataloop's Audio transcription studio allows creating and editing audio transcription. The annotations JSON format is similar to Video annotation, where annotations also span across time, but simpler, since there are no annotation coordinates.

Format Details

  "annotations": [
      "id": "632894ed22e1200334d39638",
      "datasetId": "6281eab332cd64b1c004319a",
      "itemId": "6282247746a76b5e12ddb4fe",
      "url": "",
      "item": "",
      "dataset": "",
      "type": "subtitle",
      "label": "Relaxed",
      "attributes": [
      "metadata": {
        "system": {
          "attributes": {
            "1": 0.7,
            "2": "Flase"
          "openAnnotationVersion": "1.48.2-rc.67",
          "recipeId": "6281eab46551fa03a3fa9ae8"
        "user": {
        "karaokeData": [
                "confidence": 0.38077378,
                "endTime": 2.88,
                "startTime": 0.02,
                "text": "I"
                "confidence": 0.47150243,
                "endTime": 4.65,
                "startTime": 2.88,
                "text": "Am"
                "confidence": 0.96968557,
                "endTime": 6.997,
                "startTime": 4.86,
                "text": "the"
                "confidence": 0.39305196,
                "endTime": 8.2,
                "startTime": 6.998,
                "text": "annotation"
                "confidence": 0.70361846,
                "endTime": 10.12,
                "startTime": 8.251,
                "text": "Transcription"
      "creator": "",
      "createdAt": "2022-09-19T16:12:29.879Z",
      "updatedBy": "",
      "updatedAt": "2022-09-19T16:12:29.879Z",
      "hash": "73cfecbe0e9cf8e562ec18b3c058d40ddc400372",
      "source": "ui",
      "coordinates": {
         "text": "I am the annotation Transcription"

Dictionary Table

Key NameDefinitionParent Key
annotationsList of annotationsN/A
idAnnotations IDannotations
datasetIdDataset IDannotations
typeAnnotation type - 'subtitle' for audio transcriptionannotations
labelThe annotation's label/classannotations
metadataThis key holds all of the annotation informationannotations
systemThis key holds all of the annotation system informationmetadata
isOnlyLocalA field used in the UI to determine if the annotation is ready to be saved or not (False – ready to be saved)system
systemTrue - the system created this specific annotation False - annotation was created on a different waysystem
openAnnotationVersionproduct versionsystem
recipeIDID of recipe used in this tasksystem
userMetadata that can be added by user via SDK, also used for storing word-level-timing information, which is imported into the platform and therefore considered user informationannotations
ConfidenceTranscription confidence, as generated by source modelsystem
End timeTranscription word level end time, in seconds, out of the audio file lenghtsystem
Start timeTranscription word level start time, in seconds, out of the audio file lenghtsystem
TextTranscription text included in this timing sectionsystem
creatorAnnotation creatorannotations
createdAtAnnotation creation date and timeannotations
updatedByAnnotation edits by user nameannotations
updatedAtAnnotation edits date and timeannotations
hashUnique hash for this annotationannotations
SourceIndicates 'UI' when annotation was manually createdsystem
CoordinatesContains optional word-level timing information for audio transcriptionsystem