Example: Everything and the kitchen sink
To see the full example scroll down to the end or open the raw json file.
This is an example with a bunch of different annotations created by a variety of tools. For the input we have a short totally made up video which starts with some bars-and-tone and a simple slate. Those are followed by about a dozen seconds of a talking head followed by an image of a barking dog.
The timeline includes markers for seconds. In the views below all anchors will be using milliseconds.
We apply the following processing tools:
- Bars-and-tone extraction
- Slate extraction
- Audio segmentation
- Kaldi speech recognition and alignment
- EAST text box recognition
- Tesseract OCR
- Named entity recognition
- Slate parsing
Following now are short explanations of some frgaments of the full MMIF file, some application output was explained in more detail in other examples, refer to those for more details.
Extracting time frames
The first three steps are straightforward and all result in views with time frame annotations (views with id=v1, id=v2 and id=v3). The bars-and-tone and slate extraction applications each find one time frame and the audio segmenter finds two segments with the second one being a speech time frame that starts at about 5500ms from the start.
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf2",
"frameType": "speech",
"start": 5500,
"end": 22000 }
}
This time frame will provide the input to Kaldi.
Kaldi speech recognition
Kaldi creates one view (with id=v4) which has
- a text document
- an alignment of that document with the speech time frame from the segmenter
- a list of tokens for the document
- a list of time frames corresponding to each token
- a list of alignments between the tokens and the time frames
In the metadata it spells out that the offsets of all tokens are taken to be offsets in “td1”, which is a text document in the same view. We can do this instead of the alternative (using the document property on all tokens) because all tokens are for the same text document.
{
"app": "http://mmif.clams.ai/apps/kaldi/0.2.1",
"contains": {
"http://mmif.clams.ai/vocabulary/TextDocument/v1": {},
"http://vocab.lappsgrid.org/Token": {
"document": "td1" },
"http://mmif.clams.ai/vocabulary/TimeFrame/v4": {
"timeUnit": "milliseconds",
"document": "m1" },
"http://mmif.clams.ai/vocabulary/Alignment/v1": {}
}
}
Note that a text document can refer to its text by either using the text property which contains the text verbatim or by referring to an external file using the location property, here we use the second approach:
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td1",
"mime": "text/plain",
"location": "/var/processed/transcript-002.txt" }
}
For the sake of argument we assume perfect speech recognition, and the content of the external file is as follows.
Hello, this is Jim Lehrer with the NewsHour on PBS. In the nineteen eighties, barking dogs have increasingly become a problem in urban areas.
This text is aligned with the second time frame from the segmenter.
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a1",
"source": "v3:tf2",
"target": "td1" }
}
See the full example below for all the tokens, time frames for each token and the alignment between the token and the time frame.
EAST and Tesseract
EAST adds bounding boxes anchored to the video document with id=m1:
{
"app": "http://mmif.clams.ai/apps/east/0.2.1",
"contains": {
"http://mmif.clams.ai/1.0.3/BoundingBox": { "document": "m1" }
}
Let’s assume that EAST runs on frames sampled from the video at 1 second intervals. For our example that means that EAST finds boxes at time offsets 3, 4, 5 and 21 seconds. Let’s assume decent performance where EAST finds all the boxes in the slate and just the caption in the image (but not the barking sounds). Here is one example box annotation:
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb9",
"timePoint": 4000,
"coordinates": [[180, 110], [460, 110], [180, 170], [460, 170]],
"label": "text" }
}
Due to the nature of the input many of the bounding boxes will have identical or near-identical coordinates. For example, there are two more bounding boxes with the coordinates above, one for the box with time offset 3000 and one for the box with time offset 5000.
Tesseract now runs on all those boxes and creates a text document for each of them. In doing so, it will add these to a new view:
- text documents from each text box
- alignment of that documents to their originating boxes
Thus, the metadata of the new view would be:
{
"app": "http://mmif.clams.ai/apps/tesseract/0.4.4",
"contains": {
"http://mmif.clams.ai/vocabulary/TextDocument/v1": {},
"http://mmif.clams.ai/vocabulary/Alignment/v1": {
"sourceType": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"targetType": "http://mmif.clams.ai/vocabulary/BoundingBox/v3"
}
}
}
Unlike the alignment annotations in the Kaldi view, Tesseract specifies types of both ends of the alignments in the contains
metadata. This is only allowed because all alignment annotations in the view have the same source type and target types. This information can help, for example, machines search for certain alignments more quickly.
Now the recognition results are recorded as text documents, here’s one:
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td1",
"text": { "@value": "DATE" } }
}
And here is the corresponding alignment from the bounding box to the text document:
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a1",
"source": "v5:bb1",
"target": "td1" }
}
The source is in another view, hence the prefix on the identifier.
Named entity recognition
After Kaldi and Tesseract have added text documents we now have all text extracted from audiovisual elements and we can run NLP tools like named entity recognizers over them. Each entity annotation refers to a text document, either the one in the Kaldi view or one of the documents in the Tesseract view, this examples refers to one of the documents in the Tesseract view:
{
"@type": "http://vocab.lappsgrid.org/NamedEntity",
"properties": {
"id": "ne1",
"document": "v6:td2",
"start": 0,
"end": 10,
"category": "Date",
"text": "1982-05-12" }
}
Note that since there were three text boxes with the date and therefore three documents with the actual text, there are also three named entities for this date.
Slate parsing
This section is somewhat speculative since we have not yet made any decisions on what the output of a slate paser will look like.
Slate parsing applies to frames in the slate segment found in the slate view (id=v2) and uses several kinds of information obtained from two or three other views:
- The EAST view has text bounding boxes with coordinates for all those boxes.
- The Tesseract view has the text values for all those boxes.
- The NER view has named entity classes for some of those text values, which may in some cases be useful for slate parsing.
A minimal option for the slate parser is to create a particular semantic tag dat describes value fields in a slate. For that it may use the the category of the named entity that is anchored to the field or the text in adjacent field. For example, if we have the text “1982-05-12” and we know it was tagged as a Date then this may indicate that that value is the air time of the video. Similary, if that value occurs next to a text that has the text “DATE” in it we may also derive that the value was a Date.
Here is the tag annotation on the same document as the named entity annotation above:
{
"@type": "http://vocab.lappsgrid.org/SemanticTag",
"properties": {
"id": "st1",
"document": "v6:td2",
"start": 0,
"end": 10,
"tagName": "Date",
"text": "1982-05-12" }
}
Note that the tagName property has the same value as the category property on the named entity. This is a coincidence in that there is a named entity category Date as well as a slate category Date.
Similar to what we saw for the named entities, there will be multiple versions of this data tag due to multiple text boxes with the same text.
Full MMIF File
{
"metadata": {
"mmif": "http://mmif.clams.ai/1.0.3"
},
"documents": [
{
"@type": "http://mmif.clams.ai/vocabulary/VideoDocument/v1",
"properties": {
"id": "m1",
"mime": "video/mpeg",
"location": "file:///var/archive/video-002.mp4" }
}
],
"views": [
{
"id": "v1",
"metadata": {
"app": "http://apps.clams.ai/bars-and-tones/1.0.5",
"timestamp": "2020-05-27T12:23:45",
"contains": {
"http://mmif.clams.ai/vocabulary/TimeFrame/v4": {
"document": "m1",
"timeUnit": "milliseconds" }
}
},
"annotations": [
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "s1",
"start": 0,
"end": 2600,
"frameType": "bars-and-tones" }
}
]
},
{
"id": "v2",
"metadata": {
"app": "http://apps.clams.ai/slates/1.0.3",
"timestamp": "2020-05-27T12:23:45",
"contains": {
"http://mmif.clams.ai/vocabulary/TimeFrame/v4": {
"document": "m1",
"timeUnit": "milliseconds" }
}
},
"annotations": [
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "s1",
"start": 2700,
"end": 5300,
"frameType": "slate" }
}
]
},
{
"id": "v3",
"metadata": {
"app": "http://mmif.clams.ai/apps/audio-segmenter/0.2.1",
"contains": {
"http://mmif.clams.ai/vocabulary/TimeFrame/v4": {
"timeUnit": "milliseconds",
"document": "m1" }
}
},
"annotations": [
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"frameType": "non-speech",
"id": "tf1",
"start": 0,
"end": 5500 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf2",
"frameType": "speech",
"start": 5500,
"end": 22000 }
}
]
},
{
"id": "v4",
"metadata": {
"app": "http://mmif.clams.ai/apps/kaldi/0.2.1",
"contains": {
"http://mmif.clams.ai/vocabulary/TextDocument/v1": {},
"http://vocab.lappsgrid.org/Token": {
"document": "td1" },
"http://mmif.clams.ai/vocabulary/TimeFrame/v4": {
"timeUnit": "milliseconds",
"document": "m1" },
"http://mmif.clams.ai/vocabulary/Alignment/v1": {}
}
},
"annotations": [
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td1",
"mime": "text/plain",
"location": "file:///var/archive/transcript-002.txt" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a1",
"source": "v3:tf1",
"target": "td1" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t1",
"start": 0,
"end": 5,
"text": "Hello" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf1",
"start": 5500,
"end": 6085 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a2",
"source": "tf1",
"target": "t1" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t2",
"start": 5,
"end": 6,
"text": "," }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf2",
"start": 6085,
"end": 6202 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a3",
"source": "tf2",
"target": "t2" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t3",
"start": 7,
"end": 11,
"text": "this" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf3",
"start": 6319,
"end": 6787 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a4",
"source": "tf3",
"target": "t3" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t4",
"start": 12,
"end": 14,
"text": "is" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf4",
"start": 6904,
"end": 7138 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a5",
"source": "tf4",
"target": "t4" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t5",
"start": 15,
"end": 18,
"text": "Jim" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf5",
"start": 7255,
"end": 7606 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a6",
"source": "tf5",
"target": "t5" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t6",
"start": 19,
"end": 25,
"text": "Lehrer" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf6",
"start": 7723,
"end": 8425 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a7",
"source": "tf6",
"target": "t6" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t7",
"start": 26,
"end": 30,
"text": "with" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf7",
"start": 8542,
"end": 9010 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a8",
"source": "tf7",
"target": "t7" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t8",
"start": 31,
"end": 34,
"text": "the" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf8",
"start": 9127,
"end": 9478 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a9",
"source": "tf8",
"target": "t8" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t9",
"start": 35,
"end": 43,
"text": "NewsHour" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf9",
"start": 9595,
"end": 10531 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a10",
"source": "tf9",
"target": "t9" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t10",
"start": 44,
"end": 46,
"text": "on" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf10",
"start": 10648,
"end": 10882 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a11",
"source": "tf10",
"target": "t10" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t11",
"start": 47,
"end": 50,
"text": "PBS" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf11",
"start": 10999,
"end": 11350 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a12",
"source": "tf11",
"target": "t11" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t12",
"start": 50,
"end": 51,
"text": "." }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf12",
"start": 11350,
"end": 11467 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a13",
"source": "tf12",
"target": "t12" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t13",
"start": 52,
"end": 54,
"text": "In" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf13",
"start": 11584,
"end": 11818 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a14",
"source": "tf13",
"target": "t13" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t14",
"start": 55,
"end": 58,
"text": "the" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf14",
"start": 11935,
"end": 12286 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a15",
"source": "tf14",
"target": "t14" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t15",
"start": 59,
"end": 67,
"text": "nineteen" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf15",
"start": 12403,
"end": 13339 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a16",
"source": "tf15",
"target": "t15" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t16",
"start": 68,
"end": 76,
"text": "eighties" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf16",
"start": 13456,
"end": 14392 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a17",
"source": "tf16",
"target": "t16" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t17",
"start": 76,
"end": 77,
"text": "," }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf17",
"start": 14392,
"end": 14509 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a18",
"source": "tf17",
"target": "t17" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t18",
"start": 78,
"end": 85,
"text": "barking" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf18",
"start": 14626,
"end": 15445 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a19",
"source": "tf18",
"target": "t18" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t19",
"start": 86,
"end": 90,
"text": "dogs" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf19",
"start": 15562,
"end": 16030 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a20",
"source": "tf19",
"target": "t19" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t20",
"start": 91,
"end": 95,
"text": "have" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf20",
"start": 16147,
"end": 16615 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a21",
"source": "tf20",
"target": "t20" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t21",
"start": 96,
"end": 108,
"text": "increasingly" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf21",
"start": 16732,
"end": 18136 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a22",
"source": "tf21",
"target": "t21" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t22",
"start": 109,
"end": 115,
"text": "become" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf22",
"start": 18253,
"end": 18955 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a23",
"source": "tf22",
"target": "t22" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t23",
"start": 116,
"end": 117,
"text": "a" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf23",
"start": 19072,
"end": 19189 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a24",
"source": "tf23",
"target": "t23" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t24",
"start": 118,
"end": 125,
"text": "problem" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf24",
"start": 19306,
"end": 20125 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a25",
"source": "tf24",
"target": "t24" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t25",
"start": 126,
"end": 128,
"text": "in" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf25",
"start": 20242,
"end": 20476 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a26",
"source": "tf25",
"target": "t25" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t26",
"start": 129,
"end": 134,
"text": "urban" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf26",
"start": 20593,
"end": 21178 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a27",
"source": "tf26",
"target": "t26" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t27",
"start": 135,
"end": 140,
"text": "areas" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf27",
"start": 21295,
"end": 21880 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a28",
"source": "tf27",
"target": "t27" }
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"properties": {
"id": "t28",
"start": 140,
"end": 141,
"text": "." }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v4",
"properties": {
"id": "tf28",
"start": 21880,
"end": 21997 }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a29",
"source": "tf28",
"target": "t28" }
}
]
},
{
"id": "v5",
"metadata": {
"app": "http://mmif.clams.ai/apps/east/0.2.1",
"contains": {
"http://mmif.clams.ai/vocabulary/BoundingBox/v3": {
"document": "m1" }
}
},
"annotations": [
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb1",
"timePoint": 3000,
"coordinates": [[180, 110], [460, 110], [180, 170], [460, 170]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb2",
"timePoint": 3000,
"coordinates": [[660, 110], [1250, 110], [660, 170], [1250, 170]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb3",
"timePoint": 3000,
"coordinates": [[180, 320], [460, 320], [180, 380], [460, 380]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb4",
"timePoint": 3000,
"coordinates": [[660, 320], [1210, 320], [660, 380], [1210, 380]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb5",
"timePoint": 3000,
"coordinates": [[180, 520], [460, 520], [180, 580], [460, 580]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb6",
"timePoint": 3000,
"coordinates": [[660, 520], [1200, 520], [660, 580], [1200, 580]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb7",
"timePoint": 3000,
"coordinates": [[180, 750], [470, 750], [180, 810], [470, 810]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb8",
"timePoint": 3000,
"coordinates": [[660, 750], [1180, 750], [660, 810], [1180, 810]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb9",
"timePoint": 4000,
"coordinates": [[180, 110], [460, 110], [180, 170], [460, 170]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb10",
"timePoint": 4000,
"coordinates": [[660, 110], [1250, 110], [660, 170], [1250, 170]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb11",
"timePoint": 4000,
"coordinates": [[180, 320], [460, 320], [180, 380], [460, 380]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb12",
"timePoint": 4000,
"coordinates": [[660, 320], [1210, 320], [660, 380], [1210, 380]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb13",
"timePoint": 4000,
"coordinates": [[180, 520], [460, 520], [180, 580], [460, 580]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb14",
"timePoint": 4000,
"coordinates": [[660, 520], [1200, 520], [660, 580], [1200, 580]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb15",
"timePoint": 4000,
"coordinates": [[180, 750], [470, 750], [180, 810], [470, 810]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb16",
"timePoint": 4000,
"coordinates": [[660, 750], [1180, 750], [660, 810], [1180, 810]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb17",
"timePoint": 5000,
"coordinates": [[180, 110], [460, 110], [180, 170], [460, 170]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb18",
"timePoint": 5000,
"coordinates": [[660, 110], [1250, 110], [660, 170], [1250, 170]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb19",
"timePoint": 5000,
"coordinates": [[180, 320], [460, 320], [180, 380], [460, 380]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb20",
"timePoint": 5000,
"coordinates": [[660, 320], [1210, 320], [660, 380], [1210, 380]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb21",
"timePoint": 5000,
"coordinates": [[180, 520], [460, 520], [180, 580], [460, 580]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb22",
"timePoint": 5000,
"coordinates": [[660, 520], [1200, 520], [660, 580], [1200, 580]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb23",
"timePoint": 5000,
"coordinates": [[180, 750], [470, 750], [180, 810], [470, 810]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb24",
"timePoint": 5000,
"coordinates": [[660, 750], [1180, 750], [660, 810], [1180, 810]],
"label": "text" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v3",
"properties": {
"id": "bb25",
"timePoint": 21000,
"coordinates": [[150, 810], [1120, 810], [150, 870], [1120, 870]],
"label": "text" }
}
]
},
{
"id": "v6",
"metadata": {
"app": "http://mmif.clams.ai/apps/tesseract/0.2.1",
"contains": {
"http://mmif.clams.ai/vocabulary/TextDocument/v1": {},
"http://mmif.clams.ai/vocabulary/Alignment/v1": {
"sourceType": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"targetType": "http://mmif.clams.ai/vocabulary/BoundingBox/v3"
}
}
},
"annotations": [
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td1",
"text": { "@value": "DATE" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a1",
"source": "v5:bb1",
"target": "td1" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td2",
"text": { "@value": "1982-05-12" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a2",
"source": "v5:bb2",
"target": "td2" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td3",
"text": { "@value": "TITLE" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a3",
"source": "v5:bb3",
"target": "td3" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td4",
"text": { "@value": "Loud Dogs" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a4",
"source": "v5:bb4",
"target": "td4" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td5",
"text": { "@value": "HOST" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a5",
"source": "v5:bb5",
"target": "td5" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td6",
"text": { "@value": "Jim Lehrer" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a6",
"source": "v5:bb6",
"target": "td6" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td7",
"text": { "@value": "PROD" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a7",
"source": "v5:bb7",
"target": "td7" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td8",
"text": { "@value": "Sara Just" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a8",
"source": "v5:bb8",
"target": "td8" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td9",
"text": { "@value": "DATE" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a9",
"source": "v5:bb9",
"target": "td9" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td10",
"text": { "@value": "1982-05-12" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a10",
"source": "v5:bb10",
"target": "td10" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td11",
"text": { "@value": "TITLE" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a11",
"source": "v5:bb11",
"target": "td11" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td12",
"text": { "@value": "Loud Dogs" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a12",
"source": "v5:bb12",
"target": "td12" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td13",
"text": { "@value": "HOST" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a13",
"source": "v5:bb13",
"target": "td13" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td14",
"text": { "@value": "Jim Lehrer" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a14",
"source": "v5:bb14",
"target": "td14" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td15",
"text": { "@value": "PROD" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a15",
"source": "v5:bb15",
"target": "td15" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td16",
"text": { "@value": "Sara Just" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a16",
"source": "v5:bb16",
"target": "td16" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td17",
"text": { "@value": "DATE" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a17",
"source": "v5:bb17",
"target": "td17" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td18",
"text": { "@value": "1982-05-12" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a18",
"source": "v5:bb18",
"target": "td18" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td19",
"text": { "@value": "TITLE" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a19",
"source": "v5:bb19",
"target": "td19" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td20",
"text": { "@value": "Loud Dogs" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a20",
"source": "v5:bb20",
"target": "td20" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td21",
"text": { "@value": "HOST" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a21",
"source": "v5:bb21",
"target": "td21" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td22",
"text": { "@value": "Jim Lehrer" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a22",
"source": "v5:bb22",
"target": "td22" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td23",
"text": { "@value": "PROD" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a23",
"source": "v5:bb23",
"target": "td23" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td24",
"text": { "@value": "Sara Just" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a24",
"source": "v5:bb24",
"target": "td24" }
},
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "td25",
"text": { "@value": "Dog in New York" } }
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"id": "a25",
"source": "v5:bb25",
"target": "td25" }
}
]
},
{
"id": "v7",
"metadata": {
"app": "http://apps.clams.ai/spacy-ner/0.2.1",
"contains": {
"http://vocab.lappsgrid.org/NamedEntity": {}
}
},
"annotations": [
{
"@type": "http://vocab.lappsgrid.org/NamedEntity",
"properties": {
"id": "ne1",
"document": "v6:td2",
"start": 0,
"end": 10,
"category": "Date",
"text": "1982-05-12" }
},
{
"@type": "http://vocab.lappsgrid.org/NamedEntity",
"properties": {
"id": "ne2",
"document": "v6:td6",
"start": 0,
"end": 10,
"category": "Person",
"text": "Jim Lehrer" }
},
{
"@type": "http://vocab.lappsgrid.org/NamedEntity",
"properties": {
"id": "ne3",
"document": "v6:td8",
"start": 0,
"end": 9,
"category": "Person",
"text": "Sara Just" }
},
{
"@type": "http://vocab.lappsgrid.org/NamedEntity",
"properties": {
"id": "ne4",
"document": "v6:td10",
"start": 0,
"end": 10,
"category": "Date",
"text": "1982-05-12" }
},
{
"@type": "http://vocab.lappsgrid.org/NamedEntity",
"properties": {
"id": "ne5",
"document": "v6:td14",
"start": 0,
"end": 10,
"category": "Person",
"text": "Jim Lehrer" }
},
{
"@type": "http://vocab.lappsgrid.org/NamedEntity",
"properties": {
"id": "ne6",
"document": "v6:td16",
"start": 0,
"end": 9,
"category": "Person",
"text": "Sara Just" }
},
{
"@type": "http://vocab.lappsgrid.org/NamedEntity",
"properties": {
"id": "ne7",
"document": "v6:td18",
"start": 0,
"end": 10,
"category": "Date",
"text": "1982-05-12" }
},
{
"@type": "http://vocab.lappsgrid.org/NamedEntity",
"properties": {
"id": "ne8",
"document": "v6:td22",
"start": 0,
"end": 10,
"category": "Person",
"text": "Jim Lehrer" }
},
{
"@type": "http://vocab.lappsgrid.org/NamedEntity",
"properties": {
"id": "ne9",
"document": "v6:td24",
"start": 0,
"end": 9,
"category": "Person",
"text": "Sara Just" }
},
{
"@type": "http://vocab.lappsgrid.org/NamedEntity",
"properties": {
"id": "ne10",
"document": "v6:td25",
"start": 7,
"end": 15,
"category": "Location",
"text": "New York" }
},
{
"@type": "http://vocab.lappsgrid.org/NamedEntity",
"properties": {
"id": "ne11",
"document": "v4:td1",
"start": 15,
"end": 25,
"category": "Person",
"text": "Jim Lehrer" }
},
{
"@type": "http://vocab.lappsgrid.org/NamedEntity",
"properties": {
"id": "ne12",
"document": "v4:td1",
"start": 47,
"end": 50,
"category": "Organization",
"text": "PBS" }
}
]
},
{
"id": "v8",
"metadata": {
"app": "http://apps.clams.ai/slate-parser/1.0.2",
"timestamp": "2020-05-27T12:23:45",
"contains": {
"http://vocab.lappsgrid.org/SemanticTag": {}
}
},
"annotations": [
{
"@type": "http://vocab.lappsgrid.org/SemanticTag",
"properties": {
"id": "st1",
"document": "v6:td2",
"start": 0,
"end": 10,
"tagName": "Date",
"text": "1982-05-12" }
},
{
"@type": "http://vocab.lappsgrid.org/SemanticTag",
"properties": {
"id": "st2",
"document": "v6:td4",
"start": 0,
"end": 9,
"tagName": "Title",
"text": "Loud Dogs" }
},
{
"@type": "http://vocab.lappsgrid.org/SemanticTag",
"properties": {
"id": "st3",
"document": "v6:td6",
"start": 0,
"end": 10,
"tagName": "Host",
"text": "Jim Lehrer" }
},
{
"@type": "http://vocab.lappsgrid.org/SemanticTag",
"properties": {
"id": "st4",
"document": "v6:td8",
"start": 0,
"end": 9,
"tagName": "Producer",
"text": "Sara Just" }
},
{
"@type": "http://vocab.lappsgrid.org/SemanticTag",
"properties": {
"id": "st5",
"document": "v6:td10",
"start": 0,
"end": 10,
"tagName": "Date",
"text": "1982-05-12" }
},
{
"@type": "http://vocab.lappsgrid.org/SemanticTag",
"properties": {
"id": "st6",
"document": "v6:td12",
"start": 0,
"end": 9,
"tagName": "Title",
"text": "Loud Dogs" }
},
{
"@type": "http://vocab.lappsgrid.org/SemanticTag",
"properties": {
"id": "st7",
"document": "v6:td14",
"start": 0,
"end": 10,
"tagName": "Host",
"text": "Jim Lehrer" }
},
{
"@type": "http://vocab.lappsgrid.org/SemanticTag",
"properties": {
"id": "st8",
"document": "v6:td16",
"start": 0,
"end": 9,
"tagName": "Producer",
"text": "Sara Just" }
},
{
"@type": "http://vocab.lappsgrid.org/SemanticTag",
"properties": {
"id": "st9",
"document": "v6:td18",
"start": 0,
"end": 10,
"tagName": "Date",
"text": "1982-05-12" }
},
{
"@type": "http://vocab.lappsgrid.org/SemanticTag",
"properties": {
"id": "st10",
"document": "v6:td20",
"start": 0,
"end": 9,
"tagName": "Title",
"text": "Loud Dogs" }
},
{
"@type": "http://vocab.lappsgrid.org/SemanticTag",
"properties": {
"id": "st11",
"document": "v6:td22",
"start": 0,
"end": 10,
"tagName": "Host",
"text": "Jim Lehrer" }
},
{
"@type": "http://vocab.lappsgrid.org/SemanticTag",
"properties": {
"id": "st12",
"document": "v6:td24",
"start": 0,
"end": 9,
"tagName": "Producer",
"text": "Sara Just" }
}
]
}
]
}