Comment on page

Annotation Export

At any moment during annotation, you can output your annotation work by clicking the "Export Annotation"
button in the annotation interface. Note that only documents in the validated state
will be exported out. Once you press on the export button, it will prompt you to the following window:
In the export window, you can filter the documents by entity Labels, Relations or Classes and define a split ratio.
Below are the following formats you can export to:
  1. 1.
    Amazon Comprehend format
  2. 2.
    JSON format
  3. 3.
    Spacy: Including spacy 2 and spacy 3 binary formats
  4. 4.
    Text classification format
  5. 5.
    Image classification format
  6. 6.
    Relations format: Include JSON and .spacy format
  7. 7.
    OCR Format
    • OCR JSON: Contains raw text, start/end offsets and bounding box of each token
    • OCR Form: Follows the form format as in the FUNSD dataset
    • OCR Form Processed: Follow the FUNSD format but with normalized width and height
  8. 8.
    Stanford CoreNLP format
  9. 9.
    IOB format including IOB Part Of Speech (POS) and IOB Chatbot Note: for character based projects, each character will be tokenized separately, it is recommended to export in JSON instead.
A zip file containing the annotation along with the documents used during annotation will be downloaded, you will need to unzip the file before using the annotation to train a model.
Note: For MacOs users, it is recommended to unzip the file using Winzip in order to preserve file names.