Model-Assisted Labeling
Last updated
Last updated
Model Creation:
In order to accelerate the annotation process, UBAI offers the ability to auto-annotate your documents using spacy and transformer models. UBIAI supports training Named Entity Recognition, Span Categorizer, Relation Extraction and Text Classification models. To train a model, click on “MODELS” in the menu navigation bar and select the type of models to train: Named Entity Recognition, Relation Extraction, Span Categorizer or Text Classification. Once you select the type of model, press New model. This will prompt you to the new model window. You can then enter the name of the model, its type and language.
Note: To train relation extraction model, the training documents need to contain already tagged entities so the model can predict relations between the entities.
Once you press “Create Model”, the model will be initiated but still not trained.
Model Training:
Next, press the train buttonplay_arrow in the main Models menu to train the model. You will be prompted to choose the following options:
Select project from which the training corpus will be used
Select a pre-trained model, you have the option to choose a spacy model, a BERT model, layoutLM model (requires GPU credits) or a Template Form Recognizer (based on Azure Form Recognizer model)
Select the training/evaluation partition from the annotated data to train/evaluate the model
Configure the training by specifying the number of iterations, dropout and batch size.
You have the option to auto-annotate your document after the model finish training by checking the “Annotate your documents after finish train” button. Note: For efficient model training, it is recommended to annotate at least 10% of your total documents.
After training, UBIAI will directly evaluate the model based on the train/validation partition. The precision, recall, F score and model hyperparameters will be displayed:
To track and analyze model performance over the multiple training runs, simply press on the model name to go to the analysis page. In this page, you will find all the training runs for this specific model.
In addition to model score, you can get a breakdown of each entity score as shown below:
Precision, recall and f score charts are also shown:
Model Labeling:
To minimize manual labeling, you can auto-train a model on your own annotations to predict labels directly from the annotation interface. First, select the model type, NER or relations, from the models tab and configure the hyperparameters such as number of iterations, batch size and drop out.
Next, you can run the training manually by clicking on the training button, or select the auto-train option to launch automatic training every 50 annotated documents.
Once the model is trained, you can start predicting labels or relations by clicking on the "Predict" button. All you have to do next, is to correct the predicted labels and re-train the model iteratively.
Model Export:
You can export a ready to use fine-tuned models to integrate with your application. To do so, go to the models page and press on the download button archive to export the model.