Recogito Tagging to spaCy Trained Model
https://colab.research.google.com/drive/1xB0MVhC1vTvXdlM5iNym_BLCUzU_rE7X?usp=sharing
Problem:
Researchers manually tagging text content using Recogito may want to train a named entity recognition model based their manual tags. An example of this is where a researcher may collect documents using Gale’s Digital Scholar Lab and may want to train their own named entity recognition model. In this example, Recogito may be used to tag a sample of their documents and use this Jupyter Notebook to train, test, and apply a machine learning model.
Solution:
This Jupyter Notebook uses the spaCy natural language processing engine to train, test, and apply machine learning algorithm based on Recogito tagging.
Guides Used:
- William Mattingly and the 2021 TAP Institute class on Machine Learning (click the Launch Binder button for class materials)
- Using spaCy 3.0 to build a custom NER model, by Zachary Lim
- How to Train spaCy to Autodetect New Entities (NER) [Complete Guide], by Shrivarsheni