UTC
Last updated
Last updated
Prompt-based token classification model based on and encoders. A model trained on a variety of token classification tasks demonstrates great generalization capabilities. It excels in zero-shot and few-shot settings, for diverse information extraction (IE) tasks, making it a versatile tool for a range of NLP applications.
Named-entity recognition (NER)
Relation extraction
Summarization
Q&A
Text cleaning
Coreference resolution
UTC-small
141M
3K
English
UTC-large
434M
3K
English
UTC-base
184M
3K
English
UTC-large
783M
3K
English
Prompt-based The model was trained on multiple token classification tasks making it adaptable for a variety of information extraction tasks using user prompts.
Supports zero-shot and few-shot learning. Capable of performing tasks with little to no training data, making it highly adaptable to new challenges.
3K Token Capacity. Can process texts up to 3,000 tokens in length. We work on expanding model processing capacity.
Currently supports the English language only.
Currently, you can fine-tune our model via the Hugging Face Auto-train feature.
Limitations. While the model shows promise in summarization, it is currently not its strongest application. Enhancements in this area are a future development focus.
Open-sourced under
Open-sourced under
Open-sourced under
Open-sourced under
Potential. The model's prompt-based approach allows for flexible adaptation to various tasks. Its strength in token-level analysis makes it highly effective for detailed text-processing tasks.