Entity Linking

GLiNER-Linker is a family of bi-encoder models for entity disambiguation, developed as the neural component of the GLiNKER framework. These models resolve extracted entity mentions by linking them to the correct entries in a knowledge base, handling ambiguity such as distinguishing "Apple" (the company) from "Apple" (the fruit).

Overview

Architecture: Bi-encoder (separate text encoder and label encoder sharing the same base model).
Task: Entity disambiguation / Entity linking.
Languages Supported: English.
License: Apache 2.0.

Available Models

Linking Models

Linking models perform entity disambiguation by computing similarity between mention contexts and candidate entity descriptions.

Model▲	Base Encoder▲	Use Case▲
gliner-linker-base-v1.0	deberta-base	Balanced performance
gliner-linker-large-v1.0	deberta-large	Maximum accuracy

Reranking Model

When the candidate set is large, the reranker splits candidates into chunks and runs inference on each chunk, then merges and deduplicates results for improved accuracy.

Model▲	Base Encoder▲	Use Case▲
gliner-linker-rerank-v1.0	ettin-encoder-68m	Reranking

Usage

Installation

pip install git+https://github.com/Knowledgator/GLinker.git

Entity Input Format

Entities are provided as JSONL files with the following structure:

{"entity_id": "Q312", "label": "Apple Inc.", "description": "American technology company", "entity_type": "organization"}
{"entity_id": "Q89", "label": "Apple", "description": "Edible fruit of apple tree", "entity_type": "food"}

Basic Entity Linking Pipeline

from glinker import ConfigBuilder, DAGExecutor

# Build pipeline
builder = ConfigBuilder(name="entity_linking")

# L1: Extract mentions
builder.l1.gliner(
    model="knowledgator/gliner-bi-base-v2.0",
    labels=["person", "organization", "location"]
)

# L2: Candidate retrieval
builder.l2.add("dict", priority=0)

# L3: Disambiguation with GLiNER-Linker
builder.l3.configure(
    model="knowledgator/gliner-linker-large-v1.0",
    use_precomputed_embeddings=True
)

# Execute
executor = DAGExecutor(builder.get_config())
executor.load_entities("entities.jsonl", target_layers=["dict"])

result = executor.execute({
    "texts": ["Apple announced new iPhone"]
})

# Get linked entities
l0_result = result.get("l0_result")
for entity in l0_result.entities:
    if entity.linked_entity:
        print(f"{entity.mention_text} -> {entity.linked_entity.label}")
        print(f"  Score: {entity.linked_entity.score:.3f}")

Precomputed Embeddings

Precomputing entity embeddings provides 10-100x speedups for large-scale linking:

builder.l2.embeddings(
    enabled=True,
    model_name="knowledgator/gliner-linker-large-v1.0"
)

executor.load_entities("entities.jsonl", target_layers=["dict"])
executor.precompute_embeddings(target_layers=["postgres"], batch_size=8)

Pipeline with Reranker

Add the reranker as an L4 stage for improved disambiguation when the candidate set is large:

builder = ConfigBuilder(name="reranked")
builder.l1.gliner(
    model="knowledgator/gliner-bi-base-v2.0",
    labels=["gene", "disease"]
)
builder.l3.configure(model="knowledgator/gliner-linker-base-v1.0")
builder.l4.configure(
    model="knowledgator/gliner-linker-rerank-v1.0",
    threshold=0.3,
    max_labels=5,
)

Model Selection Guide

Use Case	Linker Model	Reranker
Balanced performance	gliner-linker-base-v1.0	---
Maximum accuracy	gliner-linker-large-v1.0	Optional
Large candidate sets	gliner-linker-large-v1.0	gliner-linker-rerank-v1.0

For detailed pipeline configuration and advanced usage, see the GLiNKER framework documentation.

Overview​

Available Models​

Linking Models​

Reranking Model​

Usage​

Installation​

Entity Input Format​

Basic Entity Linking Pipeline​

Precomputed Embeddings​

Pipeline with Reranker​

Model Selection Guide​