Retrieving

RetriCo provides multiple retrieval strategies to query your knowledge graph. Each strategy approaches the graph differently — you can use them individually or combine them with fusion.

Overview

Strategy	Description	Requires	Best for
Entity Lookup	Find entities by name, expand k-hop neighborhoods	Entity labels	Direct entity questions
Path-based	Shortest paths between parsed entities	Entity labels	Connection questions
Entity Embeddings	Vector similarity over KG-trained entity embeddings	Pre-built embeddings	Similar entity discovery
Chunk Embeddings	Semantic search over source text chunks	Pre-built embeddings	Free-text questions
Community Search	Vector search over community summaries	Pre-built communities	Broad topic questions
Tool-based	LLM agent with graph query tools	API key	Complex multi-hop questions
Keyword Search	BM25 full-text search over chunks	Chunk store	Exact term matching
Fusion	Combine multiple strategies	2+ strategies	Best overall accuracy

Creating a Query Pipeline

Like build pipelines, query pipelines support three creation methods.

Option 1: One-liner

import retrico

result = retrico.query_graph(
    query="Where was Einstein born?",
    entity_labels=["person", "location"],
    api_key="sk-...",
)
print(result.answer)

Option 2: Builder API

builder = retrico.RetriCoSearch(name="my_query")
builder.query_parser(method="gliner", labels=["person", "location"])
builder.retriever(max_hops=2)
builder.chunk_retriever()
builder.reasoner(api_key="sk-...", model="gpt-4o-mini")
executor = builder.build()

result = executor.run(query="Where was Einstein born?")

Option 3: YAML Config

name: query_pipeline
stores:
  graph:
    store_type: neo4j
    uri: "bolt://localhost:7687"

nodes:
  - id: parser
    processor: query_parser
    inputs:
      query: {source: "$input", fields: "query"}
    output: {key: "parser_result"}
    config:
      method: gliner
      labels: [person, location]

  - id: retriever
    processor: retriever
    requires: [parser]
    inputs:
      entities: {source: "parser_result", fields: "entities"}
    output: {key: "retriever_result"}
    config:
      max_hops: 2

  - id: chunks
    processor: chunk_retriever
    requires: [retriever]
    inputs:
      subgraph: {source: "retriever_result", fields: "subgraph"}
    output: {key: "chunk_result"}

  - id: reasoner
    processor: reasoner
    requires: [chunks]
    inputs:
      query: {source: "$input", fields: "query"}
      subgraph: {source: "chunk_result", fields: "subgraph"}
    output: {key: "reasoner_result"}
    config:
      api_key: "sk-..."
      model: "gpt-4o-mini"

executor = retrico.ProcessorFactory.create_pipeline("query_pipeline.yaml")
result = executor.run(query="Where was Einstein born?")

Entity Lookup

The default strategy. Parses the query for entities using NER, looks them up in the graph, and expands their neighborhoods by max_hops.

Entity Lookup Retrieval

Builder API:

builder = retrico.RetriCoSearch(name="my_query")
builder.query_parser(method="gliner", labels=["person", "location"])
builder.retriever(max_hops=2)
builder.chunk_retriever()
builder.reasoner(api_key="sk-...", model="gpt-4o-mini")  # optional
executor = builder.build()

result = executor.run(query="Where was Einstein born?")

One-liner:

result = retrico.query_graph(
    query="Where was Einstein born?",
    entity_labels=["person", "location"],
    retrieval_strategy="entity",
)

YAML:

- id: retriever
  processor: retriever
  requires: [parser]
  inputs:
    entities: {source: "parser_result", fields: "entities"}
  output: {key: "retriever_result"}
  config:
    max_hops: 2

How it works:

query_parser extracts entities from the query (e.g. "Einstein" as a person)
retriever finds matching entities in the graph, then expands their k-hop neighborhood
chunk_retriever fetches source text chunks for retrieved entities
reasoner (optional) generates a natural language answer from the subgraph

Parameters:

Parameter	Default	Description
`max_hops`	`2`	How many relationship hops to expand
`active_after`	`None`	Only include relations active on or after this date (ISO 8601)
`active_before`	`None`	Only include relations active on or before this date (ISO 8601)

Temporal Filtering

Relations in RetriCo can carry start_date and end_date properties (set during data ingest or via the graph store API). Use active_after and active_before to filter relations by time range:

builder = retrico.RetriCoSearch(name="temporal_query")
builder.query_parser(method="gliner", labels=["person", "organization"])
builder.retriever(
    max_hops=2,
    active_after="2020-01-01",
    active_before="2020-12-31",
)
builder.reasoner(api_key="sk-...", model="gpt-4o-mini")

- id: retriever
  processor: retriever
  config:
    max_hops: 2
    active_after: "2020-01-01"
    active_before: "2020-12-31"

The filtering logic:

active_after — keeps relations where end_date IS NULL OR end_date >= active_after
active_before — keeps relations where start_date IS NULL OR start_date <= active_before
Relations without dates are always included (treated as always active)

Temporal filtering also works with path_retriever and entity_embedding_retriever.

Entity Lookup with Linking

Same as entity lookup, but links parsed entities to a knowledge base first for precise lookup by stable ID.

Builder API:

builder = retrico.RetriCoSearch(name="linked_query")
builder.query_parser(labels=["person", "location"])
builder.linker(executor=glinker_executor)  # or neo4j_uri= to load KB from graph
builder.retriever(max_hops=2)
builder.chunk_retriever()
executor = builder.build()

Path-based Retrieval

Finds the shortest paths between entities parsed from the query. Useful when the answer lies in the connections between entities rather than individual neighborhoods.

Path-based Retrieval

Builder API:

builder = retrico.RetriCoSearch(name="path_query")
builder.query_parser(method="gliner", labels=["person", "location"])
builder.path_retriever()
builder.chunk_retriever()
executor = builder.build()

One-liner:

result = retrico.query_graph(
    query="How is Einstein connected to the University of Paris?",
    entity_labels=["person", "organization"],
    retrieval_strategy="path",
)

YAML:

- id: retriever
  processor: path_retriever
  requires: [parser]
  inputs:
    entities: {source: "parser_result", fields: "entities"}
  output: {key: "path_retriever_result"}

Entity Embeddings

Uses vector similarity to find entities whose KG embeddings are closest to the query entities. Requires pre-built entity embeddings (see Modeling).

Entity Embedding Retrieval

Builder API:

builder = retrico.RetriCoSearch(name="embedding_query")
builder.query_parser(method="gliner", labels=["person", "location"])
builder.entity_embedding_retriever(
    top_k=5,
    max_hops=2,
    vector_index_name="entity_embeddings",
    embedding_method="sentence_transformer",
    model_name="all-MiniLM-L6-v2",
)
builder.chunk_retriever()
executor = builder.build()

One-liner:

result = retrico.query_graph(
    query="Who works at similar institutions to Einstein?",
    entity_labels=["person", "organization"],
    retrieval_strategy="entity_embedding",
    retriever_kwargs={"top_k": 10},
)

YAML:

- id: retriever
  processor: entity_embedding_retriever
  requires: [parser]
  inputs:
    entities: {source: "parser_result", fields: "entities"}
  output: {key: "entity_embedding_retriever_result"}
  config:
    top_k: 5
    max_hops: 2
    vector_index_name: entity_embeddings
    embedding_method: sentence_transformer
    model_name: "all-MiniLM-L6-v2"

Parameters:

Parameter	Default	Description
`top_k`	`5`	Number of similar entities to retrieve
`max_hops`	`2`	Expand neighborhoods around matched entities
`vector_index_name`	(required)	Name of the vector index in the store
`embedding_method`	`"sentence_transformer"`	`"sentence_transformer"` or `"openai"`
`model_name`	`"all-MiniLM-L6-v2"`	Embedding model name

Chunk Embeddings

Semantic search over source text chunks. Bypasses the graph structure entirely — finds chunks whose embeddings are most similar to the query. Requires pre-built chunk embeddings.

Chunk Embedding Retrieval

Builder API:

builder = retrico.RetriCoSearch(name="chunk_query")
builder.chunk_embedding_retriever(
    top_k=5,
    max_hops=1,
    vector_index_name="chunk_embeddings",
)
executor = builder.build()

One-liner:

result = retrico.query_graph(
    query="What is the theory of relativity?",
    retrieval_strategy="chunk_embedding",
    retriever_kwargs={"top_k": 10},
)

YAML:

- id: retriever
  processor: chunk_embedding_retriever
  inputs:
    query: {source: "$input", fields: "query"}
  output: {key: "chunk_embedding_retriever_result"}
  config:
    top_k: 5
    max_hops: 1
    vector_index_name: chunk_embeddings

Parameters:

Parameter	Default	Description
`top_k`	`5`	Number of chunks to retrieve
`max_hops`	`1`	Expand entity neighborhoods from matched chunks
`vector_index_name`	(required)	Name of the vector index

Community Search

Vector search over community summaries. Requires pre-built communities with summaries and embeddings (see Modeling - Community Detection).

Community Search Retrieval

Builder API:

builder = retrico.RetriCoSearch(name="community_query")
builder.community_retriever(
    top_k=3,
    max_hops=1,
    vector_index_name="community_embeddings",
)
builder.chunk_retriever()
executor = builder.build()

One-liner:

result = retrico.query_graph(
    query="What research fields are represented in the graph?",
    retrieval_strategy="community",
    api_key="sk-...",
)

YAML:

- id: retriever
  processor: community_retriever
  inputs:
    query: {source: "$input", fields: "query"}
  output: {key: "community_retriever_result"}
  config:
    top_k: 3
    max_hops: 1
    vector_index_name: community_embeddings

Parameters:

Parameter	Default	Description
`top_k`	`3`	Number of communities to retrieve
`max_hops`	`1`	Expand entity neighborhoods within matched communities

Tool-based Retrieval

An LLM agent that iteratively queries the graph using tools (entity lookup, relation search, path finding). The agent decides which tools to call based on the query.

Tool-based Retrieval

Builder API:

builder = retrico.RetriCoSearch(name="tool_query")
builder.tool_retriever(
    api_key="sk-...",
    model="gpt-4o-mini",
    max_tool_rounds=3,
    entity_types=["person", "organization"],
    relation_types=["WORKS_AT", "BORN_IN"],
)
builder.chunk_retriever()
executor = builder.build()

One-liner:

result = retrico.query_graph(
    query="What organizations did Einstein work at?",
    retrieval_strategy="tool",
    api_key="sk-...",
    model="gpt-4o-mini",
)

YAML:

- id: retriever
  processor: tool_retriever
  inputs:
    query: {source: "$input", fields: "query"}
  output: {key: "tool_retriever_result"}
  config:
    api_key: "sk-..."
    model: "gpt-4o-mini"
    max_tool_rounds: 3
    entity_types: [person, organization]
    relation_types: [WORKS_AT, BORN_IN]

Parameters:

Parameter	Default	Description
`api_key`	(required)	OpenAI-compatible API key
`model`	`"gpt-4o-mini"`	LLM model name
`max_tool_rounds`	`3`	Maximum iterations for the agent loop
`entity_types`	`[]`	Hint about available entity types
`relation_types`	`[]`	Hint about available relation types
`chunk_source`	`"entity"`	How to resolve chunks: `"entity"` or `"relation"`

The tool retriever also supports temporal filtering — the LLM agent can pass start_date and end_date arguments to tools like get_entity_relations dynamically based on the query context.

Keyword Search

Full-text search over chunks. Supports two search backends:

Relational (search_source="relational", default) — searches SQLite FTS5, PostgreSQL tsvector, or Elasticsearch
Graph (search_source="graph") — uses the graph DB's native full-text index (Neo4j Lucene, FalkorDB FTS, Memgraph Tantivy)

Two entity modes:

Chunks-only (default for relational) — returns matched chunks directly
Entity expansion (expand_entities=True, default for graph) — additionally looks up entities mentioned in matched chunks

Keyword Search Retrieval

Relational source (chunks-only):

builder = retrico.RetriCoSearch(name="keyword_query")
builder.keyword_retriever(
    top_k=10,
    relational_store_type="sqlite",
    sqlite_path="chunks.db",
)
builder.reasoner(api_key="sk-...", model="gpt-4o-mini")
executor = builder.build()

Relational source (with entity expansion):

builder = retrico.RetriCoSearch(name="keyword_expanded")
builder.keyword_retriever(
    top_k=10,
    expand_entities=True,
    max_hops=1,
    relational_store_type="sqlite",
    sqlite_path="chunks.db",
)
builder.chunk_retriever()
executor = builder.build()

Graph DB source (native FTS):

builder = retrico.RetriCoSearch(name="graph_keyword_query")
builder.keyword_retriever(
    search_source="graph",
    top_k=10,
)
builder.chunk_retriever()
executor = builder.build()

One-liner:

result = retrico.query_graph(
    query="theory of relativity",
    retrieval_strategy="keyword",
)

YAML:

- id: retriever
  processor: keyword_retriever
  inputs:
    query: {source: "$input", fields: "query"}
  output: {key: "keyword_retriever_result"}
  config:
    top_k: 10
    search_source: graph  # or "relational"

Graph DB	FTS engine	Index creation
Neo4j	Lucene	`CREATE FULLTEXT INDEX ...` (automatic)
FalkorDB	Built-in	`CALL db.idx.fulltext.createNodeIndex(...)` (automatic)
Memgraph	Tantivy	`CREATE TEXT INDEX ...` (automatic)

KG-Scored Retrieval

Uses an LLM tool-calling parser to decompose the query into structured triple patterns, then resolves those against the graph store and scores them with trained KG embeddings. The KG scorer acts as a universal retriever.

Prerequisites: A trained KG embedding model (optional but recommended). Use retrico.train_kg_model() to train one.

Builder API:

builder = retrico.RetriCoSearch(name="kg_scored_query")
builder.query_parser(
    method="tool",
    api_key="sk-...",
    model="gpt-4o-mini",
    labels=["person", "location"],
    relation_labels=["born_in", "works_at"],
)
builder.kg_scorer(
    model_path="kg_model",
    top_k=10,
    predict_tails=True,
    score_threshold=0.5,
    device="cpu",
)
builder.chunk_retriever(chunk_entity_source="both")
builder.reasoner(api_key="sk-...", model="gpt-4o-mini")
executor = builder.build()
ctx = executor.run({"query": "Where was Einstein born?"})

One-liner:

result = retrico.query_graph(
    query="Where was Einstein born?",
    api_key="sk-...",
    model="gpt-4o-mini",
    retrieval_strategy="kg_scored",
    entity_labels=["person", "location"],
    retriever_kwargs={
        "relation_labels": ["born_in", "works_at"],
        "model_path": "kg_model",
        "top_k": 10,
        "predict_tails": True,
    },
)

How it works:

Tool-calling parser decomposes the query into search_triples(head, relation, tail) calls
KG scorer looks up entities in the graph store
Scores candidate triples with KGE model (if available)
Builds a Subgraph from scored results
Optionally predicts missing links

Parameters:

Parameter	Default	Description
`model_path`	(required)	Trained KGE model directory
`top_k`	`10`	Top predictions per entity
`predict_tails`	`True`	Predict (entity, relation, ?)
`predict_heads`	`False`	Predict (?, relation, entity)
`score_threshold`	`None`	Minimum score filter
`device`	`"cpu"`	`"cpu"` or `"cuda"`

Strategy Comparison

Strategy	Needs parser?	Needs embeddings?	Needs LLM?	Best for
entity (default)	yes	no	no	Direct entity lookup
entity + linking	yes	no	no	Precise lookup with KB IDs
community	no	yes (community)	no	Topic/cluster-based queries
chunk_embedding	no	yes (chunk)	no	Semantic similarity search
entity_embedding	yes	yes (entity)	no	Finding similar entities
tool	no	no	yes	Complex multi-hop questions
path	yes	no	no	Relationship discovery
kg_scored	yes (tool)	optional (KGE)	yes	Structured triple matching + link prediction
keyword	no	no	no	Full-text search (relational or graph DB)

Fusion: Combining Strategies

When a single strategy isn't enough, combine multiple strategies and fuse their results.

Fusion — Multi-Retriever Merging

Builder API

builder = retrico.RetriCoSearch(name="fused_query")
builder.query_parser(method="gliner", labels=["person", "location"])
builder.retriever(max_hops=2)             # entity lookup
builder.path_retriever()                   # path-based
builder.community_retriever()              # community search
builder.fusion(strategy="rrf", top_k=20)  # merge results
builder.chunk_retriever()
builder.reasoner(api_key="sk-...", model="gpt-4o-mini")
executor = builder.build()

One-liner

result = retrico.query_graph(
    query="Where was Einstein born?",
    entity_labels=["person", "location"],
    retrieval_strategy=["entity", "community", "path"],  # list triggers fusion
    fusion_strategy="rrf",
    api_key="sk-...",
)

YAML

name: fused_query
nodes:
  - id: parser
    processor: query_parser
    inputs:
      query: {source: "$input", fields: "query"}
    output: {key: "parser_result"}
    config:
      method: gliner
      labels: [person, location]

  - id: retriever_0
    processor: retriever
    requires: [parser]
    inputs:
      entities: {source: "parser_result", fields: "entities"}
    output: {key: "retriever_0_result"}
    config:
      max_hops: 2

  - id: retriever_1
    processor: path_retriever
    requires: [parser]
    inputs:
      entities: {source: "parser_result", fields: "entities"}
    output: {key: "retriever_1_result"}

  - id: retriever_2
    processor: community_retriever
    inputs:
      query: {source: "$input", fields: "query"}
    output: {key: "retriever_2_result"}
    config:
      top_k: 3

  - id: fusion
    processor: fusion
    requires: [retriever_0, retriever_1, retriever_2]
    inputs:
      subgraph_0: {source: "retriever_0_result", fields: "subgraph"}
      subgraph_1: {source: "retriever_1_result", fields: "subgraph"}
      subgraph_2: {source: "retriever_2_result", fields: "subgraph"}
    output: {key: "fusion_result"}
    config:
      strategy: rrf
      top_k: 20

  - id: chunks
    processor: chunk_retriever
    requires: [fusion]
    inputs:
      subgraph: {source: "fusion_result", fields: "subgraph"}
    output: {key: "chunk_result"}

  - id: reasoner
    processor: reasoner
    requires: [chunks]
    inputs:
      query: {source: "$input", fields: "query"}
      subgraph: {source: "chunk_result", fields: "subgraph"}
    output: {key: "reasoner_result"}
    config:
      api_key: "sk-..."
      model: "gpt-4o-mini"

Fusion Strategies

Strategy	Behavior
`union`	Combine all entities and relations, deduplicate by ID
`rrf`	Reciprocal Rank Fusion — ranks entities across retrievers
`weighted`	Weight each retriever's entities by configurable weight
`intersection`	Only keep entities found in multiple retrievers

Parameters:

Parameter	Default	Description
`strategy`	`"union"`	Fusion method
`top_k`	`0`	Max entities after fusion (0 = keep all)
`weights`	`[]`	Per-retriever weights (for `"weighted"`)
`min_sources`	`2`	Min retrievers an entity must appear in (for `"intersection"`)

RetriCoFusedSearch (Recommended for Complex Fusion)

Configure each retrieval strategy as a separate RetriCoSearch, then combine via RetriCoFusedSearch:

from retrico import RetriCoSearch, RetriCoFusedSearch, Neo4jConfig

store = Neo4jConfig(uri="bolt://localhost:7687", password="password")

# Strategy 1: Entity lookup
entity_builder = RetriCoSearch(name="entity")
entity_builder.store(store)
entity_builder.query_parser(labels=["person", "organization", "location"])
entity_builder.retriever(max_hops=3)

# Strategy 2: Shortest paths
path_builder = RetriCoSearch(name="path")
path_builder.store(store)
path_builder.query_parser(labels=["person", "organization", "location"])
path_builder.path_retriever(max_path_length=5, max_pairs=10)

# Strategy 3: Community search
community_builder = RetriCoSearch(name="community")
community_builder.store(store)
community_builder.community_retriever(top_k=3)

# Combine with RRF fusion
fused = RetriCoFusedSearch(
    entity_builder, path_builder, community_builder,
    strategy="rrf",
    top_k=25,
)
fused.chunk_retriever()
fused.reasoner(api_key="sk-...", model="gpt-4o-mini")

executor = fused.build()
ctx = executor.run({"query": "What is the relationship between Einstein and quantum mechanics?"})

The parser is auto-inherited from the first sub-builder that has one. Store config is also inherited.

Simpler example — two strategies:

entity = RetriCoSearch(name="entity")
entity.store(store)
entity.query_parser(labels=["person", "location"])
entity.retriever(max_hops=2)

community = RetriCoSearch(name="community")
community.store(store)
community.community_retriever(top_k=5)

fused = RetriCoFusedSearch(
    entity, community,
    strategy="weighted",
    weights=[2.0, 1.0],
    top_k=15,
)
fused.chunk_retriever()
executor = fused.build()

Query Parser

The query parser extracts entities from the natural language query. It supports three methods:

Parameters:

Parameter	Default	Description
`method`	`"gliner"`	Parsing method: `"gliner"`, `"llm"`, or `"tool"`
`labels`	(required for gliner/llm)	Entity types to extract
`model`	varies	GLiNER model or LLM model name
`api_key`	`None`	Required for `"llm"` and `"tool"` methods

# GLiNER (local, fast)
builder.query_parser(method="gliner", labels=["person", "location"])

# LLM (API-based)
builder.query_parser(method="llm", labels=["person", "location"], api_key="sk-...")

# Tool-calling (LLM decides what entities to search for)
builder.query_parser(method="tool", api_key="sk-...", model="gpt-4o-mini")

Adding a Reasoner

Any retrieval strategy can be paired with an LLM reasoner that generates a natural language answer from the retrieved subgraph:

Builder API:

builder.reasoner(
    api_key="sk-...",
    model="gpt-4o-mini",
)

YAML:

- id: reasoner
  processor: reasoner
  requires: [chunks]
  inputs:
    query: {source: "$input", fields: "query"}
    subgraph: {source: "chunk_result", fields: "subgraph"}
  output: {key: "reasoner_result"}
  config:
    api_key: "sk-..."
    model: "gpt-4o-mini"

Parameters:

Parameter	Default	Description
`api_key`	(required)	OpenAI-compatible API key
`model`	`"gpt-4o-mini"`	LLM model name
`temperature`	`0.1`	Sampling temperature
`base_url`	`None`	Custom API endpoint

Without a reasoner, you still get the full retrieved subgraph:

result = executor.run(query="Where was Einstein born?")
subgraph = result.get("chunk_result")["subgraph"]
print(subgraph.entities)
print(subgraph.relations)
print(subgraph.chunks)

Overview​

Creating a Query Pipeline​

Option 1: One-liner​

Option 2: Builder API​

Option 3: YAML Config​

Entity Lookup​

Temporal Filtering​

Entity Lookup with Linking​

Path-based Retrieval​

Entity Embeddings​

Chunk Embeddings​

Community Search​

Tool-based Retrieval​

Keyword Search​

KG-Scored Retrieval​

Strategy Comparison​

Fusion: Combining Strategies​

Builder API​

One-liner​

YAML​

Fusion Strategies​

RetriCoFusedSearch (Recommended for Complex Fusion)​

Query Parser​

Adding a Reasoner​

Overview

Creating a Query Pipeline

Option 1: One-liner

Option 2: Builder API

Option 3: YAML Config

Entity Lookup

Temporal Filtering

Entity Lookup with Linking

Path-based Retrieval

Entity Embeddings

Chunk Embeddings

Community Search

Tool-based Retrieval

Keyword Search

KG-Scored Retrieval

Strategy Comparison

Fusion: Combining Strategies

Builder API

One-liner

YAML

Fusion Strategies

RetriCoFusedSearch (Recommended for Complex Fusion)

Query Parser

Adding a Reasoner