[ PREPRINT / GRAPH INTELLIGENCE ]

Odin: Multi-Signal Graph Intelligence for Autonomous Discovery in Knowledge Graphs

Muyukani Kizito, Elizabeth Nyambere
Prescott Data

Abstract

We present Odin, the first production-deployed graph intelligence engine for autonomous discovery of meaningful patterns in knowledge graphs without prior specification. Unlike retrieval-based systems that answer predefined queries, Odin guides exploration through the COMPASS (Composite Oriented Multi-signal Path Assessment) score, a novel metric that combines:

(1) Structural importance via Personalized PageRank
(2) Semantic plausibility through Neural Probabilistic Logic Learning (NPLL) used as a discriminative filter
(3) Temporal relevance with configurable decay
(4) Community-aware guidance through GNN-identified bridge entities.

This multi-signal integration addresses the "echo chamber" problem where graph exploration becomes trapped in dense local communities. We demonstrate that beam search with multi-signal guidance achieves O(b · h) complexity while maintaining high recall. To our knowledge, Odin represents the first autonomous discovery system deployed in regulated production environments (healthcare and insurance).

1. Introduction

Knowledge graphs (KGs) have emerged as a powerful paradigm for representing structured, interconnected organizational data. Unlike document stores or relational databases, KGs explicitly model entities, typed relationships, and multi-modal evidence, enabling sophisticated reasoning. However, extracting actionable insights from large-scale KGs remains challenging. Traditional approaches rely on query languages (SPARQL, Cypher) that require analysts to specify exact patterns: a fundamental limitation when the goal is discovery rather than retrieval.

Consider a healthcare KG containing millions of patient records. An analyst might query for ``patients with sepsis treated with antibiotics,'' but this only finds patterns they already hypothesize. What about unknown correlations? Emerging cross-domain patterns spanning facility transfers and readmission rates? Such discoveries require autonomous exploration: the ability to identify meaningful patterns without prior specification.

The Core Challenge

Autonomous exploration faces three critical trade-offs: (1) Coverage vs. Efficiency: Exhaustive multi-hop traversal has complexity O(d^h) where d is average degree and h is hop depth; (2) Signal vs. Noise: Not all graph edges are equally informative; data extraction errors create semantically implausible paths; (3) Explainability vs. Performance: Black-box methods struggle in regulated domains requiring audit trails.

Our Approach

We introduce Odin, the first graph intelligence engine designed as a compass for AI agents rather than a retrieval system. Odin does not answer questions; it scores possible exploration directions using a principled multi-signal framework. This architectural separation (graph intelligence from natural language reasoning) provides modularity, explainability, and adaptability across different agent objectives.

2. Problem Formulation

We define a knowledge graph as G = (E, R, T) where E is a set of entities, R is a set of typed relationships, and T ⊆ E × R × E is a set of triples (e_s, r, e_o).

Definition 1 (Autonomous Discovery): Given a KG G and seed entities S ⊂ E, the autonomous discovery task is to identify a set of paths P* starting from S that maximize a discovery utility function U measuring novelty, significance, and evidence quality, subject to computational budget B.

The key distinction: we do not specify what to find, only where to start and how to evaluate discovered patterns. This requires a scoring mechanism that prioritizes promising paths during exploration.

3. The Odin Framework

3.1 Architecture Overview

Odin operates as a two-phase system: (1) offline extraction pipeline that constructs the KG and computes structural metadata via GNNs, and (2) online intelligence library that performs real-time exploration. This separation is critical for understanding system boundaries and deployment requirements.

3.2 The COMPASS Scoring Function

We introduce the COMPASS (Composite Oriented Multi-signal Path Assessment) score—a novel compositional function that unifies structural, semantic, temporal, and community-aware signals. For a path p, we define:

COMPASS(p) = S_edge(p) · S_struct(p) · S_bridge(p) · S_affinity(p) · S_prior(p) · S_temp(p)

The multiplicative composition ensures all signals must agree for high overall score, maintaining a strict veto property so that semantically implausible paths are rejected even if structurally important.

3.3 Neural Probabilistic Logic Learning (NPLL)

Unlike traditional KG completion methods that generate missing edges, we use NPLL to filter existing edges. For an observed triple, the plausibility score S_edge ensures evidence-grounded evaluation. NPLL combines logical rules extracted via rule mining with neural scoring.

3.4 Beam Search with COMPASS Guidance

Exhaustive path enumeration has complexity O(|E| · d^h), intractable for h ≥ 3. We employ deterministic beam search which maintains only top-b candidates at each hop, ranked by COMPASS score. This provides the determinism required for auditability in regulated production.

Algorithm 1: Beam Search with COMPASS Scoring

Require: Seeds S, hop limit h, beam width b
Ensure: Top-k scored paths

1: B₀ ← { (e, ∅, 0) : e ∈ S } // Initialize from seeds
2: for i = 1 to h do
3: C ← ∅ // Candidates
4: for (e, p, s) ∈ B₍ᵢ₋₁₎ do
5: for (e, r, e') ∈ Neighbors(e) do
6: p' ← p ∪ { (e, r, e') }
7: s' ← COMPASS(p')
8: C ← C ∪ { (e', p', s') }
9: end for
10: end for
11: Bᵢ ← Top-b(C) // Keep best b candidates
12: end for
13: return Top-k( ∪ Bᵢ )

4. Experimental Validation

We evaluate on two production KGs: a Healthcare KG (2.3M entities, 8.7M triples) and an Insurance KG (1.8M entities, 6.2M triples). Quality ratings (1-5 scale) were provided by domain experts evaluating the real-world significance of discovered patterns.

Method	Cov.@50	Paths Explored	Time	Quality (1-5)
Exhaustive	95%	125,000	47.0s	4.3
Random Walk	68%	3,000	8.0s	3.5
PPR-only	87%	1,900	3.2s	3.1
GNN Embeddings	52%	2,100	4.1s	2.8
Odin (Full COMPASS)	90%	1,900	3.8s	4.2

Odin achieves comparable coverage to exhaustive search (90% vs. 95%) while analyzing 65x fewer paths. Removing semantic filtering (PPR-only) severely damages the quality of insights, returning structurally vital but medically impossible correlations.

Case Study: Insurance Fraud Ring

In deployment at a mid-sized insurer, Odin successfully flagged a sophisticated fraud ring. The ring consisted of 5 policyholders carrying completely distinct identity profiles (different addresses, distinct policies). Odin bridged the discrete communities by identifying a single service provider linking the disparate hubs. Manual investigation confirmed the scheme, leading to $437,000 in recovered funds. This topology had zero overlap with all 127 known hard-coded fraud alerts.

5. Conclusion

Odin redefines autonomous exploration within isolated enterprise perimeters. By merging the structural prioritization of PageRank with the semantic filter of Logic Learning and GNN-backed bridge scoring, it cures the echo-chamber trap that destroys localized traversal efficiency.

In regulated sectors incapable of absorbing LLM hallucinations, Odin’s path-derived transparency grants the mathematical proof necessary to deploy autonomous analytic agents into active operational workflows.

EXPLORE RELATED ARCHITECTURE

#SECURITY

Cryptographic Routing in Zero-Trust AI Frameworks

READ PAPER ↗#ARCHITECTURE

Federated Intent Resolution for Institutional Fraud Rings

READ PAPER ↗