Theoretical Foundations & Lineage
The Semantic Data Charter is not an invention ex nihilo. It is the synthesis of two decades of research into semantic rigor, formal ontology, and deterministic data modeling.
SDC v4 draws from two complementary theoretical frameworks: a mathematical proof that semantic ambiguity can be structurally eliminated, and an architectural paradigm that separates software concerns from domain knowledge. Together, they define the design space that SDC occupies.
1. The Mathematical Substrate: Deterministic Semantics
Traditional Semantic Web technologies (RDF/OWL) operate under the Open World Assumption, where the absence of a statement does not imply its negation. This is powerful for knowledge discovery across federated graphs, but it introduces inherent interpretive ambiguity at the data-exchange boundary—the same triple can carry different meanings depending on context.
The Model Executable Business System (MEBS) theory, developed by Robert Vane, demonstrates mathematically that semantic ambiguity can be eliminated through disjoint atomic taxonomies organized as partition lattices with unique semantic paths. In information-theoretic terms, this achieves zero Shannon entropy (H = 0): every term has exactly one interpretation within its taxonomy, leaving no probabilistic uncertainty about meaning.
SDC's design is aligned with this zero-entropy principle. By enforcing strict XML Schema (XSD 1.1) constraints, CUID2 identity, and mandatory semantic bindings at every data element, SDC ensures that each data packet carries a deterministic, self-describing interpretation. SDC leverages RDF and OWL for semantic linking and interoperability—the Open World Assumption remains valuable in the graph layer—while enforcing closed-world constraints at the data-packet level where unambiguous interpretation is required.
Key Alignment Points
- Unique semantic paths — Every SDC component is identified by a globally unique CUID2 and bound to explicit ontology predicates, preventing synonymy and homonymy at the structural level.
- Closed-world data packets — Within an SDC schema, every required element must be present and validated against XSD constraints. The data packet is self-contained; its meaning does not depend on external inference.
- Deterministic validation — XSD 1.1 assertions combined with SHACL shape constraints provide machine-verifiable proof that data conforms to its schema, enabling fail-closed behavior in high-assurance environments.
Citation:
Vane, R. (2026). Model Executable Business System (MEBS): Truth Systems for Mars Base Alpha (V2.00). Zenodo.
https://doi.org/10.5281/zenodo.18607750
2. The Architectural Paradigm: Two-Level Modeling
Conventional software systems encode domain knowledge directly into application code and database schemas. When business requirements change, the software must change with them—creating a tight coupling between domain evolution and system maintenance that becomes increasingly expensive over time.
The two-level modeling paradigm, pioneered by Thomas Beale and the openEHR community, addresses this by strictly separating two concerns:
- Reference Model (Level 1) — A stable, small set of software-level types that define how data is structured, identified, and transported. This layer changes infrequently and is implemented once in the platform.
- Archetypes / Domain Models (Level 2) — Constraint-based definitions of real-world concepts (clinical observations, business transactions, sensor readings) expressed as restrictions over the Reference Model. Domain experts can create and modify these without changing the underlying software.
SDC adopts this paradigm directly. The SDC4 Reference Model provides 15+ data types (XdString, XdBoolean, XdQuantity, XdTemporal, etc.), structural containers (Cluster, XdAdapter), and infrastructure elements (spatio-temporal metadata, access control tags, provenance). Domain-specific schemas are composed as constraint definitions over these types, enabling the data to describe itself independently of any particular application.
Sovereign Content
SDC extends Beale's architectural insight with mandatory sovereign identity: every data element carries its own CUID2 identifier, its own semantic bindings, and its own spatio-temporal context. Data exists independently of the application that created it. When a system is decommissioned, the data remains fully interpretable because the meaning travels with the content, not with the software.
For a deeper exploration of how compiled constraint models enable concrete algorithmic capabilities (satisfiability analysis, lattice subsumption, schema differencing), see Constraint-First Computing.
Citation:
Beale, T. (2002). Archetypes: Constraint-based Domain Models for Future-proof Information Systems. Presented at OOPSLA 2002 Workshop on Behavioral Semantics. Centre for Health Informatics and Multi-professional Education (CHIME), University College London.
ResearchGate
3. The Identity Model: Why CUIDs, Not IRIs
A common question from practitioners familiar with the Semantic Web stack: why does SDC use CUIDs (Collision-resistant Unique Identifiers) rather than IRIs (Internationalized Resource Identifiers) as its primary identity mechanism?
IRIs are the correct choice for linked data on the open web. They enable global namespace coordination, content negotiation, and federated graph traversal. SDC uses IRIs extensively in its RDF and OWL layers for exactly these purposes. But the identity anchor for a data element, the identifier that determines what a "systolic blood pressure reading" is and that it means the same thing everywhere it appears, must satisfy stricter requirements than the open web can guarantee.
Air-Gap and Zero-Egress Deployment
An IRI can technically exist on an air-gapped system. You can deploy an OWL ontology with http://snomed.info/id/12345 identifiers on a disconnected server and the reasoner will still function. But in that environment, the IRI has lost the properties that justified choosing it over an arbitrary string: content negotiation, authoritative dereferencing, and DNS-backed namespace authority. Two disconnected organizations can mint the same IRI with different meanings, and there is no arbiter. The identifier works, but its guarantees have degraded to "convention and good faith."
A CUID does not pretend to be dereferenceable. Its collision resistance is mathematical (monotonic timestamp plus cryptographic-quality randomness), not institutional (domain ownership). That guarantee does not degrade in disconnected environments. Embedded within an SDC W3C restriction lattice, a CUID is an immutable, computationally verifiable identity anchor that lives inside the payload. It requires no DNS resolution, no internet connectivity, and no external service availability. It is zero-egress native by design, not by accident.
Link Rot vs. Deterministic Permanence
IRIs break. Domains expire. Servers are decommissioned. Entire organizations disappear. If your enterprise data model relies on an IRI resolving to define what a concept means, your data model is probabilistically fragile; it works today, but its correctness tomorrow depends on infrastructure outside your control. A CUID is a deterministic anchor. It guarantees that the concept inside the payload means the exact same thing in 2026 as it will in 2046, regardless of web infrastructure. The meaning is structural, not referential.
Open World vs. Closed World at the Data Boundary
The W3C Semantic Web stack operates under the Open World Assumption (OWA): anyone can say anything about any IRI. This is powerful for knowledge discovery and federated reasoning across public linked data. But it is a liability at the data-exchange boundary in enterprise and defense systems, where you need to know that a data packet means exactly one thing and cannot be reinterpreted by an external assertion.
SDC uses the Closed World Assumption (CWA) via XML Schema 1.1 restriction lattices. The CUID rigidly binds the structural constraint to the semantic concept: every permissible value, every unit, every cardinality rule is compiled into the schema. An AI agent operating within this lattice is structurally constrained to the W3C-validated admissible state space: invalid data is rejected at the schema boundary, not at the inference layer. The agent may still reason incorrectly, but it cannot produce or consume data that violates the compiled constraints. The identity is not a pointer to meaning; it is the meaning, structurally enforced.
Complementary, Not Competing
SDC does not reject IRIs. Every SDC component can carry ontology predicates that link to standard IRIs (SNOMED CT, LOINC, Wikidata, domain-specific vocabularies). The RDF triples extracted from SDC schemas use standard IRIs for interoperability with the broader Semantic Web. The distinction is architectural: IRIs serve the linking layer (open, federated, inference-friendly), while CUIDs serve the identity layer (deterministic, self-contained, air-gap safe). Each does what it is best at.
4. The SDC Synthesis
Where Vane's MEBS theory establishes the mathematical conditions for eliminating semantic ambiguity, Beale's two-level modeling provides the software architecture for separating stable infrastructure from evolving domain knowledge, and the CUID identity model ensures that deterministic meaning survives air-gaps, link rot, and closed-world enforcement, SDC brings all three together as an executable specification:
From MEBS
- Deterministic interpretation at the data level
- Unique identity for every semantic element
- Fail-closed validation by design
- Elimination of ambiguity at the exchange boundary
From Two-Level Modeling
- Stable Reference Model immune to domain churn
- Domain schemas as constraint definitions
- Separation of software and knowledge concerns
- Data permanence beyond application lifecycles
From CUID Identity
- Zero-egress, air-gap safe deployment
- Immunity to link rot and DNS failure
- Closed-world enforcement at the data boundary
- IRIs for linking, CUIDs for identity
The result is a data modeling standard where schemas are self-describing, semantically bound, structurally validated, and independent of any single application—designed for environments where data integrity and long-term interpretability are non-negotiable.
Explore Further
See how these theoretical foundations translate into algorithmic capabilities, concrete standards compliance, technical specifications, and the design philosophy behind SDC.
About Axius SDC
The Semantic Data Charter is developed by Axius SDC, Inc., an international team with 40+ years combined experience in semantic data and health informatics across the United States, Canada, and Brazil.
Learn more about our team