Choose your on-ramp
Two paths to validated SDC4 instances. Same end state. Different fit by team and starting posture.
CLI on-ramp
SDC Agents SMB
pip install sdc-agents-smb. Pick a CSV, a SQL table, a Notion database. Day one of the audit can start day one of the pilot.
SDC Agents SMB is the agent infrastructure that turns a datasource into validated SDC4 instances composed from both component layers: domain (NIEM, FHIR, NIH CDE, Default) and ProvGov (provenance, attestation, party and role, decision tables, retention, validation hashes, access control). Eight agents, each with a single job:
Discover what the datasource has.
Read the schema and detect anomalies.
Align fields to domain and ProvGov components from SDCStudio.
Produce the instance bindings.
Confirm structural conformance.
Package for downstream consumption.
Capture the domain context that emerges from the data.
Review and approve the package.
Local LLMs, no API keys leaving the org
The agents run against local Ollama models. Right-size the model to the infrastructure you have available: the agents work across a range of model sizes, and the team picks a model that matches GPU memory and throughput targets without changing the pipeline. No API keys leaving the organization. No data flowing to a vendor inference endpoint.
This matters for sovereign deployments, healthcare data classifications, and any government context where third-party LLM calls are not on the table.
Datasources supported on day one
SQL (Postgres, MySQL, SQLite), CSV, JSON, MongoDB, Notion, Google Sheets, Airtable. The ToolsetHub plugin system covers SMB-native sources that enterprise tools usually skip.
Day-of-pilot user journey
pip install sdc-agents-smb- Edit
sdc-agents.yamlwith your SDCStudio API key and datasource config sdc-agents introspect my_database- Review the introspection output, anomaly flags, and mapping suggestions
sdc-agents assembly review- Approve the package. You now have validated SDC4 instances ready for sdcgovernance to point at.
Repository
github.com/SemanticDataCharter/SDC_AgentsSMB. Apache 2.0. MCP server mode for agent workflows. Audit dashboard, compliance reporting, and schema drift detection are part of the package.
Web on-ramp
SDCforSMB
Same pilot. No terminal. For teams where "pip install" is not the right starting move.
SDCforSMB is a Django web application that wraps SDC Agents SMB in a browser-based onboarding flow. You log in, connect a datasource, and watch the agents do the work. No YAML editing, no CLI, no Ollama configuration to manage by hand.
The wizard sequence
- Connect a datasource. Upload a CSV, point at a SQL database, link a Google Sheet.
- Review introspection. The dashboard shows what the agent found: schema, anomalies, suggested component mappings.
- Approve assembly. The assembly review screen shows the proposed SDC4 instances and bindings.
- Download or deploy. The output is a deployable package ready for the audit.
Deployment profiles
FOSS
Jena/Fuseki, PostgreSQL, Django, Redis. Single-server, SMB-friendly footprint.
Enterprise
Graphwise GraphDB, SirixDB, Keycloak. For teams with existing identity infrastructure and temporal-versioning needs at scale.
State of the project
Phase 1 (Core Console) is in production use. Onboarding wizard, datasource CRUD, introspection viewer, assembly review, and dashboard are shipped. Phase 2 (deployment via Docker and Podman) and Phase 3 (operations: scheduler UI, drift viewer, compliance reports) are in active development.
Repository
github.com/SemanticDataCharter/SDCforSMB. Run locally with pip install sdcforsmb and python manage.py runserver 9000.
What you get at the end of either flow
Both on-ramps produce the same deliverable set. Apache 2.0 on the generated code, downloadable, deployable on your own hardware. No SaaS dependency between the audit and your data.
The data model
Your composed SDC4 model: the domain components from NIEM, FHIR, NIH CDE, or Default; the ProvGov layer; your bindings and constraints. Downloadable as the source of truth.
A template SDC4 XML instance
A working SDC4 instance template the team copies and populates with your data, either as the output of an ETL pipeline or as a batch-import format. The populated instances import into the generated app, where they are validated and added to the knowledge graph.
The generated app stack
App scaffolding pre-wired for the stack profile you pick (FOSS, Enterprise, or both, detailed below). Stand it up locally and start recording your data into the substrate immediately.
Apache 2.0 license
All generated code is Apache 2.0. No commercial commitment to Axius SDC for what comes out of the on-ramp. The artifact set is yours to deploy, fork, audit, or hand to another team.
Pick your stack: FOSS, Enterprise, or both
SDCStudio generates the app scaffolding for either of two stack profiles. The user picks one, the other, or both at generation time.
FOSS profile
Apache Jena and Fuseki for the triplestore and SPARQL endpoint. PostgreSQL for instance storage. Fully open source, no commercial dependency.
Right fit when: sovereign deployment without proprietary components, full source-level audit, or smaller-footprint pilots.
Enterprise profile
Graphwise GraphDB (Ontotext and PoolParty merged into Graphwise in 2024; Axius SDC is a Graphwise tech partner) for high-scale graph queries. SirixDB for time-travel storage of instances. Keycloak for identity and access.
Right fit when: scale targets, temporal-query requirements, or existing identity infrastructure to integrate with.
Why this matters across the portfolio
Every generated app after the first imports into the original stack. The full app portfolio runs in one shared substrate runtime instead of replicating infrastructure per pilot. Infrastructure is procured once, then accumulated against, which is the procurement curve buyers actually want for governance work.