Prepare DFS Data for AI Agent Workflows
Use this workflow when a FactVerse AI Agent task depends on operational data from source systems. The goal is to turn raw source values, work records, documents, and fused datasets into reviewed evidence that an Agent workflow can cite.
The working path is:
Define Agent task -> choose source evidence -> connect with DFS Lite
-> map identity and fields -> sync and check quality
-> create or update DFS Pro dataset -> fuse and review if needed
-> publish Agent handoff record -> run read-only Agent validation
Outcome
At the end of this workflow, the implementation team should have:
- an Agent task with a clear answer contract;
- source connectors or imported datasets prepared in DFS;
- stable asset, equipment, point, scene, or work-record identity mapping;
- quality and freshness notes visible to reviewers;
- a DFS Pro dataset or fusion output approved for the Agent task;
- a handoff record that lists dataset version, source timestamps, steward, allowed output type, and MCP scope plan.
Step 1: Define the Agent task
Start from the Agent workflow, then work backward to data.
| Agent task | Required evidence | Typical DFS output |
|---|---|---|
| Facility status summary | Asset identity, latest status, meter values, alarms, inspection records, open work orders. | Governed facility operations dataset with timestamp and quality notes. |
| Predictive maintenance evidence | Equipment identity, signal history, operating mode, maintenance history, anomaly labels, engineer notes. | Time-series and maintenance dataset, or fusion output that joins signals and work records. |
| Work-order draft context | Asset, alarm or finding, related history, SOP reference, reviewer. | Reviewed dataset that links inspection, alarm, and work-order records. |
| Energy or facility change explanation | Meter values, equipment state, operating context, calculation inputs, source freshness. | Dataset with meter readings, asset mapping, quality state, and calculation fields. |
| Physical AI scenario input | Scene ID, model asset version, process constraint, operating signal, validation note. | Scenario input dataset tied to scene and asset version. |
Write an answer contract before connecting data:
- user role and workflow owner;
- tenant, site, asset group, equipment group, or scene boundary;
- time window and freshness expectation;
- required source systems;
- required fields and accepted units;
- output type such as readiness report, status summary, evidence table, draft action, or scenario input;
- reviewer and approval boundary.
Step 2: Choose DFS Lite, DFS Pro, or both
Choose the data path from the expected reuse and review level.
| Data need | Use DFS Lite | Add DFS Pro |
|---|---|---|
| One source needs to feed a read-only Agent answer | Create connector, preview fields, map, sync, and review quality. | Add a dataset if the same evidence will be reused or audited. |
| Several sources describe the same asset or event | Connect each source and map identities. | Use datasets, fusion tasks, review queue, lineage, and dataset validation. |
| The Agent output may support a work-order or inspection draft | Confirm source freshness and mapping through DFS Lite. | Require steward, dataset version, review notes, and allowed action boundary. |
| Physical AI needs reusable scenario inputs | Map scene, asset, process, and signal fields. | Package scenario input as a versioned dataset with validation notes. |
Use DFS Lite to make sources reachable and understandable. Use DFS Pro to make data reusable, reviewed, versioned, and safe for repeated Agent workflows.
Step 3: Connect source systems with DFS Lite
Open:
Data Integration > Connectors
Create or reuse a connector for each source in the answer contract.
Prepare the connector inputs:
| Source | Prepare |
|---|---|
| OPC UA, MQTT, REST, CSV, or other enabled connector | Endpoint, credentials, selected paths, payload shape, timestamp field, key field, sampling expectation. |
| Work-order or inspection feed | Source owner, asset key, work status fields, close notes, timestamp fields, update cadence. |
| Meter, signal, or equipment status feed | Point names, units, scale factors, operating ranges, quality flags. |
| Scene or model metadata source | Scene ID, model asset ID, version, geometry or process fields, validation reference. |
Run Test Connection before saving or starting the connector. A passing test confirms reachability. Continue with browse, preview, mapping, sync, and quality checks before using the data in an Agent task.
Step 4: Preview and map Agent evidence
Open the connector detail page. Use browse and preview to confirm the fields that the Agent task needs.
For every field used by the Agent, record:
| Mapping item | Review question |
|---|---|
| Source path | Which tag, topic field, JSON path, table column, node ID, or file column produced this value? |
| Operational identity | Which FactVerse asset, equipment, point, room, scene, work order, or dataset field does it map to? |
| Timestamp | Which timestamp proves when the value was produced or updated? |
| Unit and scale | What unit, scale factor, enum, or normalization rule applies? |
| Quality rule | What missing value, stale value, outlier, or range check should be visible to reviewers? |
| Agent use | Will the Agent cite, summarize, compare, calculate, draft, or hand off this field? |
When mapping suggestions are available, use them as review inputs. Accept a mapping after checking identity, units, and downstream meaning.
Step 5: Sync and inspect quality
Start or run the connector.
Open:
Data Integration > Sync History
Data Integration > Quality
Check:
- latest sync status;
- rows read, written, failed, or rejected;
- latest source timestamp;
- connector quality score;
- completeness, timeliness, and accuracy indicators when available;
- failed field, failed row, or rejected payload;
- quota or throughput limits that may affect the Agent task.
Agent workflows should receive quality notes in plain language. Example:
Chiller meter values are current to 2026-06-06 09:10 UTC.
Work-order close notes update nightly.
Three point aliases remain unmapped and are excluded from the Agent answer.
Step 6: Create or update a DFS Pro dataset
Open:
Data Integration > Dataset Center
Create a dataset when the Agent task needs repeatable access, versioning, steward review, lineage, or multi-source fusion.
Dataset description should include:
- Agent task name;
- source connector or source system;
- owner and steward;
- refresh cadence;
- identity fields;
- required timestamp field;
- required quality checks;
- known limitations;
- allowed Agent output type.
On the dataset detail page, review:
| Check | Purpose for Agent workflows |
|---|---|
| Preview | Confirm rows and payload shape before an Agent reads the data. |
| Profile | Inspect null ratio, distinct IDs, timestamp range, and outliers. |
| Validate | Mark the dataset accepted for the intended Agent task. |
| Versions | Keep schema and data-shape changes traceable. |
| Lineage | Show source connectors, fusion tasks, outputs, reports, and Agent dependencies. |
| Change impact | Review downstream workflows before replacing or changing the dataset. |
Step 7: Fuse sources when the Agent needs one evidence view
Open:
Data Integration > Data Fusion
Create a fusion task when the Agent needs combined evidence across datasets.
| Fusion need | Configuration focus | Review focus |
|---|---|---|
| Join signals and work orders | Asset key, time window, event key, source priority. | Confirm matches and unmatched critical events. |
| Reconcile asset names and aliases | Natural key, alias list, semantic match fields. | Resolve low-confidence or conflicting identity matches. |
| Merge inspection findings and alarms | Asset ID, finding ID, alarm time, status fields. | Confirm source disagreement and close-note meaning. |
| Fill gaps with calculated or synthetic values | Real dataset, calculated dataset, key fields, tolerance. | Confirm provenance and mark filled values clearly. |
After the run, review:
- task status and run history;
- input dataset IDs and output dataset;
- total, matched, unmatched, and conflict counts;
- review queue items;
- rejected rows;
- steward decision;
- output dataset profile and validation state.
Step 8: Publish the Agent handoff record
Before the Agent workflow consumes the dataset, create a short handoff record.
Use this template:
Agent workflow:
Dataset:
Dataset version:
Source systems:
Latest source timestamps:
Asset or scene boundary:
Required fields:
Known exclusions:
Quality notes:
Steward:
Reviewer:
Allowed Agent output:
Write-action approval:
MCP endpoint and scopes:
Run-record location:
The handoff record should be attached to the workflow run record or project documentation. It lets reviewers understand what the Agent was allowed to use and what data was missing or limited.
Step 9: Validate with a read-only Agent run
Run the Agent workflow in read-only mode first.
The validation prompt should ask the Agent to:
- state the workflow boundary;
- list datasets and source timestamps used;
- summarize data quality issues;
- cite source fields or records;
- identify missing evidence;
- produce the requested answer without writing to an operating system.
Accept the data package only when reviewers can trace the answer back to DFS evidence. Add compute or draft write actions after the answer format, quality notes, and review path are accepted.
Acceptance checklist
- The Agent answer contract is written before connector or dataset work.
- Required sources are connected, imported, or documented as unavailable.
- Field mappings cover identity, timestamp, units, transform, and Agent use.
- Sync history and quality state are reviewed.
- Rejected rows and critical review items are resolved or listed as limitations.
- DFS Pro dataset has steward, profile, validation state, version, and lineage.
- Fusion outputs record method, inputs, output dataset, conflicts, and reviewer decision.
- Agent handoff record lists dataset version, timestamps, quality notes, allowed output, and MCP scope plan.
- First Agent validation run is read-only and includes source references.
Related pages
| Page | Use |
|---|---|
| Build an Operational Data Pipeline with DFS | Use the broader Lite-to-Pro workflow for operational data products. |
| Create an AI Agent-Ready Dataset | Follow a shorter recipe for packaging one dataset. |
| DFS Lite Connectors | Create and operate source connectors. |
| Mapping Source Fields | Map source values to target entities and fields. |
| DFS Pro Datasets | Create, validate, version, and steward datasets. |
| DFS Pro Fusion Tasks | Fuse multiple datasets and produce reviewed outputs. |
| FactVerse AI Agent Data Readiness | Check whether the Agent workflow has enough evidence to run. |