DFS Pro Datasets
A DFS Pro dataset is a governed data asset. It can come from a connector, a file import, a subset extraction, or a fusion output.
Use datasets when data needs to be reusable, reviewed, profiled, versioned, or consumed by AI Agent, predictive maintenance, BI, or operational workflows.
Open Dataset Center
Go to:
Data Integration > Dataset Center
Dataset Center shows available datasets, source type, row count, column count, tags, and update time.
Dataset source types
| Source type | Use |
|---|---|
| File import | Data uploaded or imported from a file. |
| Connector | Data produced by a DFS Lite connector. |
| Subset extraction | A filtered or selected subset from another dataset. |
| Fusion output | Output from a DFS Pro fusion task. |
Create a dataset
- Open Dataset Center.
- Select Create Dataset.
- Enter a dataset name.
- Add a description that explains the data owner, source, and purpose.
- Choose source type.
- Save the dataset.
- Open the dataset detail page.
Use names that remain useful outside the original project meeting.
Examples:
Chiller plant sensor historyCMMS work orders normalizedInspection findings by assetCompressor vibration features
Inspect a dataset
On Dataset Detail, review:
- name and description;
- source type and source ID;
- table name when present;
- row count;
- column count;
- column schema;
- time column;
- metric column;
- tags;
- quality issues;
- status;
- steward;
- current version.
Use preview to inspect sample rows. Use profile to inspect column-level statistics such as row count, null ratio, and distinct ratio.
Assign a steward
Assign a steward when a dataset will be used beyond a one-time test.
The steward should be able to answer:
- where the data came from;
- which source system owns it;
- how often it should update;
- which columns are required;
- which quality issues block downstream use;
- who approves schema changes.
Validate a dataset
Validate a dataset after preview, profile, and source ownership are clear.
Validation checklist:
- dataset name and description are clear;
- source type and source ID are correct;
- time column is correct when the dataset is time-based;
- metric column is correct when the dataset is metric-based;
- required columns are present;
- null and distinct ratios are understood;
- quality issues are reviewed;
- steward is assigned;
- downstream consumers are known.
After validation, the dataset can be used as a more reliable input for fusion tasks, AI Agent workflows, predictive maintenance, or BI reports.
Use version history
Dataset version history helps review schema changes. Use it when:
- columns are added or removed;
- column names change;
- data types change;
- a downstream dashboard, fusion task, or AI workflow depends on the dataset.
Before making a breaking change, check change impact so downstream users can review the effect.
Use lineage
Lineage shows upstream producers and downstream consumers. Use it before changing or deprecating a dataset.
Ask:
- Which fusion tasks use this dataset?
- Which reports use this dataset?
- Which AI Agent workflows depend on it?
- Which output datasets were produced from it?
Deprecate a dataset
Deprecate a dataset when a better dataset replaces it or when the source is retired.
Before deprecating:
- Check lineage.
- Notify downstream owners.
- Confirm replacement dataset when needed.
- Record why the dataset is being deprecated.
Next step
Use Fusion Tasks when multiple datasets need to be combined or compared.