Skip to main content

DFS Pro Datasets

A DFS Pro dataset is a governed data asset. It can come from a connector, a file import, a subset extraction, or a fusion output.

Use datasets when data needs to be reusable, reviewed, profiled, versioned, or consumed by AI Agent, predictive maintenance, BI, or operational workflows.

Open Dataset Center

Go to:

Data Integration > Dataset Center

Dataset Center shows available datasets, source type, row count, column count, tags, and update time.

Dataset source types

Source typeUse
File importData uploaded or imported from a file.
ConnectorData produced by a DFS Lite connector.
Subset extractionA filtered or selected subset from another dataset.
Fusion outputOutput from a DFS Pro fusion task.

Create a dataset

  1. Open Dataset Center.
  2. Select Create Dataset.
  3. Enter a dataset name.
  4. Add a description that explains the data owner, source, and purpose.
  5. Choose source type.
  6. Save the dataset.
  7. Open the dataset detail page.

Use names that remain useful outside the original project meeting.

Examples:

  • Chiller plant sensor history
  • CMMS work orders normalized
  • Inspection findings by asset
  • Compressor vibration features

Inspect a dataset

On Dataset Detail, review:

  • name and description;
  • source type and source ID;
  • table name when present;
  • row count;
  • column count;
  • column schema;
  • time column;
  • metric column;
  • tags;
  • quality issues;
  • status;
  • steward;
  • current version.

Use preview to inspect sample rows. Use profile to inspect column-level statistics such as row count, null ratio, and distinct ratio.

Assign a steward

Assign a steward when a dataset will be used beyond a one-time test.

The steward should be able to answer:

  • where the data came from;
  • which source system owns it;
  • how often it should update;
  • which columns are required;
  • which quality issues block downstream use;
  • who approves schema changes.

Validate a dataset

Validate a dataset after preview, profile, and source ownership are clear.

Validation checklist:

  • dataset name and description are clear;
  • source type and source ID are correct;
  • time column is correct when the dataset is time-based;
  • metric column is correct when the dataset is metric-based;
  • required columns are present;
  • null and distinct ratios are understood;
  • quality issues are reviewed;
  • steward is assigned;
  • downstream consumers are known.

After validation, the dataset can be used as a more reliable input for fusion tasks, AI Agent workflows, predictive maintenance, or BI reports.

Use version history

Dataset version history helps review schema changes. Use it when:

  • columns are added or removed;
  • column names change;
  • data types change;
  • a downstream dashboard, fusion task, or AI workflow depends on the dataset.

Before making a breaking change, check change impact so downstream users can review the effect.

Use lineage

Lineage shows upstream producers and downstream consumers. Use it before changing or deprecating a dataset.

Ask:

  • Which fusion tasks use this dataset?
  • Which reports use this dataset?
  • Which AI Agent workflows depend on it?
  • Which output datasets were produced from it?

Deprecate a dataset

Deprecate a dataset when a better dataset replaces it or when the source is retired.

Before deprecating:

  1. Check lineage.
  2. Notify downstream owners.
  3. Confirm replacement dataset when needed.
  4. Record why the dataset is being deprecated.

Next step

Use Fusion Tasks when multiple datasets need to be combined or compared.