Skip to main content

Capacity Planning

Capacity planning defines the resources required for a customer-controlled FactVerse environment before production rollout. Use this page for private container deployments on Kubernetes, OpenShift, customer VMs, or restricted networks.

The numbers below are initial planning bands for customer-side deployments. Final values should be validated with the release package, enabled modules, expected user concurrency, asset size, connector schedule, retention policy, and customer infrastructure standard.

Prerequisites

Complete the deployment model and container runtime decision first. Confirm the product modules in scope, number of environments, identity method, source systems, expected users, client devices, data retention policy, backup target, monitoring platform, and change approval process.

Capacity workflow

Product deployment units

Image names and chart names are provided in the project delivery package. Use this table to understand the deployment units that usually need independent capacity planning.

Product scopeTypical deployment unitsCapacity driver
FactVerse Platform baselineWeb console, API gateway, tenant and identity services, asset metadata service, database, cache or queue, object storage, ingress.Concurrent users, asset metadata volume, API calls, authentication traffic, object storage growth.
DataMesh InspectorInspector API, work-order and inspection services, evidence upload service, notification jobs, optional DFS connector workers, ECM evidence storage.Field users, inspection records, photo or video evidence, work-order synchronization, mobile upload bursts.
Data Fusion ServicesDFS API, connector controller, connector workers, mapping and quality jobs, scheduler, queue, connector logs.Connector count, sync frequency, batch size, source-system limits, retry rate, quality-rule volume.
FactVerse AI AgentAgent API, workflow orchestrator, tool execution workers, retrieval or index service, approval queue, audit records.Workflow concurrency, tool-call volume, document retrieval, scheduled automation, human approval backlog.
Enterprise Content ManagementECM API, document service, search or index service, object storage, approval workflow jobs.Document count, file size, retention period, approval activity, search frequency.
Designer, asset preparation, and Physical AIAsset service, model conversion worker, simulation or rendering worker, worker scratch storage, optional GPU worker.Largest model size, conversion frequency, simulation jobs, SimReady asset preparation, rendering or physics workload.
Client applicationsWeb access, desktop clients, mobile clients, mixed-reality devices, field caches.Download size, site bandwidth, device cache behavior, update cadence, offline package requirements.

Initial container sizing

These values are per replica or per worker unless the row states otherwise. Use requests for scheduler planning and limits to protect the node. Raise limits only after measuring the validation workload.

Deployment unitInitial requestInitial limitReplicas or workersI/O and storage notes
Web console or static frontend0.1-0.25 vCPU, 256-512 MiB0.5 vCPU, 1 GiB2 replicas for production.Low disk I/O. Cache static assets at ingress or customer CDN when allowed.
API gateway and lightweight APIs0.5-1 vCPU, 1-2 GiB2 vCPU, 4 GiB2 replicas for production.Watch p95 latency, error rate, and connection pool usage.
Product API services1 vCPU, 2-4 GiB4 vCPU, 8 GiB2 replicas for production, more for high concurrency.Sensitive to database latency and object storage access.
Tenant, identity, and admin services0.5 vCPU, 1-2 GiB2 vCPU, 4 GiB2 replicas for production.Keep SSO callback and session behavior stable during failover tests.
DFS connector worker0.5-2 vCPU, 1-4 GiB4 vCPU, 8 GiBStart with 1 worker per connector group or schedule window.Batch size and source-system latency usually dominate. Avoid overlapping large sync jobs.
AI Agent workflow worker1-2 vCPU, 4-8 GiB4 vCPU, 16 GiBStart with 2 workers when scheduled workflows are enabled.Queue depth, tool-call latency, retrieval latency, and approval backlog drive scaling.
ECM document and search service1-2 vCPU, 2-8 GiB4 vCPU, 16 GiB2 API replicas; size index service separately.Search index needs fast persistent storage and memory headroom.
Model conversion or asset processing worker2-4 vCPU, 8-16 GiB8 vCPU, 24-32 GiBStart with 1-2 workers; isolate from API nodes for heavy assets.Use fast local scratch storage. Large models can spike memory and temporary disk usage.
Simulation, rendering, or Physical AI worker4-8 vCPU, 16-32 GiB16 vCPU, 64 GiBSize per project workload. Add GPU nodes when required by the delivery package.Needs dedicated scratch storage and longer validation runs.
Cache or queue1-2 vCPU, 2-4 GiB4 vCPU, 8 GiBProduction should use a customer-approved HA pattern.Monitor queue depth, memory eviction, and persistence mode.
Ingress controller0.5-1 vCPU, 512 MiB-2 GiB2 vCPU, 4 GiBAt least 2 replicas when cluster policy allows.Size for TLS termination, upload size, and client download bursts.

Environment sizing bands

Use these bands as starting points for customer-side planning. They are cluster or environment-level references, not a replacement for the release-specific values file.

ProfileTypical useCompute baselineData servicesStorage and I/O baseline
Single-node validationLab validation, training, configuration review, issue reproduction.8-12 vCPU, 32-48 GiB RAM on one VM or node.Local or customer-provided database and cache.300-500 GiB SSD. Use for validation only.
Small productionOne site, moderate users, limited connectors, standard Inspector or ECM workload.3 worker nodes, each 8 vCPU and 32 GiB RAM, plus control-plane nodes per customer standard.PostgreSQL 4 vCPU and 16 GiB RAM; cache or queue 2 vCPU and 4 GiB RAM.Database SSD with at least 3,000 IOPS; object storage 1-2 TiB; worker scratch 100-200 GiB.
Standard productionMultiple sites or departments, regular DFS sync, AI Agent workflows, document and evidence retention.3-5 worker nodes, each 16 vCPU and 64 GiB RAM.PostgreSQL 8 vCPU and 32 GiB RAM; cache or queue 4 vCPU and 8 GiB RAM; search service 4 vCPU and 16 GiB RAM when enabled.Database SSD with 6,000-10,000 IOPS; object storage 2-5 TiB; worker scratch 300-500 GiB.
Asset-heavy or Physical AILarge models, frequent conversion, simulation, rendering, SimReady asset preparation, robotics training scenarios.Standard production plus dedicated worker nodes with 16-32 vCPU and 64-128 GiB RAM. Add GPU nodes only when required.PostgreSQL 8-16 vCPU and 32-64 GiB RAM; separate search or index capacity when retrieval is enabled.Database SSD with 10,000+ IOPS; object storage 5 TiB or more; scratch storage 500 GiB or more with high sequential throughput.
High-control environmentRestricted network, offline package import, strict retention, separate validation and production paths.Size production and validation separately. Keep spare capacity for offline upgrade validation.Customer-managed HA database, cache or queue, internal registry, backup platform.Add space for image archives, restore samples, logs, and release bundles.

Storage and I/O recommendations

Storage areaRecommended classPlanning guidanceMonitor
Database volumeSSD or high-performance block storage.Start with the IOPS band in the sizing profile. Keep enough free space for indexes, migrations, backup staging, and restore tests.IOPS saturation, latency, slow queries, lock wait, connection pressure.
Object storageCustomer object storage or S3-compatible service.Capacity should cover source files, converted assets, documents, evidence, generated reports, retained versions, and lifecycle buffers.Growth rate, large-object latency, failed uploads, lifecycle cleanup, restore samples.
Worker scratchFast local SSD or high-throughput ephemeral volume.Model conversion and simulation workers need temporary space separate from object storage. Plan scratch size from the largest expected model plus derived files.Temporary disk pressure, conversion duration, worker eviction, failed jobs.
Search or index volumeSSD persistent volume.Allocate memory and disk together. Rebuild time should fit the maintenance window.Query latency, index size, rebuild time, memory pressure.
Logs and audit recordsCustomer log platform or retained persistent storage.Size by retention policy and export volume. High-control projects usually need separate audit retention.Log growth, dropped logs, retention pressure, query time.
Backup targetCustomer backup platform or object storage tier.Backup throughput must fit the maintenance window. Include database, object storage, configuration, and release package evidence.Backup duration, failed backup, restore duration, incomplete protected asset list.

I/O planning rules

  • Put database volumes on SSD-class storage with predictable latency.
  • Keep object storage optimized for large sequential upload and download traffic.
  • Give model conversion and simulation workers dedicated scratch storage so temporary files do not compete with database I/O.
  • Schedule large DFS sync jobs, model conversions, backups, and search reindexing in separate windows when the environment is small.
  • Track storage growth by data class: models, converted assets, documents, inspection evidence, logs, database, and backups.
  • Keep at least one restore sample in the validation plan before accepting the capacity baseline.

Inputs

InputRequired detailCapacity impact
EnvironmentsProduction, validation, training, disaster recovery, and lab environments.Determines total cluster, VM, storage, backup, and monitoring footprint.
User workloadNamed users, active users, peak concurrent sessions, user groups, site time zones, client types.Drives web/API replicas, session load, network throughput, and support windows.
Scene and asset workloadNumber of scenes, largest model size, model conversion frequency, media files, downloads, field-device cache behavior.Drives object storage, model processing workers, cache, and backup volume.
DFS and integration workloadSource systems, connector count, sync cadence, batch size, retry policy, write-back requirements.Drives connector workers, queue depth, database I/O, network routes, and source-system limits.
AI Agent workloadWorkflow concurrency, tool-call volume, document retrieval, scheduled automation, approval queues.Drives worker concurrency, queue capacity, database load, and optional private inference capacity.
Simulation or Physical AI workloadSimulation jobs, asset preparation, rendering, physics validation, robotics or training scenarios when in scope.May require dedicated worker nodes, GPU-enabled nodes, larger storage, and longer validation windows.
ECM and evidence workloadDocuments, SOPs, images, inspection evidence, audit records, retention period.Drives object storage, database records, index size, backup window, and restore test scope.
Operations policyAvailability target, maintenance window, log retention, backup frequency, recovery objective.Drives redundancy, monitoring, log storage, backup infrastructure, and restore process.

Planning steps

  1. Select the sizing band that matches the product scope and expected workload.
  2. Map enabled products to deployment units and identify which units need dedicated workers.
  3. Fill the sizing worksheet for users, scenes, assets, integrations, AI Agent workflows, ECM documents, and retention requirements.
  4. Define CPU requests, memory requests, limits, replicas, storage classes, persistent volumes, and namespace quotas.
  5. Define database IOPS, object storage capacity, worker scratch size, search index size, log retention, and backup target throughput.
  6. Separate steady workloads from burst workloads such as model conversion, scheduled synchronization, batch import, search indexing, and simulation jobs.
  7. Define scaling triggers for replicas, worker count, storage expansion, database tuning, connector scheduling, and backup windows.
  8. Run a validation workload with representative users, source records, scenes, documents, and client devices.
  9. Record the baseline capacity, known assumptions, headroom, review cadence, and owner for each resource domain.

Sizing worksheet

Worksheet itemRecord
Peak concurrent usersBusiness peak, site peak, client type, expected growth, validation sample.
Largest operational sceneScene size, asset count, media count, target devices, download behavior.
Integration scheduleSource system, sync frequency, batch size, allowed window, retry policy.
AI Agent concurrencyWorkflow type, scheduled runs, manual runs, tool-call volume, approval queue.
Storage growthObject storage growth, database growth, log growth, retention period.
Backup and restoreBackup frequency, backup window, restore objective, restore sample set.
High availabilityReplica policy, node spread, database availability pattern, maintenance window.
Optional GPU workloadSimulation, rendering, model processing, private inference, validation runtime.

Validation checklist

  • Representative users can complete target workflows during the expected peak window.
  • Connector jobs finish inside the approved synchronization window.
  • Model conversion, asset loading, and document access meet acceptance expectations.
  • AI Agent workflows and approval queues do not create unbounded backlog.
  • Database, queue, cache, and object storage metrics remain within the agreed operating range.
  • Backup completes inside the approved window and restore sampling succeeds.
  • Alerts exist for CPU, memory, pod restart, queue backlog, database connection pressure, storage growth, and backup failure.
  • The customer owner has approved the baseline and review cadence.

Expected result

The expected output is a capacity baseline that includes product deployment units, workload assumptions, initial resource requests and limits, database and storage I/O assumptions, backup estimates, scaling triggers, validation evidence, and owners for future capacity reviews.

Troubleshooting capacity gaps

SymptomCheck
Users report slow pages during peak hoursConcurrent sessions, ingress capacity, API replicas, database latency, cache hit rate.
Connector jobs miss the sync windowSource-system limits, batch size, worker count, queue depth, retry policy, network route.
Model or asset tasks take too longWorker resources, asset size, storage throughput, conversion queue, optional GPU worker need.
Storage grows faster than expectedRetention policy, duplicate uploads, log retention, imported file lifecycle, backup copies.
Backup overruns the maintenance windowProtected asset list, object storage volume, database size, backup target throughput, schedule.
Resource requests block deploymentNamespace quota, node capacity, storage class availability, OpenShift project limits, cluster policy.