Capacity Planning
Capacity planning defines the resources required for a customer-controlled FactVerse environment before production rollout. Use this page for private container deployments on Kubernetes, OpenShift, customer VMs, or restricted networks.
The numbers below are initial planning bands for customer-side deployments. Final values should be validated with the release package, enabled modules, expected user concurrency, asset size, connector schedule, retention policy, and customer infrastructure standard.
Prerequisites
Complete the deployment model and container runtime decision first. Confirm the product modules in scope, number of environments, identity method, source systems, expected users, client devices, data retention policy, backup target, monitoring platform, and change approval process.
Capacity workflow
Product deployment units
Image names and chart names are provided in the project delivery package. Use this table to understand the deployment units that usually need independent capacity planning.
| Product scope | Typical deployment units | Capacity driver |
|---|---|---|
| FactVerse Platform baseline | Web console, API gateway, tenant and identity services, asset metadata service, database, cache or queue, object storage, ingress. | Concurrent users, asset metadata volume, API calls, authentication traffic, object storage growth. |
| DataMesh Inspector | Inspector API, work-order and inspection services, evidence upload service, notification jobs, optional DFS connector workers, ECM evidence storage. | Field users, inspection records, photo or video evidence, work-order synchronization, mobile upload bursts. |
| Data Fusion Services | DFS API, connector controller, connector workers, mapping and quality jobs, scheduler, queue, connector logs. | Connector count, sync frequency, batch size, source-system limits, retry rate, quality-rule volume. |
| FactVerse AI Agent | Agent API, workflow orchestrator, tool execution workers, retrieval or index service, approval queue, audit records. | Workflow concurrency, tool-call volume, document retrieval, scheduled automation, human approval backlog. |
| Enterprise Content Management | ECM API, document service, search or index service, object storage, approval workflow jobs. | Document count, file size, retention period, approval activity, search frequency. |
| Designer, asset preparation, and Physical AI | Asset service, model conversion worker, simulation or rendering worker, worker scratch storage, optional GPU worker. | Largest model size, conversion frequency, simulation jobs, SimReady asset preparation, rendering or physics workload. |
| Client applications | Web access, desktop clients, mobile clients, mixed-reality devices, field caches. | Download size, site bandwidth, device cache behavior, update cadence, offline package requirements. |
Initial container sizing
These values are per replica or per worker unless the row states otherwise. Use requests for scheduler planning and limits to protect the node. Raise limits only after measuring the validation workload.
| Deployment unit | Initial request | Initial limit | Replicas or workers | I/O and storage notes |
|---|---|---|---|---|
| Web console or static frontend | 0.1-0.25 vCPU, 256-512 MiB | 0.5 vCPU, 1 GiB | 2 replicas for production. | Low disk I/O. Cache static assets at ingress or customer CDN when allowed. |
| API gateway and lightweight APIs | 0.5-1 vCPU, 1-2 GiB | 2 vCPU, 4 GiB | 2 replicas for production. | Watch p95 latency, error rate, and connection pool usage. |
| Product API services | 1 vCPU, 2-4 GiB | 4 vCPU, 8 GiB | 2 replicas for production, more for high concurrency. | Sensitive to database latency and object storage access. |
| Tenant, identity, and admin services | 0.5 vCPU, 1-2 GiB | 2 vCPU, 4 GiB | 2 replicas for production. | Keep SSO callback and session behavior stable during failover tests. |
| DFS connector worker | 0.5-2 vCPU, 1-4 GiB | 4 vCPU, 8 GiB | Start with 1 worker per connector group or schedule window. | Batch size and source-system latency usually dominate. Avoid overlapping large sync jobs. |
| AI Agent workflow worker | 1-2 vCPU, 4-8 GiB | 4 vCPU, 16 GiB | Start with 2 workers when scheduled workflows are enabled. | Queue depth, tool-call latency, retrieval latency, and approval backlog drive scaling. |
| ECM document and search service | 1-2 vCPU, 2-8 GiB | 4 vCPU, 16 GiB | 2 API replicas; size index service separately. | Search index needs fast persistent storage and memory headroom. |
| Model conversion or asset processing worker | 2-4 vCPU, 8-16 GiB | 8 vCPU, 24-32 GiB | Start with 1-2 workers; isolate from API nodes for heavy assets. | Use fast local scratch storage. Large models can spike memory and temporary disk usage. |
| Simulation, rendering, or Physical AI worker | 4-8 vCPU, 16-32 GiB | 16 vCPU, 64 GiB | Size per project workload. Add GPU nodes when required by the delivery package. | Needs dedicated scratch storage and longer validation runs. |
| Cache or queue | 1-2 vCPU, 2-4 GiB | 4 vCPU, 8 GiB | Production should use a customer-approved HA pattern. | Monitor queue depth, memory eviction, and persistence mode. |
| Ingress controller | 0.5-1 vCPU, 512 MiB-2 GiB | 2 vCPU, 4 GiB | At least 2 replicas when cluster policy allows. | Size for TLS termination, upload size, and client download bursts. |
Environment sizing bands
Use these bands as starting points for customer-side planning. They are cluster or environment-level references, not a replacement for the release-specific values file.
| Profile | Typical use | Compute baseline | Data services | Storage and I/O baseline |
|---|---|---|---|---|
| Single-node validation | Lab validation, training, configuration review, issue reproduction. | 8-12 vCPU, 32-48 GiB RAM on one VM or node. | Local or customer-provided database and cache. | 300-500 GiB SSD. Use for validation only. |
| Small production | One site, moderate users, limited connectors, standard Inspector or ECM workload. | 3 worker nodes, each 8 vCPU and 32 GiB RAM, plus control-plane nodes per customer standard. | PostgreSQL 4 vCPU and 16 GiB RAM; cache or queue 2 vCPU and 4 GiB RAM. | Database SSD with at least 3,000 IOPS; object storage 1-2 TiB; worker scratch 100-200 GiB. |
| Standard production | Multiple sites or departments, regular DFS sync, AI Agent workflows, document and evidence retention. | 3-5 worker nodes, each 16 vCPU and 64 GiB RAM. | PostgreSQL 8 vCPU and 32 GiB RAM; cache or queue 4 vCPU and 8 GiB RAM; search service 4 vCPU and 16 GiB RAM when enabled. | Database SSD with 6,000-10,000 IOPS; object storage 2-5 TiB; worker scratch 300-500 GiB. |
| Asset-heavy or Physical AI | Large models, frequent conversion, simulation, rendering, SimReady asset preparation, robotics training scenarios. | Standard production plus dedicated worker nodes with 16-32 vCPU and 64-128 GiB RAM. Add GPU nodes only when required. | PostgreSQL 8-16 vCPU and 32-64 GiB RAM; separate search or index capacity when retrieval is enabled. | Database SSD with 10,000+ IOPS; object storage 5 TiB or more; scratch storage 500 GiB or more with high sequential throughput. |
| High-control environment | Restricted network, offline package import, strict retention, separate validation and production paths. | Size production and validation separately. Keep spare capacity for offline upgrade validation. | Customer-managed HA database, cache or queue, internal registry, backup platform. | Add space for image archives, restore samples, logs, and release bundles. |
Storage and I/O recommendations
| Storage area | Recommended class | Planning guidance | Monitor |
|---|---|---|---|
| Database volume | SSD or high-performance block storage. | Start with the IOPS band in the sizing profile. Keep enough free space for indexes, migrations, backup staging, and restore tests. | IOPS saturation, latency, slow queries, lock wait, connection pressure. |
| Object storage | Customer object storage or S3-compatible service. | Capacity should cover source files, converted assets, documents, evidence, generated reports, retained versions, and lifecycle buffers. | Growth rate, large-object latency, failed uploads, lifecycle cleanup, restore samples. |
| Worker scratch | Fast local SSD or high-throughput ephemeral volume. | Model conversion and simulation workers need temporary space separate from object storage. Plan scratch size from the largest expected model plus derived files. | Temporary disk pressure, conversion duration, worker eviction, failed jobs. |
| Search or index volume | SSD persistent volume. | Allocate memory and disk together. Rebuild time should fit the maintenance window. | Query latency, index size, rebuild time, memory pressure. |
| Logs and audit records | Customer log platform or retained persistent storage. | Size by retention policy and export volume. High-control projects usually need separate audit retention. | Log growth, dropped logs, retention pressure, query time. |
| Backup target | Customer backup platform or object storage tier. | Backup throughput must fit the maintenance window. Include database, object storage, configuration, and release package evidence. | Backup duration, failed backup, restore duration, incomplete protected asset list. |
I/O planning rules
- Put database volumes on SSD-class storage with predictable latency.
- Keep object storage optimized for large sequential upload and download traffic.
- Give model conversion and simulation workers dedicated scratch storage so temporary files do not compete with database I/O.
- Schedule large DFS sync jobs, model conversions, backups, and search reindexing in separate windows when the environment is small.
- Track storage growth by data class: models, converted assets, documents, inspection evidence, logs, database, and backups.
- Keep at least one restore sample in the validation plan before accepting the capacity baseline.
Inputs
| Input | Required detail | Capacity impact |
|---|---|---|
| Environments | Production, validation, training, disaster recovery, and lab environments. | Determines total cluster, VM, storage, backup, and monitoring footprint. |
| User workload | Named users, active users, peak concurrent sessions, user groups, site time zones, client types. | Drives web/API replicas, session load, network throughput, and support windows. |
| Scene and asset workload | Number of scenes, largest model size, model conversion frequency, media files, downloads, field-device cache behavior. | Drives object storage, model processing workers, cache, and backup volume. |
| DFS and integration workload | Source systems, connector count, sync cadence, batch size, retry policy, write-back requirements. | Drives connector workers, queue depth, database I/O, network routes, and source-system limits. |
| AI Agent workload | Workflow concurrency, tool-call volume, document retrieval, scheduled automation, approval queues. | Drives worker concurrency, queue capacity, database load, and optional private inference capacity. |
| Simulation or Physical AI workload | Simulation jobs, asset preparation, rendering, physics validation, robotics or training scenarios when in scope. | May require dedicated worker nodes, GPU-enabled nodes, larger storage, and longer validation windows. |
| ECM and evidence workload | Documents, SOPs, images, inspection evidence, audit records, retention period. | Drives object storage, database records, index size, backup window, and restore test scope. |
| Operations policy | Availability target, maintenance window, log retention, backup frequency, recovery objective. | Drives redundancy, monitoring, log storage, backup infrastructure, and restore process. |
Planning steps
- Select the sizing band that matches the product scope and expected workload.
- Map enabled products to deployment units and identify which units need dedicated workers.
- Fill the sizing worksheet for users, scenes, assets, integrations, AI Agent workflows, ECM documents, and retention requirements.
- Define CPU requests, memory requests, limits, replicas, storage classes, persistent volumes, and namespace quotas.
- Define database IOPS, object storage capacity, worker scratch size, search index size, log retention, and backup target throughput.
- Separate steady workloads from burst workloads such as model conversion, scheduled synchronization, batch import, search indexing, and simulation jobs.
- Define scaling triggers for replicas, worker count, storage expansion, database tuning, connector scheduling, and backup windows.
- Run a validation workload with representative users, source records, scenes, documents, and client devices.
- Record the baseline capacity, known assumptions, headroom, review cadence, and owner for each resource domain.
Sizing worksheet
| Worksheet item | Record |
|---|---|
| Peak concurrent users | Business peak, site peak, client type, expected growth, validation sample. |
| Largest operational scene | Scene size, asset count, media count, target devices, download behavior. |
| Integration schedule | Source system, sync frequency, batch size, allowed window, retry policy. |
| AI Agent concurrency | Workflow type, scheduled runs, manual runs, tool-call volume, approval queue. |
| Storage growth | Object storage growth, database growth, log growth, retention period. |
| Backup and restore | Backup frequency, backup window, restore objective, restore sample set. |
| High availability | Replica policy, node spread, database availability pattern, maintenance window. |
| Optional GPU workload | Simulation, rendering, model processing, private inference, validation runtime. |
Validation checklist
- Representative users can complete target workflows during the expected peak window.
- Connector jobs finish inside the approved synchronization window.
- Model conversion, asset loading, and document access meet acceptance expectations.
- AI Agent workflows and approval queues do not create unbounded backlog.
- Database, queue, cache, and object storage metrics remain within the agreed operating range.
- Backup completes inside the approved window and restore sampling succeeds.
- Alerts exist for CPU, memory, pod restart, queue backlog, database connection pressure, storage growth, and backup failure.
- The customer owner has approved the baseline and review cadence.
Expected result
The expected output is a capacity baseline that includes product deployment units, workload assumptions, initial resource requests and limits, database and storage I/O assumptions, backup estimates, scaling triggers, validation evidence, and owners for future capacity reviews.
Troubleshooting capacity gaps
| Symptom | Check |
|---|---|
| Users report slow pages during peak hours | Concurrent sessions, ingress capacity, API replicas, database latency, cache hit rate. |
| Connector jobs miss the sync window | Source-system limits, batch size, worker count, queue depth, retry policy, network route. |
| Model or asset tasks take too long | Worker resources, asset size, storage throughput, conversion queue, optional GPU worker need. |
| Storage grows faster than expected | Retention policy, duplicate uploads, log retention, imported file lifecycle, backup copies. |
| Backup overruns the maintenance window | Protected asset list, object storage volume, database size, backup target throughput, schedule. |
| Resource requests block deployment | Namespace quota, node capacity, storage class availability, OpenShift project limits, cluster policy. |
Related pages
- Use Deployment Models to choose the customer-side deployment pattern.
- Use Container Deployment to implement the runtime.
- Use Environment Readiness to prepare owners, network, identity, and support inputs.
- Use Operations and Maintenance to review capacity after go-live.