Skip to main content

Scaling Configurations

This documentation explains Scaling Configurations within the Hostless provider application. Scaling Configurations define how your application automatically scales its resources based on demand within the Hostless environment. This allows your application to handle fluctuating workloads efficiently by dynamically adjusting the number of application instances and their resource allocation.

Why Use Scaling Configurations?

  • Automatic Scaling: Scaling configurations eliminate the need for manual intervention to adjust resources as your application's traffic or processing needs change.
  • Improved Performance: Automatic scaling ensures your application has sufficient resources to handle peak loads, preventing performance degradation due to resource constraints.
  • Cost Optimization: By scaling down resources during low-traffic periods, you can optimize your Hostless resource utilization and potentially reduce costs.
  • High Availability: Scaling configurations play a critical role in achieving High Availability (HA) for your application. HA minimizes downtime and ensures continuous operation by automatically scaling in response to failures or increased demand.

Scaling Configuration Parameters

Hostless allows you to configure resource allocation on a per-replica basis. This defines the minimum and maximum resource limits for each instance (replica) of your application within the Hostless environment. These parameters include:

  • Minimum Replicas (Instances): Set the minimum number of application instances that Hostless will always maintain running, regardless of the workload. This ensures your application has a base level of resources available.
  • Maximum Replicas (Instances): Define the maximum number of application instances Hostless can launch to handle peak workloads. This helps prevent uncontrolled scaling beyond your desired capacity.
  • Max Concurrent Requests per Replica: This parameter allows you to specify the maximum number of concurrent requests a single application instance can handle. When this limit is reached, Hostless will trigger scaling to add more instances provided it has not reached the max replicas(instances)
  • Minimum Memory per Replica: Define the minimum amount of memory allocated to each application instance (in MB). This ensures each instance has sufficient memory to function properly.
  • Maximum Memory per Replica: Set the maximum memory limit per application instance (in MB). This can help prevent individual instances from consuming excessive resources.
  • Minimum CPU per Replica: Define the minimum number of virtual CPUs (vCPU) allocated to each application instance. vCPUs provide processing power for each instance.
  • Maximum CPU per Replica: Set the maximum vCPU limit per application instance. This can help control overall resource consumption within your Hostless environment.