For any type of scaling, consider the architecture of your application and what optimizations may be possible. For example, static content should be served from optimized content delivery networks that offer price and performance benefits. Additional caching for the data storage backend may also improve performance and minimize the requirements for expensive scaling.
Performance testing and benchmarks
Benchmarking and performance testing can identify potential performance bottlenecks during development, help track trends over time, and support decision making by comparing architectures, technology stacks, cloud providers, or other options.
Load testing uses predetermined, controlled types of load, traffic, or data to measure the performance of your backend against your targets. You can see how various load levels impact your application's performance.
Scalability testing focuses on analyzing your application's ability to scale up and scale out. Scalability testing shows how your application responds to increased loads and how the backend (including data storage) adapts.
Define clear targets before beginning to benchmark the application, including performance and speed (for example latency, throughput, speed), resource utilization (CPU utilization, memory usage, network traffic between backend components), and cost. Consider any delays in scaling; for example, if the application needs to scale beyond an allocated set of "reserve" resources when bursts or spikes in traffic occur.
Many tools are available for benchmarking and performance testing, such as Apache JMeter and Locust. When selecting a testing tool, consider the types of tests that are available, including whether or not you need support for scripting, IDE integrations for debugging, additional plug-ins, or support for the kind of traffic and scale to be tested.
If using a cloud provider, determine if there are any additional requirements or best practices for any load testing to avoid potential restrictions. For example, consider the best practices for Cloud Run.
Cost and performance considerations
While scaling up is essential to improve performance, scaling down should also be a consideration to minimize cost. Consider the baseline cost of your backend application without any requests and the cost involved in scaling up the application.
You may have fixed costs for on-premise or server-based architectures, regardless of utilization. Some cloud environments allow you to "scale to zero" to avoid costs when no requests are made. Cloud providers offer calculators for you to explore different configurations and pricing strategies, such as pre-commitments for resources.