The questions are many when it comes to scaling server hardware for performance, cost and ease of maintenance.
Add load balancers, soon-to-come technologies and hardware dependent software licenses into the mix, and it quickly becomes complicated to make the right choices.
This article aims at giving you some general guidelines and directions for choosing and dimensioning servers and hardware. The scope here is servers and hosting equipment only:
The application stack often play a bigger role than the servers/hardware in the total performance.
In web applications, you will generally find 80% of the potential performance improvements in the front-end, so make sure you pick the lowest hanging fruits first. We will deal with that in separate articles about speed optimization and front-end optimization.
We invite you to contact us for further guidance to your specific setup. It is not unusual that we find substantial improvements in system performance, and lower the cost at the same time.
Choosing server CPUs
For some applications, changing to a newer and more powerful CPU can be the single most important hardware upgrade, in terms of performance benefit. But as always, it depends.
It is not really a question if more CPU core performance will be better (it will). The interesting question is if higher core clock will be better than more cores with lower core clock. Take for example the dual-socket Intel Xeon E5-2600 v2 family and compare these 2 CPUs:
Ignoring the cost difference of $1,100, the 2643 got 30% higher clock speed than the 2697, but only half the cores. When would you choose the one above the other?
If you are running virtual servers, the relevant comparison could be this: How will your application perform on 6 2-core virtual servers each with 3.5 GHz CPU clock, compared to 12 2-core virtual servers each with 2.7 GHz CPU clock?
The answer depends largely on the software, the system and the load pattern. When we talk about performance, it is important to understand the difference between load scaling (more users) and speed for each user. In general:
If you want both load scaling (many users) and high speed for each user, you will need enough servers with enough CPU cores to handle the load scaling, and the CPUs should be with the highest clock speed you can get. Assuming of course that your application is free from bottlenecks elsewhere.
Many believe that if a system runs with low CPU/core utilization, like 15-20%, there will be no benefit from changing to a higher core clock CPU.
But there typically will. On a web server with an average CPU load in the 10-15% region, we have recorded 12-15% faster page load speed just from increasing CPU core speed by 15%.
Extrapolating those numbers to compare core speed in the typical used 1.8 - 2.2 GHz CPUs at many (most) hosting providers with the core speed in a 3.5 GHz CPU, we are looking at a potential page load speed increase of 40-60%.
A word on percentages and absolute numbers: Be aware of how you measure and report speed improvements. For example a 200 ms reduction in page load speed might be a 20% improvement for users close to the web servers. But the same 200 ms improvement could be an insignificant 3% improvement for users in other regions, because of higher latency and lower bandwidth.
What you really should be doing is to measure and report page load speed from close to the same location as the majority of your user base.
At the time of writing, the Intel Haswell is the newest CPU generation available. E3-1200 is available now on Haswell, called E3-1200 v3. E5-1600, E5-2600 and E5-4600 is on IvyBridge, called v2, and not yet on Haswell. Next generation of Intel CPUs to surface after Haswell is Broadwell.
You can expect modest performance improvements for each new generation, because of improved CPU architecture alone. Additionally, the change to a newer CPU generation will often give you options of more cores and/or higher clock speeds than the former generation.
5-20% performance improvements from a newer generation CPU is common. If you are planning an upgrade or new servers, it might be worth checking when the next generation is available. Waiting 1-6 months to do an upgrade might be a good trade-off.
Check also if there are options of motherboard/chipset that will support the next generation CPUs. It would then make a later CPU upgrade much easier and less costly.
Other benefits from newer CPU generations and chipsets can include support for newer and faster memory technology and faster PCI interface. Keep an eye out for those, if your application responds well to faster memory (most does) or faster I/O.
While we at Solido have not yet done performance testing with different memory speeds, it is clear from tests performed elsewhere, that most applications will respond nicely to faster memory.
It is probably a fair guess that applications and servers that can utilize RAM a lot, will benefit more from faster memory. For example web servers with large effective HTTP cache or application code cache, or dedicated caching servers like the Varnish Cache.
In short, use the fastest memory you can get your hands on.
Professional servers and rented Virtual Servers will always use ECC (Error Checking & Correction) RAM. The overhead from the error correction slows down the throughput of ECC RAM somewhat, but because of the risk of application failures, it is not recommended to use non-ECC RAM for any professional use.
At the time of writing, 1866 MHz ECC DDR3 RAM is the fastest you can get for the E5-2600 v2 and E5-4600 v2 (IvyBridge) CPUs. For the newer generation E3-1200 v3 (Hashwell) CPUs you can get up to 2133 MHz ECC DDR3 RAM.
Around the corner awaits the DDR4 technology with promises of speed increases in the 50% region over the DDR3. You can expect DDR4 RAM to be available with the coming Intel Broadwell CPUs in professional servers and from hosting providers in late 2015 or early 2016.
A couple of recommendations for the amount of memory you should make available for your servers:
Some application- and web servers have no problem utilizing as much as 32 or 64 GB RAM, others are more comfortable with 4 or 8 GB.
Being a hosting provider, we will be the first to acknowledge the benefits of virtualization. There are many and you should consider to virtualize your setup, even if you run your own servers in a colocation setup (link).
Pros of vitualization:
Cons of virtualization:
For very I/O intensive applications, like databases, some prefer to avoid the virtualization layer because of the higher I/O latency. But don't take for granted that the reduced I/O performance will have any significance in your setup. The benefits of virtualization might make up for it. So test it.
Software licensing considerations
Software licensing models are often using the number of CPU cores or CPU sockets to determine the license you have to pay. Even the type of CPU used is part of some software license models.
Sometimes this can have a big impact on how you want to dimension application- and database servers:
Typically the software vendors have different license models to choose from. You might be able to switch to a license model depending only on the number of registered software users, or the number of employees in your company. Or a mix of several license models.
Software license cost can sometimes be so significant that you need to consider alternatives for your application server stacks. For example by reducing the number of application servers and use caching servers to replace a part of them.
Figuring out the most cost effective setup for software licenses is far from trivial, and can be very time consuming.
This is part of the reason why many companies seek to use Open Source software in their application stacks. Apart from the direct cost savings, they have the freedom of dimensioning their server hardware and they don't have to use time to navigate in and optimize for complicated license models.
A sensible place to start is to find the existing bottlenecks in your system. Removing those will reveal what hardware components holds your application back from further performance improvements and future scaling for load.
Maybe you already know the bottlenecks in your setup. The high I/O to the database server during peak hours slowing delivery down. Or the CPU on the application servers hitting 70%+ marks for long periods. Or something else.
Be aware that you can have bottlenecks in your setup that are not immediately apparent. Some examples:
Another typical problem is periodic application slowdowns, where the reasons are unknown and there is no obvious server/hardware bottlenecks. Performance problems like these can be anything from web servers running out of available threads, to buggy firmware in the database server's RAID controller. Whatever it is, it is important to locate the source of the problem and fix it before moving on.
Before planning to remove bottlenecks with more powerful hardware or changes in hosting setup, consider if it would be more effective to make changes in the application stack or its coding, to relieve the web servers or database servers from too much work.
Apply common sense when identifying bottlenecks for performance. A few concurrent users maxing out the hardware resources on your average dimensioned servers, could indicate problems with the application or the system configuration. No server- or hardware upgrades are going to fix that in the long term.
Checking your server and hardware for bottlenecks can be done with tools like JMeter and a variety of online load- and stress testing tools. If you use such tools, try to have them mimic typical user behavior as close as possible. Take care of course to run load- and stress testing on a non-production environment.
Now that you know the existing bottlenecks and you are confident that server, hardware or hosting setup can help to remove them, you should take a moment to think about the future scaling.
If your company's mission is to reach 300% more users in 2 years and 1000% more in 5 years, you would want a setup that supports this load increase. For example a setup where you can easily add more servers, add caching layers, use load balancing or maybe use multiple hosting locations.
If your database is already running on a 4-socket high-end CPU server with all memory slots filled to brim, now might be a good time to start thinking about how to scale your database across more servers. The alternative - upgrading your single high-end database server - could prove costly and is not likely to handle 1000% more user load in itself.
Generally speaking, system cost per user should be decreasing when scaling for more users. If it does not, you should have a hard look for problems in the chosen platform, systems, setup and application delivery stack.
Dedicated cache servers or more application servers?
Some applications do not scale easily across more application servers. There can be obstacles in the synchronization between clustered/parallel applications or how fast the data synchronization can happen.
Other applications will scale up to a certain point, until the overhead from the synchronization limits further scaling.
Caching servers like Varnish Cache or Nginx have the potential of not only offloading the application servers, but also speeding up content delivery significantly.
There are drawbacks too, which could be relevant to you:
A caching server like Varnish can easily replace many application servers on the same or similar hardware. So in terms of performance scaling and cost, there can be significant benefits.
Performance scaling apart, caching servers are often used to fix performance problems in application servers or - even more common - Content Management Systems.
Keep in mind that you need at least 2 caching servers if you want redundancy and high availability. With 2 or more caching servers, a load balancer is typically used in front of the caching servers to secure fail-over and load distribution.
In recent years SSD storage have conquered the storage market big time and it is not hard to understand why. In performance metrics SSD drives beats mechanical hard drives by a factor up to 100 or more.
You can expect sequential data transfers in the region of 2-4 GB/s with a 6-drive SSD RAID10 and around 400k-500k read/write IOPS (4K random file transfer). Compare that to around 5k IOPS from a similar setup with 10k rpm SAS drives.
People that tried an upgrade from a mechanical SATA drive in their PC to a SSD drive, realized how the mechanical drive was a significant performance bottleneck.
If you ever tried to perform maintenance on for example an operating system running on an SSD RAID, you would appreciate the crisp response and high speed of the storage, as rebooting an OS will typically be done in a few seconds. This is as much about time savings as it is about the convenience of avoiding noticeable downtime.
Only downside of SSD drives is, or have until recently been:
Even large professional SANs in data centers are now starting to become available with SSD drives and most hosting providers are in the process of expanding their offerings with SSD SAN storage.
We recommend SSD storage for most use cases, except for large storage requirements, where the higher cost of SSD storage might still be significant. Keep in mind though, that the pricing of SSD storage is still decreasing, so the cost argument against SSD storage might not be true for long.
Just like with SAS/SATA drives, SSD drives are available in both consumer and enterprise grade quality. The difference is in durability and performance consistency over time. And of course the price, as enterprise grade drives are more expensive than consumer drives.
The general recommendation is to use enterprise drives for all production applications. An exception might be for setups with high levels of redundancy (hot-swap drives, multiple redundant servers etc.) and low to medium storage write frequency, like typical web servers.
If you have the option of choosing the brand and type of SSD drives, there are a couple of things to look out for: