i) Constant alerts from WHMCS and/or monitoring informing of high load on the server.
ii) Website(s) on the server unreachable / intermittent throughout the same time.
2. Constant high load on web servers is normally caused by 2 possibilities:-
i) High amount of traffic and/or database queries, resulting in backlogged processes and/or insufficient memory. When out of memory and swapping, server load will abruptly increase and performance will generally severely impacted.
or
ii) Abusive activities targetting a certain web page / website.
3. In order to determine which site causes the load, first, SSH into the server, and issue the 'top' command. Zoom into the top of the outputs first, for example:-
top - 15:19:33 up 6 days, 3:38, 2 users, load average: 1.46, 5.50, 9.94 <----- look at "load average" to determine cpu load for last 1, 5, and 15 minutes.
Tasks: 180 total, 1 running, 171 sleeping, 0 stopped, 8 zombie
Cpu(s): 17.5%us, 8.6%sy, 0.9%ni, 69.1%id, 0.9%wa, 0.1%hi, 0.2%si, 2.8%st
Mem: 6855176k total, 2668152k used, 4187024k free, 107444k buffers <----- look at "Mem" and compare 'total' and 'used', if similar, look at "Swap" below
Swap: 3964924k total, 228576k used, 3736348k free, 1482908k cached <----- look at "Swap" used, if "Mem" 'total' and 'used' similar, and "Swap" used is large
4. Generally:-
- If "load average" for all 3 durations (1, 5, and 15 minutes) exceed 10, load is considered high and performance impacted.
- If "Mem" 'total' and 'used' are similar, memory use is considered max (although not necessarily bad, check together with Load above and Swap below)
- If "Swap" used is large ( > 500MB-1GB or so ), and "Mem" 'total' and 'used' are similar, memory already not sufficient (check with Load above for performance impact)
5. What to check? The 2 most common possibilities are:-
5.1. If load average high, "Mem" max, and "Swap" large, most likely current RAM resource unable to cope with current traffic + hosted sites & databases.
- Go to server WHM and confirm this with Munin (WHM -> type 'Munin' in search field) and capture the latest memory usage graphs.
- Inform the customer of this and recommend the necessary upgrade.
5.2. If load average high, but "Mem" and/or "Swap" OK, most likely there is an abuse going on.
- Relaunch 'top' command, this time focusing on the lower part of the output. You should see numerous 'php' commands in progress simultaneously.
- Determine what user is running those processes. Check with WHM to get the main website URL for this user.
- Now monitor the website's apache web logs, and determine what IP and/or access patterns are being done in close succession:-
# cd /usr/local/apache/domlogs
# tail -f website.url.com
- The monitoring process may take some time as you identify which are legitimate random URL accesses, and which are random visits in close succession, either from 1 IP (DoS) or multiple IPs (DDoS). Use 'grep' commands to verify whether certain IPs are performing abusive activities.
- Once you have identified the source IP address, block it using csf , either with GUI or command line (if server too slow to respond to WHM requests), as follows:-
i) with GUI: WHM --> search 'Firewall', select "Quick Deny", and enter the source IP address, and save + restart csf & lfd.
ii) without GUI: SSH server, vi /etc/csf/csf.deny , add the IP in one line, save, and run "csf -ra" to restart both csf & lfd.
- Verify that what you have done managed to mitigate the high load, and repeat if necessary.
- Once stable, grab latest munin CPU Load graphs, and include in new ticket to customer informing them of this abuse case.
Done.
- If "load average" for all 3 durations (1, 5, and 15 minutes) exceed 10, load is considered high and performance impacted.
- If "Mem" 'total' and 'used' are similar, memory use is considered max (although not necessarily bad, check together with Load above and Swap below)
- If "Swap" used is large ( > 500MB-1GB or so ), and "Mem" 'total' and 'used' are similar, memory already not sufficient (check with Load above for performance impact)
5. What to check? The 2 most common possibilities are:-
5.1. If load average high, "Mem" max, and "Swap" large, most likely current RAM resource unable to cope with current traffic + hosted sites & databases.
- Go to server WHM and confirm this with Munin (WHM -> type 'Munin' in search field) and capture the latest memory usage graphs.
- Inform the customer of this and recommend the necessary upgrade.
5.2. If load average high, but "Mem" and/or "Swap" OK, most likely there is an abuse going on.
- Relaunch 'top' command, this time focusing on the lower part of the output. You should see numerous 'php' commands in progress simultaneously.
- Determine what user is running those processes. Check with WHM to get the main website URL for this user.
- Now monitor the website's apache web logs, and determine what IP and/or access patterns are being done in close succession:-
# cd /usr/local/apache/domlogs
# tail -f website.url.com
- The monitoring process may take some time as you identify which are legitimate random URL accesses, and which are random visits in close succession, either from 1 IP (DoS) or multiple IPs (DDoS). Use 'grep' commands to verify whether certain IPs are performing abusive activities.
- Once you have identified the source IP address, block it using csf , either with GUI or command line (if server too slow to respond to WHM requests), as follows:-
i) with GUI: WHM --> search 'Firewall', select "Quick Deny", and enter the source IP address, and save + restart csf & lfd.
ii) without GUI: SSH server, vi /etc/csf/csf.deny , add the IP in one line, save, and run "csf -ra" to restart both csf & lfd.
- Verify that what you have done managed to mitigate the high load, and repeat if necessary.
- Once stable, grab latest munin CPU Load graphs, and include in new ticket to customer informing them of this abuse case.
Done.