Wednesday, January 7, 2015

Slow Disk Performance On Dell r620

Update:

I installed openmanage with the instructions on this page and used it to check/update bios settings: http://linux.dell.com/repo/community/deb/OMSA_7.1/

The fix ended up being BIOS power profile settings. It was set to Performance Per Watt (DAPC) and needed to be set to "Performance".
It's very interesting that the BIOS power profile settings didn't seem to affect Ubuntu 14.10's disk read speed.

Troubleshooting:

I have three of the same server. All the servers are Dell r620's with a PERC H710 Mini RAID controller (21.2.0-0007_A04 firmware) and have the exact same hard drives (Seagate ST300MM0006) in RAID 1 configurations. I'm seeing about 50% worse disk read performance on some versions of Ubuntu.

They only differences I can find in the hardware:
  • The BIOS version on the poorly performing machines is 1.4.8, the machine that's performing well is on 1.6.0. The BIOS change log doesn't appear to have any changes between the two versions that would affect anything.
  • The chipset version on the RAID controller of the poorly performing machines is rev 01 (ChipRevision: B0). The machine that's performing well has the rev 05 chipset (ChipRevision: D1). The motherboard chipset revision is also different.
Here are the benchmark results for the different Ubuntu versions and RAID controller chipsets:

(mysql read speed test - using the same version of mysql across all 3 machines)
sysbench --test=oltp --oltp-table-size=1000000 --mysql-db=test --max-time=60 --oltp-read-only=on --max-requests=0 --num-threads=8 run:
(13.04 - rev 05): read/write requests:                 4996362 (83271.11 per sec.)
(14.04 - rev 01): read/write requests:                 1906520 (31773.58 per sec.)
(14.10 - rev 01): read/write requests:                 5166798 (86111.50 per sec.)

dd if=/dev/zero of=/tmp/output bs=8k count=10k; rm -f /tmp/output:
(13.04 - rev 05): 83886080 bytes (84 MB) copied, 0.0437543 s, 1.9 GB/s
(13.04 - rev 01) 83886080 bytes (84 MB) copied, 0.116811 s, 718 MB/s
(13.10 - rev 01): 83886080 bytes (84 MB) copied, 0.129371 s, 648 MB/s
(14.04 - rev 01): 83886080 bytes (84 MB) copied, 0.157808 s, 532 MB/s
(14.10 - rev 01): 83886080 bytes (84 MB) copied, 0.102355 s, 820 MB/s

hdparm -tT /dev/sda1:
(13.04 - rev 05): Timing cached reads:23154 MB in  2.00 seconds = 11589.16 MB/sec
(13.04 - rev 01): Timing cached reads:17934 MB in  2.00 seconds = 8973.68 MB/sec
(13.10 - rev 01): Timing cached reads:18102 MB in  2.00 seconds = 9058.08 MB/sec
(14.04 - rev 01): Timing cached reads:17846 MB in  1.99 seconds = 8956.28 MB/sec
(14.10 - rev 01): Timing cached reads:21538 MB in  2.00 seconds = 10777.93 MB/sec

At one point, 2 of the rev 01 servers were using Ubuntu 14.04 and were getting similar bad benchmark results. After the upgrading them to 14.10, I immediately saw the disk read speed increase.

The performance on rev 01 machine with 13.04 was bad, but somehow the rev 05 machine with the same OS version was performing well. This told me it's not a problem with the megaraid_sas driver, because the rev 01 and rev 05 machines with 13.04 had the same version.

Here's the closest I can find to a similar issue (it's different hardware though): http://en.community.dell.com/support-forums/servers/f/906/t/19596533

Update 3/1/2015:
Saw an interesting checklist on Hacker News today that also mentioned disabling BIOS power saving settings: http://odetodata.com/2015/02/installation-and-configuration-checklist-for-microsoft-sql-server/

No comments:

Post a Comment