Linux and HW optimizations for MySQL
! " # $ "% Table of contents !
History of MySQL performance improvements " #
" $% & # ' $% %(% 、)' *+,
! - ') #
. ! & & / ! - ! %
. & / Per-server performance is important ( 1!
22
0333 1! 4 0333 03333 1! 4 033
% 533 03 6 & 0 6 ! ! ! & ! ! ! - &
('
32bit Linux
Updates
2GB RAM
2GB RAM
2GB RAM HDD RAID HDD RAID HDD RAID (20GB) (20GB) (20GB)
- Many slaves + Many slaves + Many slaves
033*733
- ! & - 8 -
% ! . 2 2 93:;< /
- )
! ! - ! !
; ! % ! - ! =
64bit Linux + large RAM + BBWC
16GB RAM
- Many slaves HDD RAID
(120GB) & >?- !, !
- 0>:; $% !,
& - ! !
- ( ! -
. / )
. ! - ! / @( A
- ! - ,
! & -
; - % . & /
Side effect caused by fast server
- % 0>*97:; $% & ! ! !
# ?*B $% - C 03 &
! > " $% 03& ) !
HDD RAID 733& 033 ) ! 0333*7333
- : & .
/2 > " $% 03 "
HDD RAID !
.03< !
Using SATA SSD on slaves
- )
.0333</ .033</ ! !
! ) E
$ ) %(% 9333<& !
- ! .0C " / -
HDD RAID F! "
! ! - ! - . ) 4 033< *D 9333</
! . ) 4 0333< *D 9333</ #
- ( ! - ! !
SATA SSD
4
- : ! " $% 1!
4 ; ) ! ! Concurrency improvements around MySQL
$ & - ;4 $ & % 4 ( -
" : .: - ! ,/ ' .
- / ! !
; ;! ) & & ;
') .C20 ) ! & C2C/ )
! -
G * . 2 + +'(/ ! !
! ! ) - ! . / ! .
- /
- . ( / !
Avoiding sudden performance drops
Perf Time
! - - ! !
Product B Product A
- ! = ! = ( ! ! % - & ! ; !
- = ! -
- ! = ! -
G /
( ! - .03*033 & -!
! 0333< ( % I ' +'( / Avoiding stalls(2)
- <
- ! 2 = !
.?23 ?20& C232?C& 22/
(
- ! . 'JK ! ,/ ;! . &
! / - - K ./ ./ .
/ Handling Real-World workloads..
4 % ) - ' ! F
G ; G !- 4/
F 8 & ) & )' 7*9 -
7C< ! 029;6 ! 7303 RDBMS or NoSQL ?
E
= L % !
& ! -M
" # ! $ ; . ') & $% & )' *+, / ' , 1! & ! ! & & ,&
! & & ! ! -
$ ; .' = 0333 ! / E
! 03&333< ) %(+
" ,
- % = - <
- & , !
- !
- 8
G + )J @! K < ,,,K A G :;
- !
%
- !
- = ! &
- $( ) %(+ + +(+ &
- " ! & & -
- $( + +'( ( - 8 - ! . , 0(;/
. + +'( /
- $(
; .( ;! 2 ! % /
- ( , & -! # , 8 -
- ! &
- D *- !
G ! ! ! G ) %(+ + +(+ + +'( !
- %
- ! 1! & , - &
;
- ! .@ -
- !
- I (+
- ) & -!
.@ A/ - ,
Insert buffer
) 4 $ !
- $ ! ! -
1! Optimized i/o
- 1!
'
4 %
, ! - ! 2 -
INSERT gets slower
Time to insert 1 million records (InnoDB, HDD) 100 200 300 400 500 600
1 13 25 37 49 61 73 85 97 109 121 133 145 Existing records (millions) S e c o n ds
Sequential order Random order
, 8 , - -!
8 N9 : ! - ! -! .
/ 1! &
2 Index size exceeded buffer pool size
10,000 rows/s 2,000 rows/s INSERT performance difference
- $( ! ! *
0C333< " #
- , -! &
7333*?333 " & >333*B333 ! *
!
% C333 ! !
G ; ! ! ') ! 1! - ! . 2 2 ! ! & /
8 , 8 +$( Approach to complete INSERT in memory $ -
( ( + (% ) - ! - G ( + (% ) ! !
C20 , 8 - K ,K 8 ! - K K
Single big physical table(index)
Partition 1 Partition 2 Partition 3 Partition 4
- % (+$ (%; + 22 $ ) )%$( (
Approaches to complete INSERT in memory
- .! C23& !
= / )! , !
& $% UPDATE-mostly tables ( !
! = ! G '! ")& , & & ! - & = ! & -
! & G ) %(+ + +'( 1! , ! !
!
) %(+
, - - G -! & ! G
8 & +$( + +(+
- "!
- G ) %(+4 07&333 G " ) %(+4 933 G %(% ) %(+4 0&B33 G )' *+ ) %(+4 ?&333 G O (
G $ ! &
What do you need to consider? (H/W layer) " E
%(% % )' *+, E& " E $%
" # $% & # $% F; E 033 - 0:- ! E 7:; $% < )' *+ >?:; $% < B" E
') & P E What do you need to consider? $ !
$% $; . /
$ * % $
, 9& , & E & $ &
!
# # ! Why SSD? IOPS! ) 4 ! - . /
- %
1!
, - ,
$ ! % " 4 733 . Q / 4 7&333< . / C&333< . / Table of contents
; ) "
$ $ $ #
1! $ 1! #
./ ) !
8 ' '
R
Direct Random Read IOPS (Single Drive, 16KB, xfs)
45000 40000 35000 30000
S HDD
25000 P
Intel SSD 20000
IO 15000
Fusion I/O 10000 5000
1
2
3
4
5
6 8 10 15 20 30 40 50 100 200 # of I/O threads
" 4 05> & ??9 033 4 9C3B & 0?C9B 033 ! 4 03C7> & ?09N5 033
! ! 0>, - " & !
7C, - ’
’ .727,/ ! .?,/ ! - " High Concurrency ! % . 2 2 ?3 , ?:; S 0>3:;/
" %
!
PCI-Express SSD
CPU North Bridge South Bridge
PCI0Express Controller SAS/SATA Controller
2GB/s (PCI0Express x 8) 300MB/s
SSD I/O Controller SSD I/O Controller
FlashFlash %
)' *+, ! % %(%
. ! /
- R )' *+
Write performance on SSD
Random Write IOPS (16KB Blocks)
20000 18000 16000 14000 120001 i/o thread 10000 100 i/o threads 8000
6000 4000 2000 HDD(4 RAID10 xfs) Intel(xfs) Fusion (xfs)
H ! ;! 22 ' ! ! O OE
’
22 Understanding how data is written to SSD (1) Block (empty) Block (empty) Block (empty) Block
Page Page !. Flash memory chips
. 2 2 7:;/ . 2 2 C07J;/ - %
% - . 2 2 ?J;/
- O O <
- # - % -
- O O -
New data
× × × ×
Block (empty) Block Page Page !.
- # #
.*733 / * - !
. 2 2 ; +
- /&
- ! .# * /
P P P
1. Reading all pages Block Block Block Block P P P P
New
P P P P2. Erasing the block Block Block Block P P
3. Writing all data P P P P P
! - ! & - !
New P
2 2 %
0CB:; 0>3:; ! - -
;
4
- 02 $
- 72 +
- 92 #
- $% + , . /
- % & - !
- $% +
( ! & ! “ ” 8 -
8 2 2 0>3:; & 073:; & ?3:;
! !
8 R * * 5>:
3 P
Block P P Block P P
Block P P Block P Block P P
Block P
P
BlockP
P
Block
P
P
BlockP
Data Space New data Block (empty)
Block (empty)
2. Writing data P P
Reserved Space Block Write performance deterioration
% - & ) “ ”
# & ) !
“ ” .- ! +$% + /
- ! ! # )
“ ” . - +$% + - - ! M -/ "
.($ ! &
Write IOPS deterioration (16KB random write) 5000 10000
15000 20000 25000 30000
Intel Fusion(150G) Fusion(120G) Fusion(96G) Fusion(80G)
IO P S Fastest Slowest
Continuous write/intensive workloads Stopping writing for a while Mitigating write performance deterioration %
$%
) 2 (
- 8
3 # -! !
2 I ! ! ! ! ! Sequential I/O
Sequential Read/Write throughput (1MB consecutive reads/writes)
600500 400 s /
Seq read B
300 M
Seq write 200 100
( - 4 ! . /& M ! . / ! " 1! & -!
" .? $% 03/ ! 1! 8 - 1! ! - & !
- !
! 1! fsync() speed 03&333< ! ') - ! .T /&
fsync speed
2000 4000 6000 800010000 12000 14000 16000 18000 20000
HDD(xfs) Intel (xfs) Fusion I/O(xfs) fs y n c/ se c
1KB
8KB
16KB
- ! .T /2
Write cache disk disk seek & rotation time Filesystem matters , & ! K $+'(& -! , O
: ! , & , 9 8 !
Random write iops (16KB Blocks)
2000 4000 6000 800010000 12000 14000 16000 18000 20000
Fusion(ext3) Fusion (xfs) Fusion (raw)
Filesystem
io p s1 thread 16 thread Changing I/O unit size Read IOPS and I/O unit size (4 HDD RAID10) 2500
2000
1KB S 1500
P
4KB
IO 1000
16KB 500
1
2
3
4
5
6
8
10
15
20
30 40 50 100 200 concurrency " & , !
77T !
0J; 0>J; -
- ! U 03
140000 120000 s
100000
4KB / ds
80000
8KB a e R
16KB 60000 40000 20000
1
2
3
4
5
6
8
10
15
20
30 40 50 100 200 Concurrency "! & ) & -!
8 ’
- ! “ - ! - 8 ” !
40000 35000 30000 s /
25000 SLC ds a
20000 MLC re
15000 10000 5000
1
2
3
4
5
6
8
10
15
20
30 40 50 100 200 concurrency B*?3T - ! ! ' SLC vs MLC (8KB) Random Read IOPS, FusionIO (8KB) 80000
70000 60000 s
50000 s/
SLC d
40000 a
MLC re
30000 20000 10000
1
2
3
4
5
6
8
10
15
20
30 40 50 100 200 concurrency
7C*NCT - ! ! ' $ - ! - ! . ' '& ! 8 & 22/ tachIOn vs FusionIO (SLC) Random Read IOPS (16KB) 90000
80000 70000 60000 s s/ 50000
FusionIO ad
40000 tachIOn re
30000 20000 10000
1
2
3
4
5
6 8 10 15 20 30 40 50 100 200 concurrency & !
!
PCI-Express interface and CPU util
# cat /proc/interrupts | grep PCI 83: … PCI-MSI vgcinit
202: … PCI-MSI-X eth2-0 210: … PCI-MSI-X eth2-1 218: … PCI-MSI-X eth2-2 226: … PCI-MSI-X eth2-3 234: … PCI-MSI-X eth2-4
# mpstat –P ALL 1 CPU %user %nice %sys %iowait %irq %soft %idle intr/s 0 1.00 0.00 12.60 86.40 0.00 0.00 0.00 1000.20
1 1.00 0.00 13.63 85.37 0.00 0.00 0.00 0.00 2 0.40 0.00 4.80 26.80 0.00 0.00 68.00 0.00 3 0.00 0.00 0.00 0.00
79.20 79.20 0.00 20.80 39033.20 ...
79.20
79.20
- T 1 ." ! / $
)' *+ ; ! ! ! - $
# of interfaces (tachIOn SLC)
Random Read IOPS (16KB, tachIOn)
300000 350000 s ad s/ 250000 200000 Single Drive re 100000 150000 Two Drives (RAID0) 50000 1 2 3 4 5 6 8 10 15 20 30 40 50 100 200 concurrency- ( !
! ! & ! # ! - . ! 1! /& T 1 & !
! ! ( ! -
# of interfaces (FusionIO MLC) Random Read IOPS (16KB) s 50000 60000 70000 re ad s/ 30000 40000 20000 FusionIO FusionIO Duo 10000 1 2 3 4 5 6
concurrency
8 10 15 20 30 40 50 100 200! ! )' *+,
( $ - ! & ! !
% ! ! ! . & -/2
- ! ! . /
G ! ") 9>3 . )' *+ / - G &
- ! - & , !
80000 70000 60000 s tachIOn
/ 50000 ds
FusionIO Duo a
40000 re
FusionIO 30000 20000 10000
1
2
3
4
5
6 8 10 15 20 30 40 50 100 200 concurrency ! ! ! & = ' Opteron vs Nehalem(tachIOn)
tachIOn Random Read IOPS (16KB)
9000080000 70000 60000 s
50000 s/
Nehalem X5650 ad
40000 Opteron 6174 re
30000 20000 10000
1
2
3
4
5
6
8
10
15
20
30 40 50 100 200 # of threads
7
) -
- )' *+ ') 1!
Opteron vs Nehalem(tachIOn) 300000 tachIOn Random Read IOPS(4KB) s 200000 250000 re ad s/ 100000 150000
Opteron 6174 Nehalem X5650 50000 1 2 3 4 5 6 8 10 15 20 30 40 50 100 200 # of threads 7*?
- T 1 033T ! " " ! 1! ')
7'
& (
$ !
V 7%$' G )
V ' -
G # !, J ! ! '
- G # !
2 !
'
!
) . ' / G !, J ! G + ! 033T & ?3T& NCT
! . ' C2C/ G P7C*+ = ) !
- G = M! 2 # , 2 = - &
2
- ( ! 2 7'
" Virtualization? '!
! . /
: 0 93 ! ! -! ! 032? < JH < !
# " & ! ! - ! ! ! !
8 = - ! ! Virtualization benchmarks (HDD) Random Read IOPS (Dell could HDD, 16KB unit) 5000
4500 4000 3500 c
3000 se
Physical /
2500 ds
KVM a
2000 re
1500 1000 500
1
2
3
4
5
6
8
10
15
20
30 40 50 100 200 # of threads " - Virtualization benchmarks (SATA SSD) Random Read IOPS (Dell cloud SSD, 16KB unit) 25000
20000 c
15000 se
Physical s/
KVM ad
10000 re
5000
1
2
3
4
5
6
8
10
15
20
30 40 50 100 200 # of threads ! ! -
9 $ " . %(%/ $ " ! .)' *+/
MySQL Deployment Practices on SSD #
- " ! C2C C20
;(*7 - . / 733 # ! .73:; – 7C:; / !
4 '& 5>:; & >?:;
HDD vs Intel SSD vs Fusion I/O
! )' *+ - )' *+ ! % %(% 0?,
?" CN3523>
0C0772NC ! 007C2?? ;! 0: "
NOTPM: Number of Transactions per minute Which should we spend money, RAM or SSD? $% .-! 8 / !
; !
- !
& .- 1! / ! & 1! - !
! $% < " ! $% <
0C0772NC CN3523> 007C2?? ;! 0:
0B>9205 ;! 7: ?9BC20B ;! C:
9>NB?2N> ;! 93:
.' /
! " Which should we spend money, RAM or SSD?
- ! < ! .C:;/ ; ! - 4
! & ! $% < " ! ’ & ! , ! ! & -
$%
0C0772NC CN3523> 007C2?? ;! 0: 7335>299 NC9>2CC
0B>9205 ;! 7:
93B?>29?
07B572C> ?9BC20B ;!
C: CN??02>? * 9>NB?2N> ;! 93:
.' /
! " MySQL file location , " ! 1!
! 1! $
4
.O2 - /
- – - 1! !
& ;! . - /
- – - . & , ! * ! / -
- – -! . & , ; *
! ! , /
1! #
4 !- ;! . - 3/
- – # ! 1! O - 2 "!
; . 1 *- 2PPPPPP/ $ . -K /
;! 93: !
0C0772NC
.! SN3T& S92CT/
C: CN??02>?
;!
.! S95T& S03T/
93B?>29?
;! 7:
.! S93T& S072CT/
;! 0: 7335>299
.! S7CT& S0CT/
! < " <0CT <7BT <7BT <7BT
Moving sequentially written files into HDD
.! S97T& S03T/
0575C25?
.! S9>T& SBT/
7C>7N2?5
.! S?5T& S>T/
95?9C27C
.! SNNT& S0T/
>>3C92>B
- & -K & .<- / " "
T !
# ! - - ! !- " Does CPU matter? Older Xeon Nehalem CPUs Memory
CPUs QPI: 25.6GB/s FSB: 10.6GB/s North Bridge
North Bridge Memory (IOH) (MCH) PCI/Express
PCI/Express
02 ') 4
- 72
') ;
72C, & ') U*D
! )' *+, Harpertown X5470 (older Xeon) vs Nehalem X5570 (HDD) ')
- 0T 007C2?? .! S0T/ ;! 0:
- 9T
') - ! 935392? .! S?3T/
?0N>2C0 .! SNT/ 0577279 .! S7T/ 009C29N .! S0T/ " PC?N3& 9299:"8
0B>9205 .! S7T/ ;! 7: <CT ?9BC20B.! SNT/ ;!
C: <05T 9>NB?2N> .! S?3T/ ;! 93:
.PCCN3& 7259:"8/
" us: userland CPU utilization Harpertown X5470 vs Nehalem X5570 (Fusion) () ! "
- ! .-! 0: 7:/& ') !
8 & -! () !
H ! )' *+
;
- ') )' *+, !
! - ') C7CB72N0 .! SN>T/ 933CB2?B .! SC3T/ 0537>2>? .! S?3T/
09C9?23> .! S9CT/
" PC?N3& 9299:"8
<?9T 0575C25? .! S97T/ ;! 0: <9CT
7C>7N2?5 .! S9NT/ ;! 7:
<90T 95?9C27C .! SC3T/ ;!
C: <7>T >>3C92>B .! SN>T/ ;! 93:
.PCCN3& 7259:"8/
! <" Intel SSDs with a traditional H/W raid controller
- 5T
- 9NT
- ?BT CN3523> ;! 0: NC9>2CC ;! 7:
$ ! - ! " #
- $%
03 " # $ ! -
! . P7C*
H + ’ / F; < <
973 !
00N9527N ?N>92>3
75NC23? ! $% C
07B572C> ;!
C: Enable HyperThreading ; ) ! C20 0>*7? ') "(
B0?07279 ?CNBC207
7B?9B233
73NBC2?7 "( .0>/
<79T <0>T <00T <N2NT 0575C25? ;! 0:
7C>7N2?5 ;! 7: 95?9C27C ;!
C: >>3C92>B ;! 93: "( .B/ ! < "
- ! ') - !
- ! !
; ') T! T
- – T! 4 9>T .C2C27/ ??T .C2C2?/ -! S 7
- – T
4 BT .C2C27/ C2CT .C2C2?/ -! S 7 & -!
73T ' !
! 1! C2C L
- ! ? "
- – #
! ! &
>N7C92?C ?N75>207 9797C2N> 7?305297
<02BT <73 <7>T <7?T 0575C25? ;! 0:
7C>7N2?5 ;! 7: 95?9C27C ;!
C: >>3C92>B ;! 93:
C20 ! < " Where can PCI-Express SSD be used?
E )' *+ ! &
G 03&333 ) & ?3&333< ) 033
03&333 ) 033 - %(% ) ! -
E ! " & %(% ! - !
G )' *+, ! , %(%
" - ! ! ! E
G H !
- )' *+ ! ! ! E
" & 033*733:; - -
- ! ) "
! '4 973:; ! < 0>3:; S ?B3:; ! '4 07B3:; ! < >?3:; S 0573:; . ! /
'4 B33:; , 7 S 0>33:; Running multiple slaves on single box $! !
)' *+
; ! " - ' !
= & ! -
! - ! ! ( ! -
After
M B S1 S2 S3 M B S1 S2 S3 M M
S1, S1 M B S1 S2 S3 M B S1 S2 S3 B B B B B B B B B M B B B B M M B B B B
M M B B B B
- Before
") 9>3:N .0 /& $>03 )' *+
! ' .>?3*07B3:; ! < 973*>?3:; / ' .B33:; , 7/ *D '
') (
- & >* & &qu
- G 7? ') G ! ,
$% >3:; ; ;' CN35& ! ; , 7 ( ) & >*07 .S / ! )
" ?*B % $% 0<3
! & & & . / !- -! - Statistics
' N ! .>?3:; ' ! < 973:; '/
- '( 1! >:; -K-! K K 8
) ) . N / >0>B92N 1!
9N59520 NB>020 ! 003C
0B?9 90?92C - ')
7N29T& T
00T.T ?T/& T ?T '2 2 %(% :T! ?T& T
0T& T
0T ;!
552?T %(% 4 552BT
- .033< / !
T!
7N29T& T
00T.T ?T/& T ?T $ .5>3:;/2 0573:; ' ! -
$ 4 0?2> ; & 4 7B2N ; ' C < - 1! . ') 1! / ! - & -! - !
22:10:57 CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s
22:11:57 all 27.13 0.00 6.58 4.06 0.14 3.70 0.00 58.40 56589.95 …
22:11:57 23 30.85 0.00 7.43 0.90 1.65 49.78 0.00 9.38 44031.82 Things to consider % ) ! -
%
- # ) ! , @ S993>A
S@ ! A - 2 *
'
& & ; & - !
- " & ; & $ & $ & & - 3 .
!- -! /& - ! " taskset
# taskset -pc 0,12,2,14 `cat /var/lib/mysql/mysqld1.pid` # taskset -pc 1,13,3,15 `cat /var/lib/mysql/mysqld2.pid` # taskset -pc 5,17,7,19 `cat /var/lib/mysql/mysqld3.pid` # taskset -pc 8,20,10,22 `cat /var/lib/mysql/mysqld4.pid` # taskset -pc 9,21,11,23 `cat /var/lib/mysql/mysqld5.pid` # taskset -pc 4,16,6,18 `cat /var/lib/mysql/mysqld6.pid` # taskset -pc 8,20,10,22 `cat /var/lib/mysql/mysqld7.pid`
! 7? ')
# ! >< 7? & ! - ! 8 ')
- ! ' ! ! !
. / #
& ! . / Application Design ;
') ! - , ! "
- " .
/ " ! -
- $
"
8 ! , -
- %$ "% "
G , ! , Future improvements from DBA perspective & - ! $
! - !
. ! /
$ B33:; !
! - -
! ! G *- S0& -K ! K K K ,K S0 G $
- .:% C2C/ ' * - .- ! / $ , ! ! ! - ; - &
. * 2 / . C2>E/ G $ ! - Conclusion for choosing H/W )' *+ . 2 2 ! & / % %(%
. 2 2 973/ ' ! $%
2 J ! & ! & ! $% 3 ! " 1!
') ! )' *+, + Conclusion for database deployments )! 1!
"
& -K & - - " ! 1!
- #
- ,
)!
03, *033, "
- %
"
! , "
C20 ; ) ! C2C
- " Q !
22
’ ! M! " ! E
- !
H !
8
! !
1 !
!
M! )
- ' ! - 8 & !- & -
' !
B*0> "
Random Access Memory $% ! "
$% 4 *>3
- – 033&333 1!
- " 4 *C 4 033*C33!
0>*033<:; $% O O ! -
8
8
. % ( H%$'"%$ ; : (& ( + (% ) %(+( +& /
! ,
- & -
Cache hot application data in memory
;(*7 - . / 73*7C:; .733 ! & ! ! / $%
8 2 + +'(& -!
- $( ) %(+ + +(+
- $(4 $
, ) %(+ + +(+4 $
9>NB?2N> ?9BC20B
0B>9205 007C2?? ( !
9>T C2CT
9T
7T T!
93T ;! 0:
7BT ;! 7:
99T ;!
C: BT ;! 93:
.% / T ;(*7 .#733/ Use Direct I/O Direct I/O Buffered I/O InnoDB Buffer Pool InnoDB Buffer Pool Filesystem Cache RAM
RAM InnoDB Data File InnoDB Data File
! !
8
- K ! K S K $+'( %
4 ! ! - C07 - ' ’ ! K $+'( ; & ; &
% & ) & Do not allocate too much memory
user$ top Mem: 32967008k total, 32808696k used, 158312k free , 10240k buffers Swap: 35650896k total, 4749460k used , 30901436k free, 819840k cached
PID USER PR NI
VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5231 mysql 25 0 35.0g 30g 324 S 0.0 71.8 7:46.50 mysqld
#
- E
$ ! . /
- )
. ! /
- . /
’ ;
8 8 & 2 ;! 22
H ! # $%
- &
2 J
- ( !
- . 1 /
’ ! 2 ' - ) - $ +$ ;I U) D K + '
1
- – H 8 & ') & ! &
. ! ! / J # ’
- ! !
Do not set swap=zero
top -- top - top top - 01:01:29 up 5:53, 3 users, load average: 0.66, 0.17, 0.06 01:01:29 up 5:53, 3 users, load average: 0.66, 0.17, 0.06 01:01:29 up 5:53, 3 users, load average: 0.66, 0.17, 0.06 01:01:29 up 5:53, 3 users, load average: 0.66, 0.17, 0.06 Tasks: 170 total, 3 running, 167 sleeping, 0 stopped, 0 zo Tasks: 170 total, 3 running, 167 sleeping, 0 stopped, 0 zombie Tasks: 170 total, 3 running, 167 sleeping, 0 stopped, 0 zo Tasks: 170 total, 3 running, 167 sleeping, 0 stopped, 0 zo mbie mbie mbie Cpu(s): 0.0%us, Cpu(s Cpu(s Cpu(s ): 0.0%us, ): 0.0%us, ): 0.0%us, 24.9%sy 24.9%sy 24.9%sy 24.9%sy , 0.0%ni,75.0%id,0.2%wa,0.0%hi, 0.0%si,0.0%st , 0.0%ni,75.0%id,0.2%wa,0.0%hi, 0.0%si,0.0%st , 0.0%ni,75.0%id,0.2%wa,0.0%hi, 0.0%si,0.0%st , 0.0%ni,75.0%id,0.2%wa,0.0%hi, 0.0%si,0.0%st Mem Mem: 32967008k total, 32815800k used, 151208k free, 8448k bu Mem Mem : 32967008k total, 32815800k used, 151208k free, 8448k bu : 32967008k total, 32815800k used, 151208k free, 8448k buffers : 32967008k total, 32815800k used, 151208k free, 8448k bu ffers ffers ffers Swap: 0k total, 0k used, 0k free, 376880k Swap: 0k total, 0k used, 0k free, 376880k Swap: 0k total, 0k used, 0k free, 376880k cached Swap: 0k total, 0k used, 0k free, 376880k cached cached cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COM MAND PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COM PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COM MAND MAND 26988 26988 26988 26988 mysql mysql mysql mysql 25 0 30g 30g 1452 R 98.5 97.7 0:42.18 mysqld 25 0 30g 30g 1452 R 98.5 97.7 0:42.18 25 0 30g 30g 1452 R 98.5 97.7 0:42.18 25 0 30g 30g 1452 R 98.5 97.7 0:42.18 mysqld mysqld mysqld
- & ') ! 033T !
7?25T . / S 0 ? ! 033T ! ! ( 8 . " ’ - / -
- & -! !
- 0N ! R *0N D U D K M ;! ’ *0N
1 !
& !, * * ’
- # ’
22 *D Swap space management
- 8
;! ’ 1 !
# ! E $ ;
- – ! . -K-! K & K-! &
K-! & /
- – ! . ; &
- & % & /
% .- ! & /
- – !
# 1 $% & Be careful about backup operations
Mem: 32967008k total, 28947472k used, 4019536k free, 152520k buffers Swap: 35650896k total, 0k used, 35650896k free, 197824k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 5231 mysql 25 0 27.0g 27g 288 S 0.0 92.6 7:40.88 mysqld
Copying 8GB datafile
Mem: 32967008k total, 32808696k used, 158312k free, 10240k buffers Swap: 35650896k total, 4749460k used , 30901436k free, 8819840k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 5231 mysql 25 0 27.0g 22g 324 S 0.0 71.8 7:46.50 mysqld
' ! vm.swappiness = 0
Mem: 32967008k total, 28947472k used, 4019536k free, 152520k buffers Swap: 35650896k total, 0k used, 35650896k free, 197824k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 5231 mysql 25 0 27.0g 27g 288 S 0.0 91.3 7:55.88 mysqld
Copying 8GB of datafile
Mem: 32967008k total, 32783668k used, 183340k free, 3940k buffers Swap: 35650896k total, 216k used , 35650680k free, 4117432k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 5231 mysql 25 0 27.0g 27g 288 S 0.0 80.6 8:01.44 mysqld
2 S3
2 ! >3
# $% ! ! & !, ! .
/ %
- &
’ - ! 2 ’ But swap happens even though swappiness==0 97:; $% - ,& ' C .72>20B207B/
- K-! K K 8 S 7>:; 1 -K-! K K 8 4 , 9:;
J 4 , 0:;
top 0 11:54:51 up 7 days, 15:17, 1 user, load average: 0.21, 0.14, 0.10 Tasks: 251 total, 1 running, 250 sleeping, 0 stopped, 0 zombie Cpu(s): 0.5%us, 0.2%sy, 0.0%ni, 98.9%id, 0.3%wa, 0.0%hi, 0.1%si, 0.0%st Mem: 32818368k total, 31154696k used, 1663672k free, 125048k buffers Swap: 4184924k total, 1292756k used , 2892168k free, 2716380k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6999 mysql 15 0 28.4g 26g 5444 S 9.3 85.2 210:52.67 mysqld Swap caused by log files
! !
; 8 4 9:; ’ ; ) ! - ! - !
; . /
Filesystem Cache InnoDB Buffer Pool InnoDB Data File Log Files Swap bug on Linux
% ! !, J
, 72>27B4 $"+ > 4 -! 8
2
2 K-! 2 E S0>3399 # ' > - E
; -! ,
4 K S K 7 < < *D W
K D 033& G (
! & ! SS 3
# - E
# ; & - & ! - ! - ! -K K K 8
G -! - , ; ) ! & ! 9*?:; - -
# ! - - $"+ CE
- K-! K K 8 -K K K 8 E
G = ! ; - Tool: unmap_mysql_logs 4 !-2 ! K
1 K
- )
G 44' ')% !
- ; 4 %
; & $
4 G '! . / - 4 % , . / 03T G ; ! ! - ; !
- 4
G
4 InnoDB Buffer Pool
Filesystem Cache InnoDB Data File Log Files Mem: 32825116k total, 32728924k used, 96192k free , 246360k buffers Swap: 4186072k total, 118764k used , 4067308k free, 2803088k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 30814 mysql 15 0 28.8g 27g 5644 S 81.1 89.4 158198:37 mysqld Mem: 32825116k total, 30061632k used, 2763484k free , 247420k buffers Swap: 4186072k total, 118764k used, 4067308k free, 143372k cached
Performance effect ! & $ !
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 30814 mysql 15 0 28.8g 27g 5644 S 76.1 89.4 158200:59 mysqld
- 03 !
!
- !
: ) . ! /
- – R ! -!
- – R
- ,2 W 2 ! - * W W *
,
- K)$+ % S ! K
2 W
1 K Q
; !
- ' ; ) !
- – -K! K K S 0. ! 0/& ; !
- – - - ! K)$+ %
9>T C2CT
9T
7T T!
9>NB?2N> ?9BC20B
0B>9205 007C2??
!
9B?33293
??>32C3
0BB02?N 009023?
K <32C3T ;! 0: <325BT ;! 7: <027T ;!
C:
<?2?T ;! 93:
! Be careful about per-session memory .
- ! /
% 7 ; !
07BJ; !, ./ - ./
8 US C07J;& ./ ! *
- '( O $ ! K K - ( 0W
- ( K-! K 8 S 7C>O037?W .7C>J;/
- – *D 32>B ! 03&333
- ( K-! K 8 S 73?BO037?W .7 ;/
- – *D 0B2B0 ! 03&333
- 2 - ;! - ! - ! , .
4 % < (< ! /
File I/O and synchronous writes
$ ; ./ . & & / ! ! ; ; ! # ' .;;#'/
03&333< ./ & ! ;;#' 733 "
- “ ”
! - - . - /
- # ! ! ;;#' . / , 94 ! * - S3 . ! ! !,/
- , 4 !
- 4 - 2
seek & rotation time Write cache with battery disk disk seek & rotation time BBWC and “Auto Learn” Issues
- ;
@%! A
- . ! /& -
G ( 0 " ! !
G ! %! !
G ! - G ! ! ! .
! / G
- ! G ! , ! . 2 2 B /& - - -
- ,
. -! /
$% ' ;;#' % ,
53
- G @+ 4 ;; W #; ! #(A
$
FBWC ; # '
- @%! A !
") ;#' ! . 9>3:N/
. , 8 /&
. 8 /4 ; % 4 ;
% < ./ ! < ./
- – %
- Q ! - ./
03&333< & 9&333
- – %
- – ' ! V ! .N&333</ * *
- ; ! ! S0
- – ' !
V
- – ' “ ” 4 # R?57C -
, 1!
- – * -* ! , S 73 . ! B/
- < ./
- – $! 1 . ;& * &
- K ! K K K ,K S0/& 1 0&333
6 1 ** ! S0 ** S0 ** S - X
- ! * * 1 ** ! * * 1 * * S
X
- ! - * *1! S033333
Buffered and asynchronous writes
& ! & % & 1 ! & -K ! K K K ,K S7&
! - ! & , ! B
# E *D
2 K- ! K
2 K
- !
! K- ! K O $% . !
03T& 03T >?:; >2?:;/ !
K O $% . ! ?3T/
- & -! !
- % -!
! & ! !
- , ! W $ !
2 K- ! K 72>297
- – !
2 ! ’ - Filesystem ext3
; ! ;! -
. " / & $ ) - &
- .- 'JK ! ,/ ; ! ! % & ; -K K K - & );P(&
#
8
- 8 - “ * ! ,”&
! 8 . )' *+, / “ K ,” !
! * - S3 - Tip: fast large file remove '
1 -0 ! K - 2 - ! K - 2 - 2 $ ) (%; + ! K - W
G ! K - 2 - 2 = ! ! - )
8 &
- #
! !
8 )
! & -!
= ! , = - G ;! = ! Filesystem xfs/ext2 ,
- ' !
! K $+'( ! ! $"+ . ! ! +/
“ ” - - - -
, 7
’ ! ! - , 7 M !
! . 2 2 * $ /&
, 7 !
; .! /
- '
- ! . ! / ! -
Random Write IOPS (16KB Blocks)
100 i/o threads20000 18000 16000 14000 12000 10000
8000 6000 4000 2000
HDD(ext3) HDD(xfs) Intel(ext3) Intel(xfs) Fusion(ext3) Fusion (xfs)
" .? % $% 0/ -
02B ! I/O scheduler
4 $ ; . ;/ ! 1! !, ! !, !
1! “ ! ” “ ! ! 8 ”
( ! . ! 72>2034 $"+ C/ 4 1! - - & ’ 4 )
8 . / 1! 1! . / , . “ * * ” - /
1. ! /4 ! 1! 4 $ 72>299 .- ! 2 ’ ! /
! 1& -! -
R D - P 1! ! ! cfq madness
$ ; & ) - ! " < !
I
1. Multi0threaded random disk reads (Simulating RDBMS reads)
$ $ Running two benchmark programs concurrently
033 3 7033 3 7033 < . / !
1 !
1
7?> 7?B
0??B3 707
. < / # M! * 1& ! 3 7>3
1
73B?
3
I 033
073B? 050C
1 3 7>3
2. Single0threaded overwriting + fsync() (Simulating redo log writes) Changing I/O scheduler (InnoDB)
DBT-2 (MySQL5.1) 15000
RAID1+0 10000
M P
RAID5
T O N
5000 noop cfq deadline as
0 Sun Fire X4150 (4 HDDs, H/W RAID controller+BBWC)
0 RHEL5.3 (2.6.180128)
0 Built0in InnoDB 5.1 Changing I/O scheduler queue size (MyISAM)
! !
8 S ! 1!