Linux and HW optimizations for MySQL

  ! " # $ "% Table of contents !

  History of MySQL performance improvements " #

  " $% & # ' $% %(% 、)' *+,

  ! - ') #

  . ! & & / ! - ! %

  . & / Per-server performance is important ( 1!

  22

  0333 1! 4 0333 03333 1! 4 033

  % 533 03 6 & 0 6 ! ! ! & ! ! ! - &

  ('

  32bit Linux

  Updates

  2GB RAM

  2GB RAM

  2GB RAM HDD RAID HDD RAID HDD RAID (20GB) (20GB) (20GB)

  • Many slaves + Many slaves + Many slaves

  033*733

  • ! & - 8 -

  % ! . 2 2 93:;< /

  • )

  ! ! - ! !

  ; ! % ! - ! =

  64bit Linux + large RAM + BBWC

  16GB RAM

  • Many slaves HDD RAID

  (120GB) & >?- !, !

  • 0>:; $% !,

  & - ! !

  • ( ! -

  . / )

  . ! - ! / @( A

  • ! - ,

  ! & -

  ; - % . & /

Side effect caused by fast server

  • % 0>*97:; $% & ! ! !

  # ?*B $% - C 03 &

  ! > " $% 03& ) !

  HDD RAID 733& 033 ) ! 0333*7333

  • : & .

  /2 > " $% 03 "

  HDD RAID !

  .03< !

Using SATA SSD on slaves

  • )

  .0333</ .033</ ! !

  ! ) E

  $ ) %(% 9333<& !

  • ! .0C " / -

  HDD RAID F! "

  ! ! - ! - . ) 4 033< *D 9333</

  ! . ) 4 0333< *D 9333</ #

  • ( ! - ! !

  SATA SSD

  4

  • : ! " $% 1!

  4 ; ) ! ! Concurrency improvements around MySQL

  $ & - ;4 $ & % 4 ( -

  " : .: - ! ,/ ' .

  • / ! !

  ; ;! ) & & ;

  ') .C20 ) ! & C2C/ )

  ! -

  G * . 2 + +'(/ ! !

  ! ! ) - ! . / ! .

  • /
  • . ( / !

Avoiding sudden performance drops

  Perf Time

  ! - - ! !

  Product B Product A

  • ! = ! = ( ! ! % - & ! ; !
  • = ! -
  • ! = ! -
Avoiding stalls % - .

G /

  ( ! - .03*033 & -!

  ! 0333< ( % I ' +'( / Avoiding stalls(2)

    <
  • ! 2 = !

  .?23 ?20&amp; C232?C&amp; 22/

  (

  • ! . 'JK ! ,/ ;! . &amp;

  ! / - - K ./ ./ .

  / Handling Real-World workloads..

  4 % ) - ' ! F

  G ; G !- 4/

  F 8 &amp; ) &amp; )' 7*9 -

  7C&lt; ! 029;6 ! 7303 RDBMS or NoSQL ?

E

  = L % !

  &amp; ! -M

  " # ! $ ; . ') &amp; $% &amp; )' *+, / ' , 1! &amp; ! ! &amp; &amp; ,&amp;

  ! &amp; &amp; ! ! -

  

$ ; .' = 0333 ! / E

  ! 03&amp;333&lt; ) %(+

  " ,

  • % = -
  • <
  • &amp; , !
Social Game workloads

  • !
    • 8

  G + )J @! K &lt; ,,,K A G :;

  • !

  %

  • !
  • = ! &amp;
    • $( ) %(+ + +(+ &amp;
    INSERT-mostly tables

  • " ! &amp; &amp; -
    • $( + +'( ( - 8 - ! . , 0(;/

  . + +'( /

  • $(

  ; .( ;! 2 ! % /

  • ( , &amp; -! # , 8 -
  • ! &amp;
    • D *- !

  G ! ! ! G ) %(+ + +(+ + +'( !

  • %
InnoDB Feature: Insert Buffering

  • ! 1! &amp; , - &amp;

  ;

  • ! .@ -
  • !
  • I (+
  • ) &amp; -!

  .@ A/ - ,

  Insert buffer

  ) 4 $ !

  • $ ! ! -

  1! Optimized i/o

  • 1!

  '

  4 %

  , ! - ! 2 -

INSERT gets slower

  Time to insert 1 million records (InnoDB, HDD) 100 200 300 400 500 600

  1 13 25 37 49 61 73 85 97 109 121 133 145 Existing records (millions) S e c o n ds

  Sequential order Random order

  , 8 , - -!

  8 N9 : ! - ! -! .

  / 1! &amp;

  2 Index size exceeded buffer pool size

  10,000 rows/s 2,000 rows/s INSERT performance difference

  • $( ! ! *

  0C333&lt; " #

  • , -! &amp;

  7333*?333 " &amp; &gt;333*B333 ! *

  !

  % C333 ! !

  G ; ! ! ') ! 1! - ! . 2 2 ! ! &amp; /

  8 , 8 +$( Approach to complete INSERT in memory $ -

  ( ( + (% ) - ! - G ( + (% ) ! !

  C20 , 8 - K ,K 8 ! - K K

  Single big physical table(index)

  Partition 1 Partition 2 Partition 3 Partition 4

  • % (+$ (%; + 22 $ ) )%$( (

Approaches to complete INSERT in memory

  • .! C23&amp; !

  = / )! , !

  &amp; $% UPDATE-mostly tables ( !

  ! = ! G '! ")&amp; , &amp; &amp; ! - &amp; = ! &amp; -

  ! &amp; G ) %(+ + +'( 1! , ! !

  !

  ) %(+

  , - - G -! &amp; ! G

  8 &amp; +$( + +(+

  • "!
    • G ) %(+4 07&amp;333 G " ) %(+4 933 G %(% ) %(+4 0&amp;B33 G )' *+ ) %(+4 ?&amp;333 G O (

  G $ ! &amp;

  What do you need to consider? (H/W layer) " E

  %(% % )' *+, E&amp; " E $%

  " # $% &amp; # $% F; E 033 - 0:- ! E 7:; $% &lt; )' *+ &gt;?:; $% &lt; B" E

  ') &amp; P E What do you need to consider? $ !

  $% $; . /

  $ * % $

  , 9&amp; , &amp; E &amp; $ &amp;

  !

  # # ! Why SSD? IOPS! ) 4 ! - . /

  • %

  1!

  , - ,

  

$ ! % " 4 733 . Q / 4 7&amp;333&lt; . / C&amp;333&lt; . / Table of contents

; ) "

  $ $ $ #

  1! $ 1! #

  ./ ) !

  8 ' '

  R

Random Read benchmark

  Direct Random Read IOPS (Single Drive, 16KB, xfs)

  45000 40000 35000 30000

  S HDD

  25000 P

  Intel SSD 20000

  IO 15000

  Fusion I/O 10000 5000

  1

  2

  3

  4

  5

  6 8 10 15 20 30 40 50 100 200 # of I/O threads

  " 4 05&gt; &amp; ??9 033 4 9C3B &amp; 0?C9B 033 ! 4 03C7&gt; &amp; ?09N5 033

  ! ! 0&gt;, - " &amp; !

  7C, - ’

  ’ .727,/ ! .?,/ ! - " High Concurrency ! % . 2 2 ?3 , ?:; S 0&gt;3:;/

  " %

  !

PCI-Express SSD

  CPU North Bridge South Bridge

PCI0Express Controller SAS/SATA Controller

  2GB/s (PCI0Express x 8) 300MB/s

SSD I/O Controller SSD I/O Controller

Flash

  Flash %

  )' *+, ! % %(%

  . ! /

  • R )' *+

Write performance on SSD

  

Random Write IOPS (16KB Blocks)

20000 18000 16000 14000 12000

  1 i/o thread 10000 100 i/o threads 8000

  6000 4000 2000 HDD(4 RAID10 xfs) Intel(xfs) Fusion (xfs)

  H ! ;! 22 ' ! ! O OE

  ’

  22 Understanding how data is written to SSD (1) Block (empty) Block (empty) Block (empty) Block

  Page Page !. Flash memory chips

  . 2 2 7:;/ . 2 2 C07J;/ - %

  % - . 2 2 ?J;/

  • O O
  • <
  • # - % -
    • O O -
    Understanding how data is written to SSD (2) Block (empty) Block (empty)

  New data

  × × × ×

  Block (empty) Block Page Page !.

    • # #

  .*733 / * - !

  . 2 2 ; +

  • /&amp;
  • ! .# * /
Understanding how data is written to SSD (3) Block Block Block P P P

  P P P

  1. Reading all pages Block Block Block Block P P P P

  

New

P P P P

  2. Erasing the block Block Block Block P P

  3. Writing all data P P P P P

  ! - ! &amp; - !

  New P

  2 2 %

  0CB:; 0&gt;3:; ! - -

  ;

  4

  • 02 $
  • 72 +
  • 92 #
    • $% + , . /

  • % &amp; - !
    • $% +
    Reserved Space

  ( ! &amp; ! “ ” 8 -

  8 2 2 0&gt;3:; &amp; 073:; &amp; ?3:;

  ! !

  8 R * * 5&gt;:

  3 P

  Block P P Block P P

  Block P P Block P Block P P

  Block P

P

Block

P

P

  Block

P

P

Block

P

  Data Space New data Block (empty)

  Block (empty)

  2. Writing data P P

  Reserved Space Block Write performance deterioration

  % - &amp; ) “ ”

  # &amp; ) !

  “ ” .- ! +$% + /

  • ! ! # )

  “ ” . - +$% + - - ! M -/ "

  .($ ! &amp;

  Write IOPS deterioration (16KB random write) 5000 10000

  15000 20000 25000 30000

  Intel Fusion(150G) Fusion(120G) Fusion(96G) Fusion(80G)

  IO P S Fastest Slowest

  Continuous write/intensive workloads Stopping writing for a while Mitigating write performance deterioration %

  $%

  ) 2 (

  • 8

  3 # -! !

  2 I ! ! ! ! ! Sequential I/O

Sequential Read/Write throughput (1MB consecutive reads/writes)

600

  500 400 s /

  Seq read B

  300 M

  Seq write 200 100

  ( - 4 ! . /&amp; M ! . / ! " 1! &amp; -!

  " .? $% 03/ ! 1! 8 - 1! ! - &amp; !

  • !

  ! 1! fsync() speed 03&amp;333&lt; ! ') - ! .T /&amp;

  

fsync speed

2000 4000 6000 8000

  10000 12000 14000 16000 18000 20000

  HDD(xfs) Intel (xfs) Fusion I/O(xfs) fs y n c/ se c

  1KB

  8KB

  16KB

  • ! .T /2
HDD is fast for sequential writes / fsync ; ) 4 # - - - ! ;;#' ;#' .; ; ! # ' /&amp; $+ .- ! ’ 1! / ! seek &amp; rotation time

  Write cache disk disk seek &amp; rotation time Filesystem matters , &amp; ! K $+'(&amp; -! , O

  : ! , &amp; , 9 8 !

  

Random write iops (16KB Blocks)

2000 4000 6000 8000

  10000 12000 14000 16000 18000 20000

  

Fusion(ext3) Fusion (xfs) Fusion (raw)

Filesystem

io p s

  1 thread 16 thread Changing I/O unit size Read IOPS and I/O unit size (4 HDD RAID10) 2500

  2000

  1KB S 1500

  P

  4KB

  IO 1000

  16KB 500

  1

  2

  3

  4

  5

  6

  8

  10

  15

  20

  30 40 50 100 200 concurrency " &amp; , !

  77T !

  0J; 0&gt;J; -

  • ! U 03
Changing I/O unit size on FusionIO Random Read IOPS (FusionIO, SLC) 160000

  140000 120000 s

  100000

  4KB / ds

  80000

  8KB a e R

  16KB 60000 40000 20000

  1

  2

  3

  4

  5

  6

  8

  10

  15

  20

  30 40 50 100 200 Concurrency "! &amp; ) &amp; -!

  8 ’

  • ! “ - ! - 8 ” !
SLC vs MLC (16KB) Random Read IOPS, FusionIO (16KB) 45000

  40000 35000 30000 s /

  25000 SLC ds a

  20000 MLC re

  15000 10000 5000

  1

  2

  3

  4

  5

  6

  8

  10

  15

  20

  30 40 50 100 200 concurrency B*?3T - ! ! ' SLC vs MLC (8KB) Random Read IOPS, FusionIO (8KB) 80000

  70000 60000 s

  50000 s/

  SLC d

  40000 a

  MLC re

  30000 20000 10000

  1

  2

  3

  4

  5

  6

  8

  10

  15

  20

  30 40 50 100 200 concurrency

  7C*NCT - ! ! ' $ - ! - ! . ' '&amp; ! 8 &amp; 22/ tachIOn vs FusionIO (SLC) Random Read IOPS (16KB) 90000

  80000 70000 60000 s s/ 50000

  FusionIO ad

  40000 tachIOn re

  30000 20000 10000

  1

  2

  3

  4

  5

  6 8 10 15 20 30 40 50 100 200 concurrency &amp; !

  !

PCI-Express interface and CPU util

  # cat /proc/interrupts | grep PCI 83: … PCI-MSI vgcinit

  202: … PCI-MSI-X eth2-0 210: … PCI-MSI-X eth2-1 218: … PCI-MSI-X eth2-2 226: … PCI-MSI-X eth2-3 234: … PCI-MSI-X eth2-4

  # mpstat –P ALL 1 CPU %user %nice %sys %iowait %irq %soft %idle intr/s 0 1.00 0.00 12.60 86.40 0.00 0.00 0.00 1000.20

  1 1.00 0.00 13.63 85.37 0.00 0.00 0.00 0.00 2 0.40 0.00 4.80 26.80 0.00 0.00 68.00 0.00 3 0.00 0.00 0.00 0.00

  79.20 79.20 0.00 20.80 39033.20 ...

  79.20

  79.20

  • T 1 ." ! / $

  )' *+ ; ! ! ! - $

  # of interfaces (tachIOn SLC)

Random Read IOPS (16KB, tachIOn)

300000 350000 s ad s/ 250000 200000 Single Drive re 100000 150000 Two Drives (RAID0) 50000 1 2 3 4 5 6 8 10 15 20 30 40 50 100 200 concurrency

  • ( !

  ! ! &amp; ! # ! - . ! 1! /&amp; T 1 &amp; !

  ! ! ( ! -

  # of interfaces (FusionIO MLC) Random Read IOPS (16KB) s 50000 60000 70000 re ad s/ 30000 40000 20000 FusionIO FusionIO Duo 10000 1 2 3 4 5 6

concurrency

8 10 15 20 30 40 50 100 200

  ! ! )' *+,

  ( $ - ! &amp; ! !

  % ! ! ! . &amp; -/2

  • ! ! . /

  G ! ") 9&gt;3 . )' *+ / - G &amp;

  • ! - &amp; , !
tachIOn(SLC) vs FusionIO Duo(MLC) Random Read IOPS (16KB) 90000

  80000 70000 60000 s tachIOn

  / 50000 ds

  FusionIO Duo a

  40000 re

  FusionIO 30000 20000 10000

  1

  2

  3

  4

  5

  6 8 10 15 20 30 40 50 100 200 concurrency ! ! ! &amp; = ' Opteron vs Nehalem(tachIOn)

tachIOn Random Read IOPS (16KB)

90000

  80000 70000 60000 s

  50000 s/

  Nehalem X5650 ad

  40000 Opteron 6174 re

  30000 20000 10000

  1

  2

  3

  4

  5

  6

  8

  10

  

15

  20

  30 40 50 100 200 # of threads

  7

  ) -

  • )' *+ ') 1!

Opteron vs Nehalem(tachIOn) 300000 tachIOn Random Read IOPS(4KB) s 200000 250000 re ad s/ 100000 150000

  Opteron 6174 Nehalem X5650 50000 1 2 3 4 5 6 8 10 15 20 30 40 50 100 200 # of threads 7*?

  • T 1 033T ! " " ! 1! ')
How about using SSD as L2 Cache?

  7'

  &amp; (

  $ !

  V 7%$' G )

  V ' -

  G # !, J ! ! '

  • G # !

    2 !

  '

  !

  ) . ' / G !, J ! G + ! 033T &amp; ?3T&amp; NCT

  ! . ' C2C/ G P7C*+ = ) !

  • G = M! 2 # , 2 = - &amp;

  2

  • ( ! 2 7'

  " Virtualization? '!

  ! . /

  : 0 93 ! ! -! ! 032? &lt; JH &lt; !

  # " &amp; ! ! - ! ! ! !

  8 = - ! ! Virtualization benchmarks (HDD) Random Read IOPS (Dell could HDD, 16KB unit) 5000

  4500 4000 3500 c

  3000 se

  Physical /

  2500 ds

  KVM a

  2000 re

  1500 1000 500

  1

  2

  3

  4

  5

  6

  8

  10

  15

  20

  30 40 50 100 200 # of threads " - Virtualization benchmarks (SATA SSD) Random Read IOPS (Dell cloud SSD, 16KB unit) 25000

  20000 c

  15000 se

  Physical s/

  KVM ad

  10000 re

  5000

  1

  2

  3

  4

  5

  6

  8

  10

  15

  20

  30 40 50 100 200 # of threads ! ! -

  9 $ " . %(%/ $ " ! .)' *+/

  MySQL Deployment Practices on SSD #

  • " ! C2C C20

  ;(*7 - . / 733 # ! .73:; – 7C:; / !

  4 '&amp; 5&gt;:; &amp; &gt;?:;

HDD vs Intel SSD vs Fusion I/O

  ! )' *+ - )' *+ ! % %(% 0?,

  ?" CN3523&gt;

  0C0772NC ! 007C2?? ;! 0: "

  NOTPM: Number of Transactions per minute Which should we spend money, RAM or SSD? $% .-! 8 / !

  ; !

  • !

  &amp; .- 1! / ! &amp; 1! - !

  ! $% &lt; " ! $% &lt;

  0C0772NC CN3523&gt; 007C2?? ;! 0:

  0B&gt;9205 ;! 7: ?9BC20B ;! C:

  9&gt;NB?2N&gt; ;! 93:

  .' /

  ! " Which should we spend money, RAM or SSD?

  • ! &lt; ! .C:;/ ; ! - 4

  ! &amp; ! $% &lt; " ! ’ &amp; ! , ! ! &amp; -

  $%

  0C0772NC CN3523&gt; 007C2?? ;! 0: 7335&gt;299 NC9&gt;2CC

  0B&gt;9205 ;! 7:

  93B?&gt;29?

  07B572C&gt; ?9BC20B ;!

  C: CN??02&gt;? * 9&gt;NB?2N&gt; ;! 93:

  .' /

  ! " MySQL file location , " ! 1!

  ! 1! $

  4

  .O2 - /

  • – - 1! !

  &amp; ;! . - /

  • – - . &amp; , ! * ! / -
  • – -! . &amp; , ; *

  ! ! , /

  1! #

  4 !- ;! . - 3/

  • – # ! 1! O - 2 "!

  ; . 1 *- 2PPPPPP/ $ . -K /

  ;! 93: !

  0C0772NC

  .! SN3T&amp; S92CT/

  C: CN??02&gt;?

  ;!

  .! S95T&amp; S03T/

  93B?&gt;29?

  ;! 7:

  .! S93T&amp; S072CT/

  ;! 0: 7335&gt;299

  .! S7CT&amp; S0CT/

  ! &lt; " &lt;0CT &lt;7BT &lt;7BT &lt;7BT

  Moving sequentially written files into HDD

  .! S97T&amp; S03T/

  0575C25?

  .! S9&gt;T&amp; SBT/

  7C&gt;7N2?5

  .! S?5T&amp; S&gt;T/

  95?9C27C

  .! SNNT&amp; S0T/

  &gt;&gt;3C92&gt;B

  • &amp; -K &amp; .&lt;- / " "

  T !

  # ! - - ! !- " Does CPU matter? Older Xeon Nehalem CPUs Memory

  CPUs QPI: 25.6GB/s FSB: 10.6GB/s North Bridge

  North Bridge Memory (IOH) (MCH) PCI/Express

  PCI/Express

  02 ') 4

  • 72

  ') ;

  72C, &amp; ') U*D

  ! )' *+, Harpertown X5470 (older Xeon) vs Nehalem X5570 (HDD) ')

  • 0T 007C2?? .! S0T/ ;! 0:
  • 9T

  ') - ! 935392? .! S?3T/

  ?0N&gt;2C0 .! SNT/ 0577279 .! S7T/ 009C29N .! S0T/ " PC?N3&amp; 9299:"8

  0B&gt;9205 .! S7T/ ;! 7: &lt;CT ?9BC20B.! SNT/ ;!

  C: &lt;05T 9&gt;NB?2N&gt; .! S?3T/ ;! 93:

  .PCCN3&amp; 7259:"8/

  " us: userland CPU utilization Harpertown X5470 vs Nehalem X5570 (Fusion) () ! "

  • ! .-! 0: 7:/&amp; ') !

  8 &amp; -! () !

  H ! )' *+

  ;

  • ') )' *+, !

  ! - ') C7CB72N0 .! SN&gt;T/ 933CB2?B .! SC3T/ 0537&gt;2&gt;? .! S?3T/

  09C9?23&gt; .! S9CT/

  " PC?N3&amp; 9299:"8

  &lt;?9T 0575C25? .! S97T/ ;! 0: &lt;9CT

  7C&gt;7N2?5 .! S9NT/ ;! 7:

  &lt;90T 95?9C27C .! SC3T/ ;!

  C: &lt;7&gt;T &gt;&gt;3C92&gt;B .! SN&gt;T/ ;! 93:

  .PCCN3&amp; 7259:"8/

  ! &lt;" Intel SSDs with a traditional H/W raid controller

  • 5T
  • 9NT
  • ?BT CN3523&gt; ;! 0: NC9&gt;2CC ;! 7:

  $ ! - ! " #

  • $%

  03 " # $ ! -

  ! . P7C*

  H + ’ / F; &lt; &lt;

  973 !

  00N9527N ?N&gt;92&gt;3

  75NC23? ! $% C

  07B572C&gt; ;!

  C: Enable HyperThreading ; ) ! C20 0&gt;*7? ') "(

  B0?07279 ?CNBC207

  7B?9B233

  73NBC2?7 "( .0&gt;/

  &lt;79T &lt;0&gt;T &lt;00T &lt;N2NT 0575C25? ;! 0:

  7C&gt;7N2?5 ;! 7: 95?9C27C ;!

  C: &gt;&gt;3C92&gt;B ;! 93: "( .B/ ! &lt; "

  • ! ') - !
MySQL 5.5 : 73*7&gt;T

  • ! !

  ; ') T! T

  • – T! 4 9&gt;T .C2C27/ ??T .C2C2?/ -! S 7
  • – T

  4 BT .C2C27/ C2CT .C2C2?/ -! S 7 &amp; -!

  73T ' !

  ! 1! C2C L

  • ! ? "
    • – #

  ! ! &amp;

  &gt;N7C92?C ?N75&gt;207 9797C2N&gt; 7?305297

  &lt;02BT &lt;73 &lt;7&gt;T &lt;7?T 0575C25? ;! 0:

  7C&gt;7N2?5 ;! 7: 95?9C27C ;!

  C: &gt;&gt;3C92&gt;B ;! 93:

  C20 ! &lt; " Where can PCI-Express SSD be used?

  E )' *+ ! &amp;

  G 03&amp;333 ) &amp; ?3&amp;333&lt; ) 033

  03&amp;333 ) 033 - %(% ) ! -

  E ! " &amp; %(% ! - !

  G )' *+, ! , %(%

  " - ! ! ! E

  G H !

  • )' *+ ! ! ! E

  " &amp; 033*733:; - -

  • ! ) "

  ! '4 973:; ! &lt; 0&gt;3:; S ?B3:; ! '4 07B3:; ! &lt; &gt;?3:; S 0573:; . ! /

  '4 B33:; , 7 S 0&gt;33:; Running multiple slaves on single box $! !

  )' *+

  ; ! " - ' !

  = &amp; ! -

  ! - ! ! ( ! -

  After

  M B S1 S2 S3 M B S1 S2 S3 M M

  S1, S1 M B S1 S2 S3 M B S1 S2 S3 B B B B B B B B B M B B B B M M B B B B

  M M B B B B

  • Before
Our environment

  ") 9&gt;3:N .0 /&amp; $&gt;03 )' *+

  ! ' .&gt;?3*07B3:; ! &lt; 973*&gt;?3:; / ' .B33:; , 7/ *D '

  ') (

  • &amp; &gt;* &amp; &qu
  • G 7? ') G ! ,

  $% &gt;3:; ; ;' CN35&amp; ! ; , 7 ( ) &amp; &gt;*07 .S / ! )

  " ?*B % $% 0&lt;3

  ! &amp; &amp; &amp; . / !- -! - Statistics

' N ! .&gt;?3:; ' ! &lt; 973:; '/

  • '( 1! &gt;:; -K-! K K 8

  ) ) . N / &gt;0&gt;B92N 1!

  9N59520 NB&gt;020 ! 003C

  0B?9 90?92C - ')

  7N29T&amp; T

  00T.T ?T/&amp; T ?T '2 2 %(% :T! ?T&amp; T

  

0T&amp; T

  0T ;!

  552?T %(% 4 552BT

  • .033&lt; / !
CPU loads ') ! 8 &amp; -! ! - -

  T!

  7N29T&amp; T

  00T.T ?T/&amp; T ?T $ .5&gt;3:;/2 0573:; ' ! -

  $ 4 0?2&gt; ; &amp; 4 7B2N ; ' C &lt; - 1! . ') 1! / ! - &amp; -! - !

22:10:57 CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s

  

22:11:57 all 27.13 0.00 6.58 4.06 0.14 3.70 0.00 58.40 56589.95 …

22:11:57 23 30.85 0.00 7.43 0.90 1.65 49.78 0.00 9.38 44031.82 Things to consider % ) ! -

  %

  • # ) ! , @ S993&gt;A

  S@ ! A - 2 *

  '

  &amp; &amp; ; &amp; - !

  • " &amp; ; &amp; $ &amp; $ &amp; &amp; - 3 .

  !- -! /&amp; - ! " taskset

  # taskset -pc 0,12,2,14 `cat /var/lib/mysql/mysqld1.pid` # taskset -pc 1,13,3,15 `cat /var/lib/mysql/mysqld2.pid` # taskset -pc 5,17,7,19 `cat /var/lib/mysql/mysqld3.pid` # taskset -pc 8,20,10,22 `cat /var/lib/mysql/mysqld4.pid` # taskset -pc 9,21,11,23 `cat /var/lib/mysql/mysqld5.pid` # taskset -pc 4,16,6,18 `cat /var/lib/mysql/mysqld6.pid` # taskset -pc 8,20,10,22 `cat /var/lib/mysql/mysqld7.pid`

  ! 7? ')

  # ! &gt;&lt; 7? &amp; ! - ! 8 ')

  • ! ' ! ! !

  . / #

  &amp; ! . / Application Design ;

  ') ! - , ! "

  • " .

  / " ! -

  • $

  "

Making MySQL better ) ) BJ; ?J; ; ;

  8 ! , -

  • %$ "% "

  G , ! , Future improvements from DBA perspective &amp; - ! $

  ! - !

  

. ! /

  $ B33:; !

  ! - -

  ! ! G *- S0&amp; -K ! K K K ,K S0 G $

  • .:% C2C/ ' * - .- ! / $ , ! ! ! - ; - &amp;

  . * 2 / . C2&gt;E/ G $ ! - Conclusion for choosing H/W )' *+ . 2 2 ! &amp; / % %(%

  . 2 2 973/ ' ! $%

  2 J ! &amp; ! &amp; ! $% 3 ! " 1!

  ') ! )' *+, + Conclusion for database deployments )! 1!

  "

  &amp; -K &amp; - - " ! 1!

  • #
  • ,

  )!

  03, *033, "

  • %

  "

  ! , "

  C20 ; ) ! C2C

  • " Q !
What will happen in the real database world? ( M! !

  22

  ’ ! M! " ! E

  • !

  H !

  8

  ! !

  1 !

  !

  M! )

  • ' ! - 8 &amp; !- &amp; -

  ' !

  B*0&gt; "

  Random Access Memory $% ! "

  $% 4 *&gt;3

  • – 033&amp;333 1!
    • " 4 *C 4 033*C33!

  0&gt;*033&lt;:; $% O O ! -

  8

  8

  . % ( H%$'"%$ ; : (&amp; ( + (% ) %(+( +&amp; /

  ! ,

  • &amp; -

Cache hot application data in memory

  ;(*7 - . / 73*7C:; .733 ! &amp; ! ! / $%

  8 2 + +'(&amp; -!

  • $( ) %(+ + +(+
  • $(4 $

  , ) %(+ + +(+4 $

  9&gt;NB?2N&gt; ?9BC20B

  0B&gt;9205 007C2?? ( !

  9&gt;T C2CT

  9T

  7T T!

  93T ;! 0:

  7BT ;! 7:

  99T ;!

  C: BT ;! 93:

  .% / T ;(*7 .#733/ Use Direct I/O Direct I/O Buffered I/O InnoDB Buffer Pool InnoDB Buffer Pool Filesystem Cache RAM

  RAM InnoDB Data File InnoDB Data File

  ! !

  8

  • K ! K S K $+'( %

  4 ! ! - C07 - ' ’ ! K $+'( ; &amp; ; &amp;

  % &amp; ) &amp; Do not allocate too much memory

  user$ top Mem: 32967008k total, 32808696k used, 158312k free , 10240k buffers Swap: 35650896k total, 4749460k used , 30901436k free, 819840k cached

  PID USER PR NI

  VIRT RES SHR S %CPU %MEM TIME+ COMMAND

  5231 mysql 25 0 35.0g 30g 324 S 0.0 71.8 7:46.50 mysqld

  #

  • E

  $ ! . /

  • )

  . ! /

  • . /
What if setting swap size to zero?

  ’ ;

  8 8 &amp; 2 ;! 22

  H ! # $%

  • &amp;

  2 J

  • ( !
    • . 1 /

  ’ ! 2 ' - ) - $ +$ ;I U) D K + '

  1

  • – H 8 &amp; ') &amp; ! &amp;

  . ! ! / J # ’

  • ! !

Do not set swap=zero

  top -- top - top top - 01:01:29 up 5:53, 3 users, load average: 0.66, 0.17, 0.06 01:01:29 up 5:53, 3 users, load average: 0.66, 0.17, 0.06 01:01:29 up 5:53, 3 users, load average: 0.66, 0.17, 0.06 01:01:29 up 5:53, 3 users, load average: 0.66, 0.17, 0.06 Tasks: 170 total, 3 running, 167 sleeping, 0 stopped, 0 zo Tasks: 170 total, 3 running, 167 sleeping, 0 stopped, 0 zombie Tasks: 170 total, 3 running, 167 sleeping, 0 stopped, 0 zo Tasks: 170 total, 3 running, 167 sleeping, 0 stopped, 0 zo mbie mbie mbie Cpu(s): 0.0%us, Cpu(s Cpu(s Cpu(s ): 0.0%us, ): 0.0%us, ): 0.0%us, 24.9%sy 24.9%sy 24.9%sy 24.9%sy , 0.0%ni,75.0%id,0.2%wa,0.0%hi, 0.0%si,0.0%st , 0.0%ni,75.0%id,0.2%wa,0.0%hi, 0.0%si,0.0%st , 0.0%ni,75.0%id,0.2%wa,0.0%hi, 0.0%si,0.0%st , 0.0%ni,75.0%id,0.2%wa,0.0%hi, 0.0%si,0.0%st Mem Mem: 32967008k total, 32815800k used, 151208k free, 8448k bu Mem Mem : 32967008k total, 32815800k used, 151208k free, 8448k bu : 32967008k total, 32815800k used, 151208k free, 8448k buffers : 32967008k total, 32815800k used, 151208k free, 8448k bu ffers ffers ffers Swap: 0k total, 0k used, 0k free, 376880k Swap: 0k total, 0k used, 0k free, 376880k Swap: 0k total, 0k used, 0k free, 376880k cached Swap: 0k total, 0k used, 0k free, 376880k cached cached cached

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COM MAND PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COM PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COM MAND MAND 26988 26988 26988 26988 mysql mysql mysql mysql 25 0 30g 30g 1452 R 98.5 97.7 0:42.18 mysqld 25 0 30g 30g 1452 R 98.5 97.7 0:42.18 25 0 30g 30g 1452 R 98.5 97.7 0:42.18 25 0 30g 30g 1452 R 98.5 97.7 0:42.18 mysqld mysqld mysqld

  • &amp; ') ! 033T !

  7?25T . / S 0 ? ! 033T ! ! ( 8 . " ’ - / -

  • &amp; -! !
What if stopping OOM Killer? U) D K M *0N&amp; J ’

  • 0N ! R *0N D U D K M ;! ’ *0N

  1 !

  &amp; !, * * ’

  • # ’

  22 *D Swap space management

  • 8

  ;! ’ 1 !

  # ! E $ ;

  • – ! . -K-! K &amp; K-! &amp;

  K-! &amp; /

  • – ! . ; &amp;
    • &amp; % &amp; /

  % .- ! &amp; /

  • – !

  # 1 $% &amp; Be careful about backup operations

  Mem: 32967008k total, 28947472k used, 4019536k free, 152520k buffers Swap: 35650896k total, 0k used, 35650896k free, 197824k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 5231 mysql 25 0 27.0g 27g 288 S 0.0 92.6 7:40.88 mysqld

  Copying 8GB datafile

  Mem: 32967008k total, 32808696k used, 158312k free, 10240k buffers Swap: 35650896k total, 4749460k used , 30901436k free, 8819840k cached

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 5231 mysql 25 0 27.0g 22g 324 S 0.0 71.8 7:46.50 mysqld

  ' ! vm.swappiness = 0

  Mem: 32967008k total, 28947472k used, 4019536k free, 152520k buffers Swap: 35650896k total, 0k used, 35650896k free, 197824k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 5231 mysql 25 0 27.0g 27g 288 S 0.0 91.3 7:55.88 mysqld

  Copying 8GB of datafile

  Mem: 32967008k total, 32783668k used, 183340k free, 3940k buffers Swap: 35650896k total, 216k used , 35650680k free, 4117432k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 5231 mysql 25 0 27.0g 27g 288 S 0.0 80.6 8:01.44 mysqld

  2 S3

  2 ! &gt;3

  # $% ! ! &amp; !, ! .

  / %

  • &amp;

  ’ - ! 2 ’ But swap happens even though swappiness==0 97:; $% - ,&amp; ' C .72&gt;20B207B/

  • K-! K K 8 S 7&gt;:; 1 -K-! K K 8 4 , 9:;

J 4 , 0:;

  top 0 11:54:51 up 7 days, 15:17, 1 user, load average: 0.21, 0.14, 0.10 Tasks: 251 total, 1 running, 250 sleeping, 0 stopped, 0 zombie Cpu(s): 0.5%us, 0.2%sy, 0.0%ni, 98.9%id, 0.3%wa, 0.0%hi, 0.1%si, 0.0%st Mem: 32818368k total, 31154696k used, 1663672k free, 125048k buffers Swap: 4184924k total, 1292756k used , 2892168k free, 2716380k cached

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6999 mysql 15 0 28.4g 26g 5444 S 9.3 85.2 210:52.67 mysqld Swap caused by log files

  ! !

  ; 8 4 9:; ’ ; ) ! - ! - !

  ; . /

  Filesystem Cache InnoDB Buffer Pool InnoDB Data File Log Files Swap bug on Linux

  % ! !, J

  , 72&gt;27B4 $"+ &gt; 4 -! 8

  2

  2 K-! 2 E S0&gt;3399 # ' &gt; - E

  ; -! ,

  4 K S K 7 &lt; &lt; *D W

  K D 033&amp; G (

  ! &amp; ! SS 3

  # - E

  # ; &amp; - &amp; ! - ! - ! -K K K 8

  G -! - , ; ) ! &amp; ! 9*?:; - -

  # ! - - $"+ CE

  • K-! K K 8 -K K K 8 E

  G = ! ; - Tool: unmap_mysql_logs 4 !-2 ! K

  1 K

  • )

  G 44' ')% !

  • ; 4 %

  ; &amp; $

  4 G '! . / - 4 % , . / 03T G ; ! ! - ; !

  • 4

  G

  4 InnoDB Buffer Pool

  Filesystem Cache InnoDB Data File Log Files Mem: 32825116k total, 32728924k used, 96192k free , 246360k buffers Swap: 4186072k total, 118764k used , 4067308k free, 2803088k cached

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 30814 mysql 15 0 28.8g 27g 5644 S 81.1 89.4 158198:37 mysqld Mem: 32825116k total, 30061632k used, 2763484k free , 247420k buffers Swap: 4186072k total, 118764k used, 4067308k free, 143372k cached

  Performance effect ! &amp; $ !

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 30814 mysql 15 0 28.8g 27g 5644 S 76.1 89.4 158200:59 mysqld

  • 03 !
Memory allocator 1 ! ./ ./ !

  !

  • !

  : ) . ! /

  • – R ! -!
  • – R
    • ,2 W 2 ! - * W W *

  ,

  • K)$+ % S ! K

  2 W

  1 K Q

  ; !

  • ' ; ) !
    • – -K! K K S 0. ! 0/&amp; ; !
    • – - - ! K)$+ %
    Memory allocator would matter for CPU bound workloads ;(*7 - . / 73*7C:; .733 ! &amp; ! ! /

  9&gt;T C2CT

  9T

  7T T!

  9&gt;NB?2N&gt; ?9BC20B

  0B&gt;9205 007C2??

  !

  9B?33293

  ??&gt;32C3

  0BB02?N 009023?

  K &lt;32C3T ;! 0: &lt;325BT ;! 7: &lt;027T ;!

  C:

  &lt;?2?T ;! 93:

  ! Be careful about per-session memory .

  • ! /

  % 7 ; !

  07BJ; !, ./ - ./

  8 US C07J;&amp; ./ ! *

  • '( O $ ! K K - ( 0W
  • ( K-! K 8 S 7C&gt;O037?W .7C&gt;J;/
    • – *D 32&gt;B ! 03&amp;333

  • ( K-! K 8 S 73?BO037?W .7 ;/
    • – *D 0B2B0 ! 03&amp;333
      • 2 - ;! - ! - ! , .

  4 % &lt; (&lt; ! /

  File I/O and synchronous writes

  $ ; ./ . &amp; &amp; / ! ! ; ; ! # ' .;;#'/

  03&amp;333&lt; ./ &amp; ! ;;#' 733 "

  • “ ”

  ! - - . - /

  • # ! ! ;;#' . / , 94 ! * - S3 . ! ! !,/
  • , 4 !
    • 4 - 2

  seek &amp; rotation time Write cache with battery disk disk seek &amp; rotation time BBWC and “Auto Learn” Issues

  • ;

  @%! A

  • . ! /&amp; -

  G ( 0 " ! !

  G ! %! !

  G ! - G ! ! ! .

  ! / G

  • ! G ! , ! . 2 2 B /&amp; - - -
    • ,

  . -! /

  $% ' ;;#' % ,

  53

  • G @+ 4 ;; W #; ! #(A

  $

  FBWC ; # '

  • @%! A !

    ") ;#' ! . 9&gt;3:N/

Overwriting or Appending?

. , 8 /&amp;

. 8 /

  4 ; % 4 ;

  % &lt; ./ ! &lt; ./

  • – %
    • Q ! - ./

  03&amp;333&lt; &amp; 9&amp;333

  • – %
  • – ' ! V ! .N&amp;333&lt;/ * *
    • ; ! ! S0

  • – ' !

  V

  • – ' “ ” 4 # R?57C -

  , 1!

  • – * -* ! , S 73 . ! B/
Quick file i/o health check ' ;#' ;;#' - &amp; -

  • &lt; ./
    • – $! 1 . ;&amp; * &amp;

  • K ! K K K ,K S0/&amp; 1 0&amp;333

  6 1 ** ! S0 ** S0 ** S - X

  • ! * * 1 ** ! * * 1 * * S

  X

  • ! - * *1! S033333

Buffered and asynchronous writes

  &amp; ! &amp; % &amp; 1 ! &amp; -K ! K K K ,K S7&amp;

  ! - ! &amp; , ! B

  # E *D

  2 K- ! K

  2 K

  • !

  ! K- ! K O $% . !

  03T&amp; 03T &gt;?:; &gt;2?:;/ !

  K O $% . ! ?3T/

  • &amp; -! !
  • % -!

  ! &amp; ! !

  • , ! W $ !

  2 K- ! K 72&gt;297

  • – !

  2 ! ’ - Filesystem ext3

  ; ! ;! -

  . " / &amp; $ ) - &amp;

  • .- 'JK ! ,/ ; ! ! % &amp; ; -K K K - &amp; );P(&amp;

  #

  8

  • 8 - “ * ! ,”&amp;

  ! 8 . )' *+, / “ K ,” !

  ! * - S3 - Tip: fast large file remove '

  1 -0 ! K - 2 - ! K - 2 - 2 $ ) (%; + ! K - W

  G ! K - 2 - 2 = ! ! - )

  8 &amp;

  • #

  ! !

  8 )

  ! &amp; -!

  = ! , = - G ;! = ! Filesystem xfs/ext2 ,

  • ' !

  ! K $+'( ! ! $"+ . ! ! +/

  “ ” - - - -

  , 7

  ’ ! ! - , 7 M !

  ! . 2 2 * $ /&amp;

  , 7 !

  ; .! /

  • '
    • ! . ! / ! -
    Concurrent write matters on fast storage 1 i/o thread

    Random Write IOPS (16KB Blocks)

    100 i/o threads

  20000 18000 16000 14000 12000 10000

  8000 6000 4000 2000

HDD(ext3) HDD(xfs) Intel(ext3) Intel(xfs) Fusion(ext3) Fusion (xfs)

  " .? % $% 0/ -

  02B ! I/O scheduler

  4 $ ; . ;/ ! 1! !, ! !, !

  1! “ ! ” “ ! ! 8 ”

  ( ! . ! 72&gt;2034 $"+ C/ 4 1! - - &amp; ’ 4 )

  8 . / 1! 1! . / , . “ * * ” - /

  1. ! /4 ! 1! 4 $ 72&gt;299 .- ! 2 ’ ! /

  ! 1&amp; -! -

  R D - P 1! ! ! cfq madness

  $ ; &amp; ) - ! " &lt; !

  I

  1. Multi0threaded random disk reads (Simulating RDBMS reads)

  $ $ Running two benchmark programs concurrently

  033 3 7033 3 7033 &lt; . / !

  1 !

  1

  7?&gt; 7?B

  0??B3 707

  . &lt; / # M! * 1&amp; ! 3 7&gt;3

  1

  73B?

  3

  I 033

  073B? 050C

  1 3 7&gt;3

  2. Single0threaded overwriting + fsync() (Simulating redo log writes) Changing I/O scheduler (InnoDB)

  DBT-2 (MySQL5.1) 15000

  RAID1+0 10000

  M P

  RAID5

  T O N

  5000 noop cfq deadline as

  0 Sun Fire X4150 (4 HDDs, H/W RAID controller+BBWC)

  0 RHEL5.3 (2.6.180128)

  0 Built0in InnoDB 5.1 Changing I/O scheduler queue size (MyISAM)

  ! !

  8 S ! 1!