Structured Concurrency Control in Object Oriented Databases pdf pdf

     

     

  STRUCTURED  CONCURRENCY  CONTROL      

  IN     OBJECT  ORIENTED  DATABASES  

         

                           

  Francisco  Mariátegui  

   

     

  STRUCTURED  CONCURRENCY  CONTROL  

  IN   OBJECT  ORIENTED  DATABASES  

     

  A  Dissertation  Presented  to  the  Graduate  Faculty  of   The  School  of  Engineering  and  Applied  Science   of  

  Southern  Methodist  University   in   Partial  Fulfillment  of  the  Requirements   for  the  Degree  of  

  Doctor  of  Philosophy   with  a   Major  in  Computer  Science  

  By   Francisco  José  Mariátegui  

    B.Sc.,  Honors,  Naval  Academy  of  Peru,  1974  

  Systems  Engineering  Specialization  Degree,  Honors,  University  of   Lima,  1977  

  M.Sc.  Computer  Science,  U.S.A.  Naval  Postgraduate  School,  1979   M.Sc.  Computer  Systems  Management  U.S.A.  Naval  Postgraduate  

  School,  1979        

  May  13,  1989  

                     

  COPYRIGHT  @  1989    

  Francisco  J.  Mariategui    

  All  Rights  Reserved    

   

      Mariategui,  Francisco  J.      

  B.Sc.  Naval  Sciences,  Naval  Academy  of  Peru,  1974   System  Engineering,  University  of  Lima,  1977   M.Sc.  Computer  Science,  U.S.  Naval  Postgraduate  School,  1979  

M.Sc.  Computer  Systems  Management,  U.S.  Naval  Postgraduate  School,  1979  

      STRUCTURED  CONCURRENCY  CONTROL  IN     OBJECT  ORIENTED  DATABASES       Advisor:  Dr.  Margaret  H.  Eich     Doctor  of  Philosophy  degree  conferred  August  12,  1989     Dissertation  completed  May  13,  1989  

    

  In   the   last   few   years   a   number   of   object-­‐oriented   database   systems   have   appeared   in   the   literature,   most   of   which   addresses   specific   areas  such  as  office  information  systems  (OIS),  computer  aided  design   (CAD),   computer   aided   manufacturing   (CAM),   software   engineering   (SE),   and   artificial   intelligence   (AI).   Unfortunately,   hardly   any   one   of   them  addresses  the  problem  of  concurrency  control  from  the  general-­‐ purpose   database   point   of   view.   Due   to   the   extreme   differences   in   types  of  transactions  supported  by  these  environments,  the  need  for   combining   different   concurrency   control   approaches   has   been   recognized  but  never  thoroughly  investigated.     A  high  level  design  of  a  Multi-­‐Group  Multi-­‐Layer  approach  to   concurrency  control  for  object-­‐oriented  message-­‐passing  based   databases  is  presented.  The  design  follows  a  formal  definition  of   transaction.  The  concurrency  control  takes  advantage  of  the   structured  nature  of  transactions  to  manage  an  on-­‐line  serializer.  The   serializer  is  specified  as  a  set  of  filters.  These  filters  are  specifications   of  algorithms  that  ensure  serializable  histories.  The  concurrency   control  manages  these  histories  by  layers.  Each  layer,  along  with  its   corresponding  filters,  constitutes  a  different  level  of  abstraction  in   concurrency  control  processing.  Mutually  exclusive  groups  of   transactions  being  processed  in  parallel  are  assumed.  The  availability   of  a  processor  per  group  is  also  assumed.  The  performance  is   improved  when  this  case  of  large  granularity  and  limited  interaction  is   applied.  The  decomposition  of  the  histories  into  layers  allows  the   problem  to  be  more  manageable,  the  principles  of  hierarchical  design   to  be  applied,  and  the  benefits  of  hierarchical  thought  to  be  utilized.     Summarizing,  this  research  has  led  to  the  following  results:  

    1) First   cut   definition   of   an   Object-­‐Oriented   Data   Model   (OODM)   which   encompasses   data   structures,   operations,   and   integrity   constraints.  

  2) Transaction   processing   model   for   the   OODM   environment,   which   facilitates   not   only   definition   of   transactions   but   also,   allows   investigation  of  concurrency  control.  

  the   OODM   and   transaction   models   that   allow   the   use   of   several   different   concurrency   control   techniques   in   parallel   in   the   same   environment.  

     

  

TABLE  OF  CONTENTS  

TABLE  OF  CONTENTS  

  B ACKGROUND  

  ISSERTATION  

  28  

  2.1  

  I NTRODUCTION  

  30  

  2.2   O BJECT -­‐O RIENTED   D ATABASES :   A N   O

  VERVIEW  

  32  

  2.2.1  

  33   2.2.2   D EFINITION  OF   T ERMS   35  

  1.6   G ENERAL   O

  2.2.3  

  D EFINITION  OF  

  P ROPERTIES  OF  

  OODB S   38  

  2.3   D ATA   M ODELS  

  41  

  2.4   A N   O BJECT -­‐O RIENTED   D ATA   M ODEL  

  44  

  2.4.1  

  D ATA  

  VERVIEW  OF  THE   CCMM   24   1.7   O UTLINE  OF  THE   D

  I NTERFACE   23  

  8  

  1.4   S

  LIST  OF  FIGURES  

  13  

  ACKNOWLEDGEMENTS  

  15  

  CHAPTER  1  -­‐  INTRODUCTION  

  16  

  1.1   T HE   P ROBLEM   16   1.2   T HE   A PPROACH  

  18  

  1.3   C ONTRIBUTION  

  19  

  IGNIFICANCE  

  1.5.4  

  20  

  1.5   T HE   C ONCURRENCY   C ONTROL   M ANAGER  

  21  

  1.5.1  

  P URPOSE  

  21  

  1.5.2  

  C ONCEPTS  AND  

  M EANS   22   1.5.3   B ENEFITS  

  22  

  S TRUCTURE   45  

  2.4.2  

  4.3   S ERIALIZABILITY   107  

  87  

  3.6   T HE   C OMPLETE   S TRUCTURE  OF  A   T RANSACTION   93   3.5   S UMMARY  

  102  

  

CHAPTER  4  -­‐  THEORY  OF  EXECUTION  AND  SERIALIZABILITY   104

 

  4.1  

  I NTRODUCTION   104  

  4.2   P RELIMINARIES   105  

  

4.4   S ERIALIZABILITY  AND  THE   P AIR   <GM,   DM>   118

 

  XPANDING  THE   M ODEL  TO  

  4.5  

  I NFORMAL   C OMPRESSION  OF   T RANSACTION   T REES   119  

  4.6   F ROM   T RANSACTION   T REES  TO   T RANSACTION   H

  ISTORIES   128  

  4.7   S UMMARY   130

   

CHAPTER  5  -­‐  MULTI-­‐LAYER  CONCURRENCY:  RATIONALE   132

    5.1  

  I NTRODUCTION   132  

  I NCLUDE   L EAVES  

  3.5   E

  O PERATORS  

  2.6   S UMMARY  

  57   2.4.3  

  I NTEGRITY   R ULES   62  

  2.4.4  

  S UMMARY  

  63  

  2.5   M ODELING   A BILITY  OF  THE   OODM  

  65  

  67  

  79  

  CHAPTER  3  -­‐  UNIT  OF  CONSISTENCY  

  69  

  3.1  

  I NTRODUCTION  

  69  

  3.2   P RELIMINARIES  

  70  

  3.4   T HE   P AIR   <   GM,   DM   >  AS  A   M ODELING   T OOL  

  

5.2   T HE   N EED  FOR   C HANGE   133  

5.3   R EPRESENTATIVE   C ONCURRENCY   C ONTROL   T ECHNIQUES   137

 

  

5.4   S TRUCTURED   C ONCURRENCY   C ONTROL   142

 

  

6.9.2.1  Filter  F1rsws   183  

6.9.2.2  Filter  F1ccg   184  

  6.8   N OTATION   173

    6.9   F

  ILTERS   175

    6.9.1   C OMPONENT   H

  ISTORIES   F

  ILTERS   177  

6.9.1.1  Filter  F0rsws   179  

6.9.1.2  Filter  F0ccg   181  

6.9.2   T RANSACTION   H

  ISTORIES   F

  ILTERS   182  

  6.9.3  

  ISTORIES   164  

  F OREST  

  H

  ISTORIES   F

  ILTERS   185  

6.9.3.1  Filter  F2rsws   186  

  

6.9.3.2  Filter  F2tcg   187  

  6.9.4  

  G ROUP  

  H

  ISTORY   F

  6.7   E LEVATOR   F UNCTIONS   165  

  IERARCHY  OF   H

  

5.5   T HE   L AYERED   A PPROACH   144  

5.6   E

   

  XPANDING  THE   T HEORY  OF   E

  XECUTION :   G ROUP   H

  ISTORY   145  

  5.7   S UMMARY   147

   

CHAPTER  6  -­‐  MULTI-­‐LAYER  CONCURRENCY  ARCHITECTURE   149

   

  6.1  

  I NTRODUCTION   149  

  6.2   B ACKGROUND   150

  6.3   A CTIVE   H

  ISTORIES   160   6.6   H

  ISTORIES   151  

  6.4   P REFIXES   155

    6.5   C ONTENTS  OF   P REFIXES  OF   A CTIVE   H

  ISTORIES   158  

  

6.5.1   N OTATION  FOR   P REFIXES   158  

  6.5.2  

  M EANING  OF  THE  

  I NDEXES  OF  

H

  ISTORIES   159   6.5.3   C ONTENTS  OF   H

  ILTERS   189  

  

6.9.4.1  Filter  F3m   192  

6.9.4.2  Filter  F3gcg   194  

6.9.4.3  Filter  F3oeg   195  

6.9.4.4  Filter  F3rsws   196  

6.10   D ELETING   T RANSACTIONS  FROM   H

  7.2   C ONCURRENCY   C ONTROL   T

  7.10   S UMMARY   247

  S OLUTIONS   244  

  S UITABLE  

  7.9.2  

   

7.9.1   P OTENTIAL   P ROBLEM   242  

  IME   A NALYSIS   S UMMARY   238  

7.9   S PACE   A NALYSIS   242

  7.8   T

  

7.7   L AYER   P ARTITION   229

 

  

7.6   G ROUP   P ARTITION   223

 

  7.5   G ENERAL   B EHAVIOR   222  

   

   

7.4   C RITERIA  AND   A SSUMPTIONS   218

  7.3   N OTATION   217

  IME   C OMPLEXITY   214  

  I NTRODUCTION   213  

  ISTORIES   196  

  7.1  

  

CHAPTER  7  -­‐  TIME  AND  SPACE  ANALYSIS   213

 

  211  

  

6.11.1   C YCLE   A LGORITHM   205  

6.12   S UMMARY  

  IERARCHY   C YCLE   205  

  ISTORY   H

  

6.10.2.3  Filter  F1del   204  

6.10.2.4  Filter  F0del   204  

6.11   H

  

6.10.2.2  Filter  F2del   203

 

  ILTERS   202  

6.10.2.1  Filter  F3del   203  

  F

  D ELETION  

  6.10.2  

  B ACKGROUND   197  

  6.10.1  

   

  

CHAPTER  8  -­‐  CONCLUSIONS  AND  FURTHER  RESEARCH   249

 

  8.1   C ONCLUSIONS   249  

    UMMARY  OF   CCOMPLISHMENTS

  

8.1.1 S A   249  

8.1.2   R ESULTS  BY   S TAGES   250  

8.2   S UGGESTIONS  FOR   F URTHER   W ORK   255

     

  IMULATION

  8.2.1 S   255  

    OMMITTED  BUT   OT   ELETED   RANSACTIONS

  

8.2.2 C N D T   257  

 

  IBRARIES  OF   YPED   BJECTS

  

8.2.3 L T O   259  

8.2.4   C ONFLICT   P REDICATES   260

 

    ARLY  

  VALUATION  OF   NTER ROUP   ONFLICTS

  8.2.5 E E I -­‐G C   261   8.2.6  

  I NCREASE  THE   N UMBER  OF   P ROCESSORS   262    

  IPELINE  THE   YCLE   LGORITHM

  

8.2.7 P C A   263  

8.2.8   R OUTER  

  I SSUES   264   REFERENCES  

  267  

         

  LIST  OF  FIGURES  

     1-­‐1   Concurrency  Control  Manager  Module   1-­‐2   2-­‐1  

  Expansion  of  CCMM  of  Figure  1-­‐1   Object  Classes  

  2-­‐2   Complex  Objects   2-­‐3   Simple  Objects  and  Composition   2-­‐4   Messages   2-­‐5   Class  Hierarchy  for  Object  Graph  of  Figures  2-­‐1  to  2-­‐4   3-­‐1   Traditional  Transaction   3-­‐2   A  Possible  Instance  of  the  Pair  <GM,DM>   3-­‐3   Transaction  with  No  Nested  Messages  Calls   3-­‐4   The  Pair  <GM,DM>  or  a  Transaction  Tree   3-­‐5   Transaction  Tree  with  Potential  Data  Base  Accesses   3-­‐6   Complete  Pair  <GM,  DM>  or  a  Transaction  Tree   3-­‐7   Two  Transaction  Trees  &  One  Transaction   4-­‐1   Transaction  Tree  Before  Compression   4-­‐2   Transaction  Tree  Compressed  1  Level      (Level  n  to  Level  n-­‐1)   4-­‐3   Transaction  Tree  Compressed  2  Levels  (Level  n  to  Level  n-­‐2)   4-­‐4   Transaction  Tree  Compressed  3  Levels  (Level  n  to  Level  n-­‐3)   4-­‐5   Transaction  Tree  Fully  Compressed   6-­‐1   Multi  Group  Approach  to  Concurrency   6-­‐2   Window  or  Active  History   6-­‐3   Relationship  Among  Histories   6-­‐4   Hierarchy  of  Histories   6-­‐5   Upward  and  Downward  Motion  of  Elevator  Functions   6-­‐6   Backward  Edge  Between  Components  1  and  2   6-­‐7   Predecessors  and  Immediate  Predecessors  of  Transactions   6-­‐8   Removing  a  Completed  Transaction  

  6-­‐9   Tight  Predecessors   6-­‐10   Necessary  and  Sufficient  Condition  to  Remove  a  Transaction   6-­‐11   Steps  of  the  History  Hierarchy  Cycle  Algorithm   7-­‐1   Work  Done  with  Traditional  Approach   7-­‐2   Work  Done  with  Partitioning  &  Parallel  Execution      

         

  

ACKNOWLEDGEMENTS  

  This   dissertation   is   dedicated   to   my   parents,   Carmela   and   Francisco   Mariategui,  as  a  small  tribute  of  my  admiration  and  love.   Special  and  most  sincere  thanks  to  my  advisor,  Maggie  Eich,  for  things   too  numerous  to  list  here.   Also,   I   thank   Dennis   Frailey,   Milan   Milenkovic,   Marion   Sobol,   and   David   Yun,   for   their   careful   reading   of   the   dissertation,   and   their   helpful  comments.   I  gratefully  acknowledge  the  Fulbright  Commission  of  Peru,  the   National  Science  Foundation,  and  the  Texas  Advanced  Research   Program  for  their  generous  support.    

CHAPTER  1  -­‐  INTRODUCTION   1.1  The  Problem  

  In   the   last   few   years,   a   number   of   self-­‐named   object-­‐oriented   database   systems   have   appeared   in   the   literature,   most   of   which   addresses   specific   areas   such   as   office   information   systems   (OIS),   computer   aided   design   (CAD),   computer   aided   manufacturing   (CAM),   software   engineering   (SE),   and   artificial   intelligence   (AI).   Unfortunately   hardly   any   one   of   them   addresses   the   problem   of   concurrency  control  from  the  general-­‐purpose  database  point  of  view.   These   specialized   databases   are   not   general   database   management   systems  (DBMS)  in  the  sense  that  they  are  just  applications;  they  are   specific  applications  with  their  own  file  system.   One   of   the   reasons   for   building   these   specialized   databases   for   specific   applications   is   the   difficulty   of   undertaking   a   major   effort   in   attacking  the  hard  problems  such  as  concurrency  control  in  a  general   framework.  The  concurrency  control  must  provide  for  multiple  access   to   the   database   to   multiple   users   while   guaranteeing   database   consistency   at   all   times   (as   seen   by   the   users).   To   preserve   consistency,  a  transaction  must  see  the  values  of  all  the  objects  either   before  or  after  other  transactions  have  updated  them.     This   work   is   aimed   at   an   encompassing   solution   to   concurrency   control   for   databases   in   general,   and   object-­‐oriented   databases   in   particular.  An  approach  to  a  solution  can  be  accomplished  by  focusing   our   efforts   in   Structured   Concurrency   Control,   which   provides   flexibility   and   adaptability.   This   methodology   allows   object-­‐oriented   databases   to   accomplish   an   efficient   concurrency   level   with   a   tolerable   amount   of   overhead,   even   in   the   presence   of   a   variety   of   transactions,   each   one   with   its   own   requirements   (short   lived,   long   lived,   etc.).   Such   a   methodology   will   be   developed   in   the   framework   of   object-­‐oriented   databases,   which,   in   theory,   are   capable   of   handling  a  variety  of  environments.   As  of  today,  most  of  the  commercial  database  systems,  and  even  most   of   the   research   prototypes,   choose   a   specific   concurrency   control   method  in  the  design  phase  of  the  DBMS.  It  is  at  this  point  where  this   approach   differs   from   the   conventional   ones.   It   does   not   pick   one   concurrency   control   method   only,   it   allows   more   than   one   from   the   start:   each   one   at   the   appropriate   occasion   and   at   the   appropriate   time.  Although  this  work  is  not  concerned  with  specifying  when  each  

different   technique   should   be   chosen,   it   does   show   how   to   combine   them  in  the  framework  of  an  original  design.  

  1.2  The  Approach  

  The   concurrency   controller   is   a   key   module   within   any   DBMS,   it   encompasses  most  of  the  activities  of  the  other  modules  in  a  DBMS  in   the  sense  that  they  must  "obtain  permission  to  continue"  in  order  to   perform  their  own  tasks.   To   be   able   to   cope   with   the   new   demands   of  the   newer   applications   (OIS,   CAD,   CAM,   SE,   Al,   etc.),   the   concurrency   controller   should   no   longer   be   "single-­‐minded"   (e.g.,   one   concurrency   control   technique   only).  The  different  types  of  applications  impose  different  demands  on   the  DBMS,  and  thus  affect  the  concurrency  control.   What  is  needed  is  the  use  of  concurrency  control  techniques  designed   specifically  for  active  transactions  and  database  accesses  in  an  object-­‐ oriented   database   system.   This   fact   does   not   necessarily   imply   the   development   of   new   techniques,   but   facilitates   selection   and   control   of   the   correct   method   for   each   transaction   based   upon   what   the   transaction   is   doing   and   the   database   state.   This   type   of   flexible   and   adaptable   concurrency   controller   should   be   capable   of   reducing   the   overhead   by,   in   some   cases   eliminating   all   concurrency   control   activity,   and   in   others   perhaps   combining   the   use   of   several   techniques   together.   Different   transactions   may   use   different   techniques.   It   is   also   possible   that   different   executions   of   the   same   transaction  may  use  different  techniques.  This  approach  must  be  able   to   ensure   correctness   across   all   the   different   techniques   being   used.   In   order   to   accomplish   success   in   this   endeavor,   the   new   flexible   concurrency  controller  must  be  able  to  keep  track  of  the  states  of  the   database  as  indicated  by  the  type  of  transactions  active  at  any  point  in   time.  

  1.3  Contribution  

  This  research  has  led  to  the  following  results:   1) First  cut  definition  of  an  Object-­‐Oriented  Data  Model  (OODM)  which   encompasses  data  structures,  operations,  and  integrity  constraints.  

  2) Transaction   Processing   model   for   the   OODM   environment   which   facilitates   not   only   definition   of   transactions   but   also   allows   investigation  of  concurrency  control.  

  3) Group   Concurrency   Control   technique   built   on   the   OODM   and   transaction-­‐processing   model   that   allows   the   use   of   several   different   concurrency   control   techniques   in   parallel   in   the   same   environment.  

  The   first   two   results   are   considered   as   supporting   result   number   three.   A   special   section   is   included   in   this   introductory   chapter   to   introduce  the  latter.  

  1.4  Significance  

  Due   to   the   extreme   differences   in   types   of   transactions   to   be   executed   in   an   object-­‐oriented   database   (long   lived   and   short   lived),   the   need   for   combining   different   (concurrency   control)   approaches   has  been  recognized  but  never  totally  investigated.   Not   only   does   this   group   approach   facilitate   the   ability   to   combine   techniques,   it   also   allows   parallelism   of   various   pieces   of   the   concurrency  control  process.  It  has  the  potential  to  drastically  reduce   the   overhead   of   concurrency   control   processing.   This   is   of   general   interest   to   general   purpose   object-­‐oriented   databases   and   of   particular  interest  to  heterogeneous  systems,  when  there  is  the  need   to  integrate  differently  designed  systems,  real  time  systems  and  main   memory  databases,  where  the  relative  impact  of  concurrency  control   overhead   can   be   quite   large.   The   OODM   definition   provides   the   framework   for   much   future   research   and   the   potential   for   definition   of   a   universally   accepted   data   model   applicable   to   object-­‐oriented   databases.  

  1.5  The  Concurrency  Control  Manager  

  The  goal  in  this  dissertation  is  to  describe  and  define  an  effective  and   flexible   mechanism   to   control   concurrency   in   object-­‐oriented   databases.   In   order   to   achieve   this   objective   the   theory   has   been   created,   the   rationale   has   been   discussed,   the   architecture   has   been   specified,   and   the   costs   involved   in   a   Concurrency   Control   Manager  

  

Module   (CCMM)   have   been   analyzed.   The   models,   algorithms,   and  

  specifications  used  to  this  effect  are  the  result  of  original  research  as   well   as   adaptations   of   state   of   the   art   technology.   The   resulting   CCMM   is   an   algorithmic   specification   of   the   proposed   approach   that   could  be  implemented  in  hardware  (the  hardware  could  take  the  form   of   a   Concurrency   Control   Board).   This   dissertation   is   concerned   with   the   presentation   of   the   underlying   technology   to   make   the   software   CCMM  possible.  

  1.5.1  Purpose  

  The  purposes  of  the  CCMM  module  are  as  follows:    

  Reduce  the  overhead  attributable  to  the  concurrency  controller.  

  • Improve   throughput   (i.e.,   number   of   transactions   per   unit   of  
  • time).  

  Provide   multiple   concurrency   control   technique   capability   in  

  • parallel.   Contribute   to   the   ongoing   research   in   Concurrency   Control  
  • Management.  

  1.5.2  Concepts  and  Means  

  The  concepts  and  means  used  to  specify  the  CCMM  are  as  follows:    

  • Conflict-­‐graph  based  serializability.   Current  concurrency  control  techniques.  
  • Model  of  transactions  in  OODBs.  
  • Multi-­‐layer  approach  to  the  treatment  of  histories.  
  • Parallel  processing  technology.    
  • 1.5.3  Benefits  

  Summarizing,  the  potential  benefits  of  the  CCMM  are  as  follows:    

  Speed:  throughput.  

  Flexibility:  several  concurrency  control  techniques.  

Modularity:  different  environments  may  use  different  techniques.  

  • Research  Tool:  for  high  speed  transaction  processing. &n

  1.5.4  Interface  

  The   CCMM   interfaces   with   transaction   managers   and   data   managers   as   shown   in   Figure   1-­‐1.   Transaction   managers   send   requests   to   the   CCMM,   such   as   BEGIN,   END,   COMMIT,   ABORT,   LOCK,   and   UNLOCK.   The   CCMM   informs   the   transaction   manager   about   the   state   of   execution   of   transactions.   The   CCMM   sends   requests   to   the   data   manager  to  perform  database  accesses  on  its  behalf.  This  document  is   not   concerned   with   the   details   of   the   protocols   used   to   achieve   proper   interface   among   these   modules.   It   is   (mainly)   concerned   with   the  internal  workings  of  the  CCMM.    

   

  1.6  General  Overview  of  the  CCMM  

  In   order   to   provide   an   initial   insight   into   the   approach   (to   be   developed  in  detail  in  the  chapters  to  come),  the  core  constituents  of   the   scheme   are   presented   in   this   section.   To   be   in   consonance   with   Confucius   saying   "A   picture   is   worth   a   thousand   words",   Figure   1-­‐2   depicts  the  structure  of  the  CCMM.  Figure  1-­‐2  could  be  interpreted  as   a  more  detailed  representation  of  the  dotted  box  part  of  Figure  1-­‐1.