SlideShare una empresa de Scribd logo
1 de 94
Descargar para leer sin conexión
Nb-GCLOCK:
A Non-blocking Buffer Management
based on the Generalized CLOCK
  Makoto YUI1, Jun MIYAZAKI2, Shunsuke UEMURA3
                            and Hayato YAMANA4
1 .Research fellow, JSPS (Japan Society for the Promotion of Science) /
   Visiting Postdoc at Waseda University, Japan and CWI, Netherlands
2. Nara Institute of Science and Technology
3. Nara Sangyo University
4. Waseda University / National Institute of Informatics
Outline
• Background
• Our approach
  – Non-Blocking Synchronization
  – Nb-GCLOCK
• Experimental Evaluation
• Related Work
• Conclusion


                                   2
Background – Recent trends in CPU development

     # of CPU cores in a chip                        Many-Core CPU
     is doubling in two year cycles                              UltraSparc T2
                                                                 Azul Vega
                                                                 Larrabee?


                          Multi-Core CPU

                                           Nehalem
Single-Core CPU
                                        Core2
                        Power4
              Pentium
                                 2000           Many-core era is coming.
          1990

                                                                                 3
Background – Recent trends in CPU development

     # of CPU cores in a chip                         Many-Core CPU
     is doubling in two year cycles                                  UltraSparc T2
                                                                     Azul Vega
                                                                     Larrabee?


                          Multi-Core CPU

                                            Nehalem
Single-Core CPU
                                        Core2
                        Power4
              Pentium
                                 2000             Many-core era is coming.
          1990                          - Niagara T2 – 8 cores x 8 SMT = 64 processors
                                        - Azul Vega3 – 54 cores x 16 chips = 864 processors

                                                                                        4
Background – CPU Scalability of open source DBs
Open source DBs have faced CPU scalability problems
Ryan Johnson et al., “Shore-MT: A Scalable Storage Manager for the Multicore Era”,
In Proc. EDBT, 2009.




                                                                                     5
Background – CPU Scalability of open source DBs
Open source DBs have faced CPU scalability problems
Ryan Johnson et al., “Shore-MT: A Scalable Storage Manager for the Multicore Era”,
In Proc. EDBT, 2009.

            10
                      PostgreSQL
             8        MySQL
                      BDB
             6

             4

             2

             0
                 1       4         8       12      16       24       32

             Microbenchmark on UltraSparc T1 (32 procs)                              6
Background – CPU Scalability of open source DBs
  Open source DBs have faced CPU scalability problems
  Ryan Johnson et al., “Shore-MT: A Scalable Storage Manager for the Multicore Era”,
  In Proc. EDBT, 2009.

              10
                        PostgreSQL
               8        MySQL
                        BDB

Throughput 6
(normalized)
             4

               2

               0                                                             Concurrent
                   1       4         8       12      16       24       32    threads
               Microbenchmark on UltraSparc T1 (32 procs)                              7
Background – CPU Scalability of open source DBs
  Open source DBs have faced CPU scalability problems
  Ryan Johnson et al., “Shore-MT: A Scalable Storage Manager for the Multicore Era”,
  In Proc. EDBT, 2009.
                                                     Gain after 16 threads
              10                                     is less than 5 %
                        PostgreSQL
               8        MySQL
                        BDB

Throughput 6
(normalized)
             4

               2

               0                                                             Concurrent
                   1       4         8       12      16       24       32    threads
               Microbenchmark on UltraSparc T1 (32 procs)                              8
Background – CPU Scalability of open source DBs
  Open source DBs have faced CPU scalability problems
  Ryan Johnson et al., “Shore-MT: A Scalable Storage Manager for the Multicore Era”,
  In Proc. EDBT, 2009.
                                                     Gain after 16 threads
              10                                     is less than 5 %
                         PostgreSQL
               8         MySQL
                         BDB

Throughput 6
(normalized)
             4

               2
                   You might think…
            What about TPC-C ?
             0                                                               Concurrent
                    1       4         8      12      16       24       32    threads
               Microbenchmark on UltraSparc T1 (32 procs)                              9
CPU scalability of PostgreSQL
TPC-C benchmark result on a high-end Linux machine of Unisys
(Xeon-SMP 32 CPUs, Memory 16GB, EMC RAID10 Storage)
   Doug Tolbert, David Strong, Johney Tsai (Unisys),
   “Scaling PostgreSQL on SMP Architectures”, PGCON 2007.




                                                            10
CPU scalability of PostgreSQL
TPC-C benchmark result on a high-end Linux machine of Unisys
(Xeon-SMP 32 CPUs, Memory 16GB, EMC RAID10 Storage)
      Doug Tolbert, David Strong, Johney Tsai (Unisys),
      “Scaling PostgreSQL on SMP Architectures”, PGCON 2007.




TPS




Version 8.2                                                    CPU cores
Version 8.1
Version 8.0                                                           11
CPU scalability of PostgreSQL
TPC-C benchmark result on a high-end Linux machine of Unisys
(Xeon-SMP 32 CPUs, Memory 16GB, EMC RAID10 Storage)
      Doug Tolbert, David Strong, Johney Tsai (Unisys),
      “Scaling PostgreSQL on SMP Architectures”, PGCON 2007.




TPS                                                Gain after 16 CPU cores
                                                   is less than 5%




Version 8.2                                                         CPU cores
Version 8.1
Version 8.0                                                                  12
CPU scalability of PostgreSQL
TPC-C benchmark result on a high-end Linux machine of Unisys
(Xeon-SMP 32 CPUs, Memory 16GB, EMC RAID10 Storage)
      Doug Tolbert, David Strong, Johney Tsai (Unisys),
      “Scaling PostgreSQL on SMP Architectures”, PGCON 2007.




TPS                                                Gain after 16 CPU cores
                                                   is less than 5%




                   Q. What PostgreSQL community did?
Version 8.2                                                         CPU cores
Version 8.1
Version 8.0                                                                  13
CPU scalability of PostgreSQL
TPC-C benchmark result on a high-end Linux machine of Unisys
(Xeon-SMP 32 CPUs, Memory 16GB, EMC RAID10 Storage)
      Doug Tolbert, David Strong, Johney Tsai (Unisys),
      “Scaling PostgreSQL on SMP Architectures”, PGCON 2007.




TPS                                                Gain after 16 CPU cores
                                                   is less than 5%




                   Q. What PostgreSQL community did?
Version 8.2                                                         CPU cores
Version 8.1        Revised their synchronization mechanisms
                   in the buffer management module
Version 8.0                                                                  14
Synchronization in Buffer Management Module
Several empirical studies have revealed that the largest bottleneck is …
synchronization in buffer management module
  [1] Ryan Johnson, Ippokratis Pandis, Anastassia Ailamaki:
  “Critical Sections: Re-emerging Scalability Concerns for Database Storage Engines”, In Proc. DaMoN, 2008.
  [2] Stavros Harizopoulos, Daniel J. Abadi, Samuel Madden, and Michael Stonebraker:
  OLTP Through the Looking Glass, and What We Found There, In Proc.SIGMOD, 2008.
Synchronization in Buffer Management Module
  Several empirical studies have revealed that the largest bottleneck is …
  synchronization in buffer management module
     [1] Ryan Johnson, Ippokratis Pandis, Anastassia Ailamaki:
     “Critical Sections: Re-emerging Scalability Concerns for Database Storage Engines”, In Proc. DaMoN, 2008.
     [2] Stavros Harizopoulos, Daniel J. Abadi, Samuel Madden, and Michael Stonebraker:
     OLTP Through the Looking Glass, and What We Found There, In Proc.SIGMOD, 2008.


       CPU                       Page requests

reduces disk access
by caching database pages
                                       Buffer
     Memory                           Manager




       HDD                            Database
                                        Files
Synchronization in Buffer Management Module
  Several empirical studies have revealed that the largest bottleneck is …
  synchronization in buffer management module
     [1] Ryan Johnson, Ippokratis Pandis, Anastassia Ailamaki:
     “Critical Sections: Re-emerging Scalability Concerns for Database Storage Engines”, In Proc. DaMoN, 2008.
     [2] Stavros Harizopoulos, Daniel J. Abadi, Samuel Madden, and Michael Stonebraker:
     OLTP Through the Looking Glass, and What We Found There, In Proc.SIGMOD, 2008.


       CPU                       Page requests                            Page requests

reduces disk access                                          Buffer Manager
by caching database pages
                                                                   (1)    Looking-up hash table
                                       Buffer
     Memory                           Manager                      misses                          hits

                                                              (2) Page replacement algorithm


       HDD                            Database                               Database
                                        Files                                  Files                             20
Synchronization in Buffer Management Module
  Several empirical studies have revealed that the largest bottleneck is …
  synchronization in buffer management module
     [1] Ryan Johnson, Ippokratis Pandis, Anastassia Ailamaki:
     “Critical Sections: Re-emerging Scalability Concerns for Database Storage Engines”, In Proc. DaMoN, 2008.
     [2] Stavros Harizopoulos, Daniel J. Abadi, Samuel Madden, and Michael Stonebraker:
     OLTP Through the Looking Glass, and What We Found There, In Proc.SIGMOD, 2008.


       CPU                       Page requests                            Page requests

reduces disk access                                          Buffer Manager
by caching database pages
                                                                   (1)    Looking-up hash table
                                       Buffer
     Memory                           Manager                      misses                          hits

                                                              (2) Page replacement algorithm


       HDD                            Database                               Database
                                        Files                                  Files                             18
Synchronization in Buffer Management Module
  Several empirical studies have revealed that the largest bottleneck is …
  synchronization in buffer management module
     [1] Ryan Johnson, Ippokratis Pandis, Anastassia Ailamaki:
     “Critical Sections: Re-emerging Scalability Concerns for Database Storage Engines”, In Proc. DaMoN, 2008.
     [2] Stavros Harizopoulos, Daniel J. Abadi, Samuel Madden, and Michael Stonebraker:
     OLTP Through the Looking Glass, and What We Found There, In Proc.SIGMOD, 2008.


       CPU                       Page requests                            Page requests

reduces disk access                                          Buffer Manager
by caching database pages
                                                                   (1)    Looking-up hash table
                                       Buffer
     Memory                           Manager                      misses                          hits

                                                              (2) Page replacement algorithm


       HDD                            Database                               Database
                                        Files                                  Files                             19
Naive buffer management schemes
        Page requests                Page requests



                                    Hash     Hash     Hash     Hash
        Looking-up hash table      bucket   bucket   bucket   bucket

   misses                 hits   misses                       hits

    Page replacement algorithm   Page replacement algorithm
       (Least Recently Used)        (Least Recently Used)



         Database                       Database
           Files                          Files

     PostgreSQL 8.0               PostgreSQL 8.1

                                                                       20
Naive buffer management schemes
        Page requests                              Page requests
                           Giant lock sucks!

                                                  Hash     Hash     Hash     Hash
        Looking-up hash table                    bucket   bucket   bucket   bucket

   misses                 hits                 misses                       hits

    Page replacement algorithm                 Page replacement algorithm
       (Least Recently Used)                      (Least Recently Used)



         Database                                     Database
           Files                                        Files

     PostgreSQL 8.0                             PostgreSQL 8.1

                                                                                     21
Naive buffer management schemes
        Page requests                              Page requests
                           Giant lock sucks!

                                                  Hash     Hash     Hash     Hash
        Looking-up hash table                    bucket   bucket   bucket   bucket

   misses                 hits                 misses                       hits

    Page replacement algorithm                 Page replacement algorithm
       (Least Recently Used)                      (Least Recently Used)


                       LRU list always needs to be
         Database                                     Database
           Files       locked when it is accessed       Files

     PostgreSQL 8.0                             PostgreSQL 8.1

                                                                                     22
Naive buffer management schemes
        Page requests                              Page requests
                           Giant lock sucks!                  Striped a lock
                                                              into buckets
                                                  Hash     Hash     Hash     Hash
        Looking-up hash table                    bucket   bucket   bucket   bucket

   misses                 hits                 misses                       hits

    Page replacement algorithm                 Page replacement algorithm
       (Least Recently Used)                      (Least Recently Used)


                       LRU list always needs to be
         Database                                     Database
           Files       locked when it is accessed       Files

     PostgreSQL 8.0                             PostgreSQL 8.1

                                                                                     23
Naive buffer management schemes
        Page requests                              Page requests
                           Giant lock sucks!                  Striped a lock
                                                              into buckets
                                                  Hash     Hash     Hash     Hash
        Looking-up hash table                    bucket   bucket   bucket   bucket

   misses                 hits                 misses                       hits

    Page replacement algorithm                 Page replacement algorithm
       (Least Recently Used)                      (Least Recently Used)


                       LRU list always needs to be
         Database                                     Database
           Files       locked when it is accessed       Files

     PostgreSQL 8.0                             PostgreSQL 8.1
  Did not scale at all                Scales up to 8 processors
                                                                                     24
Less naive buffer management schemes

        Page requests                                     Page requests


       Hash     Hash     Hash     Hash                   Hash     Hash     Hash     Hash
      bucket   bucket   bucket   bucket                 bucket   bucket   bucket   bucket

    misses                       hits                 misses                       hits

    Page replacement algorithm                        Page replacement algorithm
       (Least Recently Used)                                    (CLOCK)

                          Always needs to be locked
           Database       when it is accessed                Database
             Files                                             Files

     PostgreSQL 8.1                                    PostgreSQL 8.2
 Scales up to 8 processors
                                                                                            25
Less naive buffer management schemes

        Page requests
                   CLOCK does not require a lock          Page requests
                   when an entry is touched

       Hash     Hash     Hash     Hash                   Hash     Hash     Hash     Hash
      bucket   bucket   bucket   bucket                 bucket   bucket   bucket   bucket

    misses                       hits                 misses                       hits

    Page replacement algorithm                        Page replacement algorithm
       (Least Recently Used)                                    (CLOCK)

                          Always needs to be locked
           Database       when it is accessed                Database
             Files                                             Files

     PostgreSQL 8.1                                    PostgreSQL 8.2
 Scales up to 8 processors                    Scales up to 16 processors
                                                                                            26
Outline
• Background
• Our approach
  – Non-Blocking Synchronization
  – Nb-GCLOCK
• Experimental Evaluation
• Related Work
• Conclusion


                                   27
Core idea of our approach
 Previous approaches        Our optimistic approach



            Request pages            Request pages
  CPU




              Buffer                   Buffer
 Memory      Manager                  Manager




  HDD         Database
                files                  Database
                                         files
                                                     28
Core idea of our approach
 Previous approaches        Our optimistic approach
 ○Reducing disk I/Os
 × locks are contended

            Request pages            Request pages
  CPU




               Buffer                  Buffer
 Memory       Manager                 Manager




  HDD          Database
                 files                 Database
                                         files
                                                     29
Core idea of our approach
    Previous approaches        Our optimistic approach
    ○Reducing disk I/Os
    × locks are contended

               Request pages            Request pages
      CPU




                  Buffer                  Buffer
     Memory      Manager                 Manager



intuition
        HDD       Database
                    files                 Database
                                            files
                                                        30
Core idea of our approach
 Previous approaches           Our optimistic approach
 ○Reducing disk I/Os
 × locks are contended

               Request pages                Request pages
  CPU

                               Enough
                               processors
                     Buffer                   Buffer
 Memory             Manager                  Manager
  Disk bandwidth
  is not utilized

  HDD               Database
                      files                   Database
                                                files
                                                            31
Core idea of our approach
 Previous approaches           Our optimistic approach
 ○Reducing disk I/Os
 × locks are contended

               Request pages                Request pages
  CPU

                               Enough
                               processors
                     Buffer                   Buffer
 Memory             Manager                  Manager
  Disk bandwidth
  is not utilized

  HDD               Database
                      files                   Database
                                                files
                                                            32
Core idea of our approach
 Previous approaches              Our optimistic approach
 ○Reducing disk I/Os
 × locks are contended

               Request pages                    Request pages
  CPU

                                  Enough
                                  processors
                  Buffer                          Buffer
 Memory          Manager                         Manager
  Disk bandwidth
  is not utilized
                  Reduced lock granularity to
                 one CPU instruction and
  HDD            remove the bottleneck
                  Database
                    files                         Database
                                                    files
                                                                33
Core idea of our approach
 Previous approaches              Our optimistic approach
 ○Reducing disk I/Os              △ # of I/O slightly increases
 × locks are contended            ○ no contention on locks

               Request pages                    Request pages
  CPU

                                  Enough
                                  processors
                  Buffer                          Buffer
 Memory          Manager                         Manager
  Disk bandwidth
  is not utilized
                  Reduced lock granularity to
                 one CPU instruction and
  HDD            remove the bottleneck
                  Database
                    files                         Database
                                                    files
                                                                34
Major Difference to Previous Approaches
Previous approaches      Our optimistic approach
 ○Reducing disk I/Os     △ # of I/O slightly increases
 × locks are contended   ○ no contention on locks




    Their goal is …




                                                  35
Major Difference to Previous Approaches
  Previous approaches                Our optimistic approach
  ○Reducing disk I/Os                △ # of I/O slightly increases
  × locks are contended              ○ no contention on locks




       Their goal is …

Improve buffer hit-rates
for reducing I/Os
  Unique goal for many decades.
  Is this goal valid for many core
  era? There are also SSDs.
                                                              36
Major Difference to Previous Approaches
  Previous approaches                Our optimistic approach
  ○Reducing disk I/Os                △ # of I/O slightly increases
  × locks are contended              ○ no contention on locks




       Their goal is …                     Our goal is …

Improve buffer hit-rates
for reducing I/Os
  Unique goal for many decades.
  Is this goal valid for many core
  era? There are also SSDs.
                                                              37
Major Difference to Previous Approaches
  Previous approaches                Our optimistic approach
  ○Reducing disk I/Os                △ # of I/O slightly increases
  × locks are contended              ○ no contention on locks




       Their goal is …                     Our goal is …

Improve buffer hit-rates             Improve throughputs by
for reducing I/Os                    utilizing (many) CPUs.
  Unique goal for many decades.
  Is this goal valid for many core
  era? There are also SSDs.
                                                              38
Major Difference to Previous Approaches
  Previous approaches                Our optimistic approach
  ○Reducing disk I/Os                △ # of I/O slightly increases
  × locks are contended              ○ no contention on locks




       Their goal is …                       Our goal is …

Improve buffer hit-rates             Improve throughputs by
for reducing I/Os                    utilizing (many) CPUs.
  Unique goal for many decades.       Use Non-blocking synchronization
  Is this goal valid for many core    instead of acquiring locks!
  era? There are also SSDs.
                                                                     39
What’s non-blocking and lock-free?
   Formally:




                                     40
What’s non-blocking and lock-free?
   Formally:
     Stopping one thread will not prevent global progress.
     Individual threads make progress without waiting.




                                                        41
What’s non-blocking and lock-free?
   Formally:
     Stopping one thread will not prevent global progress.
     Individual threads make progress without waiting.
   Less Formally:




                                                        42
What’s non-blocking and lock-free?
   Formally:
     Stopping one thread will not prevent global progress.
     Individual threads make progress without waiting.
   Less Formally:
     No thread 'locks' any resource
     No 'critical sections', locks, mutexs, spin-locks, etc




                                                               43
What’s non-blocking and lock-free?
   Formally:
     Stopping one thread will not prevent global progress.
     Individual threads make progress without waiting.
   Less Formally:
     No thread 'locks' any resource
     No 'critical sections', locks, mutexs, spin-locks, etc
Lock-free if every successful step makes Global Progress
and completes within finite time (ensuring liveness)


                                                               44
What’s non-blocking and lock-free?
   Formally:
     Stopping one thread will not prevent global progress.
     Individual threads make progress without waiting.
   Less Formally:
     No thread 'locks' any resource
     No 'critical sections', locks, mutexs, spin-locks, etc
Lock-free if every successful step makes Global Progress
and completes within finite time (ensuring liveness)
Wait-free if every step makes Global Progress
and completes within finite time (ensuring fairness)
                                                               45
Non-blocking synchronization

Synchronization method that does not acquire any lock,
enabling concurrent accesses to shared resources
  Utilize   atomic CPU primitives
    
  Utilize   memory barriers




                                                         46
Non-blocking synchronization

Synchronization method that does not acquire any lock,
enabling concurrent accesses to shared resources
  Utilize atomic CPU primitives
     CAS (compare-and-swap) cmpxchg on X86
  Utilize memory barriers




                                                         47
Non-blocking synchronization

Synchronization method that does not acquire any lock,
enabling concurrent accesses to shared resources
  Utilize atomic CPU primitives
     CAS (compare-and-swap) cmpxchg on X86
  Utilize memory barriers

       Blocking
    acquire_lock(lock);
    counter++;
    release_lock(lock);



                                                         48
Non-blocking synchronization

Synchronization method that does not acquire any lock,
enabling concurrent accesses to shared resources
  Utilize atomic CPU primitives
     CAS (compare-and-swap) cmpxchg on X86
  Utilize memory barriers

       Blocking                    Non-Blocking
    acquire_lock(lock);    int old;
    counter++;             do {
    release_lock(lock);      old = *counter;
                           } while (!CAS(counter, old, old+1));
                          counter is incremented if the value
                          was equals to old
                                                              49
Making the buffer manager non-blocking


      Page requests



     Hash     Hash     Hash     Hash
    bucket   bucket   bucket   bucket

  misses                       hits

  Page replacement algorithm
            (GCLOCK)


                lock; lseek; read; unlock


        Database
          Files

                                            50
Making the buffer manager non-blocking


      Page requests                         1. Utilized existing lock-free
                                            hash table
     Hash     Hash     Hash     Hash
    bucket   bucket   bucket   bucket

  misses                       hits

  Page replacement algorithm
            (GCLOCK)


                lock; lseek; read; unlock


        Database
          Files

                                                                             51
Making the buffer manager non-blocking


      Page requests                         1. Utilized existing lock-free
                                            hash table
     Hash     Hash     Hash     Hash
    bucket   bucket   bucket   bucket

  misses                       hits

  Page replacement algorithm
                                             2. Removing locks on cache
            (GCLOCK)                         misses (in fig. 6)
                lock; lseek; read; unlock


        Database
          Files

                                                                             52
Making the buffer manager non-blocking


      Page requests



     Hash     Hash     Hash     Hash
    bucket   bucket   bucket   bucket

  misses                       hits

  Page replacement algorithm
            (GCLOCK)


                lock; lseek; read; unlock


        Database
          Files

                                            53
Making the buffer manager non-blocking

                                        3. Need to keep consistency
      Page requests
                                        between lookup hash table and GCLOCK
                                        (in the right half of fig. 3)

     Hash     Hash     Hash     Hash
    bucket   bucket   bucket   bucket

  misses                       hits

  Page replacement algorithm
            (GCLOCK)


                lock; lseek; read; unlock


        Database
          Files

                                                                               54
Making the buffer manager non-blocking

                                        3. Need to keep consistency
      Page requests
                                        between lookup hash table and GCLOCK
                                        (in the right half of fig. 3)

     Hash     Hash     Hash     Hash
    bucket   bucket   bucket   bucket       Reference in buffer lookup table
  misses                       hits        still has a different page identifier
                                           immediately after changing the
  Page replacement algorithm               page allocation of a buffer frame
            (GCLOCK)


                lock; lseek; read; unlock


        Database
          Files

                                                                                   55
Making the buffer manager non-blocking

                                            3. Need to keep consistency
      Page requests
                                            between lookup hash table and GCLOCK
                                            (in the right half of fig. 3)

     Hash     Hash     Hash     Hash
    bucket   bucket   bucket   bucket             Reference in buffer lookup table
  misses                       hits              still has a different page identifier
                                                 immediately after changing the
  Page replacement algorithm                     page allocation of a buffer frame
            (GCLOCK)


                lock; lseek; read; unlock

                                        4. Avoided locks on I/Os
        Database
          Files                         by utilizing pread, CAS, and memory barriers
                                        (in fig. 5)
                                                                                         56
State Machine-based Reasoning for selecting replacement victim




          Construct algorithm from many 'steps'
          ─ build a State Machine for ensuring
           glabal progress




                                                                 57
State Machine-based Reasoning for selecting replacement victim




                                                                 58
State Machine-based Reasoning for selecting replacement victim

        E: entry action                        evicted         Fix in pool      swapped
                             Check whether
                             Evicted                          E: CAS value
                                                                                                    success
              !null                                                                             E: move the
                                                                                                   clock hand
                            !evicted                      ! swapped
                                              Check whether                   evicted
                                              Pinned
         Select a frame
                                                                                         Try to evict

                                                                                          E: evict
                                                                     !evicted
                                 pinned                !pinned
                null                                                                     --refcount<=0
                                                                             Try to decrement
                            continue                                         the refcount
                                                                             E: decrement
                          E: try next entry
                                                                                the refcount

                                                     --refcount>0


                                                                                                                59
State Machine-based Reasoning for selecting replacement victim

               E: entry action                        evicted         Fix in pool      swapped
                                    Check whether
                                    Evicted                          E: CAS value
                                                                                                           success
                    !null                                                                              E: move the
Start finding a                                                  ! swapped
                                                                                                          clock hand
                                   !evicted
replacement                                          Check whether                   evicted
                                                     Pinned
victim      Select a frame
                                                                                                Try to evict

                                                                                                 E: evict
                                                                            !evicted
                                        pinned                !pinned
                      null                                                                      --refcount<=0
                                                                                    Try to decrement
                                   continue                                         the refcount
                                                                                    E: decrement
                                 E: try next entry
                                                                                       the refcount

                                                            --refcount>0


                                                                                                                       60
State Machine-based Reasoning for selecting replacement victim

               E: entry action                        evicted         Fix in pool      swapped
                                    Check whether
                                    Evicted                          E: CAS value
                                                                                                           success
                    !null                                                                              E: move the
Start finding a                                                  ! swapped
                                                                                                          clock hand
                                   !evicted
replacement                                          Check whether                   evicted
                                                     Pinned
victim      Select a frame
                                                                                                Try to evict

                                                                                                 E: evict
                                                                            !evicted
                                        pinned                !pinned
                      null                                                                      --refcount<=0
                                                                                    Try to decrement
                                   continue                                         the refcount
                                                                                    E: decrement
                                 E: try next entry
                                                                                       the refcount

                                                            --refcount>0
                                                                                         Decrement weight count
                                                                                         of a buffer page
                                                                                                                       61
State Machine-based Reasoning for selecting replacement victim
                                                                                                      Return a replacement
               E: entry action                        evicted                                         victim
                                    Check whether
                                                                      Fix in pool      swapped
                                    Evicted                          E: CAS value
                                                                                                           success
                    !null                                                                              E: move the
Start finding a                                                  ! swapped
                                                                                                          clock hand
                                   !evicted
replacement                                          Check whether                   evicted
                                                     Pinned
victim      Select a frame
                                                                                                Try to evict

                                                                                                 E: evict
                                                                            !evicted
                                        pinned                !pinned
                      null                                                                      --refcount<=0
                                                                                    Try to decrement
                                   continue                                         the refcount
                                                                                    E: decrement
                                 E: try next entry
                                                                                       the refcount

                                                            --refcount>0
                                                                                         Decrement weight count
                                                                                         of a buffer page
                                                                                                                       62
State Machine-based Reasoning for selecting replacement victim
                                                                                                      Return a replacement
               E: entry action                        evicted                                         victim
                                    Check whether
                                                                      Fix in pool      swapped
                                    Evicted                          E: CAS value
                                                                                                           success
                    !null                                                                              E: move the
Start finding a                                                  ! swapped
                                                                                                          clock hand
                                   !evicted
replacement                                          Check whether                   evicted
                                                     Pinned
victim      Select a frame
                                                                                                Try to evict

                                                                                                 E: evict
                                                                            !evicted
                                        pinned                !pinned
                      null                                                                      --refcount<=0
                                                                                    Try to decrement
                                   continue                                         the refcount
                                                                                    E: decrement
                                 E: try next entry
                                                                                       the refcount

                                                            --refcount>0
                                                                                         Decrement weight count
         Advance CLOCK hand
                                                                                         of a buffer page
         (check the next candidate)
                                                                                                                       63
State Machine-based Reasoning for selecting replacement victim
                                               Thread A                                               Return a replacement
               E: entry action                        evicted                                         victim
                                    Check whether
                                                                      Fix in pool      swapped
                                    Evicted                          E: CAS value
                                                                                                           success
                    !null                                                                              E: move the
Start finding a                                                  ! swapped
                                                                                                          clock hand
                                   !evicted
replacement                                          Check whether                   evicted
                                                     Pinned
victim      Select a frame
                                                                                                Try to evict

                                                                                                 E: evict
                                                                            !evicted
                                        pinned                !pinned
                      null                                                                      --refcount<=0
                                                                                    Try to decrement
                                   continue                                         the refcount
                                                                                    E: decrement
                                 E: try next entry
                                                                                       the refcount

                                                            --refcount>0
                                                                                         Decrement weight count
         Advance CLOCK hand
                                                                                         of a buffer page
         (check the next candidate)
                                                                                                                       64
State Machine-based Reasoning for selecting replacement victim
                                               Thread A                                               Return a replacement
               E: entry action                        evicted                                         victim
                                    Check whether
                                                                      Fix in pool      swapped
                                    Evicted                          E: CAS value
                                                                                                           success
                    !null                                                                              E: move the
Start finding a                                                  ! swapped
                                                                                                          clock hand
                                   !evicted
replacement                                          Check whether                   evicted                    Thread B
                                                     Pinned
victim      Select a frame
                                                                                                Try to evict

                                                                                                 E: evict
                                                                            !evicted
                                        pinned                !pinned
                      null                                                                      --refcount<=0
                                                                                    Try to decrement
                                   continue                                         the refcount
                                                                                    E: decrement
                                 E: try next entry
                                                                                       the refcount

                                                            --refcount>0
                                                                                         Decrement weight count
         Advance CLOCK hand
                                                                                         of a buffer page
         (check the next candidate)
                                                                                                                           65
State Machine-based Reasoning for selecting replacement victim
                                               Thread A                                               Return a replacement
               E: entry action                        evicted                                         victim
                                    Check whether
                                                                      Fix in pool      swapped
                                    Evicted                          E: CAS value
                                                                                                           success
                    !null                                                                              E: move the
Start finding a                                                  ! swapped
                                                                                                          clock hand
                                   !evicted
replacement                                          Check whether                   evicted                 Thread    B
                                                     Pinned                                   Oops! Candidate
victim      Select a frame                                                                    isTry to evict
                                                                                                  intercepted.
                                                                                                 E: evict
                                                                            !evicted
                                        pinned                !pinned
                      null                                                                      --refcount<=0
                                                                                    Try to decrement
                                   continue                                         the refcount
                                                                                    E: decrement
                                 E: try next entry
                                                                                       the refcount

                                                            --refcount>0
                                                                                         Decrement weight count
         Advance CLOCK hand
                                                                                         of a buffer page
         (check the next candidate)
                                                                                                                           66
State Machine-based Reasoning for selecting replacement victim
                                               Thread A                                               Return a replacement
               E: entry action                        evicted                                         victim
                                    Check whether
                                                                      Fix in pool      swapped
                                    Evicted                          E: CAS value
                                                                                                           success
                    !null                                                                              E: move the
Start finding a                                                  ! swapped
                                                                                                          clock hand
                                   !evicted
replacement                                          Check whether                   evicted                    Thread B
                                                     Pinned
victim      Select a frame
                                                                                                Try to evict

                                                                                                 E: evict
                                                                            !evicted
                                        pinned                !pinned
                      null                                                                      --refcount<=0
                                                                                    Try to decrement
                                   continue                                         the refcount
                                                                                    E: decrement
                                 E: try next entry
                                                                                       the refcount

                                                            --refcount>0
                                                                                         Decrement weight count
         Advance CLOCK hand
                                                                                         of a buffer page
         (check the next candidate)
                                                                                                                           67
Outline
• Background
• Our approach
  – Non-Blocking Synchronization
  – Nb-GCLOCK
• Experimental Evaluation
• Related Work
• Conclusion


                                   68
Experimental settings

  Workload
    Zipf 80/20 distribution       (a famous power law)
      containing 20% of sequential scans
   dataset size is 32GB in total
  Machine used: UltraSPARC T2


                                       64 processors



                                                          69
Experimental settings

  Workload
    Zipf 80/20 distribution       (a famous power law)
      containing 20% of sequential scans
   dataset size is 32GB in total
  Machine used: UltraSPARC T2


                                       64 processors

                                 We also performed evaluation
                                 on various X86 settings in the
                                 paper.
                                                                  70
Performance comparison on moderate I/Os (of fig.9)

Throughput
(normalized by LRU)

        6.0
                 LRU
        5.0
                 GCLOCK
        4.0      Nb-GCLOCK

        3.0

        2.0

        1.0

        0.0
                 8           16   32   64   Processors
                                                  71
Performance comparison on moderate I/Os (of fig.9)

Throughput
(normalized by LRU)

        6.0
                   LRU
        5.0
                   GCLOCK
        4.0        Nb-GCLOCK

        3.0

        2.0

        1.0

     CPU0.0
         utilization
      Previous approach: Low, about 20%
                     8          16         32   64   Processors
      Nb-GCLOCK: High, more than 95%
                                                           72
Performance comparison on moderate I/Os (of fig.9)

Throughput                     More difference in CPU time can be
(normalized by LRU)            expected when # of CPU increases
                               ➜ We expect more throughput
        6.0
                   LRU
        5.0
                   GCLOCK
        4.0        Nb-GCLOCK

        3.0

        2.0

        1.0

     CPU0.0
         utilization
      Previous approach: Low, about 20%
                     8          16         32      64    Processors
      Nb-GCLOCK: High, more than 95%
                                                                73
Maximum throughput to processors
 Scalability to processors when pages are resident in memory
 intending to see the scalability limit expected by each algorithm




                                                           74
Maximum throughput to processors
 Scalability to processors when pages are resident in memory
 intending to see the scalability limit expected by each algorithm

Throughput
(log scale)




                   8 (1)    16 (2)   32 (4)   64 (8)   Processors
       2Q        890992    819975   866009   662782
       GCLOCK    1758605   1912000 1931268 1817748     (cores)
       Nb-GCLOCK 3409819   7331722 14245524 25834449
                                                              75
Maximum throughput to processors
 Scalability to processors when pages are resident in memory
 intending to see the scalability limit expected by each algorithm

Throughput
(log scale)                         Achieved almost linear scalability,
                                    at least, up to 64 processors!
                                     This is the first attempt that
                                    removed locks in buffer management




                   8 (1)    16 (2)   32 (4)   64 (8)   Processors
       2Q        890992    819975   866009   662782
       GCLOCK    1758605   1912000 1931268 1817748     (cores)
       Nb-GCLOCK 3409819   7331722 14245524 25834449
                                                               76
Maximum throughput to processors
 Scalability to processors when pages are resident in memory
 intending to see the scalability limit expected by each algorithm

Throughput
(log scale)                         Achieved almost linear scalability,
                                    at least, up to 64 processors!
                                     This is the first attempt that
                                    removed locks in buffer management




                   8 (1)      16 (2)   32 (4) 64 (8)   Processors
       2Q     Interesting here is GCLOCK has662782
                  890992     819975   866009  CPU-
       GCLOCK scalability limit on around 16 1817748
                 1758605 1912000 1931268               (cores)
       Nb-GCLOCK 3409819 Caching solutions 25834449
              processors. 7331722 14245524 using
              GCLOCK have scalability limit there.             77
Max thoughput (operation/sec) evaluation
 Workload is Zipf 80/20, Evaluated on UltraSparcT2 (64 procs)
 Accesses issued from 64 threads in 60 seconds
    Thus, ideally 64 x 60 = 3,840 seconds can be used




                                                                 78
Max thoughput (operation/sec) evaluation
 Workload is Zipf 80/20, Evaluated on UltraSparcT2 (64 procs)
 Accesses issued from 64 threads in 60 seconds
    Thus, ideally 64 x 60 = 3,840 seconds can be used




                                                                 79
Max thoughput (operation/sec) evaluation
 Workload is Zipf 80/20, Evaluated on UltraSparcT2 (64 procs)
 Accesses issued from 64 threads in 60 seconds
    Thus, ideally 64 x 60 = 3,840 seconds can be used




                                                    Most of CPU time is used
                                                    because our Nb-GCLOCK
                                                    is non-blocking!


                                                                         80
Max thoughput (operation/sec) evaluation
 Workload is Zipf 80/20, Evaluated on UltraSparcT2 (64 procs)
 Accesses issued from 64 threads in 60 seconds
    Thus, ideally 64 x 60 = 3,840 seconds can be used
                                                          About 10-20% of CPU
                                                          Time is used!




                                                    Most of CPU time is used
                                                    because our Nb-GCLOCK
                                                    is non-blocking!


                                                                          81
Max thoughput (operation/sec) evaluation
 Workload is Zipf 80/20, Evaluated on UltraSparcT2 (64 procs)
 Accesses issued from 64 threads in 60 seconds
    Thus, ideally 64 x 60 = 3,840 seconds can be used
                                                          About 10-20% of CPU
                                                          Time is used!




                                                    Most of CPU time is used
                                                    because our Nb-GCLOCK
                                                    is non-blocking!

  The CPU utilization would be more differs when # of
  processors grows. It would causes contentions!                          82
TPC-C evaluation using Apache Derby


                               1400



                               1300
Transaction
per minutes                    1200


                       tpmC
                               1100
                                                                                Derby
                               1000
                                                                                Nb-GCLOCK

                               900



                               800
                                       8       16         32        64   128

                                             # of terminals (threads)



Sang Kyun Cha et al. Cache-Conscious Concurrency Control of Main-Memory Indexes on Shared-
Memory Multiprocessor Systems. In Proc. VLDB, 2001.                                          83
TPC-C evaluation using Apache Derby


                               1400



                               1300
Transaction
per minutes                    1200


                       tpmC
                               1100
                                                                                Derby
                               1000
                                                                                Nb-GCLOCK

                               900



                               800
                                       8       16      32      64      128
              The original scheme of Derby (CLOCK)
              decreased throughput.#On the other hand,
                                     of terminals (threads)

              ours scheme showed better result.
Sang Kyun Cha et al. Cache-Conscious Concurrency Control of Main-Memory Indexes on Shared-
Memory Multiprocessor Systems. In Proc. VLDB, 2001.                                          84
TPC-C evaluation using Apache Derby
                        Throughput to buffer management module reduced a
                        latch on root page of B+-tree
                        ➜ We would require a concurrent B+-tree (see OLFIT)
                             1400



                               1300
Transaction
per minutes                    1200


                       tpmC
                               1100
                                                                                Derby
                               1000
                                                                                Nb-GCLOCK

                               900



                               800
                                       8       16         32        64   128

                                             # of terminals (threads)



Sang Kyun Cha et al. Cache-Conscious Concurrency Control of Main-Memory Indexes on Shared-
Memory Multiprocessor Systems. In Proc. VLDB, 2001.                                          85
Outline
• Background
• Our approach
  – Non-Blocking Synchronization
  – Nb-GCLOCK
• Experimental Evaluation
• Related Work
• Conclusion


                                   86
Xiaoning Ding, Song Jiang, and Xiaodong Zhang:
Bp-wrapper                              Bp-Wrapper: A System Framework Making Any Replacement
                                        Algorithms (Almost) Lock Contention Free, Proc. ICDE, 2009.

                                   eliminates lock contention on buffer hits
    Page requests
                                   by using a batching and prefetching technique

   Hash      Hash     Hash     Hash
  bucket    bucket   bucket   bucket

                              hits
misses
               Recording access

Page replacement algorithm
          (any)



           Database
             Files

                                                                                               87
Xiaoning Ding, Song Jiang, and Xiaodong Zhang:
Bp-wrapper                              Bp-Wrapper: A System Framework Making Any Replacement
                                        Algorithms (Almost) Lock Contention Free, Proc. ICDE, 2009.

                                   eliminates lock contention on buffer hits
    Page requests
                                   by using a batching and prefetching technique

   Hash      Hash     Hash     Hash
                                           postpones the physical work
  bucket    bucket   bucket   bucket       (adjusting the buffer replacement list)
                              hits         and immediately returns
misses                                     the logical operation
               Recording access           called Lazy synchronization in the literature

Page replacement algorithm
          (any)



           Database
             Files

                                                                                               88
Xiaoning Ding, Song Jiang, and Xiaodong Zhang:
Bp-wrapper                              Bp-Wrapper: A System Framework Making Any Replacement
                                        Algorithms (Almost) Lock Contention Free, Proc. ICDE, 2009.

                                   eliminates lock contention on buffer hits
    Page requests
                                   by using a batching and prefetching technique

   Hash      Hash     Hash     Hash
                                           postpones the physical work
  bucket    bucket   bucket   bucket       (adjusting the buffer replacement list)
                              hits         and immediately returns
misses                                     the logical operation
               Recording access           called Lazy synchronization in the literature
                                        Pros.
Page replacement algorithm
                                         - works with any page replacement algorithm
          (any)
                                        Cons.
                                         - Does not increase throughputs of CLOCK variants
                                         because CLOCK does not require locks on buffer hits
           Database                      - Cache misses involve batching
             Files                       larger lock holding time makes more contentions

                                                                                               89
Conclusions

 Proposed a lock-free variant of the GCLOCK page
 replacement algorithm, named Nb-GCLOCK.




 Linearizability and lock-freedom are proven in the paper




                                                             90
Conclusions

 Proposed a lock-free variant of the GCLOCK page
 replacement algorithm, named Nb-GCLOCK.
    almost linear scalability to processors up to 64 processors
   while existing locking-based schemes do not scale beyond 16 processors
    The first attempt that introduce non-blocking synchronization
   to database buffer management
    Optimistic I/Os using pread, CAS and memory barriers

 Linearizability and lock-freedom are proven in the paper




                                                                            91
Conclusions

 Proposed a lock-free variant of the GCLOCK page
 replacement algorithm, named Nb-GCLOCK.
    almost linear scalability to processors up to 64 processors
   while existing locking-based schemes do not scale beyond 16 processors
    The first attempt that introduce non-blocking synchronization
   to database buffer management
    Optimistic I/Os using pread, CAS and memory barriers

 Linearizability and lock-freedom are proven in the paper
   The lock-freedom guarantees a certain throughput:
  any active thread taking a bounded number of steps ensures global progress.



                                                                                92
Conclusions

 Proposed a lock-free variant of the GCLOCK page
 replacement algorithm, named Nb-GCLOCK.
    almost linear scalability to processors up to 64 processors
   while existing locking-based schemes do not scale beyond 16 processors
    The first attempt that introduce non-blocking synchronization
   to database buffer management
    Optimistic I/Os using pread, CAS and memory barriers

 Linearizability and lock-freedom are proven in the paper
   The lock-freedom guarantees a certain throughput:
  any active thread taking a bounded number of steps ensures global progress.

  This work is also useful for any caching solution
  that requires high throughput (e.g., C10K accesses)                           93
Thank you for your attention!


                                94

Más contenido relacionado

La actualidad más candente

ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "
ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "
ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "Kuniyasu Suzaki
 
iMinds The Conference: Jan Lemeire
iMinds The Conference: Jan LemeireiMinds The Conference: Jan Lemeire
iMinds The Conference: Jan Lemeireimec
 
Multi-IMA Partition Scheduling for Global I/O Synchronization
Multi-IMA Partition Scheduling for Global I/O SynchronizationMulti-IMA Partition Scheduling for Global I/O Synchronization
Multi-IMA Partition Scheduling for Global I/O Synchronizationrtsljekim
 
VNSISPL_DBMS_Concepts_ch17
VNSISPL_DBMS_Concepts_ch17VNSISPL_DBMS_Concepts_ch17
VNSISPL_DBMS_Concepts_ch17sriprasoon
 
AMBER Molecular Dynamics on GPU
AMBER Molecular Dynamics on GPUAMBER Molecular Dynamics on GPU
AMBER Molecular Dynamics on GPUDevang Sachdev
 
More mastering the art of indexing
More mastering the art of indexingMore mastering the art of indexing
More mastering the art of indexingYoshinori Matsunobu
 
Linux Symposium 2009 Slide Suzaki "Effect of readahead and file system block ...
Linux Symposium 2009 Slide Suzaki "Effect of readahead and file system block ...Linux Symposium 2009 Slide Suzaki "Effect of readahead and file system block ...
Linux Symposium 2009 Slide Suzaki "Effect of readahead and file system block ...Kuniyasu Suzaki
 
SSD based storage tuning for databases
SSD based storage tuning for databasesSSD based storage tuning for databases
SSD based storage tuning for databasesAngelo Rajadurai
 
Advanced Components on Top of L4Re
Advanced Components on Top of L4ReAdvanced Components on Top of L4Re
Advanced Components on Top of L4ReVasily Sartakov
 
LAMMPS Molecular Dynamics on GPU
LAMMPS Molecular Dynamics on GPULAMMPS Molecular Dynamics on GPU
LAMMPS Molecular Dynamics on GPUDevang Sachdev
 
Dsmp Whitepaper Release 3
Dsmp Whitepaper Release 3Dsmp Whitepaper Release 3
Dsmp Whitepaper Release 3gelfstrom
 
Introduction to Microkernels
Introduction to MicrokernelsIntroduction to Microkernels
Introduction to MicrokernelsVasily Sartakov
 
IJCER (www.ijceronline.com) International Journal of computational Engineeri...
 IJCER (www.ijceronline.com) International Journal of computational Engineeri... IJCER (www.ijceronline.com) International Journal of computational Engineeri...
IJCER (www.ijceronline.com) International Journal of computational Engineeri...ijceronline
 

La actualidad más candente (19)

ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "
ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "
ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "
 
iMinds The Conference: Jan Lemeire
iMinds The Conference: Jan LemeireiMinds The Conference: Jan Lemeire
iMinds The Conference: Jan Lemeire
 
Multi-IMA Partition Scheduling for Global I/O Synchronization
Multi-IMA Partition Scheduling for Global I/O SynchronizationMulti-IMA Partition Scheduling for Global I/O Synchronization
Multi-IMA Partition Scheduling for Global I/O Synchronization
 
06threadsimp
06threadsimp06threadsimp
06threadsimp
 
Nehalem
NehalemNehalem
Nehalem
 
VNSISPL_DBMS_Concepts_ch17
VNSISPL_DBMS_Concepts_ch17VNSISPL_DBMS_Concepts_ch17
VNSISPL_DBMS_Concepts_ch17
 
AMBER Molecular Dynamics on GPU
AMBER Molecular Dynamics on GPUAMBER Molecular Dynamics on GPU
AMBER Molecular Dynamics on GPU
 
More mastering the art of indexing
More mastering the art of indexingMore mastering the art of indexing
More mastering the art of indexing
 
Linux Symposium 2009 Slide Suzaki "Effect of readahead and file system block ...
Linux Symposium 2009 Slide Suzaki "Effect of readahead and file system block ...Linux Symposium 2009 Slide Suzaki "Effect of readahead and file system block ...
Linux Symposium 2009 Slide Suzaki "Effect of readahead and file system block ...
 
01intro
01intro01intro
01intro
 
SSD based storage tuning for databases
SSD based storage tuning for databasesSSD based storage tuning for databases
SSD based storage tuning for databases
 
Multi-Core on Chip Architecture *doc - IK
Multi-Core on Chip Architecture *doc - IKMulti-Core on Chip Architecture *doc - IK
Multi-Core on Chip Architecture *doc - IK
 
Advanced Components on Top of L4Re
Advanced Components on Top of L4ReAdvanced Components on Top of L4Re
Advanced Components on Top of L4Re
 
LAMMPS Molecular Dynamics on GPU
LAMMPS Molecular Dynamics on GPULAMMPS Molecular Dynamics on GPU
LAMMPS Molecular Dynamics on GPU
 
Dsmp Whitepaper Release 3
Dsmp Whitepaper Release 3Dsmp Whitepaper Release 3
Dsmp Whitepaper Release 3
 
Memory, IPC and L4Re
Memory, IPC and L4ReMemory, IPC and L4Re
Memory, IPC and L4Re
 
Introduction to Microkernels
Introduction to MicrokernelsIntroduction to Microkernels
Introduction to Microkernels
 
Nehalem (microarchitecture)
Nehalem (microarchitecture)Nehalem (microarchitecture)
Nehalem (microarchitecture)
 
IJCER (www.ijceronline.com) International Journal of computational Engineeri...
 IJCER (www.ijceronline.com) International Journal of computational Engineeri... IJCER (www.ijceronline.com) International Journal of computational Engineeri...
IJCER (www.ijceronline.com) International Journal of computational Engineeri...
 

Destacado

Zece etaje de strategie pentru campaniile PPC in imobiliare
Zece etaje de strategie pentru campaniile PPC in imobiliareZece etaje de strategie pentru campaniile PPC in imobiliare
Zece etaje de strategie pentru campaniile PPC in imobiliareDragos Smeu
 
Over het belang van relationele en seksuele vorming feb2013
Over het belang van relationele en seksuele vorming feb2013Over het belang van relationele en seksuele vorming feb2013
Over het belang van relationele en seksuele vorming feb2013Elizabeth Verhetsel
 
BIM ir SOLIDWORKS
BIM ir SOLIDWORKSBIM ir SOLIDWORKS
BIM ir SOLIDWORKSIN RE UAB
 
Ne islam message sheha
Ne islam message shehaNe islam message sheha
Ne islam message shehaLoveofpeople
 
Attestato-content-marketing
Attestato-content-marketingAttestato-content-marketing
Attestato-content-marketingEmiliano Micheli
 
Generalsekretærfrokost 23.05.16
Generalsekretærfrokost 23.05.16Generalsekretærfrokost 23.05.16
Generalsekretærfrokost 23.05.16Frivillighet Norge
 
Domitilla Ferrari - Social Media Take Away
Domitilla Ferrari - Social Media Take AwayDomitilla Ferrari - Social Media Take Away
Domitilla Ferrari - Social Media Take AwayKnowCamp
 
Bota e miqve tonë të vegjël milingonave. albanian (shqip)
Bota e miqve tonë të vegjël milingonave. albanian (shqip)Bota e miqve tonë të vegjël milingonave. albanian (shqip)
Bota e miqve tonë të vegjël milingonave. albanian (shqip)HarunyahyaAlbanian
 
Timemanagement / GTD
Timemanagement / GTDTimemanagement / GTD
Timemanagement / GTDPetr Mára
 
Infographic: Why Slamby?
Infographic: Why Slamby?Infographic: Why Slamby?
Infographic: Why Slamby?Slamby
 
eTwinning_2010_11_17_roskilde
eTwinning_2010_11_17_roskildeeTwinning_2010_11_17_roskilde
eTwinning_2010_11_17_roskildeEbbe Schultze
 
How to create successful Employee Referral Program?
How to create successful Employee Referral Program? How to create successful Employee Referral Program?
How to create successful Employee Referral Program? ERDA
 

Destacado (20)

Zece etaje de strategie pentru campaniile PPC in imobiliare
Zece etaje de strategie pentru campaniile PPC in imobiliareZece etaje de strategie pentru campaniile PPC in imobiliare
Zece etaje de strategie pentru campaniile PPC in imobiliare
 
News Writting and Editing-Sikandar fayez
News Writting and Editing-Sikandar fayezNews Writting and Editing-Sikandar fayez
News Writting and Editing-Sikandar fayez
 
Cv 2013
Cv 2013Cv 2013
Cv 2013
 
Over het belang van relationele en seksuele vorming feb2013
Over het belang van relationele en seksuele vorming feb2013Over het belang van relationele en seksuele vorming feb2013
Over het belang van relationele en seksuele vorming feb2013
 
BIM ir SOLIDWORKS
BIM ir SOLIDWORKSBIM ir SOLIDWORKS
BIM ir SOLIDWORKS
 
La biblioteca
La bibliotecaLa biblioteca
La biblioteca
 
Ne islam message sheha
Ne islam message shehaNe islam message sheha
Ne islam message sheha
 
Bénédicte du BOULLAY
Bénédicte du BOULLAYBénédicte du BOULLAY
Bénédicte du BOULLAY
 
Attestato-content-marketing
Attestato-content-marketingAttestato-content-marketing
Attestato-content-marketing
 
Generalsekretærfrokost 23.05.16
Generalsekretærfrokost 23.05.16Generalsekretærfrokost 23.05.16
Generalsekretærfrokost 23.05.16
 
Palkumar - QC Executive
Palkumar - QC ExecutivePalkumar - QC Executive
Palkumar - QC Executive
 
Sirat Rasaul Allah
Sirat Rasaul AllahSirat Rasaul Allah
Sirat Rasaul Allah
 
Domitilla Ferrari - Social Media Take Away
Domitilla Ferrari - Social Media Take AwayDomitilla Ferrari - Social Media Take Away
Domitilla Ferrari - Social Media Take Away
 
25 aniversario
25 aniversario25 aniversario
25 aniversario
 
Talent Gallery - Global Headhunting
Talent Gallery - Global HeadhuntingTalent Gallery - Global Headhunting
Talent Gallery - Global Headhunting
 
Bota e miqve tonë të vegjël milingonave. albanian (shqip)
Bota e miqve tonë të vegjël milingonave. albanian (shqip)Bota e miqve tonë të vegjël milingonave. albanian (shqip)
Bota e miqve tonë të vegjël milingonave. albanian (shqip)
 
Timemanagement / GTD
Timemanagement / GTDTimemanagement / GTD
Timemanagement / GTD
 
Infographic: Why Slamby?
Infographic: Why Slamby?Infographic: Why Slamby?
Infographic: Why Slamby?
 
eTwinning_2010_11_17_roskilde
eTwinning_2010_11_17_roskildeeTwinning_2010_11_17_roskilde
eTwinning_2010_11_17_roskilde
 
How to create successful Employee Referral Program?
How to create successful Employee Referral Program? How to create successful Employee Referral Program?
How to create successful Employee Referral Program?
 

Similar a ICDE2010 Nb-GCLOCK

Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 HardwareJacob Wu
 
Parallelism Processor Design
Parallelism Processor DesignParallelism Processor Design
Parallelism Processor DesignSri Prasanna
 
Processors and its Types
Processors and its TypesProcessors and its Types
Processors and its TypesNimrah Shahbaz
 
Amd Barcelona Presentation Slideshare
Amd Barcelona Presentation SlideshareAmd Barcelona Presentation Slideshare
Amd Barcelona Presentation SlideshareDon Scansen
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...Chester Chen
 
MARC ONERA Toulouse2012 Altreonic
MARC ONERA Toulouse2012 AltreonicMARC ONERA Toulouse2012 Altreonic
MARC ONERA Toulouse2012 AltreonicEric Verhulst
 
Memory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationMemory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationBigstep
 
Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)Shien-Chun Luo
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDKKernel TLV
 
OSDC 2017 | Open POWER for the data center by Werner Fischer
OSDC 2017 | Open POWER for the data center by Werner FischerOSDC 2017 | Open POWER for the data center by Werner Fischer
OSDC 2017 | Open POWER for the data center by Werner FischerNETWAYS
 
OSDC 2017 - Werner Fischer - Open power for the data center
OSDC 2017 - Werner Fischer - Open power for the data centerOSDC 2017 - Werner Fischer - Open power for the data center
OSDC 2017 - Werner Fischer - Open power for the data centerNETWAYS
 
OSDC 2017 | Linux Performance Profiling and Monitoring by Werner Fischer
OSDC 2017 | Linux Performance Profiling and Monitoring by Werner FischerOSDC 2017 | Linux Performance Profiling and Monitoring by Werner Fischer
OSDC 2017 | Linux Performance Profiling and Monitoring by Werner FischerNETWAYS
 
Programming Trends in High Performance Computing
Programming Trends in High Performance ComputingProgramming Trends in High Performance Computing
Programming Trends in High Performance ComputingJuris Vencels
 
Power 7 Overview
Power 7 OverviewPower 7 Overview
Power 7 Overviewlambertt
 
了解Cpu
了解Cpu了解Cpu
了解CpuFeng Yu
 
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptxQ1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptxMemory Fabric Forum
 
Shak larry-jeder-perf-and-tuning-summit14-part1-final
Shak larry-jeder-perf-and-tuning-summit14-part1-finalShak larry-jeder-perf-and-tuning-summit14-part1-final
Shak larry-jeder-perf-and-tuning-summit14-part1-finalTommy Lee
 

Similar a ICDE2010 Nb-GCLOCK (20)

Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 Hardware
 
Parallelism Processor Design
Parallelism Processor DesignParallelism Processor Design
Parallelism Processor Design
 
Processors and its Types
Processors and its TypesProcessors and its Types
Processors and its Types
 
Amd Barcelona Presentation Slideshare
Amd Barcelona Presentation SlideshareAmd Barcelona Presentation Slideshare
Amd Barcelona Presentation Slideshare
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
 
MARC ONERA Toulouse2012 Altreonic
MARC ONERA Toulouse2012 AltreonicMARC ONERA Toulouse2012 Altreonic
MARC ONERA Toulouse2012 Altreonic
 
Memory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationMemory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and Virtualization
 
Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
OSDC 2017 | Open POWER for the data center by Werner Fischer
OSDC 2017 | Open POWER for the data center by Werner FischerOSDC 2017 | Open POWER for the data center by Werner Fischer
OSDC 2017 | Open POWER for the data center by Werner Fischer
 
OSDC 2017 - Werner Fischer - Open power for the data center
OSDC 2017 - Werner Fischer - Open power for the data centerOSDC 2017 - Werner Fischer - Open power for the data center
OSDC 2017 - Werner Fischer - Open power for the data center
 
OSDC 2017 | Linux Performance Profiling and Monitoring by Werner Fischer
OSDC 2017 | Linux Performance Profiling and Monitoring by Werner FischerOSDC 2017 | Linux Performance Profiling and Monitoring by Werner Fischer
OSDC 2017 | Linux Performance Profiling and Monitoring by Werner Fischer
 
Programming Trends in High Performance Computing
Programming Trends in High Performance ComputingProgramming Trends in High Performance Computing
Programming Trends in High Performance Computing
 
Power 7 Overview
Power 7 OverviewPower 7 Overview
Power 7 Overview
 
了解Cpu
了解Cpu了解Cpu
了解Cpu
 
LUG 2014
LUG 2014LUG 2014
LUG 2014
 
CLFS 2010
CLFS 2010CLFS 2010
CLFS 2010
 
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptxQ1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
 
Shak larry-jeder-perf-and-tuning-summit14-part1-final
Shak larry-jeder-perf-and-tuning-summit14-part1-finalShak larry-jeder-perf-and-tuning-summit14-part1-final
Shak larry-jeder-perf-and-tuning-summit14-part1-final
 
Current Trends in HPC
Current Trends in HPCCurrent Trends in HPC
Current Trends in HPC
 

Más de Makoto Yui

Apache Hivemall and my OSS experience
Apache Hivemall and my OSS experienceApache Hivemall and my OSS experience
Apache Hivemall and my OSS experienceMakoto Yui
 
Introduction to Apache Hivemall v0.5.2 and v0.6
Introduction to Apache Hivemall v0.5.2 and v0.6Introduction to Apache Hivemall v0.5.2 and v0.6
Introduction to Apache Hivemall v0.5.2 and v0.6Makoto Yui
 
Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Makoto Yui
 
Idea behind Apache Hivemall
Idea behind Apache HivemallIdea behind Apache Hivemall
Idea behind Apache HivemallMakoto Yui
 
Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Makoto Yui
 
What's new in Hivemall v0.5.0
What's new in Hivemall v0.5.0What's new in Hivemall v0.5.0
What's new in Hivemall v0.5.0Makoto Yui
 
What's new in Apache Hivemall v0.5.0
What's new in Apache Hivemall v0.5.0What's new in Apache Hivemall v0.5.0
What's new in Apache Hivemall v0.5.0Makoto Yui
 
Revisiting b+-trees
Revisiting b+-treesRevisiting b+-trees
Revisiting b+-treesMakoto Yui
 
Incubating Apache Hivemall
Incubating Apache HivemallIncubating Apache Hivemall
Incubating Apache HivemallMakoto Yui
 
Hivemall meets Digdag @Hackertackle 2018-02-17
Hivemall meets Digdag @Hackertackle 2018-02-17Hivemall meets Digdag @Hackertackle 2018-02-17
Hivemall meets Digdag @Hackertackle 2018-02-17Makoto Yui
 
Apache Hivemall @ Apache BigData '17, Miami
Apache Hivemall @ Apache BigData '17, MiamiApache Hivemall @ Apache BigData '17, Miami
Apache Hivemall @ Apache BigData '17, MiamiMakoto Yui
 
機械学習のデータ並列処理@第7回BDI研究会
機械学習のデータ並列処理@第7回BDI研究会機械学習のデータ並列処理@第7回BDI研究会
機械学習のデータ並列処理@第7回BDI研究会Makoto Yui
 
Podling Hivemall in the Apache Incubator
Podling Hivemall in the Apache IncubatorPodling Hivemall in the Apache Incubator
Podling Hivemall in the Apache IncubatorMakoto Yui
 
Dots20161029 myui
Dots20161029 myuiDots20161029 myui
Dots20161029 myuiMakoto Yui
 
Hadoopsummit16 myui
Hadoopsummit16 myuiHadoopsummit16 myui
Hadoopsummit16 myuiMakoto Yui
 
HadoopCon'16, Taipei @myui
HadoopCon'16, Taipei @myuiHadoopCon'16, Taipei @myui
HadoopCon'16, Taipei @myuiMakoto Yui
 
3rd Hivemall meetup
3rd Hivemall meetup3rd Hivemall meetup
3rd Hivemall meetupMakoto Yui
 
Recommendation 101 using Hivemall
Recommendation 101 using HivemallRecommendation 101 using Hivemall
Recommendation 101 using HivemallMakoto Yui
 
Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016Makoto Yui
 
Introduction to Hivemall
Introduction to HivemallIntroduction to Hivemall
Introduction to HivemallMakoto Yui
 

Más de Makoto Yui (20)

Apache Hivemall and my OSS experience
Apache Hivemall and my OSS experienceApache Hivemall and my OSS experience
Apache Hivemall and my OSS experience
 
Introduction to Apache Hivemall v0.5.2 and v0.6
Introduction to Apache Hivemall v0.5.2 and v0.6Introduction to Apache Hivemall v0.5.2 and v0.6
Introduction to Apache Hivemall v0.5.2 and v0.6
 
Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0
 
Idea behind Apache Hivemall
Idea behind Apache HivemallIdea behind Apache Hivemall
Idea behind Apache Hivemall
 
Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0
 
What's new in Hivemall v0.5.0
What's new in Hivemall v0.5.0What's new in Hivemall v0.5.0
What's new in Hivemall v0.5.0
 
What's new in Apache Hivemall v0.5.0
What's new in Apache Hivemall v0.5.0What's new in Apache Hivemall v0.5.0
What's new in Apache Hivemall v0.5.0
 
Revisiting b+-trees
Revisiting b+-treesRevisiting b+-trees
Revisiting b+-trees
 
Incubating Apache Hivemall
Incubating Apache HivemallIncubating Apache Hivemall
Incubating Apache Hivemall
 
Hivemall meets Digdag @Hackertackle 2018-02-17
Hivemall meets Digdag @Hackertackle 2018-02-17Hivemall meets Digdag @Hackertackle 2018-02-17
Hivemall meets Digdag @Hackertackle 2018-02-17
 
Apache Hivemall @ Apache BigData '17, Miami
Apache Hivemall @ Apache BigData '17, MiamiApache Hivemall @ Apache BigData '17, Miami
Apache Hivemall @ Apache BigData '17, Miami
 
機械学習のデータ並列処理@第7回BDI研究会
機械学習のデータ並列処理@第7回BDI研究会機械学習のデータ並列処理@第7回BDI研究会
機械学習のデータ並列処理@第7回BDI研究会
 
Podling Hivemall in the Apache Incubator
Podling Hivemall in the Apache IncubatorPodling Hivemall in the Apache Incubator
Podling Hivemall in the Apache Incubator
 
Dots20161029 myui
Dots20161029 myuiDots20161029 myui
Dots20161029 myui
 
Hadoopsummit16 myui
Hadoopsummit16 myuiHadoopsummit16 myui
Hadoopsummit16 myui
 
HadoopCon'16, Taipei @myui
HadoopCon'16, Taipei @myuiHadoopCon'16, Taipei @myui
HadoopCon'16, Taipei @myui
 
3rd Hivemall meetup
3rd Hivemall meetup3rd Hivemall meetup
3rd Hivemall meetup
 
Recommendation 101 using Hivemall
Recommendation 101 using HivemallRecommendation 101 using Hivemall
Recommendation 101 using Hivemall
 
Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016
 
Introduction to Hivemall
Introduction to HivemallIntroduction to Hivemall
Introduction to Hivemall
 

Último

So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 

Último (20)

So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 

ICDE2010 Nb-GCLOCK

  • 1. Nb-GCLOCK: A Non-blocking Buffer Management based on the Generalized CLOCK Makoto YUI1, Jun MIYAZAKI2, Shunsuke UEMURA3 and Hayato YAMANA4 1 .Research fellow, JSPS (Japan Society for the Promotion of Science) / Visiting Postdoc at Waseda University, Japan and CWI, Netherlands 2. Nara Institute of Science and Technology 3. Nara Sangyo University 4. Waseda University / National Institute of Informatics
  • 2. Outline • Background • Our approach – Non-Blocking Synchronization – Nb-GCLOCK • Experimental Evaluation • Related Work • Conclusion 2
  • 3. Background – Recent trends in CPU development # of CPU cores in a chip Many-Core CPU is doubling in two year cycles UltraSparc T2 Azul Vega Larrabee? Multi-Core CPU Nehalem Single-Core CPU Core2 Power4 Pentium 2000 Many-core era is coming. 1990 3
  • 4. Background – Recent trends in CPU development # of CPU cores in a chip Many-Core CPU is doubling in two year cycles UltraSparc T2 Azul Vega Larrabee? Multi-Core CPU Nehalem Single-Core CPU Core2 Power4 Pentium 2000 Many-core era is coming. 1990 - Niagara T2 – 8 cores x 8 SMT = 64 processors - Azul Vega3 – 54 cores x 16 chips = 864 processors 4
  • 5. Background – CPU Scalability of open source DBs Open source DBs have faced CPU scalability problems Ryan Johnson et al., “Shore-MT: A Scalable Storage Manager for the Multicore Era”, In Proc. EDBT, 2009. 5
  • 6. Background – CPU Scalability of open source DBs Open source DBs have faced CPU scalability problems Ryan Johnson et al., “Shore-MT: A Scalable Storage Manager for the Multicore Era”, In Proc. EDBT, 2009. 10 PostgreSQL 8 MySQL BDB 6 4 2 0 1 4 8 12 16 24 32 Microbenchmark on UltraSparc T1 (32 procs) 6
  • 7. Background – CPU Scalability of open source DBs Open source DBs have faced CPU scalability problems Ryan Johnson et al., “Shore-MT: A Scalable Storage Manager for the Multicore Era”, In Proc. EDBT, 2009. 10 PostgreSQL 8 MySQL BDB Throughput 6 (normalized) 4 2 0 Concurrent 1 4 8 12 16 24 32 threads Microbenchmark on UltraSparc T1 (32 procs) 7
  • 8. Background – CPU Scalability of open source DBs Open source DBs have faced CPU scalability problems Ryan Johnson et al., “Shore-MT: A Scalable Storage Manager for the Multicore Era”, In Proc. EDBT, 2009. Gain after 16 threads 10 is less than 5 % PostgreSQL 8 MySQL BDB Throughput 6 (normalized) 4 2 0 Concurrent 1 4 8 12 16 24 32 threads Microbenchmark on UltraSparc T1 (32 procs) 8
  • 9. Background – CPU Scalability of open source DBs Open source DBs have faced CPU scalability problems Ryan Johnson et al., “Shore-MT: A Scalable Storage Manager for the Multicore Era”, In Proc. EDBT, 2009. Gain after 16 threads 10 is less than 5 % PostgreSQL 8 MySQL BDB Throughput 6 (normalized) 4 2 You might think… What about TPC-C ? 0 Concurrent 1 4 8 12 16 24 32 threads Microbenchmark on UltraSparc T1 (32 procs) 9
  • 10. CPU scalability of PostgreSQL TPC-C benchmark result on a high-end Linux machine of Unisys (Xeon-SMP 32 CPUs, Memory 16GB, EMC RAID10 Storage) Doug Tolbert, David Strong, Johney Tsai (Unisys), “Scaling PostgreSQL on SMP Architectures”, PGCON 2007. 10
  • 11. CPU scalability of PostgreSQL TPC-C benchmark result on a high-end Linux machine of Unisys (Xeon-SMP 32 CPUs, Memory 16GB, EMC RAID10 Storage) Doug Tolbert, David Strong, Johney Tsai (Unisys), “Scaling PostgreSQL on SMP Architectures”, PGCON 2007. TPS Version 8.2 CPU cores Version 8.1 Version 8.0 11
  • 12. CPU scalability of PostgreSQL TPC-C benchmark result on a high-end Linux machine of Unisys (Xeon-SMP 32 CPUs, Memory 16GB, EMC RAID10 Storage) Doug Tolbert, David Strong, Johney Tsai (Unisys), “Scaling PostgreSQL on SMP Architectures”, PGCON 2007. TPS Gain after 16 CPU cores is less than 5% Version 8.2 CPU cores Version 8.1 Version 8.0 12
  • 13. CPU scalability of PostgreSQL TPC-C benchmark result on a high-end Linux machine of Unisys (Xeon-SMP 32 CPUs, Memory 16GB, EMC RAID10 Storage) Doug Tolbert, David Strong, Johney Tsai (Unisys), “Scaling PostgreSQL on SMP Architectures”, PGCON 2007. TPS Gain after 16 CPU cores is less than 5% Q. What PostgreSQL community did? Version 8.2 CPU cores Version 8.1 Version 8.0 13
  • 14. CPU scalability of PostgreSQL TPC-C benchmark result on a high-end Linux machine of Unisys (Xeon-SMP 32 CPUs, Memory 16GB, EMC RAID10 Storage) Doug Tolbert, David Strong, Johney Tsai (Unisys), “Scaling PostgreSQL on SMP Architectures”, PGCON 2007. TPS Gain after 16 CPU cores is less than 5% Q. What PostgreSQL community did? Version 8.2 CPU cores Version 8.1 Revised their synchronization mechanisms in the buffer management module Version 8.0 14
  • 15. Synchronization in Buffer Management Module Several empirical studies have revealed that the largest bottleneck is … synchronization in buffer management module [1] Ryan Johnson, Ippokratis Pandis, Anastassia Ailamaki: “Critical Sections: Re-emerging Scalability Concerns for Database Storage Engines”, In Proc. DaMoN, 2008. [2] Stavros Harizopoulos, Daniel J. Abadi, Samuel Madden, and Michael Stonebraker: OLTP Through the Looking Glass, and What We Found There, In Proc.SIGMOD, 2008.
  • 16. Synchronization in Buffer Management Module Several empirical studies have revealed that the largest bottleneck is … synchronization in buffer management module [1] Ryan Johnson, Ippokratis Pandis, Anastassia Ailamaki: “Critical Sections: Re-emerging Scalability Concerns for Database Storage Engines”, In Proc. DaMoN, 2008. [2] Stavros Harizopoulos, Daniel J. Abadi, Samuel Madden, and Michael Stonebraker: OLTP Through the Looking Glass, and What We Found There, In Proc.SIGMOD, 2008. CPU Page requests reduces disk access by caching database pages Buffer Memory Manager HDD Database Files
  • 17. Synchronization in Buffer Management Module Several empirical studies have revealed that the largest bottleneck is … synchronization in buffer management module [1] Ryan Johnson, Ippokratis Pandis, Anastassia Ailamaki: “Critical Sections: Re-emerging Scalability Concerns for Database Storage Engines”, In Proc. DaMoN, 2008. [2] Stavros Harizopoulos, Daniel J. Abadi, Samuel Madden, and Michael Stonebraker: OLTP Through the Looking Glass, and What We Found There, In Proc.SIGMOD, 2008. CPU Page requests Page requests reduces disk access Buffer Manager by caching database pages (1) Looking-up hash table Buffer Memory Manager misses hits (2) Page replacement algorithm HDD Database Database Files Files 20
  • 18. Synchronization in Buffer Management Module Several empirical studies have revealed that the largest bottleneck is … synchronization in buffer management module [1] Ryan Johnson, Ippokratis Pandis, Anastassia Ailamaki: “Critical Sections: Re-emerging Scalability Concerns for Database Storage Engines”, In Proc. DaMoN, 2008. [2] Stavros Harizopoulos, Daniel J. Abadi, Samuel Madden, and Michael Stonebraker: OLTP Through the Looking Glass, and What We Found There, In Proc.SIGMOD, 2008. CPU Page requests Page requests reduces disk access Buffer Manager by caching database pages (1) Looking-up hash table Buffer Memory Manager misses hits (2) Page replacement algorithm HDD Database Database Files Files 18
  • 19. Synchronization in Buffer Management Module Several empirical studies have revealed that the largest bottleneck is … synchronization in buffer management module [1] Ryan Johnson, Ippokratis Pandis, Anastassia Ailamaki: “Critical Sections: Re-emerging Scalability Concerns for Database Storage Engines”, In Proc. DaMoN, 2008. [2] Stavros Harizopoulos, Daniel J. Abadi, Samuel Madden, and Michael Stonebraker: OLTP Through the Looking Glass, and What We Found There, In Proc.SIGMOD, 2008. CPU Page requests Page requests reduces disk access Buffer Manager by caching database pages (1) Looking-up hash table Buffer Memory Manager misses hits (2) Page replacement algorithm HDD Database Database Files Files 19
  • 20. Naive buffer management schemes Page requests Page requests Hash Hash Hash Hash Looking-up hash table bucket bucket bucket bucket misses hits misses hits Page replacement algorithm Page replacement algorithm (Least Recently Used) (Least Recently Used) Database Database Files Files PostgreSQL 8.0 PostgreSQL 8.1 20
  • 21. Naive buffer management schemes Page requests Page requests Giant lock sucks! Hash Hash Hash Hash Looking-up hash table bucket bucket bucket bucket misses hits misses hits Page replacement algorithm Page replacement algorithm (Least Recently Used) (Least Recently Used) Database Database Files Files PostgreSQL 8.0 PostgreSQL 8.1 21
  • 22. Naive buffer management schemes Page requests Page requests Giant lock sucks! Hash Hash Hash Hash Looking-up hash table bucket bucket bucket bucket misses hits misses hits Page replacement algorithm Page replacement algorithm (Least Recently Used) (Least Recently Used) LRU list always needs to be Database Database Files locked when it is accessed Files PostgreSQL 8.0 PostgreSQL 8.1 22
  • 23. Naive buffer management schemes Page requests Page requests Giant lock sucks! Striped a lock into buckets Hash Hash Hash Hash Looking-up hash table bucket bucket bucket bucket misses hits misses hits Page replacement algorithm Page replacement algorithm (Least Recently Used) (Least Recently Used) LRU list always needs to be Database Database Files locked when it is accessed Files PostgreSQL 8.0 PostgreSQL 8.1 23
  • 24. Naive buffer management schemes Page requests Page requests Giant lock sucks! Striped a lock into buckets Hash Hash Hash Hash Looking-up hash table bucket bucket bucket bucket misses hits misses hits Page replacement algorithm Page replacement algorithm (Least Recently Used) (Least Recently Used) LRU list always needs to be Database Database Files locked when it is accessed Files PostgreSQL 8.0 PostgreSQL 8.1  Did not scale at all  Scales up to 8 processors 24
  • 25. Less naive buffer management schemes Page requests Page requests Hash Hash Hash Hash Hash Hash Hash Hash bucket bucket bucket bucket bucket bucket bucket bucket misses hits misses hits Page replacement algorithm Page replacement algorithm (Least Recently Used) (CLOCK) Always needs to be locked Database when it is accessed Database Files Files PostgreSQL 8.1 PostgreSQL 8.2  Scales up to 8 processors 25
  • 26. Less naive buffer management schemes Page requests CLOCK does not require a lock Page requests when an entry is touched Hash Hash Hash Hash Hash Hash Hash Hash bucket bucket bucket bucket bucket bucket bucket bucket misses hits misses hits Page replacement algorithm Page replacement algorithm (Least Recently Used) (CLOCK) Always needs to be locked Database when it is accessed Database Files Files PostgreSQL 8.1 PostgreSQL 8.2  Scales up to 8 processors  Scales up to 16 processors 26
  • 27. Outline • Background • Our approach – Non-Blocking Synchronization – Nb-GCLOCK • Experimental Evaluation • Related Work • Conclusion 27
  • 28. Core idea of our approach Previous approaches Our optimistic approach Request pages Request pages CPU Buffer Buffer Memory Manager Manager HDD Database files Database files 28
  • 29. Core idea of our approach Previous approaches Our optimistic approach ○Reducing disk I/Os × locks are contended Request pages Request pages CPU Buffer Buffer Memory Manager Manager HDD Database files Database files 29
  • 30. Core idea of our approach Previous approaches Our optimistic approach ○Reducing disk I/Os × locks are contended Request pages Request pages CPU Buffer Buffer Memory Manager Manager intuition HDD Database files Database files 30
  • 31. Core idea of our approach Previous approaches Our optimistic approach ○Reducing disk I/Os × locks are contended Request pages Request pages CPU Enough processors Buffer Buffer Memory Manager Manager Disk bandwidth is not utilized HDD Database files Database files 31
  • 32. Core idea of our approach Previous approaches Our optimistic approach ○Reducing disk I/Os × locks are contended Request pages Request pages CPU Enough processors Buffer Buffer Memory Manager Manager Disk bandwidth is not utilized HDD Database files Database files 32
  • 33. Core idea of our approach Previous approaches Our optimistic approach ○Reducing disk I/Os × locks are contended Request pages Request pages CPU Enough processors Buffer Buffer Memory Manager Manager Disk bandwidth is not utilized Reduced lock granularity to one CPU instruction and HDD remove the bottleneck Database files Database files 33
  • 34. Core idea of our approach Previous approaches Our optimistic approach ○Reducing disk I/Os △ # of I/O slightly increases × locks are contended ○ no contention on locks Request pages Request pages CPU Enough processors Buffer Buffer Memory Manager Manager Disk bandwidth is not utilized Reduced lock granularity to one CPU instruction and HDD remove the bottleneck Database files Database files 34
  • 35. Major Difference to Previous Approaches Previous approaches Our optimistic approach ○Reducing disk I/Os △ # of I/O slightly increases × locks are contended ○ no contention on locks Their goal is … 35
  • 36. Major Difference to Previous Approaches Previous approaches Our optimistic approach ○Reducing disk I/Os △ # of I/O slightly increases × locks are contended ○ no contention on locks Their goal is … Improve buffer hit-rates for reducing I/Os Unique goal for many decades. Is this goal valid for many core era? There are also SSDs. 36
  • 37. Major Difference to Previous Approaches Previous approaches Our optimistic approach ○Reducing disk I/Os △ # of I/O slightly increases × locks are contended ○ no contention on locks Their goal is … Our goal is … Improve buffer hit-rates for reducing I/Os Unique goal for many decades. Is this goal valid for many core era? There are also SSDs. 37
  • 38. Major Difference to Previous Approaches Previous approaches Our optimistic approach ○Reducing disk I/Os △ # of I/O slightly increases × locks are contended ○ no contention on locks Their goal is … Our goal is … Improve buffer hit-rates Improve throughputs by for reducing I/Os utilizing (many) CPUs. Unique goal for many decades. Is this goal valid for many core era? There are also SSDs. 38
  • 39. Major Difference to Previous Approaches Previous approaches Our optimistic approach ○Reducing disk I/Os △ # of I/O slightly increases × locks are contended ○ no contention on locks Their goal is … Our goal is … Improve buffer hit-rates Improve throughputs by for reducing I/Os utilizing (many) CPUs. Unique goal for many decades. Use Non-blocking synchronization Is this goal valid for many core instead of acquiring locks! era? There are also SSDs. 39
  • 40. What’s non-blocking and lock-free?  Formally: 40
  • 41. What’s non-blocking and lock-free?  Formally:  Stopping one thread will not prevent global progress. Individual threads make progress without waiting. 41
  • 42. What’s non-blocking and lock-free?  Formally:  Stopping one thread will not prevent global progress. Individual threads make progress without waiting.  Less Formally: 42
  • 43. What’s non-blocking and lock-free?  Formally:  Stopping one thread will not prevent global progress. Individual threads make progress without waiting.  Less Formally:  No thread 'locks' any resource  No 'critical sections', locks, mutexs, spin-locks, etc 43
  • 44. What’s non-blocking and lock-free?  Formally:  Stopping one thread will not prevent global progress. Individual threads make progress without waiting.  Less Formally:  No thread 'locks' any resource  No 'critical sections', locks, mutexs, spin-locks, etc Lock-free if every successful step makes Global Progress and completes within finite time (ensuring liveness) 44
  • 45. What’s non-blocking and lock-free?  Formally:  Stopping one thread will not prevent global progress. Individual threads make progress without waiting.  Less Formally:  No thread 'locks' any resource  No 'critical sections', locks, mutexs, spin-locks, etc Lock-free if every successful step makes Global Progress and completes within finite time (ensuring liveness) Wait-free if every step makes Global Progress and completes within finite time (ensuring fairness) 45
  • 46. Non-blocking synchronization Synchronization method that does not acquire any lock, enabling concurrent accesses to shared resources  Utilize atomic CPU primitives   Utilize memory barriers 46
  • 47. Non-blocking synchronization Synchronization method that does not acquire any lock, enabling concurrent accesses to shared resources  Utilize atomic CPU primitives  CAS (compare-and-swap) cmpxchg on X86  Utilize memory barriers 47
  • 48. Non-blocking synchronization Synchronization method that does not acquire any lock, enabling concurrent accesses to shared resources  Utilize atomic CPU primitives  CAS (compare-and-swap) cmpxchg on X86  Utilize memory barriers Blocking acquire_lock(lock); counter++; release_lock(lock); 48
  • 49. Non-blocking synchronization Synchronization method that does not acquire any lock, enabling concurrent accesses to shared resources  Utilize atomic CPU primitives  CAS (compare-and-swap) cmpxchg on X86  Utilize memory barriers Blocking Non-Blocking acquire_lock(lock); int old; counter++; do { release_lock(lock); old = *counter; } while (!CAS(counter, old, old+1)); counter is incremented if the value was equals to old 49
  • 50. Making the buffer manager non-blocking Page requests Hash Hash Hash Hash bucket bucket bucket bucket misses hits Page replacement algorithm (GCLOCK) lock; lseek; read; unlock Database Files 50
  • 51. Making the buffer manager non-blocking Page requests 1. Utilized existing lock-free hash table Hash Hash Hash Hash bucket bucket bucket bucket misses hits Page replacement algorithm (GCLOCK) lock; lseek; read; unlock Database Files 51
  • 52. Making the buffer manager non-blocking Page requests 1. Utilized existing lock-free hash table Hash Hash Hash Hash bucket bucket bucket bucket misses hits Page replacement algorithm 2. Removing locks on cache (GCLOCK) misses (in fig. 6) lock; lseek; read; unlock Database Files 52
  • 53. Making the buffer manager non-blocking Page requests Hash Hash Hash Hash bucket bucket bucket bucket misses hits Page replacement algorithm (GCLOCK) lock; lseek; read; unlock Database Files 53
  • 54. Making the buffer manager non-blocking 3. Need to keep consistency Page requests between lookup hash table and GCLOCK (in the right half of fig. 3) Hash Hash Hash Hash bucket bucket bucket bucket misses hits Page replacement algorithm (GCLOCK) lock; lseek; read; unlock Database Files 54
  • 55. Making the buffer manager non-blocking 3. Need to keep consistency Page requests between lookup hash table and GCLOCK (in the right half of fig. 3) Hash Hash Hash Hash bucket bucket bucket bucket Reference in buffer lookup table misses hits still has a different page identifier immediately after changing the Page replacement algorithm page allocation of a buffer frame (GCLOCK) lock; lseek; read; unlock Database Files 55
  • 56. Making the buffer manager non-blocking 3. Need to keep consistency Page requests between lookup hash table and GCLOCK (in the right half of fig. 3) Hash Hash Hash Hash bucket bucket bucket bucket Reference in buffer lookup table misses hits still has a different page identifier immediately after changing the Page replacement algorithm page allocation of a buffer frame (GCLOCK) lock; lseek; read; unlock 4. Avoided locks on I/Os Database Files by utilizing pread, CAS, and memory barriers (in fig. 5) 56
  • 57. State Machine-based Reasoning for selecting replacement victim Construct algorithm from many 'steps' ─ build a State Machine for ensuring glabal progress 57
  • 58. State Machine-based Reasoning for selecting replacement victim 58
  • 59. State Machine-based Reasoning for selecting replacement victim E: entry action evicted Fix in pool swapped Check whether Evicted E: CAS value success !null E: move the clock hand !evicted ! swapped Check whether evicted Pinned Select a frame Try to evict E: evict !evicted pinned !pinned null --refcount<=0 Try to decrement continue the refcount E: decrement E: try next entry the refcount --refcount>0 59
  • 60. State Machine-based Reasoning for selecting replacement victim E: entry action evicted Fix in pool swapped Check whether Evicted E: CAS value success !null E: move the Start finding a ! swapped clock hand !evicted replacement Check whether evicted Pinned victim Select a frame Try to evict E: evict !evicted pinned !pinned null --refcount<=0 Try to decrement continue the refcount E: decrement E: try next entry the refcount --refcount>0 60
  • 61. State Machine-based Reasoning for selecting replacement victim E: entry action evicted Fix in pool swapped Check whether Evicted E: CAS value success !null E: move the Start finding a ! swapped clock hand !evicted replacement Check whether evicted Pinned victim Select a frame Try to evict E: evict !evicted pinned !pinned null --refcount<=0 Try to decrement continue the refcount E: decrement E: try next entry the refcount --refcount>0 Decrement weight count of a buffer page 61
  • 62. State Machine-based Reasoning for selecting replacement victim Return a replacement E: entry action evicted victim Check whether Fix in pool swapped Evicted E: CAS value success !null E: move the Start finding a ! swapped clock hand !evicted replacement Check whether evicted Pinned victim Select a frame Try to evict E: evict !evicted pinned !pinned null --refcount<=0 Try to decrement continue the refcount E: decrement E: try next entry the refcount --refcount>0 Decrement weight count of a buffer page 62
  • 63. State Machine-based Reasoning for selecting replacement victim Return a replacement E: entry action evicted victim Check whether Fix in pool swapped Evicted E: CAS value success !null E: move the Start finding a ! swapped clock hand !evicted replacement Check whether evicted Pinned victim Select a frame Try to evict E: evict !evicted pinned !pinned null --refcount<=0 Try to decrement continue the refcount E: decrement E: try next entry the refcount --refcount>0 Decrement weight count Advance CLOCK hand of a buffer page (check the next candidate) 63
  • 64. State Machine-based Reasoning for selecting replacement victim Thread A Return a replacement E: entry action evicted victim Check whether Fix in pool swapped Evicted E: CAS value success !null E: move the Start finding a ! swapped clock hand !evicted replacement Check whether evicted Pinned victim Select a frame Try to evict E: evict !evicted pinned !pinned null --refcount<=0 Try to decrement continue the refcount E: decrement E: try next entry the refcount --refcount>0 Decrement weight count Advance CLOCK hand of a buffer page (check the next candidate) 64
  • 65. State Machine-based Reasoning for selecting replacement victim Thread A Return a replacement E: entry action evicted victim Check whether Fix in pool swapped Evicted E: CAS value success !null E: move the Start finding a ! swapped clock hand !evicted replacement Check whether evicted Thread B Pinned victim Select a frame Try to evict E: evict !evicted pinned !pinned null --refcount<=0 Try to decrement continue the refcount E: decrement E: try next entry the refcount --refcount>0 Decrement weight count Advance CLOCK hand of a buffer page (check the next candidate) 65
  • 66. State Machine-based Reasoning for selecting replacement victim Thread A Return a replacement E: entry action evicted victim Check whether Fix in pool swapped Evicted E: CAS value success !null E: move the Start finding a ! swapped clock hand !evicted replacement Check whether evicted Thread B Pinned Oops! Candidate victim Select a frame isTry to evict intercepted. E: evict !evicted pinned !pinned null --refcount<=0 Try to decrement continue the refcount E: decrement E: try next entry the refcount --refcount>0 Decrement weight count Advance CLOCK hand of a buffer page (check the next candidate) 66
  • 67. State Machine-based Reasoning for selecting replacement victim Thread A Return a replacement E: entry action evicted victim Check whether Fix in pool swapped Evicted E: CAS value success !null E: move the Start finding a ! swapped clock hand !evicted replacement Check whether evicted Thread B Pinned victim Select a frame Try to evict E: evict !evicted pinned !pinned null --refcount<=0 Try to decrement continue the refcount E: decrement E: try next entry the refcount --refcount>0 Decrement weight count Advance CLOCK hand of a buffer page (check the next candidate) 67
  • 68. Outline • Background • Our approach – Non-Blocking Synchronization – Nb-GCLOCK • Experimental Evaluation • Related Work • Conclusion 68
  • 69. Experimental settings  Workload  Zipf 80/20 distribution (a famous power law) containing 20% of sequential scans dataset size is 32GB in total  Machine used: UltraSPARC T2 64 processors 69
  • 70. Experimental settings  Workload  Zipf 80/20 distribution (a famous power law) containing 20% of sequential scans dataset size is 32GB in total  Machine used: UltraSPARC T2 64 processors We also performed evaluation on various X86 settings in the paper. 70
  • 71. Performance comparison on moderate I/Os (of fig.9) Throughput (normalized by LRU) 6.0 LRU 5.0 GCLOCK 4.0 Nb-GCLOCK 3.0 2.0 1.0 0.0 8 16 32 64 Processors 71
  • 72. Performance comparison on moderate I/Os (of fig.9) Throughput (normalized by LRU) 6.0 LRU 5.0 GCLOCK 4.0 Nb-GCLOCK 3.0 2.0 1.0 CPU0.0 utilization  Previous approach: Low, about 20% 8 16 32 64 Processors  Nb-GCLOCK: High, more than 95% 72
  • 73. Performance comparison on moderate I/Os (of fig.9) Throughput More difference in CPU time can be (normalized by LRU) expected when # of CPU increases ➜ We expect more throughput 6.0 LRU 5.0 GCLOCK 4.0 Nb-GCLOCK 3.0 2.0 1.0 CPU0.0 utilization  Previous approach: Low, about 20% 8 16 32 64 Processors  Nb-GCLOCK: High, more than 95% 73
  • 74. Maximum throughput to processors Scalability to processors when pages are resident in memory intending to see the scalability limit expected by each algorithm 74
  • 75. Maximum throughput to processors Scalability to processors when pages are resident in memory intending to see the scalability limit expected by each algorithm Throughput (log scale) 8 (1) 16 (2) 32 (4) 64 (8) Processors 2Q 890992 819975 866009 662782 GCLOCK 1758605 1912000 1931268 1817748 (cores) Nb-GCLOCK 3409819 7331722 14245524 25834449 75
  • 76. Maximum throughput to processors Scalability to processors when pages are resident in memory intending to see the scalability limit expected by each algorithm Throughput (log scale) Achieved almost linear scalability, at least, up to 64 processors!  This is the first attempt that removed locks in buffer management 8 (1) 16 (2) 32 (4) 64 (8) Processors 2Q 890992 819975 866009 662782 GCLOCK 1758605 1912000 1931268 1817748 (cores) Nb-GCLOCK 3409819 7331722 14245524 25834449 76
  • 77. Maximum throughput to processors Scalability to processors when pages are resident in memory intending to see the scalability limit expected by each algorithm Throughput (log scale) Achieved almost linear scalability, at least, up to 64 processors!  This is the first attempt that removed locks in buffer management 8 (1) 16 (2) 32 (4) 64 (8) Processors 2Q Interesting here is GCLOCK has662782 890992 819975 866009 CPU- GCLOCK scalability limit on around 16 1817748 1758605 1912000 1931268 (cores) Nb-GCLOCK 3409819 Caching solutions 25834449 processors. 7331722 14245524 using GCLOCK have scalability limit there. 77
  • 78. Max thoughput (operation/sec) evaluation  Workload is Zipf 80/20, Evaluated on UltraSparcT2 (64 procs)  Accesses issued from 64 threads in 60 seconds  Thus, ideally 64 x 60 = 3,840 seconds can be used 78
  • 79. Max thoughput (operation/sec) evaluation  Workload is Zipf 80/20, Evaluated on UltraSparcT2 (64 procs)  Accesses issued from 64 threads in 60 seconds  Thus, ideally 64 x 60 = 3,840 seconds can be used 79
  • 80. Max thoughput (operation/sec) evaluation  Workload is Zipf 80/20, Evaluated on UltraSparcT2 (64 procs)  Accesses issued from 64 threads in 60 seconds  Thus, ideally 64 x 60 = 3,840 seconds can be used Most of CPU time is used because our Nb-GCLOCK is non-blocking! 80
  • 81. Max thoughput (operation/sec) evaluation  Workload is Zipf 80/20, Evaluated on UltraSparcT2 (64 procs)  Accesses issued from 64 threads in 60 seconds  Thus, ideally 64 x 60 = 3,840 seconds can be used About 10-20% of CPU Time is used! Most of CPU time is used because our Nb-GCLOCK is non-blocking! 81
  • 82. Max thoughput (operation/sec) evaluation  Workload is Zipf 80/20, Evaluated on UltraSparcT2 (64 procs)  Accesses issued from 64 threads in 60 seconds  Thus, ideally 64 x 60 = 3,840 seconds can be used About 10-20% of CPU Time is used! Most of CPU time is used because our Nb-GCLOCK is non-blocking! The CPU utilization would be more differs when # of processors grows. It would causes contentions! 82
  • 83. TPC-C evaluation using Apache Derby 1400 1300 Transaction per minutes 1200 tpmC 1100 Derby 1000 Nb-GCLOCK 900 800 8 16 32 64 128 # of terminals (threads) Sang Kyun Cha et al. Cache-Conscious Concurrency Control of Main-Memory Indexes on Shared- Memory Multiprocessor Systems. In Proc. VLDB, 2001. 83
  • 84. TPC-C evaluation using Apache Derby 1400 1300 Transaction per minutes 1200 tpmC 1100 Derby 1000 Nb-GCLOCK 900 800 8 16 32 64 128 The original scheme of Derby (CLOCK) decreased throughput.#On the other hand, of terminals (threads) ours scheme showed better result. Sang Kyun Cha et al. Cache-Conscious Concurrency Control of Main-Memory Indexes on Shared- Memory Multiprocessor Systems. In Proc. VLDB, 2001. 84
  • 85. TPC-C evaluation using Apache Derby Throughput to buffer management module reduced a latch on root page of B+-tree ➜ We would require a concurrent B+-tree (see OLFIT) 1400 1300 Transaction per minutes 1200 tpmC 1100 Derby 1000 Nb-GCLOCK 900 800 8 16 32 64 128 # of terminals (threads) Sang Kyun Cha et al. Cache-Conscious Concurrency Control of Main-Memory Indexes on Shared- Memory Multiprocessor Systems. In Proc. VLDB, 2001. 85
  • 86. Outline • Background • Our approach – Non-Blocking Synchronization – Nb-GCLOCK • Experimental Evaluation • Related Work • Conclusion 86
  • 87. Xiaoning Ding, Song Jiang, and Xiaodong Zhang: Bp-wrapper Bp-Wrapper: A System Framework Making Any Replacement Algorithms (Almost) Lock Contention Free, Proc. ICDE, 2009. eliminates lock contention on buffer hits Page requests by using a batching and prefetching technique Hash Hash Hash Hash bucket bucket bucket bucket hits misses Recording access Page replacement algorithm (any) Database Files 87
  • 88. Xiaoning Ding, Song Jiang, and Xiaodong Zhang: Bp-wrapper Bp-Wrapper: A System Framework Making Any Replacement Algorithms (Almost) Lock Contention Free, Proc. ICDE, 2009. eliminates lock contention on buffer hits Page requests by using a batching and prefetching technique Hash Hash Hash Hash postpones the physical work bucket bucket bucket bucket (adjusting the buffer replacement list) hits and immediately returns misses the logical operation Recording access called Lazy synchronization in the literature Page replacement algorithm (any) Database Files 88
  • 89. Xiaoning Ding, Song Jiang, and Xiaodong Zhang: Bp-wrapper Bp-Wrapper: A System Framework Making Any Replacement Algorithms (Almost) Lock Contention Free, Proc. ICDE, 2009. eliminates lock contention on buffer hits Page requests by using a batching and prefetching technique Hash Hash Hash Hash postpones the physical work bucket bucket bucket bucket (adjusting the buffer replacement list) hits and immediately returns misses the logical operation Recording access called Lazy synchronization in the literature Pros. Page replacement algorithm - works with any page replacement algorithm (any) Cons. - Does not increase throughputs of CLOCK variants because CLOCK does not require locks on buffer hits Database - Cache misses involve batching Files larger lock holding time makes more contentions 89
  • 90. Conclusions  Proposed a lock-free variant of the GCLOCK page replacement algorithm, named Nb-GCLOCK.  Linearizability and lock-freedom are proven in the paper 90
  • 91. Conclusions  Proposed a lock-free variant of the GCLOCK page replacement algorithm, named Nb-GCLOCK.  almost linear scalability to processors up to 64 processors while existing locking-based schemes do not scale beyond 16 processors  The first attempt that introduce non-blocking synchronization to database buffer management  Optimistic I/Os using pread, CAS and memory barriers  Linearizability and lock-freedom are proven in the paper 91
  • 92. Conclusions  Proposed a lock-free variant of the GCLOCK page replacement algorithm, named Nb-GCLOCK.  almost linear scalability to processors up to 64 processors while existing locking-based schemes do not scale beyond 16 processors  The first attempt that introduce non-blocking synchronization to database buffer management  Optimistic I/Os using pread, CAS and memory barriers  Linearizability and lock-freedom are proven in the paper  The lock-freedom guarantees a certain throughput: any active thread taking a bounded number of steps ensures global progress. 92
  • 93. Conclusions  Proposed a lock-free variant of the GCLOCK page replacement algorithm, named Nb-GCLOCK.  almost linear scalability to processors up to 64 processors while existing locking-based schemes do not scale beyond 16 processors  The first attempt that introduce non-blocking synchronization to database buffer management  Optimistic I/Os using pread, CAS and memory barriers  Linearizability and lock-freedom are proven in the paper  The lock-freedom guarantees a certain throughput: any active thread taking a bounded number of steps ensures global progress. This work is also useful for any caching solution that requires high throughput (e.g., C10K accesses) 93
  • 94. Thank you for your attention! 94