SlideShare una empresa de Scribd logo
1 de 46
Descargar para leer sin conexión
Storage infrastructure using
            HBase behind LINE messages	
                     NHN Japan Corp.
                  LINE Server Task Force
                 Shunsuke Nakamura
                      @sunsuk7tp	

13.1.21	
            Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   2
To support      ’s       users, we have built
                  message storage that is

             Large scale (tens of billion rows/day)
                 Responsive (under 10 ms)
                High available (dual clusters)




13.1.21	
            Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   3
Outline	
•    About LINE
•    LINE & Storage requirements
•    What we achieved
•    Today’s topics
       –  IDC online migration
       –  NN failover
       –  Stabilizing LINE message cluster
•  Conclusion
13.1.21	
             Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   4
LINE
     - A global messenger powered by NHN Japan - 	

                                   Devices
                                            5 different mobile platforms
                                            + Desktop support




13.1.21	
          Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   5
13.1.21	
   Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   6
13.1.21	
   Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   7
New year 2013 in Japan	
                                  Number of requests in a HBase cluster
                                                                          Usual Peak Hours        New Year 2013


                             X	
  3	




(ploFed	
  by	
  1min)	
                あけおめ!	
                                         新年好!	




                             3	
  5mes	
  traffic	
  explosion	
  
                           LINE	
  Storage	
  had	
  no	
  problems	
  :)	
  
     13.1.21	
                                Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
              9
LINE on Hadoop	
            Storages for service, backup and log	

            For HBase, M/R and log archive	

            Bulk migration and ad-hoc analysis	

            For HBase and Sharded-Redis	

            Collecting Apache and Tomcat logs	

            KPI, Log analysis	
13.1.21	
          Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   10
LINE on Hadoop	
            Storages for service, backup and log	

            For HBase, M/R and log archive	

            Bulk migration and ad-hoc analysis	

            For HBase and Sharded-Redis	

            Collecting Apache and Tomcat logs	

            KPI, Log analysis	
13.1.21	
          Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   11
LINE service requirements	
LINE is a…
  Messaging Service - Should be fast
  Global Service - Downtime not allowed

But, not a Simple Messaging Service.
  Message synchronization b/w phone & PCs
       –  Messages should be kept for a while.	


13.1.21	
             Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   12
LINE’s storage requirements	
                                         No	
  	
  
                                      data	
  loss	


             Eventual	
                                                       Low	
  
            consistency	
                                                   latency	

                                     HA	
                    Flexible	
  
                    schema	
  
                                                          Easy	
  scale-­‐
                  management	
                               out	


13.1.21	
               Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
               13
Our selection is HBase	
•  Low latency for large amount of data
•  Linearly scalable
•  Relatively lower operating cost
       –  Replication by nature
       –  Automatic failover
•  Data model fits our requirements
       –  Semi-structured
       –  Timestamp

13.1.21	
             Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   14
Stored rows per day in a cluster 	
(billions/day)	
               10	

                   8	

                   6	


                   4	

                   2	


   13.1.21	
             Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   15
What we achieved with HBase	
•  No data loss
     –  Persistent
     –  Data replication
            •  Automatic recovery from server failure


•  Reasonable performance for large data sets
     –  Hundreds of billion rows
     –  Write: ~ 1 ms
     –  Read: 1 ~ 10 ms

13.1.21	
                   Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   16
Many issues we had	
•    Heterogeneous storages coordination
•    IDC online migration
•    Flush & Compaction Storms by “too many HLogs”
•    Row & Column distribution
•    Secondary Index
•    Region Management
       –    load, size balancing
       –    RS Allocation
       –    META region
       –    M/R
•    Monitoring for diagnostics
•    Traffic burst by decommission
•    NN problems
•    Performance degradation
       –    hotspot problem
       –    timeout burst
       –    GC problem
•    Client bugs
       –    Thread Blocking on server failure (HBASE-6364)




13.1.21	
                               Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   17
Today’s topics	


                  IDC online migration

                       NN failover

            Stabilizing LINE message cluster


13.1.21	
            Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   18
IDC online migration
                      NN failover
Stabilizing LINE message cluster
Why?	


•  Move whole HBase clusters and data

•  For better network infrastructure

•  Without downtime


13.1.21	
       Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   20
IDC online migration	
Before migration	



             App Server	
                                                  dst-HBase	


            write	



             src-HBase	


13.1.21	
              Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
                 21
IDC online migration	
•  Write to both (client-level replication)



                                         write	
             App Server	
                                                  dst-HBase	


            write	



             src-HBase	


13.1.21	
              Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
                 22
IDC online migration	
•  New data: Incremental replication
•  Old data: Bulk migration
•  dst’s timestamp equals src’s one
                                         write	
             App Server	
                                                  dst-HBase	


            write	



             src-HBase	


13.1.21	
              Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
                 23
LINE HBase Replicator & BulkMigrator	



                                    Replicator is for incremental replication
                                    BulkMigrator is for bulk migration 	




 13.1.21	
   Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
              24
LINE HBase Replicator	
•  Our own implementation
•  Prefer pull to push
       •  Throughput throttling
       •  Workload isolation of replicator and RS
•  Rowkey conversion and filtering
            HBase	
  Replicator	
                         LINE	
  HBase	
  Replicator	
                src-HBase	
                                             src-HBase	


                       push	
                                                     pull	
                dst-HBase	
                                             dst-HBase	
13.1.21	
                     Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
            25
LINE HBase Replicator
 - A simple daemon to replicate local regions -	


                             1.  HLogTracker reads a ckpt
                                 and selects next HLog.
                             2.  For each entry in HLog:
                                     1.        Filter & convert a HLog.Entry
                                     2.        Create Puts and batch to dst HBase


                             •         Periodic checkpointing
                             •         Generally, entries are replicated
                                       in seconds	


13.1.21	
        Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
          26
Bulk migration	
1.  MapReduce between any storages
     –  Map task only
     –  Read source, write destination
     –  Task scheduling problem depends on region allocation

2.  Non MapReduce version (BulkMigrator)
     –  Our own implementation
     –  HBase → HBase
     –  On each RS, scan & batch by a region
     –  Throughput throttling
     –  Slow, but easy to implement and debug
 13.1.21	
             Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   27
IDC online migration
                      NN failover
Stabilizing LINE message cluster
Background	
•  Our HBase has a SPOF: NameNode
•  “Apache Hadoop HA Configuration”
       http://blog.cloudera.com/blog/2009/07/hadoop-ha-configuration/

•  Furthermore, added Pacemaker
       –  Heartbeat can’t detect whether NN is running




13.1.21	
                  Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   29
Previous: HA-NN
            DRBD + VIP + Pacemaker 	




13.1.21	
         Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   30
NameNode failure
              in 2012.10	




13.1.21	
      Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   31
HA-NN failover failed 	

•  Not NameNode process
•  Incorrect leader election at network partitioning
•  Complicated configuration
       –  Easy to mistake, difficult to control
       –  Pacemaker scripting was not straightforward
       –  VIP is risky to HDFS
•  DRBD split-brain problem
       –  Protocol C
       –  Unable to re-sync while service is online
13.1.21	
                Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   32
Now: In-house NN failure handling	

•  Bye-bye old HA-NN
      –  Had to restart whole HBase clusters after NN failover
•  Alternative ideas
      –  Quorum-based leader election (Using ZK)
      –  Using L4 switch
      –  Implement our own AvatarNode
•  Safer solution instead of a little downtime



13.1.21	
               Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   33
In-house NN failure handling (1)	


            	
  rsync	
  with	
  -­‐-­‐link-­‐dest	
  periodically	
  




13.1.21	
      Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
         34
In-house NN failure handling (2)	
            Bomb	




13.1.21	
            Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   35
In-house NN failure handling (3)	




13.1.21	
   Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   36
IDC online migration
                      NN failover
Stabilizing LINE message cluster
Stabilizing LINE message cluster
                                                                  Case	
  1	
                                                               “Too	
  many	
  
                                                                 HLogs”	
  
                                                                                                                    H/W	
  Failure	
  
    RS	
  GC	
  Storm	
  
                                                                                                                     Handling	
  

                               Case	
  3	
                                                           Case	
  2	
                            META	
  region	
                                                          Hotspot	
  
                             workload	
                 Performance	
                                problems	
                             isola5on	


                                                                 Case	
  4	
                                                                 Region	
  
                                                                mappings	
  
                                                                  to	
  RS	


13.1.21	
                                        Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
                                  38
Case1: “Too many HLogs”	
•  Effect
      –  MemStore flush storm
      –  Compaction storm
•  Cause
      –  Different regions growth
      –  Heterogeneous tables in a RS
•  Solution
      –  Region balancing
      –  External flush scheduler

13.1.21	
            Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   39
Case1: Number of HLogs	
                                                                                          Forced flushed	
                                 	
                             shed
                   N   o flu


                  Periodic flushed	
                 better case	




peak	
                                                                                            off-peak	
                                         worse case	

                               Forced flushed	
    Forced flushed	
                 flush storm	

                                                  Forced flushed	


    13.1.21	
                         Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
             40
Case2: Hotspot problems	
•  Effect
      –  Excessive GC
      –  RS performance degradation (High CPU usage)
•  Cause
       –  Get/Scan:
            •  Row or column, updated too frequently
            •  Row which has too many columns (+ tombstones)
•  Solution
      –  Schema and row/column distribution are important
      –  Hotspot region isolation

13.1.21	
                 Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   41
Case3: META region workload
                     isolation	
•  Effect
      1.  RS high CPU
      2.  Excessive timeout
      3.  META lookup timeout
•  Cause
      –  Inefficient exception handling of HBase client
      –  Hotspot region and META in same RS
•  Solution
      –  META only RS

13.1.21	
            Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   42
Case4: Region mappings to RS	
•  Effect
      –  Region mapping is not restored on RS restart
      –  Some region mappings aren’t restored properly
         after graceful restart
            •    graceful_stop.sh --restart --reload
•  Cause
      –  HBase does not support it well
•  Solution
      –  Periodic dump and restore it
13.1.21	
                   Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   43
Summary	
•           IDC online migration
       –      Without downtime
       –      LINE HBase Replicator & BulkMigrator
•           NN failover
       –      Simple solution for a person saying
              “What’s Hadoop?”
•           Stabilizing LINE message cluster
       –      Improved response time of RS

13.1.21	
                 Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   44
Conclusion	
        We won 100M user adopting HBase




     LINE Storage is a successful example
       of a messaging service using HBase

13.1.21	
        Hadoop	
  Conference	
  Japan	
  2013	
  Winter	
   45
Storage infrastructure using HBase behind LINE messages

Más contenido relacionado

La actualidad más candente

MyCassandra (Full English Version)
MyCassandra (Full English Version)MyCassandra (Full English Version)
MyCassandra (Full English Version)Shun Nakamura
 
Hadoop Successes and Failures to Drive Deployment Evolution
Hadoop Successes and Failures to Drive Deployment EvolutionHadoop Successes and Failures to Drive Deployment Evolution
Hadoop Successes and Failures to Drive Deployment EvolutionBenoit Perroud
 
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and DeploymentOct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and DeploymentYahoo Developer Network
 
Architecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric BaldeschwielerArchitecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric Baldeschwielerlucenerevolution
 
Design, Scale and Performance of MapR's Distribution for Hadoop
Design, Scale and Performance of MapR's Distribution for HadoopDesign, Scale and Performance of MapR's Distribution for Hadoop
Design, Scale and Performance of MapR's Distribution for Hadoopmcsrivas
 
Geo-based content processing using hbase
Geo-based content processing using hbaseGeo-based content processing using hbase
Geo-based content processing using hbaseRavi Veeramachaneni
 
Architectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop DistributionArchitectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop Distributionmcsrivas
 
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, InformaticaHadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, InformaticaCloudera, Inc.
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars GeorgeJAX London
 
Apache HBase 1.0 Release
Apache HBase 1.0 ReleaseApache HBase 1.0 Release
Apache HBase 1.0 ReleaseNick Dimiduk
 
HBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance TuningLars Hofhansl
 
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaHBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaCloudera, Inc.
 
HBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBaseHBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBaseCloudera, Inc.
 

La actualidad más candente (20)

MyCassandra (Full English Version)
MyCassandra (Full English Version)MyCassandra (Full English Version)
MyCassandra (Full English Version)
 
Hadoop Successes and Failures to Drive Deployment Evolution
Hadoop Successes and Failures to Drive Deployment EvolutionHadoop Successes and Failures to Drive Deployment Evolution
Hadoop Successes and Failures to Drive Deployment Evolution
 
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and DeploymentOct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
 
Architecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric BaldeschwielerArchitecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric Baldeschwieler
 
Design, Scale and Performance of MapR's Distribution for Hadoop
Design, Scale and Performance of MapR's Distribution for HadoopDesign, Scale and Performance of MapR's Distribution for Hadoop
Design, Scale and Performance of MapR's Distribution for Hadoop
 
HBase Storage Internals
HBase Storage InternalsHBase Storage Internals
HBase Storage Internals
 
Hbase: an introduction
Hbase: an introductionHbase: an introduction
Hbase: an introduction
 
Geo-based content processing using hbase
Geo-based content processing using hbaseGeo-based content processing using hbase
Geo-based content processing using hbase
 
10c introduction
10c introduction10c introduction
10c introduction
 
Architectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop DistributionArchitectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop Distribution
 
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, InformaticaHadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
 
Apache HBase 1.0 Release
Apache HBase 1.0 ReleaseApache HBase 1.0 Release
Apache HBase 1.0 Release
 
HBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on Mesos
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
 
NoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBaseNoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBase
 
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaHBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
 
Hadoop 1.x vs 2
Hadoop 1.x vs 2Hadoop 1.x vs 2
Hadoop 1.x vs 2
 
HBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBaseHBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBase
 

Destacado

Tokyo HBase Meetup - Realtime Big Data at Facebook with Hadoop and HBase (ja)
Tokyo HBase Meetup - Realtime Big Data at Facebook with Hadoop and HBase (ja)Tokyo HBase Meetup - Realtime Big Data at Facebook with Hadoop and HBase (ja)
Tokyo HBase Meetup - Realtime Big Data at Facebook with Hadoop and HBase (ja)tatsuya6502
 
H-Base in Data Base Mangement System
H-Base in Data Base Mangement SystemH-Base in Data Base Mangement System
H-Base in Data Base Mangement SystemPreetham Devisetty
 
OSSで支えられるライブドアの巨大ログ集計 #nhntech
OSSで支えられるライブドアの巨大ログ集計 #nhntechOSSで支えられるライブドアの巨大ログ集計 #nhntech
OSSで支えられるライブドアの巨大ログ集計 #nhntechSATOSHI TAGOMORI
 
しばちょう先生による特別講義! RMANバックアップの運用と高速化チューニング
しばちょう先生による特別講義! RMANバックアップの運用と高速化チューニングしばちょう先生による特別講義! RMANバックアップの運用と高速化チューニング
しばちょう先生による特別講義! RMANバックアップの運用と高速化チューニングオラクルエンジニア通信
 
まだ間に合う HBaseCon2016
まだ間に合う HBaseCon2016まだ間に合う HBaseCon2016
まだ間に合う HBaseCon2016Hirotaka Kakishima
 
クラウド環境向けZabbixカスタマイズ紹介(第5回Zabbix勉強会)
クラウド環境向けZabbixカスタマイズ紹介(第5回Zabbix勉強会)クラウド環境向けZabbixカスタマイズ紹介(第5回Zabbix勉強会)
クラウド環境向けZabbixカスタマイズ紹介(第5回Zabbix勉強会)Daisuke Ikeda
 
Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010Cloudera, Inc.
 
パフォーマンスタブ見れないんですけど!! 株式会社コーソル 河野 敏彦
パフォーマンスタブ見れないんですけど!! 株式会社コーソル 河野 敏彦パフォーマンスタブ見れないんですけど!! 株式会社コーソル 河野 敏彦
パフォーマンスタブ見れないんですけど!! 株式会社コーソル 河野 敏彦CO-Sol for Community
 
HBase スキーマ設計のポイント
HBase スキーマ設計のポイントHBase スキーマ設計のポイント
HBase スキーマ設計のポイントdaisuke-a-matsui
 
Big Data入門に見せかけたFluentd入門
Big Data入門に見せかけたFluentd入門Big Data入門に見せかけたFluentd入門
Big Data入門に見せかけたFluentd入門Keisuke Takahashi
 
Osc2012 spring HBase Report
Osc2012 spring HBase ReportOsc2012 spring HBase Report
Osc2012 spring HBase ReportSeiichiro Ishida
 
NoSQL at Twitter (NoSQL EU 2010)
NoSQL at Twitter (NoSQL EU 2010)NoSQL at Twitter (NoSQL EU 2010)
NoSQL at Twitter (NoSQL EU 2010)Kevin Weil
 
おじさん二人が語る OOW デビューのススメ! Oracle OpenWorld 2016参加報告 [検閲版] 株式会社コーソル 杉本 篤信, 河野 敏彦
おじさん二人が語る OOW デビューのススメ! Oracle OpenWorld 2016参加報告 [検閲版] 株式会社コーソル 杉本 篤信, 河野 敏彦 おじさん二人が語る OOW デビューのススメ! Oracle OpenWorld 2016参加報告 [検閲版] 株式会社コーソル 杉本 篤信, 河野 敏彦
おじさん二人が語る OOW デビューのススメ! Oracle OpenWorld 2016参加報告 [検閲版] 株式会社コーソル 杉本 篤信, 河野 敏彦 CO-Sol for Community
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase强 王
 
Kudu: New Hadoop Storage for Fast Analytics on Fast Data
Kudu: New Hadoop Storage for Fast Analytics on Fast DataKudu: New Hadoop Storage for Fast Analytics on Fast Data
Kudu: New Hadoop Storage for Fast Analytics on Fast DataCloudera, Inc.
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraDataStax
 
Apache HBase at Airbnb
Apache HBase at Airbnb Apache HBase at Airbnb
Apache HBase at Airbnb HBaseCon
 

Destacado (20)

Tokyo HBase Meetup - Realtime Big Data at Facebook with Hadoop and HBase (ja)
Tokyo HBase Meetup - Realtime Big Data at Facebook with Hadoop and HBase (ja)Tokyo HBase Meetup - Realtime Big Data at Facebook with Hadoop and HBase (ja)
Tokyo HBase Meetup - Realtime Big Data at Facebook with Hadoop and HBase (ja)
 
20120423 hbase勉強会
20120423 hbase勉強会20120423 hbase勉強会
20120423 hbase勉強会
 
H-Base in Data Base Mangement System
H-Base in Data Base Mangement SystemH-Base in Data Base Mangement System
H-Base in Data Base Mangement System
 
OSSで支えられるライブドアの巨大ログ集計 #nhntech
OSSで支えられるライブドアの巨大ログ集計 #nhntechOSSで支えられるライブドアの巨大ログ集計 #nhntech
OSSで支えられるライブドアの巨大ログ集計 #nhntech
 
しばちょう先生による特別講義! RMANバックアップの運用と高速化チューニング
しばちょう先生による特別講義! RMANバックアップの運用と高速化チューニングしばちょう先生による特別講義! RMANバックアップの運用と高速化チューニング
しばちょう先生による特別講義! RMANバックアップの運用と高速化チューニング
 
HBase at LINE
HBase at LINEHBase at LINE
HBase at LINE
 
まだ間に合う HBaseCon2016
まだ間に合う HBaseCon2016まだ間に合う HBaseCon2016
まだ間に合う HBaseCon2016
 
クラウド環境向けZabbixカスタマイズ紹介(第5回Zabbix勉強会)
クラウド環境向けZabbixカスタマイズ紹介(第5回Zabbix勉強会)クラウド環境向けZabbixカスタマイズ紹介(第5回Zabbix勉強会)
クラウド環境向けZabbixカスタマイズ紹介(第5回Zabbix勉強会)
 
Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010
 
パフォーマンスタブ見れないんですけど!! 株式会社コーソル 河野 敏彦
パフォーマンスタブ見れないんですけど!! 株式会社コーソル 河野 敏彦パフォーマンスタブ見れないんですけど!! 株式会社コーソル 河野 敏彦
パフォーマンスタブ見れないんですけど!! 株式会社コーソル 河野 敏彦
 
HBase スキーマ設計のポイント
HBase スキーマ設計のポイントHBase スキーマ設計のポイント
HBase スキーマ設計のポイント
 
Big Data入門に見せかけたFluentd入門
Big Data入門に見せかけたFluentd入門Big Data入門に見せかけたFluentd入門
Big Data入門に見せかけたFluentd入門
 
Osc2012 spring HBase Report
Osc2012 spring HBase ReportOsc2012 spring HBase Report
Osc2012 spring HBase Report
 
NoSQL at Twitter (NoSQL EU 2010)
NoSQL at Twitter (NoSQL EU 2010)NoSQL at Twitter (NoSQL EU 2010)
NoSQL at Twitter (NoSQL EU 2010)
 
おじさん二人が語る OOW デビューのススメ! Oracle OpenWorld 2016参加報告 [検閲版] 株式会社コーソル 杉本 篤信, 河野 敏彦
おじさん二人が語る OOW デビューのススメ! Oracle OpenWorld 2016参加報告 [検閲版] 株式会社コーソル 杉本 篤信, 河野 敏彦 おじさん二人が語る OOW デビューのススメ! Oracle OpenWorld 2016参加報告 [検閲版] 株式会社コーソル 杉本 篤信, 河野 敏彦
おじさん二人が語る OOW デビューのススメ! Oracle OpenWorld 2016参加報告 [検閲版] 株式会社コーソル 杉本 篤信, 河野 敏彦
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
 
Alibaba group
Alibaba groupAlibaba group
Alibaba group
 
Kudu: New Hadoop Storage for Fast Analytics on Fast Data
Kudu: New Hadoop Storage for Fast Analytics on Fast DataKudu: New Hadoop Storage for Fast Analytics on Fast Data
Kudu: New Hadoop Storage for Fast Analytics on Fast Data
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache Cassandra
 
Apache HBase at Airbnb
Apache HBase at Airbnb Apache HBase at Airbnb
Apache HBase at Airbnb
 

Similar a Storage infrastructure using HBase behind LINE messages

Seattle Scalability Meetup - Ted Dunning - MapR
Seattle Scalability Meetup - Ted Dunning - MapRSeattle Scalability Meetup - Ted Dunning - MapR
Seattle Scalability Meetup - Ted Dunning - MapRclive boulton
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHitendra Kumar
 
HBase User Group #9: HBase and HDFS
HBase User Group #9: HBase and HDFSHBase User Group #9: HBase and HDFS
HBase User Group #9: HBase and HDFSCloudera, Inc.
 
支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统yongboy
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce ParadigmDilip Reddy
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce ParadigmDilip Reddy
 
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.02013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0Adam Muise
 
Big Data Hoopla Simplified - TDWI Memphis 2014
Big Data Hoopla Simplified - TDWI Memphis 2014Big Data Hoopla Simplified - TDWI Memphis 2014
Big Data Hoopla Simplified - TDWI Memphis 2014Rajan Kanitkar
 
Apache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce OverviewApache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce OverviewNisanth Simon
 
Strata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and FutureStrata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and FutureCloudera, Inc.
 
Cisco connect toronto 2015 big data sean mc keown
Cisco connect toronto 2015 big data  sean mc keownCisco connect toronto 2015 big data  sean mc keown
Cisco connect toronto 2015 big data sean mc keownCisco Canada
 
Big Data Architecture and Deployment
Big Data Architecture and DeploymentBig Data Architecture and Deployment
Big Data Architecture and DeploymentCisco Canada
 
Big data hadoop ecosystem and nosql
Big data hadoop ecosystem and nosqlBig data hadoop ecosystem and nosql
Big data hadoop ecosystem and nosqlKhanderao Kand
 
Hadoop - HDFS
Hadoop - HDFSHadoop - HDFS
Hadoop - HDFSKavyaGo
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduceDerek Chen
 
Hadoop: today and tomorrow
Hadoop: today and tomorrowHadoop: today and tomorrow
Hadoop: today and tomorrowSteve Loughran
 
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...Big Data Montreal
 
Hbase status quo apache-con europe - nov 2012
Hbase status quo   apache-con europe - nov 2012Hbase status quo   apache-con europe - nov 2012
Hbase status quo apache-con europe - nov 2012Chris Huang
 

Similar a Storage infrastructure using HBase behind LINE messages (20)

Seattle Scalability Meetup - Ted Dunning - MapR
Seattle Scalability Meetup - Ted Dunning - MapRSeattle Scalability Meetup - Ted Dunning - MapR
Seattle Scalability Meetup - Ted Dunning - MapR
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log Processing
 
HBase with MapR
HBase with MapRHBase with MapR
HBase with MapR
 
HBase User Group #9: HBase and HDFS
HBase User Group #9: HBase and HDFSHBase User Group #9: HBase and HDFS
HBase User Group #9: HBase and HDFS
 
支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce Paradigm
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce Paradigm
 
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.02013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
 
Hadoop programming
Hadoop programmingHadoop programming
Hadoop programming
 
Big Data Hoopla Simplified - TDWI Memphis 2014
Big Data Hoopla Simplified - TDWI Memphis 2014Big Data Hoopla Simplified - TDWI Memphis 2014
Big Data Hoopla Simplified - TDWI Memphis 2014
 
Apache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce OverviewApache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce Overview
 
Strata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and FutureStrata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and Future
 
Cisco connect toronto 2015 big data sean mc keown
Cisco connect toronto 2015 big data  sean mc keownCisco connect toronto 2015 big data  sean mc keown
Cisco connect toronto 2015 big data sean mc keown
 
Big Data Architecture and Deployment
Big Data Architecture and DeploymentBig Data Architecture and Deployment
Big Data Architecture and Deployment
 
Big data hadoop ecosystem and nosql
Big data hadoop ecosystem and nosqlBig data hadoop ecosystem and nosql
Big data hadoop ecosystem and nosql
 
Hadoop - HDFS
Hadoop - HDFSHadoop - HDFS
Hadoop - HDFS
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduce
 
Hadoop: today and tomorrow
Hadoop: today and tomorrowHadoop: today and tomorrow
Hadoop: today and tomorrow
 
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
 
Hbase status quo apache-con europe - nov 2012
Hbase status quo   apache-con europe - nov 2012Hbase status quo   apache-con europe - nov 2012
Hbase status quo apache-con europe - nov 2012
 

Último

Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 

Último (20)

Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 

Storage infrastructure using HBase behind LINE messages

  • 1.
  • 2. Storage infrastructure using HBase behind LINE messages NHN Japan Corp. LINE Server Task Force Shunsuke Nakamura @sunsuk7tp 13.1.21 Hadoop  Conference  Japan  2013  Winter 2
  • 3. To support ’s users, we have built message storage that is Large scale (tens of billion rows/day) Responsive (under 10 ms) High available (dual clusters) 13.1.21 Hadoop  Conference  Japan  2013  Winter 3
  • 4. Outline •  About LINE •  LINE & Storage requirements •  What we achieved •  Today’s topics –  IDC online migration –  NN failover –  Stabilizing LINE message cluster •  Conclusion 13.1.21 Hadoop  Conference  Japan  2013  Winter 4
  • 5. LINE - A global messenger powered by NHN Japan - Devices 5 different mobile platforms + Desktop support 13.1.21 Hadoop  Conference  Japan  2013  Winter 5
  • 6. 13.1.21 Hadoop  Conference  Japan  2013  Winter 6
  • 7. 13.1.21 Hadoop  Conference  Japan  2013  Winter 7
  • 8.
  • 9. New year 2013 in Japan Number of requests in a HBase cluster Usual Peak Hours New Year 2013 X  3 (ploFed  by  1min) あけおめ! 新年好! 3  5mes  traffic  explosion   LINE  Storage  had  no  problems  :)   13.1.21 Hadoop  Conference  Japan  2013  Winter 9
  • 10. LINE on Hadoop Storages for service, backup and log For HBase, M/R and log archive Bulk migration and ad-hoc analysis For HBase and Sharded-Redis Collecting Apache and Tomcat logs KPI, Log analysis 13.1.21 Hadoop  Conference  Japan  2013  Winter 10
  • 11. LINE on Hadoop Storages for service, backup and log For HBase, M/R and log archive Bulk migration and ad-hoc analysis For HBase and Sharded-Redis Collecting Apache and Tomcat logs KPI, Log analysis 13.1.21 Hadoop  Conference  Japan  2013  Winter 11
  • 12. LINE service requirements LINE is a… Messaging Service - Should be fast Global Service - Downtime not allowed But, not a Simple Messaging Service. Message synchronization b/w phone & PCs –  Messages should be kept for a while. 13.1.21 Hadoop  Conference  Japan  2013  Winter 12
  • 13. LINE’s storage requirements No     data  loss Eventual   Low   consistency latency HA Flexible   schema   Easy  scale-­‐ management out 13.1.21 Hadoop  Conference  Japan  2013  Winter 13
  • 14. Our selection is HBase •  Low latency for large amount of data •  Linearly scalable •  Relatively lower operating cost –  Replication by nature –  Automatic failover •  Data model fits our requirements –  Semi-structured –  Timestamp 13.1.21 Hadoop  Conference  Japan  2013  Winter 14
  • 15. Stored rows per day in a cluster (billions/day) 10 8 6 4 2 13.1.21 Hadoop  Conference  Japan  2013  Winter 15
  • 16. What we achieved with HBase •  No data loss –  Persistent –  Data replication •  Automatic recovery from server failure •  Reasonable performance for large data sets –  Hundreds of billion rows –  Write: ~ 1 ms –  Read: 1 ~ 10 ms 13.1.21 Hadoop  Conference  Japan  2013  Winter 16
  • 17. Many issues we had •  Heterogeneous storages coordination •  IDC online migration •  Flush & Compaction Storms by “too many HLogs” •  Row & Column distribution •  Secondary Index •  Region Management –  load, size balancing –  RS Allocation –  META region –  M/R •  Monitoring for diagnostics •  Traffic burst by decommission •  NN problems •  Performance degradation –  hotspot problem –  timeout burst –  GC problem •  Client bugs –  Thread Blocking on server failure (HBASE-6364) 13.1.21 Hadoop  Conference  Japan  2013  Winter 17
  • 18. Today’s topics IDC online migration NN failover Stabilizing LINE message cluster 13.1.21 Hadoop  Conference  Japan  2013  Winter 18
  • 19. IDC online migration NN failover Stabilizing LINE message cluster
  • 20. Why? •  Move whole HBase clusters and data •  For better network infrastructure •  Without downtime 13.1.21 Hadoop  Conference  Japan  2013  Winter 20
  • 21. IDC online migration Before migration App Server dst-HBase write src-HBase 13.1.21 Hadoop  Conference  Japan  2013  Winter 21
  • 22. IDC online migration •  Write to both (client-level replication) write App Server dst-HBase write src-HBase 13.1.21 Hadoop  Conference  Japan  2013  Winter 22
  • 23. IDC online migration •  New data: Incremental replication •  Old data: Bulk migration •  dst’s timestamp equals src’s one write App Server dst-HBase write src-HBase 13.1.21 Hadoop  Conference  Japan  2013  Winter 23
  • 24. LINE HBase Replicator & BulkMigrator Replicator is for incremental replication BulkMigrator is for bulk migration 13.1.21 Hadoop  Conference  Japan  2013  Winter 24
  • 25. LINE HBase Replicator •  Our own implementation •  Prefer pull to push •  Throughput throttling •  Workload isolation of replicator and RS •  Rowkey conversion and filtering HBase  Replicator LINE  HBase  Replicator src-HBase src-HBase push pull dst-HBase dst-HBase 13.1.21 Hadoop  Conference  Japan  2013  Winter 25
  • 26. LINE HBase Replicator - A simple daemon to replicate local regions - 1.  HLogTracker reads a ckpt and selects next HLog. 2.  For each entry in HLog: 1.  Filter & convert a HLog.Entry 2.  Create Puts and batch to dst HBase •  Periodic checkpointing •  Generally, entries are replicated in seconds 13.1.21 Hadoop  Conference  Japan  2013  Winter 26
  • 27. Bulk migration 1.  MapReduce between any storages –  Map task only –  Read source, write destination –  Task scheduling problem depends on region allocation 2.  Non MapReduce version (BulkMigrator) –  Our own implementation –  HBase → HBase –  On each RS, scan & batch by a region –  Throughput throttling –  Slow, but easy to implement and debug 13.1.21 Hadoop  Conference  Japan  2013  Winter 27
  • 28. IDC online migration NN failover Stabilizing LINE message cluster
  • 29. Background •  Our HBase has a SPOF: NameNode •  “Apache Hadoop HA Configuration” http://blog.cloudera.com/blog/2009/07/hadoop-ha-configuration/ •  Furthermore, added Pacemaker –  Heartbeat can’t detect whether NN is running 13.1.21 Hadoop  Conference  Japan  2013  Winter 29
  • 30. Previous: HA-NN DRBD + VIP + Pacemaker 13.1.21 Hadoop  Conference  Japan  2013  Winter 30
  • 31. NameNode failure in 2012.10 13.1.21 Hadoop  Conference  Japan  2013  Winter 31
  • 32. HA-NN failover failed •  Not NameNode process •  Incorrect leader election at network partitioning •  Complicated configuration –  Easy to mistake, difficult to control –  Pacemaker scripting was not straightforward –  VIP is risky to HDFS •  DRBD split-brain problem –  Protocol C –  Unable to re-sync while service is online 13.1.21 Hadoop  Conference  Japan  2013  Winter 32
  • 33. Now: In-house NN failure handling •  Bye-bye old HA-NN –  Had to restart whole HBase clusters after NN failover •  Alternative ideas –  Quorum-based leader election (Using ZK) –  Using L4 switch –  Implement our own AvatarNode •  Safer solution instead of a little downtime 13.1.21 Hadoop  Conference  Japan  2013  Winter 33
  • 34. In-house NN failure handling (1)  rsync  with  -­‐-­‐link-­‐dest  periodically   13.1.21 Hadoop  Conference  Japan  2013  Winter 34
  • 35. In-house NN failure handling (2) Bomb 13.1.21 Hadoop  Conference  Japan  2013  Winter 35
  • 36. In-house NN failure handling (3) 13.1.21 Hadoop  Conference  Japan  2013  Winter 36
  • 37. IDC online migration NN failover Stabilizing LINE message cluster
  • 38. Stabilizing LINE message cluster Case  1 “Too  many   HLogs”   H/W  Failure   RS  GC  Storm   Handling   Case  3 Case  2 META  region   Hotspot   workload   Performance   problems isola5on Case  4 Region   mappings   to  RS 13.1.21 Hadoop  Conference  Japan  2013  Winter 38
  • 39. Case1: “Too many HLogs” •  Effect –  MemStore flush storm –  Compaction storm •  Cause –  Different regions growth –  Heterogeneous tables in a RS •  Solution –  Region balancing –  External flush scheduler 13.1.21 Hadoop  Conference  Japan  2013  Winter 39
  • 40. Case1: Number of HLogs Forced flushed shed N o flu Periodic flushed better case peak off-peak worse case Forced flushed Forced flushed flush storm Forced flushed 13.1.21 Hadoop  Conference  Japan  2013  Winter 40
  • 41. Case2: Hotspot problems •  Effect –  Excessive GC –  RS performance degradation (High CPU usage) •  Cause –  Get/Scan: •  Row or column, updated too frequently •  Row which has too many columns (+ tombstones) •  Solution –  Schema and row/column distribution are important –  Hotspot region isolation 13.1.21 Hadoop  Conference  Japan  2013  Winter 41
  • 42. Case3: META region workload isolation •  Effect 1.  RS high CPU 2.  Excessive timeout 3.  META lookup timeout •  Cause –  Inefficient exception handling of HBase client –  Hotspot region and META in same RS •  Solution –  META only RS 13.1.21 Hadoop  Conference  Japan  2013  Winter 42
  • 43. Case4: Region mappings to RS •  Effect –  Region mapping is not restored on RS restart –  Some region mappings aren’t restored properly after graceful restart •  graceful_stop.sh --restart --reload •  Cause –  HBase does not support it well •  Solution –  Periodic dump and restore it 13.1.21 Hadoop  Conference  Japan  2013  Winter 43
  • 44. Summary •  IDC online migration –  Without downtime –  LINE HBase Replicator & BulkMigrator •  NN failover –  Simple solution for a person saying “What’s Hadoop?” •  Stabilizing LINE message cluster –  Improved response time of RS 13.1.21 Hadoop  Conference  Japan  2013  Winter 44
  • 45. Conclusion We won 100M user adopting HBase LINE Storage is a successful example of a messaging service using HBase 13.1.21 Hadoop  Conference  Japan  2013  Winter 45