SlideShare una empresa de Scribd logo
1 de 20
Octubre 2017 Alejandro - @ae_bm
Información importante
● Hay dulces
● Siempre se puede hacer mejor en un green
field proyect
● Ahorremos la pregunta de por que no fui a la
Prometheus School of Running Away From
Things *
● Suelo usar metáforas no actas para todo
público ^_^
* https://www.youtube.com/watch?v=-BWnTW4rL0U (spoilers alert)
Contexto
● Un sistema que permita consultar las gráficas de
valores para multitud de dispositivos, suena a
IoT pero no te dejes llevar por el hype
● RabbitMQ es un broker de mensajes. Uno de los
usos que tiene es desacoplar productores y
consumidores *
● Graphite es un sistema para almacenar y
mostrar gráficas separado en multiples
componentes
* http://www.eferro.net/2017/09/pub-sub-swiss-army-knife-tech-pill.html
Contexto
SSH
=
vs I
Cambiar el io scheduler
a CFQ
The main aim of CFQ scheduler is to provide a fair allocation of the disk
I/O bandwidth for all the processes which requests an I/O operation.
CFQ maintains the per process queue for the processes which request I/O
operation(synchronous requests). In case of asynchronous requests, all the
requests from all the processes are batched together according to their
process's I/O priority.
https://www.kernel.org/doc/Documentation/block/cfq-iosched.txt
https://www.kernel.org/doc/Documentation/block/ioprio.txt - bola extra
vs I
vs II
vs II
https://medium.com/netflix-techblog/lessons-netflix-learned-from-the-aws-outage-deefe5fd0c04
https://landing.google.com/sre/book/chapters/addressing-cascading-failures.html
Graceful
degradation
Fail fast
Aggressive
Timeouts
Avoid long queue lengths
reject requests, rather than
overloading servers
proxy_send_timeout 5s;
proxy_read_timeout 5s;
proxy_connect_timeout 1s;
vs III
vs III
Nuevo contexto
vs IV
Writes
Reads
vs IV
# Limits the number of whisper update_many() calls per second, which effectively
# means the number of write requests sent to the disk. This is intended to
# prevent over-utilizing the disk and thus starving the rest of the system.
# When the rate of required updates exceeds this, then carbon's caching will
# take effect and increase the overall throughput accordingly.
# MAX_UPDATES_PER_SECOND = 500
Recordatorio del contexto
vs V (parte I)
OOM Killed broker kvm process
OOM Killed graphite kvm process
https://github.com/dastergon/awesome-chaos-engineering
vs V (parte I)
vs V (parte II)
Deliver real < 300/s
Deliver esperado > 6000/s
vs V (parte II)
Drop
Traffic
Slow Startup
iptables -t filter -I INPUT 1 --dport 5672 -j DROP
# wait & remove
iptables -t filter -I INPUT 1 -p tcp --dport 5672
-m statistic --mode random --probability 0.9 -j DROP
# wait & remove
iptables -t filter -I INPUT 1 -p tcp --dport 5672
-m statistic --mode random --probability 0.8 -j DROP
...
FTW https://hoytech.com/vmtouch/
¿FIN?
Créditos
http://www.publicdomainpictures.net/pictures/30000/velka/halloween-illustration.jpg
https://pixabay.com/p-160313/?no_redirect
https://pixabay.com/p-2202209/?no_redirect
http://www.publicdomainpictures.net/pictures/170000/velka/a-bit-too-much.jpg
http://free-icon-rainbow.com/i/icon_05061/icon_050610.svg
http://blog.clarity.fm/wp-content/uploads/2014/01/shutterstock_124904114-603x483.jpg
https://pixabay.com/p-297703/?no_redirect
https://gph.is/13WDoyA
https://gph.is/2cJ4LvC
https://pixabay.com/p-485502/?no_redirect
https://gph.is/294uXwP
https://gph.is/XJdqRS

Más contenido relacionado

Similar a AMQP vs GRAPHITE

Containers explained as for cook and a mecanics
 Containers explained as for cook and a mecanics  Containers explained as for cook and a mecanics
Containers explained as for cook and a mecanics Rachid Zarouali
 
Crushing Latency with Vert.x
Crushing Latency with Vert.xCrushing Latency with Vert.x
Crushing Latency with Vert.xPaulo Lopes
 
Modern Web Security, Lazy but Mindful Like a Fox
Modern Web Security, Lazy but Mindful Like a FoxModern Web Security, Lazy but Mindful Like a Fox
Modern Web Security, Lazy but Mindful Like a FoxC4Media
 
this-is-garbage-talk-2022.pptx
this-is-garbage-talk-2022.pptxthis-is-garbage-talk-2022.pptx
this-is-garbage-talk-2022.pptxTier1 app
 
Powering Interactive Analytics with Alluxio and Presto
Powering Interactive Analytics with Alluxio and PrestoPowering Interactive Analytics with Alluxio and Presto
Powering Interactive Analytics with Alluxio and PrestoAlluxio, Inc.
 
AWS Partner Presentation - Accenture Digital Supply Chain In The Cloud
AWS Partner Presentation - Accenture Digital Supply Chain In The CloudAWS Partner Presentation - Accenture Digital Supply Chain In The Cloud
AWS Partner Presentation - Accenture Digital Supply Chain In The CloudAmazon Web Services
 
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreDataStax Academy
 
Implementing data and databases on K8s within the Dutch government
Implementing data and databases on K8s within the Dutch governmentImplementing data and databases on K8s within the Dutch government
Implementing data and databases on K8s within the Dutch governmentDoKC
 
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...GetInData
 
Explorando Go em Ambiente Embarcado
Explorando Go em Ambiente EmbarcadoExplorando Go em Ambiente Embarcado
Explorando Go em Ambiente EmbarcadoAlvaro Viebrantz
 
Coreos google compute engine (and how to scale Wordpress in 5 minutes.)
Coreos google compute engine (and how to scale Wordpress in 5 minutes.)Coreos google compute engine (and how to scale Wordpress in 5 minutes.)
Coreos google compute engine (and how to scale Wordpress in 5 minutes.)Pat Cito
 
Deep Learning and Gene Computing Acceleration with Alluxio in Kubernetes
Deep Learning and Gene Computing Acceleration with Alluxio in KubernetesDeep Learning and Gene Computing Acceleration with Alluxio in Kubernetes
Deep Learning and Gene Computing Acceleration with Alluxio in KubernetesAlluxio, Inc.
 
CampJS - Making gaming more fun and efficient
CampJS - Making gaming more fun and efficientCampJS - Making gaming more fun and efficient
CampJS - Making gaming more fun and efficientCong Nguyen
 
Prometheus for Monitoring Metrics (Fermilab 2018)
Prometheus for Monitoring Metrics (Fermilab 2018)Prometheus for Monitoring Metrics (Fermilab 2018)
Prometheus for Monitoring Metrics (Fermilab 2018)Brian Brazil
 
Trying and evaluating the new features of GlusterFS 3.5
Trying and evaluating the new features of GlusterFS 3.5Trying and evaluating the new features of GlusterFS 3.5
Trying and evaluating the new features of GlusterFS 3.5Keisuke Takahashi
 
Start Counting: How We Unlocked Platform Efficiency and Reliability While Sav...
Start Counting: How We Unlocked Platform Efficiency and Reliability While Sav...Start Counting: How We Unlocked Platform Efficiency and Reliability While Sav...
Start Counting: How We Unlocked Platform Efficiency and Reliability While Sav...VMware Tanzu
 
Production ready kubernetes
Production ready kubernetesProduction ready kubernetes
Production ready kubernetesArnaud MAZIN
 
АНДРІЙ ШУМАДА «To Cover Uncoverable» Online WDDay 2022 js
АНДРІЙ ШУМАДА «To Cover Uncoverable» Online WDDay 2022 jsАНДРІЙ ШУМАДА «To Cover Uncoverable» Online WDDay 2022 js
АНДРІЙ ШУМАДА «To Cover Uncoverable» Online WDDay 2022 jsWDDay
 

Similar a AMQP vs GRAPHITE (20)

Containers explained as for cook and a mecanics
 Containers explained as for cook and a mecanics  Containers explained as for cook and a mecanics
Containers explained as for cook and a mecanics
 
C++ Coroutines
C++ CoroutinesC++ Coroutines
C++ Coroutines
 
Crushing Latency with Vert.x
Crushing Latency with Vert.xCrushing Latency with Vert.x
Crushing Latency with Vert.x
 
Modern Web Security, Lazy but Mindful Like a Fox
Modern Web Security, Lazy but Mindful Like a FoxModern Web Security, Lazy but Mindful Like a Fox
Modern Web Security, Lazy but Mindful Like a Fox
 
this-is-garbage-talk-2022.pptx
this-is-garbage-talk-2022.pptxthis-is-garbage-talk-2022.pptx
this-is-garbage-talk-2022.pptx
 
Powering Interactive Analytics with Alluxio and Presto
Powering Interactive Analytics with Alluxio and PrestoPowering Interactive Analytics with Alluxio and Presto
Powering Interactive Analytics with Alluxio and Presto
 
AWS Partner Presentation - Accenture Digital Supply Chain In The Cloud
AWS Partner Presentation - Accenture Digital Supply Chain In The CloudAWS Partner Presentation - Accenture Digital Supply Chain In The Cloud
AWS Partner Presentation - Accenture Digital Supply Chain In The Cloud
 
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
 
Implementing data and databases on K8s within the Dutch government
Implementing data and databases on K8s within the Dutch governmentImplementing data and databases on K8s within the Dutch government
Implementing data and databases on K8s within the Dutch government
 
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
 
Explorando Go em Ambiente Embarcado
Explorando Go em Ambiente EmbarcadoExplorando Go em Ambiente Embarcado
Explorando Go em Ambiente Embarcado
 
Web Leaps Forward
Web Leaps ForwardWeb Leaps Forward
Web Leaps Forward
 
Coreos google compute engine (and how to scale Wordpress in 5 minutes.)
Coreos google compute engine (and how to scale Wordpress in 5 minutes.)Coreos google compute engine (and how to scale Wordpress in 5 minutes.)
Coreos google compute engine (and how to scale Wordpress in 5 minutes.)
 
Deep Learning and Gene Computing Acceleration with Alluxio in Kubernetes
Deep Learning and Gene Computing Acceleration with Alluxio in KubernetesDeep Learning and Gene Computing Acceleration with Alluxio in Kubernetes
Deep Learning and Gene Computing Acceleration with Alluxio in Kubernetes
 
CampJS - Making gaming more fun and efficient
CampJS - Making gaming more fun and efficientCampJS - Making gaming more fun and efficient
CampJS - Making gaming more fun and efficient
 
Prometheus for Monitoring Metrics (Fermilab 2018)
Prometheus for Monitoring Metrics (Fermilab 2018)Prometheus for Monitoring Metrics (Fermilab 2018)
Prometheus for Monitoring Metrics (Fermilab 2018)
 
Trying and evaluating the new features of GlusterFS 3.5
Trying and evaluating the new features of GlusterFS 3.5Trying and evaluating the new features of GlusterFS 3.5
Trying and evaluating the new features of GlusterFS 3.5
 
Start Counting: How We Unlocked Platform Efficiency and Reliability While Sav...
Start Counting: How We Unlocked Platform Efficiency and Reliability While Sav...Start Counting: How We Unlocked Platform Efficiency and Reliability While Sav...
Start Counting: How We Unlocked Platform Efficiency and Reliability While Sav...
 
Production ready kubernetes
Production ready kubernetesProduction ready kubernetes
Production ready kubernetes
 
АНДРІЙ ШУМАДА «To Cover Uncoverable» Online WDDay 2022 js
АНДРІЙ ШУМАДА «To Cover Uncoverable» Online WDDay 2022 jsАНДРІЙ ШУМАДА «To Cover Uncoverable» Online WDDay 2022 js
АНДРІЙ ШУМАДА «To Cover Uncoverable» Online WDDay 2022 js
 

Más de Alejandro E Brito Monedero (14)

Mad scalability (perfomance debugging)
Mad scalability (perfomance debugging)Mad scalability (perfomance debugging)
Mad scalability (perfomance debugging)
 
Tres historias
Tres historiasTres historias
Tres historias
 
Sysdig
SysdigSysdig
Sysdig
 
Sysdig SRECon 16 Europe
Sysdig SRECon 16 EuropeSysdig SRECon 16 Europe
Sysdig SRECon 16 Europe
 
Funcional para trollear
Funcional para trollearFuncional para trollear
Funcional para trollear
 
Top Bug
Top BugTop Bug
Top Bug
 
Fabric más allá de lo básico
Fabric más allá de lo básicoFabric más allá de lo básico
Fabric más allá de lo básico
 
Experiencias con PostgreSQL en AWS
Experiencias con PostgreSQL en AWSExperiencias con PostgreSQL en AWS
Experiencias con PostgreSQL en AWS
 
Fabric Fast & Furious edition
Fabric Fast & Furious editionFabric Fast & Furious edition
Fabric Fast & Furious edition
 
Así que pusiste MongoDB. Dime ¿cómo lo administras?
Así que pusiste MongoDB. Dime ¿cómo lo administras?Así que pusiste MongoDB. Dime ¿cómo lo administras?
Así que pusiste MongoDB. Dime ¿cómo lo administras?
 
AWS Baby steps circa 2008
AWS Baby steps circa 2008AWS Baby steps circa 2008
AWS Baby steps circa 2008
 
Using Logstash, elasticsearch & kibana
Using Logstash, elasticsearch & kibanaUsing Logstash, elasticsearch & kibana
Using Logstash, elasticsearch & kibana
 
Wireshark tips
Wireshark tipsWireshark tips
Wireshark tips
 
Mi experiencia con Amazon AWS EC2 y S3
Mi experiencia con Amazon AWS EC2 y S3Mi experiencia con Amazon AWS EC2 y S3
Mi experiencia con Amazon AWS EC2 y S3
 

Último

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 

Último (20)

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 

AMQP vs GRAPHITE

Notas del editor

  1. Comentar el sistema de procesos que recopilan metricas de dispositivos y las publican a rabbitmq para ser consumidas por graphite. Si hacen menciones a kafka, comentar que kafka requiere que los consumers lleven el tracking de donde estan. https://content.pivotal.io/blog/understanding-when-to-use-rabbitmq-or-apache-kafka
  2. Diagrama de como estaba originalmente
  3. Hacer ssh en el servidor de métricas para disfrutar de tiempos de respuesta que hacen llorar. Problemas de tener todo en los mismos discos físicos, cuando hay mucho IO lo mejor es tener discos dedicados, sino te comes el atasco.
  4. Razono en que el problema es el ciclo salvaje de lectura escritura que esta haciendo graphite, el cual es tan frecuente que deja al resto en inanición. Por lo que decido probar cambiando el planificador para tener canales disponibles para otros procesos. Ademas si la cosa se ponia fea se podia usar ionice. Parece que hay otros nuevos planificadores en el kernel que valdra la pena probar. https://lwn.net/Articles/720675/
  5. Se migro graphite de maquina y de una vez se aprovecho y se puso en docker para ser lo suficientemente hipster (en realidad era para aprovechar y que próximas instalaciones del graphite fueran reproducibles). Lo curioso es que en el riemann se empezó a ver que habían tiempos de respuesta altos y errores en el NGINX que lo dejaban KO
  6. Viendo los logs y recordando lo que había leído en un post de netflix, el libro de SRE sobre los fallos en cascada y después de hacer pruebas de carga con ab decidí usando la fuerza que si una gráfica tardaba más de 5 segundos nginx cancelara la operación. Como bien pone Dan Luu hay que poner deadlines para evitar zombie requests https://danluu.com/google-sre-book/
  7. Antes de hacer la migración (otra vez) de los servidores, decido hacer pruebas de carga en AWS. Es el momento mágico donde descubres que graphite leyendo metricas desde AMQP es una basura, llegando a tope de CPU antes de saturar los discos.
  8. Al final para seguir haciendo pruebas tuve que empezar a usar los carbon-relay y editarles el código para que usaran la misma named queue.
  9. Diagrama de como quedo
  10. Revisando de nuevo los logs de nginx vi que muchas peticiones de gráficas no se cargaban porque tardan más de 5 segundos.
  11. Al final después de un poco de syscall tracing veo que al estar el disco tan petado escribiendo a disco (si, en un disco dedicado) las lecturas no se podían atender, además de la motorización tenemos a gente con ventanitas abiertas para ver las gráficas como si fuera un NOC. Así que me puse a ver los updates por second, saque estadísticas de cual era el promedio / media para posteriormente bajar en la configuración del graphite este valor al 70% y así dar espacio para las lecturas. Quizás había que tocar queue depth y esas cosillas =)
  12. Diagrama de como quedo
  13. Tenia reciente la charla de @adrianco sobre chaos engineering y aprovechando que tenemos el graphite y broker duplicados, me sentía con ganas de ver si la configuración que teníamos puesta aguantaría de verdad un evento tan tonto como una actualización (si, había que hacer una actualización). Uno de los datos que tenia que validar, es que el carbon-cache iba a tardar aproximadamente 5 horas y 30 minutos en terminar de manera limpia. Por lo que detenemos los relays y el cache. Esto causa que el broker acumule mensajes hasta que boom, se pierde la conexión a ambas máquinas y conectandome al hypervisor veo que el OOM Killer se cargo los procesos KVM.
  14. Tan simple como reconfigurar el uso de memoria de las 2 máquinas virtuales para que no exploten si usan toda la memoria asignada.
  15. Había ahora que actualizar la otra instancia, así que era una buena oportunidad para probar de nuevo el proceso de shutdown y restore. En este caso el shutdown no dio problemas, más alla del RabbitMQ al usar toda la memoria, dejó de recibir mensajes de los shovels. La gracia fue restaurando el servicio. La cola estaba llena de mensajes y se leían muy lentamente. Pensaba que eran los relays, asi que ejecute aun más relays y la cosa no se arreglaba. Al final me di cuenta que el rabbitMQ estaba a tope de disco. Parece que le problema era que los mensajes a enviar no estaban en cache y tenían que leerse de disco mientras se seguía recibiendo una avalancha de mensajes.
  16. Al final para mejorar la velocidad utilice una estrategia doble, por un lado vmtouch https://hoytech.com/vmtouch/ Para tener en cache los ficheros del mnesia y por otro lado bloquear el trafico de entrada al broker e aceptarlo poco a poco con iptables y el modulo de statistics, aprovechando que TCP baja la velocidad si se pierden muchos paquetes =)