SlideShare una empresa de Scribd logo
1 de 21
Descargar para leer sin conexión
1
High Availability Mantra:
How DCIM Can Help
2
Today’s Topics
• High Availability Mantra Revisited
• Anatomy of a DCIM Software: GFS Crane
• How GFS Crane DCIM Delivers Higher Availability
• How GFS Crane DCIM Helps to Reduce Costs
• GFS Crane DCIM Case Studies
3
The High Availability Mantra RevisitedThe High Availability Mantra Revisited
Amazon Data Centers (built to Tier 4 standards and with an expected availability of 99.995%) had two
outages in 2012 – each over 3 hours!
• Tier 3/Tier 4 just defined by hardware redundancies
• Glaring gaps in operating procedures to prevent fatal human errors
• Lack of purpose-built BCP software to predict failures
• Lack of chain of custody to detect root cause
Amazon Data Centers (built to Tier 4 standards and with an expected availability of 99.995%) had two
outages in 2012 – each over 3 hours!
• Tier 3/Tier 4 just defined by hardware redundancies
• Glaring gaps in operating procedures to prevent fatal human errors
• Lack of purpose-built BCP software to predict failures
• Lack of chain of custody to detect root cause
Availability % Downtime per year Downtime per month* Downtime per week
99% ("two nines") 3.65 days 7.20 hours 1.68 hours
99.5% 1.83 days 3.60 hours 50.4 minutes
99.8% 17.52 hours 86.23 minutes 20.16 minutes
99.9% ("three nines") 8.76 hours 43.8 minutes 10.1 minutes
99.95% 4.38 hours 21.56 minutes 5.04 minutes
99.99% ("four nines") 52.56 minutes 4.32 minutes 1.01 minutes
99.999% ("five nines") 5.26 minutes 25.9 seconds 6.05 seconds
99.9999% ("six nines") 31.5 seconds 2.59 seconds 0.605 seconds
99.99999% ("seven nines") 3.15 seconds 0.259 seconds 0.0605 seconds
4
Did You Know?
90% of DC Failures Are From Common Preventable Causes90% of DC Failures Are From Common Preventable Causes
5
Did You Know?
Average Failure of an Online System: 36 hours per annum.
That’s only 99.6% Uptime
Average Failure of an Online System: 36 hours per annum.
That’s only 99.6% Uptime
6
Did You Know?
75% of Businesses Without a BC Plan Fail Within 3 Years after a Major
Disruption in their IT Systems
75% of Businesses Without a BC Plan Fail Within 3 Years after a Major
Disruption in their IT Systems
7
Anatomy of a DCIM Software: GFS Crane
8
Improves Availability: Predictability, Visibility & Change Tracking
 Advanced Alarm Management and analytics helps in failure
predictability, faster turn-around-time, improved availability and SLA
 Consolidation of alarms from different facilities helps in centralized
monitoring
Improved visibility of the power chain and the relationships among
critical components of the infrastructure helps in better impact analysis of
device malfunction or failure and doing RCA
 Change Tracking in the data center environment helps in doing impact
analysis of any change and root cause analysis of any outage occurring due
to a change
Predictive
Analytics
Predictive
Analytics
Visibility from
Power Chain
Visibility from
Power Chain
Change TrackingChange Tracking
9
Improves Availability: Predictability from Proactive Alarms
Proactive Real-time alarms
 Alarms on power, PUE and environmental
conditions like temperature, humidity, smoke,
fire, WLD, door-open and motion
 Alarms can be sent on e-mail & SMS
Alarm Dashboard
 Alarms from multiple data centers are
consolidated on a dashboard
 Analysis on alarms based on severity, type,
source, duration etc.
Advanced Alarm Management helps in failure
predictability, faster turn-around-time,
improved availability & SLA compliance
10
Improves Availability: Visibility from Power Chain
Maps relationships among critical
components of electrical infrastructure
 Create power chain for electrical infrastructure
 Map asset relationships and redundancies
starting from power source to customers and
applications
Asset Relationship Mapping
Improved visibility of the power chain and
relationships among critical components of
the infrastructure help in better impact
analysis of device malfunction or failure
and doing root cause analysis
11
Improves Availability: Change Tracking
 Maintains an audit trail for all
Installation/Move/Add/Change activity in
the data center
 Integration with existing ITSM tool
enables running the tracked changes
through a workflow system for change
approvals
Audit Trail of DC Configuration Changes
Tracking changes in the data center environment helps in doing impact analysis
of any change and root cause analysis of any outage occurring due to a change
12
Reduces Cost: Capex & Opex
Better visibility helps discovering under-utilized computing capacities
-> defers capex purchases
Better visibility helps avoiding stranded capacities on rack space &
power use: maximizes utilization of available capacities
 Better monitoring & analytics reduces operating cost on power
 Automation of processes like Asset Tracking, Provisioning &
Monitoring improves productivity
 Rationalizing asset base helps in lower maintenance costs like
equipment AMC
Reduces CapexReduces Capex
Reduces OpexReduces Opex
13
Reduces CapEx: Monitoring IT Utilization
Visibility of hidden compute capacity
 Calculates the average utilization of all
computing devices in the data center
 Identifies the unused compute capacity
Under-utilized servers can be repurposed
 Based on power consumption & utilization
patterns, hardware specs and age, ‘Repurpose
Candidates’ are identified that helps in deferring
new server hardware purchase
Hidden Computing Capacity
Repurpose Hardware
Discovery of hidden compute capacity defers
capital investment on new server hardware and
software licenses
14
Reduces Capex: Minimizing Stranded Capacities
Visibility of consumed power against max
capacity in a rack
 Provides real-time information on actual IT
load in a rack
 Provides maximum power capacity
 Provides available power capacity
Visibility of occupied rack space against
max available space
 Provides real-time information on occupied
space in the rack in RU
 Provides maximum space capacity
 Provides available space capacity
Hidden Power Capacity
Hidden Space Capacity
15
Reduces OpEx: Power Costs
Multi-level PUE Comparison
 Compares PUE calculated at
multiple levels and identifies power
distribution losses that can be
rectified to improve efficiency and
reduce OpEx on Power
Detect Power Distribution Loss
L1 PUE: UPS Output
L2 PUE: PDU Output
L3 PUE: Device-level
reading
Detection of power distribution losses in the
electrical infrastructure helps in improving
energy efficiency of the data center and
reduce operating cost on power
16
Reduces Opex: Process Automation & Improved Productivity
Automated discovery and inventory of
both IT and infrastructure assets
 Intelligent assets are automatically
discovered using SNMP/IPMI
 Manufacturer Repository contains
information on static attributes of assets
 Assets data imported from
spreadsheets or asset management tool
 Single management console to manage
IT and non-IT assets
 Maintenance management for assets
done using plug-ins that sends scheduler
based proactive alerts
 Workflow-based auto-provisioning
improves speed and reduces errors
Advanced Asset Management
17
Reduces Opex: Asset Rationalization
Asset Rationalization
 Asset Management module tracks & maintains inventory of all assets (IT
& non-IT) in the data Centre.
 Helps identify legacy servers and replacement candidates
 Reduces AMC, space rentals
Asset
Rationalization
Asset
Rationalization
Server
Virtualization
Server
Virtualization
Capacity
Planning
Capacity
Planning
Data Center
Consolidation
Data Center
Consolidation
GFS
Crane
DC
DCIM
GFS
Crane
DC
DCIM
Legacy Data
Center
Legacy Data
Center
Server & Rack
Consolidation
Server & Rack
Consolidation
Multiple
Data Centers
Multiple
Data Centers
18
How GFS Crane DCIM Helps
• Helps Data Center Manager avoid unnecessary over-provisioning
• Helps plan investments and new capacity
• Helps reduce the capital costs
• Helps reduce power use and other operating costs
• Helps reduce risk of failures through critical alerts
• Helps adapting to technical and business change more easily
• Helps improvement plans through real-time metrics & dashboard
19
GFS Crane DCIM Case Study 1: Financial Services
Industry Project Financing & Mutual Funds
Data Center Location India
Data Center Details Tier III certified by 451 Research, Energy Efficient ‘green’ Data Center
certified by TÜV Rheinland
DCIM Implementation
date
January, 2012
Business requirement
driving DCIM
implementation
 Improve energy efficiency through better energy management
 Comply with Green Grid recommendations and adopt best practices
in data center operations
 Improve data center availability and meet business SLA through
better monitoring, failure prediction and faster turn-around-time
Integration Touch
Points
Power Systems: LT transformer panels, UPS, PDUs and Distribution
Panels, BUSBAR panels, Multifunction Energy Meters.
Environmental Systems: PAC units, temperature and humidity probes
Servers, Network devices, Storage devices
Siemens Building Management System
20
Industry Mobile Operator
Data Center Location South Asia
Data Center Details Multiple data centers spread across 4 locations, covering 8,500 sq.ft. of
whitespace and housing 320 racks
DCIM Implementation
Date
Ongoing
Business requirement
driving DCIM
implementation
 Improve data center efficiency through better energy management
 Improve operational efficiency through better asset management,
capacity planning and converged infrastructure monitoring capability
 Improve data center availability and meet business SLA through
better monitoring, failure prediction and faster turn-around-time
Integration Touch
Points
Power Systems: LT transformer panels, UPS, A/C & D/C PDUs and
Distribution Panels, BUSBAR panels, Multifunction Energy Meters.
Environmental Systems: PAC units, temperature and humidity probes
Diesel generator, flow and level sensors
IBM Netcool (ITSM), VESDA, ACS and IP Surveillance
GFS Crane DCIM Case Study 2: Telecom
21
Thank You
http://www.greenfieldsoft.com
Email: sales@greenfieldsoft.com
See other two in this series:
- The Modern Data Center Topology: The High
Availability Mantra
- Data Center Infrastructure Management:
ERP for the Data Center Manager

Más contenido relacionado

La actualidad más candente

Preparing for the Future: How Asset Management Will Evolve in the Age of Smar...
Preparing for the Future: How Asset Management Will Evolve in the Age of Smar...Preparing for the Future: How Asset Management Will Evolve in the Age of Smar...
Preparing for the Future: How Asset Management Will Evolve in the Age of Smar...Schneider Electric
 
Case study-dcim-implementation-telecom-operator
Case study-dcim-implementation-telecom-operatorCase study-dcim-implementation-telecom-operator
Case study-dcim-implementation-telecom-operatorSwagata Mukherjee
 
Panduit DCIM Solution Overview
Panduit DCIM Solution OverviewPanduit DCIM Solution Overview
Panduit DCIM Solution OverviewPanduit
 
[Oil & Gas White Paper] Liquids Pipeline Leak Detection and Simulation Training
[Oil & Gas White Paper] Liquids Pipeline Leak Detection and Simulation Training[Oil & Gas White Paper] Liquids Pipeline Leak Detection and Simulation Training
[Oil & Gas White Paper] Liquids Pipeline Leak Detection and Simulation TrainingSchneider Electric
 
Field Data Gathering Services — A Cloud-Based Approach
Field Data Gathering Services — A Cloud-Based ApproachField Data Gathering Services — A Cloud-Based Approach
Field Data Gathering Services — A Cloud-Based ApproachSchneider Electric
 
Integrated Control and Safety - Assessing the Benefits; Weighing the Risks
Integrated Control and Safety - Assessing the Benefits; Weighing the RisksIntegrated Control and Safety - Assessing the Benefits; Weighing the Risks
Integrated Control and Safety - Assessing the Benefits; Weighing the RisksSchneider Electric
 
Dtech 2015 the distribution management system network model
Dtech 2015  the distribution management system network modelDtech 2015  the distribution management system network model
Dtech 2015 the distribution management system network modelSchneider Electric
 
Taming the DCIM Wave with ITIL
Taming the DCIM Wave with ITILTaming the DCIM Wave with ITIL
Taming the DCIM Wave with ITILAFCOM
 
Virtualization and Cloud Computing: Optimized Power, Cooling, and Management ...
Virtualization and Cloud Computing: Optimized Power, Cooling, and Management ...Virtualization and Cloud Computing: Optimized Power, Cooling, and Management ...
Virtualization and Cloud Computing: Optimized Power, Cooling, and Management ...Schneider Electric
 
The Building Blocks of the Industrial Network
The Building Blocks of the Industrial NetworkThe Building Blocks of the Industrial Network
The Building Blocks of the Industrial NetworkPanduit
 
Maximize your business and machine performance
Maximize your business and machine performanceMaximize your business and machine performance
Maximize your business and machine performanceSchneider Electric
 
How Schneider Electric sees Ethernet in the Industrial Environment - Part II
How Schneider Electric sees Ethernet in the Industrial Environment - Part IIHow Schneider Electric sees Ethernet in the Industrial Environment - Part II
How Schneider Electric sees Ethernet in the Industrial Environment - Part IISchneider Electric
 
Practical Options for Deploying IT Equipment in Small Server Rooms and Branch...
Practical Options for Deploying IT Equipment in Small Server Rooms and Branch...Practical Options for Deploying IT Equipment in Small Server Rooms and Branch...
Practical Options for Deploying IT Equipment in Small Server Rooms and Branch...Schneider Electric
 
2013 Vendor Track, GE Digital Energy Solutions Overview by John Chisum
2013 Vendor Track, GE Digital Energy Solutions Overview by John Chisum2013 Vendor Track, GE Digital Energy Solutions Overview by John Chisum
2013 Vendor Track, GE Digital Energy Solutions Overview by John ChisumGIS in the Rockies
 
[Oil & Gas White Paper] Control Room Management - Alarm Management
[Oil & Gas White Paper] Control Room Management - Alarm Management [Oil & Gas White Paper] Control Room Management - Alarm Management
[Oil & Gas White Paper] Control Room Management - Alarm Management Schneider Electric
 
Industrial Automation Press Conference Hannover Messe
Industrial Automation Press Conference Hannover MesseIndustrial Automation Press Conference Hannover Messe
Industrial Automation Press Conference Hannover MesseSchneider Electric
 

La actualidad más candente (20)

Preparing for the Future: How Asset Management Will Evolve in the Age of Smar...
Preparing for the Future: How Asset Management Will Evolve in the Age of Smar...Preparing for the Future: How Asset Management Will Evolve in the Age of Smar...
Preparing for the Future: How Asset Management Will Evolve in the Age of Smar...
 
Case study-dcim-implementation-telecom-operator
Case study-dcim-implementation-telecom-operatorCase study-dcim-implementation-telecom-operator
Case study-dcim-implementation-telecom-operator
 
Panduit DCIM Solution Overview
Panduit DCIM Solution OverviewPanduit DCIM Solution Overview
Panduit DCIM Solution Overview
 
[Oil & Gas White Paper] Liquids Pipeline Leak Detection and Simulation Training
[Oil & Gas White Paper] Liquids Pipeline Leak Detection and Simulation Training[Oil & Gas White Paper] Liquids Pipeline Leak Detection and Simulation Training
[Oil & Gas White Paper] Liquids Pipeline Leak Detection and Simulation Training
 
Field Data Gathering Services — A Cloud-Based Approach
Field Data Gathering Services — A Cloud-Based ApproachField Data Gathering Services — A Cloud-Based Approach
Field Data Gathering Services — A Cloud-Based Approach
 
Integrated Control and Safety - Assessing the Benefits; Weighing the Risks
Integrated Control and Safety - Assessing the Benefits; Weighing the RisksIntegrated Control and Safety - Assessing the Benefits; Weighing the Risks
Integrated Control and Safety - Assessing the Benefits; Weighing the Risks
 
SCADA of the Future
SCADA of the FutureSCADA of the Future
SCADA of the Future
 
DTN Guardian3™
DTN Guardian3™DTN Guardian3™
DTN Guardian3™
 
Dtech 2015 the distribution management system network model
Dtech 2015  the distribution management system network modelDtech 2015  the distribution management system network model
Dtech 2015 the distribution management system network model
 
Taming the DCIM Wave with ITIL
Taming the DCIM Wave with ITILTaming the DCIM Wave with ITIL
Taming the DCIM Wave with ITIL
 
Virtualization and Cloud Computing: Optimized Power, Cooling, and Management ...
Virtualization and Cloud Computing: Optimized Power, Cooling, and Management ...Virtualization and Cloud Computing: Optimized Power, Cooling, and Management ...
Virtualization and Cloud Computing: Optimized Power, Cooling, and Management ...
 
The Building Blocks of the Industrial Network
The Building Blocks of the Industrial NetworkThe Building Blocks of the Industrial Network
The Building Blocks of the Industrial Network
 
Engineer the Future Now!
Engineer the Future Now!Engineer the Future Now!
Engineer the Future Now!
 
Maximize your business and machine performance
Maximize your business and machine performanceMaximize your business and machine performance
Maximize your business and machine performance
 
How Schneider Electric sees Ethernet in the Industrial Environment - Part II
How Schneider Electric sees Ethernet in the Industrial Environment - Part IIHow Schneider Electric sees Ethernet in the Industrial Environment - Part II
How Schneider Electric sees Ethernet in the Industrial Environment - Part II
 
Practical Options for Deploying IT Equipment in Small Server Rooms and Branch...
Practical Options for Deploying IT Equipment in Small Server Rooms and Branch...Practical Options for Deploying IT Equipment in Small Server Rooms and Branch...
Practical Options for Deploying IT Equipment in Small Server Rooms and Branch...
 
2013 Vendor Track, GE Digital Energy Solutions Overview by John Chisum
2013 Vendor Track, GE Digital Energy Solutions Overview by John Chisum2013 Vendor Track, GE Digital Energy Solutions Overview by John Chisum
2013 Vendor Track, GE Digital Energy Solutions Overview by John Chisum
 
Smart Alarming Management
Smart Alarming ManagementSmart Alarming Management
Smart Alarming Management
 
[Oil & Gas White Paper] Control Room Management - Alarm Management
[Oil & Gas White Paper] Control Room Management - Alarm Management [Oil & Gas White Paper] Control Room Management - Alarm Management
[Oil & Gas White Paper] Control Room Management - Alarm Management
 
Industrial Automation Press Conference Hannover Messe
Industrial Automation Press Conference Hannover MesseIndustrial Automation Press Conference Hannover Messe
Industrial Automation Press Conference Hannover Messe
 

Similar a How DCIM Can Help Improve Availability & Reduce Costs

Data Center Infrastructure Management
Data Center Infrastructure ManagementData Center Infrastructure Management
Data Center Infrastructure Managementshahzad ahmed
 
The Modern Data Center Topology
The Modern Data Center TopologyThe Modern Data Center Topology
The Modern Data Center TopologySwagata Mukherji
 
Data Center Infrastructure Management Demystified
Data Center Infrastructure Management Demystified Data Center Infrastructure Management Demystified
Data Center Infrastructure Management Demystified Sunbird DCIM
 
Asset Insight Manager Introduction 2014 (2)
Asset Insight Manager Introduction 2014 (2)Asset Insight Manager Introduction 2014 (2)
Asset Insight Manager Introduction 2014 (2)Dean Bishop
 
CA Mainframe Resource Intelligence
CA Mainframe Resource IntelligenceCA Mainframe Resource Intelligence
CA Mainframe Resource IntelligenceCA Technologies
 
Case Study: Datotel Extended the Power of Infrastructure Management to the Ph...
Case Study: Datotel Extended the Power of Infrastructure Management to the Ph...Case Study: Datotel Extended the Power of Infrastructure Management to the Ph...
Case Study: Datotel Extended the Power of Infrastructure Management to the Ph...CA Technologies
 
Smart Manufacturing Suite for Power Industry
Smart Manufacturing Suite for Power IndustrySmart Manufacturing Suite for Power Industry
Smart Manufacturing Suite for Power IndustryA.T.E. Private Limited
 
Data Centers in the age of the Industrial Internet
Data Centers in the age of the Industrial InternetData Centers in the age of the Industrial Internet
Data Centers in the age of the Industrial InternetGE_India
 
Fluke Connect Condition Based Maintenance
Fluke Connect Condition Based MaintenanceFluke Connect Condition Based Maintenance
Fluke Connect Condition Based MaintenanceFrederic Baudart, CMRP
 
Growing Information Intensity of Energy 2014
Growing Information Intensity of Energy 2014Growing Information Intensity of Energy 2014
Growing Information Intensity of Energy 2014Peter C. Evans, PhD
 
Real Time Dynamics Monitoring System (RTDMS™): Phasor Applications for the Co...
Real Time Dynamics Monitoring System (RTDMS™): Phasor Applications for the Co...Real Time Dynamics Monitoring System (RTDMS™): Phasor Applications for the Co...
Real Time Dynamics Monitoring System (RTDMS™): Phasor Applications for the Co...Power System Operation
 
Mr. Scott Manson's presentation at QITCOM 2011
Mr. Scott Manson's presentation at QITCOM 2011Mr. Scott Manson's presentation at QITCOM 2011
Mr. Scott Manson's presentation at QITCOM 2011QITCOM
 
High Scalability Network Performance Management for Enterprises
High Scalability Network Performance Management for EnterprisesHigh Scalability Network Performance Management for Enterprises
High Scalability Network Performance Management for EnterprisesCA Technologies
 
Redefining-Smart-Grid-Architectural-Thinking-Using-Stream-Computing
Redefining-Smart-Grid-Architectural-Thinking-Using-Stream-ComputingRedefining-Smart-Grid-Architectural-Thinking-Using-Stream-Computing
Redefining-Smart-Grid-Architectural-Thinking-Using-Stream-ComputingAjoy Kumar
 
Visualizing Your Network Health - Driving Visibility in Increasingly Complex...
Visualizing Your Network Health -  Driving Visibility in Increasingly Complex...Visualizing Your Network Health -  Driving Visibility in Increasingly Complex...
Visualizing Your Network Health - Driving Visibility in Increasingly Complex...DellNMS
 

Similar a How DCIM Can Help Improve Availability & Reduce Costs (20)

DCIM Software: What & Why?
DCIM Software: What & Why?DCIM Software: What & Why?
DCIM Software: What & Why?
 
Data Center Infrastructure Management
Data Center Infrastructure ManagementData Center Infrastructure Management
Data Center Infrastructure Management
 
The Modern Data Center Topology
The Modern Data Center TopologyThe Modern Data Center Topology
The Modern Data Center Topology
 
Sleep Better At Night: Eliminate Data Center Failure
Sleep Better At Night: Eliminate Data Center FailureSleep Better At Night: Eliminate Data Center Failure
Sleep Better At Night: Eliminate Data Center Failure
 
Data Center Infrastructure Management Demystified
Data Center Infrastructure Management Demystified Data Center Infrastructure Management Demystified
Data Center Infrastructure Management Demystified
 
Smart Grid Deployment Experience and Utility Case Studies
Smart Grid Deployment Experience and Utility Case StudiesSmart Grid Deployment Experience and Utility Case Studies
Smart Grid Deployment Experience and Utility Case Studies
 
Asset Insight Manager Introduction 2014 (2)
Asset Insight Manager Introduction 2014 (2)Asset Insight Manager Introduction 2014 (2)
Asset Insight Manager Introduction 2014 (2)
 
1415 reed
1415 reed1415 reed
1415 reed
 
CA Mainframe Resource Intelligence
CA Mainframe Resource IntelligenceCA Mainframe Resource Intelligence
CA Mainframe Resource Intelligence
 
Case Study: Datotel Extended the Power of Infrastructure Management to the Ph...
Case Study: Datotel Extended the Power of Infrastructure Management to the Ph...Case Study: Datotel Extended the Power of Infrastructure Management to the Ph...
Case Study: Datotel Extended the Power of Infrastructure Management to the Ph...
 
Smart Manufacturing Suite for Power Industry
Smart Manufacturing Suite for Power IndustrySmart Manufacturing Suite for Power Industry
Smart Manufacturing Suite for Power Industry
 
Data Centers in the age of the Industrial Internet
Data Centers in the age of the Industrial InternetData Centers in the age of the Industrial Internet
Data Centers in the age of the Industrial Internet
 
Fluke Connect Condition Based Maintenance
Fluke Connect Condition Based MaintenanceFluke Connect Condition Based Maintenance
Fluke Connect Condition Based Maintenance
 
Growing Information Intensity of Energy 2014
Growing Information Intensity of Energy 2014Growing Information Intensity of Energy 2014
Growing Information Intensity of Energy 2014
 
Knowledge is Power - Richard May, Raritan
Knowledge is Power - Richard May, RaritanKnowledge is Power - Richard May, Raritan
Knowledge is Power - Richard May, Raritan
 
Real Time Dynamics Monitoring System (RTDMS™): Phasor Applications for the Co...
Real Time Dynamics Monitoring System (RTDMS™): Phasor Applications for the Co...Real Time Dynamics Monitoring System (RTDMS™): Phasor Applications for the Co...
Real Time Dynamics Monitoring System (RTDMS™): Phasor Applications for the Co...
 
Mr. Scott Manson's presentation at QITCOM 2011
Mr. Scott Manson's presentation at QITCOM 2011Mr. Scott Manson's presentation at QITCOM 2011
Mr. Scott Manson's presentation at QITCOM 2011
 
High Scalability Network Performance Management for Enterprises
High Scalability Network Performance Management for EnterprisesHigh Scalability Network Performance Management for Enterprises
High Scalability Network Performance Management for Enterprises
 
Redefining-Smart-Grid-Architectural-Thinking-Using-Stream-Computing
Redefining-Smart-Grid-Architectural-Thinking-Using-Stream-ComputingRedefining-Smart-Grid-Architectural-Thinking-Using-Stream-Computing
Redefining-Smart-Grid-Architectural-Thinking-Using-Stream-Computing
 
Visualizing Your Network Health - Driving Visibility in Increasingly Complex...
Visualizing Your Network Health -  Driving Visibility in Increasingly Complex...Visualizing Your Network Health -  Driving Visibility in Increasingly Complex...
Visualizing Your Network Health - Driving Visibility in Increasingly Complex...
 

How DCIM Can Help Improve Availability & Reduce Costs

  • 2. 2 Today’s Topics • High Availability Mantra Revisited • Anatomy of a DCIM Software: GFS Crane • How GFS Crane DCIM Delivers Higher Availability • How GFS Crane DCIM Helps to Reduce Costs • GFS Crane DCIM Case Studies
  • 3. 3 The High Availability Mantra RevisitedThe High Availability Mantra Revisited Amazon Data Centers (built to Tier 4 standards and with an expected availability of 99.995%) had two outages in 2012 – each over 3 hours! • Tier 3/Tier 4 just defined by hardware redundancies • Glaring gaps in operating procedures to prevent fatal human errors • Lack of purpose-built BCP software to predict failures • Lack of chain of custody to detect root cause Amazon Data Centers (built to Tier 4 standards and with an expected availability of 99.995%) had two outages in 2012 – each over 3 hours! • Tier 3/Tier 4 just defined by hardware redundancies • Glaring gaps in operating procedures to prevent fatal human errors • Lack of purpose-built BCP software to predict failures • Lack of chain of custody to detect root cause Availability % Downtime per year Downtime per month* Downtime per week 99% ("two nines") 3.65 days 7.20 hours 1.68 hours 99.5% 1.83 days 3.60 hours 50.4 minutes 99.8% 17.52 hours 86.23 minutes 20.16 minutes 99.9% ("three nines") 8.76 hours 43.8 minutes 10.1 minutes 99.95% 4.38 hours 21.56 minutes 5.04 minutes 99.99% ("four nines") 52.56 minutes 4.32 minutes 1.01 minutes 99.999% ("five nines") 5.26 minutes 25.9 seconds 6.05 seconds 99.9999% ("six nines") 31.5 seconds 2.59 seconds 0.605 seconds 99.99999% ("seven nines") 3.15 seconds 0.259 seconds 0.0605 seconds
  • 4. 4 Did You Know? 90% of DC Failures Are From Common Preventable Causes90% of DC Failures Are From Common Preventable Causes
  • 5. 5 Did You Know? Average Failure of an Online System: 36 hours per annum. That’s only 99.6% Uptime Average Failure of an Online System: 36 hours per annum. That’s only 99.6% Uptime
  • 6. 6 Did You Know? 75% of Businesses Without a BC Plan Fail Within 3 Years after a Major Disruption in their IT Systems 75% of Businesses Without a BC Plan Fail Within 3 Years after a Major Disruption in their IT Systems
  • 7. 7 Anatomy of a DCIM Software: GFS Crane
  • 8. 8 Improves Availability: Predictability, Visibility & Change Tracking  Advanced Alarm Management and analytics helps in failure predictability, faster turn-around-time, improved availability and SLA  Consolidation of alarms from different facilities helps in centralized monitoring Improved visibility of the power chain and the relationships among critical components of the infrastructure helps in better impact analysis of device malfunction or failure and doing RCA  Change Tracking in the data center environment helps in doing impact analysis of any change and root cause analysis of any outage occurring due to a change Predictive Analytics Predictive Analytics Visibility from Power Chain Visibility from Power Chain Change TrackingChange Tracking
  • 9. 9 Improves Availability: Predictability from Proactive Alarms Proactive Real-time alarms  Alarms on power, PUE and environmental conditions like temperature, humidity, smoke, fire, WLD, door-open and motion  Alarms can be sent on e-mail & SMS Alarm Dashboard  Alarms from multiple data centers are consolidated on a dashboard  Analysis on alarms based on severity, type, source, duration etc. Advanced Alarm Management helps in failure predictability, faster turn-around-time, improved availability & SLA compliance
  • 10. 10 Improves Availability: Visibility from Power Chain Maps relationships among critical components of electrical infrastructure  Create power chain for electrical infrastructure  Map asset relationships and redundancies starting from power source to customers and applications Asset Relationship Mapping Improved visibility of the power chain and relationships among critical components of the infrastructure help in better impact analysis of device malfunction or failure and doing root cause analysis
  • 11. 11 Improves Availability: Change Tracking  Maintains an audit trail for all Installation/Move/Add/Change activity in the data center  Integration with existing ITSM tool enables running the tracked changes through a workflow system for change approvals Audit Trail of DC Configuration Changes Tracking changes in the data center environment helps in doing impact analysis of any change and root cause analysis of any outage occurring due to a change
  • 12. 12 Reduces Cost: Capex & Opex Better visibility helps discovering under-utilized computing capacities -> defers capex purchases Better visibility helps avoiding stranded capacities on rack space & power use: maximizes utilization of available capacities  Better monitoring & analytics reduces operating cost on power  Automation of processes like Asset Tracking, Provisioning & Monitoring improves productivity  Rationalizing asset base helps in lower maintenance costs like equipment AMC Reduces CapexReduces Capex Reduces OpexReduces Opex
  • 13. 13 Reduces CapEx: Monitoring IT Utilization Visibility of hidden compute capacity  Calculates the average utilization of all computing devices in the data center  Identifies the unused compute capacity Under-utilized servers can be repurposed  Based on power consumption & utilization patterns, hardware specs and age, ‘Repurpose Candidates’ are identified that helps in deferring new server hardware purchase Hidden Computing Capacity Repurpose Hardware Discovery of hidden compute capacity defers capital investment on new server hardware and software licenses
  • 14. 14 Reduces Capex: Minimizing Stranded Capacities Visibility of consumed power against max capacity in a rack  Provides real-time information on actual IT load in a rack  Provides maximum power capacity  Provides available power capacity Visibility of occupied rack space against max available space  Provides real-time information on occupied space in the rack in RU  Provides maximum space capacity  Provides available space capacity Hidden Power Capacity Hidden Space Capacity
  • 15. 15 Reduces OpEx: Power Costs Multi-level PUE Comparison  Compares PUE calculated at multiple levels and identifies power distribution losses that can be rectified to improve efficiency and reduce OpEx on Power Detect Power Distribution Loss L1 PUE: UPS Output L2 PUE: PDU Output L3 PUE: Device-level reading Detection of power distribution losses in the electrical infrastructure helps in improving energy efficiency of the data center and reduce operating cost on power
  • 16. 16 Reduces Opex: Process Automation & Improved Productivity Automated discovery and inventory of both IT and infrastructure assets  Intelligent assets are automatically discovered using SNMP/IPMI  Manufacturer Repository contains information on static attributes of assets  Assets data imported from spreadsheets or asset management tool  Single management console to manage IT and non-IT assets  Maintenance management for assets done using plug-ins that sends scheduler based proactive alerts  Workflow-based auto-provisioning improves speed and reduces errors Advanced Asset Management
  • 17. 17 Reduces Opex: Asset Rationalization Asset Rationalization  Asset Management module tracks & maintains inventory of all assets (IT & non-IT) in the data Centre.  Helps identify legacy servers and replacement candidates  Reduces AMC, space rentals Asset Rationalization Asset Rationalization Server Virtualization Server Virtualization Capacity Planning Capacity Planning Data Center Consolidation Data Center Consolidation GFS Crane DC DCIM GFS Crane DC DCIM Legacy Data Center Legacy Data Center Server & Rack Consolidation Server & Rack Consolidation Multiple Data Centers Multiple Data Centers
  • 18. 18 How GFS Crane DCIM Helps • Helps Data Center Manager avoid unnecessary over-provisioning • Helps plan investments and new capacity • Helps reduce the capital costs • Helps reduce power use and other operating costs • Helps reduce risk of failures through critical alerts • Helps adapting to technical and business change more easily • Helps improvement plans through real-time metrics & dashboard
  • 19. 19 GFS Crane DCIM Case Study 1: Financial Services Industry Project Financing & Mutual Funds Data Center Location India Data Center Details Tier III certified by 451 Research, Energy Efficient ‘green’ Data Center certified by TÜV Rheinland DCIM Implementation date January, 2012 Business requirement driving DCIM implementation  Improve energy efficiency through better energy management  Comply with Green Grid recommendations and adopt best practices in data center operations  Improve data center availability and meet business SLA through better monitoring, failure prediction and faster turn-around-time Integration Touch Points Power Systems: LT transformer panels, UPS, PDUs and Distribution Panels, BUSBAR panels, Multifunction Energy Meters. Environmental Systems: PAC units, temperature and humidity probes Servers, Network devices, Storage devices Siemens Building Management System
  • 20. 20 Industry Mobile Operator Data Center Location South Asia Data Center Details Multiple data centers spread across 4 locations, covering 8,500 sq.ft. of whitespace and housing 320 racks DCIM Implementation Date Ongoing Business requirement driving DCIM implementation  Improve data center efficiency through better energy management  Improve operational efficiency through better asset management, capacity planning and converged infrastructure monitoring capability  Improve data center availability and meet business SLA through better monitoring, failure prediction and faster turn-around-time Integration Touch Points Power Systems: LT transformer panels, UPS, A/C & D/C PDUs and Distribution Panels, BUSBAR panels, Multifunction Energy Meters. Environmental Systems: PAC units, temperature and humidity probes Diesel generator, flow and level sensors IBM Netcool (ITSM), VESDA, ACS and IP Surveillance GFS Crane DCIM Case Study 2: Telecom
  • 21. 21 Thank You http://www.greenfieldsoft.com Email: sales@greenfieldsoft.com See other two in this series: - The Modern Data Center Topology: The High Availability Mantra - Data Center Infrastructure Management: ERP for the Data Center Manager