Running Operations is not an easy job, especially these days. Ops teams have to ensure excellent user experiences, resolve incidents quickly and help developers stay productive. Yet at the same time, there is also the need to maintain systems security and keep downtime to a minimum - goals which many struggle with at scale.
While advances in cloud computing have helped address some of these challenges, many organizations find it difficult to leverage the cloud at scale because of bottlenecks that form around repetitive tasks, such as developers having to wait for provisioning infrastructure. Despite having access to abundant cloud resources, these speedbumps often make it difficult or impossible to achieve team objectives.
Join this talk to learn:
-How to safely delegate the management of your Azure deployment (to developers and other colleagues) with self-service operations.
-How to create powerful runbooks with guardrails that leverage existing scripting languages (including PowerShell), infrastructure, and tools to remove the human from the bottleneck that forms around repetitive tasks.
-Strategies for getting started
-And how to create an Easy Button to handle the repetitive tasks that are interrupting your flow of work.
As presented by Jesse Houldsworth at PowerShell + DevOps Global Summit 2021
Ensuring Technical Readiness For Copilot in Microsoft 365
Automate Yourself Out of a Job: Safely Delegate the Management of your Azure Deployment
1. Shape Up
Skills Builder - September 4th, 2020
Confidential
Automate Yourself Out of Job:
Safely Delegate the Management of your
Azure Deployment
April 27, 2021
2. Speaker: Jesse Houldsworth
Jesse is a Senior Solutions Consultant at PagerDuty
where he helps customers save money by automating
runbooks. Jesse has over 10 years experience working
with both large and small enterprises on software
development and information security initiatives.
Twitter: @jhoulds
3. Agenda
1 Status Quo for Operations
2 Case for Self-service Operations
3 How to create runbooks that leverage existing scripting languages (including
PowerShell)
4 Strategies for getting started
5 Demo
5. 2021 Prediction for the Cloud:
“Worldwide end-user spending on
public cloud services is forecast to
grow 18.4%”
- Gartner Predictions for 2021
6. Complexity is the New Normal
Visual representation of
mid-size public SaaS
services
7. Managing an Azure Deployment at Scale
Status Quo:
● Business users can’t access to the Azure
management console
8. Status Quo:
● When a business users need an Azure
resource spun up, they fill out a ticket
and assign it to the Ops / Cloud team
Managing an Azure Deployment at Scale
Biz User
Ops/ Cloud
Team
9. Status Quo:
● Operations and Cloud teams are
inundated with manual requests from
other teams (provision Azure and cloud
resources, etc.)
● Interruptions prevent focus on high value
work
Managing an Azure Deployment at Scale
10. Case for Self-Service
How can we reduce the burden on
Operations and Cloud teams and empower
developers?
11. Case for Self-Service
How can we reduce the burden on
Operations and Cloud teams and empower
developers?
Make this information available
at the click of a button!
12. Case for Self-Service
Self-Service Operations:
Give Developers and other teams the ability to
provision cloud resources and allow the
Operations / Cloud team to maintain a set of
standards and practices for accessing secure
internal operations.
13. Benefits of Self-service
Reducing TOIL activities
Save time and money
Reducing organizational
silos
Leverage current tooling
for automation
Reducing organizational
silos
Leverage current tooling
for automation
14. Getting Started
There’s no “one-size fits all” toolset for automation. Consider the following
challenges:
● Manual and repetitive activities
● Support across organizations
● Security-first approach
● Works with legacy tooling
15. Example Automation Tasks
No-Impact
High-Impact
Simple Sophisticated
Change action that could break
things or impact performance
Non-change action with no
performance impact
Single-step with no
options
Multi-step, multi-node workflow with input options,
dependencies, and conditionals
Healthchecks
Incident
Enrichment
Diagnostics
Diagnostics
(resource intensive)
Simple
Restart
Multi-Service
Rolling Restart
Rollback and
Redeploy
Failover
Fetch Logs
Performance
Check
Emergency
Firewall Change
Config
Change
Emergency
Database
Change
Add/Remove
Capacity
Multi-Step
Restart
16. Example Automation Tasks
No-Impact
High-Impact
Simple Sophisticated
Change action that could break
things or impact performance
Non-change action with no
performance impact
Single-step with no
options
Multi-step, multi-node workflow with input options,
dependencies, and conditionals
Crawl
Walk
Run
Incident
Enrichment
Diagnostics
Diagnostics
(resource intensive)
Simple
Restart
Multi-Service
Rolling Restart
Rollback and
Redeploy
Failover
Fetch Logs
Multi-Step
Restart
Performance
Check
Emergency
Firewall Change
Config
Change
Emergency
Database
Change
Add/Remove
Capacity
Healthchecks
18. Self-Service Runbook Automation
● Enable anyone to have
self-service access to
operations tasks that were
only available to subject
matter experts.
● Makes existing automation
more secure, auditable, and
easier to run.