Making MySQL highly available using Oracle Grid Infrastructure
The document discusses using Oracle Grid Infrastructure (GI) to make MySQL highly available. Key points:
- GI provides infrastructure like virtual IPs, storage, and monitoring to enable high availability of databases and applications.
- Custom scripts are used to integrate MySQL instances as GI resources and control their startup, shutdown, and monitoring.
- ACFS file systems provide shared storage for MySQL data directories across nodes.
- Resources like virtual IPs and ACFS file systems have dependencies defined to control startup order.
- Monitoring and control of MySQL instances is done through the GI console and scripts.
2. Who Am I
LAMP developer/DBA since 2000
- MySQL 3.22 !
Delphi/Oracle developer since 2003
- The first medical patient record system in production in Estonia
Oracle DBA since 2005
- Initially by accident, but started to like it
Working in Affecto since 2006
- The only DBA, sysadmin, etc in Affecto Estonia
- Since 2007 working full time as a DBA consultant in a large European online
gambling provider
- Certifications: OCP-DBA; OCE-Grid,SQL; OCA-MySQL,AppSrv,PLSQL
3. What is Oracle Grid Infrastructure
Enables communication between collection of servers so they could act as a
collective unit
- Combination of servers is known as a cluster
- Also known as Oracle Clusterware
Provides necessary infrastructure for Oracle RAC
Manages resources:
- Virtual IP
- Disk groups, volumes, clustered file systems
- Databases
- Listeners
- etc
6. What are resources
Any database, application, process or VIP managed by Oracle Clusterware
- Cluster resource
- Local resource
Can have start and stop dependencies to other resources
Start dependencies
- Hard, weak, pullup, attraction, dispersion
Stop dependencies
- Hard
7. Resource attributes
Selection of attributes one can set for a resource
- ACTION_SCRIPT
- ACTIVE_PLACEMENT
- AUTO_START
- CHECK_INTERVAL
- ENABLED
- NAME
- PLACEMENT
- RESTART_ATTEMPTS
- SCRIPT_TIMEOUT
- SERVER_POOLS / HOSTING_MEMBERS
- TYPE
- START_DEPENDENCIES / STOP_DEPENDENCIES
8. Action Script
User supplied script to control the application
- Takes one command line argument – action
Supported actions
- start
- stop
- check
- clean
- abort (not mandatory)
9. Action script (check exit codes)
Application status controlled by script exit codes
- 0 = OK
- 1 = Not OK
CHECK action has more exit codes
- 0 = ONLINE
- 1 = UNPLANNED OFFLINE
- 2 = PLANNED OFFLINE
- 3 = UNKNOWN
- 4 = PARTIAL
- 5 = FAILED
10. Who executes the scripts?
Action scripts are executed by scriptagent process
All executions are logged:
- $GRID_HOME/logs/<host>/agent/crsd/scriptagent_<user>/scriptagent_<user>.log
2014-02-17 09:43:47.658: [support_mdb1_qa][1087916352]{1:41897:56226}
[check] Executing action script: /u02/app/mysql/support_mdb1_qa.scr[check]
2014-02-17 09:43:47.762: [support_mdb1_qa][1087916352]{1:41897:56226}
[check] Mon Feb 17 09:43:47 CET 2014 Action script
'/u02/app/mysql/support_mdb1_qa.scr' for resource[support_mdb1_qa] called
for action check - MYSQL_SID=support_mdb1_qa
2014-02-17 09:43:47.762: [support_mdb1_qa][1087916352]{1:41897:56226}
[check] Exit value: 0
11. Add resource
Optional: Add resource type to specify some common attributes
crsctl add type mysql_instance -basetype cluser_resource -attr
“ATTRIBUTE=CHECK_INTERVAL,TYPE=integer,DEFAULT_VALUE=10”
Create VIP resource
appvipcfg create -network=1 -ip=10.0.1.10 -vipname=mysql1_vip -user=root
Add application resource
crsctl add resource mysql1 -type mysql_instance –attr
“ACTION_SCRIPT=/u02/app/mysql/mysql1.scr,
PLACEMENT='restricted', SERVER_POOLS='pte_pool',
START_DEPENDENCIES='hard(mysql1_vip,ora.swe1mc1acfs.acfstest1.acfs) pullup(mysql1_vip)',
STOP_DEPENDENCIES='hard(mysql1_vip)'”
12. Monitor resource
crsctl status resource cms_mdb1_dev
NAME=cms_mdb1_dev
TYPE=mysql_instance
TARGET=ONLINE
STATE=ONLINE on jfadbmc1n03
14. Automatic Storage Management (ASM)
Oracle ASM is a cluster aware volume manager and file system for Oracle
Database
Uses disk groups to store files
Disk group is a collection of raw disk devices
- Disk group contents are striped over all disks within disk group
- Adding/removing disk space is done just by adding/removing disks online
- Redundancy configurable: external (none), normal (2x), high (3x)
Oracle DB uses disk groups directly
- 3rd party apps can use ASM through ADVM volumes and ACFS file system
15. ADVM & ACFS Architecture
ASM
Disk group
Volume
ACFS
Volume
3rd party FS
Disk group
Oracle DB
16. Need kernel drivers!
Need to load kernel drivers to make ADVM volumes and ACFS file systems
visible
Kernel drivers are dependent on kernel version
- Distributed with GI, PSU also updates drivers
If not started automatically, then
- $GRID_HOME/bin/acfsload start
lsmod | grep oracle
oracleacfs 1933804 45
oracleadvm 243697 51
oracleoks 420760 2 oracleacfs,oracleadvm
oracleasm 53663 1
17. ADVM – ASM Dynamic Volume Manager
Provides volume management services
Can be resized online
Visible as block devices on Linux
- ls -l /dev/asm/
brwxrwx--- 1 root dba 251, 232961 Oct 9 10:26 acfs1-455
brwxrwx--- 1 root dba 251, 112171 Feb 17 11:59 acfstest1-219
brwxrwx--- 1 root dba 251, 112139 Oct 9 10:26 swe1account-219
brwxrwx--- 1 root dba 251, 112151 Oct 9 10:25 swe1bank-219
brwxrwx--- 1 root dba 251, 112152 Oct 9 10:25 swe1cms-219
18. ACFS – ASM Cluster File System
POSIX compliant cluster file system
Can be resized online
BASIC use
- Free – see licensing slide
ADVANCED use – needs license
- Snapshots
- Tagging
- Replication
- Security & Encryption
- Auditing
19. Creating ADVM volume and ACFS
• asmcmd
ASMCMD [+] > ls
ACFS/
OCRDATA/
SWE1MC1ACFS/
• ASMCMD [+] > volcreate -G SWE1MC1ACFS -s 10G acfstest1
• ls -l /dev/asm/acfstest1*
brwxrwx--- 1 root dba 251, 112171 Feb 17 11:18 /dev/asm/acfstest1-219
• mkfs.acfs /dev/asm/acfstest1-219
mkfs.acfs: version = 11.2.0.4.0
mkfs.acfs: on-disk version = 39.0
mkfs.acfs: volume = /dev/asm/acfstest1-219
mkfs.acfs: volume size = 10737418240
mkfs.acfs: Format complete.
20. Registering ACFS as a GI resource
srvctl add filesystem -d /dev/asm/acfstest1-219 -v
acfstest1 -g SWE1MC1ACFS -m /instance/acfstest1
- Created resource: ora.swe1mc1acfs.acfstest1.acfs
- Can use this resource to create dependencies to other resources
srvctl start filesystem -d /dev/asm/acfstest1-219
- Mounts file system on all nodes
ACFSUTIL registration also possible but then cannot create resource
dependencies
21. Monitoring ACFS
• crsctl status resource ora.swe1mc1acfs.acfstest1.acfs –t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.swe1mc1acfs.acfstest1.acfs
ONLINE ONLINE jfadbmc1n01 mounted on /instanc
e/acfstest1
ONLINE ONLINE jfadbmc1n02 mounted on /instanc
e/acfstest1
ONLINE ONLINE jfadbmc1n03 mounted on /instanc
e/acfstest1
ONLINE ONLINE jfadbmc1n04 mounted on /instanc
e/acfstest1
22. Licensing
Oracle Clusterware for High Availability is included in Oracle Linux Basic and
Premier Support
Quote from “Oracle® Database Licensing Information 11g Release 2 (11.2)”
- Use of Oracle ADVM and Oracle ACFS's base functionality (i.e., excluding
the below-listed advanced functionality) is free for all data types, including
non-Oracle files.
- Use of Oracle ACFS's advanced functionality requires the Cloud File System
license. The advanced functionality consists of snapshots, replication,
tagging, realm-based security, encryption, and auditing.
- Oracle will provide support for ACFS/ADVM only if the server is running an
Oracle product, which may include Oracle Linux or Oracle Solaris, that is
also under Oracle support.
23. ACFS vs OCFS2
OCFS2 advantages
- OCFS2 is part of Linux kernel and open source
- OCFS2 is very easy to set up
- Does not require Oracle Grid Infrastructure
OCFS2 disadvantages
- When running with Oracle GI it still requires separate cluster setup
- In my experience unreliable – multiple times corrupted data during online
resizing, crashes, global locks
25. Our MySQL environment
20+ different applications
- Maintained by different application teams
- Different requirements to the database – for example settings, versions
Each application only uses its own databases
Databases
- 80 in production, ~400 in test
26. Our old setup
3 master-master MySQL
replication configurations
- Using MySQL built-in
replication
One instance per server
27. Using Oracle GI for MySQL
Divided databases into business-logical instance “groupings”
- 13 in production, 38 in test
- Instances can run different RDBMS versions
- Each instance has dedicated VIP, clients use this VIP to connect
One ACFS file system per MySQL instance
- Mounted read-write on all cluster nodes
3 physical servers in production, 3 in test
28. Oracle Grid Infrastructure Standalone
Agents - XAG
Oracle provided and supported management scripts for
- GoldenGate
- Siebel
- Apache HTTP Server
- Apache Tomcat
- Peoplesoft
- MySQL
Using it for MySQL requires MySQL Enterprise Edition subscription
Supported for MySQL 5.6+
Usage through agctl
29. XAG script for MySQL
Written in Perl
- agmysqlas.pm
Start
- mysqld_safe --basedir=$mysql_home --datadir=$datadir &
Stop
- mysqladmin shutdown
Check
- mysqladmin status
30. Custom scripts - My script suite
I wrote scripts in bash to help manage MySQL under Oracle clusterware
- https://github.com/ilmarkerm/mysql-with-oracle-clusterware-scripts
Tested with MySQL 5.1, 5.5, 5.6 both community and enterprise
- Under RHEL/OEL 5 and 6
Important scripts in this suite:
- mysql_handler.sh
- Action script, must be symlinked to instancename.scr
- instances.sh
- Configuration, lists all configured instances
- init_mysql_db.sh and init_grid.sh
- Create new MySQL instance and add resources to GI
- mysql_logrotate.sh
- Rotates general/error/slow logs for all instances using logrotate
31.
32. Recommended directory structure
/instance/<instance name>/
- MySQL instance home directory containing:
- data/ - mysql data directory
- config/my.cnf – mysql instance configuration file
- logs/ - directory where mysql stores its log files
/u02/app/mysql/
- Containing the script suite
/u02/app/mysql/product/<version>/
- MySQL RDBMS binaries – use the TAR version
/root/.my.cnf
- Default MySQL root password
33. init_mysql_db.sh
This script initializes MySQL data directory and all cluster scripts.
Script must be run as user root.
Usage:
-i name
Instance name. Name should be 20 characters or less and only
characters a-z, A-Z, 0-9 and _ are allowed (w character class)
MySQL and instance settings:
-s directory
MySQL software location (where tar-edition is unpacked)
-b ipaddr
IP address to bind the instance
-a basedirectory
Instance base directory, this replaces options -d -c -l by setting
the following values
-d basedirectory/data, -c basedirectory/config/my.cnf, -l
basedirectory/logs
34. init_mysql_db.sh (continued)
• ./init_mysql_db.sh -i acfstest1 -b 10.69.136.250 -s
/u02/app/mysql/product/5.6.16 -a /instance/acfstest1
Installing MySQL system tables...2014-02-17 14:05:21 0 [Warning] TIMESTAMP
with implicit DEFAULT value is deprecated. Please use --
explicit_defaults_for_timestamp server option (see documentation for more
details).
2014-02-17 14:05:21 20845 [Note] InnoDB: Using atomics to ref count buffer
pool pages
2014-02-17 14:05:21 20845 [Note] InnoDB: The InnoDB memory heap is disabled
…
Starting MySQL...
Creating 'clusterware'@'localhost' user...
Changing root password...
Stopping MySQL...
Updating cluster configuration /u02/app/mysql/instances.sh...
Symlinking actionscript acfstest1.scr...
35. init_grid.sh
• ./init_grid.sh -i acfstest1 -t mysql_instance -p Generic
Instance name=acfstest1
GRID_USER=grid
GRID_HOME=/u01/app/grid/product/11.2.0.4/grid
Virtual IP=10.69.136.250
Production Copyright 2007, 2008, Oracle.All rights reserved
…
Configuration done!
Use the following command to start MySQL instance:
/u01/app/grid/product/11.2.0.4/grid/bin/crsctl start resource
acfstest1
36. instances.sh
# This script is just configuration file, it contains some general
parameters and the
# configuration for all MySQL instances
# MySQL OS process owner
MYSQLUSER=mysql
# MYSQL_DB_USER MYSQL_DB_PASSWORD - Credentials to log in to mysql (check)
# MYSQL_DB_USER MYSQL_DB_PASSWORD - Credentials to log in to mysql
MYSQL_DB_USER="clusterware"
MYSQL_DB_PASSWORD="cluster123"
LOCKDIR=/u02/app/mysql/locks
acfstest1=(
DATADIR="/instance/acfstest1/data"
CONFFILE="/instance/acfstest1/config/my.cnf"
BINDADDR="10.69.136.250"
SOFTWARE="/u02/app/mysql/product/5.6.16"
LOGDIR="/instance/acfstest1/logs"
)
37. mysql_handler.sh
Clusterware action script implementation
Must be symlinked to <instance name>.scr for each instance
- Only execute the symlinked scr file
Implemented actions:
- start – start mysqld (without mysqld_safe)
- stop – send TERM to mysqld
- check – executes “mysqladmin ping” through socket
- clean – if mysqld still alive send KILL
When executed without parameters opens MySQL CLI through socket
38. mysql_handler.sh (continued)
• ./acfstest1.scr check
Exit value: 0
• ./acfstest1.scr
Usage: ./mysql_handler.sh {start|stop|restart|check|clean}. All other
arguments execute MySQL Console.
Welcome to the MySQL monitor. Commands end with ; or g.
Your MySQL connection id is 42
…
Type 'help;' or 'h' for help. Type 'c' to clear the current input
statement.
mysql>
39. Additional HA developments
In order to address planned maintenance (some DDL in MySQL still blocking)
- Build traditional MySQL slave instance and start using MySQL Fabric
- While primary database is under maintenance, slave serves the client
connections
MySQL Fabric node (with backing store) can also be set up as clusterware
resources
- Handles configuration and communication with drivers
MySQL driver is fabric aware
40. Contact info and links
https://github.com/ilmarkerm/mysql-with-oracle-clusterware-scripts
Email: ilmar.kerm@gmail.com
Twitter: https://twitter.com/ilmarkerm
LinkedIn: https://ee.linkedin.com/pub/ilmar-kerm/41/570/428/
Blog: https://ilmarkerm.eu
Notas del editor
How typical Oracle GI setup looks like, with Oracle RAC
Public network – network where clients get access the resources over VIP (virtual IP)
Interconnect – Private network dedicated for clusterware internal communication
Shared storage – Storage (SAN/NAS) that is accessible by all cluster nodes
Protection against hardware/OS failures
Each application has its own dedicated virtual IP (VIP) resource and action script for controlling that application
Application must bind itself only to its own VIP, not to all network interfaces
Oracle provides bundled agents for GoldenGate, Siebel, Apache HTTP server and Tomcat.
Cluster resource – fails over to other servers
Local resource – runs in each server in cluster, pinned to server, no failover
START dependencies:
Hard – another resource must be running before (on the same node by default)
Weak – if A has weak dependency to B, then when starting A clusterware also attempts to start B, but it has no consequence to the starting of resource A
Attraction – clusterware prefers to place the resources to the same node
Pullup – resource B (if it is not running already) must be automatically started when resource A starts
Dispersion – clusterware prefers not to run resources on the same node
STOP dependencies:
Hard – In resource A has hard dependency to B, then A must be stopped when B is stopped
LOAD is only used, when PLACEMENT=balanced
PLACEMENT=balanced, favored or restricted
START, STOP, CHECK, CLEAN are mandatory actions
START – bring resource online
STOP – gracefully bring down resource
CHECK – health of the resource
CLEAN – clean up resource after non-graceful operation
ABORT – called if any other actions hang
Clusterware acts differently for different CHECK exit codes – for example when application is PLANNED OFFLINE it will not be automatically restarted
Bold red – script output
There can be only one filesystem per ADVM volume, do not partition the ADVM volume
ACFS is supported on Linux (OEL, RHEL, SLES), Windows, Solaris and AIX.
ASMCA GUI also possible
ASMCMD commands
ls – displays diskgroups
volcreate – creates new ADVM volume, visible as block device under /dev/asm/
mkfs.acfs – formats ADVM volume as ACFS, must be run as root
Cannot add file system to /etc/fstab, because it requires GI to be started first, after then can mount ACFS.
Separate cluster – has its own rules for cluster membership, can reboot nodes
Advantages – Easy to switch servers
Problems – sometimes replication “broke” so it broke for everyone within that “cluster”; Sometimes app/developer modified the data on wrong node; hard to upgrade RDBMS version
Why?
high availability without going to data replication! data replication can miss data - scary to oracle dba-s
/u02 is clustered shared filesystem mounted on all nodes
Instance configuration file config/my.cnf cannot contain the following parameters (they are handled by the script suite):
datadirbind_addresssocketpid-filelog-errorslow-query-log-filegeneral-log-file
Executes mysql_install_db to create mysql instance, adds created instance information to instances.sh and then executed init_grid.sh to create GI resources