* RDS snapshots are EBS snapshots stored in S3
* When you spin up a new replica, its EBS volume loads lazily in the background
* This results in first-touch penalty (when you query any data, it takes longer to retrieve it for the first time)
* Suggestions:
* If DB is small, run “SELECT * FROM
” query on each table on the replica
* Initiate a full table scan with VACUUM ANALYZE (in PostgreSQL)
* Another reason could be an empty buffer pool (cache for table and index data)
* Vertical Scaling (Scaling up)
* Single-AZ instance will be unavailable during scaling op
* Multi-AZ setup offers minimal downtime during scaling op – standby DB gets upgraded first and then primary will failover to the upgraded instance
* Horizontal Scaling (Scaling out)
* Useful for read-heavy workloads
* Use read-replicas
* Replicas also act as a DR target
* Sharding = horizontal partitioning
* Split and distribute data across
multiple DBs (called shards)
* Mapping / routing logic
maintained at application tier
* Offers additional fault tolerance
(since no single point of failure)
* If any shard goes through failover,
other shards are not impacted
RDS Monitoring - common metrics, native logs, manual monitoring tools
* Common metrics
* CPU, RAM, disk space consumption /
Network traffic / DB connections / IOPS
metrics
* Native logs / extensions
* e.g. pgaudit extension in PostgreSQL for
auditing (DML / DCL / DDL etc)
* Manual Monitoring Tools
* RDS console (DB connections, R/W ops,
storage consumption, memory
utilization, N/W traffic)
* AWS Trusted Advisor (cost optimization,
security, fault tolerance, performance
improvement checks)
* CloudWatch (service health status etc.)
RDS Monitoring - Automated Monitoring Tools
* RDS event notifications
* Database logs (can be exported to
CloudWatch Logs)
* CloudWatch (Metrics / Alarms / Logs)
* Enhanced Monitoring (real-time)
* Performance Insights
* RDS Recommendations
* CloudTrail (captures all RDS API calls, can
be viewed in CloudTrail console or
delivered to an S3 bucket)
* Up to 90 days of your account activity can
be viewed in CloudTrail console (can create
a trail to deliver the audit logs to S3)
RDS Notifications / Event subscriptions
* Available within the RDS console
* Allows you to create CloudWatch alarms to notify you
whenever certain metric data crosses a threshold
* You can send alarm notifications to an SNS topic (email
/ SMS)
* You can also subscribe to RDS events
* Event sources can be snapshots, instances, security
groups, parameter groups, clusters, cluster snapshots,
etc.
* Events like DB instance creation, deletion, availability
(shutdown / restart), backup, recovery, failover, failure,
backtrack, config change etc.
* Periodic automated suggestions for DB instances, read replicas, and DB parameter
groups
* View / watch / download DB logs from the RDS console
* Can export logs to CloudWatch Logs (log types vary by DB engine)
* CloudWatch Logs never expire. To expire them, set log group retention policy (1 day - 10 yrs)
* Logs are accessible from RDS console even if you disable log export to CloudWatch Logs
RDS Logs - Log types that can be exported to
CloudWatch Logs - topics
* Alert log – Oracle
* Audit log – Oracle, MariaDB, MySQL (must use option group with MARIADB_AUDIT_PLUGIN option for MariaDB and MySQL to audit database activity)
* Listener log – Oracle
* Trace log – Oracle
* Error log – SQL Server, MariaDB, MySQL
* Postgresql log – PostgreSQL (contains audit
logs)
* Upgrade log – PostgreSQL
* General log – MariaDB, MySQL
* Slow query log – MariaDB, MySQL
Exporting AWS RDS logs to S3
* RDS database log files can be accessed
via RDS console, CLI or API
* Transaction logs cannot be accessed
* You can export log data from
CloudWatch Logs to S3 by creating an
export task in CloudWatch (createexport-task CLI command)
* Log files can also be downloaded using
the RDS API and uploaded to S3 (using
Lambda or AWS SDK)
* To analyze real-time OS level metrics
(CPU / memory usage etc.)
* To monitor different processes or
threads that are using the CPU
* Helps identify performance issues
* Increased granularity of 1 to 60
seconds * 1, 5, 10, 15, 30, or 60 seconds
* Requires an agent to be installed on the
DB server to collect metrics
* Offers visual dashboard for performance
tuning, analysis and monitoring
* Monitors DB load for the instance (if the
instance has multiple DBs, you’ll see
aggregated metrics)
* DB load – average number of active
sessions (AAS – average active sessions)
* Performance problems will appear as
spikes in the DB load graph
* Helps identify performance bottlenecks,
expensive SQL statements, etc.
RDS Performance Insights - DB load
* You can visualize the DB load,
filter it by waits / SQL / hosts /
users
* Waits - wait state for CPU, IO, Lock
etc.
* SQL – SQL statements
* Hosts
* Users
* Identify slow queries (top SQL),
locks
CloudWatch Application Insights
* For .NET and SQL Server
* Also supports DynamoDB tables
* Identifies and sets up key metrics, logs, and alarms for SQL Server workloads
* Uses CloudWatch events and alarms
* Useful for problem detection, notification and troubleshooting
* Lets you deploy RDS DBs in on-premises VMware environments (VMware vSphere)
* Same user interface as in AWS
* Supports MySQL, PostgreSQL, and SQL Server
* Fully managed DBs
* Uses health monitoring to detect unhealthy
database instances and automatically recovers them
* Support manual and automatic backups with PITR
* Can use CloudWatch for monitoring
RDS – Good things to know
* Read replica can be made writable (for MySQL/MariaDB)
* For other engines
* read replica cannot be made writable, but you can promote it to make it writable
* For Oracle and SQL Server
* Automatic backups or manual snapshots are not supported on the replica
* For Oracle
* Does not yet support Oracle RAC (a cluster DB with a shared cache architecture)
* For SQL Server
* Supports both Multi-AZ options – Database Mirroring and Always On
* For PostgreSQL
* Only manual snapshots are supported on the replica (no automatic backups)
* Set log retention with parameter rds.log_retention_period
* For MySQL/MariaDB
* Set log retention with stored procedures * e.g. call mysql.rds_set_configuration('binlog reten