Amazon Aurora MySQL - tips & tricks - best practices during the configuration
Aurora is a fully managed MySQL compliant database engine. There is also Amazon Aurora PostgreSQL that is compatible with PostgreSQL. MySQL and PostgreSQL combine the speed and reliability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. At some loads, Aurora can provide up to five times more performance than MySQL and up to three times more than PostgreSQL, without having to make changes to most existing applications.
Aurora uses distributed, fault-tolerant and self-healing disk space. The basic disk grows automatically as needed – up to 64 terabytes. Aurora automates and standardizes cluster creation and replication of databases, which are usually among the most difficult aspects of database configuration and administration.
The biggest advantages of Aurora are:
- high performance – up to 5 times higher than the reference MySQL. Aurora uses the most advanced technological solutions to ensure that the database engine is able to take full advantage of the available computing, memory and network capabilities.
- automatic disk space scaling – the service will automatically increase the size of the database volume, as the database’s demand for data increases.
- Amazon Aurora Serverless is a unique configuration of automatic on-demand scaling for the Amazon Aurora service. During its use, the database will be automatically launched, dumped and will increase or decrease the capacity, depending on the needs of the application.
- resistant to defects and self-healing – it is resistant to errors. It transparently handles the loss of up to two copies of data, without affecting the availability of records in the database and up to three copies, without affecting read availability.
- highly secure – Aurora operates in VPC. Thanks to this, it allows you to isolate the database in your own virtual network and connect to the local IT infrastructure using standard, encrypted VPN connections. In addition, integrated with AWS IAM, it provides the ability to control the actions of groups and users. It enables data encryption using AWS KMS. You can also monitor activity by sending control logs to Amazon CloudWatch.
- is great support for migration – Amazon Aurora combines enterprise-class security, performance, high availability and durability, with low cost and ease of MySQL support. This makes it a good migration goal when moving databases from expensive commercial databases to AWS.
- has a built-in encryption mechanism – it enables encryption of databases by means of keys that are created and controlled using the AWS key management service (KMS). Data stored at rest as well as snapshots, backups and replicas in the same cluster remain secure. Amazon Aurora uses SSL (AES-256) to secure data in transit.
- profitable – costs are calculated on the basis of the hourly rate for each instance being started. With the end of its use, it can be easily removed. Details available in the pricing and availability of the service in individual regions.
Because the Amazon Aurora is no longer a secret, it is worth exploring the knowledge of the best practices associated with it and its configuration.
Firstly, it’s worth remembering about the groups of Aurora parameters. There are two types of Aurora MySQL parameter groups: DB parameter groups and DB cluster parameter groups. Some parameters affect the configuration of the entire DB cluster, e.g. the binary log format, the time zone, and default settings of the character set. Others limit their reach to a single DB instance. An important aspect is how the parameters affect the behaviour, stability and functionality of the Aurora cluster, and which affect performance after it has been modified. It should be remembered that both types of parameters are created by default, and some parameters allow modification. To explore the subject, see the following documents: Working with DB Parameter Groups and DB Cluster Parameter Groups and Aurora MySQL Parameters.
Secondly, before we move to implement the change in production, it is obvious that we need to check them in a test environment. we create a copy of the production environment on a test environment or by creating a snapshot of a production instance. In this way, the configuration will be as close as possible to the production environment. Remember to generate a load for a test instance that will reproduce the production load. We check system performance for key performance indicators, such as processor utilization, number of connections to the database, memory usage, cache hit rates, query speed and delays. Only one parameter should be changed at the same time to ensure its impact on the whole solution. It should be documented which parameter had a negative effect, and in the implementation of which the key performance indicators showed improvement.
Thirdly, let’s remember the default parameter values and their meanings. Some DB instance parameters contain variables or formulas in which the value is specified by constants. Examples are the size of the instance and the size of the memory, the network port for the instance, and the memory allocated. It’s best to leave them unchanged because they adjust automatically when scaling or shrinking is performed.
Fourthly, we pay attention to the symptoms and diagnosis of incorrectly set parameter values. When some parameters are incorrectly configured, they can give symptoms of a lack of memory, which is recorded in the MySQL error log. In this case, the instance goes into a restart state and generates event logs. We can then get the following message:
2018-12-29 19:05:16 UTC [-]MySQL has been crashing due to incompatible parameters. Please
check your memory parameters, in particular the max_connections, innodb_buffer_pool_size,
key_buffer_size, query_cache_size, tmp_table_size, innodb_additional_mem_pool_size and
innodb_log_buffer_size. Modify the memory parameter and reboot the instance.
Lastly – classification of parameters is important. They can be classified in two ways:
- Parameters that control the behaviour and functionality of the database, but do not affect resource utilization and instance stability.
- Parameters that can affect performance by managing resource allocation, such as caching and internal memory buffers, instance.
We recommend Fabio Higa’s (Database Specialist Technical Account Manager in AWS) post, who lists examples of such parameter classifications and in a very accessible way describes their impact on the database.
It is certain that the implementation of solutions based on Amazon Aurora is cost-effective in many aspects. First of all, those related to the optimization of costs, its efficiency and scalability. With regard to best practices related to configuration, an appropriate choice of parameters becomes important. To perform the correct configuration, which translates into measurable performance increase, it is good practice to experiment, establish a baseline and compare the results after making changes. It is worth taking advantage of this scenario and testing all the changes that we want to implement for production.