Master-replica size difference
You can view the database size difference by logging into MySQL Workbench, connecting to Master DB instance ‘primary-db’ and replica instance ‘replica-db’ and selecting the "info" option next to the database name for both DB instances.
Case replication on the local environment and observations below:
First, I created a MySQL DB ‘Master’ instance with engine version 5.5.53 and added dummy data into the database ‘test’ for testing purposes. Upon connecting to the ‘Master’ DB instance from MySQL workbench and selecting ‘Schema Details’, I received a Database size of 8.1 GB.
Then I created a Read Replica ‘Replica’ for my ‘Master’ DB instance, and upon connecting to the ‘Replica’ DB instance from MySQL workbench and selecting ‘Schema Details,’ I received a Database size of 8.1 GB.
Finally, I upgraded my ‘Replica’ instance to engine version 5.7.28. Upon connecting to the ‘Replica’ DB instance from MySQL workbench and selecting ‘Schema Details,’ I still received a Database size of 8.1 GB.
In summary, I did not receive a database size difference between my ‘Master’ DB instance running on MySQL version 5.5.53 and the ‘Replica’ DB instance running on MySQL version 5.7.28.
Having said that, in our case, there could be a difference in a specific table size between your Master and Replica instance that is causing an overall difference in database size. A discrepancy in table size between the master and replica could be due to several possible reasons, such as:
- Tables not optimized.
- Differences in innodb_file_per_table. - The replica is writable.
- Page splits due to INSERTS/UPDATES.
If you cannot verify that the table statistics are current on the master and replica, than run the below queries on both the master and replica instances and troubleshoot further:
=>SELECT table_schema AS "Database",
ROUND(SUM(data_length + index_length) / 1024 / 1024, 2) AS "Size (MB)" FROM information_schema.TABLES
GROUP BY table_schema;
=>SELECT table_name AS "Table",
ROUND(((data_length + index_length) / 1024 / 1024), 2) AS "Size (MB)" FROM information_schema.TABLES
WHERE table_schema = “your_database_name”
ORDER BY (data_length + index_length) DESC;
If you do find tables with different sizes on master and replica, run and check the output for the following query as well:
=>SHOW TABLE STATUS FROM schema_name WHERE NAME=‘table_name’;
https://dev.mysql.com/doc/refman/5.5/en/show-table-status.html
You may also refer to the below third party blog for more information on similar scenarios:
https://dba.stackexchange.com/questions/118368/mysql-replication-database-size-is-different
Performance and optimizations
1. SQL Azure keeps track of performance statistics for every query plan in the cache. You can view these statistics for your database queries using the dynamic managed view (DMV) sys.dm_exec_query_stats like so:
SELECT * FROM sys.dm_exec_query_stats
For a more details example, see this blog post, I showed how to find queries with poor I/O performance by querying sys.dm_exec_query_stats.
sys.dm_exec_query_stats only reports statistics for the primary replica for your SQL Azure database. If a secondary is promoted to the primary, the results
from sys.dm_exec_query_stats might be much different, just seconds later. Queries in the cache might suddenly not be there, or execution counts could be smaller, typically you would see them grow over time.
https://azure.microsoft.com/es-es/blog/understanding-the-procedure-cache-on-sql-azure/
2. As tables go through many INSERT/UPDATE/DELETE cycles, MySQL doesn't reclaim the space from deleted rows; instead, it internally flags the row as 'deleted' and leaves it in the tablespace. A 1G size difference is not uncommon and shouldn't cause alarm.
You can try and optimize the table command on master and verify the table size gets reduced. It is a good practice to optimize table that does a lot of DELETE operations routinely.
EC2-DB replica queries
SELECT table_schema, ROUND(SUM(data_length+index_length)/1024/1024/1024,2) "size in GB" FROM information_schema.tables GROUP BY 1 ORDER BY 2 DESC;
SELECT table_schema AS "DB_NAME", SUM(size) "DB_SIZE", SUM(fragmented_space) APPROXIMATED_FRAGMENTED_SPACE_GB FROM (SELECT table_schema, table_name, ROUND((data_length+index_length+data_free)/1024/1024/1024,2) AS size, ROUND((data_length - (AVG_ROW_LENGTH*TABLE_ROWS))/1024/1024/1024,2) AS fragmented_space FROM information_schema.tables WHERE table_type='BASE TABLE' AND table_schema NOT IN ('performance_schema', 'mysql', 'information_schema') ) AS TEMP GROUP BY DB_NAME ORDER BY APPROXIMATED_FRAGMENTED_SPACE_GB DESC;
SELECT file_name, tablespace_name, table_name, engine, index_length, total_extents, extent_size from information_schema.files WHERE file_name LIKE '%ibtmp%';
SELECT * FROM information_schema.innodb_sys_tables WHERE name LIKE '%#%';
SELECT file_name, tablespace_name, table_name, engine, index_length, total_extents, extent_size from information_schema.files WHERE file_name LIKE '%ibdata%';
https://aws.amazon.com/premiumsupport/knowledge-center/view-storage-rds-mysql-mariadb/