Thursday 17 October 2019

All those Logs in MySQL

If you are a MySQL guy, this might be a silly topic for you. But, there are lot many people out there who’re still confused about the binlogs, relay logs, transaction logs.. So, this is intended for them!
Below are the logs created by MySQL with brief explanation :
  • Error Log : This is one of the important log. When enabled using the startup option –log-error or with variable log_error in the config file, MySQL writes ‘what’s important happening’ to this log file. If these options are not given, the log entries are written to the console and not to the file.

  • Transaction Logs : I’ve seen people familiar with other RDBMS asking if there is a transaction log in MySQL. Also, there are people confused Binary log as the transaction log. Well, that’s an half truth !  Of course, MySQL do have transaction logs and they are known innodb log files (ib_logfiles) . By default 2 files are created : ib_logfile0 and ib_logfile1 . They are written/overwritten in circular fashion.    These transaction logs are also known as Redo logs and they play a crucial role in MySQL’s operation.     More details on this can be found in here: https://dev.mysql.com/doc/refman/5.7/en/innodb-init-startup-configuration.html#innodb-startup-log-file-configuration

  • Binary Logs : These logs, when enabled with log_bin option, contains all those changes committed to the database. They play crucial role in point-in-time recovery solution and replication.

  • Relay Logs : These logs are specific to the Slave server. When an instance is configured as slave to a master server, the I/O thread starts reading Master’s Binlog and creates the Slave’s version of it in the form of Relay logs. They typically are the replica of Master’s Binlog, however, transactions might be in a different order. Please note that, Relay logs are flushed whenever the I/O position is manually reset or slave re-configured.

  • General Log : This log is by default disabled and when enabled, contains almost everything that MySQL is doing. It logs all the connections, their activities, everything. Though it’s useful, the size of it might be a serious issue. Generally, it’s enabled in a  particular time frame for the audit purpose. Otherwise, it’s better stay disabled.

  • Slow Query Log : As the name indicates, it logs all the slow queries which is running beyond the specified time, when configured. It is helpful for having track of long running queries.

  • Audit Log : This is not a common log. It can be generated only if the Enterprise Audit Plugin is enabled. When enabled, it logs all the connections and queries which can be filtered based on users or databases or source, etc. This is one of the very useful log for auditing .

Slave with GTID – How to use slave_skip_errors ?

Okay, here’s a challenge. Assume that you’ve a Master-Slave setup. Both are GTID enabled and Slave is having auto_position set to 1. Which means, it’s using GTID positions to get in sync with master.

Now, suppose someone executes some insert queries into one of the table with auto_increment key column in the Slave server directly  . And so, in next few seconds, the Slave breaks down with the Duplicate Entry(1062) error. How you fix?
Since you’re smart, you inject some empty transactions to skip those GTID positions. Fair. 
But, what if there are hundreds of thousands of inserts done directly and you can’t keep skip them manually. And, unfortunately there’s no alternative for ‘sql_slave_skip_counter=1000..’ in GTID mode. Also, you can’t use slave-skip-errors=1062 in config file in GTID mode.  So, what’s the solution ?

It’s pretty simple. Add ‘slave-skip-errors=1062’ to the config file and restart MySQL. Wait.. But, in GTID mode will that work?  Oh yes that works, but not yet.
Once MySQL is up, do RESET SLAVE ALL. But before take a note of everything inside master.info file or from the table.

And, now is the key point. Execute CHANGE MASTER command, but without auto position option. Use master_log_file and master_log_pos options. Start Slave. This should work without any errors. Once it’s in sync, you can enable auto_position option.
You can do the same for any other replication errors. But, be cautious. Do only if you’re aware of the reasons for the inconsistency.

Monday 2 January 2017

Announcement: New Blog !!

Okay, here comes an announcement for the first time!!

I've got a new blog and moving all these stuffs to there. 
So now on, refer the below link:

https://databasically.wordpress.com/

Tuesday 20 September 2016

MySQL: Architecture..!

Ah..! That's a very basic topic and I'm writing about it now..!! That's because it never lasts in my memory for long time. It always rolls back from my mind , might be because I never felt it's importance - apart from that of interviews !
However, after a brief Googling, here is an overview of MySQL's architecture in my words.

Unlike MS SQL Server or other RDBMS, MySQL doesn't have a standard architecture diagram that we can refer to. Or may be, couldn't find out from the official site of MySQL. 

Anyhow, I was able to get one of the know logical architecture diagram which describes MySQL well and easy to understand is :




That's a 3 layer logical architecture. 

The topmost layer facing the incoming connections/clients includes authentication, security related things.

The second layer is the most important one, which includes Parser, Query Cache and Optimizer.
Parser creates a Parse tree for every query that's been executed by the client. It uses Lex and Yacc compilers (What are they ? That's another big story..,may be in the next to next post..!?!) 
Then the Optimizer tries to optimize this Parse tree by applying different execution plans or by re-writing the query itself. 

This Optimized Parse Tree will be passed on to the third layer which comprise of the Storage engines. Storage engines just gives back the result and don't do any optimization to the query.

Meanwhile, the result sent back will be stored in the query cache to serve the repeating queries.

That's the briefest MySQL architecture, as I understood! Hope that gives an idea of how MySQL works.





Wednesday 7 September 2016

MySQL: High-Availability - Replication


MySQL being one of the best open-source light weight database, comes with the in-built data replication feature. In fact, this feature made MySQL more and more popular in the data world.

Replication in MySQL do mean – The data flow from one server to the other remote server through network, due to which, both of them will be identical to each other. It’s basically having an exact copy of data at a remote place, while, the data being updated dynamically with respect to the source.
The MySQL Replication includes below terminologies:
  •         Master – The source server from where the data is being replicated. Applications usually pointed to this server for write operations.

  •        Slave – This is the destination for replication. Slave is the actual remote mirror copy of Master server.

o   I/O Thread – This is one of the basic component of replication. This thread is responsible for reading the Master’s Binary log content and write it into the Slave’s Relay log.
The performance of this thread is the key factor of Replication speed and it solely depends on the Network Bandwidth.
o   SQL Thread – This is the other important replication component, which is responsible for reading Slave’s Relay logs and execute  them into the Slave Databases. This is always slower, compared to I/O thread and this decides the Delay in Replication/Replication Lag.


  •        Binary Logs – These binary files logs every committed changes on Master. Every modification on the Master server is being written to these files, which are used by slaves for replication. And, also they are used for recovery purposes.

  •         Relay Logs – These are the  Slave’s version of Binary Logs. They are generated only for Slave servers.



Below is the pictorial representation of MySQL Replication with Master-Slave Hierarchy :

Please note that there can be Multiple Slaves connected to Single Master. However, that doesn’t affect Master’s performance.



There are 3 types of replication MySQL supports:
·        Asynchronous Replication

This is MySQL Default. Any flavor of MySQL should support this type of replication. Asynchronous Replication is contrast to Synchronous Replication.
Here, Master sends the committed transactions to it’s Binary Logs and DO NOT wait for the Slave to read this data.
So, while using this type, we can’t be 100% sure that all the data from Master is replicated to Slave!


  •         Semi-synchronous Replication

MySQL 5.5 or above supports this type of Replication. In this type, the Master sends the committed transactions to the Binary Logs and waits for the acknowledgement from at least one of the Slaves that the replicated data is received.
If no Slaves returns an acknowledgement, the Master switch back to Asynchronous mode after timeout, until at least one of the Slaves synchronized.
Using this type of replication ensures the replication is up-to-date, but, it slows the performance.


  •         Synchronous Replication


This type of replication is not available with the normal version of MySQL. This is a part of MySQL Cluster.
In this type , the Master waits until all the Slaves receive and commit the data.  This is much more costlier w.r.t. performance.

Please note that, MySQL Replication is not complete High Availability solution. Because, if the Master goes down, the Slave should be made as new Master , Manually. 
This leads opts for downtime until the manual process completes!

Friday 8 January 2016

MySQL: Innodb Log Buffer and REDO Logs


Actually, the definition of sys variable innodb_log_buffer_size was confusing me very much. 
What actually this buffer is meant for? 
Is it a part of Innodb buffer pool?

And after extensive googling(!) I got a clear picture of it. 
Here's what I got:

 No doubt, REDO logs are the unsung heroes. But, how exactly it's generated? 

When any DML request comes in, the storage engine reads out the corresponding page from the disk into the memory(innodb buffer pool). 
So, there are now 2 copies of this page: one is in  disk and the other in memory. The page in the memory, which is not yet flushed to disk is called Dirty Page.

For this dirty page, an redo log will be generated locally, in the mini transaction buffer. Which is very internal thing and not of our concern to worry! 
This log is then transferred to the global redo log buffer. And, this redo log buffer size is decided by the variable innodb_log_buffer_size ! This is a separate memory taken from the host.

So, the redo logs in this buffer is written or flushed to redo log files on disk(iblog files) as soon as the buffer is full or a commit happens. 

If we're performing long transactions, it's recommended to increase the size of this buffer to avoid the unnecessary I/Os.

The redo logs will be flushed to the disk before the corresponding dirty pages are flushed.

For a clear insight on how InnoDB works, go through the below link:
https://blogs.oracle.com/mysqlinnodb/entry/redo_logging_in_innodb

Wednesday 6 January 2016

MySQL: REDO Logs and UNDO Logs


Most of the DBAs like me, haven't cared about the them..! But, they are the Unsung heroes of MySQL..!!

Yes, UNDO and REDO logs are the 2 different kind of logs; each with it's own purpose and you shouldn't get confused between them ..!!

As the name indicates, REDO logs are for Re-doing things..! 
They hold the transactions which were already executed, but not committed to the disks..! So, the REDO log records every transaction, holds until it gets committed and if needed, it will be used for crash recovery.

By default, there will be 2 REDO logs, named, ib_logfile0 and iblogfile1. 
As of version 5.6 or later, we can have up to 100 REDO logs using the System Variable  innodb_log_files_in_group
Since, these logs are overwritten as soon as all the files are full, it is advised to have more files or files with larger size. By default, each log file will be 5 MB and the size can be set using the system variable  innodb_log_file_size

The maximum allowed combined size of all the files is 512 GB(innodb_log_file_size * innodb_log_files_in_group <= 512 GB).

The REDO log files location can be set using the variable innodb_log_group_home_dir , if not specified, default directory will be used.


Let's talk about UNDO logs..:

This log stores copy of data that is being modified by any current transaction. So that, at the same time if any other transaction queries for the original data, this log will serve the purpose..!! 
These logs are also called Rollback Segments.

In the earlier versions, these logs were invisible as, UNDO logs were the part of System table space, i.e., ibdata1 and ibdata2. But, as of 5.6 and later, you can create multiple and separate UNDO log files(table space files).

Use variable innodb_undo_tablespaces  to set the number of table space files. Max value is 126 and minimum 0. 
The Undo logs set by the variable innodb_undo_logs will be divided between the set table spaces. Maximum you can create 128 undo logs and it is the default value.

The path of the UNDO log table space can be set by the variable innodb_undo_directory.