Skip to content

Read Free Replication

Reid Horuff edited this page Mar 16, 2017 · 9 revisions

MyRocks has a feature called "Read Free Replication", which significantly speeds up replication performance. This feature was inspired from TokuDB. In addition there is a feature named "Skip Unique Checking"

Read Free Replication means that instead of reading the old data from the database during an UPDATE or DELETE, we can use information in the binlog and skip this step. Skip Unique Checking means that we can disable the checks being done to make sure a PRIMARY or UNIQUE key doesn't already exist during an INSERT or UPDATE. Both these features can improve performance but come with some potential problems.

The settings to control these are the following:

  • rocksdb-read-free-rpl-tables=<list_of_tables> - Turn on the Read Free Replication for the specified list of tables. This feature needs Row Based Binary Logging. The binlog_row_image option must be FULL on master.
  • rocksdb_skip_unique_check_tables=<list_of_tables> - This allows the unique key check to be skipped when replication is lagging and is used in conjunction with unique_check_lag_threshold/unique_check_lag_reset_threshold. When these values are set and the lag on a slave goes high enough the system will stop doing unique key checks on the specified tables.

The <list_of_tables> is a comma separated list that can include regular expressions. For instance to indicate all tables you can use .*, or if you want all tables starting with workdb_ you could use workdb_.*.

Read Free Replication has some limitations and has to be used carefully, otherwise some indexes might get corrupted. General rule of thumb is you should not directly insert/update/delete on slaves, outside of replication. Here are two examples when secondary indexes get corrupted if you modify slaves directly.

  1. secondary keys lose some rows
create table t (id int primary key, i1 int, i2 int, value int, index (i1), index (i2)) engine=rocksdb;
insert into t values (1,1,1,1),(2,2,2,2),(3,3,3,3);

s:
delete from t where id <= 2;

m:
update t set i2=100, value=100 where id=1;

s:
mysql> select count(*) from t force index(primary);
+----------+
| count(*) |
+----------+
|        2 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from t force index(i1);
+----------+
| count(*) |
+----------+
|        1 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from t force index(i2);
+----------+
| count(*) |
+----------+
|        2 |
+----------+
1 row in set (0.00 sec)

mysql> select * from t where id=1;
+----+------+------+-------+
| id | i1   | i2   | value |
+----+------+------+-------+
|  1 |    1 |  100 |   100 |
+----+------+------+-------+
1 row in set (0.00 sec)

mysql> select i1 from t where i1=1;
Empty set (0.00 sec)

mysql> select i2 from t where i2=100;
+------+
| i2   |
+------+
|  100 |
+------+
1 row in set (0.00 sec)
  1. Secondary keys have extra rows
M:
create table t (id int primary key, i1 int, i2 int, value int, index (i1), index (i2)) engine=rocksdb;
insert into t values (1,1,1,1),(2,2,2,2),(3,3,3,3);

S:
update t set i1=100 where id=1;

M:
delete from t where id=1;

S:
mysql> select count(*) from t force index(primary);
+----------+
| count(*) |
+----------+
|        2 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from t force index(i1);
+----------+
| count(*) |
+----------+
|        3 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from t force index(i2);
+----------+
| count(*) |
+----------+
|        2 |
+----------+
1 row in set (0.00 sec)

mysql> select i1 from t where i1=100;
+------+
| i1   |
+------+
|  100 |
+------+
1 row in set (0.00 sec)

There is a similar mysql system variable 'unique_checks' supported by MyRocks which some users my find useful. Disabling this session variable (set unique_checks=OFF) will disable unique checks for the given session. This system variable is also propagated through the replication stream meaning the slave will also skip unique checks which can reduce replication lag on large transactions like bulk loading.

Similar to Read Free Replication, using Skip Unique Checks should be safe on a slave if no other modifications are allowed on the slave. If changes (other than replication) are allowed on the slave, it would be possible to get incorrect data on the slave. Turning on Skip Unique Checks on a master is not recommended unless you know with 100% certainty that the data being inserted does not already exist.

Clone this wiki locally