Friday, May 31, 2013

Database library to handle multiple masters and multiple slaves

In a large scale mysql deployment there could be multiple masters and multiple slaves. Masters are generally in circular replication. And are used for running all inserts, updates and deletes. Slaves are used to run selects.

When you are dealing with multiple mysql instances running in a large scale environment, it is important to take care of lags between masters and slaves. To handle such scenarios, the code should be capable of firing query on a server dynamically. Which means that for each query, I as a developer should have the flexibility to decide which server the query should go.

A list of existing scenarios :

1. All registrations / username generation process should happen on a single master. If you generate usernames at both masters, there may be scenarios where, due to lag between mysql masters, the user is not reflected. And in such a case, the user may register again and land on another master. Creating the same username again and breaking the circular replication. So all registrations and check for "username exists" should happen on a single master.

2. For all other Insert, Update and Delete operations, the user should be stuck to a single master. Why ? Assume there is a lag of around 30 minutes between the masters and slaves. The user inserts a record and immediately wants to see what record has been inserted. If we fetch the record from another master or slave, the record will not be available, because it has not yet been replicated. To take care of this scenario, whenever a record is inserted the immediate select has to be from the same server.

3. For all other selects, the query can be fired on any of the slaves. For example, the user logs into the site and sees his own profile. We show him his profile using one of the slave servers. This can be cached as well. The point here is that for data which has not been updated recently - the query can be fired on any of the slaves.

The following piece of code/library handles most of the scenarios. Please feel free to suggest modifications or improvements.


/**
* Created by : Jayant Kumar
* Description : php database library to handle multiple masters & multiple slaves
**/
class DatabaseList // jk : Base class
{
public $db = array();
public function setDb($db)
{
$this->db = $db;
}

public function getDb()
{
return $this->db;
}
}

class SDatabaseList extends DatabaseList // jk : Slave mysql servers
{
function __construct()
{
$this->db[0] = array('ip'=>'10.20.1.11', 'u'=>'user11', 'p'=>'pass11', 'db'=>'database1');
$this->db[1] = array('ip'=>'10.20.1.12', 'u'=>'user12', 'p'=>'pass12', 'db'=>'database1');
$this->db[2] = array('ip'=>'10.20.1.13', 'u'=>'user13', 'p'=>'pass13', 'db'=>'database1');
//print_r($db);
}
}

class MDatabaseList extends DatabaseList // jk : Master mysql servers
{
function __construct()
{
$this->db[0] = array('ip'=>'10.20.1.1', 'u'=>'user1', 'p'=>'pass1', 'db'=>'database1');
$this->db[1] = array('ip'=>'10.20.1.2', 'u'=>'user2', 'p'=>'pass2', 'db'=>'database2');
//print_r($db);
}
}

class MemcacheList extends DatabaseList // jk : memcache servers
{
function __construct()
{
$this->db[0] = array('ip'=>'localhost', 'port'=>11211);
}
}


Interface DatabaseSelectionStrategy  // jk : Database interface
{
public function getCurrentDb();
}

class StickyDbSelectionStrategy implements DatabaseSelectionStrategy // jk : sticky db . For update / delete / insert
{
private $dblist;
private $uid;
private $sessionDb;
private $sessionTimeout = 3600;

function __construct(DatabaseList $dblist)
{
$this->dblist = $dblist;
}

public function setUserId($uid)
{
$this->uid = $uid;
}

public function setSessionDb($sessionDb)
{
$this->sessionDb = $sessionDb->db;
}

private function getDbForUser() // jk : get db for this user. If not found - assign him random master db.
{
$memc = new Memcache;
foreach ($this->sessionDb as $key => $value) {
$memc->addServer($value['ip'], $value['port']);
}
$dbIp = $memc->get($this->uid);
if($dbIp == null)
{
$masterlist = new MDatabaseList();
$randomdb = new RandomDbSelectionStrategy($masterlist);
$mdb = $randomdb->getCurrentDb();
$dbIp = $mdb['ip'];
$memc->set($this->uid, $dbIp, false, $this->sessionTimeout);
}

return $dbIp;
}

public function getCurrentDb()
{
$dbIp = $this->getDbForUser();
foreach ($this->dblist->db as $key => $value) 
{
if($value['ip'] == $dbIp)
return $value;
}
}
}

class RandomDbSelectionStrategy implements DatabaseSelectionStrategy // jk : select random db from list
{
private $dblist;

function __construct(DatabaseList $dblist)
{
//print_r($dblist);
$this->dblist = $dblist;
}

public function getCurrentDb()
{
//print_r($this->dblist);
$cnt = sizeof($this->dblist->db);
$rnd = rand(0,$cnt-1);
$current = $this->dblist->db[$rnd];
return $current;
}
}

class SingleDbSelectionStrategy implements DatabaseSelectionStrategy // jk : select one master db - to generate unique keys
{
private $dblist;

function __construct(DatabaseList $dblist)
{
$this->dblist = $dblist;
}

public function getCurrentDb()
{
//print_r($this->dblist);
return $this->dblist->db[0];
}
}

Interface Database
{
public function getIp();
public function getDbConnection();
}


class DatabaseFactory implements Database // cmt : database factory
{
private $db;

public function getIp()
{
return $this->db['ip'];
}

public function getDbConnection($type = 'slave', $uid = 0)
{
$dbStrategy;

switch($type)
{
case 'slave':
$dblist = new SDatabaseList();
//print_r($dblist);
$dbStrategy = new RandomDbSelectionStrategy($dblist);
break;

case 'master':
$dblist = new MDatabaseList();
//print_r($dblist);
$dbStrategy = new StickyDbSelectionStrategy($dblist);
$dbStrategy->setSessionDb(new MemcacheList());
$dbStrategy->setUserId($uid);
break;

case 'unique':
$dblist = new MDatabaseList();
//print_r($dblist);
$dbStrategy = new SingleDbSelectionStrategy($dblist);
break;
}

$this->db = $dbStrategy->getCurrentDb();

print_r($this->db);
// return mysql_connect($this->db['ip'], $this->db['u'], $this->db['p'], $this->db['db']);
}
}

// tst :  test this out...

$factory = new DatabaseFactory();
echo 'Slave : '; $factory->getDbConnection('slave');
echo 'Slave2 : '; $factory->getDbConnection('slave');
echo 'Unique : '; $factory->getDbConnection('unique');
echo 'New Master 100: '; $factory->getDbConnection('master',100);
echo 'New Master 101: '; $factory->getDbConnection('master',101);
echo 'New Master 102: '; $factory->getDbConnection('master',102);
echo 'old Master 100: '; $factory->getDbConnection('master',100);
echo 'old Master 102: '; $factory->getDbConnection('master',102);

?>

Wednesday, May 01, 2013

how to create a 3 node riak cluster ?

A very brief intro about riak - http://basho.com/riak/. Riak is a distributed database written in erlang. Each node in a riak cluster contains the complete independent copy of the riak package. A riak cluster does not have any "master". Data is distributed across nodes using consistent hashing - which ensures that the data is evenly distributed and a new node can be added with minimum reshuffling. Each object has in a riak cluster has multiple copies distributed acorss multiple nodes. Hence failure of a node does not necessarily result in data loss.

To setup a 3 node riak cluster, we first setup 3 machines with riak installed. To install riak on ubuntu machines all that needs to be done is download the "deb" package and do a dpkg -i "riak_x.x.x_amd64.deb". The version I used here was 1.3.1. 3 machines with ips 10.20.220.2, 10.20.220.3 & 10.20.220.4 were setup

To setup riak on 1st node, there are 3 config changes that need to be done

1. replace http ip: in /etc/riak/app.config replace ip in {http, [ {"127.0.0.1", 8098 } ]} with 10.20.220.2
2. replace pb_ip: in /etc/riak/app.config replace ip in {pb_ip,   "127.0.0.1" } with 10.20.220.2
3. change the name of the fiak machine to match your ip: in /etc/riak/vm.args change name to riak@10.20.220.2



If you had started the riak cluster earlier - before making the ip related changes, you will need to clear the ring and backend db. Do the following.

rm -rf /var/lib/riak/bitcask/
rm -rf /var/lib/riak/ring/



To start the first node, run riak start.

To prepare the second node, replace the ips with 10.20.220.3. Once done do a "riak start". To join this node to the cluster do the following

root@riak2# riak-admin cluster join riak@10.20.220.2
Attempting to restart script through sudo -H -u riak
Success: staged join request for 'riak@10.20.220.3' to 'riak@10.20.220.2'

check out the cluster plan

root@riak2# riak-admin cluster plan
Attempting to restart script through sudo -H -u riak
===============================Staged Changes================================
Action         Nodes(s)
-------------------------------------------------------------------------------
join           'riak@10.20.220.3'
-------------------------------------------------------------------------------

NOTE: Applying these changes will result in 1 cluster transition

###############################################################################
                         After cluster transition 1/1
###############################################################################

=================================Membership==================================
Status     Ring    Pending    Node
-------------------------------------------------------------------------------
valid     100.0%     50.0%    'riak@10.20.220.2'
valid       0.0%     50.0%    'riak@10.20.220.3'
-------------------------------------------------------------------------------
Valid:2 / Leaving:0 / Exiting:0 / Joining:0 / Down:0

WARNING: Not all replicas will be on distinct nodes

Transfers resulting from cluster changes: 32
  32 transfers from 'riak@10.20.220.2' to 'riak@10.20.220.3'


Save the cluster

root@riak2# riak-admin cluster commit
Attempting to restart script through sudo -H -u riak
Cluster changes committed

Add 1 more node

Prepare the 3rd node by replacing the ip with 10.20.220.4. And add this node to the riak cluster.

root@riak3# riak-admin cluster join riak@10.20.220.2
Attempting to restart script through sudo -H -u riak
Success: staged join request for 'riak@10.20.220.4' to 'riak@10.20.220.2'

check and commit the new node to the cluster.

root@riak3# riak-admin cluster plan
Attempting to restart script through sudo -H -u riak
=============================== Staged Changes ================================
Action         Nodes(s)
-------------------------------------------------------------------------------
join           'riak@10.20.220.4'
-------------------------------------------------------------------------------

NOTE: Applying these changes will result in 1 cluster transition

###############################################################################
                         After cluster transition 1/1
###############################################################################

================================= Membership ==================================
Status     Ring    Pending    Node
-------------------------------------------------------------------------------
valid      50.0%     34.4%    'riak@10.20.220.2'
valid      50.0%     32.8%    'riak@10.20.220.3'
valid       0.0%     32.8%    'riak@10.20.220.4'
-------------------------------------------------------------------------------
Valid:3 / Leaving:0 / Exiting:0 / Joining:0 / Down:0

WARNING: Not all replicas will be on distinct nodes

Transfers resulting from cluster changes: 21
  10 transfers from 'riak@10.20.220.2' to 'riak@10.20.220.4'
  11 transfers from 'riak@10.20.220.3' to 'riak@10.20.220.4'

root@riak3# riak-admin cluster commit
Attempting to restart script through sudo -H -u riak
Cluster changes committed
check status

root@riak3# riak-admin status | grep ring
Attempting to restart script through sudo -H -u riak
ring_members : ['riak@10.20.220.2','riak@10.20.220.3','riak@10.20.220.4']
ring_num_partitions : 64
ring_ownership : <<"[{'riak@10.20.220.2',22},{'riak@10.20.220.3',21},{'riak@10.20.220.4',21}]">>
ring_creation_size : 64


For Advanced configuration refer:

http://docs.basho.com/riak/latest/cookbooks/Adding-and-Removing-Nodes/