Tuesday, April 01, 2014

Auto failover of mysql master in mysql multi-master multi-slave cluster

This post is an extension to my earlier posts about multi master replication cluster multi master replication in mysql and mysql multi master replication act II The problem I had encountered and discussed was with automatic failover. What happens when the master goes down? How can either a slave or another master be promoted to become the master? Once the settings are done on all the mysql dbs in the cluster to identify the new master, we will also have to change our app to point to the new master.

Let us look at the auto failover part from the mysql cluster point of view. The tool that we are going to use is mysql master ha. It is a tool written by http://yoshinorimatsunobu.blogspot.com/ and has been in place for quite some time now. The tool supports failover in a master-master scenario, but the master-master setup has to be an active-passive(read only) setup. One of the master mysql servers has to be in read only mode. It does not support active-active mysql master-master setup where both masters are handling writes. In order to make a mysql slave read-only, simply enter "read_only" in its my.cnf file.

The mysql version I am using here is mysql 5.6.17. I have created a mysql cluster with 4 nodes.

M1(RW)--->S1
 |
 |
M2(RO)--->S2

M1 and M2 are 2 masters running on ports 3311 & 3312. S1 is a slave running on port 3313 and replicates from master M1. S2 is another slave running on port 3314 and replicates from master M2. Remember, in mysql 5.6, an additional variable has been incorporated known as the server-uuid. This has to be different for all mysql servers in the mysql cluster. So if you are replicating by copying the data directory, simply remove the auto.cnf file, which contains the server-uuid and mysql will create a new uuid when it starts. In order to make M2 read only slave, I have to put the following parameter in its my.cnf file.

read_only

Download the mysql-master-ha tool from https://code.google.com/p/mysql-master-ha/downloads/list. I am using the following version.

mha4mysql-manager-0.55.tar.gz
mha4mysql-node-0.54.tar.gz


untar the mha4mysql-node archive first and install it.

mha4mysql-node-0.54$ perl Makefile.PL
*** Module::AutoInstall version 1.03
*** Checking for Perl dependencies...
[Core Features]
- DBI        ...loaded. (1.63)
- DBD::mysql ...loaded. (4.025)
*** Module::AutoInstall configuration finished.
Writing Makefile for mha4mysql::node
Writing MYMETA.yml and MYMETA.json

mha4mysql-node-0.54$ make
mha4mysql-node-0.54$ sudo make install


Next install the mha4mysql-manager.

mha4mysql-manager-0.55$ perl Makefile.PL
*** Module::AutoInstall version 1.03
*** Checking for Perl dependencies...
[Core Features]
- DBI                   ...loaded. (1.63)
- DBD::mysql            ...loaded. (4.025)
- Time::HiRes           ...loaded. (1.9725)
- Config::Tiny          ...loaded. (2.20)
- Log::Dispatch         ...loaded. (2.41)
- Parallel::ForkManager ...loaded. (1.06)
- MHA::NodeConst        ...loaded. (0.54)
*** Module::AutoInstall configuration finished.
Writing Makefile for mha4mysql::manager
Writing MYMETA.yml and MYMETA.json

mha4mysql-manager-0.55$ make
mha4mysql-manager-0.55$ sudo make install


Watch out for errors. Once the tool is installed, we need to create a configuration file for the cluster. Here is the configuration file - app1.cnf.

[server default]
multi_tier_slave=1
manager_workdir=/home/jayantk/log/masterha/app1
manager_log=/home/jayantk/log/masterha/app1/manager.log

[server1]
hostname=127.0.0.1
candidate_master=1
port=3311
user=root
ssh_user=jayantk
master_binlog_dir=/home/jayantk/mysql-5.6.17/data1

[server2]
hostname=127.0.0.1
candidate_master=1
port=3312
user=root
ssh_user=jayantk
master_binlog_dir=/home/jayantk/mysql-5.6.17/data2

[server3]
hostname=127.0.0.1
port=3313
user=root
ssh_user=jayantk
master_binlog_dir=/home/jayantk/mysql-5.6.17/data3

[server4]
hostname=127.0.0.1
port=3314
user=root
ssh_user=jayantk
master_binlog_dir=/home/jayantk/mysql-5.6.17/data4



Let us go through it and understand what is happening. "multi_tier_slave" is a parameter used for specifying that the mysql cluster is a multi-master and multi-slave cluster where each master can have its own slave. If the parameter is not specified, mha will give an error. Next we specify the working directory and log file for mha manager. Then we start specifying each of our servers with the host name and port. We have to specify the root username and password for each of our mysql servers. The parameters are "user" and "password". I have not specified password for my mysql servers and so do not have that parameter in my configuration file. The parameter "candidate_master" is used to prioritize a certain mysql server as a master candidate during failover. In our case, we are prioritizing server2 (M2) as the master. Finally mha needs ssh access to the machines running our mysql servers. I have specified the ssh username to be used for ssh as "ssh_user". And have enabled ssh public key authentication without pass phrase for the machines (127.0.0.1 in my case). The parameter "master_binlog_dir" is used if the dead master mysql is reachable via ssh to copy any pending binary log events. This is required because there is no information with slave about where the binary log files are located.

Another thing to remember is to grant "replication slave" to all other mysql servers on the network from each mysql server in the cluster - irrespective of whether it is a master or a slave. In my case, I had to run the following grant statement on all my mysql servers - M1, M2, S1 and S2.

grant replication slave on *.* to  slave_user@localhost identified by 'slave_pwd';

Once all the settings are in place, run the command

masterha_check_repl --conf=app1.cnf

It will output a lot of messages ending with

"MySQL Replication Health is OK."


Which means that the configuration is a success. In case of errors, check the wiki at https://code.google.com/p/mysql-master-ha/wiki/TableOfContents for solutions and missing parameters.

To start the masterha manager, run the following command

masterha_manager --conf=app1.cnf

In order to check the status of the mha manager script, go to the log directory - in our case - /home/jayantk/log/masterha/app1. It has a file specifying the health status. You can cat the file to see the health status.

~/log/masterha/app1$ cat app1.master_status.health
9100    0:PING_OK    master:127.0.0.1


Another file is the log file - manager.log - which will give the following output as tail.

Tue Apr  1 12:20:50 2014 - [warning] master_ip_failover_script is not defined.
Tue Apr  1 12:20:50 2014 - [warning] shutdown_script is not defined.
Tue Apr  1 12:20:50 2014 - [info] Set master ping interval 3 seconds.
Tue Apr  1 12:20:50 2014 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Tue Apr  1 12:20:50 2014 - [info] Starting ping health check on 127.0.0.1(127.0.0.1:3311)..
Tue Apr  1 12:20:50 2014 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..


The mha manager will keep on pinging the master every 3 seconds to see if it is alive or not. The ping interval time can be changed by specifying the parameter "ping_interval" in the application configuration. Failover is triggered after missing 3 pings to the master. So if ping_interval is 3 seconds, the maximum time to discover that the master mysql is down is 12 seconds.

Now, lets kill the master and see what happens. Find out the pid of the mysqld process and kill it. It is automatically restarted by the parent process "mysqld_safe". Here is what we get in out manager.log file.

140401 12:34:12 mysqld_safe Number of processes running now: 0
140401 12:34:12 mysqld_safe mysqld restarted
Tue Apr  1 12:34:14 2014 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)
Tue Apr  1 12:34:14 2014 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/home/jayantk/mysql-5.6.17/data1 --output_file=/var/tmp/save_binary_logs_test --manager_version=0.55 --binlog_prefix=zdev-bin
Tue Apr  1 12:34:15 2014 - [info] HealthCheck: SSH to 127.0.0.1 is reachable.
Tue Apr  1 12:34:17 2014 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..


It tires to do an ssh and check if the machine is reachable. But in the meantime the machine is back up an ping is available, so nothing happens. As a next step, lets kill both parent "mysqld_safe" and "mysqld" processes associated with the master mysql server in that sequence. Let us see what happens in the manager.log file. There are a lot of log entires here.


Tue Apr  1 12:44:23 2014 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)
...
Tue Apr  1 12:44:26 2014 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '127.0.0.1' (111))
Tue Apr  1 12:44:26 2014 - [warning] Connection failed 1 time(s)..
Tue Apr  1 12:44:29 2014 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '127.0.0.1' (111))
Tue Apr  1 12:44:29 2014 - [warning] Connection failed 2 time(s)..
Tue Apr  1 12:44:32 2014 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '127.0.0.1' (111))
Tue Apr  1 12:44:32 2014 - [warning] Connection failed 3 time(s)..
Tue Apr  1 12:44:32 2014 - [warning] Master is not reachable from health checker!
...
Tue Apr  1 12:44:32 2014 - [info] Multi-master configuration is detected. Current primary(writable) master is 127.0.0.1(127.0.0.1:3311)
Tue Apr  1 12:44:32 2014 - [info] Master configurations are as below:
Master 127.0.0.1(127.0.0.1:3312), replicating from localhost(127.0.0.1:3311), read-only
Master 127.0.0.1(127.0.0.1:3311), dead
...
...
Tue Apr  1 12:44:32 2014 - [info] * Phase 1: Configuration Check Phase..
Tue Apr  1 12:44:32 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] Multi-master configuration is detected. Current primary(writable) master is 127.0.0.1(127.0.0.1:3311)
Tue Apr  1 12:44:33 2014 - [info] Master configurations are as below:
Master 127.0.0.1(127.0.0.1:3311), dead
Master 127.0.0.1(127.0.0.1:3312), replicating from localhost(127.0.0.1:3311), read-only

Tue Apr  1 12:44:33 2014 - [info] Dead Servers:
Tue Apr  1 12:44:33 2014 - [info]   127.0.0.1(127.0.0.1:3311)
Tue Apr  1 12:44:33 2014 - [info] Checking master reachability via mysql(double check)..
Tue Apr  1 12:44:33 2014 - [info]  ok.
Tue Apr  1 12:44:33 2014 - [info] Alive Servers:
Tue Apr  1 12:44:33 2014 - [info]   127.0.0.1(127.0.0.1:3312)
Tue Apr  1 12:44:33 2014 - [info]   127.0.0.1(127.0.0.1:3313)
Tue Apr  1 12:44:33 2014 - [info] Alive Slaves:
Tue Apr  1 12:44:33 2014 - [info]   127.0.0.1(127.0.0.1:3312)  Version=5.6.17-log (oldest major version between slaves) log-bin:enabled
Tue Apr  1 12:44:33 2014 - [info]     Replicating from localhost(127.0.0.1:3311)
Tue Apr  1 12:44:33 2014 - [info]     Primary candidate for the new Master (candidate_master is set)
Tue Apr  1 12:44:33 2014 - [info]   127.0.0.1(127.0.0.1:3313)  Version=5.6.17-log (oldest major version between slaves) log-bin:enabled
Tue Apr  1 12:44:33 2014 - [info]     Replicating from localhost(127.0.0.1:3311)
Tue Apr  1 12:44:33 2014 - [info] Unmanaged Servers:
Tue Apr  1 12:44:33 2014 - [info]   127.0.0.1(127.0.0.1:3314)  Version=5.6.17-log (oldest major version between slaves) log-bin:enabled
Tue Apr  1 12:44:33 2014 - [info]     Replicating from localhost(127.0.0.1:3312)
Tue Apr  1 12:44:33 2014 - [info] ** Phase 1: Configuration Check Phase completed.
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] * Phase 2: Dead Master Shutdown Phase..
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] Forcing shutdown so that applications never connect to the current master..
...
Tue Apr  1 12:44:33 2014 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] * Phase 3: Master Recovery Phase..
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] The latest binary log file/position on all slaves is zdev-bin.000009:120
Tue Apr  1 12:44:33 2014 - [info] Latest slaves (Slaves that received relay log files to the latest):
Tue Apr  1 12:44:33 2014 - [info]   127.0.0.1(127.0.0.1:3312)  Version=5.6.17-log (oldest major version between slaves) log-bin:enabled
Tue Apr  1 12:44:33 2014 - [info]     Replicating from localhost(127.0.0.1:3311)
Tue Apr  1 12:44:33 2014 - [info]     Primary candidate for the new Master (candidate_master is set)
Tue Apr  1 12:44:33 2014 - [info]   127.0.0.1(127.0.0.1:3313)  Version=5.6.17-log (oldest major version between slaves) log-bin:enabled
Tue Apr  1 12:44:33 2014 - [info]     Replicating from localhost(127.0.0.1:3311)
Tue Apr  1 12:44:33 2014 - [info] The oldest binary log file/position on all slaves is zdev-bin.000009:120
Tue Apr  1 12:44:33 2014 - [info] Oldest slaves:
Tue Apr  1 12:44:33 2014 - [info]   127.0.0.1(127.0.0.1:3312)  Version=5.6.17-log (oldest major version between slaves) log-bin:enabled
Tue Apr  1 12:44:33 2014 - [info]     Replicating from localhost(127.0.0.1:3311)
Tue Apr  1 12:44:33 2014 - [info]     Primary candidate for the new Master (candidate_master is set)
Tue Apr  1 12:44:33 2014 - [info]   127.0.0.1(127.0.0.1:3313)  Version=5.6.17-log (oldest major version between slaves) log-bin:enabled
Tue Apr  1 12:44:33 2014 - [info]     Replicating from localhost(127.0.0.1:3311)
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] Fetching dead master's binary logs..
...
Tue Apr  1 12:44:33 2014 - [info] Additional events were not found from the orig master. No need to save.
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] * Phase 3.3: Determining New Master Phase..
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] Finding the latest slave that has all relay logs for recovering other slaves..
Tue Apr  1 12:44:33 2014 - [info] All slaves received relay logs to the same position. No need to resync each other.
Tue Apr  1 12:44:33 2014 - [info] Searching new master from slaves..
Tue Apr  1 12:44:33 2014 - [info]  Candidate masters from the configuration file:
Tue Apr  1 12:44:33 2014 - [info]   127.0.0.1(127.0.0.1:3312)  Version=5.6.17-log (oldest major version between slaves) log-bin:enabled
Tue Apr  1 12:44:33 2014 - [info]     Replicating from localhost(127.0.0.1:3311)
Tue Apr  1 12:44:33 2014 - [info]     Primary candidate for the new Master (candidate_master is set)
...
Tue Apr  1 12:44:33 2014 - [info] New master is 127.0.0.1(127.0.0.1:3312)
Tue Apr  1 12:44:33 2014 - [info] Starting master failover..
Tue Apr  1 12:44:33 2014 - [info]
From:
127.0.0.1 (current master)
 +--127.0.0.1
 +--127.0.0.1

To:
127.0.0.1 (new master)
 +--127.0.0.1
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] * Phase 3.3: New Master Diff Log Generation Phase..
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] * Phase 3.4: Master Log Apply Phase..
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.
Tue Apr  1 12:44:33 2014 - [info] Starting recovery on 127.0.0.1(127.0.0.1:3312)..
Tue Apr  1 12:44:33 2014 - [info]  This server has all relay logs. Waiting all logs to be applied..
Tue Apr  1 12:44:33 2014 - [info]   done.
Tue Apr  1 12:44:33 2014 - [info]  All relay logs were successfully applied.
Tue Apr  1 12:44:33 2014 - [info] Getting new master's binlog name and position..
Tue Apr  1 12:44:33 2014 - [info]  zdev-bin.000009:797
Tue Apr  1 12:44:33 2014 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='127.0.0.1', MASTER_PORT=3312, MASTER_LOG_FILE='zdev-bin.000009', MASTER_LOG_POS=797, MASTER_USER='slave_user', MASTER_PASSWORD='xxx';
Tue Apr  1 12:44:33 2014 - [warning] master_ip_failover_script is not set. Skipping taking over new master ip address.
Tue Apr  1 12:44:33 2014 - [info] Setting read_only=0 on 127.0.0.1(127.0.0.1:3312)..
Tue Apr  1 12:44:33 2014 - [info]  ok.
Tue Apr  1 12:44:33 2014 - [info] ** Finished master recovery successfully.
Tue Apr  1 12:44:33 2014 - [info] * Phase 3: Master Recovery Phase completed.
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] * Phase 4: Slaves Recovery Phase..
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] -- Slave diff file generation on host 127.0.0.1(127.0.0.1:3313) started, pid: 10586. Check tmp log /home/jayantk/log/masterha/app1/127.0.0.1_3313_20140401124432.log if it takes time..
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] Log messages from 127.0.0.1 ...
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
Tue Apr  1 12:44:33 2014 - [info] End of log messages from 127.0.0.1.
Tue Apr  1 12:44:33 2014 - [info] -- 127.0.0.1(127.0.0.1:3313) has the latest relay log events.
Tue Apr  1 12:44:33 2014 - [info] Generating relay diff files from the latest slave succeeded.
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] -- Slave recovery on host 127.0.0.1(127.0.0.1:3313) started, pid: 10588. Check tmp log /home/jayantk/log/masterha/app1/127.0.0.1_3313_20140401124432.log if it takes time..
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] Log messages from 127.0.0.1 ...
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] Starting recovery on 127.0.0.1(127.0.0.1:3313)..
Tue Apr  1 12:44:33 2014 - [info]  This server has all relay logs. Waiting all logs to be applied..
Tue Apr  1 12:44:33 2014 - [info]   done.
Tue Apr  1 12:44:33 2014 - [info]  All relay logs were successfully applied.
Tue Apr  1 12:44:33 2014 - [info]  Resetting slave 127.0.0.1(127.0.0.1:3313) and starting replication from the new master 127.0.0.1(127.0.0.1:3312)..
Tue Apr  1 12:44:33 2014 - [info]  Executed CHANGE MASTER.
Tue Apr  1 12:44:33 2014 - [info]  Slave started.
Tue Apr  1 12:44:33 2014 - [info] End of log messages from 127.0.0.1.
Tue Apr  1 12:44:33 2014 - [info] -- Slave recovery on host 127.0.0.1(127.0.0.1:3313) succeeded.
Tue Apr  1 12:44:33 2014 - [info] All new slave servers recovered successfully.
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] * Phase 5: New master cleanup phase..
Tue Apr  1 12:44:33 2014 - [info]
Tue Apr  1 12:44:33 2014 - [info] Resetting slave info on the new master..
Tue Apr  1 12:44:33 2014 - [info]  127.0.0.1: Resetting slave info succeeded.
Tue Apr  1 12:44:33 2014 - [info] Master failover to 127.0.0.1(127.0.0.1:3312) completed successfully.
Tue Apr  1 12:44:33 2014 - [info]

----- Failover Report -----

app1: MySQL Master failover 127.0.0.1 to 127.0.0.1 succeeded

Master 127.0.0.1 is down!

Check MHA Manager logs at zdev.net:/home/jayantk/log/masterha/app1/manager.log for details.

Started automated(non-interactive) failover.
The latest slave 127.0.0.1(127.0.0.1:3312) has all relay logs for recovery.
Selected 127.0.0.1 as a new master.
127.0.0.1: OK: Applying all logs succeeded.
127.0.0.1: This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
127.0.0.1: OK: Applying all logs succeeded. Slave started, replicating from 127.0.0.1.
127.0.0.1: Resetting slave info succeeded.
Master failover to 127.0.0.1(127.0.0.1:3312) completed successfully.



So eventually what has happened is that the following mysql cluster is now in place

M2(RW) ---> S2
   |
   |
  V
  S1

We can log into each of the mysql servers and verify the same. So, the failover happened. There are a few problems with this solution.

  1. Now that M2 is the master, we will have to change our application to make M2 as master instead of M1. So all inserts start happening on M2. An easier way to do this is to use an HA solution known as Pacemaker which will take over the virtual ip of the master M1 and assign it to the new master M2. Another way is to change the script on all app servers and swap the ip of M1 with that of M2. This is handled by the "master_ip_failover_script" which can be configured to handle either scenario. More details on configuring the same is available here - https://code.google.com/p/mysql-master-ha/wiki/Parameters#master_ip_failover_script 
  2. The purpose of Master-Master mysql replication is lost when failover happens. Master-Master replication allows the possibility of easy switching of the app between two masters. It is even possible to write to certain databases on M1 and other databases on M2. But with this solution, firstly M2 has to be read-only, which is only acting as a slave. Secondly, after the failover, M2 loses all information which makes it a slave of M1. So, after M1 comes back up, it cannot be put into circular replication as before.
  3. The script does not handle slave failover. So it is expected that each master will have multiple slaves and the failover of slaves should be handled separately. If a mysql slave server goes down, the application should be able to identify the same and not use that particular slave server for queries. Also if mha is monitoring the mysql servers and one of the slaves goes down (undetected), and then the master mysql fails, it may not be able to make the failover.
  4. And finally, after the failover happens the script simply exits. It does not update the mha configuration file. So, multiple failovers cannot be handled. In order to start it again, we will have to manually change the configuration file and start the mha manager script again.
But, inspite of the limitations of the script, I believe it can be used in majority of the cases. And provides a good solution to mysql master failover.

Wednesday, March 19, 2014

An interesting suggester in Solr

Auto suggestion has evolved over time. Lucene has a number of implementations for type ahead suggestions. Existing suggesters generally find suggestions whose whole prefix matches the current user input. Recently a new AnalyzingInfixSuggester has been developed which finds matches of tokens anywhere in the user input and in the suggestion. Here is how the implementation looks.




Let us see how to implement this in Solr. Firstly, we will need solr 4.7 which has the capability to utilize the Lucene suggester module. For more details check out  SOLR-5378 and  SOLR-5528. To implement this, look at the searchComponent named "suggest" in solrconfig.xml. Make the following changes.

<searchComponent name="suggest" class="solr.SuggestComponent">
      <lst name="suggester">
      <str name="name">mySuggester</str>
      <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
      <str name="suggestAnalyzerFieldType">text_ws</str>
      <str name="dictionaryImpl">DocumentDictionaryFactory</str>     <!-- org.apache.solr.spelling.suggest.HighFrequencyDictionaryFactory -->
      <str name="field">cat</str>
      <str name="weightField">price</str>
      <str name="buildOnCommit">true</str>
    </lst>
</searchComponent>

Here we have changed the type of lookup to use - lookupImpl to AnalyzingInfixLookupFactory. And defined the Analyzer to use for building the dictionary as text_ws - which is a simple WhiteSpaceTokenizer factory implementation. The field to be used for providing suggestions is "cat" and we use the "price" field as weight for sorting the suggestions.

Also change the default dictionary for suggester to "mySuggester". Add the following line to "requestHandler" by the name of "/suggest".

<str name="suggest.dictionary">mySuggester</str>

Once these configurations are in place, simply restart the solr server. In order to check the suggester, index all the documents in the exampleDocs folder. The suggester index is created when the documents are committed to the index. In order to check the implementation simply use the following URL.

http://localhost:8983/solr/suggest/?suggest=true&suggest.q=and&suggest.count=2

The output will be somewhat similer to this...


<response>
  <lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">3</int>
  </lst>
  <lst name="suggest">
    <lst name="mySuggester">
      <lst name="and">
        <int name="numFound">2</int>
        <arr name="suggestions">
        <lst>
          <str name="term">electronics <b>and</b> computer1</str>
          <long name="weight">2199</long>
          <str name="payload"/>
        </lst>
        <lst>
          <str name="term">electronics <b>and</b> stuff2</str>
          <long name="weight">279</long>
          <str name="payload"/>
        </lst>
        </arr>
      </lst>
    </lst>
  </lst>
</response>

We are already getting the suggestions as highlighted. Try getting suggestions for some other partial words like "elec", "comp" and see the output. Let us note down some limitations that I came across while implementing this.

Checkout the type of the field "cat" which is being used for providing suggestions in schema.xml. It is of the type "string" and is both indexed and stored. We can change the field name in our solrconfig.xml to provide suggestions based on some other field, but the field has to be both indexed and stored. We would not want to tokenize the field as it may mess up with the suggestions - so it is recommended to use "string" fields for providing suggestions.

Another flaw, that I came across is that the suggestions do not work on multiValued fields. "cat" for example is multivalued. Do a search on all "electronics" and get the "cat" field.

http://localhost:8983/solr/collection1/select/?q=electronics&fl=cat

We can see that in addition to "electronics", the cat field also contains "connector", "hard drive" and "memory". But a search on those strings does not give any suggestions.

http://localhost:8983/solr/suggest/?suggest=true&suggest.q=hard&suggest.count=2

So, it is recommended that the field be of type "string" and not multivalued. If there are multiple fields on which suggestions are to be provided, it is recommended to merge them into a single "string" field in our index.

Monday, February 24, 2014

Win Free e-copies of Apache Solr PHP Integration

Readers would be pleased to know that I have teamed up with Packt Publishing to organize a giveaway of Apache Solr PHP Integration

And 3 lucky winners stand a chance to win e-copies of their new book. Keep reading to find out how you can be one of the Lucky Winners.



Overview

• Understand the tools that can be used to communicate between PHP and Solr, and how they work internally
• Explore the essential search functions of Solr such as sorting, boosting, faceting, and highlighting using your PHP code
• Take a look at some advanced features of Solr such as spell checking, grouping, and auto complete with implementations using PHP code

How to enter ?

All you need to do is head on over to the book page and look through the product description of the book and drop a line via the comments below this post to let us know what interests you the most about this book. It’s that simple.

Deadline

The contest will close on 5th March 2014 Winners will be contacted by email, so be sure to use your real email address when you comment!

Wednesday, January 15, 2014

Customizing similarity in Solr

By default Solr uses the DefaultSimilarity class for calculating the score of each document with respect to the query. Details of how the score is calculated can be obtained from the following lucene api documentation.

http://lucene.apache.org/core/4_6_0/core/org/apache/lucene/search/similarities/DefaultSimilarity.html

Or from my earlier posts

http://jayant7k.blogspot.in/2007/07/document-scoring-in-lucene-part-2.html
http://jayant7k.blogspot.in/2006/07/document-scoringcalculating-relevance_08.html

In order to customize the scoring algorithm, you need to extend the default similarity class and override the implementation for different functions.

Suppose you want to disable length normalization for the text field. So that the score is not affected by the number of tokens in the text field of any document. Here is the code for implementing the same

package com.myscorer;

import org.apache.lucene.index.*;
import org.apache.lucene.search.similarities.*;

public class NoLengthNormSimilarity extends DefaultSimilarity {

    @Override
    public float lengthNorm(FieldInvertState fld) {
        if(fld.getName().equals("text"))
            return 1.0f;
        else
            return super.lengthNorm(fld);
    }

}


To compile the file, you will need the lucene-core-x.y.z.jar file in your classpath. This file is generally found in the following folder inside your solr installation.

<solr folder>/example/solr-webapp/webapp/WEB-INF/lib/

To compile the file run

javac -cp <solr folder>/example/solr-webapp/webapp/WEB-INF/lib/lucene-core-4.6.0.jar NoLengthNormSimilarity.java

This will create the class NoLengthNormSimilarity.class

You will need to put this file in a folder com/myscorer and create a jar

jar -cvf myscorer.jar com/myscorer/NoLengthNormSimilarity.class

And finally copy the jar to <solr folder>/example/solr-webapp/webapp/WEB-INF/lib/ folder

To implement this similarity class in your solr, simply add the following line at the end of your schema.xml for that particular core.

<similarity class="com.myscorer.NoLengthNormSimilarity"></similarity>

This will implement the NoLengthNorm similarity at a global level for all fields and field types in the core in whose schema this configuration has been defined. It is also possible to implement different similarities for different fieldTypes by creating different Similarity classes and adding them to the fieldType in our Solr schema.xml. A sample implementation is as shown below.

<fieldType name="text_nolenNorm" class="solr.TextField">
    <analyzer class="org.apache.lucene.analysis.standard.StandardAnalyzer"/>
    <similarity class="com.myscorer.NoLengthNormSimilarity"/>
</fieldType>


Here we have implemented the NoLengthNorm similarity for fieldType with name text_nolenNorm. Below we have created a separate similarity class called NoIDFSimilarity which ignores IDF (returns 1) and in addition to adding it to our fieldType with name text_noIdf, we have passed 2 separate parameters which can be accessed inside the class for additional customization.

<fieldType name="text_noIdf" class="solr.TextField">
    <analyzer class="org.apache.lucene.analysis.standard.StandardAnalyzer"/>
    <similarity class="com.myscorer.NoIDFSimilarity">
        <str name="param1">param_val1</str>
        <str name="param2">param_val2</str>
    </similarity>
</fieldType>

Monday, December 23, 2013

A book every php developer should read

Once upon a time, a long long time ago, when there was no Solr and lucene used to be a search engine api available for php developers to use, they used to struggle for using lucene. Most people reverted back to Mysql full text search. And some ventured into using sphinx - another free full text search engine. Then came Solr and php developers were thrilled with the ease of use over the http interface for both indexing and searching text using lucene abstracted by Solr.

Even in those days, it was difficult to fully explore and use the features provided by lucene and Solr through php. There is an extension in Php to communicate to Solr. But the extension has not been in active development. As Solr came out with more and more features, the extension became very basic. Most of the advanced features provided by Solr were not available in the php Solr extension. Then came Solarium, an open source library which is being very actively developed and had support for the latest features of Solr.

But as the features of Solarium and Solr kept on increasing, php developers find it difficult to keep up to date with them. The book Apache Php Solr Integration provides an up to date and in depth view of the latest features provided by Solr and how it can be explored in php via Solarium. Php developers are generally very comfortable in writing code and in setting up systems. But in case you are a developer and are not very familiar with how to Setup solr or how to connect to Solr using php, the book hand holds you with configurations, examples and screen shots.

In addition to discussing simple topics like indexing and search which are very basic to Solr, the book also goes in depth on advanced queries in Solr like filter queries and faceting. The book also guides a developer on setting up Solr for highlighting hits in the results and goes into the implementation with sample codes. Other advanced functionalities discussed in the book are development and implementation of spell check in Solr and php, grouping of results, implementing the more like this feature in Php and Solr. The book also discusses distributed search - a feature used for Scaling Solr horizontally. Setting up of Master-Slave on Solr is discussed with sample configuration files. Load balancing of queries using php and Solarium is also discussed with sample code.

As a php developer, you may have some questions like

Q: Why should i read this book?
A: The book would make you an expert in search using Solr. That would be an additional skill that you can show off.

Q: I know Solr. What additional does the book provide ?
A: Are you up to date with the latest features provided by Solr ?  Have you implemented featues like spell check, suggestions, result grouping, more like this ?

Q: I am an expert in all the above features of Solr. What else does the book have ?
A: Are you comfortable implementing Solr on large scale sites with index which has millions of documents ? Do you know how Solr calculates relevance and how it can be tweaked ? Can you provide index statistics of Solr using php ?

If you are still undecided, the following article and table of contents of the book will help you make your mind.

http://www.packtpub.com/article/apache-solr-php-integration
http://www.packtpub.com/apache-solr-php-integration/book

Saturday, August 31, 2013

Handling mongodb connections

Mongodb is fast and for medium to large setup, it just works out of the box. But for setups which are bigger than large, you may run into a situation where the number of connections max out. So for extra large setups, let us look at how to increase the number of connections in a mongodb server.

Mongodb uses file descriptors to manage connections. In most unix-like operating systems, the default number of file descriptors available are set to 1024. This can be verified by using the command ulimit -n which should give the output as 1024. Ulimit is the per user limitation of various resources. Ulimit can be temporarily changed by issueing the command ulimit -n .

To change file descriptors system wide, change or add the following line to your /etc/sysctl.conf

fs.file-max = 64000

Now every user in your server would be able to use 64000 file descriptors instead of the earlier 1024. For a per user configuration, you will have to tweak the hard and soft limits in /etc/security/limits.conf.

By default the mongodb configuration file mostly mongodb.conf does not specify the number of max connections. It depends directly on the number of available file descriptors. But you can control it using the maxConns variable. Suppose you want to set the number of max connections to 8000, you will have to put the following configuration line in your mongodb.conf file.

maxConns = 8000

Remember : mongodb cannot use more than 20,000 connections on a server.

I recently came across a scenario where my mongodb which was using 7000 connections maxed out. I have a replica set configured, where there is a single master and multiple replicas. All reads and writes were happening on the master. With replica sets, the problem is that if the master mongo is restarted, any one of the replica mongo servers may become the master or primary. The TMC problem was caused by a missing index which caused multiple update queries to get queued up. To solve the problem, firstly the missing index was applied. Next all queries which were locking the collection had to be killed.

Here is a quick function that we were able to put together to first list all pending write operations and then kill them.

db.currentOp().inprog.forEach(
   function(d){
     if(d.waitingForLock && d.lockType != "read")
        printjson(d.opid);       
        db.killOp(d.opid);       
        i++
     })

Chevrolet Aveo ownership report

In 2006, when I was looking for a sedan to purchase, I had multiple options. Ford fiesta and hyundai verna were recently released. Honda city was selling like hot cakes. I was opting for either ford or hyundai. But then while discussing my options around, I came to know the chevrolet aveo was also there in the same segment and has some features which really impressed me - like the ground clearance and the engine performance. Surprisingly as per the maintenance figures shared by all three of them the chevy was coming out to be cheaper to maintain. So I went ahead and got it.

For 4 years, I was extremely happy with the car's performance. I used to get good average of around 12-13 kmpl. Maintenance was every 5000 kms and it costed me only around 3000/- per service. The car was peppy and light to drive. My daily driving was around 30-40 kms, so it was much cheaper to maintain.

Then my first accident happened - I was standing still while a truck chewed through the driver side of my car. Driver side door and fender were totally damaged. When I went to the GM workshop, I could see the "smile" on their faces - which I was able to decrypt later. I got a quote of 60K from the GM workshop. I went to many other workshops, but they denied having parts for chevrolet cars. Finally I was able to find a workshop which did the repairing in around 20K - from which I had to pay around 10K.

This was the point where I should have realized that life would not be easy anymore with my aveo. The 25K service was very expensive - costed me around 15K. My shockers had leaked. I could not feel the difference in the ride quality but as per recommendation of GM, I got them changed. From then onwards, every service started costing 8-10K.

My driving went up from 30 km per day to 100 km per day. And I would finish off 5000 kms in less than 3 months. Another service would cost me 8-10K. So in addition to the rising petrol prices, I had to pay around 4K per month on the servicing of the car. The pervious shockers had a warranty of around 1 year and immediately after finishing 1 year the shockers gave away. Got them changed again - paying from my own pocket.

At 34K kms, the default goodyear tyres that came with the car gave away. Once I had 3 punchures on way to office on the same day. The next day, I took an off from office and got new tyres - again costed me 18K. Another hit was immediately after the 35K service. At around 37K, my batteries gave away. Interestingly there is a cut in the battery fitted in aveo, so only batteries from authorized workshops can be fitted. Also the complete computer would reset while fitting the battery and needs the authorization code to enable the car to be driven further. It is advanced technology but a pain. The battery which was normally costing around 6-7K in the market costed me 9K when I got it from the authorized workshop.

When my 40K service again costed me 10K for all the injector cleaning, oil change and other stuff, I decided to call it quits. I dumped the car at home and got another one for office commuting. Decided to go for a tata safari which is famous for some issues cropping up now and then. I thought that a chevy sitting at home which gets driven for around 200-300 kms in a month will be cheaper to maintain. But I was wrong.

After 41K kms, one fine day, I took the car to GM workshop with the problem that the car keeps on racing - engine RPM stays above 2000. They did a thorough cleanup of the car engine, charged me 7K for it and told me that the car's ECM has gone bad. For those who are not aware of the ECM, it is the computer sitting inside the car and it controlls almost everything in the car including fuel, acceleration and other stuff. It is supposed to be very rugged and can withstand extreme dust and temperatures which is always the case inside the car. It is said that if the ECM goes bad the car should not start at all. Surprisingly my car was starting and rolling but was rolling very fast. 

They told me that it would cost another 30K to replace my ECM - as electronics are not covered in car insurance. I raised the problem with GM india on facebook and they told me to take the car back. This time they did not charge me again but did some thorough examination of the car and again told me that ECM has to be replaced. They offered me a discount of 1000 on labour charges. I was furious. A car which has been driven only for 41,000 kms can develop a bad ECM. Fortunately one of my friends knew a 3rd party agent who dealt in such stuff. I was able to get the car working in 18K with the agent's help. And sold off the car as soon as it was healthy - before anything else goes wrong.

Today, in less than 2 years, my tata safari has done 40K kms. With service interval of 15K kms and service cost of around 5-6K it is a much cheaper car to maintain with lesser parts going bad.

The problem with GM cars is that they give you a happy feeling for the first 3 years with around 30-35,000 kms. After which the maintenance cost shoots up. After the free service from GM expires, and since there is no availability of parts in the market, you are solely dependent on the authorized service centers to cheat you or charge you whatever they like. And since the cars are not very reliable, the resale price is very less. After 5 years of driving a GM, you may realize that the money you spent in getting and maintaining the car was too much.

Ford also follows a similar structure where parts are not readily available outside the authorized service centers. But still the availability is much better than that of a GM. Also ford cars are more reliable. Hyundai, maruti, tata and mahindra can be easily maintained outside the authorized service centers. And they are much cheaper to maintain as parts and service centers are easily available.

But I would clearly stay away from GM and would advice the same.

Saturday, June 15, 2013

Step by Step setting up solr cloud

Here is a quick step by step instruction for setting up solr cloud on few machines. We will be setting up solr cloud to match production environment - that is - there will be separate setup for zookeeper & solr will be sitting inside tomcat instead of the default jetty.

Apache zookeeper is a centralized service for maintaining configuration information. In case of solr cloud, the solr configuration information is maintained in zookeeper. Lets set up the zookeeper first.

We will be setting up solr cloud on 3 machines. Please make the following entries in /etc/host file on all machines.

172.22.67.101 solr1
172.22.67.102 solr2
172.22.67.103 solr3

Lets setup the zookeeper cloud on 2 machines

download and untar zookeeper in /opt/zookeeper directory on both servers solr1 & solr2. On both the servers do the following

root@solr1$ mkdir /opt/zookeeper/data
root@solr1$ cp /opt/zookeeper/conf/zoo_sample.cfg /opt/zookeeper/conf/zoo.cfg
root@solr1$ vim /opt/zookeeper/zoo.cfg

Make the following changes in the zoo.cfg file

dataDir=/opt/zookeeper/data
server.1=solr1:2888:3888
server.2=solr2:2888:3888

Save the zoo.cfg file.

server.x=[hostname]:nnnnn[:nnnnn] : here x should match with the server id - which is there in the myid file in zookeeper data directory.

Assign different ids to the zookeeper servers

on solr1

root@solr1$ cat 1 > /opt/zookeeper/data/myid

on solr2

root@solr2$ cat 2 > /opt/zookeeper/data/myid

Start zookeeper on both the servers

root@solr1$ cd /opt/zookeeper
root@solr1$ ./bin/zkServer.sh start

Note : in future when you need to reset the cluster/shards information do the following

root@solr1$ ./bin/zkCli.sh -server solr1:2181
[zk: solr1:2181(CONNECTED) 0] rmr /clusterState.json

Now lets setup solr and start solr with the external zookeeper.

install solr on all 3 machines. I installed them in /opt/solr folder.
Start the first solr and upload solr configuration into the zookeeper cluster.

root@solr1$ cd /opt/solr/example
root@solr1$ java -DzkHost=solr1:2181,solr2:2181 -Dbootstrap_confdir=solr/collection1/conf/ -DnumShards=2 -jar start.jar

Here number of shards is specified as 2. So our cluster will have 2 shards and multiple replicas per shard.

Now start solr on remaining servers.

root@solr2$ cd /opt/solr/example
root@solr2$ java -DzkHost=solr1:2181,solr2:2181 -DnumShards=2 -jar start.jar

root@solr3$ cd /opt/solr/example
root@solr3$ java -DzkHost=solr1:2181,solr2:2181 -DnumShards=2 -jar start.jar

Note : It is important to put the "numShards" parameter, else numShards gets reset to 1.

Point your browser to http://172.22.67.101:8983/solr/
Click on "cloud"->"graph" on your left section and you can see that there are 2 nodes in shard 1 and 1 node in shard 2.

Lets feed some data and see how they are distributed across multiple shards.

root@solr3$ cd /opt/solr/example/exampledocs
root@solr3$ java -jar post.jar *.xml

Lets check the data as it was distributed between the two shards. Head back to to Solr admin cloud graph page at http://172.22.67.101:8983/solr/.

click on the first shard collection1, you may have 14 documents in this shard
click on the second shard collection1, you may have 18 documents in this shard
check the replicas for each shard, they should have the same counts
At this point, we can start issues some queries against the collection:

Get all documents in the collection:
http://172.22.67.101:8983/solr/collection1/select?q=*:*

Get all documents in the collection belonging to shard1:
http://172.22.67.101:8983/solr/collection1/select?q=*:*&shards=shard1

Get all documents in the collection belonging to shard2:
http://172.22.67.101:8983/solr/collection1/select?q=*:*&shards=shard2

Lets check what zookeeper has in its cluster.

root@solr3$ cd /opt/zookeeper/
root@solr3$ ./bin/zkCli.sh -server solr1:2181
[zk: 172.22.67.101:2181(CONNECTED) 1] get /clusterstate.json
{"collection1":{
    "shards":{
      "shard1":{
        "range":"80000000-ffffffff",
        "state":"active",
        "replicas":{
          "172.22.67.101:8983_solr_collection1":{
            "shard":"shard1",
            "state":"active",
            "core":"collection1",
            "collection":"collection1",
            "node_name":"172.22.67.101:8983_solr",
            "base_url":"http://172.22.67.101:8983/solr",
            "leader":"true"},
          "172.22.67.102:8983_solr_collection1":{
            "shard":"shard1",
            "state":"active",
            "core":"collection1",
            "collection":"collection1",
            "node_name":"172.22.67.102:8983_solr",
            "base_url":"http://172.22.67.102:8983/solr"}}},
      "shard2":{
        "range":"0-7fffffff",
        "state":"active",
        "replicas":{"172.22.67.103:8983_solr_collection1":{
            "shard":"shard2",
            "state":"active",
            "core":"collection1",
            "collection":"collection1",
            "node_name":"172.22.67.103:8983_solr",
            "base_url":"http://172.22.67.103:8983/solr",
            "leader":"true"}}}},
    "router":"compositeId"}}

It can be seen that there are 2 shards. Shard 1 has only 1 replica and shard 2 has 2 replicas. As you keep on adding more nodes, the number of replicas per shard will keep on increasing.

Now lets configure solr to use tomcat and add zookeeper related and numShards configuration into tomcat for solr.

Install tomcat on all machines in /opt folder. And create the following files on all machines.

cat /opt/apache-tomcat-7.0.40/conf/Catalina/localhost/solr.xml
<?xml version="1.0" encoding="UTF-8"?>
<Context path="/solr/home"
     docBase="/opt/solr/example/webapps/solr.war"
     allowlinking="true"
     crosscontext="true"
     debug="0"
     antiResourceLocking="false"
     privileged="true">

     <Environment name="solr/home" override="true" type="java.lang.String" value="/opt/solr/example/solr" />
</Context>

on solr1 & solr2 create the solr.xml file for shard1

root@solr1$ cat /opt/solr/example/solr/solr.xml

<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="true" zkHost="solr1:2181,solr2:2181"> 
  <cores defaultCoreName="collection1" adminPath="/admin/cores" zkClientTimeout="${zkClientTimeout:15000}" hostPort="8080" hostContext="solr">
    <core loadOnStartup="true" shard="shard1" instanceDir="collection1/" transient="false" name="collection1"/>
  </cores>
</solr>
 

on solr3 create the solr.xml file for shard2

root@solr3$ cat /opt/solr/example/solr/solr.xml

<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="true" zkHost="solr1:2181,solr2:2181"> 
  <cores defaultCoreName="collection1" adminPath="/admin/cores" zkClientTimeout="${zkClientTimeout:15000}" hostPort="8080" hostContext="solr">
    <core loadOnStartup="true" shard="shard2" instanceDir="collection1/" transient="false" name="collection1"/>
  </cores>
</solr>
 
 

set the numShards variable as a part of solr starup environment variable on all machines.

root@solr1$ cat /opt/apache-tomcat-7.0.40/bin/setenv.sh
export JAVA_OPTS=' -Xms4096M -Xmx8192M -DnumShards=2 '

To be on the safer side, cleanup the clusterstate.json in zookeeper. Now start tomcat on all machines and check the catalina.out file for errors if any. Once all nodes are up, you should be able to point your browser to http://172.22.67.101:8080/solr -> cloud -> graph and see the 3 nodes which form the cloud.

Lets add a new node to shard 2. It will be added as a replica of current node on shard 2.

http://172.22.67.101:8983/solr/admin/cores?action=CREATE&name=collection1_shard2_replica2&collection=collection1&shard=shard2
And lets check the clusterstate.json file now

root@solr3$ cd /opt/zookeeper/
root@solr3$ ./bin/zkCli.sh -server solr1:2181
[zk: 172.22.67.101:2181(CONNECTED) 2] get /clusterstate.json
{"collection1":{
    "shards":{
      "shard1":{
        "range":"80000000-ffffffff",
        "state":"active",
        "replicas":{
          "172.22.67.101:8080_solr_collection1":{
            "shard":"shard1",
            "state":"active",
            "core":"collection1",
            "collection":"collection1",
            "node_name":"172.22.67.101:8080_solr",
            "base_url":"http://172.22.67.101:8080/solr",
            "leader":"true"},
          "172.22.67.102:8080_solr_collection1":{
            "shard":"shard1",
            "state":"active",
            "core":"collection1",
            "collection":"collection1",
            "node_name":"172.22.67.102:8080_solr",
            "base_url":"http://172.22.67.102:8080/solr"}}},
      "shard2":{
        "range":"0-7fffffff",
        "state":"active",
        "replicas":{
          "172.22.67.103:8080_solr_collection1":{
            "shard":"shard2",
            "state":"active",
            "core":"collection1",
            "collection":"collection1",
            "node_name":"172.22.67.103:8080_solr",
            "base_url":"http://172.22.67.103:8080/solr",
            "leader":"true"},
          "172.22.67.103:8080_solr_collection1_shard2_replica2":{
            "shard":"shard2",
            "state":"active",
            "core":"collection1_shard2_replica2",
            "collection":"collection1",
            "node_name":"172.22.67.103:8080_solr",
            "base_url":"http://172.22.67.103:8080/solr"}}}},
    "router":"compositeId"}}

Similar to adding more nodes, you can unload and delete a node in solr.

http://172.22.67.101:8080/solr/admin/cores?action=UNLOAD&core=collection_shard2_replica2&deleteIndex=true
More details can be obtained from

http://wiki.apache.org/solr/SolrCloud

Wednesday, June 12, 2013

How bigpipe works

Bigpipe is a concept invented by facebook to help speed up page load times. It paralellizes browser rendering and server processing to achieve maximum efficiency. To understand bigpipe lets see how the a user request-response cycle is executed in the current scenario

  • Browser sends an HTTP request to web server.
  • Web server parses the request, pulls data from storage tier then formulates an HTML document and sends it to the client in an HTTP response.
  • HTTP response is transferred over the Internet to browser.
  • Browser parses the response from web server, constructs a DOM tree representation of the HTML document, and downloads CSS and JavaScript resources referenced by the document.
  • After downloading CSS resources, browser parses them and applies them to the DOM tree.
  • After downloading JavaScript resources, browser parses and executes them.

In this scenario, while the web server is processing and creating the HTML document, the browser is idle and when the browser is rendering the html page, the web server remains idle.

Bigpipe concept breaks the page into smaller chunks known as pagelets. And makes page rendering on browser and processing on server side as parallel processes speeding up the page load time.

The request response cycle in the bigpipe scenario is as follows.

  • The browser sends an HTTP request to web server.
  • Server quickly renders a page skeleton containing the tags and a body with empty div elements which act as containers to the pagelets. The HTTP connection to the browser stays open as the page is not yet finished.
  • Browser will start downloading the bigpipe javascript library and after that it'll start rendering the page
  • The PHP server process is still executing and its building the pagelets. Once a pagelet has been completed it's results are sent to the browser inside a BigPipe.onArrive(...) javascript tag.
  • Browser injects the html code for the pagelet received into the correct place. If the pagelet needs any CSS resources those are also downloaded.
  • After all pagelets have been received the browser starts to load all external javascript files needed by those pagelets asynchronously.
  • After javascripts are downloaded browser executes all inline javascripts.

This results in a parallel system where as the pagelets are being generated the browser is rendering the pagelets. From the user's perspective the page is rendered progressively. The initial page content becomes visible much earlier, which dramatically improves user perceived latency of the page.

Source : https://www.facebook.com/note.php?note_id=389414033919
open bigpipe implementation : https://github.com/garo/bigpipe

Monday, June 10, 2013

Getting started with replication from MySQL to Mongodb

Use tungsten replicator to replicate between mysql and mongodb.

Mysql tables are equivalent to collections in mongodb. The replication works by replicating inserts and updates. But all DDL statements on mysql are ignored...

Replication in detail