Thursday, February 08, 2007

memcache

Something new, something exciting, something which makes your brain churn... Thats what this blog is all about...

Hmm, so what are we to explore today...

Every machine has some amount of RAM where the OS/programs store frequently accessed data. Have you ever tried storing arrays in memory so that you can access it very frequently. You might have. Even i have done a similar stuff. Storing a binary tree in memory as doubly linked list - so that parsing of the tree becomes fast.

Here is something to ponder over known as memcache. Defined as "A high performance, distributed memory object caching system". You can get it here http://www.danga.com/memcached/

How does this work.. Well, firstly just download the tar.gz source file. Untar and compile it. A simple ./configure , make and make install would do.

So you will have the "memcached" binary ready. Then all you have to do is run the memcached binary in daemon mode and assign it some amount of memory where it can store data. Just do a memcached -h and it will list you all the available options

[jayant@jayant memcached-1.2.1]$ ./memcached -h
memcached 1.2.1
-p TCP port number to listen on (default: 11211)
-U UDP port number to listen on (default: 0, off)
-s unix socket path to listen on (disables network support)
-l interface to listen on, default is INDRR_ANY
-d run as a daemon
-r maximize core file limit
-u assume identity of (only when run as root)
-m max memory to use for items in megabytes, default is 64 MB
-M return error on memory exhausted (rather than removing items)
-c max simultaneous connections, default is 1024
-k lock down all paged memory
-v verbose (print errors/warnings while in event loop)
-vv very verbose (also print client commands/reponses)
-h print this help and exit
-i print memcached and libevent license
-b run a managed instanced (mnemonic: buckets)
-P save PID in , only used with -d option
-f chunk size growth factor, default 1.25
-n minimum space allocated for key+value+flags, default 48


To start memcached in daemon mode with 128 MB RAM, which listens on localhost port 11211 the following command would need to be run

memcached -d -m 128 -l 127.0.0.1 -p 11211

using the options available above, you can configure memcached as per your needs.

So now the server is up and running and you need to use clients to connect and store data over there. For that there are apis available with different languages which allow you to connect to memcached daemon and store/retrieve variables, arrays and objects from it. APIs for perl, python, ruby, java, C# and C are available on the website.

Since i generally do work on php, i wanted an API for php. Well for that the simplest way to install a memcache API for php is by running the following command as root

pecl install memcache

It will automatically download, compile and install the memcache API for php.

Cool, so now we are ready. We have the server running and the client API ready. All we need to do now is build a program which puts and gets information from memcache. I will stick to php for this. http://in.php.net/manual/en/ref.memcache.php lists the functions available with memcached. I wont be giving you a detailed program on how to use memcached api. But just for an idea sake...

You will have to connect to the memcached daemon using the memcache_connect function and then use the memcache_add, memcache_get, memcache_set and memcache_delete to add, retrieve, update and delete objects from the memcachd daemon.

Points to ponder upon:

1. Memcache is very fast. It uses the libevent to scale to any number of open connections.
2. You can start any number of memcached servers on different machines. Different instances of the server do not replicate data across each other. The client api uses the different servers to create a pool and distribute and store data among them.
3. LiveJournal.com uses memcache on a major basis for serving dynamic pages. It helped them reduce the page load times and also reduce database load by a major extent.
4. Memcache does not allow you to store objects which are language independent. So if you use php API to store an object, you cannot use java API to extract the object. Well, even if you extract it, you wont be able to figure out the object - it would be incomprehensible.

If you can figure out a way to serialize all objects of different languages in a well defined manner, then the same object could be stored and accessed by different languages from memcached. This should make things much simpler.

No comments: