Sunday, January 08, 2006

Understanding memcached memory management

memcached is a very fast program that can be used to cache data stored in a database for quick access and dramatically reduce the load on your database servers. It's one of those "must haves" if you develop a highly trafficked website that has a database backend. Understanding the internals of memcached may be a bit confusing at first, the details aren't documented well, and there are some tools that aren't that easy to understand what they do. I'll try to explain here by example.

Let's say you have a memcached setup and have allocated 256MB of RAM to the program. memcached works hard to avoid memory fragmentation, it does this by taking the memory and dividing it into classes, numbered 6 through 17 (don't ask). As you insert into memcached, it will take the size of the data you are inserting and inserts it into the class that is big enough to handle the data. See the chart below. The classes in the below memcached are of size 128B - 16KB. Ignore the Max_age column for now. Each class that has data is allocated at least one 1MB page. So, class 7 has 4 1MB pages, each data item in that class is using 128B and the class is full, so there are 32767 items in the class. (If you insert an item into memcached of say 100B, the class uses all 128B to avoid fragmentation.)

Now if we try adding to this cache, we won't necessarily get an "out of memory" error because memcached uses a LRU (least recently used) algorithm to kick out the item from the page that has been accessed the least. But, if we attempt to insert an item of say 15B, we will get an "out of memory" error because class 6 (the class would be of size 64B) has no items, and thus no pages have been assigned. But, all pages are currently being used, and memcached cannot allocate a 1MB page to class 6. In order to make this happen we would have to move a page from another class to class 6.

Let's examine how this is done:

First, let's get a snapshot of our current allocations:

[root@sql-slave1 bin]# ./memcached-tool localhost
# Item_Size Max_age 1MB_pages Full?
6 B 0 s 0 yes
7 128 B 742942 s 4 yes
8 256 B1175448 s 2 yes
9 512 B1279471 s 1 yes
10 1 kB6493931 s 1 yes
11 2 kB 434740 s 71 no
12 4 kB 917212 s 140 yes
13 8 kB 794005 s 30 no
14 16 kB1373021 s 7 yes
15 B 0 s 0 yes
16 B 0 s 0 yes
17 B 0 s 0 yes


Now, let's go ahead and try to insert 2 bytes of data:

[root@sql-slave1 bin]# telnet localhost 11211
Trying 127.0.0.1...
Connected to localhost (127.0.0.1).
Escape character is '^]'.
set key1 0 0 2
SERVER_ERROR out of memory


OK, just like I thought, memcached can't allocate a 1MB page, so let's help it out:

[root@sql-slave1 bin]# ./memcached-tool localhost move 14 6
Success.


Now, let's look at our pages again, as you can see 1 page has been moved from class 14 to class 6. The LRU items in class 14 go bye-bye.

[root@sql-slave1 bin]# ./memcached-tool localhost
# Item_Size Max_age 1MB_pages Full?
6 64 B 0 s 1 no
7 128 B 743011 s 4 yes
8 256 B1174812 s 2 yes
9 512 B1279618 s 1 yes
10 1 kB6494078 s 1 yes
11 2 kB 434887 s 71 no
12 4 kB 917208 s 140 yes
13 8 kB 794152 s 30 no
14 16 kB1373168 s 6 yes
15 B 0 s 0 yes
16 B 0 s 0 yes
17 B 0 s 0 yes


Let's insert our data again:

[root@sql-slave1 bin]# telnet localhost 11211
Trying 127.0.0.1...
Connected to localhost (127.0.0.1).
Escape character is '^]'.
set key1 0 0 2
aa
STORED
quit
Connection closed by foreign host.


Cool, it worked. Looking at our stats shows the item is aging:

[root@sql-slave1 bin]# ./memcached-tool localhost
# Item_Size Max_age 1MB_pages Full?
6 64 B 14 s 1 no
7 128 B 742970 s 4 yes
8 256 B1174852 s 2 yes
9 512 B1279658 s 1 yes
10 1 kB6494118 s 1 yes
11 2 kB 434918 s 71 yes
12 4 kB 917228 s 140 yes
13 8 kB 794192 s 30 no
14 16 kB1373208 s 6 yes
15 B 0 s 0 yes
16 B 0 s 0 yes
17 B 0 s 0 yes

Conclusions, you better keep an eye on the data in your memcached, and knowing what type of data you are storing will help you maintain it. Writing scripts to automate the moving around of pages based on data should be fairly easy.

Also, there are some little known commands that I use a lot:
stats
stats items
stats slabs
stats sizes

memcached is a great piece of software, many thanks to the Danga team.

5 comments:

Anonymous said...

This is gold. This is just pure gold. Thanks!

d43m0n said...

numbered 6 through 17 (don't ask)
Why? =:p
Because they are the power of 2, so this way the class number also refers to the size of the class.
For example, numbering begins with 6, and the size of the smallest class is 64 bytes, which is the 6th power of 2.

Anonymous said...

That was quite a helpful material , thnx.
I have a question, can we manipulate memcache's cleanup(lru mechanism) programatically or with the help of some conf file.

Kevin Minnick said...

Not that I'm aware of...

Chan Kun Juan said...

Old post New Comment:
Hi,

Sorry i have a question. Can we hint or influence the LRU because we want to use memcached to make the session persist in a round-robin setup. I need to give sessions the highest priority instead of being discard by the LRU algorithm. I want configure it in such a way that the only thing that can discard session is the max_age and not lack of memory.