glusterfs & synchronous data storage

Labs: installation & configuration of GlusterFS as synchronous data storage solution.
By: Pascal Charest, Freesoftware consultant
Date: September, 2008.

Synchronization of files in a cloud environment is a challenge in the path of high-{availability, performance}. From simple load balanced web sites to full-blown applications - some files always need to be in sync. Peoples, for simplicity, rely on asynchronous transfer (ie: rsync ), others deploy bigger solutions (ie: block device replication through DRBD or shared storage through AoE protocol & concurrency management with OCFSv2) or even go for the “lazy” “no-shared-storage” solution through NFS.

To address this problem in the PraizedMedia software stack, I decided to give FUSE based GlusterFS a try. Awesome, really ! The technical knowledge to deploy a basic solution is very very low. The modularity of the program also help to have “something working right now”. This isn’t meant as a direct alternative to DRBD or a good SAN deployment but in my use case, it fit perfectly.

In this lab, I will guide you through the installation of GlusterFS on 2 networked systems. They will be both used as “servers” & “client” for the GlusterFS filesystem. They will be sharing a directory (on both system : /var/production/brick), re-mounted as /var/production/static through GlusterFS. Any write I/O on this directory (of any client server) will be synchronized to the pool. This last feature is called “AFR” (for automatic file replication) and is a module (called a translator) to the GlusterFS file system.

The specificity of my environment is around the file-locking management : I don’t need any. By design, the application will never try to write the same file twice on any of the server.

#Installation of requirement (standard tools)
apt-get install flex bison libfuse-dev linux-headaers-`uname -r` curl

#download of the sources
cd /usr/local/src/
curl -O http://ftp.zresearch.com/pub/gluster/glusterfs/1.3/glusterfs-CURRENT.tar.gz
tar zxf glusterfs-CURRENT.tar.gz


# configure
cd glusterfs-1.3.11
./configure --prefix=/usr/local/glusterfs-1.3.11
make && make install
ln -s /usr/local/glusterfs-1.3.11 /usr/local/glusterfs


So we now have a basic 2 servers GlusterFS systems installed. Lets be honest, that wasn’t really hard! We are still missing configuration files though.

#Editing /usr/local/glusterfs/etc/glusterfs/glusterfs-server.vol
#
# glusterfs-servers definition
# volume definition are on first lvl, other are on second lvl (tabbed)
volume brick
type storage/posix
option directory /mnt/production/brick
end-volume

volume server
type protocol/server
option transport-type tcp/server
option auth.ip.brick.allow *
subvolumes brick
end-volume


#Editing the /usr/local/glusterfs/etc/glusterfs/glusterfs-client.vol
#
# glusterfs-client.vol
# volume definition are on first lvl, other are on second lvl (tabbed)
#
volume remote1
type protocol/client
option transport-type tcp/client
option remote-host 002.praized.com
option remote-subvolume brick
end-volume

volume remote2
type protocol/client
option transport-type tcp/client
option remote-host 001.praized.com
option remote-subvolume brick
end-volume

volume mirror0
type cluster/afr
subvolumes remote1 remote2
end-volume


#Launching services (servers and clients)
mkdir -p /mnt/production/brick
/usr/local/glusterfs-1.3.11/sbin/glusterfsd -f /usr/local/glusterfs-1.3.11/etc/glusterfs/glusterfs-server.vol

mkdir -p /mnt/production/static
/usr/local/glusterfs-1.3.11/sbin/glusterfs -f /usr/local/glusterfs-1.3.11/etc/glusterfs/glusterfs-client.vol /mnt/production/static/


You now possess a synchronized directory between your two systems. Please note that GlusterFS require TCP/6996 port to be open. There is also some improvement that can be done to this setup through adding a locking mechanism & i/o thread - I don’t currently need them, but you might.
Enjoy!

Debugging notes ; after starting the server process you should have a kernel process call glusterfs. All log files are in /usr/local/glusterfs/var/log/glusterfs*. After starting the client, “df -h” should show you your new mount point. Careful with UID/GID (&Permission), there is no such thing as root_squash_fs in GlusterFS yet.


Other notes ; Using Amazon EBS would have been the perfect solution if they did allow multiple servers-volume mount and lets us deal with concurrency / lock problems. But, they don’t.

@ Linux Symposium - SynergyFS

SynergyFS,
by Keun Soo Yim from Samsung @ Ottawa Linux Symposium

Ok, that was a flop :
- We get it - solid state drive (SSD) are faster, have a smaller energy footprint than hard disk drive (HDD). With a general engineering background most of the group knew already that there is no moving part… No need to show us a 10 minutes video of windows vista booting, of people sitting in a plane, of trying to break a laptop… This time would have been better spent giving out actual technical detail about the file-system.

- Question: “Can we see what SynergyFS look like, how it fare in benchmark”, answer: “not without signing NDA”.
- Question: “Is the source available for SynergyFS”, answer: “no, GPL really is a bad idea for a business, you could distribute our code afterward”.

… yeah, well… that was a waste of my time.

@ Linux Symposium - Green computing in clusters

Applying green computing to clusters,
by Steven Alan DuChene from SGI @ Ottawa Linux Symposium

The presentation is a high level review of metrics that are wanted and/or required to have controlled clusters computing and the road-map to the creation of an “environmental aware” job scheduler / resources manager.

It feel funny to hear the same talk that has been around for home automation about a data center installation. Especially since it does seem like a cheat to apply it to high performance since semi-random load are still a big part of the charge of a data center… and in HPC setup, the race to idle is normally the easiest and lower power consumption path.

Also weird that there is no mention of the advantage of cloud computing & fast provisioning that virtualization can bring to a green data center.

@ Linux Symposium - Kernel documentation

Where Linux Kernel Documentation Hides,
By Rob Landley @ Ottawa Linux Symposium

Nice presentation about the difficulties about dynamic documentations and the interaction with {Users, Developers, Google, Indexer}. The main problems being that documentation efforts doesn’t scale and that kernel developer are not really editors. Some past decisions make it possible to push patch that will break the “make html” process and thus documentation is one of the less stable part of the “current kernel”.

Very interesting talk. Kind of pitch the incessant fight between {developper, users} blogs, wiki, static web pages, dynamic web pages, kernel mailing list, git commit messages, man pages, info pages… and the effort to normalize everything.

@ Linux Symposium - Cloud Computing

Cloud Computing: Coming out of the Fog
By Gerrit Huizenga, from IBM @ Ottawa Linux Symposium

Its very strange to hear a talk about cloud computing from an IBM employee since they have not yet shown any serious stats about their Blue Cloud system. Still, it’s a good review of the cloud computing field.

Despite the fact that the presentation is built upon the statement that cloud computing isn’t about servers provisioning, it clearly revolve around the two following points of views:

From outside of the cloud :

You want your applications (complex systems) to be deployed fast, with next to no configurations to be done. 3Tera system is shown as a “good” way of doing that - personally, having built something similar for a client, Amazon EC2 is also a good contender for the title. From my POV, this is really about provisioning and the capacity of building virtual appliance.

From inside of the cloud :

You want to have a fully (automatic, dynamically) managed data center. Technology is already there. This is ALL about server provisioning.

The presentation moved from this “reviewing definition” to “why it is presently not everywhere” and “how to build a general interface for cloud system”. Guess this speaker is reading the cloud computing mailing list at Google Group since those are hot subjects right now.

As a closure (this wasn’t mentioned in the presentation) : here is a quick stock index of related corporation.