Archive - sysadmin RSS Feed

Microsoft & Open Source

6. Use the Open Source certification mark to keep things pure

One of the threats we faced was the possibility that the term ‘open source’ would be "embraced and extended" by Microsoft or other large vendors, corrupting it and losing our message.
[...]   
It eventually developed that the U.S. Patent and Trademark office would not issue a trademark for such a descriptive phrase [Open Source]
[...]
The sorts of serious abuse we feared have not (at least, no yet as of November 2000) actually materialized.

The Cathedral & the Bazaar :: Revenge of the Hackers, Eric S. Raymond, page 178

A couple years ago, I almost skipped this part of the Revenge of the Hacker manifesto since it was evident that Microsoft would not try something that obvious. I was so wrong…

It did took them a couple of years to figure it out, but then, they push forward evenement like {Open Source} Heroes happen here where you can [...]

Order you own Hero Hack Pack and get started with Open Source. Each Hack Pack contains a trial copy of Windows Server 2008 and Visual Studio 2008, plus a chance to win a free pass to OSCON 2008!

I am well aware that free software doesn’t need to be free "as in beer", yet It kinda make me sad that people can pull that kind of scheme and still sleep at night. They are clearly piggy backing on a wave they don’t control and have nothing to do with. You can receive a trial version of a software that is closed source, which is non-free, to run a software that is closed source, also non-free to program open-source. Hey, you can’t say no to that offer!

Why don’t they start their own little revolution and try to gain leverage… because, lets be honest here, Microsoft doesn’t even own any open source application. They have the money and the technical talent to, once again, shake the whole computing world with innovation. Their loud "we support open source" is based on money they gave their own concurrent while keeping their standard closed – they really should not play on this level.

Port 25, Microsoft blogging platform for their open source community, is more of a joke than anything else. They have some freak labs, acces to some of the best programmers in the world and … their messages all look, at least, to me, like :

Microsoft is involved in open source since Apache web server run on their server. That’s without saying anything about the price/availability of IIS… And this is no joke, the last few post from port 25 are installation procedures for Apache.

Microsoft is involved in open source since they have a partnership with XEN (this is more present on their website than the port25 blog). They still push forward their own virtualisation system, but with VMWare currently stealing all the "high value target" and Xen taking everything else…. not much left for VirtualPC.

But, there is one thing that does intrigue me : Shared Source at Microsoft. Well designed, this program could have some leverage, but I guess that everything is released under a reference only licence, that a very big entree fee is required and that your soul must be sold to the devil, or something like that.

And when they really want to go open source, the underlying "help us get more money" is so evident that their isn’t even any fun reporting it. The project look cool though : Singularity.

Update: I’ve been pointed toward this image, sad that it hasn’t been kept up-to-date.

A walk in the cloud

Executive summary : Give me 10k$, a month, 3 poweredge servers, a gigabytes capable switch and I’ll build you a scalable cloud infrastructure ;-).

And, the post:

Last year dominant meme was "Virtualization". Since you can’t have the same focus for two consecutives years (must be a law about that written somewhere), they (for various definition of "they") had to enhance it. Here come "Cloud Computing".

Cloud computing, as defined here, here, here, here, here and…  is still in condensation phase. Ideas appear and usability should emerge… soon.

While this is concentrated fun for theorician, I would prefer a more technical discussion. I am aware of Montreal based corporations currently studying Cloud/Grid systems. One of the next big player, in Montreal/North-Eastern USA, might be iWeb Technologies – they already have hardware, a customer base and so much to gain on the scalability aspect of cloud computing. Think about dynamically closing unused shared hosting system and relocating instance in relation of their impact on server resources. A lot of other corporation are also present in the field.

But I don’t have access to the same quantity of hardware as they have, so lets see what is available / can be built in my small lab.

SunGrid Engine, as an online service, no hardware needed, have more of a grid heritage than a cloud computing future. Application are launch, run, and a specific output is gathered and sent. The list of application, while impressive, doesn’t have "Apache" – this is a system meant for raw processing power, not offering services. 

IBM’s BlueCloud is still more of a vapor cloud around a press release than anything that has to do with computing. Though, I’m sure it look awesome in their lab. But, again, I’m sure their whole lab look nice.

3TERA’s apps logic does look neat, yet, there is no public price tag. This also look like the kind of system that is built around templates "which should not be modified". I have no idea how the system reliability goes when customization are made. And I won’t know… no price tag is a straight no-go for me. If you are ashamed of your pricing model, there is a problem. If its not the case, there is no reason not to show "figures".

Another online service, Amazon AWS (EC2 & S3), is one of the current market leader. Based on XEN, you can have a remote instance for couples of cents an hour. The main concern with EC2 is the volatile aspect of the storage, which kinda defeat most of services real purpose, dealing with informations.

So ?

While I don’t have much hardware, I still have a labs of 4 dev + 2 prod systems. Lets see what can be done. Lets design a home brewed cloud infrastructure.

Nodes types
ConfigNode :
    role : CNode is a standard Debian sys. It is the DHCP + PXE + tftp server. It hold the HardwareNode kernel. All cloud configuration happen on those systems.
    min : 1 sys.
    normal : 2 sys.. {Primary/Slave}. with software raid + drbd + heartbeat.
    Scalable: no use. 2 systems is more than enough, there isn’t really any CPU/Network load.

StorageNode :
    role: SNode is a network booted GNU/Linux system. It serve AoE devices on the network. All nodee (except ConfigNode) use SNODE as root filesystem.
    min: 1 sys.
    prefered: 2 sys, {Primary/Primary} with software raid + drbd. MD-device Multipathing is required from clients to preserve the P/P coherence and reliability to network failure.
    Scalable : This is a building block. The limit of SNODE is defined by the network fabric speed.

HardwareNode :
    role : HNode is a network booted GNU/Linux/XEN-dom0 system. It use a SNODE array as its root filesystem. This is where INODE will be launched. This node is diskless.
   
min: 1 sys.
    prefered: no limit.
    Scalable: This is a building block of the infrastructure. The limit of HNODE is defined by the acceptable speed of the root file system located on a SNODE.

Instance :
    role : an Instance is a network booted GNU/Linux/XEN-domU system. In the presence of VT technologies, it can also be an unmodified guest operating system (hear full-fledge GNU/Linux or Microsoft Windows). It is started on a specific HNODE using SNODE resources.
    min : 1 sys.
    prefered : no limit.
    Scalable : Currently limited to the underlying HNODE ressources.

Summary : Using a specific configuration node we start a StorageNode and an hardware node. Then, once the infrastructure is  "running", Instances can be dynamically started on HardwareNode.

Since Instances are XEN/domU based, running on shared storage, they can be migrated LIVE without downtime between HardwareNode. A ping to the virtual instance would not fail, even in the middle of the live migration.

Since HardwareNode are network booted, adding new server is as simple as adding the MAC addrs in the dhcp configuration and tagging it as HNODE. As long as system are able to PXE boot, it is really a matter of minutes to add new nodes.

Since HardwareNode are network booted with remote root filesystem, they do not need to have hard drive. This remove one of the main failing pieces of current infrastructure. There isn’t much to fail in a server with only a CPU, memory and network interfaces.

The storage aspect is taken care of Storage node where good raid + redundancy + hard-drive snapshot can be used to control the environment. The only limit on the number of storage node is the network… but then, link aggregation is your friend.

Since multipathing is used, with DRBD and AoE, a storage node can be shutdown without impacting running instances.

The creating of new InstanceNode is easy : either copy an instance or debootstrap a new system. Doing something similar from 3Tera would be fairly easy at this point, creating template and preparing configuration interfaces/scripts. 

What now ?

Took me a week-end day. I have a running ConfigNode, StorageNode (using NFS, but AoE /multipathing is next), HardwareNode and an Instance. Much of the time was spent waiting for kernel compilation and deploying distcc on my lan. Had little problems pxe booting a dom0, but found a fix.

I wonder what someone working full time could accomplish in a month…. Someone want to pay me to see ? ;-). Haaa.. and it would cost you (in addition to my salary for a month) a copy of Nicolas Carr’s BigSwitch book (which I haven’t read yet, but plan to, as soon as I can get my hand on a copy). I can even do a little presentation first for some kind of financial retribution (yeah, money drive me ;-)).

Seriously, such setup would be fully scalable and so easy to dynamically configure through scripts/GUI. One of the limiting factor is the CPU/Memory resources limit that instance have since they are linked to a single hardware node but if Xen (as a commercial solution) is able to create a resource pool, I’m sure there is way to go around that limitation.

Jeez, using VT enabled hardware node, you could even start Microsoft Windows instance in your cloud…

Btw, I know that everything i’ve spoke about can be done through VMWare infrastructure with vmotion (and maybe 3Tera’s Apps) but…. then, think about the fact that a 2 CPU licence for VMWARE Infrastructure is a little bits over 6900$USD….

I just don’t understand why there isn’t more cloud out there. This isn’t all that hard to deploy… not even time consuming…

.cloud computing

Wishlist, part 2

Last week, I drafted a wishlist and posted it here. Lets do an update:

– Minolta Maxxum 50mm F/1.7 obj.
    In the mail, slowly getting here.

– Opensource, Microsoft Windows based, AoE initiator
    Tracy Reed (of XenAoE.org) pointed me toward http://winaoe.org which seem to be a valid Microsoft Windows based initiatior licensed under the GPLv3. I will be testing it in my lab this week-end.

– Another SAN deployement project
    I never discuss my contractual engagment, but… ;-)

load-balancing

Note: The setup I’m about to describe is not only UNSUPPORTED by MySQL but also clearly "Not a good idea", following posts all over MySQL sites. I’m not kidding : Here (second paragraph), here (big orange warning in the center) and so on. Read the conclusion of this post, it also bring one important point.

Yesterday, while helping a friend with a server, I’ve decided to improve my lab setup.  The configuration in question involve dual LAMP system accessing shared-storage system to deliver webpages. The load-balancing part will be secured by PEN software on my router. I’ve integrated a small php function to show which server is answering the request, this information is available under my LinkedIn tag in the last sidebar.

So I have 2 " servers" systems, MD devices + DRBD 8.2.5 + OCFSv2 + Apache2 + Php5 + Mysql 5.0.51, with dual network interface {one toward the LAN, one as a cross-over}.

Over the cross-over, DRBD sustain a primary/primary array, with OCFSv2, creating a shared-disk device. Procedures to create such setup is available on mass-storage.org. We still need to create the rest of the infrastructure, this will be a 3 steps job.

Load-Balancing

The current setup is a load-balanced environment, there isn’t much high-availability value since our load-balancer is going to be the point of failure. We will be using PEN load-balancer on the router. Why ? Because it is very easy to configure and was the fastest to install. In time, I’ll post a true walkthrough to get an HA infrastructure, but not today.

apt-get install pen
pen -T 10 10.0.0.25:80 10.0.0.26:80 10.0.0.27:80

And I NAT all my inbound web traffics towards 10.0.0.25, which, in turn, load balance over 10.0.0.26 and 10.0.0.27. "Connection tracking" is kept for 10 seconds, giving the same servers for most of the sessions. This isn’t really needed, but allow me to give an excuse for the "DO NOT hammer my server to play with the load-balancing ;-)" part of my speach to my friends.

Apache2 + Php5

This really is the easy part, we configure both Apache service toward the same files, it work. If you intend to do heavy use of sessions (or small), you better use a modules to relocate them on you shared storage.

MySQL 5.0.51

Here goes the tricky part. MySQL doesn’t like shared storage, so you better really know what you are doing. The first thing we do is change some default configuration:

in /etc/mysql/my.cnf, we need to create the external-locking mode

skip-external-locking  -> external-locking

To remove the caching:

delay-key-write = OFF
query-cache-size = 0

And we need to be sure we are using MyISAM tables. Innodb is really a big NO-GO here. Kinda cool to know that MySQL default mode/tables are already using the MyISAM storage engine.

Conclusion

If you have any services which require you to think about load-balancing, but are ignorant enough to base decisions on the lab-notes I post on this blog, I would advice you to get in touch with a storage/network consultant. Won’t cost you that much and you will have a written confirmation that your infrastructure is going to work for your given application.

Job @ Akoha, Unix Pipefitter

Ok, I’ll be so freaking jalous of whomever get this "Unix Pipefitter" job @ Akoha. My current obligation and futur plan prevent me of being able to do 9-17 "in-office" job, the max I could give is remote work and this isn’t as accepted from boss as people might think. Still, I’ll be jalous.

One cool fact this job offer show us is the integration of web services in Akoha network. Using Amazon for network scalability is so the way to go, but I seem unable to present this correctly to my client, they (almost) always say no….  This is really a good way to grow your computer power / storage without having to go through all those "This SAN is 40K$", "This load balancer should be operated by a Foundry Programmer", etc… I think they are afraid of the variable cost…

Amazon S3 + EC2 infrastructure will change the way you think about mass computing and this isn’t a sales pitch. If you are able to plan your network usage ahead, go for your own hardware + datacenter access, but if you need scalability, this is really the way to go… Been using the system for the last couples of month, as a "turn-on backup system". Every friday night, I start an instance of an AMI (Amazon Machine Image) which backup a couple of my hosts and turn down. Really, try it!

Page 1 of 1012345»...Last »