Tag Archive - mysql

There is such a thing as too much cache

For the xth time, I’ve hit the MySQL bug #51325 - It state that with relatively big (>5 gigs) innodb buffer_pool (my.cnf::innodb_buffer_pool_size), the LRU invalidation on ‘DROP TABLE’ (or ‘TRUNCATE TABLE’) will lock the full table (even if DROP or TRUNCATE is on a partition). Bug is only present when using the ‘innodb_file_per_table‘, which is kind of sad because I love that configuration setting.

Solutions include disabling the innodb_file_per_table setting & reducing the buffer_pool size…

RT @glynmoody: Oracle kil…

@pacharest: RT @glynmoody: Oracle kills low-priced MySQL support – http://bit.ly/9be412 well, at least Larry’s consistent #mysql #oracle

MySQL’s innodb_file_per_table

I’m still alive – preparing my presentation for ConFoo conference (about massive scalability), writing an article on load balancer in the free software domain, doing a technical review of a book on Nginx (it should be pretty good), and driving my company at crazy speed (some new contracts and operations for existing ones). Ho, and I’m preparing a wedding (mine) – yeah, I’m alive and living my life at the fullest.

In the midst of all that, I’ve got a mandate involving MySQL databases and the import of a big chunk of data (about 8gb raw). Yeah, I know, some peoples reading my blog will know whom I speaking of. The data come with the schema, the load data queries and the CSV files. Should be pretty easy, long, but easy.

Got that one wrong. I’ve been bitten by all the bugs that were on the road:
– MySQL LoadData (from CSV) use the TMPDIR variable (which was on root filesystem) and LoadData generate a temporary queries in that directory. In the present case, the 8GB raw csv represented over 40GB of required tmpdir space.
– MySQL Replication doesn’t live really well with importing large amount of data. The first (failed) import broke all the replication (ring of masters + slaves).
– and…

MySQL’s innodb log file cannot shrink its size. That’s pretty simple as a rule, but it creates lots of problems when you have a failed import due to free space issue. Whatever you do, MySQL will NOT give back this hard drive space unless you dump all and re-import. The current case being a database of over 20GB, it’s really a no-go (and since you are lacking free space, dump is a no-go).

The important things to take away from that encounter is: always enable innodb_file_per_table on MySQL. The performance overhead will be minimal (very) – you might even get better performance depending on your file system tolerance for fragmentation – AND – you will be able to drop table and regain the disk space… even if its only for a couple hours. The drop of a 10GB database should always give you some free space ;-). Question is: why haven’t MySQL put this setting as default ? Its been around for long enough… is stable and does help the general user (a sysadmin) experience.

Couples of stats/facts.

As I look over 6 very interesting projects overview on my desk, I’m forced to do a bit of thinking about how the last year went by. A year is a lot of time, and so much plans finally came to fruition that I can’t think of listing them all here today. Which is kinda a good sign for me and my enterprise ;-)

Most of my readers doesn’t really know who I am, even when you take into account that I blog under my real name. Most don’t know that I bought a condo in Hull (now part of Gatineau, near Ottawa – the capital of Canada), that I still have a rent in Montreal, that I proposed to my girlfriend (she said “Yes!”), that I own a dog (greatest experience of forcing a regular schedule I ever had), that my greatest motivation in life is to be able to go where I want, whenever I want. My dream is going back to Yosemite, California… and bring hiking gear.

Another big aspect of my life is my business, Les Laboratoires Phoenix. I’ve been working full time at it for the last 9 months and its been a great experience. Over those months : I’ve worked with clients from 7 countries, contributed to 3 major open source projects, went to the “Free Software Foundation” Libre Planet confrence in Boston, went to the DefCon in Las Vegas, I’ve been named SME for {Zabbix, Zimbra, Asterisk, OpenLDAP, extended LAMP Stack, Mailman, GlusterFS, Lustre, MySQL, Cloud Computing, …}, 3 of my articles have been published (>40K prints), and I’m involved in a book project (from a major publisher)…

And, even thinking about all those achievements, I still look for the future of Les Laboratoires Phoenix. I guess that working with startups influenced me a lot : those 6 projects are all different from each others, they represent good revenue potential (clear business plan) and require low capital input to be started. So, I guess I’ll stop speaking about them and work ;-). Btw, two of those projects would be online services (SAAS) for well known parts of Internet infrastructure (not webserver). Another is a cloud computing infrastructure services based in Montreal (this one if almost finished! & I got an hardware provider)… A lot of fun to be had.

More news to come.

processor/CPU affinity on GNU/linux

With the new computer trend of adding more processing core rather than speeding each of them, CPU affinity is becoming quite important. It allow intelligent GNU/Linux system administrators to bind specific process or hardware interrupt to a list of CPU or core. This binding capacity is very useful in database server and high throughput environment by removing a lot of context-switching request on high load cpu. It can even be used to offload workstation CPU of the dreaded ‘firefox’/'flash’ combo.

 

Here is a quite interesting article from Intel, Improved Linux* SMP Scaling: User-directed Processor Affinity, on the subject. It shows how to distribute hardware-based interrupts to specifics CPU to lower context switching. It is way more technical that this very article, but I don’t see the need to copy/paste everything. Read it, it worth it.

 

So, checking out a process cpu affinity is done with following syntax:

# taskset -pc ‘pid’

The processor IDs start at 0. The CPU-ID list is available with ‘cat /proc/cpuinfo‘.

 

Binding a process ID (pID) to a specific processor is done with the following command. The only difference between the list/bind command is the processor id list added before the pID. It can be inserted as a single digit (1) or a list (1,2,3) or a suite (1-3).

# taskset -pc 1 ‘pid’

There you go: your now able to bind this CPU intensive oracle process on specific core on this 16-ways system.

 

Links : old ‘for coder’ IBM article – another one for VMWare.

Page 1 of 212»