Monday, October 8, 2007

Flipping a coin in Bash

I took the day off to look for a new laptop. So I devised this clever little script to manage my time today.

#! /bin/bash

# Flip a coin to see if I should have another beer.
# experimentaly biased. :-)

HEAD=0
TAIL=0

for x in 1 2 3 4 5 6 7 8 9
do

FLIP=$(($(($RANDOM%10))%2))

if [ $FLIP -eq 1 ]
then
echo "TAIL"
TAIL=$(($TAIL+1))
else
echo "HEAD"
HEAD=$(($HEAD+1))
fi
done

if [ $HEAD -gt $TAIL ]
then
echo "BEER TIME.!"
else
echo "Your not entitled to a beer. Run $0 again."
fi

I am still looking for a new laptop.

Thursday, September 20, 2007

Green cursor blinking.

Passing info to EC2 instance on startup.

Amazon EC2 infrastructure provide basic but good methods to pass data to images at startup.

I have created EC2 bundled images that runs a set of scripts on stratup and configures the instance according to data received on launch.

$EC2BIN/ec2-run-instances ami-NNNNNN -d "whenibootupmakemea=webserver" -k My_Key_Pair_Name

What? The command will launch image ami-NNNNNN with my ssh keys My_Key_Pair_Name and will "whenibootupmakemea=webserver" available to the instance. See EC2 docs on how to create the key pairs.

How to get the message?
curl http://169.254.169.254/latest/user-data
Contacts the EC2 control server and request the user-data. This service is provided by amazon.

I use the following command to pass the parameter to my startup scripts.
STARTUP_REQ=`curl --silent http://169.254.169.254/latest/user-data | perl -ne " if (@a=m/whenibootupmakemea=(.*)/) {print @a[0];}"


This is so nice and clean thanks Amazon.

Wednesday, September 19, 2007

It is not blackmagic, it is bashmagic

Generating a simple* random number in bash is really easy.

Let say you need a random number between 10 and 30.
Set the variables $MIN and $MAX
MAX=30
MIN=10

Generate the random number in variable $T.
T=$((MIN+$RANDOM%$((MAX-MIN))))

Notice the $RANDOM that is build into bash.

* Simple as in don't use this for security applications.

Saturday, September 8, 2007

Installing the glusterFS modified version of the FUSE Linux Kernel module onto an EC2 system

Building the fuse module as supplied by Gluster:
The 27 Step Process: (your welcome to skip some or add your own variety)

Launch the Amazon developer image (ami-26b6534f) and ssh in.

[root@domU-12-31-35-00-29-74 fuse-2.7.0-glfs1]# history
1 gcc -v # get gcc version
2 modinfo dm_mod # see what gcc was used to compile a kernel module
3 rpm -qa fuse
4 rpm -e `rpm -qa fuse`
5 rpm -e fuse-sshfs-1.7-1.fc4.i386
6 rpm -e fuse-encfs-1.3.1-2.fc4.i386
7 rpm -e fuse-2.6.0-2.fc4 #Remove the old Fuse Utilities
8 cd /tmp
9 mkdir gluster
10 cd gluster
11 wget -c http://ftp.zresearch.com/pub/gluster/glusterfs/fuse/fuse-2.7.0-glfs1.tar.gz
12 tar xzvf fuse-2.7.0-glfs1.tar.gz
13 ls -l /usr/src/linux-`uname -r`
14 cd fuse-2.7.0-glfs1
15 ./configure --enable-kernel-module --with-kernel=/usr/src/linux-`uname -r`
16 make
17 make install
18 ls /lib/modules/`uname -r`
19 ls /lib/modules/`uname -r`/ker/ # In need Coffee
20 ls /lib/modules/`uname -r`/kernel # aaar I am brain
21 ls /lib/modules/`uname -r`/kernel/fs # dead, where
22 ls /lib/modules/`uname -r`/kernel/fs/fuse # did they hide
23 ls -l /lib/modules/`uname -r`/kernel/fs/fuse # the fuse module
24 ls -l /lib/modules/`uname -r`/kernel/fs/fuse/fuse.ko # gotcha
25 tar cvjf /tmp/kernel_modules.tar.bz2 /lib/modules/`uname -r`/kernel/
26 ifconfig # get the ip so i could scp the tar file to the image I am bundling.
27 history # show you guys what i did, this output
[root@domU-12-31-35-00-29-74 fuse-2.7.0-glfs1]#


Next I scp'd the file to the image I was testing on.
The installation is pretty simple. (this ones from memory, so let hope)
umount /data/clusterfs # or where ever you mounted your cluster.
rmmod fuse # remove the module
lsmod
cp fuse.ko /lib/modules/`uname -r`/kernel/fs/fuse/fuse.ko
depmod # sort out all module dependences
modprobe fuse # load fuse module


Now your back online and you can mount your gluster filesystem with much better stability. I have been trying to break fuse-2.7.0 but so far it has stood up to the tests.

The biggest problem that people have with compiling kernel modules is the version of gcc. For your own sanity always make sure, see step 1 and 2, that you use the same gcc version for both compiling the kernel and later compiling you kernel module.

I had to recompile the fuse module for two reason. First was stability, I had several core dumps under high load, version 2.7 of the Fuse module solved it. I am again happy with Linux, great job guys. The second was to tune fuse for better performance with glusterFS. More detail http://www.gluster.org/docs/index.php/Guide_to_Optimizing_GlusterFS

Friday, September 7, 2007

A little bit of load.

root@web0:/# uptime
02:41:39 up 9:57, 1 user, load average: 104.38, 91.41, 71.59
root@web0:/# uptime
02:41:42 up 9:57, 1 user, load average: 104.38, 91.41, 71.59
root@web0:/# uptime
02:41:43 up 9:57, 1 user, load average: 102.99, 91.34, 71.67

Load testing Amazons EC2 servers.
I am quite happy.

Tuesday, September 4, 2007

15seconds to graphing your server load

I removed this, use rrdtools.

GeekCode


-----BEGIN GEEKCODE BLOCK-----
Version: 3.1
GE D? S+:- A- C+++$ ULSB++++$ UAHC++ P+++ L+++ E---?
W+++ N !W O- M+ V- PS+++ Y+ PGP+ T+ 5+ X+ R !TV B+ DI+++
D+ G+++ E* H++ R+ Z++
------END GEEKCODE BLOCK------


I do not really relate to being a geek. Geeks can still stop at any time. Junkies not.

Geeks are so kernel 2.2

What the hell is Unix Load Average.

Article mostly based on linux I have it installed on this pc. And I was busy reading my daily kernel source on http://lxr.linux.no

Load Average (LA) on a server is a touchy subject. It is very much dependent on the type of tasks running on the server. I have had servers sitting at 4 < LA < 6 and performing better than another server at 0 < LA < 2. Seen +40 a few times, probably could push it to +100 #DEFINE (Please send a multi processor server).

Usually its best if your load average is under the CPU-core count.

Slow I/O (Disk,network) will increase the load average. Since there are more processes waiting for resource.

Think of load average as queue length. You have so many process that is waiting to be processed.

1) What am I talking about. How do I find the load average?
The following commands, to name a few, give Load Average.
top
w
uptime
on linux its available in /proc/loadavg
vmstat is another way of getting similar data.

2) Where do the magic numbers come from?
The timer.c in the linux kernel. The kernel keeps track of the number of waiting/running processes over 1 min, 5min and 15min. These numbers are fed through a statistic formula, that deserves its own blog post and examples, there after made available to the system.

3) Interpretation:
Let say your numbers were 5.99, 5.52, 5.13 and this is a single CPU server.
This tells me 499% of the processes could not be finished in the last minute.
Obvious 100% could be done. More important is that over 15min, 413% of the processes did not complete.

Thus a 6 cpu server would have had a LA<1. If you could find such a thing.

4) What to do now?
Do not press the RED button!!!! What ever you do, do not press the RED button.!
Find out if its CPU or I/O?
if (I/O) Disk or network I/O?
if (Disk)Filesystem or Swap?

use "vmstat 2" and stare at the screen. it will give you a good idea quickly. top is handy to find the application that is hogging, and "ps aux" though I am getting suspect on ps's results.

5) The fluffy stuff.
network services like http,smtp,ftp cause a high LA. is usually I/O related. Look at your networking.
databases can be interesting, here you'll have to look at network,disk,swap and CPU.

6) So what about threads? You had to Ask...mmm The exact handling of thread count on relation to LA is different in the UNIX's.

May the source guide you. sched.c The answer is reviled as a comment.

/*
* nr_running, nr_uninterruptible and nr_context_switches:
*
* externally visible scheduler statistics: current number of runnable
* threads, current number of uninterruptible-sleeping threads, total
* number of context switches performed since bootup.
*/
7) Should I care?
If the box is doing ok with LA>1 then dont stress. Know that it is a bit busy. If the box is not performing well and you have LA>1 it is realy usefull to graph this value, so you know when the load is occurring.

Need to know more.

If you wanted to you could have read the man pages:
man uptime

<snip>system load averages is the average number of processes that are either in a runnable or uninterruptable state. A process in a runnable state is
either using the CPU or waiting to use the CPU. A process in uninterruptable state is waiting for some I/O access, eg waiting for disk. The aver‐
ages are taken over the three time intervals. Load averages are not normalized for the number of CPUs in a system, so a load average of 1 means a
single CPU system is loaded all the time while on a 4 CPU system it means it was idle 75% of the time.
</snip>



E&O.E.

Tuesday, August 14, 2007

is it that obvious


My computer runs a unix derivative, I get get layed as often as I have to reboot.

beer

#touch beer
#more beer
#make beer
make: Nothing to be done for `beer'.
#more beer
#more beer
#more beer
#more beer