Archive for the ‘Programming’ Category

IOI - The contest begins!!

Tuesday, February 19th, 2008

If you ever wanted to see mummies in Egypt, here comes the chance. Sri Lankan Olympiad in Informatics is calling for contestants. If you are a school student under 20, just go ahead and register at www.ioi.lk.

If you want more info on IOI, read the wiki.

P.S - The LK DNS is very promising.. but just in case, if you can’t access ioi.lk, try http://ioi.ucsc.cmb.ac.lk/

Just for the record, photo was not my idea.

Google Charts

Thursday, January 31st, 2008

There are handful feasible flash based charting solutions, and they just work really great - with interaction. But unfortunately most of those are not open source, nor free. Few months ago(may be it has been there for longer?), google released a chart api which can generate a PNG image based on your data.

The charts API very easy to use, and everything is encoded in the URL. So, you can either download the image and host it on your servers, or rather use that in the web page itself without any troubles at all.

Yesterday, I gave it a try. I got some of my apache logs, parsed them, and used google APIs to draw the charts. The results look good.

Bar Chart

The log files are too big to parse quickly and contains lot of junk data. So, I wrote the code so that it only takes the “200 OK”, and ignores “js,css,png,jpg,gif” requests. Here is the php code.

	$ignored = array('css','js','png','jpg','gif');

	$vals = array();

	$f = fopen("/var/log/apache2/access.log","r");
	while (!feof($f)){
		$line = fgets($f);
		preg_match("/(.*) - - \[(.*)\] \”GET (.*) HTTP\/1.1\” 200/”,$line,$matches);
		// If it’s not 200 OK - just ignore it
		if (empty($matches))
			continue;
		$path = pathinfo($matches[3]);
		if (in_array($path["extension"],$ignored))
			continue;
		$time = strtotime($matches[2]);
		$month = date(”M/y”,$time);
		$vals[$month]++;
	}

	function text_encode($vals,$max) {
		$tvals = array();
		foreach($vals as $val) {
			$tvals[] = round($val/$max*100,1);
		}
		return implode($tvals,”,”);
	}

	function get_y_axis($vals) {
		$max = max($vals)+3000;
		$step = $max/5.0;
		$temp = array();
		for($i=0;$i<5;$i++)
			$temp[] = round($i*$step);
		$temp[] = $max;
		return implode($temp,"|");
	}

	echo "http://chart.apis.google.com/chart?chs=500x300&chd=t:"
			.urlencode(text_encode($vals,max($vals)+3000)).
			"&cht=bvg&chbh=50&chxt=x,y&chxl=0:|".
			urlencode(implode(array_keys($vals),'|')).
			"|1:|".urlencode(get_y_axis($vals))."\n";

Even though this contains code for ommiting unrelated results, it takes too much time. So, you can use the piping to create a simple log file, which is reasonably speed than running this through the whole log file. For example I use this command to create a smaller version of the log file.

cat sandaru1_log | grep 'HTTP/1.1" 200' | grep -v '.css HTTP'
	| grep -v '.png HTTP' | grep -v '.js HTTP'
	| grep -v '.jpg HTTP' | grep -v '.gif HTTP' > log

Google Charts also provides ability to generate pie charts. It’s easier to use a pie chart for browser percentages.

Pie Chart

Here is the code :

	$vals = array();

	function parseUserAgent($ua)
  	{

    	$userAgent = array();
 		$agent = $ua;
    	$products = array();

		$pattern  = "([^/[:space:]]*)” . “(/([^[:space:]]*))?”
		.”([[:space:]]*\[[a-zA-Z][a-zA-Z]\])?” . “[[:space:]]*”
		.”(\\((([^()]|(\\([^()]*\\)))*)\\))?” . “[[:space:]]*”;

		while( strlen($agent) > 0 )
		{
			if ($l = ereg($pattern, $agent, $a))
			{
				// product, version, comment
				array_push($products, array($a[1],    // Product
                                        $a[3],    // Version
                                        $a[6]));  // Comment
				$agent = substr($agent, $l);
			}
			else
			{
				$agent = “”;
			}
		}

		// Directly catch these
		foreach($products as $product)
		{
			switch($product[0])
			{
				case ‘Firefox’:
				case ‘Netscape’:
				case ‘Safari’:
				case ‘Camino’:
				case ‘Mosaic’:
				case ‘Galeon’:
				case ‘Opera’:
					$userAgent[0] = $product[0];
					$userAgent[1] = $product[1];
					break;
			}
		}

		if (count($userAgent) == 0)
		{
			// Mozilla compatible (MSIE, konqueror, etc)
			if ($products[0][0] == ‘Mozilla’ &&
            	!strncmp($products[0][2], ‘compatible;’, 11))
			{
				$userAgent = array();
				if ($cl = ereg(”compatible; ([^ ]*)[ /]([^;]*).*”,
                           $products[0][2], $ca))
				{
					$userAgent[0] = $ca[1];
					$userAgent[1] = $ca[2];
				}
				else
				{
					$userAgent[0] = $products[0][0];
					$userAgent[1] = $products[0][1];
				}
			}
			else
			{
				$userAgent = array();
				$userAgent[0] = $products[0][0];
				$userAgent[1] = $products[0][1];
			}
		}

		if (strstr($userAgent[1],”http:/”))
			$userAgent[1] = “”;

		return $userAgent[0]
			.($userAgent[0]==”"||$userAgent[1]==”"?”":” “)
			.$userAgent[1];
	}

	$f = fopen(”log”,”r”);
	while (!feof($f)){
		$line = fgets($f);
		preg_match(”/([\d.]+).* [^ ] [^ ] \[(.*?)\] (.*?) (.*) (.*)”
				.” (\d+) ([^ ]+) (.*?) \”(.*?)\”/”,
					$line,$matches);
		$bot = parseUserAgent($matches[9]);
		$bot = preg_replace(”/Firefox 2.*/”,”Firefox 2″,$bot);
		$bot = preg_replace(”/Firefox 1.*/”,”Firefox 1″,$bot);
		$vals[$bot]++;
	}

	$others = 0;

	foreach($vals as $key => $val)
		if ($val<500) {
			$others += $val;
			unset($vals[$key]);
		}

	$vals['Others'] = $others;
	$vals['Unknown'] = $vals['-'];
	unset($vals['-']);

	$lables = array();
	function text_encode($vals,$sum) {
		global $lables;
		$tvals = array();
		foreach($vals as $key => $val) {
			$tvals[] = round($val/$sum*100,1);
			$lables[] = $key.” (”.round($val/$sum*100,2).” %)”;
		}
		return implode($tvals,’,');
	}

	echo “http://chart.apis.google.com/chart?cht=p&chd=t:”.
			urlencode(text_encode($vals,array_sum($vals))).
			“&chs=700×400&chl=”.
			urlencode(implode($lables,”|”)).”\n”;

This code will get the user agent, parse it and generate a pie chart. The user agent parse function is by dotvoid.com.

Say Cheese

Tuesday, January 1st, 2008

It’s new year, 2008. Time to smile :)

Few days ago, I’ve been playing around with a pretty old USB webcam. I just plugged in it, and v4l(Video for Linux) drivers detected that fine (”dmesg | tail” would do the trick for checking what has been going in the system).

Since I don’t any specific webcam capture apps, I just used VLC to do some little testing. By running “vlc v4l:/dev/video0″, the green light on the top of the webcam turned on (turning on a small smile on my face :) ) and the video showed up. But it was about 60×60 size (and the smile faded away). So, I experimented with several sizes, and the best worked so seemed to be “320×240″.

Now, assuming that it’s enough for my needs (I don’t really need this anyway), I decided to install a webcam app. The “Cheese” from gnome foundation looked promising, So, I downloaded the code, compiled and installed it. bham, there is nothing, just the gtk widget background with a gnome hand imprinted.

I guessed that the both problem and solution should lie in the gstreamer settings. So, I downloaded and installed the newest stable version of gstreamer(0.10.15), plugins-base(0.10.15), plugins-good(0.10.6) and plugins-ugly(0.10.6). But as a matter of fact, it didn’t have any effect on “Cheese”, but to break my mp3 playing capabilities of totem. Later on I found out it’s due to some plugins aren’t not compiled to properly, so compiling them fixed the mp3, but not the webcam app.

However, after going through the Cheese FAQ, I found out that I have to set some properties from the gstreamer-properties dialog and just blaming myself for not thinking of checking FAQ before going to all the hassle of compiling gstreamer, I changed the settings as it said (ximagesink something - I can’t remember now), then fired up Cheese.

Whoa, the webcam is blinking and suddenly the video appeared. But their seems to be small problem with the video, 3/4 of the image is blank. Only top 1/4 is displayed properly. First, I though it’s a temporarily problem, so I did a “rmmod and modprobe” to reset the device, but it didn’t help. So, I fired up cheese from the command line to see what’s going on. First it was detecting the webcam, then the sizes it can support (in other words the available “modes”). Then, it’s selecting so called the best mode, which is the largest.

For my webcam that was a weird size something with 288 x XXX (Sorry I can’t remember that either now). So, I tried that size with the VLC, and it’s the same results. As you can guess it’s not a problem with gstreamer but the V4L drivers. I looked for any cheese configuration file to set the preferred “mode” of the webcam, but unfortunately the product is still very young.

I didn’t want to mess with the V4L drivers, so I had a look at the Cheese source code. There is an universal truth about everything, it’s always easier to break things rather than fix them. So, I just break the webcam mode detection code which leads into my hardcoded preferred webcam settings. Changes are done in cheese-webcam.c, and here is the diff file - just in case.

Cheese Screenshot

Now, It’s time to say Cheese, and Happy New Year!

Apache Solr

Saturday, December 29th, 2007

I have been quite for sometime, rather busy doing some interesting work at Ulteo. It’ll be coming out soon, so keep looking :). (Ulteo recently released an online version of OpenOffice with many possitive reviews - we had a huge rush just after the release showing the potential of a such product).

However, after quite long time, I did some web programming with Paradox. Today, I played around Apache Solr as a database backend to improve the searching capabilities. The results are amazing, it gives a very powerful storing and indexing schema and nice query language to do the queries. PHP can communicate with the server using REST. Solr has the ability to generate output in many formats, including PHP serialized objects. However, the easiest I found is JSON output (You’ll have to install php-json extension).

Solr explanation from the site :
Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search, caching, replication, and a web administration interface. It runs in a Java servlet container such as Tomcat.

Version Controlling Systems - SVN and BZR

Thursday, November 1st, 2007

I have been using both “svn” and “bzr” lately. “svn” is a quite popular system, have been there for many years. “bzr” is developed by Ubuntu team to use as their version controlling system. Even though bzr is quite new to the field, it’s pretty much transparent to the user. For example if you type “bzr add” in the root bzr branch directory, then all the newly added to the repository recursively whereas svn doesn’t do that recursively. “bzr” is capable of detecting removed files, but “svn” is not; you’ll have to delete the files manually. That’s really annoying.

So, I wrote a small shell script to handle those.

#!/bin/bash

svn status | while read line; do
	command=`echo $line | awk '{print $1}'`
	file=`echo $line | awk '{print $2}'`
	if [ "$command" = "?" ]; then
		echo “Add $file”
		svn add $file
	fi
	if [ "$command" = "!" ]; then
		echo “Remove $file”
		svn delete $file
	fi
done

Create an automated bot to crawl gmail inbox

Wednesday, September 19th, 2007

There are certain scenarios that you might want to write a script to access gmail inbox automatically. One possible way to do that is to use this great project, gmail-lite.

But if you simply just want to access only the inbox, did you know there is a RSS feed which can do this? This method is more simple than using an automated bot. You can authenticate using HTTP authentication and grab the contents of the inbox. The feed url is https://mail.google.com/mail/feed/atom.

If you are using a scripting language like PHP, you can use curl extension. curl has several other language bindings too. Another possible way to do this is to execute external application(in background of course) such as wget.

The Python Challenge

Sunday, September 16th, 2007

You might have already read my blog post about deathball riddle. Few days ago, I started doing another internet riddle, but not like deathball, this is a programming riddle. It’s python challenge.

You might not familiar with python, but this is definitely a really good way to learn the language. When, I started doing this, I only knew few syntax in python.. But after finishing several levels, I learned about some really good ways of using python - or I can say the art of python.

If you don’t know about python, It’s not a dead language or not a useless language. It’s quite powerful scripting language, which also have GTK bindings. Many of the novel linux apps are written in python. Gentoo emerge is a very good example.

Picasa Widget updated

Wednesday, July 11th, 2007

Edit : Latest update http://www.sandaru1.com/2008/04/04/wordpress-picasa-plugin/

I just updated the picasa widget. Thanks for the comments on the early version. Now, you can select the image size. Then, picasa username field get bit complicated.

If you want to use your whole picasa album you can just type your username. If you want only one album then you can put the album name within brackets(Without spaces). Here is an example : username(album).

If you want to get photos from more than one user, then type all usernames separated by space. You can even use brackets with the usernames.

Click here to download

How to find free mp3 using google

Wednesday, July 4th, 2007

Most probably, you are using file sharing apps(Limewire, Bearshare, even Bittorrent) to download music. But do you know there are thousand of free mp3 hosted on internet, and those are directly accessible by normal browsers?

Sometimes, people upload mp3 files to their web servers thinking that no one will find them. But when someone enters the url of the folder which contains the music, the web server uses directly listing to show the files in that folder (Directory listing can be turned off).

Basically, if you can find some web servers with mp3 files, you can download those. The problem is finding those. You can use google advance queries to find them. The title of apache directly listing starts with the phrase “Index of”. So, you can use google to search pages with “Index of” in title. Then, you need “mp3″ files. So, just append mp3 to the query. Lets say you want Beatles. Then, append Beatles. Here is an example : intitle:”Index of” mp3 Beatles

There is a possibility that some of those url might not work. But keep on searching, there are lots of working urls.

When there is a lot of sub directories and files, you might want to get a list of urls. So, i wrote a simple PHP script. You can execute this script in command line. I have put some sample URLs. The sample pages given there will generate more than 1000 direct links for mp3s.

<?
	$stderr = fopen('php://stderr', 'w');
	set_time_limit(0);
	error_reporting(0);

	$urls[] = "http://www.mcgees.org/mp3/pearl_jam/";
	$urls[] = "http://www.xieish.net/Collective%20Soul/";
	$urls[] = "http://www.asilentflute.com/mp3/";
	$urls[] = "http://www.semret.org/music/";
	$urls[] = "http://www.koreangirlssuck.com/emotion/mp3/";
	$urls[] = "http://www.vrees.net/mp3/";
	$urls[] = "http://www.webpiri.net/Mp3/";
	$urls[] = "http://pierre33200.free.fr/Music/";

	$done = array();

	for($i=0;$i<count($urls);$i++) {
		$url = $urls[$i];
		$done[strtolower($url)] = true;
		$temp = parse_url($url);
		$path = pathinfo($temp['path']);
		$domain = $temp['host'];

		if ($path['extension']!="") {
			if (strtolower($path['extension'])=="mp3"
				|| strtolower($path['extension'])=="wma") {
				echo $url."\n";
				continue;
			} else {
				fwrite($stderr,"Escaping $url : Not mp3\n");
				continue;
			}
		}

		fwrite($stderr,"Proccessing $url\n");

		$html = file_get_contents($url);

		$direct = preg_match("/index of/i",$html);
		if ($direct==false) {
			fwrite($stderr,"Error : Not a directy index\n");
			continue;
		}

		$count = preg_match_all("/<a href=\"(.*?)\">.*?<\/a>/i",
				$html,$matches);
		foreach ($matches[1] as $match) {
			// Ignore the pages link to same url
			if ($match[0] == "?")
				continue;
			if (substr($match,0,7)=="http://")
				$cur = $match;
			else if ($match[0] == "/")
				$cur = "http://".$domain.$match;
			else
				$cur = $url.$match;
			if (!isset($done[strtolower($cur)]))
				$urls[]=$cur;
		}
	}
	fclose($stderr);
?>

Save the above script(”mp3.php”), then execute it(”php mp3.php > urls.txt”). Then, it will show you what it’s doing and all the urls will be written to “urls.txt”. If you are in linux, you can use “wget -i urls.txt” to download the songs. If you are in windows, download Free Download Manager and use File -> Import List of Downloads.

You can also download the list of generated links.

Memories of childhood

Thursday, June 21st, 2007

Today, I got a scanner and my brother wanted to scan some of our childhood photographs. So, we put four photos in each of corner of the scanner and scanned them.

After all, we had huge images, which should be split into four separate images. That was a pretty straightforward boring process. So, finally, I thought of writing a small piece code to split the images. and that�s written in JAVA :O. Code is not written for posting, so looks like poorly written code, but anyway it does the job.

import java.awt.image.*;
import javax.imageio.*;
import java.io.*;

public class ImageSpliter {
	public static void main(String args[]) {
		try {
			File file = new File(args[0]);
			int x = Integer.parseInt(args[1]);
			BufferedImage input = ImageIO.read(file);
			BufferedImage output = input.getSubimage(0,
							0,700,1000);
			File fo = new File(”C:/photos/”+x+”.jpg”);
			ImageIO.write(output,”JPG”,fo);

			output = input.getSubimage(0,
					1400,700,input.getHeight()-1400);
			fo = new File(”C:/photos/”+(x+1)+”.jpg”);
			ImageIO.write(output,”JPG”,fo);

			output = input.getSubimage(1100,
					0,input.getWidth()-1100,1000);
			fo = new File(”C:/photos/”+(x+2)+”.jpg”);
			ImageIO.write(output,”JPG”,fo);

			output = input.getSubimage(1100,
				1400,input.getWidth()-1100,input.getHeight()-1400);
			fo = new File(”C:/photos/”+(x+3)+”.jpg”);
			ImageIO.write(output,”JPG”,fo);
		} catch(Exception e) {
			System.out.println(e.toString());
		}
	}
}