Google Charts

Thursday, January 31st, 2008

There are handful feasible flash based charting solutions, and they just work really great – with interaction. But unfortunately most of those are not open source, nor free. Few months ago(may be it has been there for longer?), google released a chart api which can generate a PNG image based on your data.

The charts API very easy to use, and everything is encoded in the URL. So, you can either download the image and host it on your servers, or rather use that in the web page itself without any troubles at all.

Yesterday, I gave it a try. I got some of my apache logs, parsed them, and used google APIs to draw the charts. The results look good.

Bar Chart

The log files are too big to parse quickly and contains lot of junk data. So, I wrote the code so that it only takes the “200 OK”, and ignores “js,css,png,jpg,gif” requests. Here is the php code.

	$ignored = array('css','js','png','jpg','gif');

	$vals = array();

	$f = fopen("/var/log/apache2/access.log","r");
	while (!feof($f)){
		$line = fgets($f);
		preg_match("/(.*) - - \[(.*)\] \"GET (.*) HTTP\/1.1\" 200/",$line,$matches);
		// If it's not 200 OK - just ignore it
		if (empty($matches))
			continue;
		$path = pathinfo($matches[3]);
		if (in_array($path["extension"],$ignored))
			continue;
		$time = strtotime($matches[2]);
		$month = date("M/y",$time);
		$vals[$month]++;
	}

	function text_encode($vals,$max) {
		$tvals = array();
		foreach($vals as $val) {
			$tvals[] = round($val/$max*100,1);
		}
		return implode($tvals,",");
	}

	function get_y_axis($vals) {
		$max = max($vals)+3000;
		$step = $max/5.0;
		$temp = array();
		for($i=0;$i<5;$i++)
			$temp[] = round($i*$step);
		$temp[] = $max;
		return implode($temp,"|");
	}

	echo "http://chart.apis.google.com/chart?chs=500x300&chd=t:"
			.urlencode(text_encode($vals,max($vals)+3000)).
			"&cht=bvg&chbh=50&chxt=x,y&chxl=0:|".
			urlencode(implode(array_keys($vals),'|')).
			"|1:|".urlencode(get_y_axis($vals))."\n";

Even though this contains code for ommiting unrelated results, it takes too much time. So, you can use the piping to create a simple log file, which is reasonably speed than running this through the whole log file. For example I use this command to create a smaller version of the log file.

cat sandaru1_log | grep 'HTTP/1.1" 200' | grep -v '.css HTTP'
	| grep -v '.png HTTP' | grep -v '.js HTTP'
	| grep -v '.jpg HTTP' | grep -v '.gif HTTP' > log

Google Charts also provides ability to generate pie charts. It's easier to use a pie chart for browser percentages.

Pie Chart

Here is the code :

	$vals = array();

	function parseUserAgent($ua)
  	{

    	$userAgent = array();
 		$agent = $ua;
    	$products = array();

		$pattern  = "([^/[:space:]]*)" . "(/([^[:space:]]*))?"
		."([[:space:]]*\[[a-zA-Z][a-zA-Z]\])?" . "[[:space:]]*"
		."(\\((([^()]|(\\([^()]*\\)))*)\\))?" . "[[:space:]]*";

		while( strlen($agent) > 0 )
		{
			if ($l = ereg($pattern, $agent, $a))
			{
				// product, version, comment
				array_push($products, array($a[1],    // Product
                                        $a[3],    // Version
                                        $a[6]));  // Comment
				$agent = substr($agent, $l);
			}
			else
			{
				$agent = "";
			}
		}

		// Directly catch these
		foreach($products as $product)
		{
			switch($product[0])
			{
				case 'Firefox':
				case 'Netscape':
				case 'Safari':
				case 'Camino':
				case 'Mosaic':
				case 'Galeon':
				case 'Opera':
					$userAgent[0] = $product[0];
					$userAgent[1] = $product[1];
					break;
			}
		}

		if (count($userAgent) == 0)
		{
			// Mozilla compatible (MSIE, konqueror, etc)
			if ($products[0][0] == 'Mozilla' &&
            	!strncmp($products[0][2], 'compatible;', 11))
			{
				$userAgent = array();
				if ($cl = ereg("compatible; ([^ ]*)[ /]([^;]*).*",
                           $products[0][2], $ca))
				{
					$userAgent[0] = $ca[1];
					$userAgent[1] = $ca[2];
				}
				else
				{
					$userAgent[0] = $products[0][0];
					$userAgent[1] = $products[0][1];
				}
			}
			else
			{
				$userAgent = array();
				$userAgent[0] = $products[0][0];
				$userAgent[1] = $products[0][1];
			}
		}

		if (strstr($userAgent[1],"http:/"))
			$userAgent[1] = "";

		return $userAgent[0]
			.($userAgent[0]==""||$userAgent[1]==""?"":" ")
			.$userAgent[1];
	}

	$f = fopen("log","r");
	while (!feof($f)){
		$line = fgets($f);
		preg_match("/([\d.]+).* [^ ] [^ ] \[(.*?)\] (.*?) (.*) (.*)"
				." (\d+) ([^ ]+) (.*?) \"(.*?)\"/",
					$line,$matches);
		$bot = parseUserAgent($matches[9]);
		$bot = preg_replace("/Firefox 2.*/","Firefox 2",$bot);
		$bot = preg_replace("/Firefox 1.*/","Firefox 1",$bot);
		$vals[$bot]++;
	}

	$others = 0;

	foreach($vals as $key => $val)
		if ($val<500) {
			$others += $val;
			unset($vals[$key]);
		}

	$vals['Others'] = $others;
	$vals['Unknown'] = $vals['-'];
	unset($vals['-']);

	$lables = array();
	function text_encode($vals,$sum) {
		global $lables;
		$tvals = array();
		foreach($vals as $key => $val) {
			$tvals[] = round($val/$sum*100,1);
			$lables[] = $key." (".round($val/$sum*100,2)." %)";
		}
		return implode($tvals,',');
	}

	echo "http://chart.apis.google.com/chart?cht=p&chd=t:".
			urlencode(text_encode($vals,array_sum($vals))).
			"&chs=700x400&chl=".
			urlencode(implode($lables,"|"))."\n";

This code will get the user agent, parse it and generate a pie chart. The user agent parse function is by dotvoid.com.

Google Apps – Email

Tuesday, July 24th, 2007

I have been using paradox server to handle my blog(www.sandaru1.com) and emails(AT gunathilake.com) for more than a year now. However, due to several reasons our server went offline by a timely manner. The problem is fixed now but the xmail configuration is not perfect either.

So, I decided to go for google apps. I have used that for about a week now, and so far the only problem I got was, since the mailing account is new some important mails are marked as spam. However, google really learns fast(or it has some hidden filters for each user) and by clicking on “spam” and “not spam” buttons managed to fix the problem.

So, overall the system seems pretty good and handles with out any errors. Another big advantage of the system is it has superb gmail interface.

How to find free mp3 using google

Wednesday, July 4th, 2007

Most probably, you are using file sharing apps(Limewire, Bearshare, even Bittorrent) to download music. But do you know there are thousand of free mp3 hosted on internet, and those are directly accessible by normal browsers?

Sometimes, people upload mp3 files to their web servers thinking that no one will find them. But when someone enters the url of the folder which contains the music, the web server uses directly listing to show the files in that folder (Directory listing can be turned off).

Basically, if you can find some web servers with mp3 files, you can download those. The problem is finding those. You can use google advance queries to find them. The title of apache directly listing starts with the phrase “Index of”. So, you can use google to search pages with “Index of” in title. Then, you need “mp3″ files. So, just append mp3 to the query. Lets say you want Beatles. Then, append Beatles. Here is an example : intitle:”Index of” mp3 Beatles

There is a possibility that some of those url might not work. But keep on searching, there are lots of working urls.

When there is a lot of sub directories and files, you might want to get a list of urls. So, i wrote a simple PHP script. You can execute this script in command line. I have put some sample URLs. The sample pages given there will generate more than 1000 direct links for mp3s.

<?
	$stderr = fopen('php://stderr', 'w');
	set_time_limit(0);
	error_reporting(0);

	$urls[] = "http://www.mcgees.org/mp3/pearl_jam/";
	$urls[] = "http://www.xieish.net/Collective%20Soul/";
	$urls[] = "http://www.asilentflute.com/mp3/";
	$urls[] = "http://www.semret.org/music/";
	$urls[] = "http://www.koreangirlssuck.com/emotion/mp3/";
	$urls[] = "http://www.vrees.net/mp3/";
	$urls[] = "http://www.webpiri.net/Mp3/";
	$urls[] = "http://pierre33200.free.fr/Music/";

	$done = array();

	for($i=0;$i<count($urls);$i++) {
		$url = $urls[$i];
		$done[strtolower($url)] = true;
		$temp = parse_url($url);
		$path = pathinfo($temp['path']);
		$domain = $temp['host'];

		if ($path['extension']!="") {
			if (strtolower($path['extension'])=="mp3"
				|| strtolower($path['extension'])=="wma") {
				echo $url."\n";
				continue;
			} else {
				fwrite($stderr,"Escaping $url : Not mp3\n");
				continue;
			}
		}

		fwrite($stderr,"Proccessing $url\n");

		$html = file_get_contents($url);

		$direct = preg_match("/index of/i",$html);
		if ($direct==false) {
			fwrite($stderr,"Error : Not a directy index\n");
			continue;
		}

		$count = preg_match_all("/<a href=\"(.*?)\">.*?<\/a>/i",
				$html,$matches);
		foreach ($matches[1] as $match) {
			// Ignore the pages link to same url
			if ($match[0] == "?")
				continue;
			if (substr($match,0,7)=="http://")
				$cur = $match;
			else if ($match[0] == "/")
				$cur = "http://".$domain.$match;
			else
				$cur = $url.$match;
			if (!isset($done[strtolower($cur)]))
				$urls[]=$cur;
		}
	}
	fclose($stderr);
?>

Save the above script(”mp3.php”), then execute it(”php mp3.php > urls.txt”). Then, it will show you what it’s doing and all the urls will be written to “urls.txt”. If you are in linux, you can use “wget -i urls.txt” to download the songs. If you are in windows, download Free Download Manager and use File -> Import List of Downloads.

You can also download the list of generated links.

Google prime number problem

Tuesday, December 19th, 2006

Few years ago google placed a bill board in Silicon Valley (Well, I haven’t seen that but read the googleblog post). The board was about a small programming problem.

Somehow, that blog post became one of the top stories in digg recently; after seen it, I decided to give it a try. I just used the straight forward brute force method and it was surprisingly easy. It was just like a old days doing an IOI code, and more to the point, runtime doesn’t matter, memory limits doesn’t matter, and coding time doesn’t matter, so, what make it an IOI code is that code quality also doesn’t matter.

I just code like I always did… just coded… coded… coded… never bothered about the coding standards… and finally two small programs (e.zip). First I generated factorials up to 100, divide 1 by all those and saved the result (upto 500 decimal places) in a file. Then from the next program I added all those things together and checked for a prime. Pretty simple. The answer was 7427466391

Number Puzzle Changed – It’s 15-Puzzle now!

Tuesday, October 31st, 2006

I changed the Number Puzzle plugin a bit and renamed it to 15-Puzzle. The new download links is http://www.sandaru1.com/15-Puzzle.gg. The older link also contains the same files now.

Google.lk Sinhala Fonts

Saturday, September 16th, 2006

Yesterday, I was looking at my access logs (using webalizer) and found out that there are many hits from people who are searching for a sinhala font for google.lk. Google Sri Lanka is based sinhala Unicode fonts. For windows users, fonts can be downloaded from www.fonts.lk; Last year, it wasn’t fully supported – ‘rakaransaya’ and ‘yansaya’ didn’t render properly in firefox. I wonder whether they have made any progress since then. For linux users, get fonts from www.linux.lk. The sinhala support was debianized, so, if you are in debian environment, just a matter of adding the repositories and using apt-get or synaptic. It works really nicely.

Clipboard Manger – My Google Desktop Search plugin.

Saturday, April 22nd, 2006

Finally, After a lot of failures, I manage to complete my GDS plugin. Basically, it keeps track of previous clipboard data and show them in the desktop slidebar. Then the users can click on any item, and it’ll be put into the clipboard again.

Click here to download.

Of course, still it’s beta and may(rather should ;-) ) contain few bugs here and there.

Google Suggest

Thursday, April 6th, 2006

AJAX – Now it had been there in the internet for few years, and big companies in the world are moving into it. AJAX came so popular after google released their AJAX based email service, gmail. So, now people have done a lot of projects with AJAX, and a lot of cool services are on their way.

While I was hunting for new AJAX tools, I was redirected to google suggest site. It seems like really nice. When someone is typing something, it suggests words and the nice thing is infront of the word, it displays the number of results.

http://www.google.com/webhp?complete=1&hl=en

Live.com, Google and Sinhala

Monday, March 20th, 2006
Since the beginning of this year, the most prestigious IT companies in the world focused their attention to one profitable large business, SEARCHING. A lot of companies were trying to be the best, but there was no clue, Google was the best. Few years ago, Microsoft was trying to overcome Google by their MSN searching but wasn’t able to make it success. Live.com

So, now they are trying to start over using the popular AJAX technology. Around two weeks ago Microsoft launched their AJAX based search engine, live.com which is a bit of threaten to google. Live.com isn’t only doing searching, but a web portal such as MSN and the most important thing is that you can customize your interface, and everything is AJAX based which means it’s really fast. Even though, it’s a portal, the interface is not overloaded with too much of information as in MSN. With the power of AJAX they have made it really descent and professional. The most amazing thing is that it is not only for IE, it is working fine in firefox under linux :-)

But when we look at the search content and the quality of the search, Google is still ahead. As most people think, the search content might be a matter of time, but we have to remember that Microsoft is a large IT company which mainly focuses at their software, not in search, but the heart of google is searching and they are the first guys who launched a popular AJAX based service, GMAIL. So, their experience is, of course better than Microsoft. Even though most of the people don’t use, Google itself support customizable interface(Google News, etc).

Google Sinhala As Sri Lankans, we saw a big step in google, they launched google.lk and their interface using Unicode. Earlier it was using English letters.But still google doesn’t support searching inside sinhala Unicode range. When we look at the live.com, there is no sinhala interface but the important thing is you can search using the sinhala Unicode characters.

I wonder why google doesn’t support sinhala Unicode characters, whenever it is not hard to implement. They are supporting almost every other Unicode character sets.

But still, no one of those giants don’t support ASCII sinhala searching using Unicode, so, still sinhalasearch.com rocks!!!