Archive for the ‘PHP’ Category

Google Charts

Thursday, January 31st, 2008

There are handful feasible flash based charting solutions, and they just work really great - with interaction. But unfortunately most of those are not open source, nor free. Few months ago(may be it has been there for longer?), google released a chart api which can generate a PNG image based on your data.

The charts API very easy to use, and everything is encoded in the URL. So, you can either download the image and host it on your servers, or rather use that in the web page itself without any troubles at all.

Yesterday, I gave it a try. I got some of my apache logs, parsed them, and used google APIs to draw the charts. The results look good.

Bar Chart

The log files are too big to parse quickly and contains lot of junk data. So, I wrote the code so that it only takes the “200 OK”, and ignores “js,css,png,jpg,gif” requests. Here is the php code.

	$ignored = array('css','js','png','jpg','gif');

	$vals = array();

	$f = fopen("/var/log/apache2/access.log","r");
	while (!feof($f)){
		$line = fgets($f);
		preg_match("/(.*) - - \[(.*)\] \”GET (.*) HTTP\/1.1\” 200/”,$line,$matches);
		// If it’s not 200 OK - just ignore it
		if (empty($matches))
			continue;
		$path = pathinfo($matches[3]);
		if (in_array($path["extension"],$ignored))
			continue;
		$time = strtotime($matches[2]);
		$month = date(”M/y”,$time);
		$vals[$month]++;
	}

	function text_encode($vals,$max) {
		$tvals = array();
		foreach($vals as $val) {
			$tvals[] = round($val/$max*100,1);
		}
		return implode($tvals,”,”);
	}

	function get_y_axis($vals) {
		$max = max($vals)+3000;
		$step = $max/5.0;
		$temp = array();
		for($i=0;$i<5;$i++)
			$temp[] = round($i*$step);
		$temp[] = $max;
		return implode($temp,"|");
	}

	echo "http://chart.apis.google.com/chart?chs=500x300&chd=t:"
			.urlencode(text_encode($vals,max($vals)+3000)).
			"&cht=bvg&chbh=50&chxt=x,y&chxl=0:|".
			urlencode(implode(array_keys($vals),'|')).
			"|1:|".urlencode(get_y_axis($vals))."\n";

Even though this contains code for ommiting unrelated results, it takes too much time. So, you can use the piping to create a simple log file, which is reasonably speed than running this through the whole log file. For example I use this command to create a smaller version of the log file.

cat sandaru1_log | grep 'HTTP/1.1" 200' | grep -v '.css HTTP'
	| grep -v '.png HTTP' | grep -v '.js HTTP'
	| grep -v '.jpg HTTP' | grep -v '.gif HTTP' > log

Google Charts also provides ability to generate pie charts. It’s easier to use a pie chart for browser percentages.

Pie Chart

Here is the code :

	$vals = array();

	function parseUserAgent($ua)
  	{

    	$userAgent = array();
 		$agent = $ua;
    	$products = array();

		$pattern  = "([^/[:space:]]*)” . “(/([^[:space:]]*))?”
		.”([[:space:]]*\[[a-zA-Z][a-zA-Z]\])?” . “[[:space:]]*”
		.”(\\((([^()]|(\\([^()]*\\)))*)\\))?” . “[[:space:]]*”;

		while( strlen($agent) > 0 )
		{
			if ($l = ereg($pattern, $agent, $a))
			{
				// product, version, comment
				array_push($products, array($a[1],    // Product
                                        $a[3],    // Version
                                        $a[6]));  // Comment
				$agent = substr($agent, $l);
			}
			else
			{
				$agent = “”;
			}
		}

		// Directly catch these
		foreach($products as $product)
		{
			switch($product[0])
			{
				case ‘Firefox’:
				case ‘Netscape’:
				case ‘Safari’:
				case ‘Camino’:
				case ‘Mosaic’:
				case ‘Galeon’:
				case ‘Opera’:
					$userAgent[0] = $product[0];
					$userAgent[1] = $product[1];
					break;
			}
		}

		if (count($userAgent) == 0)
		{
			// Mozilla compatible (MSIE, konqueror, etc)
			if ($products[0][0] == ‘Mozilla’ &&
            	!strncmp($products[0][2], ‘compatible;’, 11))
			{
				$userAgent = array();
				if ($cl = ereg(”compatible; ([^ ]*)[ /]([^;]*).*”,
                           $products[0][2], $ca))
				{
					$userAgent[0] = $ca[1];
					$userAgent[1] = $ca[2];
				}
				else
				{
					$userAgent[0] = $products[0][0];
					$userAgent[1] = $products[0][1];
				}
			}
			else
			{
				$userAgent = array();
				$userAgent[0] = $products[0][0];
				$userAgent[1] = $products[0][1];
			}
		}

		if (strstr($userAgent[1],”http:/”))
			$userAgent[1] = “”;

		return $userAgent[0]
			.($userAgent[0]==”"||$userAgent[1]==”"?”":” “)
			.$userAgent[1];
	}

	$f = fopen(”log”,”r”);
	while (!feof($f)){
		$line = fgets($f);
		preg_match(”/([\d.]+).* [^ ] [^ ] \[(.*?)\] (.*?) (.*) (.*)”
				.” (\d+) ([^ ]+) (.*?) \”(.*?)\”/”,
					$line,$matches);
		$bot = parseUserAgent($matches[9]);
		$bot = preg_replace(”/Firefox 2.*/”,”Firefox 2″,$bot);
		$bot = preg_replace(”/Firefox 1.*/”,”Firefox 1″,$bot);
		$vals[$bot]++;
	}

	$others = 0;

	foreach($vals as $key => $val)
		if ($val<500) {
			$others += $val;
			unset($vals[$key]);
		}

	$vals['Others'] = $others;
	$vals['Unknown'] = $vals['-'];
	unset($vals['-']);

	$lables = array();
	function text_encode($vals,$sum) {
		global $lables;
		$tvals = array();
		foreach($vals as $key => $val) {
			$tvals[] = round($val/$sum*100,1);
			$lables[] = $key.” (”.round($val/$sum*100,2).” %)”;
		}
		return implode($tvals,’,');
	}

	echo “http://chart.apis.google.com/chart?cht=p&chd=t:”.
			urlencode(text_encode($vals,array_sum($vals))).
			“&chs=700×400&chl=”.
			urlencode(implode($lables,”|”)).”\n”;

This code will get the user agent, parse it and generate a pie chart. The user agent parse function is by dotvoid.com.

Apache Solr

Saturday, December 29th, 2007

I have been quite for sometime, rather busy doing some interesting work at Ulteo. It’ll be coming out soon, so keep looking :). (Ulteo recently released an online version of OpenOffice with many possitive reviews - we had a huge rush just after the release showing the potential of a such product).

However, after quite long time, I did some web programming with Paradox. Today, I played around Apache Solr as a database backend to improve the searching capabilities. The results are amazing, it gives a very powerful storing and indexing schema and nice query language to do the queries. PHP can communicate with the server using REST. Solr has the ability to generate output in many formats, including PHP serialized objects. However, the easiest I found is JSON output (You’ll have to install php-json extension).

Solr explanation from the site :
Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search, caching, replication, and a web administration interface. It runs in a Java servlet container such as Tomcat.

Picasa Widget updated

Wednesday, July 11th, 2007

Edit : Latest update http://www.sandaru1.com/2008/04/04/wordpress-picasa-plugin/

I just updated the picasa widget. Thanks for the comments on the early version. Now, you can select the image size. Then, picasa username field get bit complicated.

If you want to use your whole picasa album you can just type your username. If you want only one album then you can put the album name within brackets(Without spaces). Here is an example : username(album).

If you want to get photos from more than one user, then type all usernames separated by space. You can even use brackets with the usernames.

Click here to download

How to find free mp3 using google

Wednesday, July 4th, 2007

Most probably, you are using file sharing apps(Limewire, Bearshare, even Bittorrent) to download music. But do you know there are thousand of free mp3 hosted on internet, and those are directly accessible by normal browsers?

Sometimes, people upload mp3 files to their web servers thinking that no one will find them. But when someone enters the url of the folder which contains the music, the web server uses directly listing to show the files in that folder (Directory listing can be turned off).

Basically, if you can find some web servers with mp3 files, you can download those. The problem is finding those. You can use google advance queries to find them. The title of apache directly listing starts with the phrase “Index of”. So, you can use google to search pages with “Index of” in title. Then, you need “mp3″ files. So, just append mp3 to the query. Lets say you want Beatles. Then, append Beatles. Here is an example : intitle:”Index of” mp3 Beatles

There is a possibility that some of those url might not work. But keep on searching, there are lots of working urls.

When there is a lot of sub directories and files, you might want to get a list of urls. So, i wrote a simple PHP script. You can execute this script in command line. I have put some sample URLs. The sample pages given there will generate more than 1000 direct links for mp3s.

<?
	$stderr = fopen('php://stderr', 'w');
	set_time_limit(0);
	error_reporting(0);

	$urls[] = "http://www.mcgees.org/mp3/pearl_jam/";
	$urls[] = "http://www.xieish.net/Collective%20Soul/";
	$urls[] = "http://www.asilentflute.com/mp3/";
	$urls[] = "http://www.semret.org/music/";
	$urls[] = "http://www.koreangirlssuck.com/emotion/mp3/";
	$urls[] = "http://www.vrees.net/mp3/";
	$urls[] = "http://www.webpiri.net/Mp3/";
	$urls[] = "http://pierre33200.free.fr/Music/";

	$done = array();

	for($i=0;$i<count($urls);$i++) {
		$url = $urls[$i];
		$done[strtolower($url)] = true;
		$temp = parse_url($url);
		$path = pathinfo($temp['path']);
		$domain = $temp['host'];

		if ($path['extension']!="") {
			if (strtolower($path['extension'])=="mp3"
				|| strtolower($path['extension'])=="wma") {
				echo $url."\n";
				continue;
			} else {
				fwrite($stderr,"Escaping $url : Not mp3\n");
				continue;
			}
		}

		fwrite($stderr,"Proccessing $url\n");

		$html = file_get_contents($url);

		$direct = preg_match("/index of/i",$html);
		if ($direct==false) {
			fwrite($stderr,"Error : Not a directy index\n");
			continue;
		}

		$count = preg_match_all("/<a href=\"(.*?)\">.*?<\/a>/i",
				$html,$matches);
		foreach ($matches[1] as $match) {
			// Ignore the pages link to same url
			if ($match[0] == "?")
				continue;
			if (substr($match,0,7)=="http://")
				$cur = $match;
			else if ($match[0] == "/")
				$cur = "http://".$domain.$match;
			else
				$cur = $url.$match;
			if (!isset($done[strtolower($cur)]))
				$urls[]=$cur;
		}
	}
	fclose($stderr);
?>

Save the above script(”mp3.php”), then execute it(”php mp3.php > urls.txt”). Then, it will show you what it’s doing and all the urls will be written to “urls.txt”. If you are in linux, you can use “wget -i urls.txt” to download the songs. If you are in windows, download Free Download Manager and use File -> Import List of Downloads.

You can also download the list of generated links.

Picasa wordpress sidebar widget

Saturday, June 2nd, 2007

When I saw cool random photo display gadget at wela’s blog, I also wanted something like that. Earlier I had a gallery2 installed, but when Google released picasa, I found it much easier. It actually optimizes the photos and then uploads. So, I switched to picasa. Fortunately, picasa have RSS feeds for each album :).

At first I thought of writing a normal wordpress plugin, but then since it’s in the sidebar, I thought of writing a widget. My blog was running wordpress 2.1 and either I had to install the widget plugin or install the new wordpress 2.2(Wordpress 2.2 comes with the builtin support for widgets). Since there are some bug fixes in version 2.2, I upgraded to that.

Then, after coping about three lines of codes into my theme, I widgetized it. Then, finally it’s writing the widget.

Actually, the widget was fairly easy to write because the wordpress itself has a rss parser. Even though, it doesn’t completely parse the RSS given by picasa, I managed to use some regular expressions and get the necessary values.

When I finished writing the widget, I realized that there is no point of downloading the RSS each time when someone requests the site. So, I put a download delay time and the stored the parsed RSS results as a wordpress option.

If you want to try this, click here to download. Copy the file into wp-content/plugins/widgets directory. Then, activate the plugin and add that to the sidebar from Presentation -> widgets. (You should have widgets plugin installed)

P2P Minds

Monday, April 30th, 2007

We have just finished a small project - p2p mind.. We’ll probably release the code as GPL, even though it might not be useful for many of you ;-)

check this out : http://www.sandaru1.com/p2pmind/

Upgrading Gallery2

Sunday, March 18th, 2007

Today, I wanted to install Ajaxian theme for my photo gallery. So, I just download it to my gallery2 themes folder, extracted the contents, the tried to install. But my gallery 2 version is way too old for the theme, so that I had to upgrade the gallery 2 itself.

The first step was to overwrite the old files with new ones. Then, I started the installation process. It went smoothly, and then they have asked to dump the mysql database and make a backup. But, as usual without doing anything, I continued. OMG.. There is an error. I got shocked. Fortunately, they have a nice debugging output. The error was regarding a mysql table. So, I opened the database manually, it was there. Oh!! The table names are case sensitive (only in Linux). I have set the mysql to only use the lowercase names, so it saves everything in lowercase, but when gallery2 tries to access them, it also uses upper case. So, the answer was simple, I went through the code. It was using adodb, found the Execute function, use the strtolower() function to convert all the sql commands to lowercase and it worked perfectly. After all my albums are there.

After upgrading, I changed the theme and uploaded some more photos. Then, I wanted to delete an unwanted album; So, I deleted that.. after the next refresh I realized I have mistakenly deleted all the albums :P