<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Pravin Paratey</title>
	<atom:link href="http://pravin.insanitybegins.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://pravin.insanitybegins.com</link>
	<description>Natural Language Processing, Information Extraction &#38; Search</description>
	<lastBuildDate>Mon, 26 Jul 2010 14:25:46 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1-alpha</generator>
		<item>
		<title>Palindromic sub-sequences in python</title>
		<link>http://pravin.insanitybegins.com/2010/04/07/palindromic-sub-sequences-in-python/</link>
		<comments>http://pravin.insanitybegins.com/2010/04/07/palindromic-sub-sequences-in-python/#comments</comments>
		<pubDate>Wed, 07 Apr 2010 15:49:29 +0000</pubDate>
		<dc:creator>Pravin Paratey</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Code]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://pravin.insanitybegins.com/?p=415</guid>
		<description><![CDATA[This bit of python code returns all palindromic subsequences in the input string.]]></description>
			<content:encoded><![CDATA[<p>This bit of code returns all palindromic subsequences in the input string. If the line marked <code># In-efficient</code> is better implemented (I am lazy), the running time is <code>O(n<sup>2</sup>)</code>. Can you find a better solution?</p>

<div class="wp_syntax"><div class="wp_syn_hdr">palindrome.py</div><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
</pre></td><td class="code"><pre class="python" style="font-family:monospace;"><span style="color: #808080; font-style: italic;">#!/usr/bin/env python</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">def</span> get_palindromes<span style="color: black;">&#40;</span><span style="color: #008000;">str</span><span style="color: black;">&#41;</span>:
	<span style="color: #483d8b;">&quot;&quot;&quot; Return all palindromes in str of minimum size 3 &quot;&quot;&quot;</span>
	length = <span style="color: #008000;">len</span><span style="color: black;">&#40;</span><span style="color: #008000;">str</span><span style="color: black;">&#41;</span> + <span style="color: #ff4500;">1</span>
&nbsp;
	found = <span style="color: black;">&#91;</span><span style="color: black;">&#93;</span>
	<span style="color: #ff7700;font-weight:bold;">for</span> i <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">xrange</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">0</span>, length<span style="color: black;">&#41;</span>:
		<span style="color: #ff7700;font-weight:bold;">for</span> j <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">xrange</span><span style="color: black;">&#40;</span>i+<span style="color: #ff4500;">3</span>, length<span style="color: black;">&#41;</span>:
				mid = i + <span style="color: black;">&#40;</span><span style="color: black;">&#40;</span>j - i<span style="color: black;">&#41;</span> / <span style="color: #ff4500;">2</span><span style="color: black;">&#41;</span>
				<span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #008000;">str</span><span style="color: black;">&#91;</span>i:mid<span style="color: black;">&#93;</span> == <span style="color: #008000;">str</span><span style="color: black;">&#91;</span>mid+<span style="color: #ff4500;">1</span>:j<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span>::-<span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span>: <span style="color: #808080; font-style: italic;"># In-efficient</span>
					found.<span style="color: black;">append</span><span style="color: black;">&#40;</span><span style="color: #008000;">str</span><span style="color: black;">&#91;</span>i:j<span style="color: black;">&#93;</span><span style="color: black;">&#41;</span>
	<span style="color: #ff7700;font-weight:bold;">return</span> found	
&nbsp;
<span style="color: #ff7700;font-weight:bold;">if</span> __name__ == <span style="color: #483d8b;">'__main__'</span>:
	<span style="color: #ff7700;font-weight:bold;">print</span> get_palindromes<span style="color: black;">&#40;</span><span style="color: #483d8b;">'efeababaf'</span><span style="color: black;">&#41;</span></pre></td></tr></table></div>

]]></content:encoded>
			<wfw:commentRss>http://pravin.insanitybegins.com/2010/04/07/palindromic-sub-sequences-in-python/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Join a list of integers in Python</title>
		<link>http://pravin.insanitybegins.com/2010/02/19/join-a-list-of-integers-in-python/</link>
		<comments>http://pravin.insanitybegins.com/2010/02/19/join-a-list-of-integers-in-python/#comments</comments>
		<pubDate>Fri, 19 Feb 2010 14:40:37 +0000</pubDate>
		<dc:creator>Pravin Paratey</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://pravin.insanitybegins.com/?p=411</guid>
		<description><![CDATA[How do you run a string join on a list of integers in Python? After googling for about 10 mins, I gave up and did this. I am sure there is a better way of doing it!]]></description>
			<content:encoded><![CDATA[<p>Today, I had to pretty print a list of integers for debugging. This does not work:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> t = <span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">3</span>, <span style="color: #ff4500;">4</span><span style="color: black;">&#93;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #483d8b;">' '</span>.<span style="color: black;">join</span><span style="color: black;">&#40;</span>t<span style="color: black;">&#41;</span>
Traceback <span style="color: black;">&#40;</span>most recent call last<span style="color: black;">&#41;</span>:
  File <span style="color: #483d8b;">&quot;&lt;stdin&gt;&quot;</span>, line <span style="color: #ff4500;">1</span>, <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #66cc66;">?</span>
<span style="color: #008000;">TypeError</span>: sequence item <span style="color: #ff4500;">0</span>: expected <span style="color: #dc143c;">string</span>, <span style="color: #008000;">int</span> found</pre></div></div>

<p>So I came up with this:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #ff7700;font-weight:bold;">def</span> concat<span style="color: black;">&#40;</span>x, y<span style="color: black;">&#41;</span>: <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>x<span style="color: black;">&#41;</span> + <span style="color: #483d8b;">' '</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>y<span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">reduce</span><span style="color: black;">&#40;</span>concat, t<span style="color: black;">&#41;</span>
<span style="color: #483d8b;">'1 2 3 4'</span></pre></div></div>

<p>I am sure there is a better way of doing this!</p>
]]></content:encoded>
			<wfw:commentRss>http://pravin.insanitybegins.com/2010/02/19/join-a-list-of-integers-in-python/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Writing a spider in 10 mins using Scrapy</title>
		<link>http://pravin.insanitybegins.com/2010/01/21/writing-a-spider-in-10-mins-using-scrapy/</link>
		<comments>http://pravin.insanitybegins.com/2010/01/21/writing-a-spider-in-10-mins-using-scrapy/#comments</comments>
		<pubDate>Thu, 21 Jan 2010 19:21:48 +0000</pubDate>
		<dc:creator>Pravin Paratey</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Code]]></category>
		<category><![CDATA[Information Extraction]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Crawler]]></category>
		<category><![CDATA[Scraper]]></category>

		<guid isPermaLink="false">http://pravin.insanitybegins.com/?p=402</guid>
		<description><![CDATA[I came across <a href="http://scrapy.org">Scrapy</a> a few days back and have grown to really love it. This tutorial will illustrate how you can write a simple spider using Scrapy to scrape data off <a href="http://www.paulsmith.co.uk/paul-smith-london-308/category.html">Paul Smith</a>. All this in 10 minutes.]]></description>
			<content:encoded><![CDATA[<p>I came across <a href="http://scrapy.org">Scrapy</a> a few days back and have grown to really love it. This tutorial will illustrate how you can write a simple spider using Scrapy to scrape data off <a href="http://www.paulsmith.co.uk/paul-smith-london-308/category.html">Paul Smith</a>. All this in 10 minutes.</p>
<h2>Lets begin</h2>
<ol>
<li><a href="http://scrapy.org/download/">Download</a> and install scrapy and its dependencies.</li>
<li>This done, open up your terminal and type <code>python scrapy-ctl.py startproject paul_smith</code>. A scrapy project will be created.</li>
<li>
<p>Navigate to <code>~/paul_smith/paul_smith/spiders</code> and create the file <code class="fname">paul_smith.py</code> with the following contents:</p>

<div class="wp_syntax"><div class="wp_syn_hdr">paul_smith.py</div><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
</pre></td><td class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">from</span> scrapy.<span style="color: black;">spider</span> <span style="color: #ff7700;font-weight:bold;">import</span> BaseSpider
&nbsp;
<span style="color: #ff7700;font-weight:bold;">class</span> PaulSmithSpider<span style="color: black;">&#40;</span>BaseSpider<span style="color: black;">&#41;</span>:
  domain_name = <span style="color: #483d8b;">&quot;paulsmith.co.uk&quot;</span>
  start_urls = <span style="color: black;">&#91;</span><span style="color: #483d8b;">&quot;http://www.paulsmith.co.uk/paul-smith-jeans-253/category.html&quot;</span><span style="color: black;">&#93;</span>
&nbsp;
  <span style="color: #ff7700;font-weight:bold;">def</span> parse<span style="color: black;">&#40;</span><span style="color: #008000;">self</span>, response<span style="color: black;">&#41;</span>:
    <span style="color: #008000;">open</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'paulsmith.html'</span>, <span style="color: #483d8b;">'wb'</span><span style="color: black;">&#41;</span>.<span style="color: black;">write</span><span style="color: black;">&#40;</span>response.<span style="color: black;">body</span><span style="color: black;">&#41;</span>
&nbsp;
SPIDER = PaulSmithSpider<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></pre></td></tr></table></div>

</li>
<li>To run the spider, go to <code>~/paul_smith</code> type <code>python scrapy-ctl.py crawl paulsmith.co.uk</code> on the command line. This will fetch the page and save it to paulsmith.html.</li>
<li>
<p>The next step is to parse the contents of the page. Open the page in your favourite editor and try to understand the pattern of the items we want to capture. You can see that <code>&lt;div class="yui-u"&gt;</code> contains the required information. We are going to modify out code like so:</p>

<div class="wp_syntax"><div class="wp_syn_hdr">paul_smith.py</div><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
</pre></td><td class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">from</span> scrapy.<span style="color: black;">spider</span> <span style="color: #ff7700;font-weight:bold;">import</span> BaseSpider
<span style="color: #ff7700;font-weight:bold;">from</span> scrapy.<span style="color: black;">selector</span> <span style="color: #ff7700;font-weight:bold;">import</span> HtmlXPathSelector
&nbsp;
<span style="color: #ff7700;font-weight:bold;">class</span> PaulSmithSpider<span style="color: black;">&#40;</span>BaseSpider<span style="color: black;">&#41;</span>:
  domain_name = <span style="color: #483d8b;">&quot;paulsmith.co.uk&quot;</span>
  start_urls = <span style="color: black;">&#91;</span><span style="color: #483d8b;">&quot;http://www.paulsmith.co.uk/paul-smith-jeans-253/category.html&quot;</span><span style="color: black;">&#93;</span>
&nbsp;
  <span style="color: #ff7700;font-weight:bold;">def</span> parse<span style="color: black;">&#40;</span><span style="color: #008000;">self</span>, response<span style="color: black;">&#41;</span>:
    hxs = HtmlXPathSelector<span style="color: black;">&#40;</span>response<span style="color: black;">&#41;</span>
    sites = hxs.<span style="color: #dc143c;">select</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'//div[@class=&quot;yui-u&quot;]'</span><span style="color: black;">&#41;</span>
    <span style="color: #ff7700;font-weight:bold;">for</span> <span style="color: #dc143c;">site</span> <span style="color: #ff7700;font-weight:bold;">in</span> sites:
      <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #dc143c;">site</span>.<span style="color: black;">extract</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
&nbsp;
SPIDER = PaulSmithSpider<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></pre></td></tr></table></div>

<p>You can read more on XPath Selectors <a href="http://doc.scrapy.org/topics/selectors.html">here</a>.</p>
</li>
<li>
<p>Finally, looking at the HTML again, we can extract <b>title, link, img-src &amp; sale-price</b> like so:</p>

<div class="wp_syntax"><div class="wp_syn_hdr">paul_smith.py</div><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
</pre></td><td class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">from</span> scrapy.<span style="color: black;">spider</span> <span style="color: #ff7700;font-weight:bold;">import</span> BaseSpider
<span style="color: #ff7700;font-weight:bold;">from</span> scrapy.<span style="color: black;">selector</span> <span style="color: #ff7700;font-weight:bold;">import</span> HtmlXPathSelector
<span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">random</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">class</span> PaulSmithSpider<span style="color: black;">&#40;</span>BaseSpider<span style="color: black;">&#41;</span>:
  domain_name = <span style="color: #483d8b;">&quot;paulsmith.co.uk&quot;</span>
  start_urls = <span style="color: black;">&#91;</span><span style="color: #483d8b;">&quot;http://www.paulsmith.co.uk/paul-smith-jeans-253/category.html&quot;</span><span style="color: black;">&#93;</span>
&nbsp;
  <span style="color: #ff7700;font-weight:bold;">def</span> parse<span style="color: black;">&#40;</span><span style="color: #008000;">self</span>, response<span style="color: black;">&#41;</span>:
    hxs = HtmlXPathSelector<span style="color: black;">&#40;</span>response<span style="color: black;">&#41;</span>
    sites = hxs.<span style="color: #dc143c;">select</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'//div[@class=&quot;yui-u&quot;]'</span><span style="color: black;">&#41;</span>
    <span style="color: #dc143c;">random</span>.<span style="color: black;">shuffle</span><span style="color: black;">&#40;</span>sites<span style="color: black;">&#41;</span>
    <span style="color: #ff7700;font-weight:bold;">for</span> <span style="color: #dc143c;">site</span> <span style="color: #ff7700;font-weight:bold;">in</span> sites:
      title = <span style="color: #dc143c;">site</span>.<span style="color: #dc143c;">select</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'a/strong[@class=&quot;thumbnail-text&quot;]/text()'</span><span style="color: black;">&#41;</span>.<span style="color: black;">extract</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
      hlink = <span style="color: #dc143c;">site</span>.<span style="color: #dc143c;">select</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'a/@href'</span><span style="color: black;">&#41;</span>.<span style="color: black;">extract</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
      price = <span style="color: #dc143c;">site</span>.<span style="color: #dc143c;">select</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'a/strong[@class=&quot;sale&quot;]/text()'</span><span style="color: black;">&#41;</span>.<span style="color: black;">extract</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
      image = <span style="color: #dc143c;">site</span>.<span style="color: #dc143c;">select</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'a/img/@src'</span><span style="color: black;">&#41;</span>.<span style="color: black;">extract</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
&nbsp;
      <span style="color: #ff7700;font-weight:bold;">print</span> title, hlink, image, price
&nbsp;
SPIDER = PaulSmithSpider<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></pre></td></tr></table></div>

<p>You can save this data to your datastore in whatever way you wish.</p>
</li>
<li>The output of 3 random items scraped using the above code can be seen below.</li>
</ol>
<h3>Output</h3>
<div>
<div style="width:150px;float:left;text-align:center">
    <img src="http://static.paulsmith.co.uk/images/width180/jacj-416j-773-y-34562.jpg" alt="" width="90" height="90" /></p>
<p><a href="http://www.paulsmith.co.uk/paul-smith-jeans-253/paul-smith-jumper-shawl-collar-block-stripe<br />
-jumper-jacj-416j-773-y/product.html">Shawl Collar Block Stripe Jumper</a><br />Sale: &pound; 74.00</p>
</p></div>
<div style="width:150px;float:left;text-align:center">
    <img src="http://static.paulsmith.co.uk/images/width180/jafj-592j-849-f-33362.jpg" alt="" width="90" height="90" /></p>
<p><a href="http://www.paulsmith.co.uk/paul-smith-jeans-253/paul-smith-jumper-crew-neck-placement-stripe-jumper-jafj-592j-849-f/product.html">Crew Neck Placement Stripe Jumper</a><br/>Sale: &pound; 67.00</p>
</p></div>
<div style="width:150px;float:left;text-align:center">
    <img src="http://static.paulsmith.co.uk/images/width180/jacj-830h-735-x-28513.jpg" alt="" width="90" height="90" /></p>
<p><a href="http://www.paulsmith.co.uk/paul-smith-jeans-253/paul-smith-shirt-tailored-fit-organic-cotton-cravat-print-shirt-jacj-830h-735-x/product.html">Tailored Fit, Organic Cotton Cravat Print Shirt</a><br />Sale: &pound; 74.00</p>
</p></div>
</div>
<div style="clear:both"></div>
]]></content:encoded>
			<wfw:commentRss>http://pravin.insanitybegins.com/2010/01/21/writing-a-spider-in-10-mins-using-scrapy/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Using PHP and ImageMagick to resize images</title>
		<link>http://pravin.insanitybegins.com/2009/09/09/using-php-and-imagemagick-to-resize-images/</link>
		<comments>http://pravin.insanitybegins.com/2009/09/09/using-php-and-imagemagick-to-resize-images/#comments</comments>
		<pubDate>Wed, 09 Sep 2009 10:48:23 +0000</pubDate>
		<dc:creator>Pravin Paratey</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Image]]></category>
		<category><![CDATA[ImageMagick]]></category>
		<category><![CDATA[Resize]]></category>

		<guid isPermaLink="false">http://pravin.insanitybegins.com/?p=298</guid>
		<description><![CDATA[<p>Today, I had to write some code to generate thumbnails in PHP. The <code>php-gd</code> library wasn't installed and I had to work with ImageMagick. Not the most elegant of solutions, but it works *shrug*</p>]]></description>
			<content:encoded><![CDATA[<p>Today, I had to write some code to generate thumbnails in PHP. The <code>php-gd</code> library wasn&#8217;t installed and I had to work with ImageMagick. Not the most elegant of solutions, but it works:</p>

<div class="wp_syntax"><div class="wp_syn_hdr">functions.php</div><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
</pre></td><td class="code"><pre class="php" style="font-family:monospace;"><span style="color: #990000;">define</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'PRAVIN_THUMBNAIL_DIR'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'/home/firedev/public_html/wp-content/cache/thumbnails/'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">function</span> pravin_resize<span style="color: #009900;">&#40;</span><span style="color: #000088;">$img_path</span><span style="color: #339933;">,</span> <span style="color: #000088;">$width</span><span style="color: #339933;">,</span> <span style="color: #000088;">$height</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000088;">$resolution</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">'&quot;'</span> <span style="color: #339933;">.</span> <span style="color: #000088;">$width</span> <span style="color: #339933;">.</span> <span style="color: #0000ff;">'x'</span> <span style="color: #339933;">.</span> <span style="color: #000088;">$height</span> <span style="color: #339933;">.</span> <span style="color: #0000ff;">'&quot;'</span><span style="color: #339933;">;</span>
    <span style="color: #000088;">$output_path</span> <span style="color: #339933;">=</span> PRAVIN_THUMBNAIL_DIR <span style="color: #339933;">.</span> <span style="color: #990000;">md5</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$img_path</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">.</span> <span style="color: #0000ff;">&quot;-<span style="color: #006699; font-weight: bold;">$resolution</span>.jpg&quot;</span><span style="color: #339933;">;</span>
    <span style="color: #666666; font-style: italic;">// If file does not exist OR the thumbnail was generated more than </span>
    <span style="color: #666666; font-style: italic;">// 5 mins (5 x 60 sec) then re-create the thumbnail</span>
    <span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #339933;">!</span><span style="color: #990000;">file_exists</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$output_path</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">||</span> <span style="color: #009900;">&#40;</span><span style="color: #990000;">time</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">-</span> <span style="color: #990000;">filemtime</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$output_path</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&gt;</span> <span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">5</span> <span style="color: #339933;">*</span> <span style="color: #cc66cc;">60</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #990000;">system</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;/usr/bin/convert -resize <span style="color: #006699; font-weight: bold;">$resolution</span> <span style="color: #006699; font-weight: bold;">$img_path</span> <span style="color: #006699; font-weight: bold;">$output_path</span>&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
    <span style="color: #b1b100;">return</span> <span style="color: #000088;">$output_path</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

]]></content:encoded>
			<wfw:commentRss>http://pravin.insanitybegins.com/2009/09/09/using-php-and-imagemagick-to-resize-images/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Script to generate URS from Wikipedia</title>
		<link>http://pravin.insanitybegins.com/2009/07/23/script-to-generate-urs-from-wikipedia/</link>
		<comments>http://pravin.insanitybegins.com/2009/07/23/script-to-generate-urs-from-wikipedia/#comments</comments>
		<pubDate>Thu, 23 Jul 2009 11:24:55 +0000</pubDate>
		<dc:creator>Pravin Paratey</dc:creator>
				<category><![CDATA[Information Extraction]]></category>
		<category><![CDATA[NLP]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[URS]]></category>
		<category><![CDATA[Wikipedia]]></category>

		<guid isPermaLink="false">http://pravin.insanitybegins.com/?p=286</guid>
		<description><![CDATA[<p>A person’s URS is a phrase that could be used instead of his/her usual name in all circumstances, which makes it absolutely clear who he/she is. <b>ex.</b> <abbr title="Bhagat Singh (September 28, 1907 - March 23, 1931), the Indian freedom fighter, considered to be one of the most influential revolutionaries of the Indian independence movement">Bhagat Singh</abbr> was executed by the British in 1931.</p>]]></description>
			<content:encoded><![CDATA[<p>A person’s URS is a phrase that could be used instead of his/her usual name in all circumstances, which makes it absolutely clear who he/she is. A good URS for a person should meet the following criteria:</p>
<ul>
<li>Everyone familiar with the person will confidently recognise him/her from the URS.</li>
<li>There is no possibility that the URS could also describe anyone other than the person.</li>
<li>Even someone who isn&#8217;t familiar with the person will have some understanding of who he/she is from the URS.</li>
</ul>

<div class="wp_syntax"><div class="wp_syn_hdr">analyzer.py</div><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
</pre></td><td class="code"><pre class="python" style="font-family:monospace;"><span style="color: #808080; font-style: italic;">#!/usr/bin/python</span>
<span style="color: #483d8b;">&quot;&quot;&quot; 
Script to generate URS from the starting paragraph of Wikipedia 
articles about persons.
&nbsp;
by Pravin Paratey (pravinp -at- gmail.com)
&nbsp;
Current Implementation:
----------------------
1. Extract first sentence
2. Clean wiki markup
3. Observing given data, and the data on wikipedia, shows that there 
   is a pattern that is followed while writing wikipedia entries for
   persons. Replacing (was/is)(an/a/the/) with (/the) does the trick
4. Output sentence formed
&nbsp;
Ideally:
--------
Ideally, the piece of code should identify the following concepts:
1. Name of person
2. Time period
3. Son/Daughter/Father/Mother of (in case of famous personality)
4. Renowned for
&nbsp;
How do we go about it?
1 and 2 - straight forward. Wikipedia gives cues through its markup
3 - straight forward. String matching using &quot;son of&quot;, &quot;daughter of&quot;, etc
4 - will need to match against a database.
&nbsp;
For 3, we only keep the &quot;son of&quot;, &quot;daughter of&quot;, &quot;X of Y&quot; if Y is a prominent
person. An easy way of doing this is using incoming links on wikipedia OR
to search for X and Y individually on google and noting the number of results.
&quot;&quot;&quot;</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">re</span>, <span style="color: #dc143c;">sys</span>, <span style="color: #dc143c;">codecs</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">def</span> cleanUri<span style="color: black;">&#40;</span>m<span style="color: black;">&#41;</span>:
    <span style="color: #483d8b;">&quot;&quot;&quot; Cleans Uri wiki markup &quot;&quot;&quot;</span>
    word = m.<span style="color: black;">group</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span>
    <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #483d8b;">'|'</span> <span style="color: #ff7700;font-weight:bold;">in</span> word: word = word.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'|'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span>
    <span style="color: #ff7700;font-weight:bold;">return</span> word.<span style="color: black;">strip</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">def</span> dotRemove<span style="color: black;">&#40;</span>m<span style="color: black;">&#41;</span>:
    <span style="color: #483d8b;">&quot;&quot;&quot; Replaces . by # inside tags &quot;&quot;&quot;</span>
    <span style="color: #ff7700;font-weight:bold;">return</span> m.<span style="color: black;">group</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'.'</span>, <span style="color: #483d8b;">'#'</span><span style="color: black;">&#41;</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">def</span> cleanMarkup<span style="color: black;">&#40;</span>text<span style="color: black;">&#41;</span>:
    <span style="color: #483d8b;">&quot;&quot;&quot; Removes
    1. wiki markup
    2. sanitize html entities 
    3. comments &quot;&quot;&quot;</span>
    <span style="color: #808080; font-style: italic;">#text = re.sub(r&quot;\[\[[\w\s\-,]+\|(\w+)\]\]&quot;, r&quot;\1&quot;, text)</span>
    text = <span style="color: #dc143c;">re</span>.<span style="color: black;">sub</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">&quot;<span style="color: #000099; font-weight: bold;">\[</span><span style="color: #000099; font-weight: bold;">\[</span>(.*?)<span style="color: #000099; font-weight: bold;">\]</span><span style="color: #000099; font-weight: bold;">\]</span>&quot;</span>, cleanUri, text<span style="color: black;">&#41;</span>
    text = <span style="color: #dc143c;">re</span>.<span style="color: black;">sub</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">&quot;<span style="color: #000099; font-weight: bold;">\{</span><span style="color: #000099; font-weight: bold;">\{</span>.*?<span style="color: #000099; font-weight: bold;">\}</span><span style="color: #000099; font-weight: bold;">\}</span>&quot;</span>, r<span style="color: #483d8b;">&quot;&quot;</span>, text<span style="color: black;">&#41;</span>
    text = <span style="color: #dc143c;">re</span>.<span style="color: black;">sub</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">&quot;&lt;ref&gt;.*?&lt;<span style="color: #000099; font-weight: bold;">\/</span>ref&gt;&quot;</span>, r<span style="color: #483d8b;">&quot;&quot;</span>, text<span style="color: black;">&#41;</span>
    text = <span style="color: #dc143c;">re</span>.<span style="color: black;">sub</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">&quot;&lt;!--.*?--&gt;&quot;</span>, r<span style="color: #483d8b;">&quot;&quot;</span>, text<span style="color: black;">&#41;</span>
    text = <span style="color: #dc143c;">re</span>.<span style="color: black;">sub</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">&quot;<span style="color: #000099; font-weight: bold;">\[</span>.*?<span style="color: #000099; font-weight: bold;">\]</span>&quot;</span>, r<span style="color: #483d8b;">&quot;&quot;</span>, text<span style="color: black;">&#41;</span>
    text = text.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;'''&quot;</span>, <span style="color: #483d8b;">&quot;&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;''&quot;</span>, <span style="color: #483d8b;">&quot;'&quot;</span><span style="color: black;">&#41;</span>
    text = text.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;[[&quot;</span>, <span style="color: #483d8b;">&quot;&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;]]&quot;</span>, <span style="color: #483d8b;">&quot;&quot;</span><span style="color: black;">&#41;</span>
    text = text.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;&amp;ndash;&quot;</span>, <span style="color: #483d8b;">&quot;-&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;&amp;amp;&quot;</span>, <span style="color: #483d8b;">&quot;&amp;&quot;</span><span style="color: black;">&#41;</span>
    <span style="color: #ff7700;font-weight:bold;">return</span> text
&nbsp;
<span style="color: #ff7700;font-weight:bold;">def</span> getFirstSentence<span style="color: black;">&#40;</span>text<span style="color: black;">&#41;</span>:
    <span style="color: #483d8b;">&quot;&quot;&quot; Returns the text until first instance of '.'
    It also makes sure that the '.' isn't part of a wiki link
    or name&quot;&quot;&quot;</span>
    tmp = <span style="color: #dc143c;">re</span>.<span style="color: black;">sub</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">&quot;<span style="color: #000099; font-weight: bold;">\[</span><span style="color: #000099; font-weight: bold;">\[</span>.*?<span style="color: #000099; font-weight: bold;">\]</span><span style="color: #000099; font-weight: bold;">\]</span>&quot;</span>, dotRemove, text<span style="color: black;">&#41;</span>
    tmp = <span style="color: #dc143c;">re</span>.<span style="color: black;">sub</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">&quot;<span style="color: #000099; font-weight: bold;">\[</span>.*?<span style="color: #000099; font-weight: bold;">\]</span>&quot;</span>, dotRemove, tmp<span style="color: black;">&#41;</span>
    tmp = <span style="color: #dc143c;">re</span>.<span style="color: black;">sub</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">&quot;&lt;ref&gt;.*?&lt;<span style="color: #000099; font-weight: bold;">\/</span>ref&gt;&quot;</span>, dotRemove, tmp<span style="color: black;">&#41;</span>
    tmp = <span style="color: #dc143c;">re</span>.<span style="color: black;">sub</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">&quot;&lt;!--.*?--&gt;&quot;</span>, dotRemove, tmp<span style="color: black;">&#41;</span>
    tmp = <span style="color: #dc143c;">re</span>.<span style="color: black;">sub</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">&quot;'''.*?'''&quot;</span>, dotRemove, tmp<span style="color: black;">&#41;</span>
    tmp = <span style="color: #dc143c;">re</span>.<span style="color: black;">sub</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">&quot;''.*?''&quot;</span>, dotRemove, tmp<span style="color: black;">&#41;</span>
    index = tmp.<span style="color: black;">find</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'.'</span><span style="color: black;">&#41;</span>
&nbsp;
    <span style="color: #ff7700;font-weight:bold;">if</span> index == -<span style="color: #ff4500;">1</span>: 
        <span style="color: #ff7700;font-weight:bold;">return</span> text
    <span style="color: #ff7700;font-weight:bold;">else</span>:
        <span style="color: #ff7700;font-weight:bold;">return</span> text<span style="color: black;">&#91;</span>:index<span style="color: black;">&#93;</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">def</span> makeArticle<span style="color: black;">&#40;</span>m<span style="color: black;">&#41;</span>:
    <span style="color: #483d8b;">&quot;&quot;&quot; Changes a, an to the when appropriate &quot;&quot;&quot;</span>
    retval = <span style="color: #483d8b;">', the'</span>
    <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #008000;">len</span><span style="color: black;">&#40;</span>m.<span style="color: black;">group</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">2</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span> == <span style="color: #ff4500;">0</span>:
        retval = <span style="color: #483d8b;">' '</span>
    <span style="color: #ff7700;font-weight:bold;">return</span> retval
&nbsp;
<span style="color: #ff7700;font-weight:bold;">def</span> extractURS<span style="color: black;">&#40;</span>text<span style="color: black;">&#41;</span>:
    <span style="color: #483d8b;">&quot;&quot;&quot; The function to call. Returns the URS &quot;&quot;&quot;</span>
    text = getFirstSentence<span style="color: black;">&#40;</span>text<span style="color: black;">&#41;</span>
    text = cleanMarkup<span style="color: black;">&#40;</span>text<span style="color: black;">&#41;</span>
    text = <span style="color: #dc143c;">re</span>.<span style="color: black;">sub</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">&quot;,?<span style="color: #000099; font-weight: bold;">\s</span>+(was|is)<span style="color: #000099; font-weight: bold;">\s</span>+(an|the|a|)&quot;</span>, makeArticle, text<span style="color: black;">&#41;</span>
    <span style="color: #ff7700;font-weight:bold;">return</span> text
&nbsp;
<span style="color: #ff7700;font-weight:bold;">if</span> __name__ == <span style="color: #483d8b;">'__main__'</span>:
    <span style="color: #808080; font-style: italic;">#fp = open(sys.argv[1])</span>
    fp = <span style="color: #dc143c;">codecs</span>.<span style="color: #008000;">open</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;input.txt&quot;</span>, <span style="color: #483d8b;">&quot;r&quot;</span>, <span style="color: #483d8b;">&quot;utf-8&quot;</span><span style="color: black;">&#41;</span>
    fp2 = <span style="color: #dc143c;">codecs</span>.<span style="color: #008000;">open</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;output.txt&quot;</span>, <span style="color: #483d8b;">&quot;w&quot;</span>, <span style="color: #483d8b;">&quot;utf-8&quot;</span><span style="color: black;">&#41;</span>
    fp2.<span style="color: black;">write</span><span style="color: black;">&#40;</span><span style="color: #dc143c;">codecs</span>.<span style="color: black;">BOM_UTF8</span>.<span style="color: black;">decode</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;utf-8&quot;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>, <span style="color: #808080; font-style: italic;"># Add BOM for UTF-8</span>
    <span style="color: #ff7700;font-weight:bold;">for</span> line <span style="color: #ff7700;font-weight:bold;">in</span> fp:
        line = line.<span style="color: black;">rstrip</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
        <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #008000;">len</span><span style="color: black;">&#40;</span>line<span style="color: black;">&#41;</span> == <span style="color: #ff4500;">0</span> <span style="color: #ff7700;font-weight:bold;">or</span> line.<span style="color: black;">startswith</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;#&quot;</span><span style="color: black;">&#41;</span>: <span style="color: #808080; font-style: italic;"># For debugging</span>
            <span style="color: #ff7700;font-weight:bold;">continue</span>
        urs = extractURS<span style="color: black;">&#40;</span>line<span style="color: black;">&#41;</span>
        fp2.<span style="color: black;">write</span><span style="color: black;">&#40;</span>urs + <span style="color: #483d8b;">'<span style="color: #000099; font-weight: bold;">\r</span><span style="color: #000099; font-weight: bold;">\n</span>'</span><span style="color: black;">&#41;</span>
    fp.<span style="color: black;">close</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
    fp2.<span style="color: black;">close</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></pre></td></tr></table></div>

<h3>Example Inputs and Outputs</h3>
<p>These are inputs from Wikipedia (Click on the article and then Edit). Ex <a href="http://en.wikipedia.org/w/index.php?title=Lala_Lajpat_Rai&#038;action=edit">Lala Lajpat Rai</a>. The above script outputs the URS.</p>
<p><b>Example Input:</b> <em>&#8221;&#8217;B. S. Johnson &#8221;&#8217; (Bryan Stanley Johnson) ([[5 February]],[[1933]] &#8211; [[13 November]],[[1973]]) was an English experimental novelist, poet, literary critic and film-maker.</em></p>
<p><b>Script Output:</b> <em>B. S. Johnson  (Bryan Stanley Johnson) (5 February,1933 &#8211; 13 November,1973), the English experimental novelist, poet, literary critic and film-maker</em></p>
<h3>How are URS used?</h3>
<p>URS can be directly substituted in a sentence containing that persons&#8217; name. (Hover over Bhagat Singh to see this URS.</p>
<p>ex. <abbr title="Bhagat Singh (September 28, 1907 - March 23, 1931), the Indian freedom fighter, considered to be one of the most influential revolutionaries of the Indian independence movement">Bhagat Singh</abbr> was executed by the British in 1931.</p>
<p>This way, a person who had no idea who Bhagat Singh was, now has more context about the person.</p>
]]></content:encoded>
			<wfw:commentRss>http://pravin.insanitybegins.com/2009/07/23/script-to-generate-urs-from-wikipedia/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Snippet to generate a random word in python</title>
		<link>http://pravin.insanitybegins.com/2009/04/26/snippet-to-generate-a-random-word-in-python/</link>
		<comments>http://pravin.insanitybegins.com/2009/04/26/snippet-to-generate-a-random-word-in-python/#comments</comments>
		<pubDate>Mon, 27 Apr 2009 06:50:57 +0000</pubDate>
		<dc:creator>Pravin Paratey</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Generator]]></category>
		<category><![CDATA[Random]]></category>
		<category><![CDATA[Word]]></category>

		<guid isPermaLink="false">http://pravin.insanitybegins.com/?p=276</guid>
		<description><![CDATA[Since I haven’t posted here in a while, I figured I’d whip up this example real quick. It illustrates the usage of the random function.]]></description>
			<content:encoded><![CDATA[<p>Since I haven&#8217;t posted here in a while, I figured I&#8217;d whip up this example real quick. It illustrates the usage of the <code>random</code> function.</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
</pre></td><td class="code"><pre class="python" style="font-family:monospace;"><span style="color: #808080; font-style: italic;">#!/usr/bin/python</span>
<span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">random</span>
<span style="color: #ff7700;font-weight:bold;">from</span> <span style="color: #dc143c;">time</span> <span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">time</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">def</span> generateWord<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>:
	char_array = <span style="color: #483d8b;">'abcdefghijklmnopqrstuvwxyz'</span>
	<span style="color: #dc143c;">random</span>.<span style="color: black;">seed</span><span style="color: black;">&#40;</span><span style="color: #dc143c;">time</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
	word = <span style="color: #483d8b;">''</span>
	<span style="color: #ff7700;font-weight:bold;">for</span> i <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">range</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">0</span>, <span style="color: #ff4500;">8</span><span style="color: black;">&#41;</span>: <span style="color: #808080; font-style: italic;"># 8 letter word</span>
		word += char_array<span style="color: black;">&#91;</span><span style="color: #dc143c;">random</span>.<span style="color: black;">randint</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">0</span>, <span style="color: #ff4500;">25</span><span style="color: black;">&#41;</span><span style="color: black;">&#93;</span>
	<span style="color: #ff7700;font-weight:bold;">return</span> word
&nbsp;
<span style="color: #ff7700;font-weight:bold;">if</span> __name__ == <span style="color: #483d8b;">'__main__'</span>:
	<span style="color: #ff7700;font-weight:bold;">print</span> generateWord<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></pre></td></tr></table></div>

]]></content:encoded>
			<wfw:commentRss>http://pravin.insanitybegins.com/2009/04/26/snippet-to-generate-a-random-word-in-python/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Java Regex Matching</title>
		<link>http://pravin.insanitybegins.com/2009/04/16/java-regex-matching/</link>
		<comments>http://pravin.insanitybegins.com/2009/04/16/java-regex-matching/#comments</comments>
		<pubDate>Thu, 16 Apr 2009 12:42:00 +0000</pubDate>
		<dc:creator>Pravin Paratey</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[find]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[match]]></category>
		<category><![CDATA[regex]]></category>

		<guid isPermaLink="false">http://pravin.insanitybegins.com/?p=270</guid>
		<description><![CDATA[This snippet illustrates Regular Expressions in Java. Um, I didn't realize <code>matcher.find()</code> had to be called. Wasted half a day!]]></description>
			<content:encoded><![CDATA[<p>Took me half a day to figure out <code>matcher.find()</code> had to be called first. Gaah!</p>

<div class="wp_syntax"><div class="wp_syn_hdr">RegexTest.java</div><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
</pre></td><td class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">import</span> <span style="color: #006699;">java.util.regex.Matcher</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">import</span> <span style="color: #006699;">java.util.regex.Pattern</span><span style="color: #339933;">;</span>
&nbsp;
&nbsp;
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> Tester <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> main<span style="color: #009900;">&#40;</span><span style="color: #003399;">String</span> args<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> 
    <span style="color: #009900;">&#123;</span>
        Pattern pattern <span style="color: #339933;">=</span> Pattern.<span style="color: #006633;">compile</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;name=<span style="color: #000099; font-weight: bold;">\&quot;</span>QTime<span style="color: #000099; font-weight: bold;">\&quot;</span>&gt;(<span style="color: #000099; font-weight: bold;">\\</span>d+)&lt;/int&gt;&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        Matcher matcher <span style="color: #339933;">=</span> pattern.<span style="color: #006633;">matcher</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;&lt;response&gt;&lt;lst name=<span style="color: #000099; font-weight: bold;">\&quot;</span>responseHeader<span style="color: #000099; font-weight: bold;">\&quot;</span>&gt;&quot;</span> <span style="color: #339933;">+</span> 
            <span style="color: #0000ff;">&quot;&lt;int name=<span style="color: #000099; font-weight: bold;">\&quot;</span>status<span style="color: #000099; font-weight: bold;">\&quot;</span>&gt;0&lt;/int&gt;&lt;int name=<span style="color: #000099; font-weight: bold;">\&quot;</span>QTime<span style="color: #000099; font-weight: bold;">\&quot;</span>&gt;2&lt;/int&gt;&lt;/lst&gt;&lt;/response&gt;&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
        <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">&#40;</span>matcher.<span style="color: #006633;">find</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
                <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>matcher.<span style="color: #006633;">group</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #009900;">&#125;</span>
        <span style="color: #009900;">&#125;</span>
        <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span><span style="color: #003399;">Exception</span> e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;Couldn't find QTime&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p>P.S. I love Python</p>
]]></content:encoded>
			<wfw:commentRss>http://pravin.insanitybegins.com/2009/04/16/java-regex-matching/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Simple Webserver in Python</title>
		<link>http://pravin.insanitybegins.com/2009/04/15/simple-webserver-in-python/</link>
		<comments>http://pravin.insanitybegins.com/2009/04/15/simple-webserver-in-python/#comments</comments>
		<pubDate>Wed, 15 Apr 2009 10:10:45 +0000</pubDate>
		<dc:creator>Pravin Paratey</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[tutorial]]></category>
		<category><![CDATA[Webserver]]></category>

		<guid isPermaLink="false">http://pravin.insanitybegins.com/?p=262</guid>
		<description><![CDATA[This snippet illustrates how one can easily build a HTTP Web Server in python. self.args will contain the query parameters. WebServer.py1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 [...]]]></description>
			<content:encoded><![CDATA[<p>This snippet illustrates how one can easily build a HTTP Web Server in python.  <code>self.args</code> will contain the query parameters.</p>

<div class="wp_syntax"><div class="wp_syn_hdr">WebServer.py</div><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
</pre></td><td class="code"><pre class="python" style="font-family:monospace;"><span style="color: #808080; font-style: italic;">#!/usr/bin/env python</span>
<span style="color: #808080; font-style: italic;"># -*- coding: utf-8 -*-</span>
<span style="color: #808080; font-style: italic;"># Simple WebServer Illustration</span>
<span style="color: #808080; font-style: italic;"># Pravin Paratey (April 15, 2009) [pravinp at gmail dot com]</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">from</span> <span style="color: #dc143c;">BaseHTTPServer</span> <span style="color: #ff7700;font-weight:bold;">import</span> BaseHTTPRequestHandler, HTTPServer
&nbsp;
<span style="color: #ff7700;font-weight:bold;">class</span> MyHandler<span style="color: black;">&#40;</span>BaseHTTPRequestHandler<span style="color: black;">&#41;</span>:
    binaryExtensions = <span style="color: black;">&#91;</span><span style="color: #483d8b;">'.gif'</span>, <span style="color: #483d8b;">'.png'</span>, <span style="color: #483d8b;">'.jpg'</span><span style="color: black;">&#93;</span>
    contentTypes = <span style="color: black;">&#123;</span>
        <span style="color: #483d8b;">'.css'</span>: <span style="color: #483d8b;">'text/css'</span>,
        <span style="color: #483d8b;">'.gif'</span>: <span style="color: #483d8b;">'image/gif'</span>,
        <span style="color: #483d8b;">'.jpg'</span>: <span style="color: #483d8b;">'image/jpg'</span>,
        <span style="color: #483d8b;">'.png'</span>: <span style="color: #483d8b;">'image/png'</span>,
        <span style="color: #483d8b;">'html'</span>: <span style="color: #483d8b;">'text/html'</span>,
    <span style="color: black;">&#125;</span>
&nbsp;
    <span style="color: #ff7700;font-weight:bold;">def</span> do_GET<span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
        <span style="color: #483d8b;">&quot;&quot;&quot; Implementing the GET method &quot;&quot;&quot;</span>
        <span style="color: #ff7700;font-weight:bold;">try</span>:
            <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #008000;">self</span>.<span style="color: black;">path</span> == <span style="color: #483d8b;">'/'</span>: <span style="color: #008000;">self</span>.<span style="color: black;">path</span> = <span style="color: #483d8b;">'/index.html'</span>
            mode = <span style="color: #483d8b;">'r'</span>
            <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #008000;">self</span>.<span style="color: black;">path</span><span style="color: black;">&#91;</span>-<span style="color: #ff4500;">4</span>:<span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">self</span>.<span style="color: black;">binaryExtensions</span>: mode = <span style="color: #483d8b;">'rb'</span>
            fp = <span style="color: #008000;">open</span><span style="color: black;">&#40;</span><span style="color: #008000;">self</span>.<span style="color: black;">path</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>:<span style="color: black;">&#93;</span>, mode<span style="color: black;">&#41;</span>
            data = fp.<span style="color: black;">read</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
            fp.<span style="color: black;">close</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
            <span style="color: #808080; font-style: italic;"># Send response</span>
            <span style="color: #008000;">self</span>.<span style="color: black;">send_response</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">200</span><span style="color: black;">&#41;</span>
            <span style="color: #008000;">self</span>.<span style="color: black;">send_header</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'Content-Type'</span>, <span style="color: #008000;">self</span>.__getContentType<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
            <span style="color: #008000;">self</span>.<span style="color: black;">send_header</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'Transfer-Encoding'</span>, <span style="color: #483d8b;">'chunked'</span><span style="color: black;">&#41;</span>
            <span style="color: #008000;">self</span>.<span style="color: black;">end_headers</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
            <span style="color: #008000;">self</span>.<span style="color: black;">wfile</span>.<span style="color: black;">write</span><span style="color: black;">&#40;</span>data<span style="color: black;">&#41;</span>
        <span style="color: #ff7700;font-weight:bold;">except</span> <span style="color: #008000;">IOError</span>:
            <span style="color: #008000;">self</span>.<span style="color: black;">send_error</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">404</span>, <span style="color: #483d8b;">&quot;File not found: %s&quot;</span> <span style="color: #66cc66;">%</span> <span style="color: #008000;">self</span>.<span style="color: black;">path</span><span style="color: black;">&#41;</span>
&nbsp;
    <span style="color: #ff7700;font-weight:bold;">def</span> __getContentType<span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
        <span style="color: #483d8b;">&quot;&quot;&quot; Function to figure out content types &quot;&quot;&quot;</span>
        content_type = <span style="color: #483d8b;">'text/plain'</span>
        extension = <span style="color: #008000;">self</span>.<span style="color: black;">path</span><span style="color: black;">&#91;</span>-<span style="color: #ff4500;">4</span>:<span style="color: black;">&#93;</span>
        <span style="color: #ff7700;font-weight:bold;">if</span> extension <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">self</span>.<span style="color: black;">contentTypes</span>:
            content_type = <span style="color: #008000;">self</span>.<span style="color: black;">contentTypes</span><span style="color: black;">&#91;</span>extension<span style="color: black;">&#93;</span>            
        <span style="color: #ff7700;font-weight:bold;">return</span> content_type
&nbsp;
<span style="color: #ff7700;font-weight:bold;">if</span> __name__ == <span style="color: #483d8b;">'__main__'</span>:
    server = HTTPServer<span style="color: black;">&#40;</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">''</span>, <span style="color: #ff4500;">8000</span><span style="color: black;">&#41;</span>, MyHandler<span style="color: black;">&#41;</span>
    server.<span style="color: black;">serve_forever</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></pre></td></tr></table></div>

]]></content:encoded>
			<wfw:commentRss>http://pravin.insanitybegins.com/2009/04/15/simple-webserver-in-python/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>XML DOM in Java</title>
		<link>http://pravin.insanitybegins.com/2009/03/29/xml-dom-in-java/</link>
		<comments>http://pravin.insanitybegins.com/2009/03/29/xml-dom-in-java/#comments</comments>
		<pubDate>Sun, 29 Mar 2009 10:00:19 +0000</pubDate>
		<dc:creator>Pravin Paratey</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[dom]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://pravin.insanitybegins.com/?p=260</guid>
		<description><![CDATA[Of late, I have been working with Java. And one of the issues that I faced was XML parsing. With so many libraries available, I decided to stick to jaxp. What follows is sample code to Tree walk over the nodes: TreeWalk.java1 2 3 4 5 6 7 8 9 10 11 12 13 14 [...]]]></description>
			<content:encoded><![CDATA[<p>Of late, I have been working with Java. And one of the issues that I faced was XML parsing. With so many libraries available, I decided to stick to jaxp. What follows is sample code to Tree walk over the nodes:</p>

<div class="wp_syntax"><div class="wp_syn_hdr">TreeWalk.java</div><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
</pre></td><td class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">import</span> <span style="color: #006699;">java.io.File</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">import</span> <span style="color: #006699;">javax.xml.parsers.DocumentBuilder</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">import</span> <span style="color: #006699;">javax.xml.parsers.DocumentBuilderFactory</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">import</span> <span style="color: #006699;">org.w3c.dom.Node</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">import</span> <span style="color: #006699;">org.w3c.dom.NodeList</span><span style="color: #339933;">;</span>
&nbsp;
&nbsp;
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> Tester <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> main<span style="color: #009900;">&#40;</span><span style="color: #003399;">String</span> args<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> 
    <span style="color: #009900;">&#123;</span>
        DocumentBuilderFactory factory <span style="color: #339933;">=</span> DocumentBuilderFactory.<span style="color: #006633;">newInstance</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        factory.<span style="color: #006633;">setNamespaceAware</span><span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">false</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
        <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
            DocumentBuilder builder <span style="color: #339933;">=</span> factory.<span style="color: #006633;">newDocumentBuilder</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            org.<span style="color: #006633;">w3c</span>.<span style="color: #006633;">dom</span>.<span style="color: #003399;">Document</span> doc <span style="color: #339933;">=</span> builder.<span style="color: #006633;">parse</span><span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">File</span><span style="color: #009900;">&#40;</span>args<span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            NodeList nodes1 <span style="color: #339933;">=</span> doc.<span style="color: #006633;">getChildNodes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span> i<span style="color: #339933;">=</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> i<span style="color: #339933;">&lt;</span>nodes1.<span style="color: #006633;">getLength</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span> i<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
                TreeWalk<span style="color: #009900;">&#40;</span>nodes1.<span style="color: #006633;">item</span><span style="color: #009900;">&#40;</span>i<span style="color: #009900;">&#41;</span>, <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #009900;">&#125;</span>
        <span style="color: #009900;">&#125;</span>
        <span style="color: #000000; font-weight: bold;">catch</span><span style="color: #009900;">&#40;</span><span style="color: #003399;">Exception</span> e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            e.<span style="color: #006633;">printStackTrace</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> TreeWalk<span style="color: #009900;">&#40;</span>Node n, <span style="color: #000066; font-weight: bold;">int</span> level<span style="color: #009900;">&#41;</span> 
    <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">&#40;</span>n.<span style="color: #006633;">getNodeType</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">!=</span> Node.<span style="color: #006633;">TEXT_NODE</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span> i<span style="color: #339933;">=</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> i<span style="color: #339933;">&lt;</span>level<span style="color: #339933;">;</span> i<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span>
                <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">print</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;  &quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">print</span><span style="color: #009900;">&#40;</span>n.<span style="color: #006633;">getNodeName</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">+</span> <span style="color: #0000ff;">&quot;:&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
        <span style="color: #000000; font-weight: bold;">else</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>n.<span style="color: #006633;">getNodeValue</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">trim</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
        NodeList list <span style="color: #339933;">=</span> n.<span style="color: #006633;">getChildNodes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span> i<span style="color: #339933;">=</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> i<span style="color: #339933;">&lt;</span>list.<span style="color: #006633;">getLength</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span> i<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            TreeWalk<span style="color: #009900;">&#40;</span>list.<span style="color: #006633;">item</span><span style="color: #009900;">&#40;</span>i<span style="color: #009900;">&#41;</span>, level<span style="color: #339933;">+</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

]]></content:encoded>
			<wfw:commentRss>http://pravin.insanitybegins.com/2009/03/29/xml-dom-in-java/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hacking wp-syntax plugin to show header</title>
		<link>http://pravin.insanitybegins.com/2009/03/28/hacking-wp-syntax-plugin-to-show-header/</link>
		<comments>http://pravin.insanitybegins.com/2009/03/28/hacking-wp-syntax-plugin-to-show-header/#comments</comments>
		<pubDate>Sat, 28 Mar 2009 07:58:11 +0000</pubDate>
		<dc:creator>Pravin Paratey</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Plugin]]></category>
		<category><![CDATA[Syntax Highlighting]]></category>
		<category><![CDATA[Wordpress]]></category>

		<guid isPermaLink="false">http://pravin.insanitybegins.com/?p=252</guid>
		<description><![CDATA[I was recently asked how I got the wp-syntax plugin to show a header like so: test.cpp1 2 3 int main&#40;&#41; &#123; return 0; &#125; To show the test.cpp file name, I modified the wp-syntax.php file (present in /wp-content/plugins/wp-syntax/) like so: Changed the regular expression in the wp_syntax_before_filter function from: wp-syntax.phpfunction wp_syntax_before_filter&#40;$content&#41; &#123; return preg_replace_callback&#40; [...]]]></description>
			<content:encoded><![CDATA[<p>I was recently asked how I got the wp-syntax plugin to show a header like so:</p>

<div class="wp_syntax"><div class="wp_syn_hdr">test.cpp</div><table><tr><td class="line_numbers"><pre>1
2
3
</pre></td><td class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #0000ff;">int</span> main<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span> <span style="color: #008000;">&#123;</span>
	<span style="color: #0000ff;">return</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span></pre></td></tr></table></div>

<p>To show the test.cpp file name, I modified the <code>wp-syntax.php</code> file (present in <code>/wp-content/plugins/wp-syntax/</code>) like so:</p>
<p>Changed the regular expression in the <code>wp_syntax_before_filter</code> function from:</p>

<div class="wp_syntax"><div class="wp_syn_hdr">wp-syntax.php</div><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">function</span> wp_syntax_before_filter<span style="color: #009900;">&#40;</span><span style="color: #000088;">$content</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
    <span style="color: #b1b100;">return</span> <span style="color: #990000;">preg_replace_callback</span><span style="color: #009900;">&#40;</span>
        <span style="color: #0000ff;">&quot;/\s*&lt;pre(?:lang=[<span style="color: #000099; font-weight: bold;">\&quot;</span>']([\w-]*)[<span style="color: #000099; font-weight: bold;">\&quot;</span>']|line=[<span style="color: #000099; font-weight: bold;">\&quot;</span>'](\d*)[<span style="color: #000099; font-weight: bold;">\&quot;</span>']|escaped=[<span style="color: #000099; font-weight: bold;">\&quot;</span>'](true|false)?[<span style="color: #000099; font-weight: bold;">\&quot;</span>']|\s)+&gt;(.*)&lt;\/pre&gt;\s*/siU&quot;</span><span style="color: #339933;">,</span>
        <span style="color: #0000ff;">&quot;wp_syntax_substitute&quot;</span><span style="color: #339933;">,</span>
        <span style="color: #000088;">$content</span>
    <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>to</p>

<div class="wp_syntax"><div class="wp_syn_hdr">wp-syntax.php</div><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">function</span> wp_syntax_before_filter<span style="color: #009900;">&#40;</span><span style="color: #000088;">$content</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
    <span style="color: #b1b100;">return</span> <span style="color: #990000;">preg_replace_callback</span><span style="color: #009900;">&#40;</span>
        <span style="color: #0000ff;">&quot;/\s*&lt;pre(?:lang=[<span style="color: #000099; font-weight: bold;">\&quot;</span>']([\w-]*)[<span style="color: #000099; font-weight: bold;">\&quot;</span>']|line=[<span style="color: #000099; font-weight: bold;">\&quot;</span>'](\d*)[<span style="color: #000099; font-weight: bold;">\&quot;</span>']|escaped=[<span style="color: #000099; font-weight: bold;">\&quot;</span>'](true|false)?[<span style="color: #000099; font-weight: bold;">\&quot;</span>']|header=[<span style="color: #000099; font-weight: bold;">\&quot;</span>']([\w-\. ]*)[<span style="color: #000099; font-weight: bold;">\&quot;</span>']|\s)+&gt;(.*)&lt;\/pre&gt;\s*/siU&quot;</span><span style="color: #339933;">,</span>
        <span style="color: #0000ff;">&quot;wp_syntax_substitute&quot;</span><span style="color: #339933;">,</span>
        <span style="color: #000088;">$content</span>
    <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>And the <code>wp_syntax_highlight</code> function to:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
</pre></td><td class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">function</span> wp_syntax_highlight<span style="color: #009900;">&#40;</span><span style="color: #000088;">$match</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">global</span> <span style="color: #000088;">$wp_syntax_matches</span><span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #990000;">intval</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$match</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000088;">$match</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$wp_syntax_matches</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #000088;">$language</span> <span style="color: #339933;">=</span> <span style="color: #990000;">strtolower</span><span style="color: #009900;">&#40;</span><span style="color: #990000;">trim</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$match</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000088;">$line</span> <span style="color: #339933;">=</span> <span style="color: #990000;">trim</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$match</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000088;">$escaped</span> <span style="color: #339933;">=</span> <span style="color: #990000;">trim</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$match</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000088;">$header</span> <span style="color: #339933;">=</span> <span style="color: #990000;">trim</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$match</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000088;">$code</span> <span style="color: #339933;">=</span> wp_syntax_code_trim<span style="color: #009900;">&#40;</span><span style="color: #000088;">$match</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">5</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$escaped</span> <span style="color: #339933;">==</span> <span style="color: #0000ff;">&quot;true&quot;</span><span style="color: #009900;">&#41;</span> <span style="color: #000088;">$code</span> <span style="color: #339933;">=</span> <span style="color: #990000;">htmlspecialchars_decode</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$code</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #000088;">$geshi</span> <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> GeSHi<span style="color: #009900;">&#40;</span><span style="color: #000088;">$code</span><span style="color: #339933;">,</span> <span style="color: #000088;">$language</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000088;">$geshi</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">enable_keyword_links</span><span style="color: #009900;">&#40;</span><span style="color: #009900; font-weight: bold;">false</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    do_action_ref_array<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'wp_syntax_init_geshi'</span><span style="color: #339933;">,</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #339933;">&amp;</span><span style="color: #000088;">$geshi</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #000088;">$output</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&lt;div class=<span style="color: #000099; font-weight: bold;">\&quot;</span>wp_syntax<span style="color: #000099; font-weight: bold;">\&quot;</span>&gt;&quot;</span><span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$header</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000088;">$output</span> <span style="color: #339933;">.=</span> <span style="color: #0000ff;">&quot;&lt;div class=<span style="color: #000099; font-weight: bold;">\&quot;</span>wp_syn_hdr<span style="color: #000099; font-weight: bold;">\&quot;</span>&gt;&quot;</span> <span style="color: #339933;">.</span> <span style="color: #000088;">$header</span> <span style="color: #339933;">.</span> <span style="color: #0000ff;">&quot;&lt;/div&gt;&quot;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p>Node the addition of lines 104 and 114-116</p>
<p>All you have to do is add another attribute <code>header="header-text"</code> in your pre tag. ex. <code>&lt;pre lang="php" line="1" header="wp-syntax.php"&gt;</code></p>
]]></content:encoded>
			<wfw:commentRss>http://pravin.insanitybegins.com/2009/03/28/hacking-wp-syntax-plugin-to-show-header/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
