rss
logo

I provide consulting and custom development for Natural Language Processing, Information Extraction and Search solutions.Self Picture


 learn more   get in touch 

Logo - I Build Search
Feb
19

Join a list of integers in Python digg

Today, I had to pretty print a list of integers for debugging. This does not work:

>>> t = [1, 2, 3, 4]
>>> ' '.join(t)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: sequence item 0: expected string, int found

So I came up with this:

>>> def concat(x, y): return str(x) + ' ' + str(y)
>>> reduce(concat, t)
'1 2 3 4'

I am sure there is a better way of doing this!

Jan
21

Writing a spider in 10 mins using Scrapy digg

I came across Scrapy a few days back and have grown to really love it. This tutorial will illustrate how you can write a simple spider using Scrapy to scrape data off Paul Smith. All this in 10 minutes.

Lets begin

  1. Download and install scrapy and its dependencies.
  2. This done, open up your terminal and type python scrapy-ctl.py startproject paul_smith. A scrapy project will be created.
  3. Navigate to ~/paul_smith/paul_smith/spiders and create the file paul_smith.py with the following contents:

    paul_smith.py
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    
    from scrapy.spider import BaseSpider
     
    class PaulSmithSpider(BaseSpider):
      domain_name = "paulsmith.co.uk"
      start_urls = ["http://www.paulsmith.co.uk/paul-smith-jeans-253/category.html"]
     
      def parse(self, response):
        open('paulsmith.html', 'wb').write(response.body)
     
    SPIDER = PaulSmithSpider()
  4. To run the spider, go to ~/paul_smith type python scrapy-ctl.py crawl paulsmith.co.uk on the command line. This will fetch the page and save it to paulsmith.html.
  5. The next step is to parse the contents of the page. Open the page in your favourite editor and try to understand the pattern of the items we want to capture. You can see that <div class="yui-u"> contains the required information. We are going to modify out code like so:

    paul_smith.py
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    
    from scrapy.spider import BaseSpider
    from scrapy.selector import HtmlXPathSelector
     
    class PaulSmithSpider(BaseSpider):
      domain_name = "paulsmith.co.uk"
      start_urls = ["http://www.paulsmith.co.uk/paul-smith-jeans-253/category.html"]
     
      def parse(self, response):
        hxs = HtmlXPathSelector(response)
        sites = hxs.select('//div[@class="yui-u"]')
        for site in sites:
          print site.extract()
     
    SPIDER = PaulSmithSpider()

    You can read more on XPath Selectors here.

  6. Finally, looking at the HTML again, we can extract title, link, img-src & sale-price like so:

    paul_smith.py
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    
    from scrapy.spider import BaseSpider
    from scrapy.selector import HtmlXPathSelector
    import random
     
    class PaulSmithSpider(BaseSpider):
      domain_name = "paulsmith.co.uk"
      start_urls = ["http://www.paulsmith.co.uk/paul-smith-jeans-253/category.html"]
     
      def parse(self, response):
        hxs = HtmlXPathSelector(response)
        sites = hxs.select('//div[@class="yui-u"]')
        random.shuffle(sites)
        for site in sites:
          title = site.select('a/strong[@class="thumbnail-text"]/text()').extract()
          hlink = site.select('a/@href').extract()
          price = site.select('a/strong[@class="sale"]/text()').extract()
          image = site.select('a/img/@src').extract()
     
          print title, hlink, image, price
     
    SPIDER = PaulSmithSpider()

    You can save this data to your datastore in whatever way you wish.

  7. The output of 3 random items scraped using the above code can be seen below.

Output

Shawl Collar Block Stripe Jumper
Sale: £ 74.00

Crew Neck Placement Stripe Jumper
Sale: £ 67.00

Tailored Fit, Organic Cotton Cravat Print Shirt
Sale: £ 74.00

Jun
07

GPRS over linux digg

This is how I configured my laptop to connect to the internet over GPRS. I own a nokia E51 and I used the supplied USB cable to connect it to my laptop.

  1. On connecting my phone to the laptop via the USB cable, the phone prompts for mode. Pick PC Suite
  2. Run dmesg | tail to see if your phone has been detected.
    pravin@pravin-pc:~$ dmesg |tail
    [ 7295.313526] usb 1-1: new full speed USB device using uhci_hcd and address 3
    [ 7295.539647] usb 1-1: configuration #1 chosen from 1 choice
    [ 7295.552653] cdc_acm 1-1:1.10: ttyACM0: USB ACM device
  3. Run wvdialconf as root
  4. Modify /etc/wvdial.conf. For MTNL Mumbai GPRS service, my wvdial.conf looks like,
    wvdial.conf
    [Dialer Defaults]
    Init1 = ATZ
    Init2 = ATQ0 V1 E1 S0=0 &C1 &D2 +FCLASS=0
    Modem Type = USB Modem
    ISDN = 0
    Modem = /dev/ttyACM0
    Baud = 460800
    Phone = *99#
    Username = mtnl
    Password = mtnl123
  5. To connect to GPRS via the phone,
    pravin@pravin-pc:~$ sudo wvdial
    --> WvDial: Internet dialer version 1.60
    --> Cannot get information for serial port.
    --> Initializing modem.
    --> Sending: ATZ
    ATZ
    OK
    --> Sending: ATQ0 V1 E1 S0=0 &C1 &D2 +FCLASS=0
    ATQ0 V1 E1 S0=0 &C1 &D2 +FCLASS=0
    OK
    --> Modem initialized.
    --> Sending: ATDT*99#
    --> Waiting for carrier.
    ATDT*99#
    CONNECT
    ~[7f]}#@!}!} } }2}#}$@#}!}$}%\}"}&} }*} } g}%~
    --> Carrier detected.  Waiting for prompt.
    ~[7f]}#@!}!} } }2}#}$@#}!}$}%\}"}&} }*} } g}%~
    --> PPP negotiation detected.
    --> Starting pppd at Sat Jun  7 11:03:33 2008
    --> Pid of pppd: 3000
    --> Using interface ppp0
    --> pppd: `?[06][08]0?[06][08]
    --> pppd: `?[06][08]0?[06][08]
    --> pppd: `?[06][08]0?[06][08]
    --> pppd: `?[06][08]0?[06][08]
    --> pppd: `?[06][08]0?[06][08]
    --> local  IP address 202.159.246.204
    --> pppd: `?[06][08]0?[06][08]
    --> remote IP address 10.6.6.6
    --> pppd: `?[06][08]0?[06][08]
    --> primary   DNS address 203.94.227.70
    --> pppd: `?[06][08]0?[06][08]
    --> secondary DNS address 203.94.243.70
    --> pppd: `?[06][08]0?[06][08]
Dec
19

Using Devanagari in Latex digg

Ever wanted to write latex documents in Hindi, Marathi or Sanskrit? This article shows you how to do just that. At the end of this article, you’ll learn to go from here:

Latex Source Screenshot

to here:

Latex Devanagari Output Screenshot

Setting up your system

On Windows

  1. Download and Install MikTeX.
  2. Start the Package Manager.

    Starting MikTeX Package Manager

    Next, locate and install the devanagari package.

    MikTeX Package Manager

  3. Get the Devnag Binary and copy it to C:\Program Files\MiKTeX 2.7\miktex\bin\.

That’s it. We’re all set to write latex documents in Devanagari!

On Linux

If you’re running a debian based system, get the following packages: texlive-latex-base, texlive-lang-indic, texlive-fonts-recommended

Writing in devanagari

Now fire up your editor and add the following code:

1
2
3
4
5
6
7
8
9
10
11
12
 
% Using devanagari in latex
% http://pravin.insanitybegins.com/articles/
% Pravin Paratey [pravinp -at- gmail -dot- com]
 
\documentclass[12pt]{article}
\usepackage{devanagari}
\begin{document}
Lets write in devanagari
 
{\dn calo devanagarI me likhate hai}
\end{document}
  1. Save this file as sample.dn.
  2. Next, go to the command line and type devnag sample.dn to generate the corresponding sample.tex file.
  3. Run latex sample.tex to generate sample.dvi

Figuring out what to type

Figuring out what string of English characters give a Devanagari word isn’t always apparent. This document should be somewhat useful.

Feb
19

Writing a shell replacement digg

Download the source files.

This document will teach you to make your own windows shell replacement. If you don’t know what a shell replacement is, take a look at Shellfront. Before you begin, you’ll need:

  1. A working knowledge of C#.
  2. A copy of Visual Studio C# Express. [Download]
  3. A machine running Windows 2000 and above.

Creating the project

Fire up Visual C# and create a new Windows Application project. I have named it DeeShell. The first thing to do is to get rid of all the default code. Delete Form1.cs by right clicking it in Solution Explorer and selecting Delete.

You’ll be left with Program.cs. Open Program.cs and make the static void Main() method look like so:

1
2
3
4
static void Main()
{
	//Application.Run();
}

Save the project.

Setting up the Screen

Next, we’re going to set up the desktop area. Our steps will be:

  1. Hide the taskbar.
  2. Reset the desktop area.
  3. – Insert Breakpoint –
  4. Restore the desktop area.
  5. Restore the taskbar.

Add a class called WinAPI.cs to the project. This file will contain all the Win32 API that we need to P/Invoke. The best place to learn about all the functions available is at pinvoke.net.

Edit WinAPI.cs to look like so,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
/* DeeShell - A shell replacement for Windows
 * Pravin Paratey (February 19, 2007)
 *
 * Article: http://pravin.insanitybegins.com/articles/deeshell
 *
 * Released under Creative Commons Attribution 2.5 Licence
 * http://creativecommons.org/licenses/by/2.5/
 */
using System;
using System.Text;
using System.Runtime.InteropServices;
 
namespace DeeShell
{
	public class WinAPI
	{
		public struct RECT
		{
			public int left;
			public int top;
			public int right;
			public int bottom;
		}
 
		/// <summary>For ShowWindow</summary>
		public enum WindowShowStyle : int
		{
			Hide = 0,
			ShowNormal = 1,
			ShowMinimized = 2,
			ShowMaximized = 3,
			Maximize = 3,
			ShowNormalNoActivate = 4,
			Show = 5,
			Minimize = 6,
			ShowMinNoActivate = 7,
			ShowNoActivate = 8,
			Restore = 9,
			ShowDefault = 10,
			ForceMinimized = 11
		}
 
		/// <summary>For SystemParametersInfo</summary>
		public enum SPI : int
		{
			SPI_SETWORKAREA = 0x002F,
			SPI_GETWORKAREA = 0x0030
		}
 
		[DllImport("user32.dll", SetLastError = true)]
		public static extern bool SystemParametersInfo(uint uiAction, uint uiParam,
			ref RECT pvParam, uint fWinIni);
 
		[DllImport("user32.dll", SetLastError = true)]
		public static extern IntPtr FindWindow(string lpClassName, string lpWindowName);
 
		[DllImport("user32.dll")]
		public static extern bool ShowWindow(IntPtr hWnd, int nCmdShow);
 
		[DllImport("user32.dll")]
		public static extern bool SetWindowPos(IntPtr hWnd, IntPtr hWndInsertAfter, int X,
			int Y, int cx, int cy, uint uFlags);
 
		[DllImport("user32.dll")]
		[return: MarshalAs(UnmanagedType.Bool)]
		public static extern bool SetForegroundWindow(IntPtr hWnd);
	}
}

Now add another file Functions.cs with the following code,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
/* DeeShell - A shell replacement for Windows
 * Pravin Paratey (February 19, 2007)
 *
 * Article: http://pravin.insanitybegins.com/articles/deeshell
 *
 * Released under Creative Commons Attribution 2.5 Licence
 * http://creativecommons.org/licenses/by/2.5/
 */
using System;
using System.Windows.Forms;
 
namespace DeeShell
{
	class Functions
	{
		#region Private variables
		private static WinAPI.RECT m_rcOldDesktopRect;
		private static IntPtr m_hTaskBar;
		#endregion
 
		/// <summary>
		/// Resizes the Desktop area to our shells' requirements
		/// </summary>
		public static void MakeNewDesktopArea()
		{
			// Save current Working Area size
			m_rcOldDesktopRect.left = SystemInformation.WorkingArea.Left;
			m_rcOldDesktopRect.top = SystemInformation.WorkingArea.Top;
			m_rcOldDesktopRect.right = SystemInformation.WorkingArea.Right;
			m_rcOldDesktopRect.bottom = SystemInformation.WorkingArea.Bottom;
 
			// Make a new Workspace
			WinAPI.RECT rc;
			rc.left = SystemInformation.VirtualScreen.Left;
			// We reserve the 24 pixels on top for our taskbar
			rc.top = SystemInformation.VirtualScreen.Top + 24;
			rc.right = SystemInformation.VirtualScreen.Right;
			rc.bottom = SystemInformation.VirtualScreen.Bottom;
			WinAPI.SystemParametersInfo((int)WinAPI.SPI.SPI_SETWORKAREA, 0, ref rc, 0);
		}
 
		/// <summary>
		/// Restores the Desktop area
		/// </summary>
		public static void RestoreDesktopArea()
		{
			WinAPI.SystemParametersInfo((int)WinAPI.SPI.SPI_SETWORKAREA, 0, ref m_rcOldDesktopRect, 0);
		}
 
		/// <summary>
		/// Hides the Windows Taskbar
		/// </summary>
		public static void HideTaskBar()
		{
			// Get the Handle to the Windows Taskbar
			m_hTaskBar = WinAPI.FindWindow("Shell_TrayWnd", null);
			// Hide the Taskbar
			if ((int)m_hTaskBar != 0)
			{
				WinAPI.ShowWindow(m_hTaskBar, (int)WinAPI.WindowShowStyle.Hide);
			}
		}
 
		/// <summary>
		/// Show the Windows Taskbar
		/// </summary>
		public static void ShowTaskBar()
		{
			if ((int)m_hTaskBar != 0)
			{
				WinAPI.ShowWindow(m_hTaskBar, (int)WinAPI.WindowShowStyle.Show);
			}
		}
	}
}

Edit Program.cs to look like,

1
2
3
4
5
6
7
8
9
10
static void Main()
{
	// Make new Working Area
	Functions.HideTaskBar();
	Functions.MakeNewDesktopArea();
 
	// Restore Working Area Size
	Functions.RestoreDesktopArea();
	Functions.ShowTaskBar();
}

Put a breakpoint at RestoreDesktopArea(), and Press F5 to compile and run. When you hit the breakpoint, observe that,

  1. The Windows Taskbar has disappeared.
  2. You can try maximizing any open windows. They will now occupy the bottom pixels and leave an empty space of 24 pixels on top.
  3. If you refresh your desktop icons, they will obey these rules too. Doesn’t it make you feel powerful – making the desktop icons do that? Those who answered Yes – Boy! You guys sure need some serious councelling.

Press F5 again to continue and end the program. I know what you’re thinking, “This is not a shell replacement! You’re just hiding the taskbar. Explorer still runs in the background.”

*raises eyebrow* Are you being sassy? Let me tell you a story about what happened to the little boys and girls who were sassy. Santa didn’t leave them any presents that year. So there!

Coming back to the question, take a look at HideTaskBar(). DeeShell can run independently as well as on top of explorer.

Adding a Taskbar

Adding a taskbar will involve,

  1. Creating a new form.
  2. Setting its properties.
  3. Displaying it on startup.

Lets add a new form. Right-click the project in the solution explorer and choose Add -> Windows Form and name it TaskBar. Next, set the following properties:

  • Size: 300, 24
  • Start Position: Manual
  • Control Box: False
  • ShowInTaskbar: False

That completes your basic taskbar. Lets spruce it up a little. First, we’ll add a background image. Double click Resources.resx in the solution explorer. Then click on Add Image Resource and add an image. Set this as the BackgroundImage for your Taskbar window.

Adding Image Resource

Next, we’ll add a TableLayoutPanel. This control will help us organize our taskbar buttons quite nicely. Set the number of rows to 1 and leave the number of columns as two. Set the size of the first column to 60px. Then, set the following properties:

  • Backcolor: Transparent
  • (Name): tableLayoutPanel
  • ColumnCount: 2
  • Dock: Fill
  • Location: 0, 0
  • Margin: 0, 0, 0, 0
  • RowCount: 1

Now, we’ll add a Button called “Exit” which will let us exit DeeShell gracefully. Add a button to column 1 of TableLayoutPanel. Set its Dock property to Fill and set Margin to 0. Add the following code to it’s Click event (I’ve named the function OnExitClick):

private void OnExitClick(object sender, EventArgs e)
{
	Application.Exit();
}

Change Program.cs to

1
2
3
4
5
6
7
8
9
10
11
12
13
14
static void Main()
{
	Application.EnableVisualStyles();
 
	// Make new Working Area
	Functions.HideTaskBar();
	Functions.MakeNewDesktopArea();
 
	Application.Run(new Taskbar());
 
	// Restore Working Area Size
	Functions.RestoreDesktopArea();
	Functions.ShowTaskBar();
}

If you run the application now, you’ll see that the height property of Taskbar is not respected. To fix this, add this line to Taskbar.cs:

public Taskbar()
{
	InitializeComponent();
	WinAPI.SetWindowPos(this.Handle, (IntPtr)0, 0, 0, SystemInformation.VirtualScreen.Width, 24, 0x0040);
}

Listing Running Tasks

Add the following to Functions.cs,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
/// <summary>
/// Gets a list of Active Tasks
/// </summary>
public static ArrayList GetActiveTasks()
{
	ArrayList ar = new ArrayList();
	IntPtr child = IntPtr.Zero;
 
	Process[] process = Process.GetProcesses();
	foreach (Process p in process)
	{
		WindowData w;
		if (p.MainWindowHandle != IntPtr.Zero && p.MainWindowTitle.Length > 0)
		{
			w.hwnd = p.MainWindowHandle;
			w.title = p.MainWindowTitle;
			ar.Add(w);
		}
	}
	return ar;
}

You will also have to add these declarations at the beginning of Functions.cs

using System.Diagnostics; using System.Collections;

Now add a ComboBox to our TaskBar form to show the list of running processes. I’ve populated the ComboBox at the start with:

public Taskbar()
{
	InitializeComponent();
	WinAPI.SetWindowPos(this.Handle, IntPtr.Zero, 0, 0, SystemInformation.VirtualScreen.Width, 24, 0x0040);
	ArrayList ar = Functions.GetActiveTasks();
	for (int i = 0; i &lt; ar.Count; i++)
	{
		WindowData w = (WindowData)ar[i];
		cboTaskList.Items.Add(w.title);
	}
}

We’re going to bring the window selected in the ComboBox to the Foreground. To the SelectedIndexChanged event, attach the code:

private void cboTaskList_SelectedIndexChanged(object sender, EventArgs e)
{
	string windowName = cboTaskList.Text;
	IntPtr handle = WinAPI.FindWindow(null, windowName);
	WinAPI.SetForegroundWindow(handle);
}

There you go! Your very own partially skinned shell replacement. It doesn’t do much. It doesn’t mow your lawn or fix your faucet. But it makes one hell of a story – for those times when your kids just refuse to sleep.

I can’t believe the tutorial is done! 5 hours. Whoa!

F.A.Q.

1. How do I prevent the user from closing DeeShell through Alt+F4?

Add this code to the Taskbar’s FormClosing event:

private void Taskbar_FormClosing(object sender, FormClosingEventArgs e)
{
	e.Cancel = true;
}

You’ll also need to add a check to see if the exit was valid (say clicking the exit button) and allow that one to pass.

2. How do I ensure that only one instance of DeeShell runs?

Create a named mutex. Lock it. And let your first line in main() check for the presence of this mutex. If present, DeeShell is already running. Another way would be to use the FindWindow() API.

Oct
23

Running bbpress on sourceforge digg

Changelog

[June 14, 2007]
Updated for bbPress 0.8.1
[October 23, 2006]
First version of this document

Background

Sourceforge does not allow php mailers[1]. Software like bbPress which use the mailer to send newly registered users their password fails. This article will describe how you can solve this problem.

Preparing bbPress

The following steps are for bbPress v0.8.1. Adapt these for other (future) versions.

  1. Open bb-includes/registration-functions.php in your favourite editor.
  2. Add include_once('sf-functions.php'); immediately after &lt?php.
  3. Next use Find and Replace to replace all occurances of mail( with sfmail( in this file.

Now create another file called sf-functions.php in bb-includes directory with the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
&lt;?php
/* Additional functions for enabling mail on sourceforge
 * by Pravin Paratey
 * http://pravin.insanitybegins.com/articles/running-bbpress-on-sourceforge/
 *
 *
 * Note: I use the same database as bbpress. I create an additional table
 * called sfMailTable
 *
 */
 
 
// This function pushes data into the table
function sfmail ($email, $subject, $message) {
	global $bbdb;
	$table = 'sfMailTable';
 
	// Create table if it does not exist
	if($bbdb->get_var("SHOW TABLES LIKE '$table'") != $table) {
		$sql = "CREATE TABLE $table (
			id	bigint	not null auto_increment,
			email	text	not null,
			subject	text	not null,
			message	text	not null,
			unique key id(id)
			);";
		$results = $bbdb->query($sql);
	}
 
	// Push email data into the table
	$results = $bbdb->query("INSERT INTO `$table` (email, subject, message)" .
		"VALUES ('$email', '$subject', '$message');");
}
?&gt;

This function pushes to-email messages into the database.

Email Script

Next, we will write a perl script which will pull out values from the database and email them. This script will be called by the cron daemon every hour.

For the sake of this demonstration, the project name is assumed to be pravin. Replace instances of this with your own project name and path.

Create a file called sendmail.pl at /home/groups/p/pr/pravin/bin/ (you’ll have to create the bin directory) with the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#!/usr/bin/perl
# This script is responsible for pulling values out of the database
# and emailing.
# by Pravin Paratey
# http://pravin.insanitybegins.com/articles/running-bbpress-on-sourceforge/
 
use DBI;
 
#---------------------------------------------
#Edit this and replace with your sf db details
my $dsn = 'DBI:mysql:p133996_pravin:mysql4-p'; # Change to database-name:hostname
my $db_username = 'p133996admin'; # Change this to your db username
my $db_password = 'password'; # Change this to your password
 
#--------------------------------------------------------------
#Do not Edit below this line unless you know what you are doing
 
my $sfMailTable = 'sfMailTable';
 
my $db = DBI->connect($dsn, $db_username, $db_password) or
	die "Cannot connect to database";
 
my $query = $db->prepare("select * from $sfMailTable");
 
$query->execute();
 
while(my ($id, $email, $subject, $message) = $query->fetchrow_array()) {
	open(SENDMAIL, "|/bin/mail -s '$subject' $email") or
		die ("Cannot open mail");
	print SENDMAIL $message;
	close(SENDMAIL);
 
	# delete row
	$query2 = $db->prepare("delete from `$sfMailTable` where id=$id");
	$query2->execute();
}
 
$db->disconnect();

Mark this file as executable by:

$ chmod +x sendmail.pl

Test if everything works by,

  1. Registering a user in bbpress
  2. Running sendmail.pl
  3. Check if you receive the registration successful mail

Setting up cron

Now that the script works, we need to get cron to call it every hour. Type crontab -e and add the lines:

# Send email every hour
6 * * * * /home/groups/p/pr/pravin/bin/sendmail.pl

That’s it. You’re done!

Troubleshooting

The perl script gives a “Table not found” error.
The table is created (if it doesn’t exist) via sf-functions.php. This means the table is created when a user registers. Just register an arbitrary user to make this error go away.
I want the script to execute every x minutes.
I do not know the specifics of cron. man 5 crontab would be your best guide.
Sep
13

Dee’s guide to rediffblogs digg

blog guide
"I don’t see much sense in that," said Rabbit.
"No," said Pooh humbly, "there isn’t.
But there was going to be when I began it.
It’s just that something happened to it along the way."
Note: This document is no longer being maintained. It is here in the hope that it might be useful to someone. For all other purposes, this document stands deprecated.

Revision 1:

Last Updated: Tuesday, March 30, 2004 10:39 AM

Change Log

[March 30, 2004]

  1. Added a section called Rediffblog Issues.
  2. Added a tip on how to temporarily disable your blog.
  3. Updated section on tagboards.
  4. Updated section on image hosting websites.
  5. Added a link to my sample templates.

1. Introduction

Welcome to my guide to rediffblogs. If you’ve come here, you are either a co-blogger or a wannabe. This page contains everything you need to make your blog the way you want it to be.

1.1. Audience

I understand that most bloggers aren’t computer people and so this guide starts off with the very basics and then moves on to the more advanced stuff.

1.2. Feedback

A feedback mechanism forms the most important part of any work. Your feedback will help improve this guide and you might find yourself mentioned in the credits part of this document. Comments/suggestions/flames are always welcome at pravin[at]gmail[dot]com. Alternately, you can use the comments section at the bottom of this page.

1.3. Official Homepage

The official version of this document can be found at http://pravin.insanitybegins.com/tutorials/rediffblog-guide/. Check here for the latest changes and revisions to this document.

1.4. Mirroring, Copyright and Distribution

This document titled "Dee’s Guide to Rediffblogs" is copyright ©, 2003-2006.

You are free to mirror and distribute this document in any form so long at its contents are not altered and it is presented in its entirety with this copyright notice intact.

2. Rediffblog basics

If you are unfamiliar with the concept of blogging, I suggest you check my blog.

Lets take a look at some of the page elements.

  1. Posts - Clicking on this shows you your posts.
  2. Settings - This takes you to a page where you can modify stuff like your Page Title, Description, Favorite sites and your Time Zone.
  3. Template - This is where you can modify your template.
  4. Category - This is where you can specify the category of your post. A good trick is to use this only for special posts like About Me or My Favorite Things and leave it blank for all your other posts.
    You can do this by writing a post with stuff about yourself and then specifying its category as "About Me". This will show up on the links under category. An easy way to add sections to your site.

TIP: Every time you change your settings or Template, be sure to click on Publish Blog to execute those changes.

3. An HTML Primer

3.1 Changing text attributes

Desired Effect Code Result
Bold Text <b>James Bold</b> James Bold
Italicize Text <i>Sizzlers!</i> Sizzlers
Underline Text <u>Troll Bridge</u> Troll Bridge
Combine Effects <b><i>Bold and Italics</i></b> Bold and Italics
Adding Color <font color="#FF0000">Red Robin</font> Red Robin

Note: #FF0000 specifies the color in a hexadecimal triplet of (red, green, blue). You can also use color names like blue, cyan, lime, fuchsia, yellow, red, maroon, white, silver, gray and black. i.e. <font color="blue">Rivers and valleys</font> = Rivers and valleys

3.2 Centering, right aligning and justifying text

Desired Effect Code Result
Left Align <p align="left">Mary</p> Mary
Center Align <p align="center">had a little</p>
had a little
Right Align <p align="right">lamb</p>

lamb

Justify Text <p align="justify">Mary had a little lamb</p>

Mary had a little lamb

3.3 Linking friends

3.3.1 Basics of linking – the anchor <a> tag

Linking to a friend’s blog or another website is done by using the anchor tag. The anchor tag looks like:

<a href="http://dee.rediffblogs.com">Visit Dee</a>

Replace http://dustyant.com with the address of the website and Visit Dee with the text you’d like to display.

3.3.2 Using an automated tool like Blogrolling

4. Adding images

Adding images is done by the <IMG> tag which looks like:

<IMG SRC="marsh.jpg">

The above code will cause the browser to display the image located at marsh.jpg.

4.1 Hosting your images

You cannot host your images at rediff. What this means is that if you’ve got a picture on your hard disk that you’d like to put up on your blog, you’ll have to first host that image on another website.

Here are some sites which let you do just that

  1. http://www.ripway.com
  2. http://www.photobucket.com
  3. http://www.villagephotos.com

After signing up, look for a link that says upload and follow the instructions there. Once you’ve uploaded your picture, you get a link. Use this link instead of marsh.jpg in the image tag above.

4.2 Here’s a trick – for users with a knowledge of HTML only

Most free web space providers like geocities and tripod do not permit direct links to images. So what you can do is use the <iframe> tag and embed a page.

In other words, you will enter something like

<iframe src="http://in.geocities.com/oh_its_dee/page.html" width="100" height="100" scrollbars=no>

where you want the image to appear. And page.html will contain:

<html>
<body margin=0>
<img src="image.jpg">
</body>
</html>

Notice the margin=0 attribute. And Voila! Does that box look ugly to you? You can make it look prettier by setting the frame width to 0.

5. Adding a tag board

5.1 What is a tag board?

A tag board is a great way of adding interactivity to your blog. A tag board makes it easier for people to leave comments on your blog. A tag board looks something like this:

5.2 That’s cool! How do I get one?

Getting your own taggy involves two steps:

  1. Signing up with a tag board provider like
    1. Tagboard
    2. Shoutbox
    3. Doodleboard
    4. Who-Is-Online Taggy
    Once this is done, you receive a piece of html code.
  2. Next you got to insert that code into your template.

6. A look at templates

Rediff provides you with six templates to choose from. A template defines how your blog will look. You can customize the templates that come with rediffblogs. All it involves is customizing the stylesheet located at the top of the page. If you are lost, just follow the step by step tutorial.

6.1 Customizing rediffblogs templates

I’ll show you how you can customize the first template.

Customizing others is very similar

  1. Log into rediffblogs and click on Template.
  2. In the text area that appears, scroll down till you see <style>. You can search for this using the find command (Ctrl+F).
  3. Look for the line .righttext {FONT-SIZE: 8pt; COLOR: #000000; FONT-FAMILY: arial; TEXT-DECORATION: none}
  4. Changing the values of FONT-SIZE, COLOR, FONT-FAMILY will change the look of the text in the yellow area. Lets change theFONT-FAMILY to verdana,
    .righttext {FONT-SIZE: 8pt; COLOR: #000000; FONT-FAMILY:verdana; TEXT-DECORATION: none}
  5. If you save the template and view your blog, you’ll see that the text in the yellow area appears in the verdana font.
  6. A well designed template has it’s customizable elements well separated from its body. In this case, however, some of the elements are located deep within the body. If you move down to the body tag <body you’ll find that it contains, among others, the following attributes:
    • bgcolor - This specifies the background color of the page. To change it to, say yellow, change this to bgcolor="yellow".
    • text - This specifies the color of the text in the page. To change this to a shade of black, change this to text="#333333".
    • link - This specifies the color of the links. To change the color of all links to red, change this to link="red".
    • alink - This specifies the color of the active link (i.e. the color the link changes to, when you click it).
    • vlink - This specifies the color of the links you’ve already visited.

7. Creating your own template

This is a bit advanced. I am going to assume the reader to be familiar with HTML. Creating your own template involves:

  1. Creating the page layout.
  2. Adding rediff-specific tags so that the rediff-cgi can replace them with content.

7.2 Rediff Specific Tags

Lets take a look at all the tags you can use:

<$BlogTitle$> All occurances of this tag are replaced by the title of your blog. A good place for this tag is between the <title></title> tags in the header.
<$BlogName$> This tag is replaced by the name of the blog.
<$BlogDescription$> This tag is replaced by the description of your blog. You can change the description of your blog in the Settings area.
<rediffBlog> </rediffBlog> This tag denotes the beginning of your post.
NOTE: Because of a peculiarity of the rediff-cgi, this tag must be present in all templates containing the <body> tag.
<$BlogItemAuthor$> This tag is replaced by the Firstname-Lastname values you entered when you signed up for rediffblogs.
<$BlogItemBody$> This is replaced by your actual post
<$BlogItemNumber$>

Every post that you make is tagged with an Item Number. This number is generated by taking into account the date and the time of your posting and is unique for every post you make.

This number can be used with the <$BlogId$> tag to create a link to this post. Another good application for this is in providing a unique name for commenting systems like Haloscan.

<$BlogItemArchiveFileName$> This tag is replaced by the filename where this post is stored. So you can refer to this group of posts using this value. To refer to a single post, use <$BlogItemArchiveFileName$>#<$BlogItemNumber$>
<$BlogItemDateTime$> This is replaced by the short Time/Date
<$BlogId$> This tag identifies your blog. Together with <$BlogItemNumber$>, this identifies a unique post
<BlogDateHeader> </BlogDateHeader> This tag encloses <$BlogDateHeaderDate$>
<$BlogDateHeaderDate$> This tag is replaced by the long date. You can set the date format in the Settings area
<BlogArchiveFormat> </BlogArchiveFormat> This encloses the Archives Structure
<$BlogArchiveLink$> This tag is replaced by the URL of the Archive
<$BlogArchiveText$> This tag is replaced by a text describing the Archive
<$BlogArchiveFileName$> I don’t see the purpose of this. This loads a JavaScript source for your blog
<$BlogCategoryFileName$> Ditto for this too
<BlogCategoryFormat> </BlogCategoryFormat> This tag encloses the categories.
<$BlogCategoryLink$> This tag is replaced by the URL of the archives which fall under this category
<$BlogCategoryText$> This tag is replaced by the category name.
<BlogFavoriteLink> </BlogFavoriteLink> This tag encloses the favorite links.
<$BlogUrlLink$> This tag is replaced by the URL of the favorite link.
<$BlogUrlText$> This tag is replaced by the text describing that favorite link.

7.3 Sample Templates

Guess what I’ve got for you fine folks? A bunch of sample templates that you can use for your blog! Visit http://themes.rediffblogs.com.

8. Fancy Tricks

8.1 Getting rid of Rediff Ads

Before continuing, I’d like you to ponder for a while. Rediff provides blogging as a free service and the ads you see on your page contribute to the hosting/bandwidth/storage costs of your blog. If you’d still like to remove those ads, read ahead.

Getting rid of ads means removing the following texts:

  1. <SCRIPT LANGUAGE="JavaScript" TYPE="text/javascript"><!--
    var a='http://ads.rediff.com/RealMedia/ads/';
    var RN = new String (Math.random());
    var RNS = RN.substring (2, 11);
    function da(width, height, posn) {
    var p='www.rediff.com/blogs-general.htm/1' + RNS + '@' + posn;
    if(v < 11) {
    document.write('<A HREF=' + a + 'click_nx.ads/' + p + '>
    
    	<IMG SRC=' + a + 'adstream_nx.ads/' + p + ' BORDER=0
    	WIDTH=' + width + ' HEIGHT= ' + height + ' VSPACE=0 HSPACE=0><\/A>');
    } else {
    document.write('<SCRIPT LANGUAGE=JavaScript1.1 SRC=' + a +
    	'adstream_jx.ads/' + p + '><\/SCRIPT>');
    }
    }
    //-->
    </SCRIPT>
  2. <SCRIPT LANGUAGE="JavaScript" TYPE="text/javascript">
    
    <!--
    if (navigator.appName.indexOf("Microsoft") != -1)
    {
    document.write("<IFRAME SRC=\"http://blogs.rediff.com/blogs-all-Top.htm\"
    	NAME=Top WIDTH=\"468\" HEIGHT=\"60\" MARGINWIDTH=\"0\"
    	MARGINHEIGHT=\"0\" FRAMEBORDER=\"0\" SCROLLING=\"no\"><\/IFRAME><BR>");
    }
    else
    {
    da(180, 150, 'Top');
    document.write("<BR>");
    }
    //-->
    
    </SCRIPT>
  3. <noscript>
    <BR>
    <a href="http://ads.rediff.com/RealMedia/ads/click_nx.ads/
    	www.rediff.com/blogs-general.htm@Top">
    	<img src="http://ads.rediff.com/RealMedia/ads/
    	adstream_nx.ads/www.rediff.com/blogs-general.htm@Top"
    	border="0"></a>
    
    </noscript>

Now you have an ad-free page!

However, you can see the rectangle with the rediff.com logo. To remove that, remove everything within the <table></table> (including <table></table>) tag immediately enclosing the word redifflogo.gif. In template 1, it’s everything within the first occurrence of <table> </table> tag.

8.2 Temporarily disabling your blog to visitors

If you would like to temporarily disable your blog and prevent visitors from viewing your posts,

  1. Look for the <rediffBlog> </rediffBlog> tags
  2. Add <!– after the <rediffBlog> and –> before the </rediffBlog> tag. Your template code should look something like this:
...
<rediffBlog>
<!--
    ...
-->
</rediffBlog>
...

9. Rediff Issues

9.1 Posting to your blog

While posting a new post to your blog, rediff-cgi removes the spaces between two tags if they follow each other:

Ex. If you are posting the following text
<span class="bingo">Hello</span> <span class="doodle">World</span>

Note that there is a space between </span> and <span>. Rediff-cgi will remove that space and store the post as
<span class="bingo">Hello</span><span class="doodle">World</span>

This will change your desired output from Hello World to HelloWorld (without spaces).

Jul
03

Compiling XFCE from source digg

xfce screenshot

Xfce is a really cool Desktop environment for Linux. Its really light on resources compared to Gnome and Kde, and just as pretty.

Installation

I tried doing an rpm install. I guess a tool like apt would’ve made it a lot easier. After downloading 23 rpms, I realized that I was missing a few files. They turned out to be headers with the .hdr extension. I found these files in the header directory, but had *no* idea what to do with them. But then I had also started a parallel download of the source files. And instead of going through help forums, I opted to do a source compile. This is my compiling order:
  1. libxfce4util
  2. libxfcegui4
  3. libxfce4mcs
  4. xfce-mcs-manager
  5. dbh
  6. xfce4-panel
  7. gtk-xfce-engine
  8. xfce4-systray
  9. xfdesktop
  10. xfce-mcs-plugins
  11. xfce4-themes
  12. xfce4-iconbox
  13. xfce4-mixer
  14. xfce4-toys
  15. xfce4-trigger-launcher
  16. xffm
  17. xffm-icons
  18. xfprint
  19. xfwm4
  20. xfwm4-themes

Note

ldconfig: To add the path /usr/local/lib to ldconfig, append it to /etc/ld.co.conf and then run ldconfig. (July 03, 2004)

Jun
29

Ooh what a pwiddy resume digg

I finally got around to updating my resume. I also converted it to LaTeX. For all you Latex people out there, here are a few good links:

  1. MikTeX – a Windows implementation of LaTeX
  2. WinShell – A really nice TeX editor for Windows
  3. How to make a compact beautiful PostScript or PDF file from a TeX file
Jul
09

.plan digg

My .plan file on everest:

   _       _
  ( ).---.( )            ~!@#$%^&*()~!@#$%^&*()~!@#$%^&*()
  ./.="'"=.\.
  |=.     .=|                     pravinparatey
  |_  0 0  _|                  tHe InSaNiTy BeGiNs
 .`  .---.  '.
 :   `---'   :           ~!@#$%^&*()~!@#$%^&*()~!@#$%^&*()
 `._ '---' _.'
   _:-----:._
  /={     }=_\
 /_.{     }-_=\
 |=|{     }=|-|
 |=|{     }-|=|
  \|{     }_|/
   |{     } |
   |{     }=|
   |{     } =\
   |`.   .'=|`\
   |_=`|'`=_|`\`\    .'`.
 __|_=_|=___|  `\`\_/./`'
(((__(((_____)   `.__/



I will probably be at h2
----------------------------------------

Latest Articles

Feb
19

Join a list of integers in Python

How do you run a string join on a list of integers in Python? After googling for about 10 mins, I gave up and did this. I am sure there is a better way of doing it! [Read More]
Jan
21

Writing a spider in 10 mins using Scrapy

I came across Scrapy a few days back and have grown to really love it. This tutorial will illustrate how you can write a simple spider using Scrapy to scrape data off Paul Smith. All this in 10 minutes. [Read More]

Featured Projects

Document Tagger

Document Tagger

DocTagger lets you automatically classify text documents. Use this as a starting point to write apps that can sort through volumes of unorganized data.

[Read More]

Indic to English Transliterator

Indic to English Transliterator

Transliteration is the process of converting a word from one language to another while retaining its phonetic characteristics. This application lets you convert a word from any major Indian language (currently supports Hindi, Marathi, Sanskrit and Bengali) to English.

[Read More]

This page and its contents are copyright © 2010, Pravin Paratey.