Leon’s Weblog

July 22, 2008

File Synchronization with Unison

Filed under: Software Dev — leon @ 8:51 am

Unison is a universal tool for synchronizing files. Although the program is no longer actively developed, it has enough useful features to make it my tool of choice for many tasks and projects. Here are a few scenarios for which I find Unison to be particularly useful:

Application Deployment
While Unison is in no way a replacement for version control, it can be used to release (web/intranet) applications from staging to production environments. This approach has several advantages. First, it is faster (and safer) than doing a full copy of a large site. Before the changes are committed, the program displays a summary of changed files and allows you to use diff to view/confirm the changes that were made. Since, platforms like ASP.NET can compile pages on-the-fly (in memory) synchronizing only the changed files saves the server processing time and improves the users’ experience. Also, synchronization is bidirectional (unlike rsync) so changes made directly on the production copy can be detected (just don’t ask who made them). Of course all of this can be achieved by writing custom deployment scripts but running Unison is far easier (especially if you have a frequent release schedule).

Synchronizing Documents with Mobile Gadgets
I run a central file server that hosts all of my documents. Although I can access the documents remotely, I often make create replicas for my laptop, PDA, and flash drive (as needed) for times when I am not connected or the internet connection is too slow. Unison is particularly useful here because it is available for Windows, Linux, and Mac and can synchronize local files (for flash drive), network shares, and over SSH. This was the only tool that I found that can safely and securely synchronize files from my Linux server to my windows laptop without compromising any functionality on either platform. Furthermore, if you have more than two replicas of the same files, you can safely synchronize the replicas two at a time to propagate changes.

Backup
There are many backup and disaster recovery solutions; however on Unix/Linux, everything is just a file. It’s often easier and more useful to just make a copy of everything to an external disk and maintain it by synchronizing. To recover, just copy the files back. I wouldn’t recommend this approach on a critical corporate server; but, for a personal server I find this approach is good enough.

Unison is free. Give it a shot.

July 3, 2008

Backup Fully, Backup Often

Filed under: Personal — leon @ 9:59 am

Recently, the power supply on my server failed damaging the motherboard and all attached hard drives. I used many precautionary measures to protect the data on the server but they were not enough to avoid going through data recovery. The data was on a journaling file system (ReiserFS v.3) but that doesn’t help when the disks are fried and un-readable. The data was also mirrored across two 250GB drives which, as luck would have it, were both unusable. Sure there were several server backups as well but none were recent or complete enough to be usable.

My data recovery quest started with some anecdotal attempts to get the drives to work. The USB SATA adaptors did not work nor did the trick of putting the disks in the freezer (as silly as that sounds some have had luck with this approach so I figured is was worth a shot). It was time to enlist professional help so I contacted CBL Data Recovery who have had a long history recovering data from various disasters.

Pros:

  • CBL performs an assessment of the damage and only charges you if they are able to recover the data.
  • The prices are reasonable compared to other services that I have seen that change 10K and above.
  • Friendly service

Cons:

  • The recovery process took over a week. Apparently, the disk platters got damaged as well as the disk circuit board.
  • The customer service representatives were not very helpful and did not appear technically inclined. The CBL engineers that I talked to were much more aware of the situation.
  • Many of recovered text files had some binary data after the EOF flag which caused some Linux programs to crash when opening the files. This was fixable but time consuming.

Ultimately, CBL was able to recover all the data from the drives. Time to rebuild my server and think of a better backup strategy.

December 10, 2007

WordPress Auto-Login

Filed under: Software Dev — leon @ 9:31 pm

WordPressis a great blogging engine. It’s flexible, scalable, and easy to tweak/configure to integrate into an existing PHP site. However, if you have an existing site with available user authentication and management capabilities, getting WordPress to accept those credentials (in a single sign-on fashion) can be a bit of a challenge.

Before we proceed, I should note that there are a number of available plugins that enable WordPress to integrate with some of the popular content management systems out there. Our requirement is a bit different however. We want to bypass WordPress’ authentication mechanism all together and have users login through the main portion of the site. In fact, in a well integrated site, the interface should make navigating between WordPress pages and the rest of the site seamless to the user. Our goal is to write a WordPress plug-in that will automatically authenticate a user who is already logged into the parent site (and, consequently, grant the user access to edit the blog’s content). All other users will have the rights of an unregistered visitor.

In my setup, the main site has role-based permissions and the WordPress setup only has one account for each role (i.e. admin, editor, user etc…). The plugin first checks the role of the user logged in to the main site and then simulates a WordPress login anytime the user navigates to the blog. You should be able to customize this method for your own needs.

function auto_login() {
    if (!is_user_logged_in()) {
        //determine WordPress user account to impersonate
        $user_login = 'guest'; 

        //get users password
        $user = new WP_User(0, $user_login);
        $user_pass = md5($user->user_pass); 

        //login, set cookies, and set current user
        wp_login($user_login, $user_pass, true);
        wp_setcookie($user_login, $user_pass, true);
        wp_set_current_user($user->ID, $user_login);
    }
}
add_action('init', 'auto_login');

Additional notes and caveats for the attentive reader

  • There is a wp-include/pluggable.phpfile that defines all the functions that you can override and hook into. The WordPress API documentation is not very thorough so you may need to review the actual code.
  • WordPress uses a double MD5 hash of the password to authenticate the user. In the database, the password is stored as a single hash. We need to hash that password again before passing it into the wp_login() function (and set the third parameter to indicate that the password is already hashed). Obviously hard coding the actual password would be a big no-no.

We did all this work to login but what about logging out? We have several options. First, we can call WordPress’ logout method which is wp_clearcookie()from the main site.  The drawback to this approach is that we need to include all the WordPress libraries into our main site for this to work (too much unnecessary overhead IMHO). The other approach is to not use cookies at all thus alleviating the need to logout. To do this we simply remove the call to wp_setcookie()in out plugin and override the auth_redirect()function to do nothing. This works because we impersonate the user on every page load and the only WordPress code that checks the cookie was in auth_redirect()until we got rid of it. Another side effect of this is that un-authenticated WordPress users will no longer be taken to the WordPress login page (but we didn’t want that anyway).

Update 6/4/08: There were a few changes to the WordPress API as of version 2.5 and some of the functions I used above became depreciated. The API documentation has also improved. A better way to implement the auto_login() function above is as follows.

function auto_login() {
    if (!is_user_logged_in()) {
        //determine WordPress user account to impersonate
        $user_login = 'guest';

       //get user's ID
        $user = get_userdatabylogin($user_login);
        $user_id = $user->ID;
  
        //login
        wp_set_current_user($user_id, $user_login);
        wp_set_auth_cookie($user_id);
        do_action('wp_login', $user_login);
    }
} 
add_action('init', 'auto_login');

November 30, 2007

Configuring Website on a 1and1 Shared Host

Filed under: Software Dev — leon @ 2:39 pm

Recently, I was working on a project to setup a new website on a 1 & 1 shared host. Shared hosts are a cheap alternative to VPS and managed servers but they come with a mixed bag of restrictions that limit your ability to configure the server. I was looking for a host for under $10/month that offered SSH access and had a typical LAMP setup. This is how I configured the rest. (more…)

June 3, 2007

Back from New Zealand

Filed under: Personal — leon @ 8:34 am


New Zealand

I just returned from a two week trip to New Zealand with my friend, Eugene. (5/19/07-6/1/07). We toured the country from Christchurch to Auckland by car. The pictures from the trip are available in my photo gallery.

(more…)

February 27, 2007

ASP.NET Impersonation and Screen-Scraping

Filed under: Software Dev — leon @ 10:13 am

If you have ever tied to download the content of a secured web page from another page then you know just how easy it is to have something go wrong. But, as I’ve found out recently, if you see it through, you will learn a great deal about IIS and ASP.NET security (while developing a new found appreciation for Apache at the same time). Here are some highlights.

Background:
You want a user to press a button on an Intranet page and download the content of several other remote sites that use NTLM. The connection to the remote resource should be made using the credentials of the user not a default account. (in my case the remote resources were actually web services and network files but it doesn’t really matter)

The Basics:
First configure your site to use Integrated Windows Security (a.k.a. NTLM) and disable anonymous login. This can all be done under the Security tab of the IIS settings. At this point, it doesn’t really matter what authentication method is configured for ASP.NET in the web.config file because IIS will still pass the identity of the network user on the domain to the page. However, to be a bit more explicit, we can disable guest login in the web.config file as well.

The easiest way to download a file in ASP.NET is to use the System.Net.WebClient class. Just create an instance of the class and use the DownloadFile method. What if you are downloading from a secured site that doesn’t allow anonymous users?

First Problem:
One would think that passing the credentials of the current logged in user to a remote resource should be easy. Especially when the WebClient class has a Credentials property which we can set to CredentialCache.DefaultCredentials. Easy enough. Too bad that doesn’t work. Maybe we should set the UseDefaultCredentials property of the WebClient class to True as well. Still doesn’t work.

Checking the logs on the remote web server will show that no credentials were passed at all and the server returned the typical 401-unauthorized error. There are plenty of other properties that you can set or you can try to work with the lower level HTTP Client Request and Response classes but they will all lead you in the same place… i.e. nowhere. You may see a small beacon of hope if you place both websites on the same host because this configuration works (although it is not the way you intended).

Impersonation:
By default, IIS runs and processes each user request in threads owned by ASP.NET’s local machine account (which is typically ASPNET). As a consequence, even if we enable NTLM security on the page, those credentials will only last one hop. Meaning that your web server authenticates the client over NTLM but cannot spawn new web requests on the user’s behalf to remote network resources. No wonder we couldn’t login before. ASPNET is not a network account. Also, this explains why you can connect to other websites on the same host… IIS was using the same ASP.NET account on each site.

Now we can configure IIS to enable impersonation (which is disabled by default) by setting the appropriate variable in the web.config file. This will tell IIS to spawn each new request thread under the credentials of the client thus giving the thread access to network resources that are available to the user. Are we home free now? Well, you can test that this scenario does work and that we have reached our goal but at what cost.

Next Problem:
According to Microsoft (and a few simple performance tests) enabling impersonation reduces the scalability of your site because each request incurs a small performance hit. Do you want to impose this overhead on the entire site if you only need to use this functionality in a few places? There are also a number of security problems with this solution. The one that I particularly dislike is that, if you have pages that write files on the webs server, every network user will now need write access to the output folders (whereas, without impersonation, only the local asp.net account needs write access to the output folders).

Impersonation at Runtime:
What if you can disable impersonation for the site as a whole but enable it dynamically on the pages that need it. Well it turns out that you can. I would have preferred setting a page directive to enable this option as needed but, after going though this ordeal, writing a few extra lines of code doesn’t seem so bad. The last caveat is that the web.config settings now matter. If ASP.NET authentication method is not set to “Windows” you will not be able to impersonate the user at runtime because there will not be enough information in the User.Identity object.

The final code:

' Impersonate the current network user
Dim impersonationContext As System.Security.Principal.WindowsImpersonationContext
Dim currentIdentity As System.Security.Principal.WindowsIdentity = CType(User.Identity, _
System.Security.Principal.WindowsIdentity)
impersonationContext = currentIdentity.Impersonate

' Download file from a network resource
Dim myWebClient As New WebClient
myWebClient.Credentials = CredentialCache.DefaultCredentials
myWebClient.DownloadFile("http://network_resource", Server.MapPath("~\inbox"))

' Revert back to original context of the ASP.NET process
impersonationContext.Undo()

February 19, 2007

.NET Regular Expressions

Filed under: Software Dev — leon @ 7:11 pm

Regular Expressions are great for parsing text files. If only all languages used the same conventions… Oh well. Here are some tips and caveats that I’ve picked up from working with Regular Expressions in .NET

  • Comments: If your RegExp is going to be longer than just a few symbols, make sure to include comments. The comments are placed in tags like this: (?# Your Comment Here ). When doing this make sure to set the option to ignore pattern white space. While you are at it, enable multiline support since the file that you are parsing is probably on multiple lines and ignore case to make the pattern simpler. Combine the options using the or operator like so: RegexOptions.Multiline Or RegexOptions.IgnoreCase Or RegexOptions.IgnorePatternWhitespace
  • Named Captures: If you are parsing the file with Regular Expressions then you are looking for tokens to extract. These tokens are saved in the Groups collection of the Match object. By default, you can access the captured values using an index; however, you can greatly improve readability by assigning a name to any matched token like this: (?<Name>(Pattern)).
  • Non-Greedy Matching: Ever wonder why .* sometimes gobbles up your entire file. By default regular expressions are optimistic and match the last possible token that matches. To stop at the first token that matches use the question mark operator like so .*?. As you would expect, this works on other patterns such as .+? and .{4,8}?. At least this behavior is consistent on most platforms that claim to implement regular expressions.
  • Caveat: Did you expect the period operator to match absolutely any character. Well not in .NET. Here, a new line character (which, might I add, is pretty common in text files) will not be matched even the Multiline option is set. If you want to match absolutely anything, use a character class like so [\s\S]. This basically means match anything that is white space or is not white space… i.e. everything.

In my last project, I had to parse an HTML document to extract the path to a particular image and the associated image map (good ol fashioned screen-scraping). Maybe this example will help someone else…

Dim re As New Regex( _
   "<img\\s+                     (?# First find an image ) " & _
   "id=""Chart_Image""\\s+       (?# With this ID ) " & _
   "usemap=                     (?# Then find the image map name ) " & _
   """\\#([\\s\\S]+?)""\\s+         (?# Save the image map name. Use [\s\S] because . does not match \n ) " & _
   "src=""(?<file>([\\s\\S]+?))"" (?# Then get the file path ) " & _
   "[\\s\\S]*?                    (?# Now wait for the image map. Non-Greedy capture ) " & _
   "<map \\s+ name=""\\1"" \\s*>   (?# Capture the image map content with specified name ) " & _
   "(?<imagemap>([\\s\\S]+))      (?# This is the image map content) " & _
   "<\\/map>                     (?# End of Image Map) ", _
   RegexOptions.Multiline Or RegexOptions.IgnoreCase Or RegexOptions.IgnorePatternWhitespace)
Dim myMatch As Match = re.Match(file)

For more .NET specific help on Regular Expressions, check out this Article.

January 21, 2007

Maps for GPS Tuner

Filed under: Gadgets, Software Dev — leon @ 10:34 pm

I have been looking for an off-road navigation solution when I stumbled across GPS Tuner. While TomTom is great for car navigation, it lacks many features such as track recording and support for custom maps. We now have the option of using the newly released mobile Google Maps and mobile Virtual Earth, these programs require a constant Internet connection and can be slow to use (especially when hiking in remote locations with poor cell phone reception). Its often much more convenient to have the needed maps pre-loaded and configured on the hand-held.

When I gave GPS Tuner a try I quickly realized that I have to spend a lot of time making my own maps. Luckily, there are a number of free online mapping systems (tile servers) such as Google Earth and Virtual Earth that can provide the base images for maps. The problem now is downloading the maps (in fine resolution) and piecing them together. Since I’m too lazy to do this manually, it was time for a little scripting to automate the process.

Google Maps and Microsoft Virtual Earth work by asynchronously downloading tiles of the map depending on the users desired map zoom level. With a little hacking, I figured that I could put together a script that will download any section of the map in any available zoom level and automatically put all the tiles together into one large image. The biggest challenge there is finding out the indexing scheme used for the tiles (i.e. given a lat/long coordinate and a zoom level deterministically determine the corresponding tile on the map and the URL to fetch that tile). The following articles on Via Virtual Earth gave me a great head start and even some sample code. All that was left to do is write a loop to download all the tiles between two lat/long coordinates and save them into one continuous image that can be loaded into GPS Tuner.

Here is the code and a sample image of Manhattan made from about 100 tiles.

Happy Navigating.

January 14, 2007

New Toy: Eten M700 (glofiish)

Filed under: Gadgets — leon @ 1:20 am

Eten M700 (glofiish)
Just got a new M700 to replace my old phone (Motorola MPx220) and PDA (iPaq 3900). On first impression, this is a great convergence device but its takes some time getting used to. The Eten Users Forum was a big help in working out the kinks and getting everything setup.

Caveats

  • Static Navigation: The GPS receiver has static navigation turned on by default. This limits position changes to about every 50 meters and makes the device unusable for navigating on foot. Use SirfTech to disable this option (be careful). Refer to the following thread for details.
  • Battery Indicator: The battery seems to continue charging indefinitely (even when the indicator shows 100%). The charging light does turn off an hour later… this is “normal.”
  • ETEN software: For lack of a better word, the ETEN software sucks. Replace it with another vendor’s version if you need that functionality. The device runs faster and with less “hick-ups” once the ETEN software is removed.

Key Applications

  • TomTom Navigator - This is the best navigation software that I found. Just copy the maps to your SD card and you are ready to go. (make sure to get at least a 1GB card)
  • GPS Tuner - Great for off road navigation and making custom maps.
  • Tube II - Transit maps and city reference.
  • Resco Explorer - Powerful file explorer, registry editor, network browser.
  • Resco Photo Viewer
  • SPB Pocket Plus, Diary, and Weather - Convenient today screen plug-in and task manager. Great for one handed navigation of PIM data. Note that the current version has a problem with the Glofiish which causes stray lines to be drawn on the today screen tabs.
  • Pocket Informant - Comprehensive PIM manager.
  • HiCalc - Comprehensive calculator
  • Lexisgoo - Great dictionary.
  • Windows Live Search (still in Beta)
  • SK Tools - I didn’t believe the hype at first but it actually does make the device run faster.

December 31, 2006

My contribution to New Year’s dinner…

Filed under: Cooking, Personal — leon @ 5:21 pm

Olivie (Salad)

Olivie

5 Potatoes Boil, peel, and finely chop Mix
4 Carrots
5 Eggs
2 Cans of Peas
Turkey, Bologna, or Ham Finely chop
Pickles
Dill weed
Mayonnaise
Salt and Pepper To taste

Bangladesh (Salad)

Bangladesh

6 Egg whites Grate each ingredient and layer in this order.
Mozzarella cheese
Mayonnaise
Can of Mackerel
1 Onion
Mayonnaise
2 Egg yolks
1 Stick of frozen butter
Mayonnaise
4 Egg yolks

Chicken Liver Paste

Chicken Liver Paste

1 Pound of chicken liver Boil till soft then puree Blend together Make little domes
1 Onion Finely chop and blanch in hot water
5 Egg yolks Puree
5 Egg whites
Salt and Pepper To taste

Bon Appetite!

Next Page »