Leon's Weblog

February 27, 2007

ASP.NET Impersonation and Screen-Scraping

Filed under: Software Dev — Leon @ 10:13 am

If you have ever tied to download the content of a secured web page from another page then you know just how easy it is to have something go wrong. But, as I’ve found out recently, if you see it through, you will learn a great deal about IIS and ASP.NET security (while developing a new found appreciation for Apache at the same time). Here are some highlights.

Background:
You want a user to press a button on an Intranet page and download the content of several other remote sites that use NTLM. The connection to the remote resource should be made using the credentials of the user not a default account. (in my case the remote resources were actually web services and network files but it doesn’t really matter)

The Basics:
First configure your site to use Integrated Windows Security (a.k.a. NTLM) and disable anonymous login. This can all be done under the Security tab of the IIS settings. At this point, it doesn’t really matter what authentication method is configured for ASP.NET in the web.config file because IIS will still pass the identity of the network user on the domain to the page. However, to be a bit more explicit, we can disable guest login in the web.config file as well.

The easiest way to download a file in ASP.NET is to use the System.Net.WebClient class. Just create an instance of the class and use the DownloadFile method. What if you are downloading from a secured site that doesn’t allow anonymous users?

First Problem:
One would think that passing the credentials of the current logged in user to a remote resource should be easy. Especially when the WebClient class has a Credentials property which we can set to CredentialCache.DefaultCredentials. Easy enough. Too bad that doesn’t work. Maybe we should set the UseDefaultCredentials property of the WebClient class to True as well. Still doesn’t work.

Checking the logs on the remote web server will show that no credentials were passed at all and the server returned the typical 401-unauthorized error. There are plenty of other properties that you can set or you can try to work with the lower level HTTP Client Request and Response classes but they will all lead you in the same place… i.e. nowhere. You may see a small beacon of hope if you place both websites on the same host because this configuration works (although it is not the way you intended).

Impersonation:
By default, IIS runs and processes each user request in threads owned by ASP.NET’s local machine account (which is typically ASPNET). As a consequence, even if we enable NTLM security on the page, those credentials will only last one hop. Meaning that your web server authenticates the client over NTLM but cannot spawn new web requests on the user’s behalf to remote network resources. No wonder we couldn’t login before. ASPNET is not a network account. Also, this explains why you can connect to other websites on the same host… IIS was using the same ASP.NET account on each site.

Now we can configure IIS to enable impersonation (which is disabled by default) by setting the appropriate variable in the web.config file. This will tell IIS to spawn each new request thread under the credentials of the client thus giving the thread access to network resources that are available to the user. Are we home free now? Well, you can test that this scenario does work and that we have reached our goal but at what cost.

Next Problem:
According to Microsoft (and a few simple performance tests) enabling impersonation reduces the scalability of your site because each request incurs a small performance hit. Do you want to impose this overhead on the entire site if you only need to use this functionality in a few places? There are also a number of security problems with this solution. The one that I particularly dislike is that, if you have pages that write files on the webs server, every network user will now need write access to the output folders (whereas, without impersonation, only the local asp.net account needs write access to the output folders).

Impersonation at Runtime:
What if you can disable impersonation for the site as a whole but enable it dynamically on the pages that need it. Well it turns out that you can. I would have preferred setting a page directive to enable this option as needed but, after going though this ordeal, writing a few extra lines of code doesn’t seem so bad. The last caveat is that the web.config settings now matter. If ASP.NET authentication method is not set to “Windows” you will not be able to impersonate the user at runtime because there will not be enough information in the User.Identity object.

The final code:

' Impersonate the current network user
Dim impersonationContext As System.Security.Principal.WindowsImpersonationContext
Dim currentIdentity As System.Security.Principal.WindowsIdentity = CType(User.Identity, _ 
System.Security.Principal.WindowsIdentity)
impersonationContext = currentIdentity.Impersonate

' Download file from a network resource
Dim myWebClient As New WebClient
myWebClient.Credentials = CredentialCache.DefaultCredentials
myWebClient.DownloadFile("http://network_resource", Server.MapPath("~\inbox"))

' Revert back to original context of the ASP.NET process
impersonationContext.Undo() 

3 Comments »

  1. This is what I have been looking for exactly. Thanks for the explaination and code!

    Comment by Lance — May 19, 2009 @ 3:16 pm
  2. Hi,

    I happened to see your post find it quite informative. I would like to share a link where a software engineer has shared a tip on “Screen Scraping in ASP.Net”. I am sharing it just for the knowledge purpose.

    Here is the link:
    http://www.mindfiresolutions.com/Screen-Scraping-in-ASPNET-800.php

    Hope you find it useful and of assistance.

    Thanks,
    Bijayani

    Comment by Bijayani — January 29, 2010 @ 7:49 am
  3. Hi,
    Thank you for your post.

    I’m in the same situation. I have to create a website A, inside this one, i have a button to download a file from website B (this one need windows authentification).

    I’m following your way with impersonation at runtime, i don’t get error 401 so it looks like working well. But the problem is when WebClient download a file, its size is 1 ko or in reality, its size has to be 1.8 MO.

    here is my code
    [
    System.Security.Principal.WindowsImpersonationContext impersonationContext;
    System.Security.Principal.WindowsIdentity currentIdentity = (System.Security.Principal.WindowsIdentity)User.Identity;
    impersonationContext = currentIdentity.Impersonate();
    string filePathTemp = System.IO.Path.Combine(“D:/temp”, Guid.NewGuid() + “_” + “.pdf”);
    WebClient oWebClient = new System.Net.WebClient();
    oWebClient.Credentials = CredentialCache.DefaultCredentials;
    oWebClient.DownloadFile(@”http://xxx/TESTPDF/PREPARATION_OF_STABLE_POLLUTION_GAS.pdf”,
    filePathTemp);
    impersonationContext.Undo();
    ]

    thank you for your help.

    Huy

    Comment by Huy — October 2, 2012 @ 4:46 am

RSS feed for comments on this post. TrackBack URI

Leave a comment