Leon's Weblog

October 13, 2005

PHP Application Framework Design: 4 – Forms and Events

Filed under: Software Dev — Leon @ 9:41 am

This is part 4 of a multi-part series on the design of a complete application framework written in PHP. In part 1, we covered the basic class structure of the framework and laid out the scope of the project. The second part described methods for managing users and session data. The third part described a practical implementation of page templates. In this fourth and final section, we will expand on our implementation of page templates to provide the web application persistent form data capabilities. We will also apply the framework to built an error page and a page dispatcher.

Forms

If you’ve read Part 3 and felt a little gypped, you are right. The fact of the matter is that our use of templates as described in that article is not quite practical. One template for the entire application. Really? Does that mean each page has to have exactly the same layout? Of course not. The part that was not entirely stated was that the page templates described the general principle for separating the presentation layer out from the application logic. Now that this is established, we can look into applying these templates more practically. For starters, it is reasonable to have a global template as long as there is some provision for customizing each page. Ergo forms.

Forms are actually nothing more than a template within a template. In fact the class_form.php class actually inherits from the class_template.php class. In the global template we put a place-holder for the form, <?=$form;?>, and then in the code set the $form variable of the page template to the result of fetching another template for the form. The form template itself does not necessarily have to have an HTML form element; but, since web application frequently deal with dynamic content, you will usually want to place the form tag into the the
template anyway. Now that we have solved that problem lets discuss the real power of forms: data.

Persistent Data

One of the difficulties in writing web applications is maintaining persistent data across multiple submits of the same page. This is not a problem for thick client applications because all of the data is stored in the client’s memory. Since HTTP is a stateless protocol, however, we have to collect all the variables that were submitted using the POST or GET methods and send them back to the client with every single post back. Traditional web sites don’t have to do this because every page is designed to do only one task so it gets submitted only once. The difference is that we are writing a web applications where each page is a modular entity. If we were to write a separate page for every button that’s contained on the page then we would have an order of magnitude more pages to maintain. Furthermore, the very concept of a button (in a traditional thick client sense) doesn’t exist in HTML. Yes you have something that looks like a button control but all it does is submit a page. Its up to us to connect the server side logic and have it execute when that button is clicked.

Another point about data is that not all of it gets submitted during a HTTP connection. What about the data that is generated at the server and needs to be maintained inside each page. For example, lets say you are displaying a table of data and want to remember which column was used to sort the data. You can store this in a hidden input control inside the form; but, if your page has many of these kinds of properties, then you may not want the overhead and burden of maintaining all of them this way. What other choice do you have?

If you look at the template class carefully, you will notice that everything that we are considering data is just a value that needs to be embedded somewhere within our page template. If we store this data across post backs, then we don’t have to worry about maintaining the other data that gets submitted with each page. This is precisely the what the form class does. We save all the variables just before we output the form to the client and load them back as soon as the form class gets instantiated. All the form variables are stored in a hidden input control named __FORMTATE. If you are familiar with ASP.NET way of doing things then you will realize that this a very similar approach. Note that when using the framework you do not have to worry about how this happens because the form class updates the information accordingly and even updates the values that were submitted during the last post-back. The one cumbersome issue with this approach is that any form property that corresponds to an actual control on the form (i.e. an item that sends a name/value pair with the page request) needs to be indicated to the form class so that the form properties can be automatically updated with the latest value send back. This is why we need to pass a $form_controls array when instantiating the form class. The code needed to maintain persistent form data is reproduced below.

    function _saveState() {
        $this->set("__IsPostback",true);
        $state = chunk_split(encrypt(serialize($this->state_vars),session_id()));
        $_POST["__FORMSTATE"] = $state;
    }

    function _restoreState() {
        if(isset($_POST['__FORMSTATE'])) {
            $state = decrypt($_POST['__FORMSTATE'],session_id());
            $this->vars = unserialize($state);
            $this->state_vars = $this->vars;
        }
    }

    function _updateState() {
        if (is_array($this->form_controls)) {
            foreach ($this->form_controls as $state_var => $control_name) {
                $this->set($state_var, $_POST[$control_name]);
            }
        }
    }

Encryption

HTTP is inherently a very insecure protocol. It was designed, as were many Internet protocols in the 1980’s, with a security model based on user trust. Decades later, this model fell apart as the popularity of the Internet grew (and the users became less trustworthy). Although, we can still send data in clear text when implementing public forums, banking and e-commerce applications require a great deal more security. Why do I bring this up now? For starters, because SSL does not solve all of our problems. Yes we can encrypt the communication channel between the server and client with confidence but what happens to the data once it reaches the client. Once a client opens a web site, the HTML code used to generate the page is always available in clear text even if the communication channel was encrypted. That means that any data that we send to the client can remain in their browser cache or in their cookies indefinitely (and its all in clear text)1. All of this makes it easier for mal-ware and spy-ware programs to gather data about its victims. Even worse is the situation where you don’t want the user to see what properties are stored in the form.

Now that I have hopefully convinced you that encryption of our form data is necessary, lets look in to how we can implement it. You will notice that the _saveState() and _restoreState() methods reproduced above call the encrypt() and decrypt() functions respectively. These functions are implemented in the functions.php file and require the mcrypt library for PHP to be installed on the server. Lets look at how these functions work.

//encrypt plane text using a key and an mcrypt algorithm
function encrypt($pt, $key, $cipher='MCRYPT_BLOWFISH') {
    $td  = mcrypt_module_open(constant($cipher), "", MCRYPT_MODE_OFB, "");
    $key = substr($key, 0, mcrypt_enc_get_key_size($td));       //truncate key to length
    $iv  = mcrypt_create_iv(mcrypt_enc_get_iv_size($td), MCRYPT_RAND); //create iv

    mcrypt_generic_init($td, $key, $iv);                        //init encryption
    $blob = mcrypt_generic($td, md5($pt).$pt);                  //encrypt
    mcrypt_generic_end($td);                                    //end encryption

    return base64_encode($iv.$blob);                            //return as text
}

//decrypt ciphered text using a key and an mcrypt algorithm
function decrypt($blob, $key, $cipher='MCRYPT_BLOWFISH') {
    $blob= base64_decode($blob);                                //convert to binary
    $td  = mcrypt_module_open(constant($cipher), "", MCRYPT_MODE_OFB, "");
    $key = substr($key, 0, mcrypt_enc_get_key_size($td));       //truncate key to size
    $iv  = substr($blob, 0, mcrypt_enc_get_iv_size($td));       //extract the IV
    $ct  = substr($blob, mcrypt_enc_get_iv_size($td));          //extract cipher text
    if (strlen($iv) < mcrypt_enc_get_iv_size($td))              //test for error
        return FALSE;
    mcrypt_generic_init($td, $key, $iv);                        //init decryption
    $pt  = mdecrypt_generic($td, $ct);                          //decrypt
    mcrypt_generic_end($td);                                    //end decryption

    $check=substr($pt,0,32);                                    //extract md5 hash
    $pt   =substr($pt,32);                                      //extract text

    if ($check != md5($pt))                                     //verify against hash
      return FALSE;
    else
      return $pt;
}

By default, we will encrypt the form data using the Blowfish algorithm with the client's unique (pseudorandom/dynamically generated) session ID as the key. The Blowfish algorithm was used because benchmarking showed that it ran in a reasonable time for encrypting both small and large amounts of data. Also, the cypher text produced is proportional in size to the plane text. These are all good qualities to have when we are planing to encrypt arbitrary amounts of data.

I will not get into the details of using the mcrypt library in this article (see the PHP manual for details) but will describe the general algorithm used for encryption. After the library is initialized using our key, we take the plane text and append the MD5 hash of the text prior to encryption. When we are decrypting the cypher text, we will take the decrypted text, separate out the MD5 hash that was appended (this is always 32 bytes in length), and compare that MD5 hash to the MD5 hash of the text just decrypted. This way even if someone manages to modify the the cypher text in a meaningful way, we will be able to see that it was changed. Also, since the MD5 hash contains 32 bytes of random data, adding it to the plain text will make cracking the code without the key very difficult. This approach will work well as long as the the hacker does not have access to the session ID of the client (which is not always a good assumption... expecially when the Session ID is stored in a session cookie on the client).

While we are on this subject, why not compress the data before/after it is encrypted and reduce the amount of data that is transmitted between the server and client. Well, to be honest, I tried. It turned out that the time it took to compress/decompress this data was longer that any amount of time that we would have saved by reducing the transition delay. In fact, the format of the serialized or encrypted data that we would be compressing is such that compression is not very effective in reducing the size (i.e. there are not many repeatable elements in the text).

Form Events

So we have our new page, it looks pretty, and is somewhat secure. What do we do when the user actually clicks on a button. If you look back to Part 1 of the series you will remember that the system base class has an abstract function handleFormEvents() which gets called as the page is processed. You can overwrite this function in each page and handle the events accordingly. For example, lets say you have a button on your page called cmdSearch and you want to see if it was clicked. All you have to do is test for isset($_POST["cmdSearch"]) inside the handleFormEvents() function.

This approach looks simple enough but it only solves half of our problem. Any HTML element can trigger an onClick JavaScript event on the client. What if we want that event to be handled on the server side. Since the object clicked does not necessarily have to be a button, you wouldn't always be able to find out if it was clicked using the approach described above. To solve this problem, our form template will need two more hidden input objects: __FORMACTION and __FORMPARAM. We can test for these values when the page is submitted and simply set the values in JavaScript whenever any event is raised that needs to be handled by the server. To set these objects in JavaScript and then submit the form we would do something like:

function mySubmit(action, param) {
    document.forms[0].__FORMACTION.value = action;
    document.forms[0].__FORMPARAM.value = param;
    document.forms[0].submit();
}

Note that since our model supports only one form element per page, we can simply refer to the first form in the forms array when accessing the objects. In practice, you wouldn't want to have multiple forms on a single page anyway.

Error Handling

Customized error pages are necessary for any web application in order to improve the users experience whenever the inevitable errors occur. So, since we have to build the pages anyway, lets demonstrate how to apply this framework to writing a custom error page which lets the user save the error that occurred into our database (whether you actually address the problem is up to you). Lets start with a file for the form template which will contain our user interface.

<form action="<?=$_SERVER['PHP_SELF']?>" method="POST" name="<?=$formname?>">
    <input type="hidden" name="__FORMSTATE" value="<?=$_POST['__FORMSTATE']?>">

    <div align="center" style="width: 100%">
        <div style="height: 280px;">
            <img src='<?=IMAGE_PATH."error/error".$err.".gif"?>' style="padding-top: 20px">
        </div>
        <div style="height: 120px">
            <?=$message?><br><br>
            <? if (isset($_POST["log"])) : ?>
                Click <a href="#" onclick="history.go(-2)">here</a> to go back.
            <? else : ?>
                If you think there is something wrong with the site, <br>
                please submit the error for review so that it can be fixed. <br><br>
                <input type="submit" name="log" value="Submit">
            <? endif; ?>
        </div>
    </div>
</form>

As you can see there is a little logic embedded in the template file but so be it. I didn't want to over complicate things by doing everything by the book (this is after all an example of how to mold the application framework to various needs). Now lets look at the main PHP file. The first thing that you will notice is that the NO_DB constant is defined because, in the usual case, we will not need a database connection. If the user wants to log the error, then we can create the database connection ourselves 2. Also notice that this page does not use the same generic template as the rest of the web site is using (the "error.inc.php" template is used to give the error page a special look). The rest of the page is fairly straight forward. We have to include the system base class in every page and then we derive our current page from the base class. In the init() method we indicate that we want to use a form and specify where the form template is located. Then, we check if this is the fist time that this page loaded (i.e. the user has not yet clicked any buttons on this page which would cause it to post back to itself). If so, we set the appropriate form properties. When the user clicks on the submit button, the handleFormEvents() function records the error in our database. Note that you will probably want to log a lot more information in your error pages.3.

define("NO_DB", 2);
include "include/class_system.php";

class Page extends SystemBase {
    function init() {
        $this->form = new FormTemplate("error.frm.php", "frmError", true);

        if ( !$this->form->IsPostback() ) {
            switch($_GET['err']) {
                //specify all the error which you want to handle here
                case 404:
                    $title = _("File not Found");
                    $message = _("The URL that you requested, '") . $_SERVER[REDIRECT_URL] .
                        _("', could not be found.");
                    $this->form->set("err","404");
                    break;
                case 500:
                    $title = _("Server Error");
                    $message = _("The requested for ") . $_SERVER[REDIRECT_URL] .
                        _(" caused a server configuration error.");
                    $this->form->set("err","500");
                    break;
                default:
                    $title = _("Unknown error");
                    $message = _("An unknown error occured at ") . $_SERVER[REDIRECT_URL];
                    break;
            }
            $this->form->set("notes",$_SERVER[REDIRECT_ERROR_NOTES]);
            $this->form->set("redirectURL",$_SERVER[REDIRECT_URL]);
            $this->form->set("redirectQueryString",$_SERVER[REDIRECT_QUERY_STRING]); 
            $this->form->set("remoteIP",$_SERVER[REMOTE_ADDR]);
            $this->form->set("userAgent",$_SERVER[HTTP_USER_AGENT]);
            $this->form->set("referer",$_SERVER[HTTP_REFERER]);
            $this->form->set("lang","en");
        } else {
            $title = _("Error Page");
            $message = _("Thank you for reporting the error.");
        }
        $this->page->set("title",$title);
        $this->form->set("message",$message);
    }

    function handleFormEvents() {
        if (isset($_POST["log"])) {
            $db = db_connect();

            $Number = $db->quote($this->form->get("err"));
            $Notes  = $db->quote($this->form->get("notes"));
            $RequestedURL = $db->quote($this->form->get("redirectURL"));
            $Referer  = $db->quote($this->form->get("referer"));

            $sql = "INSERT INTO tblErrorLog(Number, Notes, RequestedURL, Referer)
                    values($Number, $Notes, $RequestedURL, $Referer)";
            $result = $db->query($sql);
            if (DB::isError($result))
                die($result->getMessage());

            $db->disconnect();
        }
    }
}

$p = new Page("error.inc.php");

The Page Dispatcher

Here is another example of how to apply this framework to a page that doesn't quite fit the mold. Look at the URL of the current page. Notice anything interesting. The entire site is written in PHP and yet you are looking at a .html file. I'll give you a hint, the HTML page is not actually there. In fact, the page that you looking at is called articles. The web server was configured to interpret the file as a PHP file even though it does not have a .php extention. Also, using the AcceptPathInfo On directive, you can configure Apache to backtrack along the requested URL until it find the desired file. So in this case, the HTML file is not found and the file articles is found so everything after the valid file name (i.e. /article_name.html) is passed into the articles PHP file as a query string. Why would we want such a convoluted system. Well, for starters, this hides the server's internal folder structure from the users. More importantly, however, it makes the URL more human readable than something like "articles.php?ID=12345") and increases the chances that search engines will cache the page. Many search engines don't cache dynamic pages and having a query string like "?ID=12345" is a dead give away.

How does one implement articles using this application framework? Because of the template engine, its actually quite simple. Take a look.

class Page extends SystemBase {
    function init() {
        //determine page from query string
        $page = substr($_SERVER['PATH_INFO'],1,strpos($_SERVER['PATH_INFO'], '.')-1);

        if (file_exists("templates/_$page.frm.php")) {
            $this->form = new FormTemplate("_$page.frm.php", "frmArtile");
        } else { //article not found
            $this->redirect(BASE_PATH."error.php?err=404");
        }
    }
}
$p = new Page("view.inc.php");

Wrapping up

By now, we have covered all the necessary aspects of designing web applications in PHP. Although the framework described here may not be exactly what you need, hopefully this series of articles provided a basic idea of the issues that need to be considered when writing your own web applications in PHP. Feel free to contact me if you have any questions or comments.

  1. Don't get me wrong here because we aren't building the state of the art security site. All I am trying to accomplish is encrypting the data in the form that is sent back and forth between the server and client. If security was truly a concern, we wouldn't be sending such sensitive information back and forth but storing it securely on the server side. This implementation also addresses one of the criticisms that people had with the initial version ASP.NET which did not encrypt its VIEWSTATE for performance reasons. As for trusting the users... here is a perfect example of what too much trust can do. In the mid nineties, many startup e-commerce sites stored the prices of products inside their forms in clear text. The prices would then be submitted via GET (which is ridiculously easy to "hack") or POST to the script that confirmed the purchase of a product. So all a user would have to do is submit the order for any product with any value that they wanted as the price. You don't think anyone got away with doing this? Well, don't be so sure.
  2. Or what if the database is down and that is the reason why the user ended up in the error page in the first place. Then the error page will cause an error as well and the user will never see all the effort we went through to enhance his experience. This is why we want to make the page as simple as possible (and if we break a few design rules while we are at it then so be it)
  3. If you are wondering why all of the string are written _("like this"), it is because we may want to localize the page depending on who is viewing it. For more details look up the gettext library in the PHP manual.

Navigate: Part 1: Getting Started, Part 2: Managing Users, Part 3: Page Templates, Part 4: Forms and Events

1 Comment »

  1. Encryption:

    If you’re using MCRYPT_RAND to initialize IV, you should probably be calling SRAND first (e.g. srand((double) microtime() * 1000000))

    Thanks for the nice artlicle!

    Alex

    Comment by Alex — July 7, 2007 @ 1:36 pm

RSS feed for comments on this post. TrackBack URI

Leave a comment