Models, and Views, and Controllers, Oh My!
September 25th, 2010
If you've done any sort of web programming recently, you have probably heard about MVC.
MVC, which stands for "Model-View-Controller," is a particular design pattern which attempts to separate the logic of an application from its presentation. It is particularly popular with web programmers, although it can be applied to other areas of programming as well.
Until recently, I was pretty clueless about MVC. My only knowledge of the subject came from passing references to it found in various blog posts and podcasts. When I started planning out my new web project, I accidentally reinvented MVC in my attempts to keep my design relatively clean. When I was told how close I was to MVC, I decided to do a little more research into the subject, and this blog post is the result of that research.
What is MVC?
There are four main components to MVC: the eponymous models, views, and controllers, and something called a router, which is essentially another controller, but which I consider enough of a special case to warrant special mention.
Routers and Front Controllers
Front Controllers are the first entry point into your application. As explained in my previous post, Apache web servers can be configured to redirect all requests into a single file, usually called index.php. This index.php file would be the front controller for the application.
It is from this file that your application is bootstrapped, and configuration files are loaded. These configuration files can and should include definitions which map your directory tree to constant values in your application (using define() statements in PHP), so you can quickly and easily find files your need.
Once the application is prepped and ready to go, the router takes over and directs (routes) processing to specific controllers which are better suited to handle that particular form of request. In practice, a lot of the time this is simply a matter of extracting variables out of a URL to find a controller name (remember from last time that, by design, URLs often begin with a controller name), verifying it exists, then passing control to that router.
You may have noticed by now that front controllers and routers have some overlap in responsibility. I consider them to be the same thing, but some purists may want to separate routing capabilities into their own class, so I have tried to separate them a bit in this description.
Controllers
Controllers are essentially gatherers. They are the middle managers of your application. Their entire role is to know what is needed to fulfill a task, and who to ask for it. They aren't there to do any real work (a fact many people seem to miss, leading to an epidemic of "fat controllers"); they are there to facilitate the work of others.
When a request is passed to a controller from the router, the constructor for that controller will once again extract data out of the URL to determine which action is required. In the example "questions" controller example used in the last post, the questions controller would need to decide between the "ask" and "display" actions available to it based on the data in the URL.
Once a suitable action is identified, a method is called to execute that action. This method will have a grocery list of data it needs to collect, and it will use any number of specific models to get that data. When data is returned from a model, it gets stored in the controller (or potentially a special data repository object) for later use.
Once the controller has collected all the data it needs, it will determine how the data is meant to be displayed. For example, the same data could be output as an HTML page, an RSS feed, or an email. When the controller knows what sort of display is needed, it passes all the data to an object called a view which is responsible for rendering the data in the appropriate format.
Models
You may have taken a class in school which explained the fundamentals of object-oriented programming (OOP). In this class, your professor probably made a big deal about how this paradigm can be used to create software models of real world objects for your application to interact with. The models in MVC refer to exactly the sort of models your professor was talking about.
Models can take many forms. The most common examples given for models online tend to take the form of database abstraction, but that is just the tip of the iceberg. If you were creating a banking application, you could have a Mortgage class to give approval, determine rates, and calculate payments. This mortgage class would be a Model.
Essentially, all of your business logic, all of your database logic, and practically anything that doesn't deal directly with displaying information to the user would be considered a model.
Views
I've covered views a bit already as part of my explanations of models and controllers, but let's explain it again anyway.
Views are the interface between your application and the end user. They are the HTML displayed by a users web browser, the message received by the users email client, and the feed in your users RSS reader.
By the time a controller passes control of the application to a view, all of the heavy lifting has already been done. All of the data to be displayed to a user is wrapped up in a shiny package and presented to the view with a bow on top. In fact, some people prefer that views aren't objects at all, but rather simple scripts with placeholders into which precomputed values are dumped. I happen to take a slightly different approach, and think that views can and should contain methods of their own. These methods, quite obviously, should only contain presentation logic, such as choosing between displaying a logged in users information, or a generic "log in or sign up" type of message. By separating this sort of logic out into methods, the main template of a view becomes that much cleaner.
Conclusion
Before I looked into MVC in any depth, I was convinced that it was a needlessly complicated framework that I would spend weeks trying to wrap my mind around. In the end, it only took me an hour or two one morning to get everything sorted out.
I encourage you, if you're starting your first MVC project, to take the time to build your own mini framework for a simple project you want to build. While premade frameworks like CodeIgniter are great, I think it's always a good idea to try to roll your own at least once. By forcing yourself to immerse yourself in the details, you will have a better understanding of what's going on should you decide to use an open source framework later on.
Happy coding!
Structuring Your URLs (or, URL-Driven Design)
September 23rd, 2010
For the sake of argument, I'm going to assume you're reading this blog directly on appsCanadian.ca and not in an RSS reader or some other fancy software. Now, I'd like to direct your attention to the address bar in your web browser, and take note of the URL:
http://www.appscanadian.ca/archives/structuring-your-urls-or-url-driven-design
In the old days of the internet, this URL would indicate to the reader that I had created a real directory on my web server called "archives" and in that directory I had placed a file called "structuring-your-urls-or-url-driven-design" which contained the HTML file you're viewing right now. These are not the old days of the internet, however, and I can assure you that there is no "archives" directory on my server, nor is there a file called structuring-your-urls-or-url-driven-design.
The way this sort of thing often happens today, is that there is a file on the server (most often called .htaccess for Apache servers) which examines incoming requests and redirects the browser to a file elsewhere on the server. More often than not, all requests for web pages (i.e. content that isn't something static like CSS or images) get routed to the same file, usually called index.php. In fact, this page you're currently reading was processed by a file called index.php on my server, and that script knows to serve this particular page because of a GET variable called 'p' which has a value of 139. This means that, for all intents and purposes, asking appsCanadian.ca to serve /archives/structuring-your-urls-or-url-driven-design is exactly the same as asking it for /index.php?p=139.
Obviously, the server needs to understand how one URL maps to another, so it makes a good deal of sense to spend some time planning your URL design before you actually begin making your website. By planning your various URLs, you will force yourself to spend a good deal of time thinking about how your site is actually going to work, breaking your features into your component parts.
An Example
As I write this, the highest voted question on the popular programming Q&A site Stack Overflow is located at the following URL:
http://stackoverflow.com/questions/194812/list-of-freely-available-programming-books
Lets break down this URL into its component parts to see what makes it tick.
First is the typical site identification stuff, the "http://stackoverflow.com" part of the URL. We don't much care about that since it's pretty standard. I will say that you should give a bit of thought about how you'll deal with subdomains. Google treats subdomains as separate sites, so it's probably best to keep everything under one roof - either have everything under a 'www' subdomain, or no subdomain at all.
The real fun begins when we examine the path. The first part of the path points to a "directory" called "/questions" which (as with my blog example above) doesn't actually exist. The first "directory" in a dynamic URL like this typically points to a specific "controller" in the application. In MVC-based applications (which I'll talk about in more detail in a future post), controllers can sort of be described as "sub-applications." A website is typically made up of several different controllers which can perform a variety of actions. They denote the boundaries of a specific portion of the website. In this case, the "questions" controller is in charge of, at a minimum, submitting and displaying questions on the site.
Next, there is another component which looks like a directory called '194812'. The precise meaning of this is something which is left to the specific site to determine, but in a lot of cases, the "second directory" will be a specific action that the first controller is meant to execute. This could take the form of a URL like "/questions/display" - the meaning of which should be obvious: the site is to 'display' a 'question' which is specified later in the URL. In the case of Stack Overflow, the website designers have decided to forgo the 'action' directive, and instead immediately give a target for which a default action will be applied. Specifically, the number 194812 is a question identifier, and the system knows that the default action for such an identifier is to display the question and its answers.
Finally, we have what looks like a filename. As with the blog example above, I can assure you that there is no file on Stack Overflows webservers which has the filename list-of-freely-available-programming-books (with the possible exception of some sort of file-based cache). Instead, the filename-like identifier "list-of-freely-available-programming-books" is what is referred to as a "stub." A stub is a bit of information which is contained in a URL entirely for decorative purposes. A site designer may choose to include additional information in a URL to make it more descriptive and alluring to end users, or to search engines. In the example, the words "list-of-freely-available-programming-books" are actually the title of the question being displayed. In most cases, stubs can be changed or removed entirely without altering how the site displays the page.
Designing Your Structure
Figuring out how your URLs will look really isn't the hardest part of designing a website, but I do think it is an important part. By working out your URL structures, you get a feel for how your site actually comes together. One way of doing this is to write out a brief explanation of what your site is and how you expect it to work. Here's an example I wrote up for Stack Overflow:
Stack Overflow is a Q&A site for programmers. Users can ask and answer questions and are awarded reputation points when the community decides they have a) asked a clear and useful question, or b) provided a correct answer to a question. These points are awarded based on votes provided by the community of users. Questions can be assigned up to 5 tags relating to the question, such as programming languages or platforms identified in the question. These tags can be edited by other users with a sufficient amount of reputation, and the question itself can be edited by users with even higher levels of reputation. Participation in the site is also rewarded by a set of "badges" which are awards for performing tasks within the system. Users can find questions by browsing based on one or more tags, or by the age or vote total of a question. When viewing a question, answers can also be sorted by age and votes. Similarly, the site will have a search function for finding questions containing certain language.
This is a pretty basic explanation of the site, but it works well enough. If you read over that description, you can probably identify some distinct sections of the site, and some actions which can be performed within those sections. Rather than review Stack Overflows actual URL structure (of which I have an incomplete knowledge), let's try to create our own URL structure.
The key components of the site seem to be asking questions, providing answers, and voting. Users also specify tags and can be awarded badges for good behaviour. Users can search for questions based on terms and tags, or they can browse questions freely.
Let's use these highlighted terms to create some "controllers":
- /questions
- /answers
- /votes
- /users
- /tags
- /badges
- /search
- /browse
These are the preliminary candidates, and not all of them may be suitable. Use your own judgment regarding which potential controllers will actually work for your own website.
Next, we'll need to identify actions for these controllers to take. For brevity, let's just review the questions controller:
- The whole point of the site is for users to ask questions. So 'ask' is a good action for a question.
- Once asked, the question will need to be available to other users to answer, so we'll need to display it, so lets add a 'display' action.
- We need to be able to uniquely identify the question, so we'll use an ID number to tell the 'display' action what to show us.
- For SEO purposes, let's also add a slug based on the question
- We have to display answers with our questions, and answers can be sorted in a number of ways:
- Newest
- Oldest
- Most votes
- A user may want to edit a question he or she asked.
- A user may want to delete his or her question completely.
This gives us the following URL options for questions:
/question
/ask
/display
/id
/slug
/new
/old
/votes
This seems pretty good to me, but there are two issues you might want to consider:
- The first potential issue is that a slug should be regarded as decoration, and the absence or misspelling of it shouldn't be fatal to your application. You can easily store the "preferred slug" for an item in your database, and redirect to the proper form of your URL when an incorrect version is received.
- The second potential issue is that the /new, /old, and /votes sort items probably shouldn't form their own URLs. To a search engine, these will all appear to be separate links, and your potential page rank will be split across them. These options should probably be set in GET variables, but if you're really dead set on having perfectly pretty URLs, you can get around this by using browser redirects to invisibly rewrite a pretty URL into a GET variable.
If you really think about your site, and what you want to do with it, it should be pretty straightforward to design a URL strategy. When you're done, you should have a better overall grasp on how your site will work, and the rest of your design will start to fall into place.