Drupal 7 Line by Line Part 4 - DRUPAL_BOOTSTRAP_PAGE_CACHE

Welcome to the fourth part of the Drupal 7 Line by Line series of articles.

In this series I've been going through the Drupal page load process by going through the code line by line. So far I've covered index.php, the logic of the drupal_bootstrap function and the first bootstrap phase. (see the links at the bottom of this article if you want to catch up).

Today I'm going to cover phase two of the bootstrap process. This is where things start to get a bit more interesting as we see how Drupal handles page caching and how it supports pluggable caching backends.

Phase two starts with a call to _drupal_bootstrap_page_cache() from the drupal_bootstrap function.

_drupal_bootstrap_page_cache()

<?php
/**
* Bootstrap page cache: Try to serve a page from cache.
*/
function _drupal_bootstrap_page_cache() {
?>

The first thing that Drupal does is declare the global $user variable which is used throughout the page load to represent the $user which is requesting the page. At this point it is completely empty. In fact, at this stage Drupal has no idea who or what is requesting the page.

Drupal then includes cache.inc. As you might guess this file contains Drupal caching functions. But, it also includes Drupal's first Class found during the page loading. Yes, real honest to goodness object oriented code. The file defines the DrupalCacheInterface and a class that implements that interface: DrupalDatabaseCache.

<?php
/**
* Default cache implementation.
*
* This is Drupal's default cache implementation. It uses the database to store
* cached data. Each cache bin corresponds to a database table by the same name.
*/
class DrupalDatabaseCache implements DrupalCacheInterface {
?>

This object isn't created at this point in the page load, I just wanted to point out that the class and interface exist.

Cache Backends

Back to _drupal_bootstrap_page_cache:

<?php
 
foreach (variable_get('cache_backends', array()) as $include) {
    require_once
DRUPAL_ROOT . '/' . $include;
  }
?>

In this snippet, Drupal loops through the configured cache backends and includes the file that defines the class that implements DrupalCacheInterface. This code requires a bit of explaining.

variable_get() is a wrapper function for reading the global $conf variable. As you'll recall $conf was defined in drupal_settings_initialize() as an array. If you read the settings.php file and paid close attention to the code comments you'll also recall that you may assign certain configuration values to the $conf array in your settings.php file.

One such configuration value you might set in settings.php is $conf['cache_backends'][] = 'path/to/caching_include_file.inc';

So, if you've installed and configured an alternative caching system on your Drupal site its caching code gets included here.

<?php
 
// Check for a cache mode force from settings.php.
 
if (variable_get('page_cache_without_database')) {
   
$cache_enabled = TRUE;
  }
  else {
   
drupal_bootstrap(DRUPAL_BOOTSTRAP_VARIABLES, FALSE);
   
$cache_enabled = variable_get('cache');
  }
 
drupal_block_denied(ip_address());
 
// If there is no session cookie and cache is enabled (or forced), try
  // to serve a cached page.
?>

Now the code checks to see if $conf['page_cache_without_database'] was set in settings.php. If that setting exists the code carries on. If on the other hand that setting doesn't exist, Drupal has to find out if a site administrator has turned on caching. In order to find that out Drupal has to bootstrap more of itself to get the job done. This is done with a call to drupal_bootstrap(DRUPAL_BOOTSTRAP_VARIABLES, FALSE) Note, that this is a recursive call! We got to this stage in the code execution via a call from within drupal_bootstrap() and Drupal is calling drupal_bootstrap() again.

This is where things are a little tricky to follow. You'll remember that drupal_bootstrap() will go through all the bootstrap phases up to and including the one you are asking for (here: DRUPAL_BOOTSTRAP_VARIABLES). That means that this one call to drupal_boostrap() is actually going to complete two 2 bootstrap phases right here. Both DRUPAL_BOOTSTRAP_DATABASE and DRUPAL_BOOTSTRAP_VARIABLES.

I'm going to cover both bootstrap phase 3 (database) and phase 4 (variables) in separate articles. Just know that at this stage both needed to be run in order to be able to call variable_get('cache');

Blocking IP Addresses

drupal_block_denied(ip_address());
Before Drupal bothers to do anything else it checks to see if the IP address requesting the page is blocked.

The ip_address() function either returns $_SERVER['REMOTE_ADDR']; or if the site administrator has configured a reverse proxy like squid or varnish, ip_address checks the configuration to determine which http header contains the real requesting IP address and returns it instead.

drupal_block_denied() actually calls drupal_is_denied() to determine if the IP is blocked. This normally means querying the database, but see this comment in the drupal_is_denied() function:

<?php
 
// Only check if database.inc is loaded already. If
  // $conf['page_cache_without_database'] = TRUE; is set in settings.php,
  // then the database won't be loaded here so the IPs in the database
  // won't be denied. However the user asked explicitly not to use the
  // database and also in this case it's quite likely that the user relies
  // on higher performance solutions like a firewall.
?>

This should make sense if you could follow what I have written in this article to this point.

If Drupal discovers that an IP address is blocked:

<?php
    header
($_SERVER['SERVER_PROTOCOL'] . ' 403 Forbidden');
    print
'Sorry, ' . check_plain(ip_address()) . ' has been banned.';
    exit();
?>

A 403 header is sent to the requester and a non translatable, non themeable page is sent back. Finally exit() is called, which means we are done. No more of Drupal is run. The page load is complete with very little code executed.

Serving a Cached Page

Most of the time the requester isn't going to be blocked, so the next thing that Drupal tries to do at this stage is to load a cached page. If a user is logged in or if caching is disabled this bootstrap phase is complete and no cached page will be displayed.

if (!isset($_COOKIE[session_name()]) && $cache_enabled) { translates to: if this user isn't logged in and page caching is enabled continue on and try to load a cached page.

I am not going to go into the code that actually gets the cached page because that is worthy of its own article. Just know that if Drupal can find a cached page it returns it and the page load is complete.

Summary

The whole point of the page cache bootstrap phase is to keep Drupal from running any more code than it needs to. If a requesting IP addresses is banned very little code is run in order for Drupal to let the requester know they've been banned. If caching is enabled and the site is configured to load a cached page and NOT load the Drupal database system two whole bootstrap phases are avoided. If caching does require the database. two more bootstrap phases are run, but cached page is displayed if possible and code execution stops.

Of course, even with caching enabled it is possible that a cached version of a page is not found or the cached version of the page has expired. In those cases Drupal must carry on through the bootstrap process which I will cover in upcoming articles.

January 6th 2011 4AM
By: andre

 

Comments

I really like your Drupal 7

I really like your Drupal 7 line by line. I learned, and it helps.

Great work!