Code?

Elegant Code Patterns—Set and Return a Variable in One Line

I’ve been working on a Project that Never Ends™ for well… months now. Many months. The project has moved in to phase I’ve-lost-count, and the current milestone work is speed improvement. The project is a wonderful example of what goes wrong when you don’t have a detailed plan to start with, and has evolved into a terrible mess of convoluted code. The first step in any speed improvement is to reduce as much duplication as possible. One way of doing that is by caching calculations. And you can do that with a very small amount of code, using an elegant code pattern of setting and returning a variable in a single call.

The premise

When a complicated calculation is going to be used multiple times on a page, efficiency dictates saving the result (so that the calculation is only done once). So, instead of calculating the result every time you need it, why not cache the value? You can do this using a global variable and a single function. That way, any time you need the calculation, you can call the same function—including the first time—but only calculate the value once.

A case study

A function exists to get the total sales for a specific product during a given year. It does so by querying a database that has thousands of entries for each sales transaction. This time-expensive function is used multiple times on the page and is causing the page to take far too long to load.

It’s really a rather simple function:

function getSales($customer,$year) {
     $db_record = query_database("SELECT sum(sales) as total »
                FROM sales_records WHERE customer='$customer' »
                AND year='$year'");
     return $db_record->total;
}

Line breaks marked by »

While the above function glosses over the actual database calculation (that’s all pseudo code, the functions don’t really exist), you get the idea. It’s simple, but expensive when you’re querying many thousands of records each time.

By caching the result, we can reduce the page load time by quite a bit.

$CACHE = array();
function getSales($customer,$year) {
     global $CACHE;
     if (isset($CACHE[$customer][$year])) return $CACHE[$customer][$year];
     $db_record = query_database("SELECT sum(sales) as total »
                FROM sales_records WHERE customer='$customer' »
                AND year='$year'");
     return $CACHE[$customer][$year] = $db_record->total;
}

Explaining the new code, line by line

$CACHE = array();
This line of code simply creates a global variable (a variable outside the scope of our function) to use as a cache. In this case, we’re using an array.
global $CACHE;
Now that we’re in the function, we need to get access to our global variable. This line of code does just that. It declares that any time the variable $CACHE is used, it is referring to the global version.
if (isset($CACHE[$customer][$year])) return $CACHE[$customer][$year];
This is where the cache magic happens. If this is the second or later time the function has been called, the value has already been cached. We check to see if the correct cache value exists, and if it does, we return it. That exits the function immediately, meaning we return the correct value, but don’t do the expensive calculation.
$db_record = query_database…
This line is the calculation from the unoptimized function. If the result hasn’t been cached (that is, this is the first time the function has been called), we have to do the calculation. There’s no need to wrap this in an else clause, though, because when the previous if statement is true, this code is never reached.
return $CACHE[$customer][$year] = $db_record->total;

This is the magic from the title of the blog post. It sets and returns a value (the value to the right of the assignment operator [=]) at the same time. You’ve probably written something like this before:

$variable = $value;
return $variable

In PHP, however, you can do that same thing with a single line of code:

return $variable = $value

It’s elegant and simple. That’s what we’re doing in this line.

Caveat: this code pattern is wonderful for returning simple values. It doesn’t work for whole arrays. That is, you can return the value of one item in an array, as we do in the function, but you couldn’t return a whole array. That must be done with the two lines of code.

The result

I’m not going to put precise empirical evidence (execution times) on here, because your results will vary widely depending on your project and data. But, to give an idea, the project the case study is based on saw an average page load time reduction from 20s to 4s. Not a single other thing was done to the code beyond caching the values of the calculation. Your results may see less of a savings or more. In my case, we were dealing with a database containing more than 15,000 individual records that were being looked at each time the calculation query was made.

Depending on your exact project, there can be more improvements to this code. For instance, lookups in a multi-dimensional array are more expensive than ones in a one-dimensional array or an integer variable. But the concept transcends projects: cache and return the value at once, and you can save precious time by not recalculating each time the function is called.

Have you seen or used this code pattern before? Do you love it or hate it?