Elegant Code Patterns—Set and Return a Variable in One Line

I’ve been working on a Project that Never Ends™ for well… months now. Many months. The project has moved in to phase I’ve-lost-count, and the current milestone work is speed improvement. The project is a wonderful example of what goes wrong when you don’t have a detailed plan to start with, and has evolved into a terrible mess of convoluted code. The first step in any speed improvement is to reduce as much duplication as possible. One way of doing that is by caching calculations. And you can do that with a very small amount of code, using an elegant code pattern of setting and returning a variable in a single call.

The premise

When a complicated calculation is going to be used multiple times on a page, efficiency dictates saving the result (so that the calculation is only done once). So, instead of calculating the result every time you need it, why not cache the value? You can do this using a global variable and a single function. That way, any time you need the calculation, you can call the same function—including the first time—but only calculate the value once.

A case study

A function exists to get the total sales for a specific product during a given year. It does so by querying a database that has thousands of entries for each sales transaction. This time-expensive function is used multiple times on the page and is causing the page to take far too long to load.

It’s really a rather simple function:

function getSales($customer,$year) {
     $db_record = query_database("SELECT sum(sales) as total »
                FROM sales_records WHERE customer='$customer' »
                AND year='$year'");
     return $db_record->total;
}

Line breaks marked by »

While the above function glosses over the actual database calculation (that’s all pseudo code, the functions don’t really exist), you get the idea. It’s simple, but expensive when you’re querying many thousands of records each time.

By caching the result, we can reduce the page load time by quite a bit.

$CACHE = array();
function getSales($customer,$year) {
     global $CACHE;
     if (isset($CACHE[$customer][$year])) return $CACHE[$customer][$year];
     $db_record = query_database("SELECT sum(sales) as total »
                FROM sales_records WHERE customer='$customer' »
                AND year='$year'");
     return $CACHE[$customer][$year] = $db_record->total;
}

Explaining the new code, line by line

$CACHE = array();
This line of code simply creates a global variable (a variable outside the scope of our function) to use as a cache. In this case, we’re using an array.
global $CACHE;
Now that we’re in the function, we need to get access to our global variable. This line of code does just that. It declares that any time the variable $CACHE is used, it is referring to the global version.
if (isset($CACHE[$customer][$year])) return $CACHE[$customer][$year];
This is where the cache magic happens. If this is the second or later time the function has been called, the value has already been cached. We check to see if the correct cache value exists, and if it does, we return it. That exits the function immediately, meaning we return the correct value, but don’t do the expensive calculation.
$db_record = query_database…
This line is the calculation from the unoptimized function. If the result hasn’t been cached (that is, this is the first time the function has been called), we have to do the calculation. There’s no need to wrap this in an else clause, though, because when the previous if statement is true, this code is never reached.
return $CACHE[$customer][$year] = $db_record->total;

This is the magic from the title of the blog post. It sets and returns a value (the value to the right of the assignment operator [=]) at the same time. You’ve probably written something like this before:

$variable = $value;
return $variable

In PHP, however, you can do that same thing with a single line of code:

return $variable = $value

It’s elegant and simple. That’s what we’re doing in this line.

Caveat: this code pattern is wonderful for returning simple values. It doesn’t work for whole arrays. That is, you can return the value of one item in an array, as we do in the function, but you couldn’t return a whole array. That must be done with the two lines of code.

The result

I’m not going to put precise empirical evidence (execution times) on here, because your results will vary widely depending on your project and data. But, to give an idea, the project the case study is based on saw an average page load time reduction from 20s to 4s. Not a single other thing was done to the code beyond caching the values of the calculation. Your results may see less of a savings or more. In my case, we were dealing with a database containing more than 15,000 individual records that were being looked at each time the calculation query was made.

Depending on your exact project, there can be more improvements to this code. For instance, lookups in a multi-dimensional array are more expensive than ones in a one-dimensional array or an integer variable. But the concept transcends projects: cache and return the value at once, and you can save precious time by not recalculating each time the function is called.

Have you seen or used this code pattern before? Do you love it or hate it?

Elegant Code Patterns—Drop the Else When Code is Repeated

I’m working on launching a new project that has been eating up a lot of my time. Partially because any new project has a habit of eating up a lot of time, but also because I’m really focusing on using this project as a solid base for a few others, meaning I want my code to be elegant, my ideas well-executed, and my implementation in a way that means I won’t want to scrap it and start over in the future as I do far too often. Part of that employing elegant code patterns. One of my most often used patterns involves dropping the else statement when I have code repeated in different logic clauses.

The premise

One of the countless ways of using the if…else construct is to do one thing if a certain variable is one value, or do something else if it is not. A lot of times this results in duplicating code. Duplicate code is inelegant, and introduces a greater opportunity for error when, for instance, one line is changed, but its clone is not.

The solution: structure your code to reduce duplicate code.

A case study

A project is using a CMS that provides a tag to retrieve the path to a featured image for a page. If no such image exists, it returns the path for an empty filler image. There’s no way within the CMS to change what that filler image is, but this project calls for using something else as a filler.

The goal: display the proper featured image or correct placeholder using an img tag.

The process: get the image path, and check if it is the default filler image. If it is, change the path to the new filler. Display the correct image.

The inherited code looked something like this (PHP):

$image_path = get_image_path($page);
if ($image_path == "/path/to/default_filler.png") {
   echo "<img src="/path/to/desired/filler.png" alt=" " />";
} else {
   echo "<img src="$image_path" alt=" " />";
}

In this case, the logical clauses are doing the exact same thing: printing out an image tag. The only difference is the path being printed. What if you later decide to add in an alt value, or reuse this code on an HTML page instead of XHTML? You might forget to change both lines. There’s a more elegant way.

The refined code

Refining the code, we can completely drop the else statement. Instead of printing the tag in each clause, we simply change the value of $image_path to be equal to the new filler image path if it is set to the wrong one. Then, we print the image tag using the value of $image_path.

$image_path = get_image_path($page);
if ($image_path == "/path/to/default_filler.png") {
    $image_path = "/path/to/desired/filler.png";
}
echo "<img src="$image_path" alt=" " />";

Depending on your school of thought on using braces with logical constructs, you can even reduce this code to three lines. But, regardless of brace use, the refined code is a lot more elegant and maintainable than the old code.

Going further, in this case our code pattern was only being used once, but if you were also repeating this pattern, you would turn it into a function to be called each time rather than copying the code.

Any time you find yourself repeating code, especially inside of logical constructs, see if you can simplify and refine your code. It makes it easier to read, easier to update, and is often much more efficient.

Converting From Named to Numbered Entities

The Web is ever changing, and this article is relatively ancient having been published 9 years ago. It is likely out of date or even blatantly incorrect in relation to modern better practices, so proceed at your own risk.

I’ve been having some feed issues lately, thanks to my propensity for using proper(ish) typography, such as real quotes (“=&ldquo;,”=&rdquo;,‘=&lsquo;’=&rsquo;) in my content and headlines. The problem was that XML doesn’t behave very well with some of the named HTML entities. My feed-generation code had some conversion set up using html_decode_entities() and a declared charset of UTF-8 for the document and decoded entities, which can handle them all, but for whatever reason no luck; It was still generating invalid RSS feeds. Matt Robinson’s code for conversion to numbered entities fixed it all up nice and clean-like. Thanks, man.