Elegant Code Patterns—Set and Return a Variable in One Line

I’ve been working on a Project that Never Ends™ for well… months now. Many months. The project has moved in to phase I’ve-lost-count, and the current milestone work is speed improvement. The project is a wonderful example of what goes wrong when you don’t have a detailed plan to start with, and has evolved into a terrible mess of convoluted code. The first step in any speed improvement is to reduce as much duplication as possible. One way of doing that is by caching calculations. And you can do that with a very small amount of code, using an elegant code pattern of setting and returning a variable in a single call.

The premise

When a complicated calculation is going to be used multiple times on a page, efficiency dictates saving the result (so that the calculation is only done once). So, instead of calculating the result every time you need it, why not cache the value? You can do this using a global variable and a single function. That way, any time you need the calculation, you can call the same function—including the first time—but only calculate the value once.

A case study

A function exists to get the total sales for a specific product during a given year. It does so by querying a database that has thousands of entries for each sales transaction. This time-expensive function is used multiple times on the page and is causing the page to take far too long to load.

It’s really a rather simple function:

function getSales($customer,$year) {
     $db_record = query_database("SELECT sum(sales) as total »
                FROM sales_records WHERE customer='$customer' »
                AND year='$year'");
     return $db_record->total;
}

Line breaks marked by »

While the above function glosses over the actual database calculation (that’s all pseudo code, the functions don’t really exist), you get the idea. It’s simple, but expensive when you’re querying many thousands of records each time.

By caching the result, we can reduce the page load time by quite a bit.

$CACHE = array();
function getSales($customer,$year) {
     global $CACHE;
     if (isset($CACHE[$customer][$year])) return $CACHE[$customer][$year];
     $db_record = query_database("SELECT sum(sales) as total »
                FROM sales_records WHERE customer='$customer' »
                AND year='$year'");
     return $CACHE[$customer][$year] = $db_record->total;
}

Explaining the new code, line by line

$CACHE = array();
This line of code simply creates a global variable (a variable outside the scope of our function) to use as a cache. In this case, we’re using an array.
global $CACHE;
Now that we’re in the function, we need to get access to our global variable. This line of code does just that. It declares that any time the variable $CACHE is used, it is referring to the global version.
if (isset($CACHE[$customer][$year])) return $CACHE[$customer][$year];
This is where the cache magic happens. If this is the second or later time the function has been called, the value has already been cached. We check to see if the correct cache value exists, and if it does, we return it. That exits the function immediately, meaning we return the correct value, but don’t do the expensive calculation.
$db_record = query_database…
This line is the calculation from the unoptimized function. If the result hasn’t been cached (that is, this is the first time the function has been called), we have to do the calculation. There’s no need to wrap this in an else clause, though, because when the previous if statement is true, this code is never reached.
return $CACHE[$customer][$year] = $db_record->total;

This is the magic from the title of the blog post. It sets and returns a value (the value to the right of the assignment operator [=]) at the same time. You’ve probably written something like this before:

$variable = $value;
return $variable

In PHP, however, you can do that same thing with a single line of code:

return $variable = $value

It’s elegant and simple. That’s what we’re doing in this line.

Caveat: this code pattern is wonderful for returning simple values. It doesn’t work for whole arrays. That is, you can return the value of one item in an array, as we do in the function, but you couldn’t return a whole array. That must be done with the two lines of code.

The result

I’m not going to put precise empirical evidence (execution times) on here, because your results will vary widely depending on your project and data. But, to give an idea, the project the case study is based on saw an average page load time reduction from 20s to 4s. Not a single other thing was done to the code beyond caching the values of the calculation. Your results may see less of a savings or more. In my case, we were dealing with a database containing more than 15,000 individual records that were being looked at each time the calculation query was made.

Depending on your exact project, there can be more improvements to this code. For instance, lookups in a multi-dimensional array are more expensive than ones in a one-dimensional array or an integer variable. But the concept transcends projects: cache and return the value at once, and you can save precious time by not recalculating each time the function is called.

Have you seen or used this code pattern before? Do you love it or hate it?

Why is date() Returning 12/31/1969?

 In PHP, a good sign that something is wrong with your date data or logic is when you start seeing dates displayed as “12/31/1969” (or however you specified the format). Unfortunately, there’s not a one-fits-all solution to that, since it could be caused by any number of things, but ultimately, the root of the matter is that you’re passing an invalid timestamp into date(). So while I can’t answer what’s wrong in your specific code, I can tell you why it is happening.

All about date()

In PHP, the date function takes in a Unix timestamp, then formats it according to the format mask you provide. To display today’s date, you only need to provide a mask, no timestamp, as the timestamp argument will default to the value of time() (as in, right that moment).

In code:

<?php echo date("m/d/Y"); ?>

Which evaluated as [date format="m/d/Y"] when this page loaded.

But when you want to display a date in the past or future, you have to provide a timestamp. A Unix timestamp to be exact.

The Unix timestamp

The Unix timestamp is defined in seconds since the Unix Epoch, otherwise known as January 1, 1970 at 0:00:00 UTC.

In PHP, you can retrieve the timestamp using the time function. In code:

<?php echo time() ?>

Which evaluated as [time] when this page loaded.

In order to get a timestamp value from a time string, such as "2009-09-09", you can use strtotime(), which will try to parse many different types of date strings. Learn more about the strtotime function at in the PHP manual. So if I wanted to use PHP to format that date as something different, say to insert into a MySQL database, I would use code that looks something like:

<?php $date = date("Y-m-d H:i:s",strtotime("09/09/2009")); ?>

Which sets date equal to "2009-09-09 00:00:00".

But when you pass a string to strtotime() that the function can’t parse, or try passing a date string directly to date() instead of a timestamp, date() can’t do anything with the invalid value.

Demystifying 12/31/1969 (or 1/1/1970 for Eastern Hemisphere folk)

Since the Unix timestamp is based off the Unix Epoch, an invalid timestamp defaults to to the Epoch (Thu, 01 Jan 1970 00:00:00 +0000).

But, date() displays the formatted time taking into account the timezone of the server or a timezone set with date_default_timezone_set(), so if your timezone is set to something like America/New_York (-0500), the date will be adjusted, resulting in a time that falls during 31 Dec 1969.

So if your server or script timezone is set to a timezone in the Western Hemisphere, any invalid timestamps will end up displayed as some incarnation of 31 Dec 1969. Likewise, in the Eastern Hemisphere, the date falls on or after the Epoch, resulting in a returned value of 1 Jan 1970.

Like Y2K except worse…

The concept of the Unix Epoch as the basis for time is causing some issues as we get deeper into the new millennium. For 32-bit systems, such as this server and hundreds of thousands (likely millions) of other computerized devices out there, time is finite. The systems will not be able to handle the large integer required to store the date based on the Epoch. When Tue, 19 Jan 2038 03:14:07 UTC rolls around, timestamps will rollover—to a value that equates to Fri, 13 Dec 1901 20:45:54 GMT.

As much as I’d like to travel back in time and see the last few days of Queen Victoria’s reign (oh the fashion!), that rollover will likely bring software systems crashing down. In fact, some systems have already started showing issues if they deal with dates farther than 27 years in the future. Luckily, electronics have been on a steady move toward 64-bit systems that can handle dates up to over 200 billion years in the future, but it’s not unlikely that some 32-bit systems will still be in use when 2038 rolls around even if their manufacture has slowed (or likely stopped).

If you’re trying to pass a perfectly valid date that falls before 13 Dec 1901 or later than 19 Jan 2038 to your PHP script, chances are you’ll see this Epoch error, because the server can’t handle that timestamp.

What’s your favorite 12/31/1969 story? Are you worried about the year 2038?

Why printf Isn’t Working: PHP String Handling

At one point in time, I spent hours troubleshooting why my argument swapping/repeated placeholders in sprintf/printf statements weren’t working properly. I referred to the PHP docs, the internet, cursed the fact that I was doing back-end coding again, etc., and finally gave up and just repeated my variables using normal, un-numbered conversion specifications. Not once did I see anything that answered my question: “Why am I getting a ‘Too few arguments’ warning when using numbered placeholders?”

So, why am I getting a ‘Too few arguments’ warning when using numbered placeholders?

I (and probably you) was/were/are using a double-quoted string.

The issue here is with how PHP handles the String type. PHP is pretty forgiving about a lot of things, and in many cases single-quoted strings and double-quoted strings work just as well as the other. But there are some very important differences, especially when it comes to special characters, conversion specifications and variables.

If you’re having this same problem, change to using single quotes. Voilà, your problem is probably fixed, as long as you didn’t forget to escape any single quotes within your string.

A little more information on strings

The number one thing to remember when working with single-quoted strings in PHP: the only parsed character is the escaped single quote [\'].

Double-quoted strings will parse many escape sequences, and most importantly will expand variables.

If you’re a visual learner, refer to this example:

print '\nHello $username';
print "\nHello $username";

The above code will produce the following (assuming $username has been set to “World”):

\nHello $username
Hello World

So why is this an issue with printf?

I haven’t actually found a source to explain this or verify my postulate. But, I’d bet money on it, and I’m not a betting woman.

Double-quoted strings expand the conversion specification as a variable before printf even gets a chance to evaluate the code. Visually speaking:

$s = "shoes";
$numchucks = "a";
$noun = "wood";
$verb = "chuck";
printf("How much %1$s would %2$s %1$s-%3$s %3$s...",$noun,$numchucks,$verb);

The above code results in a “Too few variables” warning, and nothing prints because printf is actually getting a string that looks like: “How much %1shoes would %2shoes %1shoes-%3shoes %3shoes... In other words, you’re passing it five valid conversion specifications but only four arguments. If you added in the proper number of arguments, like this:

printf("How much %1$s would %2$s %1$s-%3$s %3$s...",$noun,$numchucks,$verb,$noun,$verb);

The resulting output would be:

How much woodhoes would    ahoes chuckhoes-woodhoes chuckoes…

For those readers who are familiar with only the most basic of conversion specs, %2s is perfectly valid. It reads like: “format as a string [s] at least two characters long [2], using the default padding character [a space] and default alignment [left].” So, as you see with ‘a’, it pads the left of the value with a space (exaggerated in above example with three spaces).

In the above examples, we end up with a (somewhat) sensibly formatted string because our interpreted variable $s started with the letter ‘s’, a valid conversion specification character. You will end up with even weirder results if $s begins with an invalid conversion spec character. Take $s="random"; for example. printf tries to use ‘r’ but as it’s not a valid spec character it just leaves out the conversion specifications

How much andom would andom andom-andom andom…

Even more descriptive is if $s doesn’t exist:

How much would…

So, for future reference, if you want to use numbered conversion specifications, here’s the proper way:

printf('How much %1$s would %2$s %1$s-%3$s %3$s...',$noun,$numchucks,$verb);

And you know, I’ve really always wondered—

How much wood would a wood-chuck chuck…

…if a wood-chuck could chuck wood.