Readable code tips: Don’t inline everything!

Previous Tip: Extract complex conditionals
Next Tip: Use “is” or “has” for Boolean variables/functions

There is a balance to be found between “inlining” multiple expressions, and using temporary variables.

This example uses lots of nicely named variables. But it’s like a wall of code that’s actually hard to read and surprisingly hard to reason about.

function isExistingTextFile($filename) {
    $isEmptyFilename = ! is_empty($filename);
    $isValidFilename =  strlen($filename) > 5;
    $isNonEmptyValidFilename = $isEmptyFilename && $isValidFilename;
    $filenameHasTextExtension = pathinfo($filename, PATHINFO_EXTENSION) === 'txt';
    $fileExists = file_exists($filename);
    $isValidTextFilename = $isNonEmptyValidFilename && $fileNameHasTextExtension;
    $isTextFile = $isValidTextFilename && $fileExists;
    return $isTextFile;
}

One possible way to fix this is to say “Well, this is really just one condition. A single expression. Let’s just inline everything and return it:

function isExistingTextFile($filename) {
    return ! is_empty($filename) && strlen($filename) > 5 && pathinfo($filename, PATHINFO_EXTENSION) === 'txt' && file_exists($filename);
}

And on the face of it this is much nicer. But for me, this is the other extreme. It’s still actually kinda hard to read. I think that a good, well-named intermediate variable can actually help break down a multi-part expression like this.

function isExistingTextFile($filename) {
	$isValidFilename = ! is_empty($filename) && strlen($filename) > 5;
	$isTxtFile = $isValidFilename && pathinfo($filename, PATHINFO_EXTENSION) === 'txt';
	return $isTxtFile && file_exists($filename);
}

I think this is a nice middle ground with the final return being a readable statement of what we want to check for. And working our way back up the expressions we can see how each is also readable. In my opinion at least.

Bonus: A quick primer on short-circuit evaluation!

Now… to that “hard to reason about” piece. Why is this code hard to reason about? It seems pretty simple?

Well if you’ve not heard of short-circuit evaluation, it’s worth finding out about.

Boolean expressions can be made more efficient using short-circuit evaluation. When this happens, the computer runs the minimum number of checks required to complete the evaluation of the expression.

There’s some words in there. Maybe that means nothing to you. So let’s look at an example.

$result = complexFunction1() && complexFunction2();

Because we’re using &&, we don’t always have to call both functions.

If the result of the first function is false then the whole thing is false. So complexFunction2() doesn’t need to be called to get the result. And in many programming languages, including PHP and JavaScript, it is not called.

Wait? What?! You might be thinking.

Yeah, this can be pretty powerful and we need to be careful with it.

Imagine that these two complex function calls involve slow operations, say, database queries or API requests. It could be much quicker to only call the first function.

Consider this case:

$result = createPost() && publishPost();

In this case short-circuit evaluation helps us because if createPost() fails then publishPost() will also fail. But short-circuit evaluation means we don’t call publishPost(), which could save us time.

Now what if we extracted variables here?

$isPostCreated = createPost();
$isPostPublished = publishPost();
$result = $isPostCreated && $isPostPublished;

This is slightly contrived, and probably not what you would do. But if we did do this then publishPost() is ALWAYS called on the second line, even if createPost()fails and returns false.

Note that short-circuit evaluation also works for || conditions, but backwards:

$result = complexFunction1() || complexFunction2();

Here if complexFunction1() is true then the whole expression will always be true and we don’t need to call complexFunction2() to determine that, and so it’s not called.

So there might be a benefit to our fully-inlined version:

function isExistingTextFile($filename) {
    return ! is_empty($filename) && strlen($filename) > 5 && pathinfo($filename, PATHINFO_EXTENSION) === 'txt' && file_exists($filename);
}

because if we fail any step, the subsequent steps are not executed at all.

I shall leave it up to the reader to look at the example code at the top of this post and try to figure out what is called and when. Did the code work differently as we inlined and then un-inlined the parts of the expression?