Avoid Spaghetti Code with Scope Minimization

Let’s see some recommendations on how to prevent the phenomenon of spaghetti code just by minimizing the visibility of variables.

Our aim will be reducing to the minimum possible the portion of code where our variables are visible over the source code, namely reducing the scope of variables.

Scope minimization is the process of structuring code in such a way that it’s easy to:

  • declare variables that have minimal scope, and
  • assign variables with data that have minimal scope.

As a matter of fact, it’s the code structure that defines the visibility of variables.

Background Notions

A program is made of a combination of statements that are either simple (e.g. assignments) or compound (e.g. conditionals, loops).

The latter kind of statements can be nested, which means they can be made of code blocks that can contain other statements.

In particular, let’s consider two blocks A and B:

  • we say A is the outer block of B if A contains B, while
  • we say B is the inner block of A if B is contained by A.

The indentation level of a block is the number of its nesting levels, and corresponds to one level higher than its outer block.

Let’s define the global scope as special block having no outer blocks and indentation level 0.

Finally, global variables are those defined in the global scope.

Variable Visibility Rules

A variable is visible in the portion of code that:

  1. starts from the variable’s declaration statement,
  2. ends at the end of the variable’s declaration block, and
  3. includes all the nested blocks within 1. and 2.

Complementary, a variable is not visible in the portion of code that lies:

  1. before the variable’s declaration, and
  2. after the end of the variable’s declaration block.

Recommendations

  • R1. Never use global variables
  • R2. Declare single-purpose variables
  • R3. Declare variables close to their use
  • R4. Keep code blocks small
  • R5. Use variables close to their declaration
  • R6. Use no more than two nesting levels

R1. Never use global variables

Never declare or use global variables as they make code a lot more difficult to read, maintain and test. See “global variables are bad”.

Their use increases the occurrences of problematic side effects, which often lead to programming errors that are not easy to identify and fix.

The fewer the statements over the program that could erroneously assign variables, the better.

In conclusion, the use of global variables often represent technical debts, which must be repaid ASAP with a code rewrite!

R2. Declare single-purpose variables

Declare and use variables for a single specific purpose so as to restrict their scope to the minimum.

The more the purposes of a declared variable, the higher the number of statements the variable will be visible to.

The higher the number of statements the variable is visible to, the more the statements that could erroneously assign the variable.

The more the statements that could erroneously assign the variable, the more difficult to find and fix potential errors.

Summing up: only declare and use variables that are single-purposed.

R3. Declare variables close to their use

Declare variables as close as possible to the statements and code blocks that will use them.

Strictly related to the recommendation R2, this is another way of reducing the number of statements that can use the declared variables.

Example: Three Subsequent Blocks

First, let’s consider three subsequent code blocks A, B, and C, namely:

  • A, B, and C are in the same outer block, and
  • they have the same level of indent.

Second, let’s declare a variable v immediately before A, at the same indentation level of A.

According to the visibility rules, v will be visible to all the three subsequent blocks A, B, and C.

Now let’s assume we need v to be visible only to C.

Clearly, declaring v immediately before C will reduce its visibility by making it only visible by statements within C. So far so good!

However, this recommendation is only partially useful, if followed without the next recommendation being followed as well.

R4. Keep code blocks small

Similar to the recommendation R2 on single-purpose variables, keep code blocks as small as possible by making them focused on a single specific task.

Otherwise, some variables will likely become unnecessarily visible by the code block parts dedicated to different tasks.

Let’s take again the three subsequent blocks A, B and C introduced before, which have the same indentation level.

Let’s suppose we need to declare a variable w to be used and modified by A while only being read by B and C.

Declaring w immediately before A is inevitable, thus its scope will unnecessarily include B and C.

This makes B and C able to also assign w, while we want them to only be able to read w’s value.

How do we avoid this? Just separate B and C into two distinct functions, that take w as one their arguments.

Summing up: make code blocks being single-task, and move sub-tasks into separate functions.

R5. Use variables close to their declaration

Remember to use variables as close to their declaration as possible.

Let’s say you need to use (i.e. either read or assign) a variable v declared in an outer block.

Remember to minimize the indentation levels between the statement that uses v and the declaration block of v.

A general rule: Two should be the maximum number of nesting levels between the variable declaration and its use.

For completeness, let’s see an example where it may be reasonable to use variables at a distance of three nesting levels (i.e. one more than the recommended).

Example: Matrix Multiplication Algorithm

Let’s consider the iterative matrix multiplication algorithm, which uses three nested loops.

The deepest of such loops contains this assignment statement:

sumsum + Aik × Bkj.

A and B used by the statement are declared in an outer block at three nesting levels of distance, which is one level more than the recommended maximum.

In cases like this, it’s very much forgivable to make an exception to the rule.

Corner-cases apart, consider three as too many levels of indent when using variables declared in outer blocks.

Despite can be practically met by recommendation R6, keeping this recommendation in mind it’s always a good idea when organizing code.

R6. Use no more than two nesting levels

In general, remember to reduce the depth of nested blocks as much as possible.

The maximum nesting depth in each function should be two at most.

When a function has a nesting depth higher than two, restructure it in the following way:

  1. move some sub-blocks into separate functions, and
  2. pass the variables used by the moved sub-blocks as functions’ arguments.

As seen in the recommendation R5, it’s sometimes reasonable to have three as the maximum depth.

Again, corner-cases apart, always consider two as the maximum nesting depth.


© 2024 Massimo Nazaria

RSS

Licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.