Are global variables bad?
We hear it often, "global variables are bad, avoid using them!" But is this actually good advice? Simplified blanket statements are already a bit suspicious, but if we get into the details of this one, it seems to fall apart. In this article, I look at the lifetime and visibility of variables, needed to make sense of the claim. It may turn out to just be a meaningless statement.
What is a global variable?
The term "global" does not have a consistent definition across languages and architectures. This inconsistency requires us to be pedantic; if I need to issue a judgement about global variables, I have to understand what precisely I'm talking about.
Variables have several properties that define them. Two of the key properties for our discussion are lifetime and visibility. The type of the variable also plays a role, but it's a bit more subtle, so I won't cover it here.
It's important to understand the difference between names and values. Some language constructs can muddy the clarity of lifetime and visibility, such as with pointers.
The lifetime of a variable is how long the value exists (the value is referred to as the "object" in some languages). I don't want to get too deep into this concept, but we need a bit of an overview.
Some of the common lifetimes are:
- temporary: These values are created by expressions, such as
sin( a + 5 ). The resulting value can be passed to another function, or copied into a variable. The intermediate results disappear when no longer needed, or at the end of the statement in some languages.
- block: Variables declared inside a block-scope (sections with
begin/endtags of some kind) tend to have values that exist only within that block.
- instance: The values defined within a class exist on an instance of the class. This is a kind of child lifetime: when the parent dies so do the children.
- application: These values share a lifetime with the application. They are created when it starts, or possibly later, and exist until the program terminates.
- manual: The value's lifetime is controlled explicitly via
Those are the classic options if you're considering a single executable written in one language. It would seem natural to label "global variables" as those with application lifetime. It can get confusing as we consider software systems comprising more than one program. We have values that persist beyond execution: database and configuration values. Some values exist so long as the host machine is running. In a web app, we have session lifetime: they disappear when the browser is closed. It's not clear what global lifetime means.
Visibility is a statement about how we get access to a value. The typical approach is via a variable name. In this limited interpretation there are a few ways to create names:
- block scope: Names within a block of code are only available in that block of code. Some languages have "function scope" instead of arbitrary block scope.
- private member scope: Names within a class that are only accessible by member functions.
- public member scope: Names within a class that are accessible so long as you have an instance of the class.
- module scope: All the source code within a module sees these names. These may actually be public and private, like member variables.
Some values are "visible" via accessors function: setters and getters. At a quick glance, having the functions
set_a is roughly equivalent to having a publicly visible variable
a. Indeed many languages allow you to write accessor functions directly for a public variable. There's something a bit unsettling here; I'll get back to this.
A variable could also be marked as read-only with a
final tag. Despite the names, this doesn't necessarily mean the backing object is constant, only that the symbol always points to the same object.
If we were to speak of "global" visibility, perhaps module scope comes the closest. These variables aren't visible everywhere though, usually within a module, or sometimes exported for public use. The visibility is also only upwards: a user of a module could see names inside it, but the module can't see names from the higher level code.
Correlation and specialness
Some of the lifetimes and visibilities have familiar relationships:
- a block scope name tends to have a value with block lifetime
- a module scope name tends to have a value with application lifetime
Though this may be the default, it's not the only option. By using a
static, or similar, keyword we can give block scope variables a value with application lifetime. The same keyword can also make member properties that have application lifetime: these are called "class variables".
If we're doing multi-thread programming, we use a "thread local" lifetime. These variables may be visible to an entire module, but each running thread has a distinct value associated with it.
Back to those accessor functions I mentioned. If access to a value is hidden behind functions, there's no real way the caller can know the lifetime of the backing value. These accessors must play a role in any advice we give on "global" values.
Uhm, so what's good or bad here?
All of this leads back to the original question: are global variables bad? We couldn't answer before because we didn't have a clear definition. Now, armed with our knowledge of visibility and accessibility, I'm not sure we want to come up with a definition.
Should a global variable be defined as one with module scope and application lifetime? Does it matter if it's private to a module? If it's only modified through accessor functions is it still global? Consider that a function like
get_time() is the same as a read-only variable accessing the current time, and I find it hard to believe we'd want to say this function is "bad".
What if my program is a micro-service architecture? I have several little programs that start and stop at frequent intervals. Though technically I have many values with an application lifetime, it feels more limited because of how I'm using them. They certainly aren't "global" in my system.
I don't think I can give a satisfactory definition of "global variable" that has any universal usefulness. That would mean it's somewhat meaningless to ask, "are global variable bad?" The question has to be more nuanced than that. It must refer to the applicability of all lifetimes and visibilities.
Answering the question "What are the applicability of various combinations of name visibility and value lifetime?" would be a long and complicated discussion. In short, I assure you that all combinations have both good and bad uses.