We know that in Java there are four kinds of variables: local variables, formal parameters, instance variables and class variables. We know that class variables are often used to give named constants. We know that each variable must be declared and initialised before it is used, that it has a certain lifetime and a certain visibility and that its identifier (name) has a certain scope. Here we try to provide advice on good coding style that affects these issues. Here are some general guidelines, which are expanded on in subsequent sections of this web page.
Declaring variables and then not using them is a sign of slap-dash programming. It brings clutter which lessens readability, understandability and maintainability. It lowers efficiency.
The argument that declaring but not using variables (especially formal parameters) gives room for expansion is rarely sustainable. ([Meyer, 1997], pp.764-770 gives a good discussion of this and related points. He shows how you can avoid having lots of parameters that are included simply so that a method can offer lots of options on the way it is run.)
In the case of local variables, Java doesn't supply a default value and the compiler insists that you explicitly initialise each local variable before you use it. It therefore makes much good sense to initialise the variable when you declare it:
int screenWidth = MIN_SCREEN_WIDTH;
When initialising a variable, use a sensible value where possible rather than some `dummy' value. For example, initialise your variables from formal parameters, from named constants, etc.
If, on the other hand,
you do simply require that the variable start off with some
arbitrary default value, then use the same values that Java
uses when it initialises instance variables and class variables,
i.e. zero for numeric variables, false
for booleans
and null
for reference types.
Instance variables are given initial values by Java. But they nevertheless deserve some further treatment here.
The values of an object's instance variables define the state of
that object. You must ensure that your object's are always in
a legal state. Each instance variable must contain a legal
value at all times, but also you should never allow the variables
to contain illegal combinations of value.
For example, a Person
object might have among its instance variables an age
variable
(an int
) and an isOAP
variable
(a boolean
). Not only must age
always
contain a positive integer less than 120 (say) and
isOAP
must always contain true
or
false
, but the two must be consistent: age
can't contain 99 while isOAP
contains false
.
A part of achieving this aim is to make the variables private
and allow access to them to be mediated only through a public
interface that provides suitable getters & setters. Then, the
setters can ensure that
another class cannot mess up the values: it can ensure that, e.g,
age
and isOAP
are always consistent.
But
proper initialisation is also important. You need to make sure that
no matter how an object is created it starts its life with a proper
set of initial values in its instance variables.
Do Java's default values (zero for numeric types, false
for booleans and null
for reference types) give
a legal starting state to the object? If so,
you might not need a constructor.
If not, write one or more constructors to replace the default
values.
(If your constructor takes
in parameters and assigns them into the instance variables,
you ought to check that they give the object a legal
starting state.
If not, you must take appropriate remedial action.)
Variables that get used for more than one purpose are confusing and typically have convoluted, inaccurate or vague identifiers.
You'll see one example of a variable being used for more than one purpose in the section on minimising scope below.
One common way of using a variable for more than one purpose arises when programmers assign into formal parameters. Don't do this! Consider this abstract example:
public void exampleMethod(int inputVal) { <statements1> inputVal = <some calculation probably using inputVal> <statements2> }
Within a few statements,
inputVal
no longer contains
the input value; it contains some result of a computation. Its identifier is
now inaccurate and
misleading.
Another kind of example of a variable with multiple
uses is where a variable is used to hold special values
in exceptional circumstances. For example, a variable index
might be used to hold the position of an item in an array. But,
when the item being searched for is not in the array, the variable might be
set to -1. This is a common programming idiom; and it isn't a heinous
crime. However, it does mean that your variable
identifier is not wholly accurate (it describes only one of the two
uses of the variable).
Since this can be confusing on occasion, it should be done
with care.
It also requires that you are very certain that the special value can always be reserved for its special use. Clearly this is OK in the case of array indexes: these can never be negative, so -1 can be given a special meaning. By contrast, suppose that, at the moment, you need an array of 100 elements, so you use 999 as a special value; if, in the future, your arrays are allowed to be 1000 elements, you will have to change your special value too. (Obviously, if you use a named constant, then this will be a less painful change.)
Finally, if you know what scope holes are, then avoid using them. If you don't know what they are, avoid using them anyway :)
In practice, the advice that you should minimise identifier scope applies primarily to local variables.
It is considered good practice to declare and initialise local variables as close to their use as possible. The worst approach would be:
int x = 3; // A comment that describes what x is used for <lots of statements not using x> <a statement that does use x>
The program would be easier to read if the information
about x
(its type, its initial value and the
information in the comment) appeared closer to the use of
x
.
One temptation to be resisted is giving local variable identifiers long scopes and re-using them for several distinct purposes within that scope, e.g.
int i = 0; for ( ; i < s.length(); i ++) { <statements affecting String s> } <statements> for( ; i < a.length; i++) { <statements affecting array a> }
Here, i
is used in one part of the program to index
a string, and in a later part of the program the same
variable, i
, is used to index an array.
The problem with this is that you
might forget to re-initialise i
prior to the second loop, and you may cause confusion
by using one identifier (i
)
with two purposes. Instead, declare
i
local to the first loop and then declare another
i
local to the second loop:
for (int i = 0; i < s.length(); i ++) { <statements affecting String s> } <statements> for(int i = 0; i < a.length; i++) { <statements affecting array a> }
Even better, use two distinct and meaningful identifiers in place of
i
!
You can see that, in this example, three of our guidelines, using a variable for only one purpose, minimising identifier scope and minimising lifetimes, all coincide.
Visibility applies to instance and class variables and is determined
by the access modifier used in the declaration
(public, private, protected
or blank). We won't say much in
this web page on this subject since it is covered in detail in several
lectures.
Here's a quick summary of the issues.
High visibility means low information hiding with the risk of greater coupling between different parts of the program. High coupling makes a program harder to understand (you can't understand one part without understanding the other parts to which it is coupled) and harder to maintain (because modifications will have knock-on effects).
You minimise visibility by adhering to the following advice.
Whenever you find that a method needs to store a piece of data,
assume to start with that a local variable will suffice. Make it an
instance or class variable only if necessary. Even then, make it
private
. Don't automatically supply public
getters and setters: supply them only if necessary.
Related advice is to remember that the more parameters a method has, then the higher the coupling between that method and any other part of the program that invokes that method: more parameters increases the inter-dependency between the two; changing the method may require changing the other part of the program. (This is not an invitation to use lots of instance variables and class variables since the other main way that parts of a program can be coupled is if they `communicate' through shared instance or class variables.)
Short lifetimes ensure that we make good use of memory resources.
The lifetime of a formal parameter is the method or constructor body. Good procedural abstraction can minimise formal parameter lifetimes.
The lifetime of a local variable is from the point where its declaration is encountered during program execution to the end of the block in which it was declared. Declaring variables close to their use can minimise local variable lifetimes.
Instance variables are encapsulated within objects. The object is created
using new
and the memory cannot be reclaimed
by the garbage collector until
the object has no references or the program terminates.
So, in programs that make heavy use of memory, make sure you
nullify all references to objects as soon as the objects are
no longer needed.
Named constants in Java are variables that are
declared static
and final
. By convention, their identifiers are written in
uppercase, with underscores within them if necessary.
In your programs, you must not ever use magic numbers. If your code contains numeric constants, such as 60, 400, 52, 1000, etc., then these are magic numbers. Instead, include declarations such as the following:
private static final int MINS_PER_HOUR = 60; private static final int SCREEN_WIDTH = 400; private static final int DECK_SIZE = 52; private static final int CATALOGUE_MAX_CAPACITY = 10000;
And use the identifiers MINS_PER_HOUR, SCREEN_WIDTH,
DECK_SIZE
and CATALOGUE_MAX_CAPACITY
instead. (They can be
public
if you need to use them in other
class definitions.)
Just about the only numbers that do not need to be replaced by named
constants are 0, 1 and 2 when used in initialisation of for
loop counters and the like.
We covered in lectures the advantages of using named constants: readability and modifiability.
Everything that has been said above about magic numbers and the use
of named constants applies equally well to other values, especially
strings. On the whole, you should avoid magic strings hard-coded into
your programs. Instead, use a named constant. A good example of this
was in the Time
class, which used the following named
constants:
private static final String MORNING_TEXT = "a.m."; private static final String AFTERNOON_TEXT = "p.m.";
The program can print times such as "10:30 a.m.". But it is easily changed
so that it can print "10:30 am" or "10:30 in the morning":
we simply alter the value of the MORNING_TEXT
variable.
Converting software to a market whose customers speak a language
other than English may be easier too.
However, it has to be said that most programmers wouldn't take this idea too far. Most print statements in a program would still be hard-coded strings rather than named constants.
(It might be worth noting, by way of a postscript to this section, that using named constants in Java does not worsen run-time efficiency. The compiler replaces your constant identifier with the actual value at compile-time.)