Java Intermediate: Pervasive Shallowness in Java
Java Intermediate: Pervasive Shallowness in Java
Java Intermediate
This doc gathers miscellaneous intermediate Java topics -- things we need for CS108 but which you may
not have seen before in detail. The first part deals with many little topics, and the second part concentrates
on Java Collections: List, Map, and Set. The 2nd part has different typography, as I'm preparing it in
HTML to be published on the web.
• It is typical in Java to have a few objects, and copy pointers to those few objects all over the place. We
can afford to have pointers spread all over, since the GC figures out the deallocation for us.
• Can be a problem if we want to change an object, since many parts of the program may be pointing to it.
• This is not a problem if the object is "immutable" -- having no mutating methods, so it never changes
once built.
• Another solution is to make a copy is if the class provides a "copy constructor" (C++ terminology) -- a
constructor which takes an existing object as an argument, and creates a new object that is a copy. In
general, making copies of things is something more often done in C++ than in Java. I think this is
because the GC allows to not make copies in may situations.
Foo a = new Foo(1);
Foo copy = new Foo(a); // make a copy of a
Array Review
• Ok, first remember how arrays work in Java:
int[] a = new int[100]; // Allocate array in the heap, a points to the array
a[0] = 13;
a.length // access .length -- 100
a[100] = 13; // out of bounds exception
Arrays.equals(), deepEquals()
• The default "a.equals(b)" does not do a deep comparison for arrays, it just compares the two pointers.
This violates the design principle of least surprise, but we're stuck with it for now.
2
• Use the static equals() in the Arrays class -- Arrays.equals(a, b) -- this checks that 1-d arrays contain the
same elements, calling equals() on each pair of elements. For multi-dimensional arrays, use
Arrays.deepEquals() which recurs to check each dimension.
Multidimensional Arrays
• An array with two or more dimensions is allocated like this...
- int[][] grid = new int[100][100]; // allocate a 100x100 array
• Specify two indexes to refer to each element -- the operation of the 2-d array is simple when both indexes
are specified.
- grid[0][1] = 10; // refer to (0,1) element
• Unlike C and C++, a 2-d java array is not allocated as a single block of memory. Instead, it is
implemented as a 1-d array of pointers to 1-d arrays. So a 10x20 grid has an "outer" 1-d array length
10, containing 10 pointers to 1-d arrays length 20. This detail is only evident if we omit the second
index -- mostly we don't need to do that.
int temp;
int[][] grid = new int[10][20]; // 10x20 2-d array
grid[0][0] = 1;
grid[9][19] = 2;
temp = grid.length; // 10
temp = grid[0].length; // 20
grid[0][9] = 13;
int[] array = grid[0]; // really it's just a 1-d array
temp = array.length; // 20
temp = array[9]; // 13
• Note that System.arraycopy() does not copy all of a 2-d array -- it just copies the pointers in the outer 1-d
array.
Packages / Import
Java Packages
• Java classes are organized into "packages"
• In this way, a class "Account" in your ecommerce package will not conflict with a class also named
"Account" in the sales-tax computation package that you are using.
• Every java class has a "long" or "fully qualified" name the combines the class name and its package
• e.g. "String" full name is java.lang.String (that is, String is in the package "java.lang", the package that
contains the most central classes of the language).
• e.g. "ArrayList" full name is java.util.ArrayList
• If you need to find the package of a class, you can look at its javadoc page -- the fully qualified name is
at the top.
• In the compiled .class bytecode form of a class, the fully qualified name, e.g. java.lang.String, is used for
everything. The idea of a "short" human readable name is a convention that is only used in the .java
source files. All the later stages in the JVM use the full name.
Package Declaration
• A "package" declaration at the top of the .java source file indicates what package that class goes in
• package stanford.cslib; // statement at start of file
• If a package is not declared, the class goes into the one, catch-all "default" package. For simplicity, we
will put our own classes in the default package, but you still need to know what a package is.
"List" Example
• In the Java libraries, there are two classes with the name "List" -- java.util.List is a list data structure and
java.awt.List is a graphical list that shows a series of elements on screen.
• Could "import java.util.*", in which case "List" is the util one. Or could import "java.awt.*", in which
case "List" is the awt one. If both imports are used, then the word "List" is ambiguous, and we must
spell it out in the full "java.util.List" form.
• In any case, the generated .class files always use the long java.util.List form in the bytecode.
• Compiling and referring to java.util.List does not link the java.util.List code into our bytecode. The
java.util.List bytecode is brought in by the JVM at runtime. This is why your compiled Java code
which many standard classes, is still tiny -- your code just has references to the classes, not copies of
them.
Static
• Instance variables (ivars) or methods in a class may be declared "static".
• Regular ivars and methods are associated with objects of the class.
• Static variables and methods are not associated with an object of the class. Instead, they are associated
with the class itself.
Static variable
• A static variable is like a global variable, except it exists inside of a class.
• There is a single copy of the static variable inside the class. In contrast, each regular instance variable
exists many times -- one copy inside each object of the class.
• Static variables are rare compared to ordinary instance variables.
• The full name of a static variable includes the name of its class.
- So a static variable named "count" in the Student class would be referred to as "Student.count".
Within the class, the static variable can be referred to by its short name, such as "count", but I
prefer to write it the long way, "Student.count", to emphasize to the reader that the variable is
static.
• e.g. "System.out" is a static variable in the System class that represents standard output.
• Monster Example -- Suppose you are implementing the game Doom. You have a Monster class that
represents the monsters that run around in the game. Each monster object needs access to a "roar"
variable that holds the sound "roar.mp3" so the monster can play that sound at the right moment. With
a regular instance variable, each monster would have their own "roar" variable. Instead, the Monster
class contains a static Monster.roar variable, and all the monster objects share that one variable.
4
Static method
• A static method is like a regular C function that is defined inside a class.
• A static method does not execute against a receiver object. Instead, it is like a plain C function -- it
can have parameters, but there is no receiver object.
• Just like static variables, the full name of a static method includes the name of its class, so a static foo()
method in the Student class is called Student.foo().
• The Math class contains the common math functions, such as max(), sin(), cos(), etc.. These are defined
as static methods in the Math class. Their full names are Math.max(), Math.sin(), and so on.
Math.max() takes two ints and returns the larger, called like this: Math.max(i, j)
• A "static int getCount() {…" method in the Student class is invoked as Student.getCount();
• In contrast, a regular method in the Student class would be invoked with a message send (aka a method
call) on a Student object receiver like s.getStress(); where s points to a Student object.
• The method "static void main(String[] args)" is special. To run a java program, you specify the name of a
class. The Java virtual machine (JVM) then starts the program by running the static main() function in
that class, and the String[] array represents the command-line arguments.
• It is better to call a static method like this: Student.foo(), NOT s.foo(); where s points to a Student object,
although both syntaxes work.
- s.foo() compiles fine, but it discards "s" as a receiver, using its compile time type to determine
which class to use and translating the call to the Student.foo() form. The s.foo() syntax is
misleading, since it makes it look like a regular method call.
Files
File Reading
• Java uses input and output "stream" classes for file reading and writing -- the stream objects respond to
read() and write(), and communicate back to the file system. InputStream and OuputStream are the
fundamental superclasses.
• The streams objects can be layered together to get overall effect -- e.g. wrapping a FileInputStream inside
a BufferedInputStream to read from a file with buffering. This scheme is flexible but a bit
cumbersome.
• The classes with "reader" or "writer" in the name deal with text files
- FileReader -- knows how to read text chars from a file
- BufferedReader -- buffers the text and makes it available line-by-line
• For non-text data files (such as jpeg, png, mp3) use FileInputStream, FileOutputStream,
BufferedInputStream, BufferedOutputStream -- these treat the file as a block of raw bytes.
• You can specify a unicode encoding to be used by the readers and writers -- defines the translation
between the bytes of the file and the 2-byte unicode encoding of Java chars.
if (line == null) {
break;
}
in.close();
}
catch (IOException except) {
// The code above jumps to here on an IOException,
// otherwise this code does not run.
6
Exceptions
This is a basic introduction to exceptions. An exception occurs at runtime when a line of code tries to do
something impossible such as accessing an array using an index number that is out of bounds of the array
or dereferencing a pointer that is null.
An exception halts the normal progress of the code and searches for some error handling code that matches
the exception. Most often, the error handling code will print some sort of warning message and then
possibly exit the program, although it could take some more sophisticated corrective action.
Java uses a "try/catch" structure to position error-handling code to be used in the event of an exception. The
main code to run goes in a "try" section, and it runs normally. If any line in the try section hits an exception
at runtime, the program looks for a "catch-block" section for that type of exception. The normal flow of
execution jumps from the point of the exception to the code in the catch-block. The lines immediately
following the point of the exception are never executed.
normal code
progress
on exception,
execution jumps to
try { the catch block
stmt();
stmt();
stmt();
stmt();
}
catch (Exception ex) {
ex.printStackTrace();
System.exit(1);
}
For the file-reading code, some of the file operations such as creating the FileReader, or calling the
readLine() method can fail at runtime with an IOException. For example, creating the FileReader could fail
if there is no file named "file.txt" in the program directory. The readLine() could fail if, say, the file is on a
CD ROM, our code is halfway through reading the file, and at that moment our pet parrot hits the eject
button and flies off with the CD. The readLine() will soon throw an IOException since the file has
disappeared midway through reading the file.
The above file-reading code uses a simple try/catch pattern for exception handling. All the file-reading
code goes inside the "try" section. It is followed by a single catch-block for the possible IOException. The
catch prints an error message using the built-in method printStackTrace(). The "stack trace" will list the
exception at the top, followed by the method-file-line where it occurred, followed by the stack of earlier
methods that called the method that failed.
It is possible for an exception to propagate out of the original method to be caught in a try/catch in one of
its caller methods, however we will always position the try/catch in the same method where the exception
first appears.
7
When your program crashes with an exception, if you are lucky you will see the exception stack trace
output. The stack trace is a little cryptic, but it has very useful information in it for debugging. In the
example stack trace below, the method hide() in the Foo class has failed with a NullPointerException. The
offending line was line 83 in the file Foo.java. The hide() method was called by main() in FooClient on line
23.
java.lang.NullPointerException
at Foo.hide(Foo.java:83)
at FooClient.main(FooClient.java:23)
In production code, the catch will often exit the whole program, using a non-zero int exit code to indicate a
program fault (e.g. call System.exit(1)). Alternately, the program could try to take corrective action in the
catch-block to address the situation. Avoid leaving the catch empty -- that can make debugging very hard
since when the error happens at runtime, an empty catch consumes the exception but does not give any
indication that an exception happened. As a simple default strategy, put a printStackTrace() in the catch so
you get an indication of what happened. If no exception occurs during the run, the catch-block is ignored.
In Java code, if there is a method call, such as in.readLine() above, that can throw an exception, then the
compiler will insist that the code deal with the exception, typically with a try/catch block. This can be
annoying, since the compiler forces you to put in a try/catch when you don't want to think about that case.
However, this strict structure is one of the things that makes Java code so reliable in production. Aside:
some exceptions such as NullPointerException or ArrayOutOfBounds are so common that almost any line
of code can trigger them. These common exceptions are called "unchecked" exceptions, and code is not
Three Uses
• Here are the three use patterns of generics that I think work for most situations (examples below)…
• 1. Using a template class, like using ArrayList<String>
• 2. Writing template code with a simple <T> or <?> type parameter
• 3. Writing template code with a <T extends Foo> type parameter
• The advantage is that the compiler knows that the add() and get() methods take and return Strings, and so
it can do the right type checking at compile time. We do not have to put in the (String) cast. The code
that uses the list is now more readable. As a benefit of compile time typing, Eclipse code-assist now
knows that strings.get(0) is a String and so can do code completion.
• The type of the iterator -- e.g. Iterator<String> -- must match the type of the collection.
• The plain type "List" without any generic type is known as the "raw" version. Raw versions still work in
Java, and they essentially just contain pointers of type Object. You can assign back and forth between
generic and raw versions, and it works, although it may give a warning.
8
- e.g. Can store a List<String> pointer in a variable just of type List. This works, but gives a
warning that it's better to keep it in List<String> the whole time.
• At runtime, all the casts are checked in any case, so if the wrong sort of object gets into a List<String> it
will be noticed at runtime. This is why Java lets us mix between List<String> and raw List with just a
warning-- it's all being checked at runtime anyway as a last resort.
• Eclipse tip: with the cursor where the <String> would go, hit ctrl-space. If Eclipse can deduce from
context that a <String> or whatever is required, it puts it in for you, yay!
Boxing/Unboxing
• Normally, you cannot store an int or a boolean in an ArrayList, since it can only store pointers to objects.
You cannot create an ArrayList<int>, but you can create an ArrayList<Integer>.
• With Java 5 "auto boxing", when the code needs an Integer but has an int, it automatically creates the
Integer on the fly, without requiring anything in the source code. Going the other direction, auto
unboxing, if the code has an Integer but needs an int, it automatically calls intValue() on the Integer to
get the int value.
• This works for all the primitives -- int, char double, boolean, …
• This works especially well with generic collections, taking advantage of the fact that the collection type
shows that it needs Integer or Boolean or whatever.
• This is a shorthand for looping over the elements with an iterator that calls hasNext()/next() in the usual
way. The fancy iterator features -- such as remove() -- are not available.
• Nonetheless, this syntax is very handy for the very common case of iterating over any sort of collection.
• Design Lesson: if the clients of a system perform a particular operation very commonly (in this case,
iterating over all the elements in a collection) -- it's a good design idea to have a simple facility that
makes that common case very easy. It's ok if the facility does not handle more advanced uses. It can be
simple and focus just on the common case. Putting in such a special-purpose facility is, in a way,
inelegant -- it creates more than one way to do things. However, experience shows that making
common cases very easy is a good design idea.
// Can go back and forth between typed Collections and untyped "raw"
// forms -- may get a warning.
List<String> genList = new ArrayList(); // warning
List rawList = new ArrayList<String>(); // no warning
rawList.add("hello"); // warning
genList = rawList; // warning
rawList = genList; // no warning
public Pair(T a, T b) {
this.a = a;
this.b = b;
}
public T getA() {
return a;
}
public T getB() {
return b;
}
• We can use the list inside bar(), but we cannot assume anything about what type it contains -- we don't
even have a name for the type it contains. Note that there is no <?> at the start of the method. The <?>
only appears in the parameter list.
• We can make local generic variables that mention the <?>, such as:
List<?> temp = list;
Iterator<?> it = list.iterator();
• I like the <?> variant to encode true "I don't care" cases. It has the simplest syntax and the fewest
features.
public PairNumber(T a, T b) {
this.a = a;
this.b = b;
}
Container Anomaly
• Why must we use List<T> to for a list of any type, rather than just List<Object>?
• There is a basic problem between the type of a container and the type of thing it contains.
• A pointer to a String (the subclass) can be stored in a variable of type Object (the superclass). We do that
all the time.
• However, a pointer to a List<String> cannot be stored in a variable type List<Object> -- this is
unintuitive, but it is a real constraint. The reason for this constraint is demonstrated below.
// The point: you can assign a pointer of type sub to a pointer of type
// super. However, you cannot assign a pointer type container(sub) to a
// pointer of type container(super).
// Therefore Collection<Object> will not work as a sort of catch-all
// type for any sort of collection.
}