Copy Link
Add to Bookmark
Report

Java Newsletter by Glen McCluskey - Issue 4

eZine's profile picture
Published in 
Java Newsletter
 · 2 years ago

Issue #004
April, 1996

Contents

  • Random Numbers in Java
  • Comparing C/C++ with Java Part 4 - Templates
  • Book Review
  • Java Program Packaging Part 2 - Public/Private
  • Performance - String Operations

RANDOM NUMBERS IN JAVA

Random numbers are useful in many contexts in programming. One common use is in games, and another is for testing program algorithms such as those used in searching or sorting.

Java has random number support in the library class java.util.Random.
Basic use is:

        import java.util.*; 

Random rn = new Random();

...

int r = rn.nextInt(); // 32-bit random number

double d = rn.nextDouble(); // random value in range 0.0 - 1.0


The random number generation algorithm is based on the linear congruential method (see for example Chapter 3 in Volume 2 of Knuth's "Art of Computer Programming"). This method has a starting value (a "seed"), and each value of the sequence is generated from the last. Whether such a stream of random numbers are in fact truly random is an interesting question beyond the scope of this discussion. If you're interested in that question you might consult Knuth's book.

If the default Random() constructor is used, the random number generator is seeded from the current time of day. You can also specify your own seed:

        Random rn = new Random(1234);


to generate a repeatable sequence of random numbers.

To show how random numbers might be used, here is a class Test that will generate random numbers in a range using the rand() method, or generate random strings composed of characters 'a' ... 'z'.

        import java.util.*; 

public class Test {

private static Random rn = new Random();

private Test()
{
}

public static int rand(int lo, int hi)
{
int n = hi - lo + 1;
int i = rn.nextInt() % n;
if (i < 0)
i = -i;
return lo + i;
}

public static String randomstring(int lo, int hi)
{
int n = rand(lo, hi);
byte b[] = new byte[n];
for (int i = 0; i < n; i++)
b[i] = (byte)rand('a', 'z');
return new String(b, 0);
}

public static String randomstring()
{
return randomstring(5, 25);
}
}


Actual random numbers are obtained using nextInt(), and then knocked down to the relevant range using the modulo ("%") operator.

Note that Test is simply a packaging vehicle with all of the methods as class methods ("static"). To obtain a random String of between 10 and 40 characters, you would say:

        String s = Test.randomstring(10, 40);

COMPARING C/C++ WITH JAVA PART 4 - TEMPLATES

Suppose that you have a vector of numbers or strings or other objects that you'd like to sort. How would you do this? One way is to devise a specialized sorting function unique to your program, that is, a function that simply sorts whatever vector is at hand.

This is fine but it's wasteful to have to repeatedly implement the same sorting algorithm for slightly differing situations across a variety of programs. To help solve this problem, C has a standard library function that implements the Quicksort algorithm:

        void qsort(void* base, size_t n, size_t size, 
int (*cmp)(const void*, const void*));


The idea is that you pass in a pointer to the base of a data structure, along with the number of elements in the structure and the size of each. You also pass in a function pointer. The function is called with pairs of elements and it returns < 0, 0, or > 0 to specify ordering of the elements. This approach is fairly low-level but works pretty well. C++ also has access to qsort().

In C++ there is also the notion of function templates, a function that can be parameterized with a type parameter so that the same function skeleton can operate on different types. For example, a simple sort might look like:

        template <class T> void sort(T vec[], int n) 
{
for (int i = 0; i < n - 1; i++) {
for (int j = i + 1; j < n; j++) {
if (vec[i] > vec[j]) {
T t = vec[i];
vec[i] = vec[j];
vec[j] = t;
}
}
}
}


This will work for all numeric types and for any class types for which relational operators, initialization, and assignment are defined. Note that the sorting algorithm in this example has running time proportional to N**2 and we would want to use a more efficient one in production code.

Java does not have templates or user-visible pointers or sizeof(), so the above approaches will not work. One possible alternative would be to exploit the idea of all Java class types being derived from a single Object base. This means that one can have a vector of Object references containing actual Objects, Strings, numerics (using wrappers), and so on. A reference to any object of a derived class type can be assigned to an Object reference. For example, I can say:

        Object vec[] = new Object[10]; 

for (int i = 0; i < 10; i++)
vec[i] = new Integer(i);


to represent the numbers 0-9 in a vector. The Vector class in the Java library supports this type of usage.

But representing numbers or Strings or other entities in this way doesn't solve the sorting problem. Object is the base of all class types, and it contains an equals() method for comparing two objects for equality, but there is no method for determining the ordering of two objects. For example, given two slots of a Vector, containing references to the objects "Integer(37)" and "Integer(47)", there is no direct way of ordering these.

What can be done is to interrogate the actual type of the object and obtain its value indirectly:

        if (vec[5] instanceof Integer) 
value = ((Integer)vec[5]).intValue();


"instanceof" is an operator that returns true if the object on its left side is an instance of the class found on the right side.

Using a technique like this, it would be possible to write a method that accepted a vector of Object references, and sort that vector by querying and extracting the numbers in it and exchanging their references in the slots of the vector. The Object wrapper approach is quite flexible and general but at the expense of speed and space.

Another approach is to special-case common types like int and String, and add an interface with a name like Orderable for ordering arbitrary class type objects. A class that implements Orderable is guaranteed to define a method compareTo() that will order object instances. The types int, String, Orderable, and Object can be supported within a single List class using overloaded methods for adding new elements to the list. Actual elements such as ints can be directly represented using an int[] vector that grows as new elements are added, and operations like binary searching, sorting, and binary trees can be implemented efficiently. This approach is ugly on the inside but presents a fairly clean interface to the user.

It is possible to simulate some of the features of C++ templates by using macros such as #define in C/C++, or macro processing tools like the tool "m4" available in UNIX. Java has no macro processing, though one could add auxiliary tools if desired.

Are templates very important? This is a hard question to answer. In C++, they are used heavily, for example in STL, the Standard Template Library. In certain situations they solve a messy problem in an elegant way. But C++ templates are also very complex and hard to implement and master, and so it's possible to debate the issue without reaching any definite conclusions. There's something to be said for simpler language features, even if the result is more verbose code.


BOOK REVIEW

An excellent all-in-one volume on Java is the book "Java in a Nutshell" by David Flanagan (O'Reilly & Associates, $15). It's about 400 pages and covers both the language and libraries, with many examples and a comprehensive index.

The book has a description of each library class, such as String or Vector, and presents the interface of each in a handy condensed form. It also covers I/O, graphics, networking, and development tools.

The book is handy to keep around for quick reference. Thanks to Herb Weiner for the suggestion.


JAVA PROGRAM PACKAGING PART 2 - PUBLIC/PRIVATE

In the last issue we talked about CLASSPATH and Java packages. In this and subsequent issues, we'll be discussing visibility specifiers for instance (per object) and class (shared across all object instances) variables and methods.

The first two of these specifiers are "public" and "private". Specifying public means that the variable or method is accessible everywhere, and is inherited by any subclasses that extend from the class. "Everywhere" means subclasses of the class in question and other classes in the same or other packages. For example:

        // file A.java 

public class A {
public void f() {}
public static int x = 37;
}

// file B.java

public class B {
public static void main(String args[])
{
A a = new A();
a.f(); // calls public method
A.x = -19; // sets static variable in A
}
}


By contrast, "private" means that no other class anywhere can access a method or variable. For example:

        // file A.java 

public class A {
private void f() {}
private static int x = 37;
}

// file B.java

public class B {
public static void main(String args[])
{
A a = new A();
a.f(); // illegal
A.x = -19; // illegal
}
}


Private instance variables are not inherited. This means something slightly different in Java than in C++. In both languages private data members are in fact part of any derived class, but in Java the term "not inherited" in reference to private variables does double duty to mean "not accessible".

One crude way of figuring out just how much space an object instance requires is to use a technique like this one, where the amount of free memory is saved, many object instances are allocated, and then a calculation is done to determine the number of bytes per object instance:

        class x1 { 
private double d1;
private double d2;
private double d3;
private double d4;
private double d5;
}

public class x2 extends x1 {
public static void main(String args[])
{
int N = 10000;
x2 vec[] = new x2[N];

long start_mem = Runtime.getRuntime().freeMemory();

for (int i = 0; i < N; i++)
vec[i] = new x2();

long curr_mem = Runtime.getRuntime().freeMemory();

long m = (start_mem - curr_mem) / N;

System.out.println("memory used per object = " + m);
}
}


This technique is not without its pitfalls (notably issues related to garbage collection), but sometimes can provide useful information about object sizes.

In future issues we will be talking about other kinds of visibility, such as the default visibility level and "protected".


PERFORMANCE - STRING OPERATIONS

Java is a young language and it's hard to say for sure just how some of the performance aspects of the language will shake out. But we will offer from time to time some performance tips that may be useful. Many performance techniques apply in any programming language.

Suppose that you want to create and print a list of numbers, like so:

        public class test1 { 
public static void main(String args[])
{
String s = "[";

for (int i = 1; i <= 2000; i++) {
if (i > 1)
s += ",";
s += Integer.toString(i, 10);
}

s += "]";

System.out.println(s);
}
}


This works OK but seems kind of slow. It takes about 13 seconds using JDK 1.0 on a Pentium. When we profile the program, we find that it seems to be doing lots of data shuffling and so forth, with the garbage collector called 40 times.

It turns out that using "+=" on Strings is quite expensive, and the reason is that Strings themselves are immutable, that is, are not changed after creation. To append to a String you must copy out the String to a StringBuffer and append to it and then convert it back. StringBuffers are used for doing operations on Strings, like "+" and "+=".

This idea can be illustrated by the following example:

        public class test2 { 
public static void main(String args[])
{
String s = "aaa"; // sequence #1
s += "bbb";

System.out.println(s);

String ss = "ccc"; // sequence #2
StringBuffer sb = new StringBuffer();
sb.append(ss);
sb.append("ddd");
String ss_save = ss;
ss = sb.toString();

System.out.println(ss_save);
System.out.println(ss);
}
}


These two sequences are equivalent; using "+=" causes the processing shown in sequence #2 (except for the "ss_save" line).

Note that we captured the old value of ss before ss was changed to point to a new String. The old String didn't change when we reassigned ss, we just changed a reference that pointed at it to point at a new String.

Going back to the original example, we can rewrite it as:

        public class test3 { 
public static void main(String args[])
{
StringBuffer sb = new StringBuffer();

sb.append("[");

for (int i = 1; i <= 2000; i++) {
if (i > 1)
sb.append(",");
sb.append(Integer.toString(i, 10));
}

sb.append("]");

String s = sb.toString();

System.out.println(s);
}
}


resulting in about a 6X speedup.

A useful tool for performance tuning is the Java profiler, which can help you find bottlenecks in your code. Using JDK 1.0, one can say:

        $ javac test.java 

$ java -prof test


resulting in a file "java.prof" being written. Lines of this file contain a count of the number of calls, the name of a called method, the name of the method doing the calling, and a time in milliseconds.

Using an Awk script such as that shown below, you can summarize this file into a form similar to prof(1) output for UNIX:

        100.0% time = 13.07 seconds 
36.0 14895 java/lang/StringBuffer.ensureCapacity(I)V
30.9 18002 java/lang/System.arraycopy(Ljava/lang/Object; ...
15.4 40 java/lang/System.gc()V
3.0 2001 java/lang/Integer.toString(II)Ljava/lang/String;
2.8 8002 java/lang/StringBuffer.append(Ljava/lang/String ...
2.0 1 t2.main([Ljava/lang/String;)V
1.5 4001 java/lang/String.<init>(Ljava/lang/StringBuffer;)V
1.3 6893 java/lang/StringBuffer.append(C)Ljava/lang ...
1.3 6002 java/lang/StringBuffer.<init>(I)V


I was told by someone at Sun that the format of the java.prof file will change in a future JDK release, so be careful when using this script with such releases.

        #!/bin/sh 

awk '
$1 == "#" && $2 == "count" {
flag = 1;
next;
}
$1 == "#" && $2 != "count" {
flag = 0;
next;
}
{
if (flag) {
n++;
data[n,1] = $1;
data[n,2] = $2;
data[n,3] = $3;
data[n,4] = $4;
}
}
END {
for (i = 1; i <= n; i++) {
cnt[data[i,2]] += data[i,1];
tim[data[i,2]] += data[i,4];
}
for (i in tim) {
for (j = 1; j <= n; j++) {
if (i == data[j,3])
tim[i] -= data[j,4];
}
total += tim[i];
}
printf "%5.1f%% time = %.2f seconds\n",
100.0, total / 1000.0;
for (i in cnt) {
printf "%5.1f %7ld %s\n",
tim[i] * 100.0 / total, cnt[i], i;
}
}
' <java.prof | sort -rn

exit 0

ACKNOWLEDGEMENTS

Thanks to Thierry Ciot, Irv Kanode, Mike Paluka, and Alan Saldanha for help with proofreading.


SUBSCRIPTION INFORMATION / BACK ISSUES

To subscribe to the newsletter, send mail to majordomo@world.std.com with this line as its message body:

subscribe java_letter

Back issues are available via FTP from:

rmii.com /pub2/glenm/javalett

or on the Web at:

http://www.rmii.com/~glenm

There is also a C++ newsletter. To subscribe to it, say:

subscribe c_plus_plus

using the same majordomo@world.std.com address.

-------------------------

Copyright (c) 1996 Glen McCluskey. All Rights Reserved.

This newsletter may be further distributed provided that it is copied in its entirety, including the newsletter number at the top and the copyright and contact information at the bottom.

Glen McCluskey & Associates
Professional Computer Consulting
Internet: glenm@glenmccl.com
Phone: (800) 722-1613 or (970) 490-2462
Fax: (970) 490-2463
FTP: rmii.com /pub2/glenm/javalett (for back issues)
Web: http://www.rmii.com/~glenm

← previous
next →
loading
sending ...
New to Neperos ? Sign Up for free
download Neperos App from Google Play
install Neperos as PWA

Let's discover also

Recent Articles

Recent Comments

Neperos cookies
This website uses cookies to store your preferences and improve the service. Cookies authorization will allow me and / or my partners to process personal data such as browsing behaviour.

By pressing OK you agree to the Terms of Service and acknowledge the Privacy Policy

By pressing REJECT you will be able to continue to use Neperos (like read articles or write comments) but some important cookies will not be set. This may affect certain features and functions of the platform.
OK
REJECT