Breaking News

Laman

Kamis, 10 Mei 2012

Security

Concepts and Practices
Before analysing specific attacks and how to protect against them, it is necessary to have a foundation on some basic principles of Web application security. These principles are not difficult to grasp, but they require a particular mindset about data; simply put, a security-conscious mindset assumes that all data received in input is tainted and this data must be filtered before use and escaped when leaving the application. Understanding and practising these concepts is essential to ensure the security of your applications.


All Input Is Tainted
Perhaps the most important concept in any transaction is that of trust. Do you trust the data being processed? Can you? This answer is easy if you know the origin of the data. In short, if the data originates from a foreign source such as user form input, the query string, or even an RSS feed, it cannot be trusted. It is tainted data.
Data from these sources—and many others—is tainted because it is not certain whether it contains characters that might be executed in the wrong context. For example, a query string value might contain data that was manipulated by a user to contain Javascript that, when echoed to a Web browser, could have harmful consequences.
As a general rule of thumb, the data in all of PHP’s superglobals arrays should be considered tainted. This is because either all or some of the data provided in the superglobal arrays comes from an external source. Even the $_SERVER array is not fully safe, because it contains some data provided by the client. The one exception to this rule is the $_SESSION superglobal array, which is persisted on the server and
never over the Internet.
Before processing tainted data, it is important to filter it. Once the data is filtered, then it is considered safe to use. There are two approaches to filtering data: the whitelist approach and the blacklist approach.

Whitelist vs. Blacklist Filtering
Two common approaches to filtering input are whitelist and blacklist filtering. The blacklist approach is the less restrictive form of filtering that assumes the programmer knows everything that should not be allowed to pass through. For example, some forums filter profanity using a blacklist approach. That is, there is a specific
set of words that are considered inappropriate for that forum; these words are filtered out. However, any word that is not in that list is allowed. Thus, it is necessary to add new words to the list from time to time, as moderators see fit. This example may not directly correlate to specific problems faced by programmers attempting to mitigate attacks, but there is an inherent problem in blacklist filtering that is evident here: blacklists must be modified continually, and expanded as new attack vectors become apparent.
On the other hand, whitelist filtering is much more restrictive, yet it affords the programmer the ability to accept only the input he expects to receive. Instead of identifying data that is unacceptable, a whitelist identifies only the data that is acceptable. This is information you already have when developing an application; it may change in the future, but you maintain control over the parameters that change
and are not left to the whims of would-be attackers. Since you control the data that you accept, attackers are unable to pass any data other than what your whitelist allows. For this reason, whitelists afford stronger protection against attacks than blacklists.

Filter Input
Since all input is tainted and cannot be trusted, it is necessary to filter your input to ensure that input received is input expected. To do this, use a whitelist approach, as described earlier. As an example, consider the following HTML form:
<form method="POST">
Username: <input type="text" name="username" /><br />
Password: <input type="text" name="password" /><br />
Favourite colour:
<select name="colour">
<option>Red</option>
<option>Blue</option>
<option>Yellow</option>
<option>Green</option>
</select><br />
<input type="submit" />
</form>
This form contains three input elements: username, password, and colour. For this example, username should contain only alphabetic characters, password should contain only alphanumeric characters, and colour should contain any of “Red,” “Blue,” “Yellow,” or “Green.” It is possible to implement some client-side validation code using JavaScript to enforce these rules, but, as described later in the section onspoofed forms, it is not always possible to force users to use only your form and, thus, your client-side rules. Therefore, server-side filtering is important for security, while client-side validation is important for usability.
To filter the input received with this form, start by initializing a blank array. It is important to use a name that sets this array apart as containing only filtered data; this example uses the name $clean. Later in your code, when encountering the variable $clean[’username’], you can be certain that this value has been filtered. If, however, you see $_POST[’username’] used, you cannot be certain that the data is trustworthy. Thus, discard the variable and use the one from the $clean array instead. The following code example shows one way to filter the input for this form:
$clean = array();
if (ctype_alpha($_POST[’username’]))
{
$clean[’username’] = $_POST[’username’];
}
if (ctype_alnum($_POST[’password’]))
{
$clean[’password’] = $_POST[’password’];
}
$colours = array(’Red’, ’Blue’, ’Yellow’, ’Green’);
if (in_array($_POST[’colour’], $colours))
{
$clean[’colour’] = $_POST[’colour’];
}
Filtering with a whitelist approach places the control firmly in your hands and ensures that your application will not receive bad data. If, for example, someone tries to pass a username or colour that is not allowed to the processing script, the worst than can happen is that the $clean array will not contain a value for username or colour. If username is required, then simply display an error message to the user and ask them
to provide correct data. You should force the user to provide correct information rather than trying to clean and sanitize it on your own. If you attempt to sanitize the data, you may end up with bad data, and you’ll run into the same problems that result with the use of blacklists.

Escape Output
Output is anything that leaves your application, bound for a client. The client, in this case, is anything from aWeb browser to a database server, and just as you should filter all incoming data, you should escape all outbound data. Whereas filtering input protects your application from bad or harmful data, escaping output protects the client and user from potentially damaging commands.
Escaping output should not be regarded as part of the filtering process, however. These two steps, while equally important, serve distinct and different purposes. Filtering ensures the validity of data coming into the application; escaping protects you and your users from potentially harmful attacks. Output must be escaped because clients—Web browsers, database servers, and so on—often take action when encountering special characters. For Web browsers, these special characters form
HTML tags; for database servers, they may include quotation marks and SQL keywords. Therefore, it is necessary to know the intended destination of output and to escape accordingly. Escaping output intended for a database will not suffice when sending that same output to a Web browser—data must be escaped according to its destination. Since most PHP applications deal primarily with the Web and databases, this section will focus on escaping output for these mediums, but you should always be aware of the destination of your output and any special characters or commands that destination may accept and act upon—and be ready escape those characters or commands accordingly.
To escape output intended for a Web browser, PHP provides htmlspecialchars() and htmlentities(), the latter being the most exhaustive and, therefore, recommended function for escaping. The following code example illustrates the use of htmlentities() to prepare output before sending it to the browser. Another concept illustrated is the use of an array specifically designed to store output. If you prepare output by escaping it and storing it to a specific array, you can then use the latter’s contents without having to worry about whether the output has been escaped. If you encounter a variable in your script that is being outputted and is not part of this array, then it should be regarded suspiciously. This practice will help make your code easier to read and maintain. For this example, assume that the value for $user_message comes from a database result set.
$html = array();
$html[’message’] = htmlentities($user_message, ENT_QUOTES, ’UTF-8’);
echo $html[’message’];
Escape output intended for a database server, such as in an SQL statement, with the database-driver-specific *_escape_string() function; when possible, use prepared statements. Since PHP 5.1 includes PHP Data Objects (PDO), you may use prepared statements for all database engines for which there is a PDO driver. If the database engine does not natively support prepared statements, then PDO emulates this feature
transparently for you.
The use of prepared statements allows you to specify placeholders in an SQL statement. This statement can then be used multiple times throughout an application, substituting new values for the placeholders, each time. The database engine (or PDO, if emulating prepared statements) performs the hard work of actually escaping the values for use in the statement. The Database Programming chapter contains more information on prepared statements, but the following code provides a simple example for binding parameters to a prepared statement.
// First, filter the input
$clean = array();
if (ctype_alpha($_POST[’username’]))
{
$clean[’username’] = $_POST[’username’];
}
// Set a named placeholder in the SQL statement for username
$sql = ’SELECT * FROM users WHERE username = :username’;
// Assume the database handler exists; prepare the statement
$stmt = $dbh->prepare($sql);
// Bind a value to the parameter
$stmt->bindParam(’:username’, $clean[’username’]);
// Execute and fetch results
$stmt->execute();
$results = $stmt->fetchAll();

Register Globals
When set to On, the register_globals configuration directive automatically injects variables into scripts. That is, all variables from the query string, posted forms, session store, cookies, and so on are available in what appear to be locally-named variables. Thus, if variables are not initialized before use, it is possible for a malicious user to set script variables and compromise an application.
Consider the following code used in an environment where register_globals is set to On. The $loggedin variable is not initialized, so a user for whom checkLogin() would fail can easily set $loggedin by passing loggedin=1 through the query string. In this way, anyone can gain access to a restricted portion of the site. To mitigate this risk, simply set $loggedin = FALSE at the top of the script or turn off register_globals, which is the preferred approach. While setting register_globals to Off is the preferred approached, it is a best practice to always initialize variables.

if (checkLogin())
{
$loggedin = TRUE;
}
if ($loggedin)
{
// do stuff only for logged in users
}
Note that a by-product of having register_globals turned on is that it is impossible to determine the origin of input. In the previous example, a user could set $loggedin from the query string, a posted form, or a cookie. Nothing restricts the scope in which the user can set it, and nothing identifies the scope from which it comes. A best practice for maintainable and manageable code is to use the appropriate superglobal array for the location from which you expect the data to originate—$_GET, $_POST, or $_COOKIE. This accomplishes two things: first of all, you will know the origin of the data; in addition, users are forced to play by your rules when sending data to your application.
Before PHP 4.2.0, the register_globals configuration directive was set to On by default. Since then, this directive has been set to Off by default; as of PHP 6, it will no longer exist.
Read more ...

Array II

Array Operations
As we mentioned in the PHP Basics chapter, a number of operators behave differently if their operands are arrays. For example, the addition operator + can be used to create the union of its two operands:
$a = array (1, 2, 3);
$b = array (’a’ => 1, ’b’ => 2, ’c’ => 3);
var_dump ($a + $b);
This outputs the following:

array(6) {
[0]=>
int(1)
[1]=>
int(2)
[2]=>
int(3)
["a"]=>
int(1)
["b"]=>
int(2)
["c"]=>

int(3)
}
Note how the the resulting array includes all of the elements of the two original arrays,
even though they have the same values; this is a result of the fact that the keys
are different—if the two arrays had common elements that also share the same string
keys or that have numeric keys (even if they are different), they would only appear
once in the end result:
$a = array (1, 2, 3);
$b = array (’a’ => 1, 2, 3);
var_dump ($a + $b);
This results in:
array(4) {
[0]=>
int(1)
[1]=>
int(2)
[2]=>
int(3)
["a"]=>
int(1)
}

Comparing Arrays
Array-to-array comparison is a relatively rare occurrence, but it can be performed using another set of operators. Like for other types, the equivalence and identity operators can be used for this purpose:
$a = array (1, 2, 3);
$b = array (1 => 2, 2 => 3, 0 => 1);
$c = array (’a’ => 1, ’b’ => 2, ’c’ => 3);
var_dump ($a == $b); // True
var_dump ($a == $c); // True
var_dump ($a === $c); // False
As you can see, the equivalence operator == returns true if both arrays have the same number of elements with the same values and keys, regardless of their order. The identity operator ===, on the other hand, returns true only if the array contains the same key/value pairs in the same order. Similarly, the inequality and non-identity operators can determine whether two arrays are different:
$a = array (1, 2, 3);
$b = array (1 => 2, 2 => 3, 0 => 1);
var_dump ($a != $b); // False
var_dump ($a !== $b); // True
Once again, the inequality operator only ensures that both arrays contain the same elements with the same keys, whereas the non-identity operator also verifies their position. Counting, Searching and Deleting Elements The size of an array can be retrieved by calling the count() function:
$a = array (1, 2, 4);
$b = array();
$c = 10;
echo count ($a); // Outputs 3
echo count ($b); // Outputs 0
echo count ($c); // Outputs 1
As you can see, count() cannot be used to determine whether a variable contains an array—since running it on a scalar value will return one. The right way to tell whether a variable contains an array is to use is_array() instead.
A similar problem exists with determining whether an element with the given key exists. This is often done by calling isset():
$a = array (’a’ => 1, ’b’ => 2);
echo isset ($a[’a’]); // True
echo isset ($a[’c’]); // False
However, isset() has themajor drawback of considering an element whose value is NULL—which is perfectly valid—as inexistent:
$a = array (’a’ => NULL, ’b’ => 2);
echo isset ($a[’a’]); // False
The correct way to determine whether an array element exists is to use array_key_exists() instead:
$a = array (’a’ => NULL, ’b’ => 2);
echo array_key_exists ($a[’a’]); // True
Obviously, neither these functions will allow you to determine whether an element with a given value exists in an array—this is, instead, performed by the in_array() function:
$a = array (’a’ => NULL, ’b’ => 2);
echo in_array ($a, 2); // True
Finally, an element can be deleted from an array by unsetting it:
$a = array (’a’ => NULL, ’b’ => 2);
unset ($a[’b’]);
echo in_array ($a, 2); // False

Flipping and Reversing
There are two functions that have rather confusing names and that are sometimes misused: array_flip() and array_reverse(). The first of these two functions inverts the value of each element of an array with its key:
$a = array (’a’, ’b’, ’c’);
var_dump (array_flip ($a));
This outputs:
array(3) {
["a"]=>
int(0)
["b"]=>
int(1)
["c"]=>
int(2)
}
On the other hand, array_reverse() actually inverts the order of the array’s elements, so that the last one appears first:
$a = array (’x’ => ’a’, 10 => ’b’, ’c’);
var_dump (array_reverse ($a));
Note how key key association is only lost for those elements whose keys are numeric:
array(3) {
[0]=>
string(1) "c"
[1]=>
string(1) "b"
["x"]=>
string(1) "a"
}

Read more ...

Array I

Arrays are the undisputed kings of advanced data structures in PHP. PHP arrays are extremely flexible—they allow numeric, auto-incremented keys, alphanumeric keys or a mix of both, and are capable of storing practically any value, including other arrays. With over seventy functions for manipulating them, arrays can do practically anything you can possibly imagine—and then some. Array Basics

All arrays are ordered collections of items, called elements. Each element has a value, and is identified by a key that is unique to the array it belongs to. As we mentioned in the previous paragraph, keys can be either integer numbers or strings of arbitrary length.
Arrays are created one of two ways. The first is by explicitly calling the array() construct, which can be passed a series of values and, optionally, keys:
$a = array (10, 20, 30);
$a = array (’a’ => 10, ’b’ => 20, ’cee’ => 30);
$a = array (5 => 1, 3 => 2, 1 => 3,);
$a = array();
The first line of code above creates an array by only specifying the values of its three elements. Since every element of an array must also have a key, PHP automatically the array keys are specified in the call to array()—in this case, three alphabetical keys (note that the length of the keys is arbitrary). In the third example, keys are assigned “out of order,” so that the first element of the array has, in fact, the key 5—note here the use of a “dangling comma” after the last element, which is perfectly legal from a syntactical perspective and has no effect on the final array. Finally, in the fourth example we create an empty array.
A second method of accessing arrays is bymeans of the array operator ([]):
$x[] = 10;
$x[’aa’] = 11;
echo $x[0]; // Outputs 10
As you can see, this operator provides a much higher degree of control than array(): in the first example,we add a newvalue to the array stored in the $x variable. Because we don’t specify the key, PHP will automatically choose the next highest numeric key available for us. In the second example, on the other hand, we specify the key ’aa’ ourselves. Note that, in either case, we don’t explicitly initialize $x to be an array, which means that PHP will automatically convert it to one for us if it isn’t; if $x is
empty, it will simply be initialized to an empty array. Printing Arrays
In the PHP Basics chapter, we illustrated how the echo statement can be used to output the value of an expression—including that of a single variable. While echo is extremely useful, it exhibits some limitations that curb its helpfulness in certain situations. For example, while debugging a script, one often needs to see not just the value of an expression, but also its type. Another problem with echo is in the fact that
it is unable to deal with composite data types like arrays and objects.
To obviate this problem, PHP provides two functions that can be used to output a variable’s value recursively: print_r() and var_dump(). They differ in a few key points:
• While both functions recursively print out the contents of composite value, only var_dump() outputs the data types of each value
• Only var_dump() is capable of outputting the value of more than one variable at the same time
• Only print_r can return its output as a string, as opposed to writing it to the script’s standard output
Whether echo, var_dump() or print_r should be used in any one given scenario is, clearly, dependent on what you are trying to achieve. Generally speaking, echo will cover most of your bases, while var_dump() and print_r() offer a more specialized set of functionality that works well as an aid in debugging. Enumerative vs. Associative
Arrays can be roughly divided in two categories: enumerative and associative. Enumerative arrays are indexed using only numerical indexes, while associative arrays (sometimes referred to as dictionaries) allow the association of an arbitrary key to every element. In PHP, this distinction is significantly blurred, as you can create an enumerative array and then add associative elements to it (while still maintaining elements
of an enumeration). What’s more, arrays behave more like ordered maps and can actually be used to simulate a number of different structures, including queues and stacks.
PHP provides a great amount of flexibility in how numeric keys can be assigned to arrays: they can be any integer number (both negative and positive), and they don’t need to be sequential, so that a large gap can exist between the indices of two consecutive values without the need to create intermediate values to cover ever possible key in between. Moreover, the keys of an array do not determine the order of its elements—
as we saw earlier when we created an enumerative array with keys that were out of natural order.
When an element is added to an array without specifying a key, PHP automatically assigns a numeric one that is equal to the greatest numeric key already in existence in the array, plus one:
$a = array (2 => 5);
$a[] = ’a’; // This will have a key of 3
Note that this is true even if the array contains a mix of numerical and string keys:
$a = array (’4’ => 5, ’a’ => ’b’);
$a[] = 44; // This will have a key of 5

Multi-dimensional Arrays
Since every element of an array can contain any type of data, the creation of multidimensional arrays is very simple: to create multi-dimensional arrays, we simply assign an array as the value for an array element. With PHP, we can do this for one or more elements within any array—thus allowing for infinite levels of nesting.
$array = array();
$array[] = array(
’foo’,
’bar’
);
$array[] = array(
’baz’,
’bat’
);
echo $array[0][1] . $array[1][0];
Our output fromthis example is barbaz. As you can see, to accessmulti-dimensional array elements, we simply “stack” the array operators, giving the key for the specify element we wish to access in each level.
Unravelling Arrays It is sometimes simpler to work with the values of an array by assigning them to
individual variables. While this can be accomplished by extracting individual ele ments and assigning each of them to a different variable, PHP provides a quick shortcut— the list() construct:
$sql = "SELECT user_first, user_last, lst_log FROM users";
$result = mysql_query($sql);
while (list($first, $last, $last_login) = mysql_fetch_row($result)) {
echo "$last, $first - Last Login: $last_login";
}
By using the list construct, and passing in three variables, we are causing the first three elements of the array to be assigned to those variables in order, allowing us to then simply use those elements within our while loop.
Read more ...

Java and the Internet


Introduction

The past four years have seen a phenomenal rise in interest in the Internet. Tens of millions of users regularly access this network to carry out operations such as browsing through electronic newspapers, downloading bibliographies, participating in news groups and emailing friends and colleagues. The number of applications that are hosted within the Internet has also grown; however, there are major problems in developing such applications:

• The first problem is security. There are still many problems concerned with ensuring that unauthorized access is prevented. This is becoming one of the major drag factors why commercial applications, particularly those involving the direct transfer of funds across communication lines, have been relatively slow in developing as compared with academic applications.
• The lack of a specific programming language for Internet applications. Currently applications are written in a wide variety of languages including C, Pascal and TCL/TK which have to access fairly low-level facilities such as protocol handlers.
• It is very difficult to build interaction into an Internet application. Most of the applications that have been developed tend to give the impression of being interactive. However, what they usually involve is just the user moving through a series of text and visual images following pointers to other sections of text and visual images. The most one often gets with the vast majority of Internet applications is some small amount of interactivity, for example an application asking the user for an identity and a password and checking what has been typed against some stored data which describes the user.
• The majority of interactive applications are non-portable: they tend to be firmly anchored within one computer architecture and operating system by virtue of the fact, for example, that they tend to use run-time facilities provided by one specific operating system.
The language.
The Java programming language originated at Sun Microsystems. It was developed initially as a programming language for consumer-electronics products; however, its later versions address the problems that have been outlined in the previous section. The designers of the language had a number of design goals:
• The language should be familiar. It should have no strange syntax and, as much as possible, it should look like an existing language. However, this principle was not taken to the point where problems with other languages would be carried through to Java. The control structures and data types in Java look like some of those provided in the C programming language, while those facilities which make it object-oriented resemble those in the programming language C++. The developers of the Java language felt that on both commercial and technical grounds Java would have the greatest success if the learning curve was not too steep. Its similarity to the C family of programming languages means that a wide variety of users are able to program in it: ranging from professionals at the cutting edge of Internet technology to the home computer user.
Read more ...

Classes

The previous chapter described the factthat computing systems can be viewed as consisting of objects which cooperatewith each other in order to carry out a task by means of the mechanism ofmessage sending. Events in the outside world such as a user requesting someinformation from an applet or application will give rise to a series ofmessages interchanged between objects within that applet or application untilone or more objects send some data to the outside world.

The previous chapter was, of necessity,brief and introductory. The aim of this chapter is to look at objects in moredetail and, in particular, show how they are implemented in the Javaprogramming language. In order to understand this it is worth examining whathappens when a message is sent to an object. For example, let us assume that anobject obj has been sent a message mess:

obj.mess;

What happens is that the object receivesthe message and decodes it. The decoding consists of looking at a list of allthe messages that it can receive and then executing the code corresponding tothe message mess. The question that this poses is how does the object accessthis information? The answer is that objects in Java are defined by means of amechanism known as a class. A class is very much like a template whichdefines the stored data associated with an object and the program code which isexecuted when particular messages are received.

The outline architecture of a class isas shown below:
Class Name {
State
Method1
Method2
Method3
...
Methodn
}

The first line contains the Javalanguage statements which define the name of the class. The state contains statementswhich define the individual data items that make up an object and the lines labeledmethod1, method2 and so on contain the code corresponding to each particularmessage that is sent. We use the term method to describe the chunk ofcode that is executed when a message is sent. For example, if an object canreceive a message x then there will normally be code identified by x withinthe class definition; this code is executed when the message is received by theobject.

The state part of a class will containdata items which are important to the object defined by that class. Forexample, assume that we have written an applet which allows password access tosome stored data. In such a system there will be an object defined by a classUser. In order for users to be processed correctly by this applet there willneed to be some means whereby such users can be identified and their passwordsstored. The state for each user is hence defined by two items of data: apassword and some unique identification such as that given to users when theyare registered at their home computer. In this example where there are twoitems of data associated with each object, such items will be held in variablesknown as instance variables. Each user object within the applet willcontain
these two items of data.
It is worth lingering a little longerwith this example. In the applet user objects will be sent messages such as:

user1.getPassword();
daveUser.newPassword("klxxx");
daveUser.newId("Dave33");
rolandUser.changeId("Roland66");
rolandUser.changePassword("zlxxghj");
robUser.checkPassword("xxkoil99");

The first expression involves the objectidentified by user1 being sent the message getPassword(). The result of thiscommunication is that the password for the user is returned and could then beused within the applet. The second expression involves the object identified bydaveUser being sent the message newPassword("klxxx"). Thiscommunication involves the instance variable corresponding to the passwordbeing given a value ("klxxx"). The third expression involves theobject identified by daveUser being sent the message newId("Dave33").This results in the instance variable corresponding to the identity of the userbeing updated to the string value "Dave33". The fourth expressioninvolves the sending of the message changeId("Roland66") torolandUser. This would result in the instance variable which holds the user’sidentity being changed. Similarly the fifth expression would involve theinstance variables representing the password of the user rolandUser beingchanged to a new value ("zlxxghj"). The final Java expressioninvolves the message checkPassword("xxkoil99") being sent to the useridentified by robUser. This results in a check being made that robUser has thepassword value "xxkoil99" in the instance variable which holds thecurrent password of the user.

An important point to make about themessages above is that two pairs of messages look very similar; for example,newPassword and changePassword seem to do the same things. What we have assumedhere is that they correspond to slightly different processing. For example, themessage corresponding to newPassword might access another instance variablewhich contains data that describes the date on which the user was first allowedaccess. This instance variable would not be accessed by changePassword. If theprocessing required by these two methods and the methods changeId and newIdwere the same, then they could, of course, be replaced by just one message
.
A class template for users that alsohave associated with them the date of last access would look something likethis:

Class User{
// Declarations of variables for thepassword of a user,
// the identity and the date of firstaccess
// Code defining the method getPassword
// Code defining the method newPassword
// Code defining the method newId
// Code defining the method changeId
// Code defining the methodchangePassword
// Code defining the methodcheckPassword
...
// Code for other methods
...
}

The symbols // introduce a comment on asingle line. Once you have defined such a class within a Java applet orapplication new users can be defined and space allocated for them. The code:

User rolandUser, daveUser, janeUser;

defines three user variables which willidentify User objects. Each of these users will eventually contain thevariables which hold their password, their user identity and the date theyfirst accessed the applet.

The statement above just informs theJava system that the variables rolandUser, daveUser and janeUser will be Uservariables. To allocate space requires a facility known as the new facility. Forexample, to allocate memory space for the object identified by rolandUser youwill need to write:

rolandUser = new User();

In order to reinforce the ideas justpresented let us take another example from a Java application. Let us assumethat we have written an applet which allows users to interrogate the prices ofshares on a stock exchange. We shall assume that the user of such an appletcarries out functions such as finding the current price of a share and alsoexamining the price of the shares as far back in the past as 365 days. Thismeans that we will need an object which we shall call ShareHistory. Thiscontains the last 365 days of prices for each share taken at the end of thedealing day.
The template for this object will looklike:

Class ShareHistory {
// Instance variable holding the last365 prices for
// each of the shares listed on thestock exchange
// Method code for findCurrentPrice
// Method code for findAnyPrice
// Method code for updatePrice
}

The first method, findCurrentPrice, willfind the price of a particular share which is the last one posted – normallythe price for the previous day’s close of business. The second method,findAnyPrice, given a day within 365 days of the current day, will deliver theprice of a particular share on that day. The third method will update the priceof a share at the end of a day’s trading.

Such an applet can deal with a number ofstock exchanges which in Java can be declared as:
ShareHistory tokyo, london, newYork;
with typical Java expressions being:
tokyo.findCurrentPrice("Daiwa");
london.findCurrentPrice("UNISYS");
newYork.findAnyPrice("IBM","22/09/95");
newYork.updatePrice("GeneralMotors", 333);

The first message to the receiver objecttokyo results in the current stock price of the Daiwa company being returned,the second message sent to the object london results in the current price ofthe computer company UNISYS being returned, the third message sent to the NewYork exchange results in the price of the stock for IBM on 22 September beingreturned. Finally, the fourth line updates the price of General Motors on theNew York stock exchange.

The two examples above are structurallysimilar. As you proceed through the book you will find that all classes willfollow this pattern of class definition, definition of variables and definitionof methods.

What are the reasons for defining datain such a way? Later in this chapter you will see how some of the more advancedfacilities related to objects lead to a high degree of reuse. Apart from reusethere is also the advantage of maintainability.

Software systems are subject to majorchanges in their lifetime. Recent surveys have suggested that as much as 80% ofthe development effort expended by a software company is devoted to modifying existingsystems. When you define objects using classes one of the things that you cando is to ensure that no user can use the instance variables of an object: thatall access to the information stored in an object is via methods. Thisprinciple is known as information hiding: the user of an object does nothappen to know the details of how that object is implemented. This means thatwhen a developer wants to change the implementation of an object, for exampleto speed up access to the object, then the instance variables and code of allthe methods change but the interface to the object – the method names andarguments themselves – do not change. For example, after using the stockexchange system detailed above for a number of months the developer maydiscover that there is a particular pattern of access to objects, for examplehe or she may discover that most of the access is to recent data, and that anew way of storing the stock prices which takes advantage of this access leadsto an enhanced run-time performance. This new way of storing the price datawould inevitably lead to large changes in the code of the methods that thestock exchange object recognized. However, it would not lead to any changes inthe format of the messages. For example, users could still send messages suchas:

tokyo.findCurrentPrice("Daiwa");
london.findCurrentPrice("UNISYS");
newYork.findAnyPrice("IBM","22/09/95");
newYork.updatePrice("GeneralMotors", 333);

as before, without any changes beingmade to the applet. This means that any applet or Java application which usesthe stock exchange class does not need to be changed.

Some concepts

Almost certainly the next question thatyou are asking yourself is: what is the detailed syntax of a class in Java?This question will be answered fully in Chapter 5. At this stage in the book wewould like to give you an idea of how you write such classes. An example of asimple Java class is shown below. It is simple, and hence unrealistic; however,we have included it for teaching reasons: we do not want to get the detail ofclasses in the way of understanding what, for many, is regarded as a difficultconcept.

A robot

The class below is taken from a screenrobot applet. Here the user can direct a robot to move on a twodimensional board,where the coordinates of the board are expressed as x and y coordinates.When the robot is moved the screen updates itself by showing the next position.The lowest x and y positions are 1 and the highest x and ypositions are 8. An example of this board and the current position of a robotis shown in Figure 3.1.

Let us assume that we need a number ofmessages to be sent to the robot which can send the robot vertically upwards(to the north), vertically downwards (to the south), horizontally leftwards (tothe west) and horizontally rightwards (to the east). We also want to be able tosend messages to the robot to send it upwards and to the right, upwards and tothe left, downwards and to the left and downwards and to the right.

Examples of these messages are shownbelow:
newRobot.up();
oldRobot.down();
newRobotleft();
oldRobot.right();
oddRobot.leftUp();
oddRobot.rightUp();
smallRobot.leftDown();
smallRobot.rightDown();

Robot newRobot, oldRobot, oddRobot,smallRobot;
In the discussion that follows we shallassume that there is no possibility that the robot will travel beyond theconfines of the square grid. This would normally be achieved by means ofmessages which would be sent to a robot to check that it is not on the edge ofthe grid.
The first part of the Java classdefinition for a robot is shown below:

Class Robot {
int x, y;

This states that the class is known asRobot and that there will be two instance variables x and y which define wherethe robot is on the grid; these variables will hold integer values. Thedefinition of the up, down, right and left messages is shown below:

public void up() {
y++;
}
public void down() {
y--;
}
public void right() {
x++;
}
public void left() {
x--;
}
Each of the method definitions startswith the keyword public. This specifies that the method can be used by otherobjects and by code written outside the class. The keyword void states that novalue is to be returned by the method; this keyword is followed by the name ofthe method and a list of its arguments; in the case of the four methods abovethere are no arguments. The first method up moves the robot one grid upwardsand the code following the first curly bracket does this. For those of you whohave programmed in C or C++ you will recognize the statement y++ as the codewhich increments the value of the instance variable y by one. This isequivalent to a statement such as y:= y + 1 in a programming language such asPascal.

The code for the down message is similarexcept that it decrements the instance variable y by one, hence moving therobot downwards. The code for right and left is similar apart from the factthat it accesses the x instance variable.
The code for the other methods is shownbelow:

public void upRight() {
y++;
x++;
}
public void downRight() {
y--;
x++;
}
public void upLeft() {
y++;
x--;
}
public void downLeft() {
x--;
y--;
}

The code for these methods isself-explanatory.
Let us assume that as well as methodswhich move the robot we require methods which find out the position of therobot, for example to discover whether the robot is in a position where it ispossible to move. Assume that the two methods we need will be called findxPosand findyPos. When a message corresponding to findxPos is sent to a receiverobject what is returned is the x position of the object and when a findyPosmessage is sent to a receiver object what is returned is the y position of theobject. The code for these two methods is shown below:

public int findxPos() {
return (x);
}
public int findyPos() {
return (y);
}

There are a number of things to noticeabout this code. First, the two methods are declared as public; this means thatthey can be accessed by other methods and by code outside the class whichdefines robots. The second thing to notice is that the keyword int is usedinstead of the previously used void. This means that the method will returnwith an integer value. The final thing to notice about the code is that in thebody of the method, between the curly brackets, the keyword return is used.This means that the value specified by the return is the one that is returnedby the method. In the case of the findxPos method this is the value of the xinstance variable.

So far we have described methods withoutarguments. The next collection of methods show how arguments can be used. Thefirst method checkAtPoint checks whether a robot is at a specific point on thegrid. The second method moveToPoint moves a robot from its current point to anew point specified by two arguments. The code for these two methods is shownbelow:

public boolean checkAtPoint(int xPos,yPos) {
return (x==xPos && y==yPos);
}
public void moveToPoint(int xPos, yPos){
x = xPos;
y = yPos;
}

The first method is headed with theboolean keyword. This means that it will return a value which is either true orfalse. The two arguments to the method are xPos and yPos which represent apossible position of the robot. The body of the method returns true if the xinstance variable contains the same integer as the xPos argument and the yinstance variable contains the same value as the yPos argument. Again those ofyou who have programmed in C and C++ will realize that Java has similarfacilities to those found in these languages. The == symbol stands forarithmetic equality while the operator && stands for Boolean and
Thus, when the Java interpreter executesthe code shown below:

strangeRobot.checkAtPoint(3,4);

it checks whether the receiver objectstrangeRobot is at the point (3, 4). What happens is that the interpreter firstrecognizes strangeRobot as of class Robot. This will be because it will havebeen defined previously in the Java program which uses the code as, say:

Robot strangeRobot;.

The interpreter will look for a methodcorresponding to the name checkAtPoint. It will find the method and execute thecode. However, before executing the code it will copy the values 3 and 4 to thearguments xPos and yPos. It will thus evaluate the expression:

x==3 && y==4

and will return either true or falsedepending on whether the receiver object is at the position which is defined bythe contents of its instance variables.

The code for the method movePointreturns no value since it is headed by the keyword void. It has two argumentsxPos and yPos which are the new points to which it is to be sent. The codewithin the body of the method updates the two instance variables with thevalues given by the arguments.

So far in this section we have describedmethods which access the state (instance variables) of an object and whichupdate those variables. There is, however, one class of method which we haveomitted: methods which create an object. There are a number of ways of creatingan object. We have said very little about how objects are created apart fromthe fact that the new facility is used, for example the declaration:

Robot fredRobot = new Robot();

declares a variable fredRobot, allocatesspace for a Robot object and identifies this space as fredRobot.

The one problem with this form ofdeclaration and allocation of space is that the instance variables of theobjects declared in such a way are uninitialized. Java contains a facilitywhereby an object can be given a value when it is declared as above. In orderto do this all that is required is to declare a new method within the classtemplate which has the same name as the class. So, for example, in our Robot classwe would need to define Robot as:

Robot() {
x = 1;
y = 1;
}
The effect of this is that whenever youdeclare a Robot object, for example in:
Robot slowRobot = new Robot();
the Java interpreter will first examinethe methods defined in Robot looking for a method which has the same name asthe class name. If it does not find such a method, then it will just create anobject which has uninitialized instance variables. However, if it discovers amethod with the same name, then the code for the method is executed. In thecase of the code shown above this will create a robot which is initiallypositioned on the bottom left square of the grid. The method, known as Robot,is called a constructor. Constructors can have specified defaults such as Robotshown above where the default is that a newly created robot object is placed onthe square (1,1). However, constructors can also be associated with argumentswhich represent a user’s specified initial value. An example of a Robotconstructor which sets the x position and y position of a robottaken from values supplied by the programmer is shown below:

Robot(int xPos, yPos) {
x = xPos;
y = yPos;
}
An example of its use with the newfacility is shown below:
Robot denseRobot = new Robot(6, 5);

This line of code declares an objectdenseRobot which is created by applying new to the Robot method. This methodrequires two arguments (in the example above these are 6 and 5) which are usedto update the x and y instance variables within denseRobot.

Messages and classes

It is worth recapping the mechanismsused within Java for sending messages and what happens to these messages. Whenthe Java interpreter encounters an expression such as:

receiverObject.message(arguments);

it will first determine which class thedestination object is defined by. To do this it will scan all the declarationswithin the code of the applet or Java application. If it finds the name of theobject within a declaration then it recovers the name of the class whichdescribes it. It will then scan the code of the methods associated with thisclass. If it finds a method with the same name as the message string then thecode is executed; if not an error is flagged. If the message is associated withany arguments these are copied through as the arguments of the method and thecode associated with the method is executed. This is a slight simplification asyou will see later in this chapter. However, in essence, it represents theprocessing cycle that occurs.

Inheritance

So far we have outlined a number ofpowerful facilities contained in the Java programming language which are usedto construct objects and define the methods which correspond to the messagesthat Java objects receive. This section introduces what is certainly the mostpowerful facility within Java: that of inheritance. In order to introduce theidea we will first describe three increasingly complex examples of whereinheritance is useful. The first example will introduce the idea of inheritancein an abstract way. Later chapters of the book show how inheritance is usedwithin more realistic applets and applications.

The augmented set

The first example concerns a set wherewe need methods which find out the size of the set and add integers to the set,together with an operation which finds the sum of the elements in the set. Onevery efficient way of implementing such a set in such a way that summation isnot too inefficient an operation is to have a class which has two instancevariables: intSet, the set of integers, and an integer sum which contains thecurrent sum of the integers. This is an efficient implementation because anymethod which needs to find the sum only needs to look it up in the variable sumrather than iterating through intSet.

An example of an object described bythis class is shown in Figure 3.2. Here the instance variable sum contains thecurrent sum of the integers in the instance variable intSet. Let us assume thatwe have already implemented a set of integers described by the class intSetshown below, where the code for each of the classes is not shown and where wehave just listed the instance variables without defining their types. The classdoes not require a summation method and hence does not require an instance variableto contain a sum.

Class IntSet {
int intSetvar[];
public boolean includes(int no) {
...
}
public int size(int no) {
...
}
public void add (int no) {
...
}
public void remove(int no) {
...
}
}

Let us assume that this set has beenimplemented for another project and is associated with the four methods shownabove which check that a particular number is in the set, find the size of theset, add an element to the set and removes an element from the set.

Now let us assume that we need a set towhich we wish to send messages that calculate its sum. We are faced with onechoice immediately: we can program a new class from scratch. However, another choiceis to use a facility known as inheritance which takes advantage of the setclass we have just written. Before seeing it in action it is worth providingsome definitions.

Inheritance is arelationship between two classes: if a class A inherits from a class B,then class A, as well as being able to use all of its own instancevariables and methods, can use all the methods that B can use and canalso use any instance variables that B can use. As an example of thisconsider the two classes X and Y below without the code for their methods.

Class X {
// Declarations for u v and w
// Code for method A
// Code for method B
// Code for method C
}
Class Y {
// Declarations for l and m
// Code for method R
// Code for method S
// Code for method T
}

If class Y inherits from class X, then,firstly, all the methods in class Y can refer to not only the instance variablesin class Y but also the instance variables in class X; and secondly an objectdescribed by class Y can have messages corresponding not only to the methods R,S and T but also the methods A, B and C.

Classes can inherit from classes whichinherit from other classes. As an example consider the three classes shownbelow:

Class X {
// Declarations for u v and w
// Code for method A
// Code for method B
// Code for method C
}
Class Y {
// Declarations of l and m
// Code for method R
// Code for method S
// Code for method T
}
Class Z {
// Declarations of n and o
// Code for method G
// Code for method H
// Code for method I
}

The same holds for the instancevariables: methods in class Z can refer to not only the instance variables nand o defined in Z, but also the instance variables l and m which it inheritsfrom Y and also the instance variables u, v and w which Y inherits. We canrepresent this relationship graphically as shown in Figure 3.3.

This diagram, which shows that class Zinherits from class Y which, in turn, inherits from class X, is known as a classhierarchy diagram. Class hierarchies are a very powerful way of describingthe relationship between classes. This relationship lies at the heart of theuse of an object-oriented programming language as a medium for software reuse. Considerthe classes described below where B inherits from A:

Class A {
// Declarations of a b and c
// Code for method J
// Code for method K
// Code for method L
}
Class B {
// Declarations of e f g and h
// Code for method X
// Code for method Y
X
Y
Z
// Code for method Z
}

Also consider the statements:
• An object of class B can be sent amessage associated with method K.
• An object of class A can be sent amessage associated with method B.
• Method Z in class B can refer to theinstance variable a.
• Method Y in class B can refer to theinstance variable e.
• Method K in class A can refer to theinstance variable g.
Which of these statements are true?
The first is true since A contains amethod K which B inherits. The second is false since there is no method calledB, only a class called B. The third is true since there is an instance variablea in A which is inherited by B. The fourth is true since the instance variablee is defined in the class B. The final statement is false; class A does notinherit from class B, but vice versa.
There is only one further rule thatneeds to be explained about inheritance before we return to the example thatstarted this section. This concerns classes which contain methods that have thesame name as methods in the class which they inherit. Consider the classesshown below, where class B inherits from class A.

Class A {
// Declarations a b and c
// Code for method J
// Code for method K
// Code for method L
}
Class B {
// Declarations e f g and h
// Code for method X
// Code for method K
// Code for method Z
}

If a message with the name K was sent toan object of class B which method would be executed?

The answer is that it would be themethod in class B. The general rule is that whenever a class inherits fromanother class and there is duplication of method names between the class andthe one it inherits from, then the method from the class that inherits is theone invoked. Consider the classes shown below:

Class D {
// Declarations of a b and c
// Code for method J
// Code for method K
// Code for method L
}
Class E {
// Declarations of e f g h i and j
// Code for method J
// Code for method K
// Code for method Z
}
The following statements are true:
• When a message is sent to an objectdescribed by class E and the message uses the selector J, then the method Jdefined in E is invoked.
• When a message is sent to an objectdescribed by class E and the message uses the selector Z, then the method Zdefined in E is invoked.
• When a message is sent to an objectdefined by class E and the message uses the selector L, then the method Ldefined in class D is invoked. However, the following statement is not true:
• When a message is sent to an objectdefined by class D and the message uses the selector K, then the method Kdefined in E is invoked.

Because D does not inherit from E, themethod corresponding to K defined in D is invoked. This rule about similarnames for methods holds whatever the level of the inheritance hierarchy. Forexample, consider the three classes shown below:

Class A {
// Declaration of r and s
// Code for method J
// Code for method K
// Code for method L
}
Class B {
// Declarations of e f and g
// Code for method J
// Code for method U
// Code for method V
}
Class C {
// Declarations of l m n and o
// Code for method U
// Code for method R
// Code for method S
}

If class C inherits from class B which,in turn, inherits from class A, then when a message involving the selector U issent to an object defined by C, the method U defined in C is invoked; when amessage using the selector J is sent to the same type of object, the method Jdefined in class B is invoked. Before returning to the summable set example itis worth defining a concept that you will meet a number of times later in thebook. The protocol of a class is the list of messages that can be sentto an object defined by that class. The protocol will contain those methodnames defined within the class, together with the methods from those classes itinherits from. For example, the protocol of the class C defined above containsthe messages corresponding to the methods U, R, S, J, V, K, L. It is now worthcontinuing with the summable set example. You will remember that we had alreadydefined a set of integers as:

Class IntSet {
int intSetvar[];
public boolean includes(int no) {
...
}
public int size(int no) {
...
}
public void add(int no) {
...
}
public void remove(int no) {
...
}
}

and that we wanted to define a new classwhich provided all the facilities of IntSet but also provided a method whichsummed the integers within the set.

You will remember that the way to dothis is to somehow define a new class which has an instance variable whichholds the current sum of the values and an inherited instance variable whichholds the members of the set. We can easily define this new set by writing downits name, specifying the names of the instance variables and then writing codefor the five methods. However, inheritance allows us to save some time, sinceby using inheritance we can reuse some of the elements of IntSet. We shallassume that the new class which describes summable sets is called SummableSet.The code skeleton describing its structure is shown below:

Class SummableSet {
int sum;
public int totalSum() {
...
}
}

Here a new method totalSum is definedwhich returns with the sum of the set; also defined is an instance variable sumwhich holds the current sum of the set.

In Java if we wish to specify that a setinherits from another set then we write this using the keyword extends. Thefull definition of SummableSet is then:

Class SummableSet extends IntSet {
int sum;
public int totalSum() {
...
}
}

This states that we have defined a newclass called SummableSet which inherits the instance variables and the methodsfrom the existing class IntSet. If this class inherits from IntSet, then anobject described by SummableSet can be sent messages corresponding to themethods includes, size, add and remove. There seems to be no reason why wecannot now write code such as:

SummableSet sms;
sms = new SummableSet();
sms.add(23);

where the semicolon is used as with mostC-like programming languages to terminate individual Java statements.

Unfortunately, there is a problem: thecode for the methods add and remove is incorrect. The code for add and removewould correctly alter the set of integers by adding and removing items from it;however, they do not affect the current sum. For example, there may be threeintegers in the summable set which are 45, 3, 10 with the instance variable sumholding 58 which is the sum. If add is invoked with the argument 20, then 20would be deposited in the set but the sum would remain at 58. What we need arenew versions of add and remove.

These need to be embedded in the classSummableSet, making its code skeleton look like:

Class SummableSet extends IntSet {
int sum;
public int totalSum() {
...
}
public void add(int no) {
...
}
public void remove (int no) {
...
}
}

The code for the method add would add noto the set and also add the value of no to the instance variable
sum. The code for the method removewould remove no from the set and subtract its value from
the instance variable sum. Perhaps wecould then expect the code for add to look something like:
public void add(int no) {
add(no);
sum = sum + no;
...
}
where the first line introduces themethod, the second line uses the method add declared in the class
IntSet to carry out the insertion of nointo SummableSet and the third line adjusts the instance variable
sum so that it is up to date andcontains the current sum. We might also expect the code for remove
within SummableSet to look like:
public void remove(int no) {
remove(no);
sum = sum - no;
...
}

There is one problem, however. Whichversion of add does the first extract refer to: the one defined in IntSet orthe one defined in SummableSet? Similarly which version of remove does thesecond extract refer to: the one defined in IntSet or the one defined inSummableSet? Well, the rule that we have given previously states that when amethod M is invoked associated with a class C, then method M issearched for in C, and if it is there then it is invoked; if it isn’tthere then the search continues in the next class that it inherits from, namelythe class in the next level in the inheritance hierarchy. This means that, forexample, inside the method add defined in SummableSet there is a call toitself. When this is invoked there would be another call to itself, and so on.This means that the method could be in a continuous loop and never exit. Thesame would, of course, be true of remove.

The way to get over this is to use adevice which slightly overrides the way in which methods are searched for whenthey are invoked. This involves the use of a dot notation. In order to refer tothe method in the class that SummableSet inherits we refer to it as super.add.

In order to see how this works, examinethe code shown below for the correct version of add within SummableSet:

public void add(int no) {
intSetvar.super.add(no);
sum = sum + no;
...
}

This instructs the Java interpreter tostart its search for a method to execute not within the class which describesthe object SummableSet, but within the class IntSet (the keyword superinstructs the Java system to look at the class above). The code for removewould be similar:

public void remove(int no) {
intSetvar.super.remove(no);
sum = sum - no;
...
}

It is worth recapping the points thatthe example and supporting text has illustrated. First, it has shown howinheritance works. Second, it has illustrated reuse to a certain extent: thereuse of IntSet was not massive since we had to redefine the methods add andremove within SummableSet, but nevertheless some reuse was employed. Finally,the text described the way in which the Java system looked for methods toexecute: by traversing the inheritance hierarchy, normally starting with theclass that was defined by the object which is being sent a message, but withthe use of the dot notation facility overriding this starting point.
Read more ...
Designed By