Performance Tuning in Java
Any application that we develop has some or the other performance issues, whether it is an intranet or internet product. As we progress with development, we concentrate more on resolving the client requirements and technical complexities while performances issues are overlooked. But it is essential that we have a phase in our development cycle that deals with tuning the product performance before going into production.
Performance Tuning requires a lot of experience and innovative thinking. It is very difficult to propose definite and generic solutions for resolving performance issues, but a few suggestions can definitely be made. Every application is unique in itself and so are the issues related with its performance, though most applications have common performance requirements.
Following are a few techniques that can be used for performance tuning:-
Static Content Serving
Most of the content in the pages of a website is static both by volume and by count. e.g. www.amazon.com or www.bbc.co.uk . By building a static content accelerator server we can make the site faster. The static content should be deployed in a separate web server, and the static content request shall be transparently routed to the static content accelerator server. The reverse proxy technique is a good example in this regard.
Caching
It is an important technique employed by many programmers. Java programmers choose custom cache implementation due to the fact that it increases the comfort level on coding and debugging. Open source caching components like EhCache and OSCache work best for custom cache implementation. This reduces development time as well as bugs in the code. A few important points that are to be kept in mind while implementing caching are
1. Prior information about when and how to invalidate the cached entries should be there
2. Too much data shouldn’t be cached as it leads to cluttering of heap memory
3. A suitable algorithm should be used for clearing the cache periodically e.g. LRU(Least Recently used ) and LFU (Less Frequently used)
4. There should be an option to measure the hits and misses count to fine-tune the caching
5. The Caching components should work in a clustered environment
Connection/Thread Pooling
An important and widely used strategy. A few points to keep in mind while configuring the connection and thread pool are as follows
1. Max count of thread and connection should be predefined in a database driven application, based on database capacity.
2. The optimal value of maximum thread and connection pool from the performance tests should be predetermined.
3. The initial capacity should be set equivalent to maximum capacity
4. The max and min thread/connection pool size have to be derived based on hardware capability and load testing.
Remote Calls
Distributed computing is quite prevalently used for scalable applications. A few important considerations are
1. Remote calls should be chosen carefully. They should mostly be avoided
2. The remote call’s response time should be measured and accordingly the frequently accessed data should be fine-tuned and cached
3. All synchronized remote calls like File System , Email , Web Service , Database and EJB should be measured
4. For developing large scale, distributed integration solutions, asynchronous messaging should be employed as the preferred communication model. Messaging provides more scalability since both the sender and the receiver have been decoupled and is no longer required.
High Availability
The application Server and the Database server should be up for a long time to ensure higher availability. Along with good software design, network and hardware settings also are needed. The software should run for longer duration. On the application side, memory leak causes most problems. These leaks can be identified by using the following techniques:-
1. The application should be profiled to find unused objects and the time taken to execute each method. If possible jdk 1.6 should be used; it has an in built profiler.
2. The performance and correct code to remove unused objects should be found out by running the application for a longer time on the test environment.
3. The JVM Heap size should be adjusted so as to suit the server RAM
4. The JVM garbage settings should be fine-tuned
SQL Tuning
SQL tuning can be carried out in the following manner
1. The top 10 frequently used queries and the top 10 most time consuming queries should be optimized
2. Optimal indexes should be created.
3. The database server and the application server should be connected to each other through a high bandwidth Ethernet Card
4. The query performance should be continuously tracked by using tools like IronTrackSQL etc.
JDBC Optimization
For many J2EE applications, database interaction issues pop up and degrade the performance. The programmer should be extremely cautious while using JDBC drivers
Following precautions can be taken to ensure a smooth working of JDBC drivers
1. Choosing right JDBC driver
2. Batch updates and batch retrievals should be used
3. Connection pools should be used
4. Prepared statements should used
5. Excessive rollback to avoided
6. Frequent BLOB/CLOB updates should be avoided
7. Optimal transaction level should be chosen
8. The fetch size should be optimized to reduce n/w round trips between application and database server
Refactoring the design
To keep the code maintainable and for regular tuning of the application performance, the programmer should not shy away from making any design changes. The application design should be changed in order to improve performance. If refactoring is not permissible then the following solutions can be applied:
1. Authentication time should be minimal, in fact negligible. It should ideally be less than a millisecond.
2. The static resources like images and JavaScript which are frequently accessed should be cached or moved to any other web server.
3. The session persistent time should be measured and the code should be fine tuned.
Logging
It is an essential part of server side applications. Excessive logging causes performance issues. To measure the application performance, all logging can be switched off, and the performance can then be analysed.
Data Transfer
The following points should be kept in mind
1. The amount of data transfer between the browser and the web server should be kept at a minimum. Too many variables should not be kept.
2. AJAX calls can be used to fetch minimal /reqd data from the web server
Code Optimization
Last but not the least we can optimize our code itself. As an interpreted language with a compact byte code, speed, or the lack of it, is what most often pops up as a problem in Java. A few nitty gritties on how by using some simple tricks we can optimize our code so as to ensure maximum performance:-
1. Don't repeat the same function in conditional statements. For Example instead of writing
for (int a=0 ; a < str.length; a++) {
charc ch =str.charAt(a);
}
Write it as ,
int b =str.length();
for (int x =0 ; x < b; x++) {
charc ch1 =str.charAt(x);
}
Doing this you have removed an Extra overhead on Conditional Statement.
2. Don't create objects unnecessarily. For example do not declare
Date currentDate =new Date();
if (requiredCondition) {
// use currentDate
}
Instead declare the Date object inside the if condition, so that it is not initialized if the required Condition is false. So declare the code as,
if (requiredCondition){
Date currentDate =new Date();
// use currentDate
}
3. For Large Scale enterprise applications, new features are regularly added and old features rendered useless. If possible, if the old feature codes are never going to be used, then remove the old code. Don't try to place old code in,
if (oldFeatureRequired){
// Old Code.
}
this makes the code larger and takes more maintenance time.
4. Avoid using the Enumeration class. The way Enumeration class is used is,
for(Enumeration enumValue = vect.elements();enumValue.hasMoreElements())
{
str = (String) enumValue.nextElement ();
// Perform Operations
}
In this code, enumValue.hasMoreElements () takes a lot of processing time. Also for enumValue.nextElement() , a lot of internal processing like incrementing the internal Counter etc.
Instead for retrieving elements from Vector, here is the preferred way,
int size = vect.size();
for (int k = 0; k< size; k++) {
str = (String)vect.elementAt(k);
// Perform Operations
}
5. Java.util.Date has some performance problems, particularly with internationalization. If you frequently print out the current time as something other than the (long ms-since-epoch) that it is usually represented as, you may be able to cache your representation of the current time and then create a separate thread to update that representation every N seconds (N depends on how accurately you need to represent the current time). You could also delay converting the time until a client needs it, and the current representation is known to be stale.
6. When joining couple of Stings use StringBuffer instead of String. Instead of writing
Stringstr1= "How"+ "are" + "you" + "today";
Write it as
StringBuffer strBuf = new StringBuffer(50);
strBuf.append("How);
strBuf.append("are");
strBuf.append("you");
strBuf.append("today");
If the maximum length of the String Buffer is already known then a lot overheads can be saved while increasing the length of the StringBuffer. If the max length is not known , while increasing the capacity of the String Buffer a new character has to be allocated , all the contents of the old array to be copied into new array and the old array has to be discarded using Garbage Collection.
Therefore an appropriate size of the String Buffer should be declared initially itself. But over allocation should be avoided.
7. When the String class is being used, charAt() method should be given preference over the startsWith() method as it makes relatively more comparisons while preparing itself to compare its prefix with another string.
So instead of writing,
if (str.startsWith("r")) {
// Perform operations
}
write it as,
if ("r" == str.charAt(0) ) {
// Perform operations
}
8. Do not declare the objects twice. For example in the code below,
public class r{
privateVector vect = new Vector();
public r() {
vect = new Vector();
}
}
The compiler generates the following code for the Constructor.
public r() {
vect = new Vector();
vect = new Vector();
}
By default, the initialization code for public variables is moved to the constructor. So if the public variable has been initialized outside the default constructor, there is no need to initialize it inside the constructor. For example,
public class r {
privateVector vect;
public r() {
vect = new Vector();
}
}
9. Stringbuilder Vs String Buffer
As regards the use of StringBuilder and StringBuffer:
(i) Stringbuffer or Stringbuilder can be used as long as the instances are being used by a single thread only. Stringuilder is a better option because of its faster implement and
also has overloaded append and insert methods
(ii) Stringbuffer can be considered a more viable choice when we want to synchronise access to the instance . Stringbuffer is implicitly synchronised ie when multiple threads
are trying to access that instance , only one thread will be able to do that.
(iii) Stringbuffer provides thread safety , but just for it's methods. this takes a toll on the performance. for a small application it doesnt make much of a difference but when lots
of objects are involved , the impact is considerable. So Stringbuffer is a better option unless thread safety is required. There is an improvement of about 34% in terms of
performance.
9. Use of Static Keyword:
The ‘static’ keyword is a very useful option for specifying the scope for the variable. It is often used to finger point ‘static’ scope for declaring constant natured
variables.The static variable is a common copy for all object space, therefore it has a direct advantage for memory and performance.It is common practice followed by
seasoned programmers to assign the static + private + final combination for a constant variable’s scope.
Example:
private final static int C= 3.5 ;
private final static int V = 332;
The private, static, final combination brings out the maximum possibility for program performance.Specifying the correct choice of scope modifiers helps in making the compile
time optimization possible and , gears up the performance at run time .
10.Inlining the methods:
Method Inlining is one of the most effective ways to cope with the method call overhead. Inlining can either be done by the complier automatically or can be done by the
programmer himself. Inlining is implemented by expanding the inlined method's code in the code itself which calls the method. It is seen that when we call the method about
6500 units of time are taken and when the method is inlined , only 3300 odd units are used. Inlining makes the process almost twice as fast.Automatic inlining can be
performed by the compiler in several ways . For instance, by expanding the called method inline in the caller, thereby improving speed at the expense of code space.
Inlining can be achieved dynamically in a running program as well.Inlining works best when the methods are simple and short. For instance if a method just returns the
value of a private variable or field , the compiler finds it suitable for inlining.
Conclusion
Performance Tuning is more of an art and is always specific to the application . A few general guidelines if kept in mind can certainly help the application improve upon a lot of performance related issues. Load Testing should be carried out before starting the tuning process. The bottleneck should be identified and the tuning strategy should be accordingly devised. Performance tuning does not end after deployment , rather it is an iterative and a continuous process.
Any application that we develop has some or the other performance issues, whether it is an intranet or internet product. As we progress with development, we concentrate more on resolving the client requirements and technical complexities while performances issues are overlooked. But it is essential that we have a phase in our development cycle that deals with tuning the product performance before going into production.
Performance Tuning requires a lot of experience and innovative thinking. It is very difficult to propose definite and generic solutions for resolving performance issues, but a few suggestions can definitely be made. Every application is unique in itself and so are the issues related with its performance, though most applications have common performance requirements.
Following are a few techniques that can be used for performance tuning:-
Static Content Serving
Most of the content in the pages of a website is static both by volume and by count. e.g. www.amazon.com or www.bbc.co.uk . By building a static content accelerator server we can make the site faster. The static content should be deployed in a separate web server, and the static content request shall be transparently routed to the static content accelerator server. The reverse proxy technique is a good example in this regard.
Caching
It is an important technique employed by many programmers. Java programmers choose custom cache implementation due to the fact that it increases the comfort level on coding and debugging. Open source caching components like EhCache and OSCache work best for custom cache implementation. This reduces development time as well as bugs in the code. A few important points that are to be kept in mind while implementing caching are
1. Prior information about when and how to invalidate the cached entries should be there
2. Too much data shouldn’t be cached as it leads to cluttering of heap memory
3. A suitable algorithm should be used for clearing the cache periodically e.g. LRU(Least Recently used ) and LFU (Less Frequently used)
4. There should be an option to measure the hits and misses count to fine-tune the caching
5. The Caching components should work in a clustered environment
Connection/Thread Pooling
An important and widely used strategy. A few points to keep in mind while configuring the connection and thread pool are as follows
1. Max count of thread and connection should be predefined in a database driven application, based on database capacity.
2. The optimal value of maximum thread and connection pool from the performance tests should be predetermined.
3. The initial capacity should be set equivalent to maximum capacity
4. The max and min thread/connection pool size have to be derived based on hardware capability and load testing.
Remote Calls
Distributed computing is quite prevalently used for scalable applications. A few important considerations are
1. Remote calls should be chosen carefully. They should mostly be avoided
2. The remote call’s response time should be measured and accordingly the frequently accessed data should be fine-tuned and cached
3. All synchronized remote calls like File System , Email , Web Service , Database and EJB should be measured
4. For developing large scale, distributed integration solutions, asynchronous messaging should be employed as the preferred communication model. Messaging provides more scalability since both the sender and the receiver have been decoupled and is no longer required.
High Availability
The application Server and the Database server should be up for a long time to ensure higher availability. Along with good software design, network and hardware settings also are needed. The software should run for longer duration. On the application side, memory leak causes most problems. These leaks can be identified by using the following techniques:-
1. The application should be profiled to find unused objects and the time taken to execute each method. If possible jdk 1.6 should be used; it has an in built profiler.
2. The performance and correct code to remove unused objects should be found out by running the application for a longer time on the test environment.
3. The JVM Heap size should be adjusted so as to suit the server RAM
4. The JVM garbage settings should be fine-tuned
SQL Tuning
SQL tuning can be carried out in the following manner
1. The top 10 frequently used queries and the top 10 most time consuming queries should be optimized
2. Optimal indexes should be created.
3. The database server and the application server should be connected to each other through a high bandwidth Ethernet Card
4. The query performance should be continuously tracked by using tools like IronTrackSQL etc.
JDBC Optimization
For many J2EE applications, database interaction issues pop up and degrade the performance. The programmer should be extremely cautious while using JDBC drivers
Following precautions can be taken to ensure a smooth working of JDBC drivers
1. Choosing right JDBC driver
2. Batch updates and batch retrievals should be used
3. Connection pools should be used
4. Prepared statements should used
5. Excessive rollback to avoided
6. Frequent BLOB/CLOB updates should be avoided
7. Optimal transaction level should be chosen
8. The fetch size should be optimized to reduce n/w round trips between application and database server
Refactoring the design
To keep the code maintainable and for regular tuning of the application performance, the programmer should not shy away from making any design changes. The application design should be changed in order to improve performance. If refactoring is not permissible then the following solutions can be applied:
1. Authentication time should be minimal, in fact negligible. It should ideally be less than a millisecond.
2. The static resources like images and JavaScript which are frequently accessed should be cached or moved to any other web server.
3. The session persistent time should be measured and the code should be fine tuned.
Logging
It is an essential part of server side applications. Excessive logging causes performance issues. To measure the application performance, all logging can be switched off, and the performance can then be analysed.
Data Transfer
The following points should be kept in mind
1. The amount of data transfer between the browser and the web server should be kept at a minimum. Too many variables should not be kept.
2. AJAX calls can be used to fetch minimal /reqd data from the web server
Code Optimization
Last but not the least we can optimize our code itself. As an interpreted language with a compact byte code, speed, or the lack of it, is what most often pops up as a problem in Java. A few nitty gritties on how by using some simple tricks we can optimize our code so as to ensure maximum performance:-
1. Don't repeat the same function in conditional statements. For Example instead of writing
for (int a=0 ; a < str.length; a++) {
charc ch =str.charAt(a);
}
Write it as ,
int b =str.length();
for (int x =0 ; x < b; x++) {
charc ch1 =str.charAt(x);
}
Doing this you have removed an Extra overhead on Conditional Statement.
2. Don't create objects unnecessarily. For example do not declare
Date currentDate =new Date();
if (requiredCondition) {
// use currentDate
}
Instead declare the Date object inside the if condition, so that it is not initialized if the required Condition is false. So declare the code as,
if (requiredCondition){
Date currentDate =new Date();
// use currentDate
}
3. For Large Scale enterprise applications, new features are regularly added and old features rendered useless. If possible, if the old feature codes are never going to be used, then remove the old code. Don't try to place old code in,
if (oldFeatureRequired){
// Old Code.
}
this makes the code larger and takes more maintenance time.
4. Avoid using the Enumeration class. The way Enumeration class is used is,
for(Enumeration enumValue = vect.elements();enumValue.hasMoreElements())
{
str = (String) enumValue.nextElement ();
// Perform Operations
}
In this code, enumValue.hasMoreElements () takes a lot of processing time. Also for enumValue.nextElement() , a lot of internal processing like incrementing the internal Counter etc.
Instead for retrieving elements from Vector, here is the preferred way,
int size = vect.size();
for (int k = 0; k< size; k++) {
str = (String)vect.elementAt(k);
// Perform Operations
}
5. Java.util.Date has some performance problems, particularly with internationalization. If you frequently print out the current time as something other than the (long ms-since-epoch) that it is usually represented as, you may be able to cache your representation of the current time and then create a separate thread to update that representation every N seconds (N depends on how accurately you need to represent the current time). You could also delay converting the time until a client needs it, and the current representation is known to be stale.
6. When joining couple of Stings use StringBuffer instead of String. Instead of writing
Stringstr1= "How"+ "are" + "you" + "today";
Write it as
StringBuffer strBuf = new StringBuffer(50);
strBuf.append("How);
strBuf.append("are");
strBuf.append("you");
strBuf.append("today");
If the maximum length of the String Buffer is already known then a lot overheads can be saved while increasing the length of the StringBuffer. If the max length is not known , while increasing the capacity of the String Buffer a new character has to be allocated , all the contents of the old array to be copied into new array and the old array has to be discarded using Garbage Collection.
Therefore an appropriate size of the String Buffer should be declared initially itself. But over allocation should be avoided.
7. When the String class is being used, charAt() method should be given preference over the startsWith() method as it makes relatively more comparisons while preparing itself to compare its prefix with another string.
So instead of writing,
if (str.startsWith("r")) {
// Perform operations
}
write it as,
if ("r" == str.charAt(0) ) {
// Perform operations
}
8. Do not declare the objects twice. For example in the code below,
public class r{
privateVector vect = new Vector();
public r() {
vect = new Vector();
}
}
The compiler generates the following code for the Constructor.
public r() {
vect = new Vector();
vect = new Vector();
}
By default, the initialization code for public variables is moved to the constructor. So if the public variable has been initialized outside the default constructor, there is no need to initialize it inside the constructor. For example,
public class r {
privateVector vect;
public r() {
vect = new Vector();
}
}
9. Stringbuilder Vs String Buffer
As regards the use of StringBuilder and StringBuffer:
(i) Stringbuffer or Stringbuilder can be used as long as the instances are being used by a single thread only. Stringuilder is a better option because of its faster implement and
also has overloaded append and insert methods
(ii) Stringbuffer can be considered a more viable choice when we want to synchronise access to the instance . Stringbuffer is implicitly synchronised ie when multiple threads
are trying to access that instance , only one thread will be able to do that.
(iii) Stringbuffer provides thread safety , but just for it's methods. this takes a toll on the performance. for a small application it doesnt make much of a difference but when lots
of objects are involved , the impact is considerable. So Stringbuffer is a better option unless thread safety is required. There is an improvement of about 34% in terms of
performance.
9. Use of Static Keyword:
The ‘static’ keyword is a very useful option for specifying the scope for the variable. It is often used to finger point ‘static’ scope for declaring constant natured
variables.The static variable is a common copy for all object space, therefore it has a direct advantage for memory and performance.It is common practice followed by
seasoned programmers to assign the static + private + final combination for a constant variable’s scope.
Example:
private final static int C= 3.5 ;
private final static int V = 332;
The private, static, final combination brings out the maximum possibility for program performance.Specifying the correct choice of scope modifiers helps in making the compile
time optimization possible and , gears up the performance at run time .
10.Inlining the methods:
Method Inlining is one of the most effective ways to cope with the method call overhead. Inlining can either be done by the complier automatically or can be done by the
programmer himself. Inlining is implemented by expanding the inlined method's code in the code itself which calls the method. It is seen that when we call the method about
6500 units of time are taken and when the method is inlined , only 3300 odd units are used. Inlining makes the process almost twice as fast.Automatic inlining can be
performed by the compiler in several ways . For instance, by expanding the called method inline in the caller, thereby improving speed at the expense of code space.
Inlining can be achieved dynamically in a running program as well.Inlining works best when the methods are simple and short. For instance if a method just returns the
value of a private variable or field , the compiler finds it suitable for inlining.
Conclusion
Performance Tuning is more of an art and is always specific to the application . A few general guidelines if kept in mind can certainly help the application improve upon a lot of performance related issues. Load Testing should be carried out before starting the tuning process. The bottleneck should be identified and the tuning strategy should be accordingly devised. Performance tuning does not end after deployment , rather it is an iterative and a continuous process.