Java  High Performance Computing

Java is well known as a high level, robust, secure, object-oriented, and platform independent language and environment. Also well known is the fact that these desirable attributes exact a significant performance penalty, which heretofore has limited Java's utility in the arena of numerically intensive, high performance computing.

In order to ensure platform independence, Java programs are compiled into an intermediate representation consisting of bytecode instructions for a virtual, stack-based machine. These bytecode instructions are then either interpreted directly by a Java Virtual Machine (JVM) or compiled to native machine code prior to execution. If the underlying execution engine is fully based on an interpreter mechanism, Java applications do not perform well at all. It is found that over 60% of Java execution time is spent in interpreting Java bytecode instructions. Therefore, bytecode interpretation becomes a major Java performance bottleneck.

One of the ways to eliminate this bottleneck is to identify frequently executed or performance critical fragments of bytecode, translate these fragments into native machine code at runtime, and then directly execute the machine code. This technique (known as Just-In-Time (JIT) compilation) greatly accelerates the execution of most large-scale Java scientific applications. However, the generated machine code is geared toward a general purpose CPU architecture, and thus suffers from the overhead inherent in the traditional fetch-load-execute software execution paradigm.

We propose a new computing paradigm called Just-in-Time Reconfigurable Computing (JITRC) to extend the JIT technique one step further, specifically, translating Java bytecode to hardware on the fly instead of compiling it to machine-specific object code. By using reconfigurable hardware devices such as Field Programmable Gate Arrays (FPGAs), computing hardware can be dynamically customized to suit the needs of a particular application. Owing to the speedup of execution directly in hardware relative to software execution, this reconfigurable computing paradigm should significantly enhance the performance of large-scale Java scientific applications.

The JITRC system will be designed to dynamically translate key portions of a Java application to a hardware description representation, such as the Very High Speed Integrated Circuit Hardware Description Language (VHDL) or Verilog Hardware Description Language (Verilog HDL). This hardware description program will then be transformed to an intermediate netlist format, synthesized to a hardware configuration bit-stream, and ultimately mapped to a particular reconfigurable hardware device at runtime. We plan to perform an extensive performance evaluation on the JITRC system to measure the advantages of this approach for high performance computing.

Conference Publication:

  1. Yi-Gang Tai, Chia-Tien Dan Lo, and Kleanthis Psarris, "Applying Out-of-Core QR Decomposition Algorithms on FPGA-Based Systems," in the 17th International Conference on Field Programmable Logic and Applications (FPL 2007), Amsterdam, Netherlands, 27-29 August, 2007. (Acceptance Rate: 21%)
  2. Yi-Gang Tai, Chia-Tien Dan Lo, and Kleanthis Psarris, "An FPGA-Based Computation Model for Blocked Algorithms," in the 6th WSEAS International Conference on APPLIED INFORMATICS AND COMMUNICATIONS (AIC'06), Elounda, Agios Nikolaos, Crete Island, Greece, August 18-20, 2006, pp. 286 - 291.

Journal Publications:

  1. Chia-Tien Dan Lo, Yi-Gang Tai, and Kleanthis Psarris, "FPGA-Based Hardware Acceleration on I/O-Bound Scientific Applications," in the WSEAS Transactions on Computers, Vol. 5, No. 12, pp. 2977-2983, December 2006.