Memory Layout of Java Virtual Machine

The Java Virtual Machine is an abstract computing machine. It is responsible for most of the features which made Java a great technology. Few of the major selling points of Java was:

  • Platform Independence
    • Hardware Independence
    • Operating System Independence
  • Security from malicious programs

It is said to be abstract computing machine because it offers all the specification which if implemented properly can act as a machine to execute instructions.

Sun Microsystems (now acquired by Oracle) offers an implementation of the JVM. The JVM understands a particular binary format which is called the class file format. The JVM is capable of executing the instructions in the class file, which specifically means that if you can produce a valid class file, it can be executed by the JVM. Hence, any language (not necessarily Java) if can produce a valid class file, the JVM will accept it.

The Memory Layout of Java Virtual Machine

It is absolutely alright if you are not familiar with the memory layout of the underlying JVM. Most of the times we do not necessarily need it but sometimes this knowledge can save you a lot of setup and debugging hours. Mostly in memory intensive applications where the volume of data to be processed is in GBs like the available caching solutions ehcache, Terracota.

Certainly the knowledge of the JVM memory layout is beneficial if you are working on such products.

We must understand that there can be multiple threads of execution in a running JVM and hence it needs a lot of memory to cater to these threads and for its internal working.

Memory Model of Java Virtual Machine

At run time the JVM manages certain areas of the memory allocated to it. These are called Run Time Data Areas and I divide it in two major categories

  1. Shared Area : one created at JVM startup and destroyed at JVM exits.
  2. Managed Per Thread : one created per thread and destroyed when the thread exits.

Here are the run time data areas

Program Counter Register

This area is managed per thread, this means that each thread of execution will have its own program counter. At any point of time, it contains an address which is the address of the virtual machine instruction being executed.

The thread of execution at any point in time is executing an instruction which is present in a method. In case the method is a native method, the program counter register will be undefined. In rest all the cases it will contain the address of the instruction.

The Java Virtual Machine Stack

This area is managed per thread, hence each thread will have one private stack for itself. This stack is used to store frames.

A frame is used to store data and partial results, perform dynamic linking and dispatch exceptions. A frame is very specific to a method. Each invocation of a method results in creation of a new frame. This frame is pushed on to the stack of the thread which is executing the method. When the method execution is completes (normally or abruptly), the frame is destroyed.

Implementations of the abstract machine (JVM) can choose to either have a fixed size stack or a dynamic size stack. It can provide options to the user to control the size of the stack by supplying the size (in case of fixed size) or a minimum and maximum allowable size (in case of dynamic size).

StackOverflowError: Now that we are discussing about the JVM Stack, it is necessary that we talk about the StackOverflowError. As we know that the stack can be of fixed size or dynamic size.

Assuming that the stack is of fixed size, and one of our computation demands more stack (for e.g.: a recursive method which keeps on pushing frames on the stack) and hence requires more size than permitted. In such a scenario it would be impossible to allocate more space and it will throw a StackOverflowError.

OutOfMemoryError: In case the stack is of dynamic size and some computation needs more space, then the JVM will attempt to increase the size of the stack within the permissible max size. In case the system memory is not free and the size cant be increased, it will throw a OutOfMemoryError.

Also, when a new thread is started, the JVM tries to allocate memory for the stack. If the system memory is not available for creation of this stack, it will throw a OutOfMemoryError.

Native Method Stacks

This data area is similar to the JVM stack and it is managed per thread. This is needed for the execution of the native methods. The same exceptions are associated with these stacks as well. So, everything is almost similar to the above section except for the fact that they are sometimes also called C Stack.

Above were the data areas managed per thread. Now let us discuss the shared ones.

The Heap

This area is created at the time of JVM startup and is destroyed when the JVM stops. The heap is shared among all the threads of execution.

What is the heap used for?

The heap contains a lot of other data areas in general which we will explain in detail in the sections below.

  1. The memory for class instances or arrays is allocated from the heap. Which means, the moment a thread executes a statement like int[] X = new int[10], a memory block to store this array is allocated from the heap. A JVM implementation may have an  automated storage management system (we know it as the garbage collector). The block of memory which is allocated for the objects and arrays can be de-allocated by the storage management system when it is appropriate (here appropriate is a complex term and needs a separate discussion).
  2. Method Area is logically a part of the heap. Hence, the heap must provide space for method area. Method Area is shared among all threads of execution. The method area stores the per class structures, for e.g.: field and method data, the run time constant pool, code for method and constructors and special initialization methods like <clinit> or <init>
    1. Run-time Constant Pool  is logically a part of the method area and it is a per class or per interface representation of the constant_pool table in a class file. It is created when the class or interface is created. It contains constants like numeric literals and string literals known at compile time, for e.g.: String s = “” or int k = 10. It also contains field of method references which must be resolved at the run time.

Error condition associated with the Heap

As we said about the stack, the heap can also be of fixed or dynamic size. If any computation requires more heap than can be made available by the automatic storage management, then it will throw a OutOfMemoryError.

The same error can occur if it is not able to create the method area or run time constant pool.

References :