How does Java Application run? What is Bytecode? Write a simple VM simulation

Yunus Kılıç
Analytics Vidhya
Published in
12 min readJul 6, 2020

--

While researching about Java and JVM architecture, I found a meetup talk[1] and other resources. Then I recognized that other developers might be curious about Java’s behind the scene. So I decided to write an article about how Java Application runs? How Java VM run by explaining step by step? In order to discuss this, I collected some background information. Then finally illustrate VM running steps. The sample application is based on that meetup talk Terence Parr[1].

Background Information

“Java is a general-purpose programming language that is class-based, object-oriented, and designed to have as few implementation dependencies as possible. It is intended to let application developers write once, run anywhere (WORA), meaning that compiled Java code can run on all platforms that support Java without the need for recompilation. Java applications are typically compiled to bytecode that can run on any Java virtual machine (JVM) regardless of the underlying computer architecture. The syntax of Java is similar to C and C++, but it has fewer low-level facilities than either of them. As of 2019, Java was one of the most popular programming languages in use according to GitHub, particularly for client-server web applications, with a reported 9 million developers.”[2]

As stated at the Java definition, once you write your program with Java, it runs anywhere. In order to achieve this goal, Java uses bytecode and JVM. Before starting to discuss Java architecture, let’s see other languages like C.

After the compiling phase, all objects are being merged and made an executable. This called linking. Output exe is ready to run.

On the other hand, Java compiler converts to .class files without linking these files. These class files contain another IL called bytecode. But the computer could not read and run bytecode directly. So a virtual machine needs to convert these to computer-executable code. The advantage of this approach is when you create bytecode files, each architecture uses its own JVM and can run Java. That’s how Java runs at every platform.

On the other hand, JIT Compiler(Just-In-Time) interprets this bytecode at runtime. So Java is comparable slower than some languages. Also, this architecture makes Java is both compiled and interpreted language. Because Java compiler compiles java files to generate class files. Then JIT interprets class files to Native computer code.

Using this background information, we will write our own virtual machine. This virtual machine will be very primitive. But understanding the main concept is more clear. Understanding how bytecode executed is more important than writing simple code.

Converting Java code to bytecode is Java Compiler’s responsibility. But let’s look without detail. The below code contains the most primitive factorial code.

/**
*
@author Yunus
*/
public class Main {

static int factorial(int num) {
if (num < 2) {
return 1;
}
return num * factorial(num - 1);
}

public static void main(String[] args) {
factorial(2);
}
}

With below command bytecode will be generated.

javac Main.java

Let’s look at Main.class file with the below command.

$javap -p -c -v Main.classLast modified Jul 3, 2020; size 378 bytes
MD5 checksum 1928bdcf95625d8714f1aa34b13ec2f2
Compiled from "Main.java"
public class Main
minor version: 0
major version: 52
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Methodref #4.#16 // java/lang/Object."<init>":()V
#2 = Methodref #3.#17 // Main.factorial:(I)I
#3 = Class #18 // Main
#4 = Class #19 // java/lang/Object
#5 = Utf8 <init>
#6 = Utf8 ()V
#7 = Utf8 Code
#8 = Utf8 LineNumberTable
#9 = Utf8 factorial
#10 = Utf8 (I)I
#11 = Utf8 StackMapTable
#12 = Utf8 main
#13 = Utf8 ([Ljava/lang/String;)V
#14 = Utf8 SourceFile
#15 = Utf8 Main.java
#16 = NameAndType #5:#6 // "<init>":()V
#17 = NameAndType #9:#10 // factorial:(I)I
#18 = Utf8 Main
#19 = Utf8 java/lang/Object
{
public Main();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 4: 0
static int factorial(int);
descriptor: (I)I
flags: ACC_STATIC
Code:
stack=3, locals=1, args_size=1
0: iload_0
1: iconst_2
2: if_icmpge 7
5: iconst_1
6: ireturn
7: iload_0
8: iload_0
9: iconst_1
10: isub
11: invokestatic #2 // Method factorial:(I)I
14: imul
15: ireturn
LineNumberTable:
line 7: 0
line 8: 5
line 10: 7
StackMapTable: number_of_entries = 1
frame_type = 7 /* same */
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=1, locals=1, args_size=1
0: iconst_2
1: invokestatic #2 // Method factorial:(I)I
4: pop
5: return
LineNumberTable:
line 14: 0
line 15: 5
}
SourceFile: "Main.java"

Our main function has an instruction like,

1: invokestatic #2 // Method factorial:(I)I

Basically says that invoke factorial method. The meaning of the #2 can be found inside the constant pool, states our factorial functions and its information. The correct function can be found from the method table.

#2 = Methodref          #3.#17#3 = Class              #18#17 = NameAndType        #9:#10         // factorial:(I)I
#18 = Utf8 Main
#9 = Utf8 factorial
#10 = Utf8 (I)I

Let’s focus factorial functions’ byte code below:

0: iload_0
1: iconst_2
2: if_icmpge 7
5: iconst_1
6: ireturn
7: iload_0
8: iload_0
9: iconst_1
10: isub
11: invokestatic #2 // Method factorial:(I)I
14: imul
15: ireturn

So to make a factorial code run, we need to convert this into computer operations. There is a huge list of instructions[5]. But I will only implement only used instructions inside factorial code. With this minimalist approach, it will be easy to understand.

In the beginning, we have the only stack to memory because we do not have an object, etc.

Our minimalist VM basically takes instruction and executes it. So in order to keep important information, we need stack pointer(sp), instruction pointer(ip), and frame pointer(fp).

Stack pointer will be used what is the current position at stack.

The instruction pointer keeps which instruction executed.

Frame pointer…

For example: iload_0, load an int value from local variable 0

Also, there is an instructor called iload, load an int value from a local variable #index

iload_0 and iload 0 both of them is true. In order to more generic, I will use iload 0 to load any index. Same for other instructions.

For simplicity reasons, we will just call and number of instruction instead of invokestatic #2

if_icmpge => if value1 is greater than or equal to value2, branch to instruction at branchoffset. So I divided these instructions like below(ilt and brf).

We move forward as if following bytecode is generated. If an instruction has operand then the next instruction number changes.

Factorial method:0: iload 0
2: iconst 2
4: ilt
5: brf 10
7: iconst 1
9: return
10: iload 0
12: iload 0
14: iconst 1
16: isub
17: call 0,1
20: imul
21: return
Main method:22: iconst 2
24: call 0,1
27: print
28: halt

First of all, let's define instructions. Code snippets are taken from conference talk as I said at the beginning, I made some refactoring[1]. I want to demonstrate each step to demonstrate how it is working? Actually codes are very simple. Understanding the concept is more important.

First of all, I create the Instruction class which has only contains instructions that are required for our factorial sample.

import java.util.Arrays;
import java.util.Optional;

public enum Instruction {

ILOAD(1, 1),
ICONST(2, 1),
ILT(3, 0),
RET(4, 0),
ISUB(5, 0),
CALL(6, 2),
IMUL(7, 1),
PRINT(8, 0),
HALT(9, 0),
BRF(10, 1);

private final int code;
private final int numberOfOperand;

Instruction(int code, int numberOfOperands) {
this.code = code;
this.numberOfOperand = numberOfOperands;
}

public static Optional<Instruction> findByCode(int code) {
return Arrays.asList(Instruction.values()).stream().filter(x -> x.getCode() == code).findFirst();
}

public int getCode() {
return code;
}

public int getNumberOfOperand() {
return numberOfOperand;
}
}

Define Test class, bytecodes as input. And state starts with the main method’s first instruction.

public class Test {
static int[] factorial = {
Instruction.ILOAD.getCode(), 0,
Instruction.ICONST.getCode(), 2,
Instruction.ILT.getCode(),
Instruction.BRF.getCode(), 10,
Instruction.ICONST.getCode(), 1,
Instruction.RET.getCode(),
Instruction.ILOAD.getCode(), 0,
Instruction.ILOAD.getCode(), 0,
Instruction.ICONST.getCode(), 1,
Instruction.ISUB.getCode(),
Instruction.CALL.getCode(), 0, 1,
Instruction.IMUL.getCode(),
Instruction.RET.getCode(),

Instruction.ICONST.getCode(), 2,
Instruction.CALL.getCode(), 0, 1,
Instruction.PRINT.getCode(),
Instruction.HALT.getCode()
};

public static void main(String[] args) {
Vm vm = new Vm(factorial, 22);
vm.execute();
}
}

Vm class:

public class Vm {
public static final int DEFAULT_STACK_SIZE = 1000;
public static final int NUMBER_OF_GLOBAL = 0;
public static final int FALSE = 0;
public static final int TRUE = 1;
int ip = -1, sp = -1, fp = -1, defaultOffset = -3;
int[] code, globals, stack;

public Vm(int[] code, int startip) {
this.code = code;
ip = startip;
globals = new int[NUMBER_OF_GLOBAL];
stack = new int[DEFAULT_STACK_SIZE];
}

public void execute() {
int opcode = code[ip];
Instruction instruction = getInstruction(opcode);
Optional<Instruction> optionalInstruction;
if (instruction == null) return;
int a, b, addr, offset;

while (opcode != Instruction.HALT.getCode() && ip < code.length) {
instruction = getInstruction(opcode);
ip++;
switch (instruction) {
case ILOAD:
offset = defaultOffset + code[ip++];
stack[++sp] = stack[fp + offset];
break;
case ICONST:
stack[++sp] = code[ip++];
break;
case ILT:
b = stack[sp--];
a = stack[sp--];
stack[++sp] = (a < b) ? TRUE : FALSE;
break;
case ISUB:
b = stack[sp--];
a = stack[sp--];
stack[++sp] = a - b;
break;
case CALL:
addr = code[ip++];
int nargs = code[ip++];
stack[++sp] = nargs;
stack[++sp] = fp;
stack[++sp] = ip;
fp = sp;
ip = addr;
break;
case RET:
int rvalue = stack[sp--];
sp = fp;
ip = stack[sp--];
fp = stack[sp--];
nargs = stack[sp--];
sp -= nargs;
stack[++sp] = rvalue;
break;
case IMUL:
b = stack[sp--];
a = stack[sp--];
stack[++sp] = a * b;
break;
case PRINT:
System.out.println(stack[sp--]);
break;
case BRF:
addr = code[ip++];
if (stack[sp--] == FALSE) ip = addr;
break;
}
opcode = code[ip];
}
}

Now the most important part is coming. I want to discuss the state of the stack after each instruction executed to diving into logic.

Let’s illustrate how to calculate factorial 2.

Before execution of any bytecode, our stack looks like:

22: iconst 2 // bytecode
Instruction.ICONST.getCode(), 2, // input equilavent

This bytecode basically defines an integer constant inside stack.

As you see 2 located at stack. And ip become 24 which means next instruction.

24: call 0,1
Instruction.CALL.getCode(), 0, 1

Current instruction says that call instruction 0 with 1 argument. Calling and returning are the most complicated operations. So I try to give you more details. While calling another function, we need to store important information. This information will be used after returning the call. In our sample application, we store 3 pieces of information.

1-) Number of arguments (1)

2-) Current frame pointer (-1)

3-) Next instruction(27)

Our ip becomes 0 because the next instruction is 0. Sp is 3. Fp becomes 3 because just before going another function our sp is 3. So Fp of factorial function is 3. Finally, we called our factorial function from the main method. Now let’s iterate with factorial function’s first instruction.

0: iload 0
Instruction.ILOAD.getCode(), 0,

As you noticed at code while loading data, there is -3 offset. Why this exists? Because while calling function, we put 3 extra information about the previous situation. While loading data from our stack, we need to get values with this offset to find the correct value. 3 extra information demonstrated as yellow. Arguments are blue and our local values are green. iload 0 loads 2 which was pushed as arguments inside the main method. Then pushes stack again to use as a local value. Now ip becomes 2. Let’s move:

2: iconst 2
Instruction.ICONST.getCode(), 2,

Now we come to n<2 part of our functions. In order to that, we need to put value 2 to our stack.

4: ilt
Instruction.ILT.getCode(),

ILT, just pop the last 2 values and execute less than operations. If true then puts 1 to stack else puts 0. 2<2 is false so puts 0.

5: brf 10
Instruction.BRF.getCode(), 10,

We divide if_icmpge operation to lt and brf. Brf means if false then branch specified index. Brf pops and looks stack whose last element 0. 0 means false, so we need to branch instruction number 10.

10: iload 0
Instruction.ILOAD.getCode(), 0,
12: iload 0
Instruction.ILOAD.getCode(), 0,
14: iconst 1
Instruction.ICONST.getCode(), 1,
16: isub
Instruction.ISUB.getCode(),

Now we are trying to return 2* factorial(2–1) make a subs operations. Our program pops the last two values and gets their difference to put stack. So we popped 2 and 1. Then push 1 to stack.

17: call  0,1
Instruction.CALL.getCode(), 0, 1

Now, we are making recursive call(2* factorial(1).

We put 1 because the number of arguments is 1. Our last fp was 3 so, we push 3. Finally, after 17th instruction, we need to continue with 20th, so push 20. Now we are again beginning of the factorial functions. We will repeat the same operations a while. I will only put images of the current states.

0: iload 0
Instruction.ILOAD.getCode(), 0,
2: iconst 2
Instruction.ICONST.getCode(), 2,
4: ilt
Instruction.ILT.getCode(),

This time 1<2 is true, so push 1 to stack.

5: brf 10
Instruction.BRF.getCode(), 10,

This time the last element is stack is not false, so we are not making branching. We keep continuing with instruction 7.

7: iconst 1
Instruction.ICONST.getCode(), 1,
9: return
Instruction.RET.getCode(),

Another very important instruction is returning. Following steps executed:

  • Get return value from the stack. In our case, it is 1.
  • Get instruction pointer value from the stack. We need this value to find where to continue. In our case, it is 20. After factorial returns recursive call, we need to apply multiply(which inst#20).
  • Get the frame pointer. It is 3.
  • After popping this values from the stack. We decreased sp as number of arguments.
  • We need to push return value to the stack.
20: imul
Instruction.IMUL.getCode(),

Multiplies last two elements inside the stack.

21: return
Instruction.RET.getCode(),

Another return instruction will be executed.

As you see we returned, instruction 27. We are inside the main function again. With local 2 which is return value of the factorial function.

27: print
Instruction.PRINT.getCode(),

As you, there is not any print instruction. But to demonstrate our work, this instruction was added.

28: halt
Instruction.HALT.getCode()

After halt instruction, our program will be exited.

This time I researched Bytecode, JVM, and JIT resources and write outputs of these resources. Next time, the ahead-of-time compilation (AOT compilation) in Java, GraalVM and advantages/disadvantages will be shared. Be connected.

Cites:

1-)https://www.youtube.com/watch?v=OjaAToVkoTw

2-)https://en.wikipedia.org/wiki/Java_(programming_language)

3-)https://www.guru99.com/java-virtual-machine-jvm.html

4-)https://www.lohika.com/methods-invocation-inside-of-a-java-virtual-machine#:~:text=JVM%20methods%20invocations,dedicated%20part%20of%20the%20article.

5-)https://en.wikipedia.org/wiki/Java_bytecode_instruction_listings

--

--

Yunus Kılıç
Analytics Vidhya

I have 10 years of experience in high-quality software application development, implementation, and integration.