Thursday, 11 June 2015

Can java optimize empty array allocation?

Yesterday came across a simple optimization case, here is the original method

public String[] getSomeArray() {
    if (nothing) {
        return new String[0];
    }
    // normal processing ignored for brevity 
}

at the first sight the allocation looks quite wasteful, and I am tempted to carry out some micro optimization like

private static final String[] EMPTY = new String[0];
public String[] getSomeArray() {
    if (nothing) {
        return EMPTY;
    }
    // normal processing ignored for brevity 
}

However, another developer Pedro ringed the bell, maybe java can JIT away the allocation all together, and this does looks a very reasonable JIT target.

Let's find out!

jmh to rescue

@Benchmark
public void test1() {
    for (int i = 0; i < 10000; i++) {
        get1();
    }
}
public String[] get1() {
    return new String[0];
}

private static final String[] CONST = {};
@Benchmark
public void test2() {
    for (int i = 0; i < 10000; i++) {
        get2();
    }
}
public String[] get2() {
    return CONST;
}

A benchmark run gave following result which showed that the two methods ran pretty much at the same speed, therefore the actual allocation could be indeed optimized away

test$ java -jar target/benchmarks.jar -f 1
# JMH 1.9.3 (released 28 days ago)
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_40.jdk/Contents/Home/jre/bin/java
# VM options: <none>
# Warmup: 20 iterations, 1 s each
# Measurement: 20 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: org.sample.MyBenchmark.test1

# Run progress: 0.00% complete, ETA 00:01:20
# Fork: 1 of 1
# Warmup Iteration   1: 3177146862.839 ops/s
# Warmup Iteration   2: 2969126090.532 ops/s
...
# Warmup Iteration  19: 3904120378.974 ops/s
# Warmup Iteration  20: 3368973982.889 ops/s
Iteration   1: 3273016452.646 ops/s
Iteration   2: 3720653112.375 ops/s
...
Iteration  19: 2940755393.888 ops/s
Iteration  20: 3490675218.425 ops/s


Result "test1":
  3150112425.866 ±(99.9%) 346620443.427 ops/s [Average]
  (min, avg, max) = (2526859466.365, 3150112425.866, 3790445537.196), stdev = 399168618.122
  CI (99.9%): [2803491982.439, 3496732869.293] (assumes normal distribution)


# JMH 1.9.3 (released 28 days ago)
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_40.jdk/Contents/Home/jre/bin/java
# VM options: <none>
# Warmup: 20 iterations, 1 s each
# Measurement: 20 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: org.sample.MyBenchmark.test2

# Run progress: 50.00% complete, ETA 00:00:40
# Fork: 1 of 1
# Warmup Iteration   1: 2646209214.510 ops/s
# Warmup Iteration   2: 3014719359.164 ops/s
...
# Warmup Iteration  19: 3639571958.173 ops/s
# Warmup Iteration  20: 3127621392.815 ops/s
Iteration   1: 3464961418.737 ops/s
Iteration   2: 2827541432.787 ops/s
....
Iteration  19: 2888880315.543 ops/s
Iteration  20: 3109114933.979 ops/s


Result "test2":
  3048325924.714 ±(99.9%) 269904767.209 ops/s [Average]
  (min, avg, max) = (2523324876.886, 3048325924.714, 3573386254.596), stdev = 310822731.303
  CI (99.9%): [2778421157.505, 3318230691.923] (assumes normal distribution)


# Run complete. Total time: 00:01:20

Benchmark           Mode  Cnt           Score           Error  Units
MyBenchmark.test1  thrpt   20  3150112425.866 ± 346620443.427  ops/s
MyBenchmark.test2  thrpt   20  3048325924.714 ± 269904767.209  ops/s
test$ 

to be super conservative, let's check the assembly

  0x00000001051ffa99: movabs $0x11e65a2c8,%rbx  ;   {metadata({method} {0x000000011e65a2c8} 'get1' '()[Ljava/lang/String;' in 'org/sample/MyBenchmark')}

  0x00000001051ffaa3: and    $0x7ffff8,%edx
  0x00000001051ffaa9: cmp    $0x0,%edx

  0x0000000109ae1f51: movabs $0x122f3d440,%rbx  ;   {metadata({method} {0x0000000122f3d440} 'get2' '()[Ljava/lang/String;' in 'org/sample/MyBenchmark')}
  0x0000000109ae1f5b: and    $0x7ffff8,%eax
  0x0000000109ae1f61: cmp    $0x0,%eax

Now we see the exact same native codes were generated.

Case closed.

Java does optimize empty array allocation.


Happy Coding!

by Dapeng