String Comparison in a Little Detail
Why we avoid '==' operator to compare two strings
We all know that we should use String class's equals method to compare two strings. When we use '==' operator to compare two equivalent String objects will give you 'false'. If those are String literals then 'true'. The concern is you don't know whether the String reference is pointing to literal or String object. That's why you are always encouraged to use equals method to do string comparison.Let's check few examples with String literal those we are focusing on this article:
1 2 3 4 5 6 7 8 9 10 11 12 13 | /** * Created by eananthaneshan on 6/8/16. */ public class Test { public static void main(String[] args) { String a = "a"; String b = "b"; String ab = "ab"; System.out.println(ab==ab); System.out.println(ab==(a+b)); System.out.println(ab== ("a" + "b")); } } |
Let's see decompiled code of above Test.java file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | // // Source code recreated from a .class file by IntelliJ IDEA // (powered by Fernflower decompiler) // public class Test { public Test() { } public static void main(String[] args) { String a = "a"; String b = "b"; String ab = "ab"; System.out.println(ab == ab); System.out.println(ab == a + b); System.out.println(ab == "ab"); } } |
now we understand why third sysout gives us true; The compiler concat above two string constants. So the String literal "ab" is created at compile time so that is equivalent to the reference 'ab'. Still we have no clue why second sysout gives us false.
Let's look at the byte code of Test.class file.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 | Es-MacBook-Pro:TestString eananthaneshan$ javap -c -verbose Test Classfile /Users/eananthaneshan/WSO2/samples/TestString/out/production/TestString/Test.class Last modified Aug 6, 2016; size 937 bytes MD5 checksum 6571cdd8b9c826e9e6baacb1660db7bf Compiled from "Test.java" public class Test SourceFile: "Test.java" minor version: 0 major version: 52 flags: ACC_PUBLIC, ACC_SUPER Constant pool: #1 = Methodref #12.#34 // java/lang/Object."<init>":()V #2 = String #24 // a #3 = String #26 // b #4 = String #27 // ab #5 = Fieldref #35.#36 // java/lang/System.out:Ljava/io/PrintStream; #6 = Methodref #37.#38 // java/io/PrintStream.println:(Z)V #7 = Class #39 // java/lang/StringBuilder #8 = Methodref #7.#34 // java/lang/StringBuilder."<init>":()V #9 = Methodref #7.#40 // java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; #10 = Methodref #7.#41 // java/lang/StringBuilder.toString:()Ljava/lang/String; #11 = Class #42 // Test #12 = Class #43 // java/lang/Object #13 = Utf8 <init> #14 = Utf8 ()V #15 = Utf8 Code #16 = Utf8 LineNumberTable #17 = Utf8 LocalVariableTable #18 = Utf8 this #19 = Utf8 LTest; #20 = Utf8 main #21 = Utf8 ([Ljava/lang/String;)V #22 = Utf8 args #23 = Utf8 [Ljava/lang/String; #24 = Utf8 a #25 = Utf8 Ljava/lang/String; #26 = Utf8 b #27 = Utf8 ab #28 = Utf8 StackMapTable #29 = Class #23 // "[Ljava/lang/String;" #30 = Class #44 // java/lang/String #31 = Class #45 // java/io/PrintStream #32 = Utf8 SourceFile #33 = Utf8 Test.java #34 = NameAndType #13:#14 // "<init>":()V #35 = Class #46 // java/lang/System #36 = NameAndType #47:#48 // out:Ljava/io/PrintStream; #37 = Class #45 // java/io/PrintStream #38 = NameAndType #49:#50 // println:(Z)V #39 = Utf8 java/lang/StringBuilder #40 = NameAndType #51:#52 // append:(Ljava/lang/String;)Ljava/lang/StringBuilder; #41 = NameAndType #53:#54 // toString:()Ljava/lang/String; #42 = Utf8 Test #43 = Utf8 java/lang/Object #44 = Utf8 java/lang/String #45 = Utf8 java/io/PrintStream #46 = Utf8 java/lang/System #47 = Utf8 out #48 = Utf8 Ljava/io/PrintStream; #49 = Utf8 println #50 = Utf8 (Z)V #51 = Utf8 append #52 = Utf8 (Ljava/lang/String;)Ljava/lang/StringBuilder; #53 = Utf8 toString #54 = Utf8 ()Ljava/lang/String; { public Test(); flags: ACC_PUBLIC Code: stack=1, locals=1, args_size=1 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return LineNumberTable: line 4: 0 LocalVariableTable: Start Length Slot Name Signature 0 5 0 this LTest; public static void main(java.lang.String[]); flags: ACC_PUBLIC, ACC_STATIC Code: stack=4, locals=4, args_size=1 0: ldc #2 // String a 2: astore_1 3: ldc #3 // String b 5: astore_2 6: ldc #4 // String ab 8: astore_3 9: getstatic #5 // Field java/lang/System.out:Ljava/io/PrintStream; 12: aload_3 13: aload_3 14: if_acmpne 21 17: iconst_1 18: goto 22 21: iconst_0 22: invokevirtual #6 // Method java/io/PrintStream.println:(Z)V 25: getstatic #5 // Field java/lang/System.out:Ljava/io/PrintStream; 28: aload_3 29: new #7 // class java/lang/StringBuilder 32: dup 33: invokespecial #8 // Method java/lang/StringBuilder."<init>":()V 36: aload_1 37: invokevirtual #9 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 40: aload_2 41: invokevirtual #9 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 44: invokevirtual #10 // Method java/lang/StringBuilder.toString:()Ljava/lang/String; 47: if_acmpne 54 50: iconst_1 51: goto 55 54: iconst_0 55: invokevirtual #6 // Method java/io/PrintStream.println:(Z)V 58: getstatic #5 // Field java/lang/System.out:Ljava/io/PrintStream; 61: aload_3 62: ldc #4 // String ab 64: if_acmpne 71 67: iconst_1 68: goto 72 71: iconst_0 72: invokevirtual #6 // Method java/io/PrintStream.println:(Z)V 75: return LineNumberTable: line 6: 0 line 7: 3 line 8: 6 line 9: 9 line 10: 25 line 11: 58 line 12: 75 LocalVariableTable: Start Length Slot Name Signature 0 76 0 args [Ljava/lang/String; 3 73 1 a Ljava/lang/String; 6 70 2 b Ljava/lang/String; 9 67 3 ab Ljava/lang/String; StackMapTable: number_of_entries = 6 frame_type = 255 /* full_frame */ offset_delta = 21 locals = [ class "[Ljava/lang/String;", class java/lang/String, class java/lang/String, class java/lang/String ] stack = [ class java/io/PrintStream ] frame_type = 255 /* full_frame */ offset_delta = 0 locals = [ class "[Ljava/lang/String;", class java/lang/String, class java/lang/String, class java/lang/String ] stack = [ class java/io/PrintStream, int ] frame_type = 95 /* same_locals_1_stack_item */ stack = [ class java/io/PrintStream ] frame_type = 255 /* full_frame */ offset_delta = 0 locals = [ class "[Ljava/lang/String;", class java/lang/String, class java/lang/String, class java/lang/String ] stack = [ class java/io/PrintStream, int ] frame_type = 79 /* same_locals_1_stack_item */ stack = [ class java/io/PrintStream ] frame_type = 255 /* full_frame */ offset_delta = 0 locals = [ class "[Ljava/lang/String;", class java/lang/String, class java/lang/String, class java/lang/String ] stack = [ class java/io/PrintStream, int ] } |
Go to the main method and see the first line, JVM is loading the value from constant pool; ldc #2. Go to constant pool and see whats in #2, it's a String reference and that points to #24 let's go and check 24, it's a Utf8 value 'a'. What's Utf8 means? it's a stream of bytes representing a Utf8 encoded sequence of characters[1]. So, this is the way sting literals are loaded.
What's happening when System.out.println(ab == a + b) get executed. see the main method line 29 on the byte code. new StringBuilder gets loaded and then append() method called on that object, after toString method returns the String object's reference.
Let's see the StringBuilder.toString() method[2]
1 2 3 4 | public String toString() { // Create a copy, don't share the array return new String(value, 0, count); } |
The StringBuilder.toString() method creates a new String object. It is not a literal, Reference variable is pointing to that object's memory location.
We should also see what happens when we create a new String object.
First we look at java code to create new String.
1 2 3 4 5 6 7 8 | /** * Created by eananthaneshan on 6/8/16. */ public class Test { public static void main(String[] args) { String a = new String("a"); } } |
here is the compiled code of the above Java file.
1 2 3 4 5 6 7 8 9 10 11 12 13 | // // Source code recreated from a .class file by IntelliJ IDEA // (powered by Fernflower decompiler) // public class Test { public Test() { } public static void main(String[] args) { new String("a"); } } |
No much difference, we should see String's constructor too[3]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | /** * Initializes a newly created {@code String} object so that it represents * the same sequence of characters as the argument; in other words, the * newly created string is a copy of the argument string. Unless an * explicit copy of {@code original} is needed, use of this constructor is * unnecessary since Strings are immutable. * * @param original * A {@code String} */ public String(String original) { int size = original.count; char[] originalValue = original.value; char[] v; if (originalValue.length > size) { // The array representing the String is bigger than the new // String itself. Perhaps this constructor is being called // in order to trim the baggage, so make a copy of the array. int off = original.offset; v = Arrays.copyOfRange(originalValue, off, off+size); } else { // The array representing the String is the same // size as the String, so no point in making a copy. v = originalValue; } this.offset = 0; this.count = size; this.value = v; } |
Interestingly the constructor takes string as an argument and store its char array value. now it has connection with neither argument no literal.
Finally byte code.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | Es-MacBook-Pro:TestString eananthaneshan$ javap -c -verbose Test Classfile /Users/eananthaneshan/WSO2/samples/TestString/out/production/TestString/Test.class Last modified Aug 6, 2016; size 475 bytes MD5 checksum 0d23a6e244f295b3d7bce7241b7022d7 Compiled from "Test.java" public class Test SourceFile: "Test.java" minor version: 0 major version: 52 flags: ACC_PUBLIC, ACC_SUPER Constant pool: #1 = Methodref #6.#23 // java/lang/Object."<init>":()V #2 = Class #24 // java/lang/String #3 = String #20 // a #4 = Methodref #2.#25 // java/lang/String."<init>":(Ljava/lang/String;)V #5 = Class #26 // Test #6 = Class #27 // java/lang/Object #7 = Utf8 b #8 = Utf8 Ljava/lang/String; #9 = Utf8 <init> #10 = Utf8 ()V #11 = Utf8 Code #12 = Utf8 LineNumberTable #13 = Utf8 LocalVariableTable #14 = Utf8 this #15 = Utf8 LTest; #16 = Utf8 main #17 = Utf8 ([Ljava/lang/String;)V #18 = Utf8 args #19 = Utf8 [Ljava/lang/String; #20 = Utf8 a #21 = Utf8 SourceFile #22 = Utf8 Test.java #23 = NameAndType #9:#10 // "<init>":()V #24 = Utf8 java/lang/String #25 = NameAndType #9:#28 // "<init>":(Ljava/lang/String;)V #26 = Utf8 Test #27 = Utf8 java/lang/Object #28 = Utf8 (Ljava/lang/String;)V { public Test(); flags: ACC_PUBLIC Code: stack=1, locals=1, args_size=1 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return LineNumberTable: line 4: 0 LocalVariableTable: Start Length Slot Name Signature 0 5 0 this LTest; public static void main(java.lang.String[]); flags: ACC_PUBLIC, ACC_STATIC Code: stack=3, locals=2, args_size=1 0: new #2 // class java/lang/String 3: dup 4: ldc #3 // String a 6: invokespecial #4 // Method java/lang/String."<init>":(Ljava/lang/String;)V 9: astore_1 10: return LineNumberTable: line 7: 0 line 8: 10 LocalVariableTable: Start Length Slot Name Signature 0 11 0 args [Ljava/lang/String; 10 1 1 a Ljava/lang/String; } |
look at the main method, first creating a new String object, then load 'a' from constant pool, then calling the constructor with loaded 'a' as a parameter, return is a newly created string object. Only this reference will be stored on variable.
How to take advantage of String literal
String literals are faster on comparison[4], compare to String objects. Then how we can use then. We shall take advantage of String.intern()[5]. intern method will give String object equivalent literal pool reference. So when comparing call intern method on String object then use '==' operator.
You may read these article to get related more information
[1]http://blog.jamesdbloom.com/JVMInternals.html
[4] http://cs-fundamentals.com/tech-interview/java/use-of-string-intern-method.php
Comments
Post a Comment