String.substring()方法简单理解

程序员文章站 2022-07-13 16:09:04

...

今天有时间，看了下String类中的substring()方法，现简要分析如下：

/**
 * Returns a new string that is a substring of this string. The substring begins
 * with the character at the specified index and extends to the end of this string.
 * 上面的这段话的意思是，它会返回一个新的字符串，这个新的字符串是原来字符串的一部分，
 * 从某个下标的字母开始，一直到这个字符串的结束。
 * 例如："unhappy".substring(2)，从下标2开始，到结束->happy
 *
 **/ 
 public String substring(int beginIndex) {
	   return substring(beginIndex, count);
 }
 注：The count is the number of characters in the String.
 可以看出如果只是指定了开始的下标，那即是默认将该下标之后的所有字符都作为新的字符串的一部分。
接着看String.substring(int beginIndex, int endIndex)方法：
 /**
  * Returns a new string that is a substring of this string. The
  * substring begins at the specified <code>beginIndex</code> and
  * extends to the character at index <code>endIndex - 1</code>.
  * Thus the length of the substring is <code>endIndex-beginIndex</code>.
  * 从这段说明中也可看出是左闭右开的(即[a,b),包括下标为a的但不包括下标为b的元素)。
  * 例如："hamburger".substring(4, 8) returns "urge"
  **/
  public String substring(int beginIndex, int endIndex) {
	    if (beginIndex < 0) {
	        throw new StringIndexOutOfBoundsException(beginIndex);
	    }
	    if (endIndex > count) {
	        throw new StringIndexOutOfBoundsException(endIndex);
	    }
	    if (beginIndex > endIndex) {
	        throw new StringIndexOutOfBoundsException(endIndex - beginIndex);
	    }
	    return ((beginIndex == 0) && (endIndex == count)) ? this :
	    new String(offset + beginIndex, endIndex - beginIndex, value);
 }
 在String类的一开始定义了一些变量：
    /** The value is used for character storage. */
    private final char value[];
    /** The offset is the first index of the storage that is used. */
    private final int offset;
    /** The count is the number of characters in the String. */
    private final int count;
    /** Cache the hash code for the string */
    private int hash; // Default to 0
    
 // Package private constructor which shares value array for speed.
    String(int offset, int count, char value[]) {
	      this.value = value; // 保存的字符串
	      this.offset = offset; // 开始的位置
	      this.count = count; // 字符数目
    }
 从上面的代码中不难发现substring(int begin, int endIndex)方法使用的是和new String()相同的value, 而value存储的就是要截取的字符串的值(本身String就是用一个char数组来存储的),然后通过修改offset和count来得到要截取的字串,这样也不用移动元素，也不用单独分配新的空间。
 下面是debug的截图：

原字符串：截取的字符串：

String.substring()方法简单理解

博客分类： JDK源码简单分析 javajdkString
证实了前面所说的使用的是同一个value，只是offset和count不同

 
 注：代码
 public class NewString {
    public static void main(String[] args) {
        String str = "Happy";
        String substr = str.substring(1,3);
        System.out.println(substr);
    }
 }

最后的这个使用的value的问题是参考了下面的这个博客，感兴趣的可以看一下：

博客地址：http://www.cnblogs.com/tianchi/archive/2012/11/14/2768851.html

Compiled from "NewString.java"
public class NewString extends java.lang.Object{
public NewString();
  Code:
   0:	aload_0
   1:	invokespecial	#1; //Method java/lang/Object."<init>":()V
   4:	return

public static void main(java.lang.String[]);
  Code:
   0:	ldc	#2; //String Happy
   2:	astore_1
   3:	aload_1
   4:	iconst_1
   5:	iconst_3
   6:	invokevirtual	#3; //Method java/lang/String.substring:(II)Ljava/lang/String;
   9:	astore_2
   10:	getstatic	#4; //Field java/lang/System.out:Ljava/io/PrintStream;
   13:	aload_2
   14:	invokevirtual	#5; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   17:	return
}

上面这部分是将前面的代码使用反汇编之后得到的结果。附上指令介绍，仅供参考：

ldc 将int、float或String型常量值从常量池中推送至栈顶

astore index 将栈顶数值（objectref）存入当前frame的局部变量数组中指定下标(index)处的变量中，栈顶数值出栈.

aload index 当前frame的局部变量数组中下标为index的引用型局部变量进栈.

iconst_1 int型常量值1进栈

iconst_3 int型常量值3进栈

invokevirtual 调用实例方法substring()

...

###########################################################################

以上，编译和运行都是在jdk 1.6中的，但是看了jdk 1.7的源码，发现有些变化，下面附上jdk 1.7中的实现：

public String substring(int beginIndex) {
        if (beginIndex < 0) {
            throw new StringIndexOutOfBoundsException(beginIndex);
        }
        // 长度是用源数组长度减去开始下标
        int subLen = value.length - beginIndex;
        if (subLen < 0) {
            throw new StringIndexOutOfBoundsException(subLen);
        }
        return (beginIndex == 0) ? this : new String(value, beginIndex, subLen);
    }
// 没有再通过substring(int,int)转，而是直接使用new String(...)

public String substring(int beginIndex, int endIndex) {
        if (beginIndex < 0) {
            throw new StringIndexOutOfBoundsException(beginIndex);
        }
        if (endIndex > value.length) {
            throw new StringIndexOutOfBoundsException(endIndex);
        }
        // 长度是用最后一个元素下标减去开始下标
        int subLen = endIndex - beginIndex;
        if (subLen < 0) {
            throw new StringIndexOutOfBoundsException(subLen);
        }
        return ((beginIndex == 0) && (endIndex == value.length)) ? this
                : new String(value, beginIndex, subLen);
    }

前面的基本上没什么变化，关键在最后调用的new String(value, beginIndex, subLen)方法：

    /**
     * Allocates a new {@code String} that contains characters from a subarray
     * of the character array argument. The {@code offset} argument is the
     * index of the first character of the subarray and the {@code count}
     * argument specifies the length of the subarray. The contents of the
     * subarray are copied; subsequent modification of the character array does
     * not affect the newly created string.
     *
     * @param  value Array that is the source of characters
     * @param  offset The initial offset
     * @param  count The length
     * @throws  IndexOutOfBoundsException
     *          If the {@code offset} and {@code count} arguments index
     *          characters outside the bounds of the {@code value} array
     */
    public String(char value[], int offset, int count) {
        if (offset < 0) {
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count < 0) {
            throw new StringIndexOutOfBoundsException(count);
        }
        // Note: offset or count might be near -1>>>1.
        if (offset > value.length - count) {
            throw new StringIndexOutOfBoundsException(offset + count);
        }
        this.value = Arrays.copyOfRange(value, offset, offset+count);
    }

继续看看Arrays.copyOfRange()方法

public static char[] copyOfRange(char[] original, int from, int to) {
    int newLength = to - from;
    if (newLength < 0)
        throw new IllegalArgumentException(from + " > " + to);
    char[] copy = new char[newLength];
    System.arraycopy(original, from, copy, 0, Math.min(original.length - from, newLength));
    return copy;
}

debug模式下查看了两个value，发现两个value不是同一个value：

String.substring()方法简单理解

博客分类： JDK源码简单分析 javajdkString