分析Java中ArrayList与LinkedList列表结构的源码

程序员文章站 2024-03-11 12:42:07

一、arraylist源码分析（jdk7） arraylist内部维护了一个动态的object数组，arraylist的动态增删就是对这个对组的动态的增加和删除。 1、...

一、arraylist源码分析（jdk7）

arraylist内部维护了一个动态的object数组，arraylist的动态增删就是对这个对组的动态的增加和删除。

1、arraylist构造以及初始化

arraylist实例变量
//arraylist默认容量
private static final int default_capacity = 10;
//默认空的object数组， 用于定义空的arraylist
private static final object[] empty_elementdata = {};
//arraylist存放存放元素的object数组
private transient object[] elementdata;
//arraylist中元素的数量
private int size;

arraylist构造函数：

无参构造函数: 即构造一个空的object[]

public arraylist() {
  super();
  this.elementdata = empty_elementdata;
}

指定容量大小构造：

public arraylist(int initialcapacity) {
  super();
  if (initialcapacity < 0)
    throw new illegalargumentexception("illegal capacity: "+
                      initialcapacity);
  this.elementdata = new object[initialcapacity];
}

指定某一实现collection接口的集合构造：

public arraylist(collection<? extends e> c) {
  elementdata = c.toarray();
  size = elementdata.length;
  // c.toarray might (incorrectly) not return object[] (see 6260652)
  if (elementdata.getclass() != object[].class)
    elementdata = arrays.copyof(elementdata, size, object[].class);
}

这里也说明了collection的作用， java-collection-framwork设计collection接口而不是直接使用list，set等接口的原因。

2、arraylist的容量分配机制

arraylist的容量上限： arraylist容量是有上限的，理论允许分配integer.max_value - 8大小的容量。但是能分配多少还跟堆栈设置有关，需要设置vm参数

private static final int max_array_size = integer.max_value - 8;

调用add方法时扩容规则

public boolean add(e e) {
    ensurecapacityinternal(size + 1); // increments modcount!!
    elementdata[size++] = e;
    return true;
  }

ensurecapacityinternal(int)方法实际上确定一个最小扩容大小。

private void ensurecapacityinternal(int mincapacity) {
    if (elementdata == empty_elementdata) {
      mincapacity = math.max(default_capacity, mincapacity);
    }
    ensureexplicitcapacity(mincapacity);
  }
private void ensureexplicitcapacity(int mincapacity) {
    modcount++;
    // overflow-conscious code
    if (mincapacity - elementdata.length > 0)
      grow(mincapacity);
  }

关于modcount： modcount定义在抽象类abstratlist中，源码的注释基本说明了它的用处:在使用迭代器遍历的时候，用来检查列表中的元素是否发生结构性变化（列表元素数量发生改变的一个计数）了，主要在多线程环境下需要使用，防止一个线程正在迭代遍历，另一个线程修改了这个列表的结构。
grow方法为真正的扩容方法

private void grow(int mincapacity) {
    // overflow-conscious code
    int oldcapacity = elementdata.length;
    int newcapacity = oldcapacity + (oldcapacity >> 1);
    if (newcapacity - mincapacity < 0)
      newcapacity = mincapacity;
    if (newcapacity - max_array_size > 0)
      newcapacity = hugecapacity(mincapacity);
    // mincapacity is usually close to size, so this is a win:
    elementdata = arrays.copyof(elementdata, newcapacity);
  }

其中对大容量扩容多少还有个hugecapacity方法

private static int hugecapacity(int mincapacity) {
    if (mincapacity < 0) // overflow
      throw new outofmemoryerror();
    return (mincapacity > max_array_size) ?
      integer.max_value :
      max_array_size;
  }

总结：
每次扩容都会伴随着数组的复制操作，因此一次给定恰当的容量会提高一点性能。
下图是我归纳的整个扩容流程：

分析Java中ArrayList与LinkedList列表结构的源码

3.arraylist迭代器

arraylist的迭代器主要有两种itr和listitr, 但是在jdk1.8中还添加了一个arraylistspliterator, 下面分别学一下itr和listitr的源码分析。

（1）itr:只能向后遍历

private class itr implements iterator<e> {
    int cursor;    // index of next element to return
    int lastret = -1; // index of last element returned; -1 if no such
    //expectedmodcount 是modcount的一个副本
    int expectedmodcount = modcount;
    public boolean hasnext() {
      return cursor != size;
    }
    @suppresswarnings("unchecked")
    public e next() {
      checkforcomodification();
      //记录当前位置
      int i = cursor;
      if (i >= size)
        throw new nosuchelementexception();
      object[] elementdata = arraylist.this.elementdata;
      if (i >= elementdata.length)
        throw new concurrentmodificationexception();
      //下一个元素的位置
      cursor = i + 1;
      return (e) elementdata[lastret = i];
    }
    //使用迭代器的remove方法
    public void remove() {
      if (lastret < 0)
        throw new illegalstateexception();
      checkforcomodification();
      try {
        //注意内部类调用外部类的方式
        arraylist.this.remove(lastret);
        //remove之后需要重新调整各个指针的位置
        cursor = lastret;
        lastret = -1;
        expectedmodcount = modcount;
      } catch (indexoutofboundsexception ex) {
        throw new concurrentmodificationexception();
      }
    }
    final void checkforcomodification() {
      if (modcount != expectedmodcount)
        throw new concurrentmodificationexception();
    }
  }

从源码中可以看出itr迭代器是向前迭代器，它提供了一个next方法用于获取arraylist中的元素。
checkforcomodification是java-collection-framwork中的一种fail-fast的错误检测机制。在多线程环境下对同一个集合操作，就可能触发fail-fast机制，抛出concurrentmodificationexception异常。

itr迭代器定义了一个expectedmodcount记录modcount副本。在arraylist执行改变结构的操作的时候例如add， remove， clear方法时modcount的值会改变。

通过itr源码可以看出调用next和remove方法会触发fail-fast检查。此时如果在遍历该集合时，存在其他线程正在执行改变该集合结构的操作时就会发生异常。

（2）listitr：支持向前和向后遍历，下面看看listitr的源码：

private class listitr extends itr implements listiterator<e> {
    listitr(int index) {
      super();
      cursor = index;
    }
    public boolean hasprevious() {
      return cursor != 0;
    }
    public int nextindex() {
      return cursor;
    }
    public int previousindex() {
      return cursor - 1;
    }
    @suppresswarnings("unchecked")
    public e previous() {
      checkforcomodification();
      //arraylist前一个元素的位置
      int i = cursor - 1;
      if (i < 0)
        throw new nosuchelementexception();
      object[] elementdata = arraylist.this.elementdata;
      if (i >= elementdata.length)
        throw new concurrentmodificationexception();
      cursor = i;
      return (e) elementdata[lastret = i];
    }
    //该迭代器中添加了set方法
    public void set(e e) {
      if (lastret < 0)
        throw new illegalstateexception();
      checkforcomodification();
      try {
        arraylist.this.set(lastret, e);
      } catch (indexoutofboundsexception ex) {
        throw new concurrentmodificationexception();
      }
    }
    //该迭代器添加了add方法
    public void add(e e) {
      checkforcomodification();
      try {
        int i = cursor;
        arraylist.this.add(i, e);
        //重新标记指针位置
        cursor = i + 1;
        lastret = -1;
        expectedmodcount = modcount;
      } catch (indexoutofboundsexception ex) {
        throw new concurrentmodificationexception();
      }
    }
  }

listitr的实现基本与itr一致，添加了可以先前遍历的方法以及add与set方法。

（3）使用java.util.concurrent中的copyonwritearraylist解决fast-fail问题

copyonwritearraylist是线程安全的，具体看一下它的add方法源码：

public boolean add(e e) {
    final reentrantlock lock = this.lock;
    lock.lock();
    try {
      object[] elements = getarray();
      int len = elements.length;
      object[] newelements = arrays.copyof(elements, len + 1);
      newelements[len] = e;
      setarray(newelements);
      return true;
    } finally {
      lock.unlock();
    }
  }

copyonwritearraylist就是写时复制的arraylist。当开始写数据的操作时候， arrays.copyof一个新的数组，这样不会影响读操作。
这样的代价就是会损耗内存，带来性能的问题。copyonwritearraylist写的时候在内存中生成一个副本对象，同时原来的对象仍然存在。
copyonwritearraylist无法保证数据实时的一致，只能保证结果的一致。适用于并发下读多写少得场景，例如缓存。

（4）arraylist的其他方法源码：

一个私有方法batchremove(collection<?>c, boolean complement)，即批量移除操作

private boolean batchremove(collection<?> c, boolean complement) {
    //下面会提到使用final的原因
    final object[] elementdata = this.elementdata;
    int r = 0, w = 0;
    boolean modified = false;
    try {
      //遍历list中的元素，进行验证
      for (; r < size; r++)
        if (c.contains(elementdata[r]) == complement)
          elementdata[w++] = elementdata[r];
    } finally {
      //try中如果出现异常，则保证数据一致性执行下面的copy操作
      if (r != size) {
        system.arraycopy(elementdata, r,
                 elementdata, w,
                 size - r);
        w += size - r;
      }
      //清理无用的元素， 通知gc回收
      if (w != size) {
        // clear to let gc do its work
        for (int i = w; i < size; i++)
          elementdata[i] = null;
        modcount += size - w;
        size = w;
        modified = true;
      }
    }
    return modified;
  }

final修饰的变量指的是同一个引用，为了后面保持数据的一致性。
此方法，想保留collection c中的元素时， complement值为true；想移除c中的元素时， complement值为false。这样就分别变成了retainall和removeall方法。

swap：交换arraylist中的某两个位置的

二、linkedlist源码分析（jdk7）

linkedlist即链表，相对于顺序表，链表存储数据不需要使用地址连续的内存单元。减少了修改容器结构而带来的移动元素的问题，顺序访问相对高效。

1、结点（node）的定义

jdk中的linkedlist是双向链表，每个结点分别存有上一个结点和下一个结点的信息。它的定义如下：

private static class node<e> {
  e item;
  node<e> next;
  node<e> prev;
  node<e> (node<e> prev, e element, node<e> next) {
    this.item = element;
    this.next = next;
    this.prev = prev;
  }
}

2、linkedlist构造以及初始化

成员： linkedlist中维护了3个成员变量，用以记录链表中结点的个数，结点的前驱以及后继

transient int size = 0;
transient node<e> first;
transient node<e> last;

构造函数：默认构造函数即构造一个空的linkedlist

public linkedlist() {}

或者根据其他容器进行构造，后面我们会自己写一个构造一个有序的链表

public linkedlist(collection<? extends e> c) {
    this();
    addall(c);
}

这里给出一点补充，关于泛型修饰符? super t 与 ? extends t的区别，参见这篇文章泛型中? super t和? extends t的区别

3、linkedlist的结构操作

头插法：即在链表头插入一个元素

private void linkfirst(e e) {
  final node<e> f = first;
  final node<e> newnode = new node<>(null, e, f);
  first = newnode;
  //判断是否是空链表
  if (f == null)
    last = newnode;
  else
    f.prev = newnode;
  size++;
  modcount++;
  }

尾插法：即在链表尾部插入一个元素

  void linklast(e e) {
    final node<e> l = last;
    final node<e> newnode = new node<>(l, e, null);
    last = newnode;
    if (l == null)
      first = newnode;
    else
      l.next = newnode;
    size++;
    modcount++;
  }

插入到当前结点之前：找当前结点的前驱

void linkbefore(e e, node<e> succ) {
    //确定当然结点非空
    final node<e> pred = succ.prev;
    final node<e> newnode = new node<>(pred, e, succ);
    succ.prev = newnode;
    //判断当前结点是否是第一个结点
    if (pred == null)
      first = newnode;
    else
      pred.next = newnode;
    size++;
    modcount++;
  }

头删法：删除链表的第一个结点

  private e unlinkfirst(node<e> f) {
    // assert f == first && f != null;
    final e element = f.item;
    final node<e> next = f.next;
    f.item = null;
    f.next = null; // help gc
    first = next;
    if (next == null)
      last = null;
    else
      next.prev = null;
    size--;
    modcount++;
    return element;
  }

尾删法：删除链表的最后一个结点

  private e unlinklast(node<e> l) {
    //保证l==last 并且l != null
    final e element = l.item;
    final node<e> prev = l.prev;
    l.item = null;
    l.prev = null; // help gc
    last = prev;
    if (prev == null)
      first = null;
    else
      prev.next = null;
    size--;
    modcount++;
    return element;
  }

4、保持list接口与deque的一致性

list接口允许使用下标来实现对容器的随机访问，对于数组这种实现随机访问是很容易的。对于链表，jdk也从逻辑上利用链表中结点的计数给出了随机访问的实现

node<e> node(int index) {
    //确保index的正确性
    if (index < (size >> 1)) {
      node<e> x = first;
      for (int i = 0; i < index; i++)
        x = x.next;
      return x;
    } else {
      node<e> x = last;
      for (int i = size - 1; i > index; i--)
        x = x.prev;
      return x;
    }
  }

index 属于前半部分的计数，从头遍历查找。index属于后半部分的计数，从末尾遍历查找。充分利用双向链表的特点。
因此，add(int index, t t), get(int), set(int)等方法就可以很容易的实现。

linkedlist实现了deque接口，也就是linkedlist实现了双端队列容器的方法，下面给出一些api的总结。

5、linkedlist的遍历

既然linkedlist是双向链表，自然就可以前后遍历。与arraylist同样，涉及到多线程操作容器的时候linkedlist也会出现fail-fast问题。
对于fail-fast问题上篇已经讲解过，这里就不说了。

关于迭代器，linkedlist有listiterator双向迭代器，和descendingiterator逆序迭代器。都很简单。源码不在分析

如果遍历元素的话，随机访问的代价是比较大得。

三、linkedlist，arraylist， vector总结

1、linkedlist与arraylist

arraylist是实现了基于动态数组的数据结构，linkedlist基于链表的数据结构。

对于随机访问get和set，arraylist觉得优于linkedlist，因为linkedlist要移动指针。

对于新增和删除操作add和remove，linedlist比较占优势，因为arraylist要移动数据。这一点要看实际情况的。若只对单条数据插入或删除，arraylist的速度反而优于linkedlist。但若是批量随机的插入删除数据，linkedlist的速度大大优于arraylist. 因为arraylist每插入一条数据，要移动插入点及之后的所有数据。

2、arraylist与vector

vector是线程同步的，所以它也是线程安全的，而arraylist是线程异步的，是不安全的。如果不考虑到线程的安全因素，一般用arraylist效率比较高。

如果集合中的元素的数目大于目前集合数组的长度时，vector增长率为目前数组长度的100%,而arraylist增长率为目前数组长度的50%.如过在集合中使用数据量比较大的数据，用vector有一定的优势。

如果查找一个指定位置的数据，vector和arraylist使用的时间是相同的，都是0(1),这个时候使用vector和arraylist都可以。而如果移动一个指定位置的数据花费的时间为0(n-i)n为总长度，这个时候就应该考虑到使用linklist,因为它移动一个指定位置的数据所花费的时间为0(1),而查询一个指定位置的数据时花费的时间为0(i)。

arraylist 和vector是采用数组方式存储数据，此数组元素数大于实际存储的数据以便增加和插入元素，都允许直接序号索引元素，但是插入数据要设计到数组元素移动等内存操作，所以索引数据快插入数据慢，vector由于使用了synchronized方法（线程安全）所以性能上比arraylist要差，linkedlist使用双向链表实现存储，按序号索引数据需要进行向前或向后遍历，但是插入数据时只需要记录本项的前后项即可，所以插入数度较快！

上一篇： Django 中 cookie的使用

下一篇：简单谈谈JVM、JRE和JDK的区别与联系

分析Java中ArrayList与LinkedList列表结构的源码

分析Java中ArrayList与LinkedList列表结构的源码

Java中ArrayList类的用法与源码完全解析

Java中ArrayList和LinkedList的遍历与性能分析

源码阅读(3)：Java中主要的List结构——ArrayList集合（上）

源码阅读(6)：Java中主要的List、Deque结构——LinkedList集合（上）

Java中关于枚举单例模式与源码分析的示例

java中对比ArrayList与LinkedList的图文详情

Java中关于枚举单例模式与源码分析的示例

java中对比ArrayList与LinkedList的图文详情

Java数据结构中栈与队列的分析