欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

浅谈从fail-fast机制到CopyOnWriteArrayList使用

程序员文章站 2022-06-11 15:42:23
...

【1】Fail-Fast机制

在ArrayList、HashMap和HashSet等集合的Javadoc中,你会看到类似如**释:

  • 图片来源于ArrayList的javadoc
    浅谈从fail-fast机制到CopyOnWriteArrayList使用

第一段翻译如下:

本(ArrayList)类的iterator()或者listIterator(int)方法返回的迭代器是“快速失败”的。即在迭代器被创建后,除了使用迭代器自身的方法(如ListIterator#remove()或ListIterator#add(Object))之外,其他任何导致list结构被修改的方法执行,迭代器都会抛出ConcurrentModificationException异常。因此,在面临并发修改时,迭代器会快速而干净地失败,而不会在未来不确定的时间冒任意的、非确定性行为的风险。

第二段翻译如下:

注意,迭代器的快速失败行为不能得到保证,一般来说,存在非同步的并发修改时,不可能作出任何坚决的保证。快速失败迭代器尽最大努力抛出 ConcurrentModificationException。因此,编写依赖于此异常的程序的做法是错误的,正确做法是:迭代器的快速失败行为应该仅用于检测程序错误。


第一段简要说明了何为“快速失败”机制,总结如下:

它是Java集合的一种错误检测机制,当多个线程对集合进行结构上的改变的操作时,有可能会产生fail-fast机制。记住是有可能,而不是一定。例如:假设存在两个线程(线程1、线程2),线程1通过Iterator在遍历集合A中的元素,在某个时候线程2修改了集合A的结构(是结构上面的修改,而不是简单的修改集合元素的内容),那么这个时候程序就会抛出 ConcurrentModificationException 异常,从而产生fail-fast机制。

示例如下:

public class TestFailFast {
    public static void main(String[] args){

        FailFast failFast = new FailFast();
        for (int i = 0; i < 10; i++) {
            new Thread(failFast).start();
        }

    }
}

class FailFast implements  Runnable{

    private List list=new ArrayList();

    {
        for (int i=0;i<100;i++){
            list.add(i);
        }
    }

    @Override
    public void run() {
        ListIterator iterator = list.listIterator();
        while (iterator.hasNext()){
            System.out.println(iterator.next());
        }
        list.add("AA");
    }
}

测试结果如下:

浅谈从fail-fast机制到CopyOnWriteArrayList使用


【2】为什么会抛出ConcurrentModificationException

有三个层面,第一是该异常是什么,第二是什么原因导致了该异常,第三是如何避免该异常。

① 该异常是什么

从字面意思理解,即为并发修改异常,直接看源码。

/**
//如果方法发现正在并发修改某个对象,而这些修改不被允许,方法就会抛出该异常。
 * This exception may be thrown by methods that have detected concurrent
 * modification of an object when such modification is not permissible.
 * <p>
 * //例如,一般不允许一个线程修改集合的同时另一个线程遍历。
 * For example, it is not generally permissible for one thread to modify a Collection
 * while another thread is iterating over it.  
 * 
 * In general, the results of the iteration are undefined under these circumstances.  
 * //一般情况下,迭代的结果在这些情况下是不可预知的。
 * 
 * Some Iteratorimplementations (including those of all the general purpose collection implementations provided by the JRE) may choose to throw this exception if this behavior isdetected.  
 * 如果这种行为被发现,一些迭代器实现产品(包括JRE提供的所有通用集合实现的那些迭代器)可能会选择抛出这个异常.
 * 
 * Iterators that do this are known as <i>fail-fast</i> iterators,
 * as they fail quickly and cleanly, rather that risking arbitrary,
 * non-deterministic behavior at an undetermined time in the future.
 * <p>
 * 
 * Note that this exception does not always indicate that an object has
 * been concurrently modified by a <i>different</i> thread. 
 * 注意,该线程并不永远表明是一个对象在不同线程被并发修改才会出现。
 * 
 *  If a single thread issues a sequence of method invocations that violates the contract of an object, the object may throw this exception.  
 * // 如果单线程进行了一系列违法对象协议的方法调用,这个对象也可能抛出该异常。
 * For example, if a thread modifies a collection directly while it is iterating over the collection with a fail-fast iterator, the iterator will throw this exception.
 * // 例如,一个线程它使用了一个快速失败迭代器对集合进行迭代同时修改集合,
 * //这个迭代器将会抛出该异常。
 *
 //参考博文【1】中第二段的翻译
 * <p>Note that fail-fast behavior cannot be guaranteed as it is, generally
 * speaking, impossible to make any hard guarantees in the presence of
 * unsynchronized concurrent modification.  Fail-fast operations
 * throw {@code ConcurrentModificationException} on a best-effort basis.
 * Therefore, it would be wrong to write a program that depended on this
 * exception for its correctness: <i>{@code ConcurrentModificationException}
 * should be used only to detect bugs.</i>
 *
 // 下面这些集合都可能抛出该异常
 * @author  Josh Bloch
 * @see     Collection
 * @see     Iterator
 * @see     Spliterator
 * @see     ListIterator
 * @see     Vector
 * @see     LinkedList
 * @see     HashSet
 * @see     Hashtable
 * @see     TreeMap
 * @see     AbstractList
 * @since   1.2
 */
public class ConcurrentModificationException extends RuntimeException {
    private static final long serialVersionUID = -3666751008965953603L;

    /**
     * Constructs a ConcurrentModificationException with no
     * detail message.
     */
    public ConcurrentModificationException() {
    }

② 什么原因导致了该异常

查看异常信息,跟踪源码:

 final void checkForComodification() {
    if (modCount != expectedModCount)
         throw new ConcurrentModificationException();
 }

如果modCount !=expectedModCount就会抛出该异常。

那么什么时候执行checkForComodification方法,modCount 和expectedModCount又分别是什么?

首先看ListItr源码:

private class ListItr extends Itr implements ListIterator<E> {
        ListItr(int index) {
            super();
            cursor = index;
        }

        public boolean hasPrevious() {
            return cursor != 0;
        }

        public int nextIndex() {
            return cursor;
        }

        public int previousIndex() {
            return cursor - 1;
        }

        @SuppressWarnings("unchecked")
        public E previous() {
            checkForComodification();
            int i = cursor - 1;
            if (i < 0)
                throw new NoSuchElementException();
            Object[] elementData = ArrayList.this.elementData;
            if (i >= elementData.length)
                throw new ConcurrentModificationException();
            cursor = i;
            return (E) elementData[lastRet = i];
        }

        public void set(E e) {
            if (lastRet < 0)
                throw new IllegalStateException();
            checkForComodification();

            try {
                ArrayList.this.set(lastRet, e);
            } catch (IndexOutOfBoundsException ex) {
                throw new ConcurrentModificationException();
            }
        }

        public void add(E e) {
            checkForComodification();

            try {
                int i = cursor;
                ArrayList.this.add(i, e);
                cursor = i + 1;
                lastRet = -1;
                expectedModCount = modCount;
            } catch (IndexOutOfBoundsException ex) {
                throw new ConcurrentModificationException();
            }
        }
    }

其继承于Itr,查看Itr源码:

 private class Itr implements Iterator<E> {
        int cursor;       // index of next element to return
        int lastRet = -1; // index of last element returned; -1 if no such
        int expectedModCount = modCount;

        public boolean hasNext() {
            return cursor != size;
        }

        @SuppressWarnings("unchecked")
        public E next() {
            checkForComodification();
            int i = cursor;
            if (i >= size)
                throw new NoSuchElementException();
            Object[] elementData = ArrayList.this.elementData;
            if (i >= elementData.length)
                throw new ConcurrentModificationException();
            cursor = i + 1;
            return (E) elementData[lastRet = i];
        }

        public void remove() {
            if (lastRet < 0)
                throw new IllegalStateException();
            checkForComodification();

            try {
                ArrayList.this.remove(lastRet);
                cursor = lastRet;
                lastRet = -1;
                expectedModCount = modCount;
            } catch (IndexOutOfBoundsException ex) {
                throw new ConcurrentModificationException();
            }
        }

        @Override
        @SuppressWarnings("unchecked")
        public void forEachRemaining(Consumer<? super E> consumer) {
            Objects.requireNonNull(consumer);
            final int size = ArrayList.this.size;
            int i = cursor;
            if (i >= size) {
                return;
            }
            final Object[] elementData = ArrayList.this.elementData;
            if (i >= elementData.length) {
                throw new ConcurrentModificationException();
            }
            while (i != size && modCount == expectedModCount) {
                consumer.accept((E) elementData[i++]);
            }
            // update once at end of iteration to reduce heap write traffic
            cursor = i;
            lastRet = i - 1;
            checkForComodification();
        }

        final void checkForComodification() {
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
        }
    }

可以看到,迭代器在执行next(),remove()方法时,都会执行checkForComodification()方法进行检测。

那么modCount 和expectedModCount 在哪里定义,又在哪里被修改?

如下所示expectedModCount 在Itr中被定义,初始值为modCount:

 private class Itr implements Iterator<E> {
        int cursor;       // index of next element to return
        int lastRet = -1; // index of last element returned; -1 if no such
        int expectedModCount = modCount;

modCount在AbstractList类中被定义,用来记录集合结构修改的次数:

protected transient int modCount = 0;

那么modCount 值什么时候被修改?

查看ArrayList源码:

// 首先看add
public boolean add(E e) {
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    elementData[size++] = e;
    return true;
}

private void ensureCapacityInternal(int minCapacity) {
    if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
        minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
    }

    ensureExplicitCapacity(minCapacity);
}

private void ensureExplicitCapacity(int minCapacity) {
    // 这里将modCount++!!!
   modCount++;

   // overflow-conscious code
   if (minCapacity - elementData.length > 0)
       grow(minCapacity);
}

//其次看remove
public boolean remove(Object o) {
  if (o == null) {
        for (int index = 0; index < size; index++)
            if (elementData[index] == null) {
                fastRemove(index);
                return true;
            }
    } else {
        for (int index = 0; index < size; index++)
            if (o.equals(elementData[index])) {
                fastRemove(index);
                return true;
            }
    }
    return false;
}
private void fastRemove(int index) {
 // 这里modCount++!!!
  modCount++;
   int numMoved = size - index - 1;
   if (numMoved > 0)
       System.arraycopy(elementData, index+1, elementData, index,
                        numMoved);
   elementData[--size] = null; // clear to let GC do its work
}

// 最后看下clear
public void clear() {
    //不再重复。。
   modCount++;

    // clear to let GC do its work
    for (int i = 0; i < size; i++)
        elementData[i] = null;

    size = 0;
}

从上面的源代码我们可以看出,ArrayList中无论add、remove、clear方法只要是涉及了改变ArrayList元素的个数的方法都会导致modCount的改变。但是所以我们这里可以初步判断由于expectedModCount 的值与modCount的改变不同步,导致两者之间不等从而产生fail-fast机制。如果expectedModCount 和modCount修改为原子操作,同步修改,将不会抛出异常。

知道产生fail-fast产生的根本原因了,我们可以有如下场景:

有两个线程(线程A,线程B),其中线程A负责遍历list、线程B修改list。线程A在遍历list过程的某个时候(此时expectedModCount = modCount=N),线程启动,同时线程B增加一个元素,这是modCount的值发生改变(modCount + 1 = N + 1)。线程A继续遍历执行next方法时,调用checkForComodification方法发现expectedModCount = N ,而modCount = N + 1,两者不等,这时就抛出ConcurrentModificationException 异常,从而产生fail-fast机制。


③ 如何避免该异常

第一种,使用局部变量(例子代码使用的是成员变量),同时在单个线程操作集合的时候不要在遍历的同时做可能导致modCount 修改的操作。

第二种,屡试不爽的锁,如修改代码如下:

 @Override
 public void run() {
      synchronized (list){
          ListIterator iterator = list.listIterator();
          System.out.println(iterator.getClass());
          while (iterator.hasNext()){
              System.out.println(iterator.next());
          }
          list.add("AA");
      }
  }

很显然,锁会阻塞其他线程,会导致性能降低。


第三种,使用CopyOnWriteArrayList替代ArrayList。


【3】CopyOnWriteArrayList

Java 5.0 在java.util.concurrent 包中提供了多种并发容器类来改进同步容器的性能。

如ConcurrentHashMap 同步容器类是Java 5 增加的一个线程安全的哈希表。对与多线程的操作,介于HashMap 与Hashtable 之间。内部采用“锁分段”机制替Hashtable 的独占锁。进而提高性能。

  • ConcurrentHashMap
  • ConcurrentSkipListMap
  • ConcurrentSkipListSet
  • CopyOnWriteArrayList
  • CopyOnWriteArraySet。

当期望许多线程访问一个给定collection 时,ConcurrentHashMap 通常优于同步的HashMap,ConcurrentSkipListMap 通常优于同步的TreeMap。当期望的读数和遍历远远大于列表的更新数时,CopyOnWriteArrayList 优于同步的ArrayList。

修改代码如下:

public class TestFailFast {
    public static void main(String[] args){

        FailFast failFast = new FailFast();
        for (int i = 0; i < 100; i++) {
            new Thread(failFast).start();
        }

    }
}

class FailFast implements  Runnable{

//    private List list=new ArrayList();
    private CopyOnWriteArrayList list=new CopyOnWriteArrayList ();


    {
        for (int i=0;i<100;i++){
            list.add(i);
        }
    }

    @Override
    public void run() {
        ListIterator iterator = list.listIterator();
        System.out.println(iterator.getClass());
        while (iterator.hasNext()){
            System.out.println(iterator.next());
        }
        list.add("AA");
    }
}

此时再多次进行测试,不会出现ConcurrentModificationException异常。

那么为什么使用CopyOnWriteArrayList 就不会出现该异常?

看一下CopyOnWriteArrayList的javadoc:

 A thread-safe variant of {@link java.util.ArrayList} in which all mutative
 * operations ({@code add}, {@code set}, and so on) are implemented by
 * making a fresh copy of the underlying array.
 *//ArrayList的线程安全变体,其所有变化的操作如add/set等等
 //都是通过在底层数组的最新副本上面实现的。

 * <p>This is ordinarily too costly, but may be <em>more</em> efficient
 * than alternatives when traversal operations vastly outnumber
 * mutations, and is useful when you cannot or don't want to
 * synchronize traversals, yet need to preclude interference among
 * concurrent threads.  
 * //这通常是太贵(耗费更多资源),但是它在遍历集合大于集合增删数据时会更有效;
 * //同时当你不能或者不想同步遍历,但又需要排除并发线程的处突是,这种方式也是有效的。
 * 
 * The "snapshot" style iterator method uses a
 * reference to the state of the array at the point that the iterator was created. 
 * //“快照”方式迭代器方法在创建迭代器时使用了一个对数组状态的引用。
 * 
 * This array never changes during the lifetime of the
 * iterator, so interference is impossible and the iterator is
 * guaranteed not to throw {@code ConcurrentModificationException}.
 * //此数组在迭代器的生存期内永远不会更改,因此其他线程冲突是不可能的,
 * //并且迭代器保证不会抛出{@代码ConcurrentModificationException}。
 * 
 * The iterator will not reflect additions, removals, or changes to
 * the list since the iterator was created. 
 * //迭代器自创建迭代器以来不会反映对列表的添加、移除或更改。
 * //也就是说,如果在迭代器遍历期间,集合数据改变了,
 * //迭代器仍旧遍历原先拿到的集合--迭代器的集合数据不能时刻保证最新状态!!!

 *  Element-changing operations on iterators themselves ({@code remove}, {@code set}, and
 * {@code add}) are not supported. These methods throw
 * {@code UnsupportedOperationException}.

通过以上可知,CopyOnWriterArrayList的无论是从数据结构、定义都和ArrayList一样。它和ArrayList一样,同样是实现List接口,底层使用数组实现。在方法上也包含add、remove、clear、iterator等方法。

其所使用的COWIterator根本不会抛出ConcurrentModificationException,但是需要注意,如果在迭代器遍历期间,集合数据被修改,迭代器是不知道的!!

COWIterator其源码如下:

 static final class COWIterator<E> implements ListIterator<E> {
        /** Snapshot of the array */
        private final Object[] snapshot;
        /** Index of element to be returned by subsequent call to next.  */
        private int cursor;

        private COWIterator(Object[] elements, int initialCursor) {
            cursor = initialCursor;
            snapshot = elements;
        }

        public boolean hasNext() {
            return cursor < snapshot.length;
        }

        public boolean hasPrevious() {
            return cursor > 0;
        }

        @SuppressWarnings("unchecked")
        public E next() {
            if (! hasNext())
                throw new NoSuchElementException();
            return (E) snapshot[cursor++];
        }
//...
}

CopyOnWriterArrayList的方法根本就没有像ArrayList中使用checkForComodification方法来判断expectedModCount 与 modCount 是否相等。

它为什么会这么做,凭什么可以这么做呢?我们以add方法为例:

public class CopyOnWriteArrayList<E>
    implements List<E>, RandomAccess, Cloneable, java.io.Serializable {
    private static final long serialVersionUID = 8673264195747942595L;

    /** The lock protecting all mutators */
    final transient ReentrantLock lock = new ReentrantLock();

    /** The array, accessed only via getArray/setArray. */
    private transient volatile Object[] array;

    /**
     * Gets the array.  Non-private so as to also be accessible
     * from CopyOnWriteArraySet class.
     */
    final Object[] getArray() {
        return array;
    }

    /**
     * Sets the array.
     */
    final void setArray(Object[] a) {
        array = a;
    }

    //添加数据
    public boolean add(E e) {
       final ReentrantLock lock = this.lock;
           //进来就看见了一把锁
        lock.lock();
        try {
            Object[] elements = getArray();
            int len = elements.length;
            //复制数组
            Object[] newElements = Arrays.copyOf(elements, len + 1);
            //在新数组上操作
            newElements[len] = e;
            //把新数组赋值给原先的数组--引用重指向
            setArray(newElements);
            return true;
        } finally {
            lock.unlock();
        }
    }
//...
}