设计高效的线程安全的缓存--JCIP5.6读书笔记

程序员文章站 2022-03-02 10:33:00

...

[本文是我对Java Concurrency In Practice 5.6的归纳和总结. 转载请注明作者和出处, 如有谬误, 欢迎在评论中指正. ]

几乎每一个应用都会使用到缓存, 但是设计高效的线程安全的缓存并不简单. 如:

public interface Computable<A, V> { 
    V compute(A arg) throws InterruptedException; 
} 

public class ExpensiveFunction 
        implements Computable<String, BigInteger> { 
    // 模拟一个耗时操作
    public BigInteger compute(String arg) { 
	// ...
        return new BigInteger(arg); 
    } 
} 

public class Memorizer1<A, V> implements Computable<A, V> { 
    private final Map<A, V> cache = new HashMap<A, V>(); 
    private final Computable<A, V> c; 

    public Memorizer1(Computable<A, V> c) { 
        this.c = c; 
    } 
    // 使用synchronized同步整个方法解决线程安全
    public synchronized V compute(A arg) throws InterruptedException { 
        V result = cache.get(arg); 
        if (result == null) { 
            result = c.compute(arg); 
            cache.put(arg, result); 
        } 
        return result; 
    } 
}

Memorizer1使用HashMap缓存计算结果. 如果能在缓存中取出参数对应的结果, 就直接返回缓存的数据, 避免了重复进行代价昂贵的计算. 由于HashMap不是线程安全的, Memorizer1同步整个compute方法, 避免重复计算的同时, 牺牲了并发执行compute方法的机会, 此种设计甚至可能导致性能比没有缓存更差.

使用ConcurrentHashMap代替HashMap, 同时取消对compute方法的同步可以极大的改善性能:

public class Memorizer2<A, V> implements Computable<A, V> { 
    private final Map<A, V> cache = new ConcurrentHashMap<A, V>(); 
    private final Computable<A, V> c; 

    public Memorizer2(Computable<A, V> c) { this.c = c; } 

    public V compute(A arg) throws InterruptedException { 
        V result = cache.get(arg); 
        if (result == null) { 
            result = c.compute(arg); 
            cache.put(arg, result); 
        } 
        return result; 
    } 
}

ConcurrentHashMap是线程安全的, 并且具有极好的并发性能. 但是该设计仍存在问题: 无法避免所有的重复的计算. 有时这是可以的, 但对于一些要求苛刻的系统, 重复计算可能会引发严重的问题. Memorizer2的问题在于一个线程在执行compute方法的过程中, 其他线程以相同的参数调用compute方法时, 无法从缓存中获知已有线程正在进行该参数的计算的信息, 因此造成了重复计算的发生. 针对这一点, 可以改进缓存的设计:

public class Memorizer3<A, V> implements Computable<A, V> { 
    // 改为缓存Future
    private final Map<A, Future<V>> cache 
            = new ConcurrentHashMap<A, Future<V>>(); 
    private final Computable<A, V> c; 

    public Memorizer3(Computable<A, V> c) { this.c = c; } 

    public V compute(final A arg) throws InterruptedException { 
        Future<V> f = cache.get(arg); 
        if (f == null) { 
            Callable<V> eval = new Callable<V>() { 
                public V call() throws InterruptedException { 
                    return c.compute(arg); 
                } 
            }; 
            FutureTask<V> ft = new FutureTask<V>(eval); 
            f = ft; 
	    // 在计算开始前就将Future对象存入缓存中.
            cache.put(arg, ft); 
            ft.run(); // call to c.compute happens here 
        } 
        try { 
	    // 如果缓存中存在arg对应的Future对象, 就直接调用该Future对象的get方法.
	    // 如果实际的计算还在进行当中, get方法将被阻塞, 直到计算完成
            return f.get(); 
        } catch (ExecutionException e) { 
            throw launderThrowable(e.getCause()); 
        } 
    } 
}

Memorizer3中的缓存系统看起来已经相当完美: 具有极好的并发性能, 也不会存在重复计算的问题. 真的吗? 不幸的是Memorizer3仍然存在重复计算的问题, 只是相对于Memorizer2, 重复计算的概率降低了一些. cache.get(arg)的结果为null, 不代表cache.put(arg, ft)时cache中依旧没有arg对应的Future, 因此直接调用cache.put(arg, ft)是不合理的:

public class Memorizer<A, V> implements Computable<A, V> { 
    private final ConcurrentMap<A, Future<V>> cache 
        = new ConcurrentHashMap<A, Future<V>>(); 
    private final Computable<A, V> c; 

    public Memorizer(Computable<A, V> c) { this.c = c; } 

    public V compute(final A arg) throws InterruptedException { 
        while (true) { 
            Future<V> f = cache.get(arg); 
            if (f == null) { 
                Callable<V> eval = new Callable<V>() { 
                    public V call() throws InterruptedException { 
                        return c.compute(arg); 
                    } 
                }; 
                FutureTask<V> ft = new FutureTask<V>(eval); 
		// 使用putIfAbsent测试是否真的将ft存入了缓存, 如果存入失败, 说明cache中已经存在arg对应的future对象
		// 否则才进行计算.
                f = cache.putIfAbsent(arg, ft); 
                if (f == null) { f = ft; ft.run(); } 
            } 
            try { 
                return f.get(); 
            } catch (CancellationException e) { 
		// 当计算被取消时, 从缓存中移除arg-f键值对
                cache.remove(arg, f); 
            } catch (ExecutionException e) { 
                throw launderThrowable(e.getCause()); 
            } 
        } 
    } 
}

至此才真正实现了高效且线程安全的缓存.

PS: 终于看完了JCIP的第五章, 这一章真是又臭又长...

相关标签： java 缓存 cache 多线程 ConcurrentHashMap

上一篇：改善并发性能--JCIP6.3读书笔记

下一篇： synchronizer--JCIP5.5读书笔记

设计高效的线程安全的缓存--JCIP5.6读书笔记

[Java 并发编程实战] 设计线程安全的类的三个方式(含代码)

[Java 并发编程实战] 设计线程安全的类的三个方式(含代码)

利用对象限制和委托构建线程安全的类--Java Concurrency In Practice C04读书笔记