由dubbo服务禁用system.gc而引起的思考
程序员文章站
2022-04-08 20:30:49
我一直都有一个疑问,丰巢业务服务的生产环境jvm参数设置是禁止system.gc的,也就是开启设置:-XX:+DisableExplicitGC,但是生产环境却从来没有出现过堆外内存溢出的情况。说明一下,丰巢使用了阿里开源的dubbo,而dubbo底层通信默认情况下使用了3.2.5.Final版本的 ......
我一直都有一个疑问,丰巢业务服务的生产环境jvm参数设置是禁止system.gc的,也就是开启设置:-xx:+disableexplicitgc,但是生产环境却从来没有出现过堆外内存溢出的情况。说明一下,丰巢使用了阿里开源的dubbo,而dubbo底层通信默认情况下使用了3.2.5.final版本的netty,而我们对于netty的常规认知里,netty一定是使用了堆外内存,并且堆外内存在禁止了system.gc这个函数调用的话,在服务没有主动回收分配的堆外内存的情况下,一定会出现堆外内存的泄露。带着这个问题,刚好前天晚上有些时间,研究了一下3.2.5版本的netty源码,又是在科兴科兴园等馒头妈妈时候,发现了秘密之所在,我只能说,科兴科学园真是我的宝地啊。
涉及到的netty类:nioworker、heapchannelbufferfactory、bigendianheapchannelbuffer、socketreceivebufferpool
核心的秘密在socketreceivebufferpool中
1 final class socketreceivebufferpool { 2 3 private static final int pool_size = 8; 4 5 @suppresswarnings("unchecked") 6 private final softreference<bytebuffer>[] pool = new softreference[pool_size]; 7 8 socketreceivebufferpool() { 9 super(); 10 } 11 12 final bytebuffer acquire(int size) { 13 final softreference<bytebuffer>[] pool = this.pool; 14 for (int i = 0; i < pool_size; i ++) { 15 softreference<bytebuffer> ref = pool[i]; 16 if (ref == null) { 17 continue; 18 } 19 20 bytebuffer buf = ref.get(); 21 if (buf == null) { 22 pool[i] = null; 23 continue; 24 } 25 26 if (buf.capacity() < size) { 27 continue; 28 } 29 30 pool[i] = null; 31 32 buf.clear(); 33 return buf; 34 } 35 36 bytebuffer buf = bytebuffer.allocatedirect(normalizecapacity(size)); 37 buf.clear(); 38 return buf; 39 } 40 41 final void release(bytebuffer buffer) { 42 final softreference<bytebuffer>[] pool = this.pool; 43 for (int i = 0; i < pool_size; i ++) { 44 softreference<bytebuffer> ref = pool[i]; 45 if (ref == null || ref.get() == null) { 46 pool[i] = new softreference<bytebuffer>(buffer); 47 return; 48 } 49 } 50 51 // pool is full - replace one 52 final int capacity = buffer.capacity(); 53 for (int i = 0; i< pool_size; i ++) { 54 softreference<bytebuffer> ref = pool[i]; 55 bytebuffer pooled = ref.get(); 56 if (pooled == null) { 57 pool[i] = null; 58 continue; 59 } 60 61 if (pooled.capacity() < capacity) { 62 pool[i] = new softreference<bytebuffer>(buffer); 63 return; 64 } 65 } 66 } 67 68 private static final int normalizecapacity(int capacity) { 69 // normalize to multiple of 1024 70 int q = capacity >>> 10; 71 int r = capacity & 1023; 72 if (r != 0) { 73 q ++; 74 } 75 return q << 10; 76 } 77 }
socketreceivebufferpool中维护了一个softreference<bytebuffer>类型的数组,关于java的softreference,大家可以自行搜索。其实就是在此类中维护了一个directbuffer的内存池,此部分的内存是可以重复利用的。那么问题来了,如果我们把netty用于接收网络信息的directbuffer直接传给dubbo的业务代码,那么这个内存池的作用是什么呢,内存如何被release回内存池?带着这个疑问,继续分析调用了socketreceivebufferpool的nioworker代码。
1 private boolean read(selectionkey k) { 2 final socketchannel ch = (socketchannel) k.channel(); 3 final niosocketchannel channel = (niosocketchannel) k.attachment(); 4 5 final receivebuffersizepredictor predictor = 6 channel.getconfig().getreceivebuffersizepredictor(); 7 final int predictedrecvbufsize = predictor.nextreceivebuffersize(); 8 9 int ret = 0; 10 int readbytes = 0; 11 boolean failure = true; 12 13 bytebuffer bb = recvbufferpool.acquire(predictedrecvbufsize); 14 15 try { 16 while ((ret = ch.read(bb)) > 0) { 17 readbytes += ret; 18 if (!bb.hasremaining()) { 19 break; 20 } 21 } 22 failure = false; 23 } catch (closedchannelexception e) { 24 // can happen, and does not need a user attention. 25 } catch (throwable t) { 26 fireexceptioncaught(channel, t); 27 } 28 29 if (readbytes > 0) { 30 bb.flip(); 31 32 final channelbufferfactory bufferfactory = 33 channel.getconfig().getbufferfactory(); 34 final channelbuffer buffer = bufferfactory.getbuffer(readbytes); 35 buffer.setbytes(0, bb); 36 buffer.writerindex(readbytes); 37 //if(buffer instanceof bigendianheapchannelbuffer){ 38 // logger2.info("buffer instanceof bigendianheapchannelbuffer."); 39 //} 40 recvbufferpool.release(bb); 41 42 // update the predi||\\||||| 43 predictor.previousreceivebuffersize(readbytes); 44 45 // fire the event. 46 firemessagereceived(channel, buffer); 47 } else { 48 recvbufferpool.release(bb); 49 } 50 51 if (ret < 0 || failure) { 52 k.cancel(); // some jdk implementations run into an infinite loop without this. 53 close(channel, succeededfuture(channel)); 54 return false; 55 } 56 57 return true; 58 }
在代码里发现了netty会再创造一个chanelbuffer对象,然后将directbuffer里的内容复制到chanelbuffer里面,而这个chanelbuffer对象实际上是一个堆内内存,然后netty会真对这块内存进行解码及返回给上层调用服务等,也就是说没有直接将directbuffer返回给dubbo服务,这样也就解释了,我们在提供dubbo服务的jvm里,禁止掉了system.gc的情况下,没有发生过堆外内存泄漏的原因。后面我会找时间详细的分析一下netty4和kafka使用directbuffer的情况。