利用JFR监控WLS
前阵子Jenkins上builds起伏很大,有些UnitPerfTes的耗时很大,一度使得WLS集群表现很糟糕,于是我希望为UnitPerfTest框架加上JRockit Flight Recording(JFR)模块,具体是这样的:
1)在每个节点开始跑 Task之前能启动该节点上的一个JFR;
2)当task跑完之后结束JFR;
3)从每一个节点产生的JFR report 传回到启动Task所在的JVM,也就是我们Jenkins Server上
JFR是JRockit Missing Controll的东西(目前新版的Hotspot JDK已经也支持了),查了下wiki到也容易,首先JFR是事件驱动的,所有的数据收集都是通过事件触发实现的,所以客户端如果想定制自己的JFR report的话可以注册相关的事件,并且disable不感兴趣的事件。所以第一件事是搞清楚到底有哪些事件,通过一小段代码就可以打印出如下事件以及属性(附event文件):
try { FlightRecorderClient fr = new FlightRecorderClient(); FlightRecordingClient rec = fr.createRecordingObject("tmp"); for (CompositeData pd : fr.getProducers()) { if (!PRODUCER_URI.equals(pd.get("uri"))) continue; CompositeData events[] = (CompositeData[]) pd.get("events"); // Go through all registered events and enable them for (CompositeData d : events) { int id = (Integer) d.get("id"); 。。。。。。。 } } } catch (Exception e) { // Add proper exception handling. e.printStackTrace(); }
所以接下来就很好搞了,定义几个常用的事件模板,比如nomal_event_setting,mem_event_setting,lock_event_setting分别用来配置一般测试,着重找mem问题的测试和着重找cpu/锁等问题的测试配置,在node启动之前,创建FJR设定关注的事件 并启动:
public static FlightRecordingClient createFlightRecordingClient(final String recordingName) throws NamingException, JMException, IOException, NoSuchRecordingException, NoSuchEventException { MBeanServerConnection server = ServerConfiguration.getInstance().getServerRuntimeServer(); FlightRecorderClient frc = new FlightRecorderClient(server); FlightRecordingClient flightRecording = frc.createRecordingObject(recordingName); //Disable all the events for (CompositeData pd : frc.getProducers()) { CompositeData[] events = (CompositeData[]) pd.get("events"); for (CompositeData event : events) { Integer id = (Integer) event.get("id"); flightRecording.setEventEnabled(id, false); flightRecording.setStackTraceEnabled(id, false); } } return flightRecording; } public static void startFlightRecording(FlightRecordingClient frc, List<JFREventSetting> events) throws NoSuchEventException { if (frc.isStarted()) { logger.warn("The FlightRecording has already started in current thread."); } else { //Set all the event values for (JFREventSetting event : events) { frc.setEventEnabled(event.getId(), event.isEnabled()); frc.setStackTraceEnabled(event.getId(), event.isStackTraceEnabled()); frc.setPeriod(event.getId(), event.getPeriod()); frc.setThreshold(event.getId(), event.getThreshold()); } frc.start(); } }
测试结束停止JFR:
public static void stopFlightRecording(final FlightRecordingClient frc) throws IOException { if (frc == null) { logger.warn("Stop Flight Recording Warning: There is no FlightRecording in current thread context."); return; } if (frc.isStopped()) { logger.warn("The FlightRecording has already stopped in current thread."); } else { frc.stop(); } }
最后我们需要把JFRReport从WLS Node传回到Jenkins Server,所以实现一个RemoteStream来完成:
public interface RemoteOutputStream extends Remote{
public void close() throws IOException, RemoteException;
public void flush() throws IOException, RemoteException;
public void write(byte[] b) throws IOException, RemoteException;
public void write(byte[] b, int off, int len) throws IOException, RemoteException;
public void write(int b) throws IOException, RemoteException;
}
public class RemoteOutputStreamImpl implements RemoteOutputStream{
public RemoteOutputStreamImpl(OutputStream out)
{
if (out == null) {
throw new IllegalArgumentException("OutputStream cannot be null.");
}
this.out = out;
}
@Override
public void close() throws IOException
{
out.close();
}
/* (non-Javadoc)
* @see oracle.oats.common.utilities.wls.rmi.RemoteOutputStream#flush()
*/
@Override
public void flush() throws IOException
{
out.flush();
}
/* (non-Javadoc)
* @see oracle.oats.common.utils.rmi.RemoteOutputStream#write(byte[])
*/
@Override
public void write(byte[] b) throws IOException
{
out.write(b);
}
/* (non-Javadoc)
* @see oracle.oats.common.utils.rmi.RemoteOutputStream#write(byte[], int, int)
*/
@Override
public void write(byte[] b, int off, int len) throws IOException
{
out.write(b, off, len);
}
/* (non-Javadoc)
* @see oracle.oats.common.utils.rmi.RemoteOutputStream#write(int)
*/
@Override
public void write(int b) throws IOException
{
out.write(b);
}
}
public class RemoteOutputStreamWrapper extends OutputStream{
//封装一个RemoteOutputStream,略
}
所以这样就远程的传回来就可以简单的copy下stream就ok了:
public static void copyStream(InputStream in, OutputStream out) throws IOException { if (!(in instanceof BufferedInputStream || in instanceof ByteArrayInputStream)) { in = new BufferedInputStream(in); } if (!(out instanceof BufferedOutputStream || out instanceof ByteArrayOutputStream)) { out = new BufferedOutputStream(out); } byte[] buf = new byte[32768]; int len = 0; while ((len = in.read(buf)) != -1) { out.write(buf, 0, len); } out.flush(); }
推荐阅读
-
利用Python自动监控网站并发送邮件告警的方法
-
利用Python自动监控网站并发送邮件告警的方法
-
利用Prometheus与Grafana对Mysql服务器的性能监控详解
-
Linux利用nc命令监控服务器端口的方法
-
利用Prometheus与Grafana对Mysql服务器的性能监控详解
-
利用pt-heartbeat监控MySQL的复制延迟详解
-
python简单的监控脚本-利用socket、psutil阻止远程主机运行特定程序
-
Spring Cloud实战之初级入门(四)— 利用Hystrix实现服务熔断与服务监控
-
Linux利用nc命令监控服务器端口的方法
-
利用python为运维人员写一个监控脚本