从clock_gettime和gettimeofday开始谈linux下函数耗时计算
从clock_gettime和gettimeofday开始谈linux下函数耗时计算
引言
在OpenCV中有如下的两个函数用来获取当前时间。
static long long getTickCount(void)
{
#if defined _WIN32 || defined WINCE
LARGE_INTEGER counter;
QueryPerformanceCounter( &counter );
return (long long)counter.QuadPart;
#elif defined __linux || defined __linux__
struct timespec tp;
clock_gettime(CLOCK_MONOTONIC, &tp);
return (long long)tp.tv_sec*1000000000 + tp.tv_nsec;
#elif defined __MACH__ && defined __APPLE__
return (long long)mach_absolute_time();
#else
struct timeval tv;
gettimeofday(&tv, NULL);
return (long long)tv.tv_sec*1000000 + tv.tv_usec;
#endif
}
CV_EXPORTS_W double getTickFrequency(void)
{
#if defined _WIN32 || defined WINCE
LARGE_INTEGER freq;
QueryPerformanceFrequency(&freq);
return (double)freq.QuadPart;
#elif defined __linux || defined __linux__
return 1e9;
#elif defined __MACH__ && defined __APPLE__
static double freq = 0;
if( freq == 0 )
{
mach_timebase_info_data_t sTimebaseInfo;
mach_timebase_info(&sTimebaseInfo);
freq = sTimebaseInfo.denom*1e9/sTimebaseInfo.numer;
}
return freq;
#else
return 1e6;
#endif
}
我们可以先后调用两次getTickCount函数两次来计算某些操作耗时多少。
long long t_stat = getTickCount()
...
long long t_end = getTickCount()
double time_elapsed = (double)(dut_end - t_stat)/getTickFrequency()
在上面的函数中,我们可以看到:计算时间差,主要使用的是clock_gettime和gettimeofday。而且,clock_gettime优于gettimeofday。那么,这是为什么呢?
clock_gettime & gettimeofday
从精度上比较。
gettimeofday返回从Epoch到现在的秒数和微秒数。
The functions gettimeofday() and settimeofday() can get and set the time as well as a timezone. The tv argument is a struct timeval (as specified in <sys/time.h>):
struct timeval {
time_t tv_sec; /* seconds */
suseconds_t tv_usec; /* microseconds */
};
and gives the number of seconds and microseconds since the Epoch (see time(2)).
clock_gettime返回的时间的精度最低可以到纳秒
The functions clock_gettime() and clock_settime() retrieve and set the time of the specified clock clk_id.
The res and tp arguments are timespec structures, as specified in <time.h>:
struct timespec {
time_t tv_sec; /* seconds */
long tv_nsec; /* nanoseconds */
};
从时间单调性分析
clock_gettime可以返回单调连续的时间(使用CLOCK_MONOTONIC clock_id),而gettimeofday不可以。
This clock is not affected by discontinuous jumps in the system time (e.g., if the system administrator manually
changes the clock), but is affected by the incremental adjustments performed by adjtime(3) and NTP
不同时钟源对clock_gettime的影响。
clock source | description |
---|---|
CLOCK_REALTIME | Represents wall-clock time. Can be both stepped and slewed by time adjustment code (e.g., NTP, PTP). |
CLOCK_REALTIME_COARSE | A lower-resolution version of CLOCK_REALTIME. |
CLOCK_REALTIME_HR | A higher-resolution version of CLOCK_REALTIME. Only available with the real-time kernel. |
CLOCK_MONOTONIC | Represents the interval from an abitrary time. Can be slewed but not stepped by time adjustment code. As such, it can only move forward, not backward. |
CLOCK_MONOTONIC_COARSE | A lower-resolution version of CLOCK_MONOTONIC. |
CLOCK_MONOTONIC_RAW | A version of CLOCK_MONOTONIC that can neither be slewed nor stepped by time adjustment code. |
CLOCK_BOOTTIME | A version of CLOCK_MONOTONIC that additionally reflects time spent in suspend mode. Only available in newer (2.6.39+) kernels. |
如果我们的任务只是获取系统的时间,那么到现在为止已经足够了。但是,系统是如何保存time tick 以及time tick的精度又是怎样的呢?
Linux如何保存tick count & tick count的精度
tick count由时钟源产生,它的精度由时钟源的频率决定。那么,我们可以使用的时钟源有哪些呢?
-
The TSC is a register counter that is also driven from a crystal oscillator – the same oscillator that is used to generate the clock pulses that drive the CPU(s). As such it runs at the frequency of the CPU, so for instance a 2GHz clock will tick twice per nanosecond.
-
The HPET (High Precision Event Timer) was introduced by Microsoft and Intel around 2005. Its precision is approximately 100 ns, so it is less accurate than the TSC, which can provide sub-nanosecond accuracy. It is also much more expensive to query the HPET than the TSC.
-
The acpi_pm clock source has the advantage that its frequency doesn’t change based on power-management code, but since it runs at 3.58MHz (one tick every 279 ns), it is not nearly as accurate as the preceding timers.
-
jiffies signifies that the clock source is actually the same timer used for scheduling, and as such its resolution is typically quite poor. (The default scheduling interval in most Linux variants is either 1 ms or 10 ms).
一般来说,我们使用TSC(Time stamp Counter)时钟源,因为他的精度高,损耗小。而且现代的CPU对TSC进行了优化,解决了很多之前存在的问题。
如何查看CPU支持TSC特性。
$ cat /proc/cpuinfo | grep -i tsc
flags : ... tsc rdtscp constant_tsc nonstop_tsc ...
The flags have the following meanings:
Flag | Meaning |
---|---|
tsc | The system has a TSC clock. |
rdtscp | The RDTSCP instruction is available. |
constant_tsc | The TSC is synchronized across all sockets/cores. |
nonstop_tsc | The TSC is not affected by power management code. |
如何查看当前系统支持的时钟源
查看支持的时钟源:
$ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
tsc hpet acpi_pm
查看当前使用的时钟源:
$ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
tsc
获取时钟的损耗
我们用clock_gettime来计算时间的流逝。除开时钟源tick count影响精度外,clock_gettime调用的时候自身也要花费一定的时间。
在我的虚拟机(i5 3.2G kvm_clock)上一次clock_gettime的消耗大概在400ns,基本上和一次系统调用的时间差不多。
参考引用
http://btorpey.github.io/blog/2014/02/18/clock-sources-in-linux/
上一篇: go常用时间 time
下一篇: vue配置全局变量