'restrict' keyword
restrict
is a keyword that can be used in pointer declarations. By adding this type qualifier, a programmer hints to the compiler that for the lifetime of the pointer, only the pointer itself or a value directly derived from it (such as pointer + 1
) will be used to access the object to which it points.
restrict
limits the effects of pointer aliasing, aiding optimizations.
wiki and * (copied below also) has very illuminating and concrete examples.
-------------
The Wikipedia example is very illuminating.
It clearly shows how it allows to save one assembly instruction.
Without restrict:
void f(int *a, int *b, int *x) {
*a += *x;
*b += *x;
}
Pseudo assembly:
load R1 ← *x ; Load the value of x pointer
load R2 ← *a ; Load the value of a pointer
add R2 += R1 ; Perform Addition
set R2 → *a ; Update the value of a pointer
; Similarly for b, note that x is loaded twice,
; because a may be equal to x.
load R1 ← *x
load R2 ← *b
add R2 += R1
set R2 → *b
With restrict:
void fr(int *restrict a, int *restrict b, int *restrict x);
Pseudo assembly:
load R1 ← *x
load R2 ← *a
add R2 += R1
set R2 → *a
; Note that x is not reloaded,
; because the compiler knows it is unchanged
; load R1 ← *x
load R2 ← *b
add R2 += R1
set R2 → *b
Does GCC really do it?
GCC 4.8 Linux x86-64:
gcc -g -std=c99 -O0 -c main.c
objdump -S main.o
With -O0
, they are the same.
With -O3
:
void f(int *a, int *b, int *x) {
*a += *x;
0: 8b 02 mov (%rdx),%eax
2: 01 07 add %eax,(%rdi)
*b += *x;
4: 8b 02 mov (%rdx),%eax
6: 01 06 add %eax,(%rsi)
void fr(int *restrict a, int *restrict b, int *restrict x) {
*a += *x;
10: 8b 02 mov (%rdx),%eax
12: 01 07 add %eax,(%rdi)
*b += *x;
14: 01 06 add %eax,(%rsi)
For the uninitiated, the calling convention is:
-
rdi
= first parameter -
rsi
= second parameter -
rdx
= third parameter
GCC output was even clearer than the wiki article: 4 instructions vs 3 instructions.
Arrays
So far we have single instruction savings, but if pointer represent arrays to be looped over, a common use case, then a bunch of instructions could be saved, as mentioned by supercat.
Consider for example:
void f(char *restrict p1, char *restrict p2) {
for (int i = 0; i < 50; i++) {
p1[i] = 4;
p2[i] = 9;
}
}
Because of restrict
, a smart compiler (or human), could optimize that to:
memset(p1, 4, 50);
memset(p2, 9, 50);
which is potentially much more efficient as it may be assembly optimized on a decent libc implementation (like glibc): Is it better to use std::memcpy() or std::copy() in terms to performance?
Does GCC really do it?
GCC 5.2.1.Linux x86-64 Ubuntu 15.10:
gcc -g -std=c99 -O0 -c main.c
objdump -dr main.o
With -O0
, both are the same.
With -O3
:
-
with restrict:
3f0: 48 85 d2 test %rdx,%rdx 3f3: 74 33 je 428 <fr+0x38> 3f5: 55 push %rbp 3f6: 53 push %rbx 3f7: 48 89 f5 mov %rsi,%rbp 3fa: be 04 00 00 00 mov $0x4,%esi 3ff: 48 89 d3 mov %rdx,%rbx 402: 48 83 ec 08 sub $0x8,%rsp 406: e8 00 00 00 00 callq 40b <fr+0x1b> 407: R_X86_64_PC32 memset-0x4 40b: 48 83 c4 08 add $0x8,%rsp 40f: 48 89 da mov %rbx,%rdx 412: 48 89 ef mov %rbp,%rdi 415: 5b pop %rbx 416: 5d pop %rbp 417: be 09 00 00 00 mov $0x9,%esi 41c: e9 00 00 00 00 jmpq 421 <fr+0x31> 41d: R_X86_64_PC32 memset-0x4 421: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 428: f3 c3 repz retq
Two
memset
calls as expected. -
without restrict: no stdlib calls, just a 16 iteration wide loop unrolling which I do not intend to reproduce here :-)
I haven't had the patience to benchmark them, but I believe that the restrict version will be faster.
C99
Let's look at the standard for completeness sake.
restrict
says that two pointers cannot point to overlapping memory regions. The most common usage is for function arguments.
This restricts how the function can be called, but allows for more compile-time optimizations.
If the caller does not follow the restrict
contract, undefined behavior.
The C99 N1256 draft 6.7.3/7 "Type qualifiers" says:
The intended use of the restrict qualifier (like the register storage class) is to promote optimization, and deleting all instances of the qualifier from all preprocessing translation units composing a conforming program does not change its meaning (i.e., observable behavior).
and 6.7.3.1 "Formal definition of restrict" gives the gory details.
Strict aliasing rule
The restrict
keyword only affects pointers of compatible types (e.g. two int*
) because the strict aliasing rules says that aliasing incompatible types is undefined behavior by default, and so compilers can assume it does not happen and optimize away.
See: What is the strict aliasing rule?
See also
- C++14 does not yet have an analogue for
restrict
, but GCC has__restrict__
as an extension: What does the restrict keyword mean in C++? - Many questions that ask: according to the gory details, does this code UB or not?
- A "when to use" question: When to use restrict and when not to
- The related GCC
__attribute__((malloc))
, which says that the return value of a function is not aliased to anything: GCC: __attribute__((malloc))
推荐阅读
-
keyword学习
-
'restrict' keyword
-
处理Vue is a constructor and should be called with the `new` keyword
-
c volatile keyword and memory cache
-
python 出现SyntaxError: non-keyword arg after keyword arg错误解决办法
-
vue报错:SyntaxError: Unexpected keyword 'const'. Const declarations are not supported in strict mode
-
python 出现SyntaxError: non-keyword arg after keyword arg错误解决办法
-
在DROP TABLE时,RESTRICT与CASCADE的区别?
-
正则获取网页源码keyword跟description ,蛋有点疼
-
curl真的没有对采集内容进行字节限制的设置么?例如小弟我只想要网页的header部分的keyword,全部都采集过来的话太浪费了