Java实现高效随机数算法的示例代码
程序员文章站
2024-02-28 22:06:10
前言
事情起源于一位网友分享了一个有趣的面试题:
生成由六位数字组成的id,要求随机数字,不排重,不可自增,且数字不重复。id总数为几十万。
初次解答...
前言
事情起源于一位网友分享了一个有趣的面试题:
生成由六位数字组成的id,要求随机数字,不排重,不可自增,且数字不重复。id总数为几十万。
初次解答
我一开始想到的办法是
- 生成一个足够大的id池(其实就是需要多少就生成多少)
- 对id池中的数字进行随机排序
- 依次消费id池中的数字
可惜这个方法十分浪费空间,且性能很差。
初遇梅森旋转算法
后面咨询了网友后得知了一个高效的随机数算法:梅森旋转(mersenne twister/mt)。通过搜索资料得知:
梅森旋转算法(mersenne twister)是一个伪随机数发生算法。由松本真和西村拓士在1997年开发,基于有限二进制字段上的矩阵线性递归。可以快速产生高质量的伪随机数,修正了古典随机数发生算法的很多缺陷。
最为广泛使用mersenne twister的一种变体是mt19937,可以产生32位整数序列。
ps:此算法依然无法完美解决面试题,但是也算学到了新知识
mt19937算法实现
后面通过google,找到了一个高效的mt19937的java版本代码。原代码链接为http://www.math.sci.hiroshima-u.ac.jp/~m-mat/mt/versions/java/mtrandom.java
import java.util.random; /** * mt19937的java实现 */ public class mtrandom extends random { // constants used in the original c implementation private final static int upper_mask = 0x80000000; private final static int lower_mask = 0x7fffffff; private final static int n = 624; private final static int m = 397; private final static int magic[] = { 0x0, 0x9908b0df }; private final static int magic_factor1 = 1812433253; private final static int magic_factor2 = 1664525; private final static int magic_factor3 = 1566083941; private final static int magic_mask1 = 0x9d2c5680; private final static int magic_mask2 = 0xefc60000; private final static int magic_seed = 19650218; private final static long default_seed = 5489l; // internal state private transient int[] mt; private transient int mti; private transient boolean compat = false; // temporary buffer used during setseed(long) private transient int[] ibuf; /** * the default constructor for an instance of mtrandom. this invokes * the no-argument constructor for java.util.random which will result * in the class being initialised with a seed value obtained by calling * system.currenttimemillis(). */ public mtrandom() { } /** * this version of the constructor can be used to implement identical * behaviour to the original c code version of this algorithm including * exactly replicating the case where the seed value had not been set * prior to calling genrand_int32. * <p> * if the compatibility flag is set to true, then the algorithm will be * seeded with the same default value as was used in the original c * code. furthermore the setseed() method, which must take a 64 bit * long value, will be limited to using only the lower 32 bits of the * seed to facilitate seamless migration of existing c code into java * where identical behaviour is required. * <p> * whilst useful for ensuring backwards compatibility, it is advised * that this feature not be used unless specifically required, due to * the reduction in strength of the seed value. * * @param compatible compatibility flag for replicating original * behaviour. */ public mtrandom(boolean compatible) { super(0l); compat = compatible; setseed(compat?default_seed:system.currenttimemillis()); } /** * this version of the constructor simply initialises the class with * the given 64 bit seed value. for a better random number sequence * this seed value should contain as much entropy as possible. * * @param seed the seed value with which to initialise this class. */ public mtrandom(long seed) { super(seed); } /** * this version of the constructor initialises the class with the * given byte array. all the data will be used to initialise this * instance. * * @param buf the non-empty byte array of seed information. * @throws nullpointerexception if the buffer is null. * @throws illegalargumentexception if the buffer has zero length. */ public mtrandom(byte[] buf) { super(0l); setseed(buf); } /** * this version of the constructor initialises the class with the * given integer array. all the data will be used to initialise * this instance. * * @param buf the non-empty integer array of seed information. * @throws nullpointerexception if the buffer is null. * @throws illegalargumentexception if the buffer has zero length. */ public mtrandom(int[] buf) { super(0l); setseed(buf); } // initializes mt[n] with a simple integer seed. this method is // required as part of the mersenne twister algorithm but need // not be made public. private final void setseed(int seed) { // annoying runtime check for initialisation of internal data // caused by java.util.random invoking setseed() during init. // this is unavoidable because no fields in our instance will // have been initialised at this point, not even if the code // were placed at the declaration of the member variable. if (mt == null) mt = new int[n]; // ---- begin mersenne twister algorithm ---- mt[0] = seed; for (mti = 1; mti < n; mti++) { mt[mti] = (magic_factor1 * (mt[mti-1] ^ (mt[mti-1] >>> 30)) + mti); } // ---- end mersenne twister algorithm ---- } /** * this method resets the state of this instance using the 64 * bits of seed data provided. note that if the same seed data * is passed to two different instances of mtrandom (both of * which share the same compatibility state) then the sequence * of numbers generated by both instances will be identical. * <p> * if this instance was initialised in 'compatibility' mode then * this method will only use the lower 32 bits of any seed value * passed in and will match the behaviour of the original c code * exactly with respect to state initialisation. * * @param seed the 64 bit value used to initialise the random * number generator state. */ public final synchronized void setseed(long seed) { if (compat) { setseed((int)seed); } else { // annoying runtime check for initialisation of internal data // caused by java.util.random invoking setseed() during init. // this is unavoidable because no fields in our instance will // have been initialised at this point, not even if the code // were placed at the declaration of the member variable. if (ibuf == null) ibuf = new int[2]; ibuf[0] = (int)seed; ibuf[1] = (int)(seed >>> 32); setseed(ibuf); } } /** * this method resets the state of this instance using the byte * array of seed data provided. note that calling this method * is equivalent to calling "setseed(pack(buf))" and in particular * will result in a new integer array being generated during the * call. if you wish to retain this seed data to allow the pseudo * random sequence to be restarted then it would be more efficient * to use the "pack()" method to convert it into an integer array * first and then use that to re-seed the instance. the behaviour * of the class will be the same in both cases but it will be more * efficient. * * @param buf the non-empty byte array of seed information. * @throws nullpointerexception if the buffer is null. * @throws illegalargumentexception if the buffer has zero length. */ public final void setseed(byte[] buf) { setseed(pack(buf)); } /** * this method resets the state of this instance using the integer * array of seed data provided. this is the canonical way of * resetting the pseudo random number sequence. * * @param buf the non-empty integer array of seed information. * @throws nullpointerexception if the buffer is null. * @throws illegalargumentexception if the buffer has zero length. */ public final synchronized void setseed(int[] buf) { int length = buf.length; if (length == 0) throw new illegalargumentexception("seed buffer may not be empty"); // ---- begin mersenne twister algorithm ---- int i = 1, j = 0, k = (n > length ? n : length); setseed(magic_seed); for (; k > 0; k--) { mt[i] = (mt[i] ^ ((mt[i-1] ^ (mt[i-1] >>> 30)) * magic_factor2)) + buf[j] + j; i++; j++; if (i >= n) { mt[0] = mt[n-1]; i = 1; } if (j >= length) j = 0; } for (k = n-1; k > 0; k--) { mt[i] = (mt[i] ^ ((mt[i-1] ^ (mt[i-1] >>> 30)) * magic_factor3)) - i; i++; if (i >= n) { mt[0] = mt[n-1]; i = 1; } } mt[0] = upper_mask; // msb is 1; assuring non-zero initial array // ---- end mersenne twister algorithm ---- } /** * this method forms the basis for generating a pseudo random number * sequence from this class. if given a value of 32, this method * behaves identically to the genrand_int32 function in the original * c code and ensures that using the standard nextint() function * (inherited from random) we are able to replicate behaviour exactly. * <p> * note that where the number of bits requested is not equal to 32 * then bits will simply be masked out from the top of the returned * integer value. that is to say that: * <pre> * mt.setseed(12345); * int foo = mt.nextint(16) + (mt.nextint(16) << 16);</pre> * will not give the same result as * <pre> * mt.setseed(12345); * int foo = mt.nextint(32);</pre> * * @param bits the number of significant bits desired in the output. * @return the next value in the pseudo random sequence with the * specified number of bits in the lower part of the integer. */ protected final synchronized int next(int bits) { // ---- begin mersenne twister algorithm ---- int y, kk; if (mti >= n) { // generate n words at one time // in the original c implementation, mti is checked here // to determine if initialisation has occurred; if not // it initialises this instance with default_seed (5489). // this is no longer necessary as initialisation of the // java instance must result in initialisation occurring // use the constructor mtrandom(true) to enable backwards // compatible behaviour. for (kk = 0; kk < n-m; kk++) { y = (mt[kk] & upper_mask) | (mt[kk+1] & lower_mask); mt[kk] = mt[kk+m] ^ (y >>> 1) ^ magic[y & 0x1]; } for (;kk < n-1; kk++) { y = (mt[kk] & upper_mask) | (mt[kk+1] & lower_mask); mt[kk] = mt[kk+(m-n)] ^ (y >>> 1) ^ magic[y & 0x1]; } y = (mt[n-1] & upper_mask) | (mt[0] & lower_mask); mt[n-1] = mt[m-1] ^ (y >>> 1) ^ magic[y & 0x1]; mti = 0; } y = mt[mti++]; // tempering y ^= (y >>> 11); y ^= (y << 7) & magic_mask1; y ^= (y << 15) & magic_mask2; y ^= (y >>> 18); // ---- end mersenne twister algorithm ---- return (y >>> (32-bits)); } // this is a fairly obscure little code section to pack a // byte[] into an int[] in little endian ordering. /** * this simply utility method can be used in cases where a byte * array of seed data is to be used to repeatedly re-seed the * random number sequence. by packing the byte array into an * integer array first, using this method, and then invoking * setseed() with that; it removes the need to re-pack the byte * array each time setseed() is called. * <p> * if the length of the byte array is not a multiple of 4 then * it is implicitly padded with zeros as necessary. for example: * <pre> byte[] { 0x01, 0x02, 0x03, 0x04, 0x05, 0x06 }</pre> * becomes * <pre> int[] { 0x04030201, 0x00000605 }</pre> * <p> * note that this method will not complain if the given byte array * is empty and will produce an empty integer array, but the * setseed() method will throw an exception if the empty integer * array is passed to it. * * @param buf the non-null byte array to be packed. * @return a non-null integer array of the packed bytes. * @throws nullpointerexception if the given byte array is null. */ public static int[] pack(byte[] buf) { int k, blen = buf.length, ilen = ((buf.length+3) >>> 2); int[] ibuf = new int[ilen]; for (int n = 0; n < ilen; n++) { int m = (n+1) << 2; if (m > blen) m = blen; for (k = buf[--m]&0xff; (m & 0x3) != 0; k = (k << 8) | buf[--m]&0xff); ibuf[n] = k; } return ibuf; } }
测试
测试代码
// mt19937的java实现 mtrandom mtrandom=new mtrandom(); map<integer,integer> map=new hashmap<>(); //循环次数 int times=1000000; long starttime=system.currenttimemillis(); for(int i=0;i<times;i++){ //使用map去重 map.put(mtrandom.next(32),0); } //打印循环次数 system.out.println("times:"+times); //打印map的个数 system.out.println("num:"+map.size()); //打印非重复比率 system.out.println("proportion:"+map.size()/(double)times); //花费的时间(单位为毫秒) system.out.println("time:"+(system.currenttimemillis()-starttime));
测试结果
times:1000000
num:999886
proportion:0.999886
time:374
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持。