Implementing a 64 bit PRNG library in C backed by xoroshiro128+
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
3
down vote
favorite
Introduction and Credits
The inbuilt rand()
function in C only offers a very limited range of possible outcomes, being limited to only 31 bits. I wanted something that was fast (enough) and could provide 64 random bits at a time. I fond an impressive, openly available PRNG for this purpose: xoroshiro128+. It seems to pass most well-known tests for randomness. I decided to modify it slightly and make a library.
Credit: David Blackman and Sebastiano Vigna for coming up with this implementation in 2016.
Their website: Link.
Here are my modifications to the code the original authors wrote.
xoroshiro128plus.c
// Original and permanent link: http://xoroshiro.di.unimi.it/xoroshiro128plus.c
/* Written in 2016 by David Blackman and Sebastiano Vigna (vigna@acm.org)
To the extent possible under law, the author has dedicated all copyright
and related and neighboring rights to this software to the public domain
worldwide. This software is distributed without any warranty.
See <http://creativecommons.org/publicdomain/zero/1.0/>. */
#include <stdint.h>
#include <sys/time.h>
/* This is the successor to xorshift128+. It is the fastest full-period
generator passing BigCrush without systematic failures, but due to the
relatively short period it is acceptable only for applications with a
mild amount of parallelism; otherwise, use a xorshift1024* generator.
Beside passing BigCrush, this generator passes the PractRand test suite
up to (and included) 16TB, with the exception of binary rank tests, as
the lowest bit of this generator is an LFSR of degree 128. The next bit
can be described by an LFSR of degree 8256, but in the long run it will
fail linearity tests, too. The other bits needs a much higher degree to
be represented as LFSRs.
We suggest to use a sign test to extract a random Boolean value, and
right shifts to extract subsets of bits.
Note that the generator uses a simulated rotate operation, which most C
compilers will turn into a single instruction. In Java, you can use
Long.rotateLeft(). In languages that do not make low-level rotation
instructions accessible xorshift128+ could be faster.
The state must be seeded so that it is not everywhere zero. If you have
a 64-bit seed, we suggest to seed a splitmix64 generator and use its
output to fill s. */
uint64_t s[2];
////////////////////////////////////////////////////////////////////////
/* Modifications by Subhomoy Haldar (github.com/Subh0m0y) start here. */
// As per the recommendation by the original authors - David Blackman and
// Sebastiano Vigna, we shall use two iterations of a seeded splitmix64
// generator (written by Sebastiano Vigna) and use its output to seed this
// program's seed vector.
// Original and permanent link: http://xoroshiro.di.unimi.it/splitmix64.c
static uint64_t splitmix64next(const uint64_t x)
uint64_t z = (x + 0x9e3779b97f4a7c15);
z = (z ^ (z >> 30)) * 0xbf58476d1ce4e5b9;
z = (z ^ (z >> 27)) * 0x94d049bb133111eb;
return z ^ (z >> 31);
const uint64_t SEED_SCRAMBLER = 0x37bc7dd1f3339a5fULL;
static uint64_t time_based_x(void)
// Obtain the (relative, partial) time information
// in microseconds
struct timeval currentTime;
gettimeofday(¤tTime, NULL);
uint64_t value = currentTime.tv_usec;
// Combine and generate the seed.
uint64_t x = (value << 32)
void _seed_with(const uint64_t x)
s[0] = splitmix64next(x);
s[1] = splitmix64next(s[0]);
void _seed_auto(void)
_seed_with(time_based_x());
/* Modifications by Subhomoy Haldar end here. */
////////////////////////////////////////////////
static inline uint64_t rotl(const uint64_t x, int k)
return (x << k)
uint64_t next(void)
const uint64_t s0 = s[0];
uint64_t s1 = s[1];
const uint64_t result = s0 + s1;
s1 ^= s0;
s[0] = rotl(s0, 55) ^ s1 ^ (s1 << 14); // a, b
s[1] = rotl(s1, 36); // c
return result;
/* This is the jump function for the generator. It is equivalent
to 2^64 calls to next(); it can be used to generate 2^64
non-overlapping subsequences for parallel computations. */
void jump(void)
static const uint64_t JUMP = 0xbeac0467eba5facb, 0xd86b048b86aa9922 ;
uint64_t s0 = 0;
uint64_t s1 = 0;
for(int i = 0; i < sizeof JUMP / sizeof *JUMP; i++)
for(int b = 0; b < 64; b++)
if (JUMP[i] & UINT64_C(1) << b)
s0 ^= s[0];
s1 ^= s[1];
next();
s[0] = s0;
s[1] = s1;
xoroshiro128plus.h
#ifndef XORO_SHIRO_128_PLUS
#define XORO_SHIRO_128_PLUS
#include <inttypes.h>
/* Only expose the method that is relevant */
/*
* Automatically initializes the seed vector for the xoroshiro128+
* PRNG, using a part of the current time (in microseconds) and
* a seed scrambler.
*/
void _seed_auto(void);
/*
* Initializes the seed vector with a starting value. This is useful
* for debugging when reproducible scenarios are desirable.
*/
void _seed_with(const uint64_t x);
/*
* Returns 64 randomly generated bits.
*/
uint64_t next(void);
#endif
randomlib.c
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stdint.h>
#include <time.h>
#include <math.h>
#include "xoroshiro128plus.h"
#define UINT32_MASK 0xFFFFFFFFULL
// The seeding functions.
/*
* Initializes the PRNG. Uses the current time to seed
* it. It's expected resolution is in microseconds.
*/
void init_rand(void)
_seed_auto();
/*
* Initializes the PRNG with a given seed.
*/
void seed(uint64_t x)
_seed_with(x);
/*
* Returns a random integer in the range 0 (inclusive)
* and limit (exclusive). The integers generated are uniformly
* distributed.
*/
uint64_t next_uint64(const uint64_t limit)
return next() % limit;
/*
* Returns a 32-bit unsigned integer.
*/
uint32_t next_uint32()
// Generate 2 ints at a time out of one call to next()
// This makes use of both halves of the 64 bits generated
static uint32_t next_uint32_temp;
static bool has_next_uint32 = false;
uint32_t val;
if (has_next_uint32)
val = next_uint32_temp;
else
uint64_t full = next();
val = full >> 32; // The upper half
next_uint32_temp = (uint32_t) (full & UINT32_MASK); // The lower half
// quick flip
has_next_uint32 ^= true;
return val;
/*
* Returns a boolean value. The expected probability of
* both values is 50%.
*/
bool next_bool()
// Sign test as per the recommendation
// We check if the highest bit is on
return (next() >> 63) & 1;
/*
* Returns a uniformly distributed double between 0
* (inclusive) and 1 (exclusive).
*/
double next_double()
// return ((double) next()) / ((double) UINT64_MAX);
// return (next() >> 11) * (1. / (UINT64_C(1) << 53));
return (next() >> 11) * 0x1.0p-53;
/*
* Returns a normally distributed double between -1.0
* and +1.0
*/
double next_gaussian()
static double next_gaussian;
static bool has_next_gaussian = false;
double val;
if (has_next_gaussian)
val = next_gaussian;
else
double u, v, s;
do
// Limit u and v to the range [-1, 1]
u = next_double() * 2 - 1;
v = next_double() * 2 - 1;
s = u * u + v * v;
while(s > 1);
double scale = sqrt(-2.0 * log(s) / s);
next_gaussian = v * scale;
val = u * scale;
// Quick flip
has_next_gaussian ^= true;
return val;
My questions
I tested the output of all the functions I implemented by calling them several times and dumping the output to a CSV file. I then used Excel to plot Histograms and the plots seem reasonable. i.e. the functions are uniform and the Gaussian function is normal, as expected. You may perform yo0ur own tests or suggest a better, more efficient or automated approach.
The implementations for generating double
s have a characteristic dip in the frequency of the very last bucket (to about 50% of the rest), for all the methods. I do not think it would matter much in practice though. This is also something you may wish to comment upon.
The backing xoroshiro algorithm is pretty fast. I hope my implementation does not do injustice to it and introduce unnecessary bottle-necks.
As usual, you are free to comment and critique on every part of (my) code. You may suggest improvements to the code written by the original authors since it is in the Public Domain.
performance c random library
add a comment |Â
up vote
3
down vote
favorite
Introduction and Credits
The inbuilt rand()
function in C only offers a very limited range of possible outcomes, being limited to only 31 bits. I wanted something that was fast (enough) and could provide 64 random bits at a time. I fond an impressive, openly available PRNG for this purpose: xoroshiro128+. It seems to pass most well-known tests for randomness. I decided to modify it slightly and make a library.
Credit: David Blackman and Sebastiano Vigna for coming up with this implementation in 2016.
Their website: Link.
Here are my modifications to the code the original authors wrote.
xoroshiro128plus.c
// Original and permanent link: http://xoroshiro.di.unimi.it/xoroshiro128plus.c
/* Written in 2016 by David Blackman and Sebastiano Vigna (vigna@acm.org)
To the extent possible under law, the author has dedicated all copyright
and related and neighboring rights to this software to the public domain
worldwide. This software is distributed without any warranty.
See <http://creativecommons.org/publicdomain/zero/1.0/>. */
#include <stdint.h>
#include <sys/time.h>
/* This is the successor to xorshift128+. It is the fastest full-period
generator passing BigCrush without systematic failures, but due to the
relatively short period it is acceptable only for applications with a
mild amount of parallelism; otherwise, use a xorshift1024* generator.
Beside passing BigCrush, this generator passes the PractRand test suite
up to (and included) 16TB, with the exception of binary rank tests, as
the lowest bit of this generator is an LFSR of degree 128. The next bit
can be described by an LFSR of degree 8256, but in the long run it will
fail linearity tests, too. The other bits needs a much higher degree to
be represented as LFSRs.
We suggest to use a sign test to extract a random Boolean value, and
right shifts to extract subsets of bits.
Note that the generator uses a simulated rotate operation, which most C
compilers will turn into a single instruction. In Java, you can use
Long.rotateLeft(). In languages that do not make low-level rotation
instructions accessible xorshift128+ could be faster.
The state must be seeded so that it is not everywhere zero. If you have
a 64-bit seed, we suggest to seed a splitmix64 generator and use its
output to fill s. */
uint64_t s[2];
////////////////////////////////////////////////////////////////////////
/* Modifications by Subhomoy Haldar (github.com/Subh0m0y) start here. */
// As per the recommendation by the original authors - David Blackman and
// Sebastiano Vigna, we shall use two iterations of a seeded splitmix64
// generator (written by Sebastiano Vigna) and use its output to seed this
// program's seed vector.
// Original and permanent link: http://xoroshiro.di.unimi.it/splitmix64.c
static uint64_t splitmix64next(const uint64_t x)
uint64_t z = (x + 0x9e3779b97f4a7c15);
z = (z ^ (z >> 30)) * 0xbf58476d1ce4e5b9;
z = (z ^ (z >> 27)) * 0x94d049bb133111eb;
return z ^ (z >> 31);
const uint64_t SEED_SCRAMBLER = 0x37bc7dd1f3339a5fULL;
static uint64_t time_based_x(void)
// Obtain the (relative, partial) time information
// in microseconds
struct timeval currentTime;
gettimeofday(¤tTime, NULL);
uint64_t value = currentTime.tv_usec;
// Combine and generate the seed.
uint64_t x = (value << 32)
void _seed_with(const uint64_t x)
s[0] = splitmix64next(x);
s[1] = splitmix64next(s[0]);
void _seed_auto(void)
_seed_with(time_based_x());
/* Modifications by Subhomoy Haldar end here. */
////////////////////////////////////////////////
static inline uint64_t rotl(const uint64_t x, int k)
return (x << k)
uint64_t next(void)
const uint64_t s0 = s[0];
uint64_t s1 = s[1];
const uint64_t result = s0 + s1;
s1 ^= s0;
s[0] = rotl(s0, 55) ^ s1 ^ (s1 << 14); // a, b
s[1] = rotl(s1, 36); // c
return result;
/* This is the jump function for the generator. It is equivalent
to 2^64 calls to next(); it can be used to generate 2^64
non-overlapping subsequences for parallel computations. */
void jump(void)
static const uint64_t JUMP = 0xbeac0467eba5facb, 0xd86b048b86aa9922 ;
uint64_t s0 = 0;
uint64_t s1 = 0;
for(int i = 0; i < sizeof JUMP / sizeof *JUMP; i++)
for(int b = 0; b < 64; b++)
if (JUMP[i] & UINT64_C(1) << b)
s0 ^= s[0];
s1 ^= s[1];
next();
s[0] = s0;
s[1] = s1;
xoroshiro128plus.h
#ifndef XORO_SHIRO_128_PLUS
#define XORO_SHIRO_128_PLUS
#include <inttypes.h>
/* Only expose the method that is relevant */
/*
* Automatically initializes the seed vector for the xoroshiro128+
* PRNG, using a part of the current time (in microseconds) and
* a seed scrambler.
*/
void _seed_auto(void);
/*
* Initializes the seed vector with a starting value. This is useful
* for debugging when reproducible scenarios are desirable.
*/
void _seed_with(const uint64_t x);
/*
* Returns 64 randomly generated bits.
*/
uint64_t next(void);
#endif
randomlib.c
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stdint.h>
#include <time.h>
#include <math.h>
#include "xoroshiro128plus.h"
#define UINT32_MASK 0xFFFFFFFFULL
// The seeding functions.
/*
* Initializes the PRNG. Uses the current time to seed
* it. It's expected resolution is in microseconds.
*/
void init_rand(void)
_seed_auto();
/*
* Initializes the PRNG with a given seed.
*/
void seed(uint64_t x)
_seed_with(x);
/*
* Returns a random integer in the range 0 (inclusive)
* and limit (exclusive). The integers generated are uniformly
* distributed.
*/
uint64_t next_uint64(const uint64_t limit)
return next() % limit;
/*
* Returns a 32-bit unsigned integer.
*/
uint32_t next_uint32()
// Generate 2 ints at a time out of one call to next()
// This makes use of both halves of the 64 bits generated
static uint32_t next_uint32_temp;
static bool has_next_uint32 = false;
uint32_t val;
if (has_next_uint32)
val = next_uint32_temp;
else
uint64_t full = next();
val = full >> 32; // The upper half
next_uint32_temp = (uint32_t) (full & UINT32_MASK); // The lower half
// quick flip
has_next_uint32 ^= true;
return val;
/*
* Returns a boolean value. The expected probability of
* both values is 50%.
*/
bool next_bool()
// Sign test as per the recommendation
// We check if the highest bit is on
return (next() >> 63) & 1;
/*
* Returns a uniformly distributed double between 0
* (inclusive) and 1 (exclusive).
*/
double next_double()
// return ((double) next()) / ((double) UINT64_MAX);
// return (next() >> 11) * (1. / (UINT64_C(1) << 53));
return (next() >> 11) * 0x1.0p-53;
/*
* Returns a normally distributed double between -1.0
* and +1.0
*/
double next_gaussian()
static double next_gaussian;
static bool has_next_gaussian = false;
double val;
if (has_next_gaussian)
val = next_gaussian;
else
double u, v, s;
do
// Limit u and v to the range [-1, 1]
u = next_double() * 2 - 1;
v = next_double() * 2 - 1;
s = u * u + v * v;
while(s > 1);
double scale = sqrt(-2.0 * log(s) / s);
next_gaussian = v * scale;
val = u * scale;
// Quick flip
has_next_gaussian ^= true;
return val;
My questions
I tested the output of all the functions I implemented by calling them several times and dumping the output to a CSV file. I then used Excel to plot Histograms and the plots seem reasonable. i.e. the functions are uniform and the Gaussian function is normal, as expected. You may perform yo0ur own tests or suggest a better, more efficient or automated approach.
The implementations for generating double
s have a characteristic dip in the frequency of the very last bucket (to about 50% of the rest), for all the methods. I do not think it would matter much in practice though. This is also something you may wish to comment upon.
The backing xoroshiro algorithm is pretty fast. I hope my implementation does not do injustice to it and introduce unnecessary bottle-necks.
As usual, you are free to comment and critique on every part of (my) code. You may suggest improvements to the code written by the original authors since it is in the Public Domain.
performance c random library
Did you build a wrapper or did you modify the original code? The whole point of a wrapper is to leave the original intact, or at least be independent of it's implementation.
â Mast
Jan 6 at 9:10
@Mast That's a valid point. I slightly modified the code as per the recommendation of the authors.
â Astrobleme
Jan 6 at 9:13
Ok, and what's with all the magic numbers?
â Mast
Jan 6 at 9:14
The magic numbers come from the original source: xoroshiro.di.unimi.it/splitmix64.c
â Astrobleme
Jan 6 at 9:15
I figured as much. But why are those numbers exactly that value? Are they dependent on each other?
â Mast
Jan 6 at 9:34
add a comment |Â
up vote
3
down vote
favorite
up vote
3
down vote
favorite
Introduction and Credits
The inbuilt rand()
function in C only offers a very limited range of possible outcomes, being limited to only 31 bits. I wanted something that was fast (enough) and could provide 64 random bits at a time. I fond an impressive, openly available PRNG for this purpose: xoroshiro128+. It seems to pass most well-known tests for randomness. I decided to modify it slightly and make a library.
Credit: David Blackman and Sebastiano Vigna for coming up with this implementation in 2016.
Their website: Link.
Here are my modifications to the code the original authors wrote.
xoroshiro128plus.c
// Original and permanent link: http://xoroshiro.di.unimi.it/xoroshiro128plus.c
/* Written in 2016 by David Blackman and Sebastiano Vigna (vigna@acm.org)
To the extent possible under law, the author has dedicated all copyright
and related and neighboring rights to this software to the public domain
worldwide. This software is distributed without any warranty.
See <http://creativecommons.org/publicdomain/zero/1.0/>. */
#include <stdint.h>
#include <sys/time.h>
/* This is the successor to xorshift128+. It is the fastest full-period
generator passing BigCrush without systematic failures, but due to the
relatively short period it is acceptable only for applications with a
mild amount of parallelism; otherwise, use a xorshift1024* generator.
Beside passing BigCrush, this generator passes the PractRand test suite
up to (and included) 16TB, with the exception of binary rank tests, as
the lowest bit of this generator is an LFSR of degree 128. The next bit
can be described by an LFSR of degree 8256, but in the long run it will
fail linearity tests, too. The other bits needs a much higher degree to
be represented as LFSRs.
We suggest to use a sign test to extract a random Boolean value, and
right shifts to extract subsets of bits.
Note that the generator uses a simulated rotate operation, which most C
compilers will turn into a single instruction. In Java, you can use
Long.rotateLeft(). In languages that do not make low-level rotation
instructions accessible xorshift128+ could be faster.
The state must be seeded so that it is not everywhere zero. If you have
a 64-bit seed, we suggest to seed a splitmix64 generator and use its
output to fill s. */
uint64_t s[2];
////////////////////////////////////////////////////////////////////////
/* Modifications by Subhomoy Haldar (github.com/Subh0m0y) start here. */
// As per the recommendation by the original authors - David Blackman and
// Sebastiano Vigna, we shall use two iterations of a seeded splitmix64
// generator (written by Sebastiano Vigna) and use its output to seed this
// program's seed vector.
// Original and permanent link: http://xoroshiro.di.unimi.it/splitmix64.c
static uint64_t splitmix64next(const uint64_t x)
uint64_t z = (x + 0x9e3779b97f4a7c15);
z = (z ^ (z >> 30)) * 0xbf58476d1ce4e5b9;
z = (z ^ (z >> 27)) * 0x94d049bb133111eb;
return z ^ (z >> 31);
const uint64_t SEED_SCRAMBLER = 0x37bc7dd1f3339a5fULL;
static uint64_t time_based_x(void)
// Obtain the (relative, partial) time information
// in microseconds
struct timeval currentTime;
gettimeofday(¤tTime, NULL);
uint64_t value = currentTime.tv_usec;
// Combine and generate the seed.
uint64_t x = (value << 32)
void _seed_with(const uint64_t x)
s[0] = splitmix64next(x);
s[1] = splitmix64next(s[0]);
void _seed_auto(void)
_seed_with(time_based_x());
/* Modifications by Subhomoy Haldar end here. */
////////////////////////////////////////////////
static inline uint64_t rotl(const uint64_t x, int k)
return (x << k)
uint64_t next(void)
const uint64_t s0 = s[0];
uint64_t s1 = s[1];
const uint64_t result = s0 + s1;
s1 ^= s0;
s[0] = rotl(s0, 55) ^ s1 ^ (s1 << 14); // a, b
s[1] = rotl(s1, 36); // c
return result;
/* This is the jump function for the generator. It is equivalent
to 2^64 calls to next(); it can be used to generate 2^64
non-overlapping subsequences for parallel computations. */
void jump(void)
static const uint64_t JUMP = 0xbeac0467eba5facb, 0xd86b048b86aa9922 ;
uint64_t s0 = 0;
uint64_t s1 = 0;
for(int i = 0; i < sizeof JUMP / sizeof *JUMP; i++)
for(int b = 0; b < 64; b++)
if (JUMP[i] & UINT64_C(1) << b)
s0 ^= s[0];
s1 ^= s[1];
next();
s[0] = s0;
s[1] = s1;
xoroshiro128plus.h
#ifndef XORO_SHIRO_128_PLUS
#define XORO_SHIRO_128_PLUS
#include <inttypes.h>
/* Only expose the method that is relevant */
/*
* Automatically initializes the seed vector for the xoroshiro128+
* PRNG, using a part of the current time (in microseconds) and
* a seed scrambler.
*/
void _seed_auto(void);
/*
* Initializes the seed vector with a starting value. This is useful
* for debugging when reproducible scenarios are desirable.
*/
void _seed_with(const uint64_t x);
/*
* Returns 64 randomly generated bits.
*/
uint64_t next(void);
#endif
randomlib.c
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stdint.h>
#include <time.h>
#include <math.h>
#include "xoroshiro128plus.h"
#define UINT32_MASK 0xFFFFFFFFULL
// The seeding functions.
/*
* Initializes the PRNG. Uses the current time to seed
* it. It's expected resolution is in microseconds.
*/
void init_rand(void)
_seed_auto();
/*
* Initializes the PRNG with a given seed.
*/
void seed(uint64_t x)
_seed_with(x);
/*
* Returns a random integer in the range 0 (inclusive)
* and limit (exclusive). The integers generated are uniformly
* distributed.
*/
uint64_t next_uint64(const uint64_t limit)
return next() % limit;
/*
* Returns a 32-bit unsigned integer.
*/
uint32_t next_uint32()
// Generate 2 ints at a time out of one call to next()
// This makes use of both halves of the 64 bits generated
static uint32_t next_uint32_temp;
static bool has_next_uint32 = false;
uint32_t val;
if (has_next_uint32)
val = next_uint32_temp;
else
uint64_t full = next();
val = full >> 32; // The upper half
next_uint32_temp = (uint32_t) (full & UINT32_MASK); // The lower half
// quick flip
has_next_uint32 ^= true;
return val;
/*
* Returns a boolean value. The expected probability of
* both values is 50%.
*/
bool next_bool()
// Sign test as per the recommendation
// We check if the highest bit is on
return (next() >> 63) & 1;
/*
* Returns a uniformly distributed double between 0
* (inclusive) and 1 (exclusive).
*/
double next_double()
// return ((double) next()) / ((double) UINT64_MAX);
// return (next() >> 11) * (1. / (UINT64_C(1) << 53));
return (next() >> 11) * 0x1.0p-53;
/*
* Returns a normally distributed double between -1.0
* and +1.0
*/
double next_gaussian()
static double next_gaussian;
static bool has_next_gaussian = false;
double val;
if (has_next_gaussian)
val = next_gaussian;
else
double u, v, s;
do
// Limit u and v to the range [-1, 1]
u = next_double() * 2 - 1;
v = next_double() * 2 - 1;
s = u * u + v * v;
while(s > 1);
double scale = sqrt(-2.0 * log(s) / s);
next_gaussian = v * scale;
val = u * scale;
// Quick flip
has_next_gaussian ^= true;
return val;
My questions
I tested the output of all the functions I implemented by calling them several times and dumping the output to a CSV file. I then used Excel to plot Histograms and the plots seem reasonable. i.e. the functions are uniform and the Gaussian function is normal, as expected. You may perform yo0ur own tests or suggest a better, more efficient or automated approach.
The implementations for generating double
s have a characteristic dip in the frequency of the very last bucket (to about 50% of the rest), for all the methods. I do not think it would matter much in practice though. This is also something you may wish to comment upon.
The backing xoroshiro algorithm is pretty fast. I hope my implementation does not do injustice to it and introduce unnecessary bottle-necks.
As usual, you are free to comment and critique on every part of (my) code. You may suggest improvements to the code written by the original authors since it is in the Public Domain.
performance c random library
Introduction and Credits
The inbuilt rand()
function in C only offers a very limited range of possible outcomes, being limited to only 31 bits. I wanted something that was fast (enough) and could provide 64 random bits at a time. I fond an impressive, openly available PRNG for this purpose: xoroshiro128+. It seems to pass most well-known tests for randomness. I decided to modify it slightly and make a library.
Credit: David Blackman and Sebastiano Vigna for coming up with this implementation in 2016.
Their website: Link.
Here are my modifications to the code the original authors wrote.
xoroshiro128plus.c
// Original and permanent link: http://xoroshiro.di.unimi.it/xoroshiro128plus.c
/* Written in 2016 by David Blackman and Sebastiano Vigna (vigna@acm.org)
To the extent possible under law, the author has dedicated all copyright
and related and neighboring rights to this software to the public domain
worldwide. This software is distributed without any warranty.
See <http://creativecommons.org/publicdomain/zero/1.0/>. */
#include <stdint.h>
#include <sys/time.h>
/* This is the successor to xorshift128+. It is the fastest full-period
generator passing BigCrush without systematic failures, but due to the
relatively short period it is acceptable only for applications with a
mild amount of parallelism; otherwise, use a xorshift1024* generator.
Beside passing BigCrush, this generator passes the PractRand test suite
up to (and included) 16TB, with the exception of binary rank tests, as
the lowest bit of this generator is an LFSR of degree 128. The next bit
can be described by an LFSR of degree 8256, but in the long run it will
fail linearity tests, too. The other bits needs a much higher degree to
be represented as LFSRs.
We suggest to use a sign test to extract a random Boolean value, and
right shifts to extract subsets of bits.
Note that the generator uses a simulated rotate operation, which most C
compilers will turn into a single instruction. In Java, you can use
Long.rotateLeft(). In languages that do not make low-level rotation
instructions accessible xorshift128+ could be faster.
The state must be seeded so that it is not everywhere zero. If you have
a 64-bit seed, we suggest to seed a splitmix64 generator and use its
output to fill s. */
uint64_t s[2];
////////////////////////////////////////////////////////////////////////
/* Modifications by Subhomoy Haldar (github.com/Subh0m0y) start here. */
// As per the recommendation by the original authors - David Blackman and
// Sebastiano Vigna, we shall use two iterations of a seeded splitmix64
// generator (written by Sebastiano Vigna) and use its output to seed this
// program's seed vector.
// Original and permanent link: http://xoroshiro.di.unimi.it/splitmix64.c
static uint64_t splitmix64next(const uint64_t x)
uint64_t z = (x + 0x9e3779b97f4a7c15);
z = (z ^ (z >> 30)) * 0xbf58476d1ce4e5b9;
z = (z ^ (z >> 27)) * 0x94d049bb133111eb;
return z ^ (z >> 31);
const uint64_t SEED_SCRAMBLER = 0x37bc7dd1f3339a5fULL;
static uint64_t time_based_x(void)
// Obtain the (relative, partial) time information
// in microseconds
struct timeval currentTime;
gettimeofday(¤tTime, NULL);
uint64_t value = currentTime.tv_usec;
// Combine and generate the seed.
uint64_t x = (value << 32)
void _seed_with(const uint64_t x)
s[0] = splitmix64next(x);
s[1] = splitmix64next(s[0]);
void _seed_auto(void)
_seed_with(time_based_x());
/* Modifications by Subhomoy Haldar end here. */
////////////////////////////////////////////////
static inline uint64_t rotl(const uint64_t x, int k)
return (x << k)
uint64_t next(void)
const uint64_t s0 = s[0];
uint64_t s1 = s[1];
const uint64_t result = s0 + s1;
s1 ^= s0;
s[0] = rotl(s0, 55) ^ s1 ^ (s1 << 14); // a, b
s[1] = rotl(s1, 36); // c
return result;
/* This is the jump function for the generator. It is equivalent
to 2^64 calls to next(); it can be used to generate 2^64
non-overlapping subsequences for parallel computations. */
void jump(void)
static const uint64_t JUMP = 0xbeac0467eba5facb, 0xd86b048b86aa9922 ;
uint64_t s0 = 0;
uint64_t s1 = 0;
for(int i = 0; i < sizeof JUMP / sizeof *JUMP; i++)
for(int b = 0; b < 64; b++)
if (JUMP[i] & UINT64_C(1) << b)
s0 ^= s[0];
s1 ^= s[1];
next();
s[0] = s0;
s[1] = s1;
xoroshiro128plus.h
#ifndef XORO_SHIRO_128_PLUS
#define XORO_SHIRO_128_PLUS
#include <inttypes.h>
/* Only expose the method that is relevant */
/*
* Automatically initializes the seed vector for the xoroshiro128+
* PRNG, using a part of the current time (in microseconds) and
* a seed scrambler.
*/
void _seed_auto(void);
/*
* Initializes the seed vector with a starting value. This is useful
* for debugging when reproducible scenarios are desirable.
*/
void _seed_with(const uint64_t x);
/*
* Returns 64 randomly generated bits.
*/
uint64_t next(void);
#endif
randomlib.c
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stdint.h>
#include <time.h>
#include <math.h>
#include "xoroshiro128plus.h"
#define UINT32_MASK 0xFFFFFFFFULL
// The seeding functions.
/*
* Initializes the PRNG. Uses the current time to seed
* it. It's expected resolution is in microseconds.
*/
void init_rand(void)
_seed_auto();
/*
* Initializes the PRNG with a given seed.
*/
void seed(uint64_t x)
_seed_with(x);
/*
* Returns a random integer in the range 0 (inclusive)
* and limit (exclusive). The integers generated are uniformly
* distributed.
*/
uint64_t next_uint64(const uint64_t limit)
return next() % limit;
/*
* Returns a 32-bit unsigned integer.
*/
uint32_t next_uint32()
// Generate 2 ints at a time out of one call to next()
// This makes use of both halves of the 64 bits generated
static uint32_t next_uint32_temp;
static bool has_next_uint32 = false;
uint32_t val;
if (has_next_uint32)
val = next_uint32_temp;
else
uint64_t full = next();
val = full >> 32; // The upper half
next_uint32_temp = (uint32_t) (full & UINT32_MASK); // The lower half
// quick flip
has_next_uint32 ^= true;
return val;
/*
* Returns a boolean value. The expected probability of
* both values is 50%.
*/
bool next_bool()
// Sign test as per the recommendation
// We check if the highest bit is on
return (next() >> 63) & 1;
/*
* Returns a uniformly distributed double between 0
* (inclusive) and 1 (exclusive).
*/
double next_double()
// return ((double) next()) / ((double) UINT64_MAX);
// return (next() >> 11) * (1. / (UINT64_C(1) << 53));
return (next() >> 11) * 0x1.0p-53;
/*
* Returns a normally distributed double between -1.0
* and +1.0
*/
double next_gaussian()
static double next_gaussian;
static bool has_next_gaussian = false;
double val;
if (has_next_gaussian)
val = next_gaussian;
else
double u, v, s;
do
// Limit u and v to the range [-1, 1]
u = next_double() * 2 - 1;
v = next_double() * 2 - 1;
s = u * u + v * v;
while(s > 1);
double scale = sqrt(-2.0 * log(s) / s);
next_gaussian = v * scale;
val = u * scale;
// Quick flip
has_next_gaussian ^= true;
return val;
My questions
I tested the output of all the functions I implemented by calling them several times and dumping the output to a CSV file. I then used Excel to plot Histograms and the plots seem reasonable. i.e. the functions are uniform and the Gaussian function is normal, as expected. You may perform yo0ur own tests or suggest a better, more efficient or automated approach.
The implementations for generating double
s have a characteristic dip in the frequency of the very last bucket (to about 50% of the rest), for all the methods. I do not think it would matter much in practice though. This is also something you may wish to comment upon.
The backing xoroshiro algorithm is pretty fast. I hope my implementation does not do injustice to it and introduce unnecessary bottle-necks.
As usual, you are free to comment and critique on every part of (my) code. You may suggest improvements to the code written by the original authors since it is in the Public Domain.
performance c random library
edited Jan 6 at 9:14
asked Jan 6 at 7:19
Astrobleme
1,5241238
1,5241238
Did you build a wrapper or did you modify the original code? The whole point of a wrapper is to leave the original intact, or at least be independent of it's implementation.
â Mast
Jan 6 at 9:10
@Mast That's a valid point. I slightly modified the code as per the recommendation of the authors.
â Astrobleme
Jan 6 at 9:13
Ok, and what's with all the magic numbers?
â Mast
Jan 6 at 9:14
The magic numbers come from the original source: xoroshiro.di.unimi.it/splitmix64.c
â Astrobleme
Jan 6 at 9:15
I figured as much. But why are those numbers exactly that value? Are they dependent on each other?
â Mast
Jan 6 at 9:34
add a comment |Â
Did you build a wrapper or did you modify the original code? The whole point of a wrapper is to leave the original intact, or at least be independent of it's implementation.
â Mast
Jan 6 at 9:10
@Mast That's a valid point. I slightly modified the code as per the recommendation of the authors.
â Astrobleme
Jan 6 at 9:13
Ok, and what's with all the magic numbers?
â Mast
Jan 6 at 9:14
The magic numbers come from the original source: xoroshiro.di.unimi.it/splitmix64.c
â Astrobleme
Jan 6 at 9:15
I figured as much. But why are those numbers exactly that value? Are they dependent on each other?
â Mast
Jan 6 at 9:34
Did you build a wrapper or did you modify the original code? The whole point of a wrapper is to leave the original intact, or at least be independent of it's implementation.
â Mast
Jan 6 at 9:10
Did you build a wrapper or did you modify the original code? The whole point of a wrapper is to leave the original intact, or at least be independent of it's implementation.
â Mast
Jan 6 at 9:10
@Mast That's a valid point. I slightly modified the code as per the recommendation of the authors.
â Astrobleme
Jan 6 at 9:13
@Mast That's a valid point. I slightly modified the code as per the recommendation of the authors.
â Astrobleme
Jan 6 at 9:13
Ok, and what's with all the magic numbers?
â Mast
Jan 6 at 9:14
Ok, and what's with all the magic numbers?
â Mast
Jan 6 at 9:14
The magic numbers come from the original source: xoroshiro.di.unimi.it/splitmix64.c
â Astrobleme
Jan 6 at 9:15
The magic numbers come from the original source: xoroshiro.di.unimi.it/splitmix64.c
â Astrobleme
Jan 6 at 9:15
I figured as much. But why are those numbers exactly that value? Are they dependent on each other?
â Mast
Jan 6 at 9:34
I figured as much. But why are those numbers exactly that value? Are they dependent on each other?
â Mast
Jan 6 at 9:34
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
3
down vote
accepted
Overall: good design, presentation and nicely formatted.
xoroshiro128plus.c
uint64_t s[2];
looks very bad to have at global scope. Certainly should have limited file scope withstatic uint64_t s[2];
Same foruint64_t SEED_SCRAMBLER
The state space is 128 bit, yet users with
_seed_with(const uint64_t x)
are limited to 264 combinations. I'd expect a_seed_with()
that allows 2128 or an auxiliary_seed_full_with()
. 64-bit may seem sufficient today. Similar thinking years ago thoughtsrand(unsigned int seed)
was good enough too.Code uses
gettimeofday()
which is not part of the Standard C library.|
favors 1 bits. Use^
. There is no difference withuint64_t value
as a count of microseconds, yet it would be a weakness shouldvalue
become derived otherwise.// uint64_t x = (value << 32) | value;
uint64_t x = (value << 32) ^ value;The net effect of
static uint64_t time_based_x(void)
is 1 of 1,000,000 different results. (Seems weak and is reminiscent of an online poker hack that use the second of the day as a game seed: Only 24*60*60 different decks). Recommended to use more of "now" to achieve more than a paltry 1,000,000 different results. Example:uint64_t x = currentTime.tv_sec;
x *= 1000000; // us per s
x += currentTime.tv_usec;
x ^= (x << 32) ^ SEED_SCRAMBLER;
return x;
xoroshiro128plus.h
const
invoid _seed_with(const uint64_t x);
serves no purpose for the users. Recommend simplificationvoid _seed_with(uint64_t x);
for their sakeFunction name
next()
is far too common and uninformative. Collisions are certain.Although ponderous, consider function names like
xoroshiro128plus_seed_auto(), xoroshiro128plus_next()
.
randomlib.h
The file is missing. I'd expect this code to include the companion to
randomlib.c
.In the .h file is a great place to put the function description comments. There are not so much needed in the .c file
randomlib.c
Like #2 above, a function to seed more than 64-bits would be useful.
Bug. Re: "The integers generated are uniformly * distributed." with
next_uint64(const uint64_t limit)
. Function only generates uniformly whenlimit
is a power of 2. Function with that limitation is not that useful.Inconsistent style:
void init_rand(void)
vsuint32_t next_uint32()
. Recommend(void)
for both.Mask or cast is superfluous. Simply use one.
// next_uint32_temp = (uint32_t) (full & UINT32_MASK);
next_uint32_temp = (uint32_t) full;has_next_uint32 ^= true;
is more logical ashas_next_uint32 = !has_next_uint32;
. Either way,has_next_uint32 = true;
andhas_next_uint32 = false
in theif() ... else
code would convey more meaning.Bug.
seed()
does not resethas_next_uint32
innext_uint32()
. Soseed()
does not "seed" the complete PRNG state. Same forhas_next_gaussian
innext_gaussian()
. Consider makinghas_next_uint32
, etc.static
to the file, soseed()
may reset them.next_double()
relies onDBL_EPSILON/2 == 0x1.0p-53
. Either make code portable or add anassert/_Static_assertion
to confirm assumptions.usually means to include end points. So comment
// Limit u and v to the range [-1, 1]
would be better as// Limit u and v to the range (-1, 1)
.Bug. No protection against
s==0
insqrt(-2.0 * log(s) / s)
.Unclear comment in
next_bool()
with "Sign test as per the recommendation" as there is no sign in the code.return next() & 1;
should be sufficient. Any bit should be as random as the others. Go for simple code.LL
unnecessary inUINT32_MASK 0xFFFFFFFFULL
. TheLL
could only make the constant wider than it needs to be. I find using digits in one case andU
in another adds clarity to the constantUINT32_MASK 0xFFFFFFFFu
. Given #34, it is not needed anyways.The function names lack distinctiveness and meaning (e.g. How is
next_uint32
something to do with "random"?). Recommend similar change here as in #13. Examplerandomlib_gaussian()
.
Candidate alternative next_double()
. The commented out ((double) next()) / ((double) UINT64_MAX);
is on the right track, yet it should divide by UINT64_MAX + 1
. Code should also be careful how to convert _MAX
to floating point. With the below code, small values retain more precision than OP's code.
double next_double(void)
#define DBL_UINT64_MAX_P1 ((UINT64_MAX/2 + 1)*2.0)
int save_round = fegetround();
fesetround(FE_DOWNWARD);
double d = next() / DBL_UINT64_MAX_P1;
fesetround(save_round);
return d;
[Edit] On review, I see the above needs the complication of adjusting the rounding mode - sigh - not as clean an alternative as hoped.
1
See this good answer for #32 alternative: generate a uniform mapping It is that answer's step 3 is so important. Also this
â chux
Jan 6 at 19:02
A very thorough answer as usual. Well done. I'd like to ask for a suggestion for an alternative forgettimeofday()
which you mentioned is not part of the Standard C Library in #3.
â Astrobleme
Jan 6 at 21:27
1
@Astrobleme I see no great alternative for #3. One idea is to use the result oftime()
andclock()
likeuint64_t c = clock(), t = time(); return (c << 32) ^ t ^ SEED_SCRAMBLER;
This would need more study. In the end, this part of code should likely go of to platform dependent code.
â chux
Jan 6 at 21:41
I have a question aboutclock()
and the correspondingCLOCKS_PER_SEC
macro. How does this stay constant for most systems which implement turbo boost and can reach high(er) clock speeds (say 3.0 to 3.7 GHz)? Same when the system enters power saving mode and the clock speed is throttled to about half or even less. Should I avoid usingCLOCKS_PER_SEC
? Or is there a proper way? Or did I interpret it wrong?
â Astrobleme
Jan 7 at 7:43
@Astrobleme "How does this stay constant for most system" - it does not. Originalgettimeofday()
does not have consistency as the accuracy of itscurrentTime.tv_usec
varies from system to system, (it does uniformly report us though - but that may move in 1000 us steps for example) , so usingclock()
would be a portable substitute. The goals oftime_based_x()
is to offer some "true" random seed for_seed_auto()
and using "time" can only provide so many "random" bits. If code uses non-Standard C code, research/dev/random
.
â chux
Jan 7 at 12:34
 |Â
show 1 more comment
up vote
3
down vote
your _seed_with() does not fill state using splitmix as PRNG
Instead, it fill state using nested hash functions
s[1] = splitmix64next(s[0] = splitmix64next(x));
The equidistribution and nice 2^64 period is now lost
Using nested filling method for bigger state, this might happen (period = 2):
s = y, x, y, x, y, x, ...; // if splitmix(splitmix(x)) = x
Does s look like initialized randomly ?
Using splitmix PRNG to fill state, all values are guaranteed different.
Also, SEED_SCRAMBLER is unnecessary, that should be the job of splitmix
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
3
down vote
accepted
Overall: good design, presentation and nicely formatted.
xoroshiro128plus.c
uint64_t s[2];
looks very bad to have at global scope. Certainly should have limited file scope withstatic uint64_t s[2];
Same foruint64_t SEED_SCRAMBLER
The state space is 128 bit, yet users with
_seed_with(const uint64_t x)
are limited to 264 combinations. I'd expect a_seed_with()
that allows 2128 or an auxiliary_seed_full_with()
. 64-bit may seem sufficient today. Similar thinking years ago thoughtsrand(unsigned int seed)
was good enough too.Code uses
gettimeofday()
which is not part of the Standard C library.|
favors 1 bits. Use^
. There is no difference withuint64_t value
as a count of microseconds, yet it would be a weakness shouldvalue
become derived otherwise.// uint64_t x = (value << 32) | value;
uint64_t x = (value << 32) ^ value;The net effect of
static uint64_t time_based_x(void)
is 1 of 1,000,000 different results. (Seems weak and is reminiscent of an online poker hack that use the second of the day as a game seed: Only 24*60*60 different decks). Recommended to use more of "now" to achieve more than a paltry 1,000,000 different results. Example:uint64_t x = currentTime.tv_sec;
x *= 1000000; // us per s
x += currentTime.tv_usec;
x ^= (x << 32) ^ SEED_SCRAMBLER;
return x;
xoroshiro128plus.h
const
invoid _seed_with(const uint64_t x);
serves no purpose for the users. Recommend simplificationvoid _seed_with(uint64_t x);
for their sakeFunction name
next()
is far too common and uninformative. Collisions are certain.Although ponderous, consider function names like
xoroshiro128plus_seed_auto(), xoroshiro128plus_next()
.
randomlib.h
The file is missing. I'd expect this code to include the companion to
randomlib.c
.In the .h file is a great place to put the function description comments. There are not so much needed in the .c file
randomlib.c
Like #2 above, a function to seed more than 64-bits would be useful.
Bug. Re: "The integers generated are uniformly * distributed." with
next_uint64(const uint64_t limit)
. Function only generates uniformly whenlimit
is a power of 2. Function with that limitation is not that useful.Inconsistent style:
void init_rand(void)
vsuint32_t next_uint32()
. Recommend(void)
for both.Mask or cast is superfluous. Simply use one.
// next_uint32_temp = (uint32_t) (full & UINT32_MASK);
next_uint32_temp = (uint32_t) full;has_next_uint32 ^= true;
is more logical ashas_next_uint32 = !has_next_uint32;
. Either way,has_next_uint32 = true;
andhas_next_uint32 = false
in theif() ... else
code would convey more meaning.Bug.
seed()
does not resethas_next_uint32
innext_uint32()
. Soseed()
does not "seed" the complete PRNG state. Same forhas_next_gaussian
innext_gaussian()
. Consider makinghas_next_uint32
, etc.static
to the file, soseed()
may reset them.next_double()
relies onDBL_EPSILON/2 == 0x1.0p-53
. Either make code portable or add anassert/_Static_assertion
to confirm assumptions.usually means to include end points. So comment
// Limit u and v to the range [-1, 1]
would be better as// Limit u and v to the range (-1, 1)
.Bug. No protection against
s==0
insqrt(-2.0 * log(s) / s)
.Unclear comment in
next_bool()
with "Sign test as per the recommendation" as there is no sign in the code.return next() & 1;
should be sufficient. Any bit should be as random as the others. Go for simple code.LL
unnecessary inUINT32_MASK 0xFFFFFFFFULL
. TheLL
could only make the constant wider than it needs to be. I find using digits in one case andU
in another adds clarity to the constantUINT32_MASK 0xFFFFFFFFu
. Given #34, it is not needed anyways.The function names lack distinctiveness and meaning (e.g. How is
next_uint32
something to do with "random"?). Recommend similar change here as in #13. Examplerandomlib_gaussian()
.
Candidate alternative next_double()
. The commented out ((double) next()) / ((double) UINT64_MAX);
is on the right track, yet it should divide by UINT64_MAX + 1
. Code should also be careful how to convert _MAX
to floating point. With the below code, small values retain more precision than OP's code.
double next_double(void)
#define DBL_UINT64_MAX_P1 ((UINT64_MAX/2 + 1)*2.0)
int save_round = fegetround();
fesetround(FE_DOWNWARD);
double d = next() / DBL_UINT64_MAX_P1;
fesetround(save_round);
return d;
[Edit] On review, I see the above needs the complication of adjusting the rounding mode - sigh - not as clean an alternative as hoped.
1
See this good answer for #32 alternative: generate a uniform mapping It is that answer's step 3 is so important. Also this
â chux
Jan 6 at 19:02
A very thorough answer as usual. Well done. I'd like to ask for a suggestion for an alternative forgettimeofday()
which you mentioned is not part of the Standard C Library in #3.
â Astrobleme
Jan 6 at 21:27
1
@Astrobleme I see no great alternative for #3. One idea is to use the result oftime()
andclock()
likeuint64_t c = clock(), t = time(); return (c << 32) ^ t ^ SEED_SCRAMBLER;
This would need more study. In the end, this part of code should likely go of to platform dependent code.
â chux
Jan 6 at 21:41
I have a question aboutclock()
and the correspondingCLOCKS_PER_SEC
macro. How does this stay constant for most systems which implement turbo boost and can reach high(er) clock speeds (say 3.0 to 3.7 GHz)? Same when the system enters power saving mode and the clock speed is throttled to about half or even less. Should I avoid usingCLOCKS_PER_SEC
? Or is there a proper way? Or did I interpret it wrong?
â Astrobleme
Jan 7 at 7:43
@Astrobleme "How does this stay constant for most system" - it does not. Originalgettimeofday()
does not have consistency as the accuracy of itscurrentTime.tv_usec
varies from system to system, (it does uniformly report us though - but that may move in 1000 us steps for example) , so usingclock()
would be a portable substitute. The goals oftime_based_x()
is to offer some "true" random seed for_seed_auto()
and using "time" can only provide so many "random" bits. If code uses non-Standard C code, research/dev/random
.
â chux
Jan 7 at 12:34
 |Â
show 1 more comment
up vote
3
down vote
accepted
Overall: good design, presentation and nicely formatted.
xoroshiro128plus.c
uint64_t s[2];
looks very bad to have at global scope. Certainly should have limited file scope withstatic uint64_t s[2];
Same foruint64_t SEED_SCRAMBLER
The state space is 128 bit, yet users with
_seed_with(const uint64_t x)
are limited to 264 combinations. I'd expect a_seed_with()
that allows 2128 or an auxiliary_seed_full_with()
. 64-bit may seem sufficient today. Similar thinking years ago thoughtsrand(unsigned int seed)
was good enough too.Code uses
gettimeofday()
which is not part of the Standard C library.|
favors 1 bits. Use^
. There is no difference withuint64_t value
as a count of microseconds, yet it would be a weakness shouldvalue
become derived otherwise.// uint64_t x = (value << 32) | value;
uint64_t x = (value << 32) ^ value;The net effect of
static uint64_t time_based_x(void)
is 1 of 1,000,000 different results. (Seems weak and is reminiscent of an online poker hack that use the second of the day as a game seed: Only 24*60*60 different decks). Recommended to use more of "now" to achieve more than a paltry 1,000,000 different results. Example:uint64_t x = currentTime.tv_sec;
x *= 1000000; // us per s
x += currentTime.tv_usec;
x ^= (x << 32) ^ SEED_SCRAMBLER;
return x;
xoroshiro128plus.h
const
invoid _seed_with(const uint64_t x);
serves no purpose for the users. Recommend simplificationvoid _seed_with(uint64_t x);
for their sakeFunction name
next()
is far too common and uninformative. Collisions are certain.Although ponderous, consider function names like
xoroshiro128plus_seed_auto(), xoroshiro128plus_next()
.
randomlib.h
The file is missing. I'd expect this code to include the companion to
randomlib.c
.In the .h file is a great place to put the function description comments. There are not so much needed in the .c file
randomlib.c
Like #2 above, a function to seed more than 64-bits would be useful.
Bug. Re: "The integers generated are uniformly * distributed." with
next_uint64(const uint64_t limit)
. Function only generates uniformly whenlimit
is a power of 2. Function with that limitation is not that useful.Inconsistent style:
void init_rand(void)
vsuint32_t next_uint32()
. Recommend(void)
for both.Mask or cast is superfluous. Simply use one.
// next_uint32_temp = (uint32_t) (full & UINT32_MASK);
next_uint32_temp = (uint32_t) full;has_next_uint32 ^= true;
is more logical ashas_next_uint32 = !has_next_uint32;
. Either way,has_next_uint32 = true;
andhas_next_uint32 = false
in theif() ... else
code would convey more meaning.Bug.
seed()
does not resethas_next_uint32
innext_uint32()
. Soseed()
does not "seed" the complete PRNG state. Same forhas_next_gaussian
innext_gaussian()
. Consider makinghas_next_uint32
, etc.static
to the file, soseed()
may reset them.next_double()
relies onDBL_EPSILON/2 == 0x1.0p-53
. Either make code portable or add anassert/_Static_assertion
to confirm assumptions.usually means to include end points. So comment
// Limit u and v to the range [-1, 1]
would be better as// Limit u and v to the range (-1, 1)
.Bug. No protection against
s==0
insqrt(-2.0 * log(s) / s)
.Unclear comment in
next_bool()
with "Sign test as per the recommendation" as there is no sign in the code.return next() & 1;
should be sufficient. Any bit should be as random as the others. Go for simple code.LL
unnecessary inUINT32_MASK 0xFFFFFFFFULL
. TheLL
could only make the constant wider than it needs to be. I find using digits in one case andU
in another adds clarity to the constantUINT32_MASK 0xFFFFFFFFu
. Given #34, it is not needed anyways.The function names lack distinctiveness and meaning (e.g. How is
next_uint32
something to do with "random"?). Recommend similar change here as in #13. Examplerandomlib_gaussian()
.
Candidate alternative next_double()
. The commented out ((double) next()) / ((double) UINT64_MAX);
is on the right track, yet it should divide by UINT64_MAX + 1
. Code should also be careful how to convert _MAX
to floating point. With the below code, small values retain more precision than OP's code.
double next_double(void)
#define DBL_UINT64_MAX_P1 ((UINT64_MAX/2 + 1)*2.0)
int save_round = fegetround();
fesetround(FE_DOWNWARD);
double d = next() / DBL_UINT64_MAX_P1;
fesetround(save_round);
return d;
[Edit] On review, I see the above needs the complication of adjusting the rounding mode - sigh - not as clean an alternative as hoped.
1
See this good answer for #32 alternative: generate a uniform mapping It is that answer's step 3 is so important. Also this
â chux
Jan 6 at 19:02
A very thorough answer as usual. Well done. I'd like to ask for a suggestion for an alternative forgettimeofday()
which you mentioned is not part of the Standard C Library in #3.
â Astrobleme
Jan 6 at 21:27
1
@Astrobleme I see no great alternative for #3. One idea is to use the result oftime()
andclock()
likeuint64_t c = clock(), t = time(); return (c << 32) ^ t ^ SEED_SCRAMBLER;
This would need more study. In the end, this part of code should likely go of to platform dependent code.
â chux
Jan 6 at 21:41
I have a question aboutclock()
and the correspondingCLOCKS_PER_SEC
macro. How does this stay constant for most systems which implement turbo boost and can reach high(er) clock speeds (say 3.0 to 3.7 GHz)? Same when the system enters power saving mode and the clock speed is throttled to about half or even less. Should I avoid usingCLOCKS_PER_SEC
? Or is there a proper way? Or did I interpret it wrong?
â Astrobleme
Jan 7 at 7:43
@Astrobleme "How does this stay constant for most system" - it does not. Originalgettimeofday()
does not have consistency as the accuracy of itscurrentTime.tv_usec
varies from system to system, (it does uniformly report us though - but that may move in 1000 us steps for example) , so usingclock()
would be a portable substitute. The goals oftime_based_x()
is to offer some "true" random seed for_seed_auto()
and using "time" can only provide so many "random" bits. If code uses non-Standard C code, research/dev/random
.
â chux
Jan 7 at 12:34
 |Â
show 1 more comment
up vote
3
down vote
accepted
up vote
3
down vote
accepted
Overall: good design, presentation and nicely formatted.
xoroshiro128plus.c
uint64_t s[2];
looks very bad to have at global scope. Certainly should have limited file scope withstatic uint64_t s[2];
Same foruint64_t SEED_SCRAMBLER
The state space is 128 bit, yet users with
_seed_with(const uint64_t x)
are limited to 264 combinations. I'd expect a_seed_with()
that allows 2128 or an auxiliary_seed_full_with()
. 64-bit may seem sufficient today. Similar thinking years ago thoughtsrand(unsigned int seed)
was good enough too.Code uses
gettimeofday()
which is not part of the Standard C library.|
favors 1 bits. Use^
. There is no difference withuint64_t value
as a count of microseconds, yet it would be a weakness shouldvalue
become derived otherwise.// uint64_t x = (value << 32) | value;
uint64_t x = (value << 32) ^ value;The net effect of
static uint64_t time_based_x(void)
is 1 of 1,000,000 different results. (Seems weak and is reminiscent of an online poker hack that use the second of the day as a game seed: Only 24*60*60 different decks). Recommended to use more of "now" to achieve more than a paltry 1,000,000 different results. Example:uint64_t x = currentTime.tv_sec;
x *= 1000000; // us per s
x += currentTime.tv_usec;
x ^= (x << 32) ^ SEED_SCRAMBLER;
return x;
xoroshiro128plus.h
const
invoid _seed_with(const uint64_t x);
serves no purpose for the users. Recommend simplificationvoid _seed_with(uint64_t x);
for their sakeFunction name
next()
is far too common and uninformative. Collisions are certain.Although ponderous, consider function names like
xoroshiro128plus_seed_auto(), xoroshiro128plus_next()
.
randomlib.h
The file is missing. I'd expect this code to include the companion to
randomlib.c
.In the .h file is a great place to put the function description comments. There are not so much needed in the .c file
randomlib.c
Like #2 above, a function to seed more than 64-bits would be useful.
Bug. Re: "The integers generated are uniformly * distributed." with
next_uint64(const uint64_t limit)
. Function only generates uniformly whenlimit
is a power of 2. Function with that limitation is not that useful.Inconsistent style:
void init_rand(void)
vsuint32_t next_uint32()
. Recommend(void)
for both.Mask or cast is superfluous. Simply use one.
// next_uint32_temp = (uint32_t) (full & UINT32_MASK);
next_uint32_temp = (uint32_t) full;has_next_uint32 ^= true;
is more logical ashas_next_uint32 = !has_next_uint32;
. Either way,has_next_uint32 = true;
andhas_next_uint32 = false
in theif() ... else
code would convey more meaning.Bug.
seed()
does not resethas_next_uint32
innext_uint32()
. Soseed()
does not "seed" the complete PRNG state. Same forhas_next_gaussian
innext_gaussian()
. Consider makinghas_next_uint32
, etc.static
to the file, soseed()
may reset them.next_double()
relies onDBL_EPSILON/2 == 0x1.0p-53
. Either make code portable or add anassert/_Static_assertion
to confirm assumptions.usually means to include end points. So comment
// Limit u and v to the range [-1, 1]
would be better as// Limit u and v to the range (-1, 1)
.Bug. No protection against
s==0
insqrt(-2.0 * log(s) / s)
.Unclear comment in
next_bool()
with "Sign test as per the recommendation" as there is no sign in the code.return next() & 1;
should be sufficient. Any bit should be as random as the others. Go for simple code.LL
unnecessary inUINT32_MASK 0xFFFFFFFFULL
. TheLL
could only make the constant wider than it needs to be. I find using digits in one case andU
in another adds clarity to the constantUINT32_MASK 0xFFFFFFFFu
. Given #34, it is not needed anyways.The function names lack distinctiveness and meaning (e.g. How is
next_uint32
something to do with "random"?). Recommend similar change here as in #13. Examplerandomlib_gaussian()
.
Candidate alternative next_double()
. The commented out ((double) next()) / ((double) UINT64_MAX);
is on the right track, yet it should divide by UINT64_MAX + 1
. Code should also be careful how to convert _MAX
to floating point. With the below code, small values retain more precision than OP's code.
double next_double(void)
#define DBL_UINT64_MAX_P1 ((UINT64_MAX/2 + 1)*2.0)
int save_round = fegetround();
fesetround(FE_DOWNWARD);
double d = next() / DBL_UINT64_MAX_P1;
fesetround(save_round);
return d;
[Edit] On review, I see the above needs the complication of adjusting the rounding mode - sigh - not as clean an alternative as hoped.
Overall: good design, presentation and nicely formatted.
xoroshiro128plus.c
uint64_t s[2];
looks very bad to have at global scope. Certainly should have limited file scope withstatic uint64_t s[2];
Same foruint64_t SEED_SCRAMBLER
The state space is 128 bit, yet users with
_seed_with(const uint64_t x)
are limited to 264 combinations. I'd expect a_seed_with()
that allows 2128 or an auxiliary_seed_full_with()
. 64-bit may seem sufficient today. Similar thinking years ago thoughtsrand(unsigned int seed)
was good enough too.Code uses
gettimeofday()
which is not part of the Standard C library.|
favors 1 bits. Use^
. There is no difference withuint64_t value
as a count of microseconds, yet it would be a weakness shouldvalue
become derived otherwise.// uint64_t x = (value << 32) | value;
uint64_t x = (value << 32) ^ value;The net effect of
static uint64_t time_based_x(void)
is 1 of 1,000,000 different results. (Seems weak and is reminiscent of an online poker hack that use the second of the day as a game seed: Only 24*60*60 different decks). Recommended to use more of "now" to achieve more than a paltry 1,000,000 different results. Example:uint64_t x = currentTime.tv_sec;
x *= 1000000; // us per s
x += currentTime.tv_usec;
x ^= (x << 32) ^ SEED_SCRAMBLER;
return x;
xoroshiro128plus.h
const
invoid _seed_with(const uint64_t x);
serves no purpose for the users. Recommend simplificationvoid _seed_with(uint64_t x);
for their sakeFunction name
next()
is far too common and uninformative. Collisions are certain.Although ponderous, consider function names like
xoroshiro128plus_seed_auto(), xoroshiro128plus_next()
.
randomlib.h
The file is missing. I'd expect this code to include the companion to
randomlib.c
.In the .h file is a great place to put the function description comments. There are not so much needed in the .c file
randomlib.c
Like #2 above, a function to seed more than 64-bits would be useful.
Bug. Re: "The integers generated are uniformly * distributed." with
next_uint64(const uint64_t limit)
. Function only generates uniformly whenlimit
is a power of 2. Function with that limitation is not that useful.Inconsistent style:
void init_rand(void)
vsuint32_t next_uint32()
. Recommend(void)
for both.Mask or cast is superfluous. Simply use one.
// next_uint32_temp = (uint32_t) (full & UINT32_MASK);
next_uint32_temp = (uint32_t) full;has_next_uint32 ^= true;
is more logical ashas_next_uint32 = !has_next_uint32;
. Either way,has_next_uint32 = true;
andhas_next_uint32 = false
in theif() ... else
code would convey more meaning.Bug.
seed()
does not resethas_next_uint32
innext_uint32()
. Soseed()
does not "seed" the complete PRNG state. Same forhas_next_gaussian
innext_gaussian()
. Consider makinghas_next_uint32
, etc.static
to the file, soseed()
may reset them.next_double()
relies onDBL_EPSILON/2 == 0x1.0p-53
. Either make code portable or add anassert/_Static_assertion
to confirm assumptions.usually means to include end points. So comment
// Limit u and v to the range [-1, 1]
would be better as// Limit u and v to the range (-1, 1)
.Bug. No protection against
s==0
insqrt(-2.0 * log(s) / s)
.Unclear comment in
next_bool()
with "Sign test as per the recommendation" as there is no sign in the code.return next() & 1;
should be sufficient. Any bit should be as random as the others. Go for simple code.LL
unnecessary inUINT32_MASK 0xFFFFFFFFULL
. TheLL
could only make the constant wider than it needs to be. I find using digits in one case andU
in another adds clarity to the constantUINT32_MASK 0xFFFFFFFFu
. Given #34, it is not needed anyways.The function names lack distinctiveness and meaning (e.g. How is
next_uint32
something to do with "random"?). Recommend similar change here as in #13. Examplerandomlib_gaussian()
.
Candidate alternative next_double()
. The commented out ((double) next()) / ((double) UINT64_MAX);
is on the right track, yet it should divide by UINT64_MAX + 1
. Code should also be careful how to convert _MAX
to floating point. With the below code, small values retain more precision than OP's code.
double next_double(void)
#define DBL_UINT64_MAX_P1 ((UINT64_MAX/2 + 1)*2.0)
int save_round = fegetround();
fesetround(FE_DOWNWARD);
double d = next() / DBL_UINT64_MAX_P1;
fesetround(save_round);
return d;
[Edit] On review, I see the above needs the complication of adjusting the rounding mode - sigh - not as clean an alternative as hoped.
edited Jan 6 at 18:56
answered Jan 6 at 14:40
chux
11.4k11238
11.4k11238
1
See this good answer for #32 alternative: generate a uniform mapping It is that answer's step 3 is so important. Also this
â chux
Jan 6 at 19:02
A very thorough answer as usual. Well done. I'd like to ask for a suggestion for an alternative forgettimeofday()
which you mentioned is not part of the Standard C Library in #3.
â Astrobleme
Jan 6 at 21:27
1
@Astrobleme I see no great alternative for #3. One idea is to use the result oftime()
andclock()
likeuint64_t c = clock(), t = time(); return (c << 32) ^ t ^ SEED_SCRAMBLER;
This would need more study. In the end, this part of code should likely go of to platform dependent code.
â chux
Jan 6 at 21:41
I have a question aboutclock()
and the correspondingCLOCKS_PER_SEC
macro. How does this stay constant for most systems which implement turbo boost and can reach high(er) clock speeds (say 3.0 to 3.7 GHz)? Same when the system enters power saving mode and the clock speed is throttled to about half or even less. Should I avoid usingCLOCKS_PER_SEC
? Or is there a proper way? Or did I interpret it wrong?
â Astrobleme
Jan 7 at 7:43
@Astrobleme "How does this stay constant for most system" - it does not. Originalgettimeofday()
does not have consistency as the accuracy of itscurrentTime.tv_usec
varies from system to system, (it does uniformly report us though - but that may move in 1000 us steps for example) , so usingclock()
would be a portable substitute. The goals oftime_based_x()
is to offer some "true" random seed for_seed_auto()
and using "time" can only provide so many "random" bits. If code uses non-Standard C code, research/dev/random
.
â chux
Jan 7 at 12:34
 |Â
show 1 more comment
1
See this good answer for #32 alternative: generate a uniform mapping It is that answer's step 3 is so important. Also this
â chux
Jan 6 at 19:02
A very thorough answer as usual. Well done. I'd like to ask for a suggestion for an alternative forgettimeofday()
which you mentioned is not part of the Standard C Library in #3.
â Astrobleme
Jan 6 at 21:27
1
@Astrobleme I see no great alternative for #3. One idea is to use the result oftime()
andclock()
likeuint64_t c = clock(), t = time(); return (c << 32) ^ t ^ SEED_SCRAMBLER;
This would need more study. In the end, this part of code should likely go of to platform dependent code.
â chux
Jan 6 at 21:41
I have a question aboutclock()
and the correspondingCLOCKS_PER_SEC
macro. How does this stay constant for most systems which implement turbo boost and can reach high(er) clock speeds (say 3.0 to 3.7 GHz)? Same when the system enters power saving mode and the clock speed is throttled to about half or even less. Should I avoid usingCLOCKS_PER_SEC
? Or is there a proper way? Or did I interpret it wrong?
â Astrobleme
Jan 7 at 7:43
@Astrobleme "How does this stay constant for most system" - it does not. Originalgettimeofday()
does not have consistency as the accuracy of itscurrentTime.tv_usec
varies from system to system, (it does uniformly report us though - but that may move in 1000 us steps for example) , so usingclock()
would be a portable substitute. The goals oftime_based_x()
is to offer some "true" random seed for_seed_auto()
and using "time" can only provide so many "random" bits. If code uses non-Standard C code, research/dev/random
.
â chux
Jan 7 at 12:34
1
1
See this good answer for #32 alternative: generate a uniform mapping It is that answer's step 3 is so important. Also this
â chux
Jan 6 at 19:02
See this good answer for #32 alternative: generate a uniform mapping It is that answer's step 3 is so important. Also this
â chux
Jan 6 at 19:02
A very thorough answer as usual. Well done. I'd like to ask for a suggestion for an alternative for
gettimeofday()
which you mentioned is not part of the Standard C Library in #3.â Astrobleme
Jan 6 at 21:27
A very thorough answer as usual. Well done. I'd like to ask for a suggestion for an alternative for
gettimeofday()
which you mentioned is not part of the Standard C Library in #3.â Astrobleme
Jan 6 at 21:27
1
1
@Astrobleme I see no great alternative for #3. One idea is to use the result of
time()
and clock()
like uint64_t c = clock(), t = time(); return (c << 32) ^ t ^ SEED_SCRAMBLER;
This would need more study. In the end, this part of code should likely go of to platform dependent code.â chux
Jan 6 at 21:41
@Astrobleme I see no great alternative for #3. One idea is to use the result of
time()
and clock()
like uint64_t c = clock(), t = time(); return (c << 32) ^ t ^ SEED_SCRAMBLER;
This would need more study. In the end, this part of code should likely go of to platform dependent code.â chux
Jan 6 at 21:41
I have a question about
clock()
and the corresponding CLOCKS_PER_SEC
macro. How does this stay constant for most systems which implement turbo boost and can reach high(er) clock speeds (say 3.0 to 3.7 GHz)? Same when the system enters power saving mode and the clock speed is throttled to about half or even less. Should I avoid using CLOCKS_PER_SEC
? Or is there a proper way? Or did I interpret it wrong?â Astrobleme
Jan 7 at 7:43
I have a question about
clock()
and the corresponding CLOCKS_PER_SEC
macro. How does this stay constant for most systems which implement turbo boost and can reach high(er) clock speeds (say 3.0 to 3.7 GHz)? Same when the system enters power saving mode and the clock speed is throttled to about half or even less. Should I avoid using CLOCKS_PER_SEC
? Or is there a proper way? Or did I interpret it wrong?â Astrobleme
Jan 7 at 7:43
@Astrobleme "How does this stay constant for most system" - it does not. Original
gettimeofday()
does not have consistency as the accuracy of its currentTime.tv_usec
varies from system to system, (it does uniformly report us though - but that may move in 1000 us steps for example) , so using clock()
would be a portable substitute. The goals of time_based_x()
is to offer some "true" random seed for _seed_auto()
and using "time" can only provide so many "random" bits. If code uses non-Standard C code, research /dev/random
.â chux
Jan 7 at 12:34
@Astrobleme "How does this stay constant for most system" - it does not. Original
gettimeofday()
does not have consistency as the accuracy of its currentTime.tv_usec
varies from system to system, (it does uniformly report us though - but that may move in 1000 us steps for example) , so using clock()
would be a portable substitute. The goals of time_based_x()
is to offer some "true" random seed for _seed_auto()
and using "time" can only provide so many "random" bits. If code uses non-Standard C code, research /dev/random
.â chux
Jan 7 at 12:34
 |Â
show 1 more comment
up vote
3
down vote
your _seed_with() does not fill state using splitmix as PRNG
Instead, it fill state using nested hash functions
s[1] = splitmix64next(s[0] = splitmix64next(x));
The equidistribution and nice 2^64 period is now lost
Using nested filling method for bigger state, this might happen (period = 2):
s = y, x, y, x, y, x, ...; // if splitmix(splitmix(x)) = x
Does s look like initialized randomly ?
Using splitmix PRNG to fill state, all values are guaranteed different.
Also, SEED_SCRAMBLER is unnecessary, that should be the job of splitmix
add a comment |Â
up vote
3
down vote
your _seed_with() does not fill state using splitmix as PRNG
Instead, it fill state using nested hash functions
s[1] = splitmix64next(s[0] = splitmix64next(x));
The equidistribution and nice 2^64 period is now lost
Using nested filling method for bigger state, this might happen (period = 2):
s = y, x, y, x, y, x, ...; // if splitmix(splitmix(x)) = x
Does s look like initialized randomly ?
Using splitmix PRNG to fill state, all values are guaranteed different.
Also, SEED_SCRAMBLER is unnecessary, that should be the job of splitmix
add a comment |Â
up vote
3
down vote
up vote
3
down vote
your _seed_with() does not fill state using splitmix as PRNG
Instead, it fill state using nested hash functions
s[1] = splitmix64next(s[0] = splitmix64next(x));
The equidistribution and nice 2^64 period is now lost
Using nested filling method for bigger state, this might happen (period = 2):
s = y, x, y, x, y, x, ...; // if splitmix(splitmix(x)) = x
Does s look like initialized randomly ?
Using splitmix PRNG to fill state, all values are guaranteed different.
Also, SEED_SCRAMBLER is unnecessary, that should be the job of splitmix
your _seed_with() does not fill state using splitmix as PRNG
Instead, it fill state using nested hash functions
s[1] = splitmix64next(s[0] = splitmix64next(x));
The equidistribution and nice 2^64 period is now lost
Using nested filling method for bigger state, this might happen (period = 2):
s = y, x, y, x, y, x, ...; // if splitmix(splitmix(x)) = x
Does s look like initialized randomly ?
Using splitmix PRNG to fill state, all values are guaranteed different.
Also, SEED_SCRAMBLER is unnecessary, that should be the job of splitmix
edited May 20 at 17:13
answered May 20 at 16:47
Albert Chan
312
312
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f184425%2fimplementing-a-64-bit-prng-library-in-c-backed-by-xoroshiro128%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Did you build a wrapper or did you modify the original code? The whole point of a wrapper is to leave the original intact, or at least be independent of it's implementation.
â Mast
Jan 6 at 9:10
@Mast That's a valid point. I slightly modified the code as per the recommendation of the authors.
â Astrobleme
Jan 6 at 9:13
Ok, and what's with all the magic numbers?
â Mast
Jan 6 at 9:14
The magic numbers come from the original source: xoroshiro.di.unimi.it/splitmix64.c
â Astrobleme
Jan 6 at 9:15
I figured as much. But why are those numbers exactly that value? Are they dependent on each other?
â Mast
Jan 6 at 9:34