Skip to content
Advertisement

what’s the random byte mentioned in linux `shuf` –random-source?

I’m trying to prepare a few shuffled files with Linux shuf command, however, through shuf --help, it doesn’t provide any “seed” option. I wonder if there is any workaround? Thanks!

Updates:

thanks to the remind in the comments, I realize (1) shuf FILE produce different result each time; and (2) when --random-source are same, the produced results are same, so I guess can be used to make the shuffling procedure replicatable?

Then just out of curiosity: for long I thought randomness is controlled by the random seed, but as the man said, --random-source specify where to read the “random bytes”. I wonder how is this random bytes related to the random seed?

Advertisement

Answer

I wonder if there is any workaround?

There is --random-source option.

I guess can be used to make the shuffling procedure replicatable?

This is what --random-source is for.

how is this random bytes related to the random seed?

It is the source of random bytes.

GNU shuf has an optimized as much as possible algorithm and that algorithm strongly depends on other options. In the most simplistic and crude oversimplification, you just choose a random number from 1 to the count of lines in the file and print the line with that number. Then repeat and choose a another number without repeats.

Browsing GNU shuf.c sources I see the number generation is in shuf/randint.c randint_genmax().

Basically I believe you are confusing algorithms. Indeed pseudorandom number generators use a seed. However shuf is intended to work with /dev/urandom – a special file where the kernel itself keeps a global seed. And shuf is intended to work with any random number generation method. So shuf instead of requiring a seed for a specific random number generation method, it requires a stream which provides random bytes when reading from it.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement