vendredi 4 mai 2018

generate random byte strings that compress to a known percentage

I'd like to write a test case for a complex system that stores certain data as key/value pairs. I know how well the files backing those k/v databases compress and I can probably find out to what extent values are compressed by the database library itself. (KyotoCabinet with LZO compression.)

In my test case I'd like to generate synthetic values, probably using a random generator that should compress to a comparable extent. Is that possible and if so, how?

A naive approach I can think of would take the compression ratio (say 2x) and then construct a N-element byte array containing two copies of the same N/2 array of random bytes. Or maybe N/2 random bytes each repeated twice; that should work for RLE.




Aucun commentaire:

Enregistrer un commentaire