I want to generate huge data with 16 variables.
The data should be generated randomly with the ranges:
(The range may change)
v01: 20 ~ 30
v02: 15 ~ 25
v03: 3 ~ 9
v04: 20 ~ 30
v05: 3 ~ 9
v06: 30 ~ 40
v07: 600 ~ 700
v08: 50 ~ 60
v09: 5 ~ 55
v10: 4 ~ 10
v11: 25 ~ 35
v12: 10 ~ 16
v13: 9 ~ 15
v14: 2 ~ 8
v15: 450 ~ 550
v16: 5 ~ 35
For example:
+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+
| v01 | v02 | v03 | v04 | v05 | v06 | v07 | v08 | v09 | v10 | v11 | v12 | v13 | v14 | v15 | v16 |
+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+
| 24 | 21 | 3 | 29 | 3 | 30 | 635 | 58 | 10 | 8 | 32 | 10 | 12 | 4 | 540 | 32 |
| 20 | 18 | 7 | 22 | 7 | 34 | 675 | 50 | 45 | 4 | 29 | 12 | 9 | 3 | 475 | 34 |
| 28 | 24 | 7 | 28 | 3 | 39 | 685 | 53 | 5 | 6 | 34 | 12 | 10 | 2 | 495 | 31 |
| 23 | 24 | 3 | 28 | 5 | 32 | 645 | 53 | 5 | 4 | 32 | 12 | 14 | 5 | 545 | 25 |
| 21 | 20 | 7 | 28 | 7 | 37 | 635 | 57 | 50 | 4 | 32 | 12 | 12 | 3 | 455 | 34 |
+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+
Every row should be unique, so I created a MySQL table:
CREATE TABLE `test_table` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`v01` smallint(6) DEFAULT NULL,
`v02` smallint(6) DEFAULT NULL,
`v03` smallint(6) DEFAULT NULL,
`v04` smallint(6) DEFAULT NULL,
`v05` smallint(6) DEFAULT NULL,
`v06` smallint(6) DEFAULT NULL,
`v07` smallint(6) DEFAULT NULL,
`v08` smallint(6) DEFAULT NULL,
`v09` smallint(6) DEFAULT NULL,
`v10` smallint(6) DEFAULT NULL,
`v11` smallint(6) DEFAULT NULL,
`v12` smallint(6) DEFAULT NULL,
`v13` smallint(6) DEFAULT NULL,
`v14` smallint(6) DEFAULT NULL,
`v15` smallint(6) DEFAULT NULL,
`v16` smallint(6) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `unique_index` (`v01`,`v02`,`v03`,`v04`,`v05`,`v06`,`v07`,`v08`,`v09`,`v10`,`v11`,`v12`,`v13`,`v14`,`v15`,`v16`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
And I wrote a (python) script to make random values and insert into it.
In the beginning, it could generate a million rows within 10 minutes,
but now, it costs about a whole day to generate a million rows when there are a billion rows existed.
It becomes slower and slower.
I want 10 billion rows, so I'm considering to buy a faster computer.
Before I buy a new computer, I just want to ask a question,
is there any more efficient way to do that?
Thank you
Aucun commentaire:
Enregistrer un commentaire