mardi 27 mars 2018

Should I generate massive amounts of SQL data on the client or in SQL Server?

I am writing a program to generate a massive amount of data and populate tables in SQL Server. This is data that spans across multiple tables with potentially multiple foreign key constraints, as well as multiple 'enum' like tables, whose distribution of values need to be seemingly random as well and are referenced often from other tables. This leads to a lot of ORDER BY NEWID() type code, which seems slow to me.

My question is: which strategy would be more performant:

1) Generate and insert data in SQL Server, using set based operations and a bunch of ORDER BY NEWID() to get randomness

2) Generate all the data on a client (should make operations like choosing a random value from an enum table much faster), then import the data into SQL Server

I can see some positives and negatives from both strategies. Obviously the generation of random data would be easier and probably more performant in the client. However getting that data to the server would be slow. Otherwise, importing the data and inserting it in a set based operation should be similar in scale.

Has anyone done something similar?




Aucun commentaire:

Enregistrer un commentaire