I've got two tables which both have hundreds of millions of rows.
One is PAPER. Each row is unique with a column called "paper_id" as its key. The other is PFOS. Each row has two columns, "paper_id" and "field_id".
One paepr may belong to several fields.
I need to select N rows in each group grouped by field_id in PFOS then get papers in PAPER by selected paper_id.
This is my sql: select paper_id in PFOS where field_id in/= xxx order by random limit N. If possible, I could also process data with python cause I'm using sqlite3 package.
Questions
- How could I make it faster?
- When I use LIMIT(), the rows I got are less than N.Did I make a mistake in sql?
PAPER
paper_id*,title...
PFOS
paper_id,field_id
I would apprecaite it if I got you suggestions.
Aucun commentaire:
Enregistrer un commentaire