I'd like to sample 2000 random documents from approximately 60 ES indexes holding about 50 million documents each, for a total of about 3 billion documents overall. I've tried doing the following on the Kibana Dev Tools page:
GET some_index_abc_*/_search
{
""size": 2000,
"query": {
"function_score": {
"query": {
"match_phrase": {
"field_a": "some phrase"
}
},
"random_score": {}
}
}
}
But this query never returns. Upon refreshing the Dev Tools page, I get a page that tells me that the ES cluster status is red (doesn't seem to be a coincidence - I've tried several times). Other queries (counts, simple match_all queries) without the random function work fine. I've read that function score queries tend to be slow, but using a random function score is the only method I've been able to find for getting random documents from ES. I'm wondering if there might be any other, faster way that I can sample random documents from multiple large ES indexes.
Aucun commentaire:
Enregistrer un commentaire