lundi 29 janvier 2018

How to randomly sample web pages from a domain?

This may be the wrong place for this question and if so, please let me know and I'll delete it.

I'm trying to randomly sample x number of pages from a given domain. Say, for example, I'm looking at PubMed. Given just the URL https://www.ncbi.nlm.nih.gov/pubmed/, is there any way to grab subpages randomly?

I typically use Python for this sort of thing, but I'm up for any tools in any language to look into this further!




Aucun commentaire:

Enregistrer un commentaire