dimanche 4 octobre 2015

How to do an HTTP Post when the Form Data randomizes?

I am trying to make a web scraper to look through classes at my University. This is the website: http://ift.tt/1ElPhk9

It requires you to select a campus and a semester before it shows the available classes. The URL does not change, no matter what you search. When I inspect the page and go to Network and Form Data, the parameters seem to be randomized every session I make a search entry. I gave it a shot, this is the code that I have:

import requests
from bs4 import BeautifulSoup

url = "http://ift.tt/1ElPhk9"

params = {
    'P_ba54caa74ca72e0fdac3c182cf2368f0': 201508,
    'P_17eb17673be0b52144c05c059540ca77': 1,
    'P_85bcf9333d09c50557f2c1de50710370': 'COP',
    'search': 'Search',
    'P_19995eb50474990371a9eed255dbfaf3': 1,
    'P_f0b0c62cf3b3fa84659194dc9720e9bd': 20,
    'P_b452f0da22c69c5f259ad52ee228f252': 1
    }

response = requests.post(url, data=params)
soup = BeautifulSoup(response.content, "lxml")
data = soup.text

print(data)

But this just shows the standard entry page. I want to be able to select certain search specifications, (ex. Fall Semester, Class #, etc.) and then scan the page that comes up.




Aucun commentaire:

Enregistrer un commentaire