python

Easier Way To Scrape This Difficult Site?  #angularjs #reactjs

  • Go through the “player” table, scrape the rest of the information and look up the dictionary containing the fp values.
  • We parse the fp ratings and construct a “player_id” -> “fp value” dictionary.
  • Getting fp ratings is a little bit tricky – they are separated from the actual “player” table, but, we can “connect” them together by using the data-player-id unique attribute.
  • Stack Overflow works best with JavaScript enabled
  • I’m trying to grab Player Name and the FP column from .

I’m trying to grab Player Name and the FP column from here. Usually when I need table info I could load it up into a Dataframe using pandas or at least run a find_all() method with bs4. I found one page that recommended something like this:

@ng_real_ninja: Easier Way To Scrape This Difficult Site? #angularjs #reactjs

First of all, before diving into the solution, make sure to study the “Terms of Service” and understand if you are allowed to scrape the resource this way, be a good web-scraping citizen.

The problem is, the site checks if you are authenticated and, if not, it would set the

NF_DATA

But, if you open the page in the browser while not being authenticated or would study the

, you would see that the desired data is actually there in the HTML – you can scrape it directly, no need to go through parsing the

script

fp

ratings is a little bit tricky – they are separated from the actual “player” table, but, we can “connect” them together by using the

unique attribute. First, we parse the

fp

ratings and construct a “player_id” -> “fp value” dictionary. Then, go through the “player” table, scrape the rest of the information and look up the dictionary containing the

fp

python