Using selenium to grab the google searching result

  • Open multi-browsers for scanning.
  • Continue a interupted scanning.
  • Perfectly avoid the 503 Anti-crawler response.

The source code were put in githubhttps://github.com/huboqiang/seleniumSearchGoogleTest

1. Preparing

Install packages:

pip install selenium
pip install bs4
pip install json

ChromeDriver were used because I used to use chrome to search google(fxxk G F W). It’s OK for Firefox or other browsers if you can use it for searching google. Remember to put chromedriver into $PATH before starting selenium.

option(Removing results):

Remove json file can result in searching all queries, or only queries with “NA” or “None” address would be scanned.

rm test.json

2. Run script:

python main.py

This script may have to be re-run for many times in order to continue a interupted scanning.

In main.py, two browsers would be openned. For the testing data:

browser1:

browser2:

And these gif were generated using makeGif.py. This scripts usedimages2gif, however, a bug have to be fixed: http://stackoverflow.com/questions/19149643/error-in-images2gif-py-with-globalpalette


本文作者Boqiang Hu, 欢迎评论、交流。
转载请务必标注出处: Using selenium to grab the google searching result