How to avoid Selenium webdriver from being detected as bot or web spider

selenium as bot

 

Before we start to use php-webdrive and Selenium for web scraping and social media auto posting, we need to do some settings in code or file modifications to avoid our script from being detected as web bot or spider. I have listed some ways to hide our automation using Selenium. The methods can be used for any programming languages as well. Please note that this is not a complete list and from time to time web servers companies can find new methods to detect and block our Selenium automation. Anyway, we just have to factor in all known methods in our scripts to reduce chances of detection.

1. Remove browser control flag

2. Remove signature in javascript

3. Set User-Agent

4. Avoid using headless browser

5. Use maximum resolution

6. Follow page flow

7. Use proxy or VPN

8. Insert random delay

9. Use cookies to login

Read more...

How to install php-webdriver + Selenium for screen scrapping and auto-post

phpwebdriver

Today I look at the content of php8legs.com and realize that I have not writing in website for more than 4 years already. It was a busy four years. As Malaysia is implementing MCO (Movement Control Order) due to wide spread of Covid-19 virus, I have the chance to take sometime to discuss the topics of my interest - web scraping and auto posting.

In the articles, I want to discuss about more advanced scraping techniques such as scraping website with infinite scroll, as well as using webdriver to auto login social media websites and perform auto posting. All this can be done using Selenium. There are already so many articles on Selenium + webdrivers in Python/Java/Ruby etc. So I want to write this topic using PHP. To run Selenium with PHP under Windows 10 environment, assuming you already have XAMPP installed (with PHP 7 or above), here are the software packages required: 

1. Java - installation

2. Composer - installation

3. php-webdriver from github.com  - installation

4. Selenium Standalone Server - download

5. Chromedriver - download

If you already have Java and composer installed earlier, then just perform installation at step 3 and download packages at step 4 and 5.

Read more...
Subscribe to this RSS feed