In this article, I will discuss how to download and save image files with PHP/cURL web scraper. I will use email extractor script created earlier as example. With some modification, the same script can then be used to extract product information and images from Internet shopping websites such as ebay.com or amazon.com to your desired database. We can also extract business information from directory websites, both text information and images into your website as well.
There are concerns and considerations before we scrape image file from websites.
1) There could be various file formats (jpeg, png, gif etc) used in a website. Even a single web page could have various file formats.
If we want to build common database for all collected images (from various websites), then our PHP web scraper script needs to be able to convert to the file format we prefer.
2) Each images could have different file size.
Some images can be very large and some very small. Our PHP web scraping script needs to be able to resize large file to a smaller size. Resize large file to small is not a problem. Small size to large will give us poor quality image.
3) We need a naming convention for image file.
Different websites named image files differently. Some have long name, some short. Before store image files into our folder, we need to rename these files with our naming convention.
4) We need to add one column in MySQL database, to link the images to the related information.
So here we go...