My last article was written in Feb this year. I was extremely busy on my internet retail business in the last eight months, as well as putting a lot of effort to release a price comparison engine for Malaysia's ecommerce market, BijakMall.com. Bijak Mall collects product information from various internet malls in Malaysia and indexes into database. Potential buyers then able to search and compare prices through Bijak Mall's search engine. I started this project under XAMPP environment so that I can use my laptop to test the scripts. I wrote web spiders in PHP to collect products' data and store into MySQL database. Sphinx Search Server (Windows version) is used to index the database and response to user query. Sphinx is a free software/open source Fulltext search engine designed to provide full-text search functionality to client applications.
I started the project in Aug 2013 and wrote the scripts on and off basis during my free time. I have to rewrite the scripts many times so that they can extract data into common database. Initially I was looking at some open source search engines like Xapian, Xunsearch, Apache Lucene and etc. Later I settled with Sphinex Search Server after some research. To learn up Sphinx, I purchased an ebook "Sphinx Search Beginner's Guide" (2011) by Abbas Ali. This book describes simple example on Sphinx application and I think is a good book for beginner.
In fact, there is another excellent book on Sphinx, "Introduction to Search with Sphinx: From installation to relevance tuning" (2011) by the creator by Sphinx, Andrew Aksyonoff.
Sphinx was easy to use (from installation, setup and writing PHP interface code) and super fast response to user's search queries. PHP program can interface with Sphinx via SphinxAPI (for PHP) or SphinxQL. To speed up the learning curve, I also refer to Barry Hunter's sample code at nearby.org.uk.
Finally I subscribed VPS hosting from Servint to run Bijak Mall. They provided outstanding services, starts from initial contact with sales and marketing, sphinx installation and answering technical issues. I have to install Sphinx Search (Linux version) on VPS and retest the code. However, the process was easy and no issue at all.
Bijak Mall is currently indexed slightly over 200,000 products, as of Oct 18th 2014, from 8 major internet shopping malls in Malaysia. This is still a very very small number compared to what Sphinx can do. I am continue to write web spiders to extra more products, improve user interface, as well as beef up site content of Bijak Mall.
In the next article, I will discuss how to install and use Sphinx Search in application.