- using scrapy https://scrapy.org/
- support parsing and asynchronous
- paring HTML/JS
- deduplicateubg links
- using MongoDB as store with json, a no sql apporach suitable for web data collection
- for the entry point for each brand
- collect coordinates/text address then parse them using map api of all kind to deal with address only situation, then store the coordinates of the store location.
- parsing again to get the detail information of the city / province meta for each address