Skip to content

淘宝评论爬虫 加 爬下评论的分析

Notifications You must be signed in to change notification settings

parkerjohn6/Taobao_WebCrawler

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

This is a mini web crawler application for crawling taobao review automatically
In order to use it, you need first have a valid taobao account, install "requests" already in your editor first.

These are all the key points you need to be careful:
1. You should replace the three parameters in "header": "cookie","referer","user-agent" to your own. It won't work if
   you just use as same as mine. 
2. How can I replace these there parameters?
   i. You need first find the target items you want to crawl the reviews in taobao official website
   ii. At the same time, please log in with your account 
   iii. When you get into the product page, press "Fn","F12" at the same time (if you use Chrome) to show the developer 
        tool
   iv. Select "Network" in the navigator bar, then, click the "reviews" in the product page. 
   v. You should find out that the developer page already have a lot of new stuff, go and find the element under "Name", start with
      "list_detail_rate.htm". This is the review info that we want to find.
   vi. Click "Headers" in the right pannel, scoll down to the "Requests Headers" section, copy and paste "cookie","referer",
       "user-agent" from that to your code
3. The five parameters in "params": "itemId", "sellerId", "currentPage", "order", "callback" should be changed to your own as well.
        You can find the value of all of them except "order" just by scolling down to the "Query String Parameters" section. The "order" keywords
        means the order you want to keep those contents, you can just keep it same as mine. 
4. The next thing you need to change is the "searchNumber", this means how many pages you want to crawl
5. Don't forget to change the itemId in "TaoBao("itemId")" to the corresponding itemId you have above
6. Run it. You will get all the reviews after it finished. 

NOTICE: This project was finished in 2020/2/18. All the reviews were until that day. 

About

淘宝评论爬虫 加 爬下评论的分析

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 93.6%
  • Python 6.4%