1 comments

  • 8organicbits 9 hours ago ago

    Let's break it down:

    Authentic user sentiment - this is an impossible problem. At scale, the best you could do is to ask an LLM to rate sentiment and authenticity. If you can tolerate inaccuracies, that may be viable.

    Rate limits: web crawling frameworks do this out of the box via robots.txt, various headers, etc. Non-trivial to set up, but not novel.

    ToS: You'll need a lawyer to advise here. Possibly by reading each ToS document individually, including every ToS update.

    Legal/ethical: we'd need to know more about what you're doing to comment.

    Generally: Retail websites want users to use their website to purchase products. If your scraping doesn't drive traffic to their sites, then they won't want your scraping. If users previously went directly to the retail website to view the reviews, but now used your site instead, then they'd fear loss of revenue. One way or another, they'll try to prevent this from happening.