Community note: AI LLM and BOT forum data scraping - Total Motorcycle Community Forums
BACK TO TOTAL MOTORCYCLE - DAILY MOTORCYCLE NEWS - MOTORCYCLE MODEL REVIEW GUIDES

Total Motorcycle Community Forums

26 Years. 430 Million Readers. 54 years of Motorcycle Guides ∙ Reviews ∙ The friendliest motorcycle community on the internet!

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Total Motorcycle Talk Forums Total Motorcycle Talk
  • Search
  • Unanswered topics
  • Active topics

Community note: AI LLM and BOT forum data scraping

Post Reply
  • Print view
Advanced search
3 posts • Page 1 of 1
Message
Author
User avatar
totalmotorcycle
Administrator
Administrator
Posts: 30005
Joined: Sat Nov 22, 2003 1:00 pm
Real Name: Mike
Sex: Male
Years Riding: 34
My Motorcycle: 2013 Moto Guzzi V7 Stone
Location: Winnipeg, Manitoba

Community note: AI LLM and BOT forum data scraping

  • Quote

#1 Post by totalmotorcycle » Tue Jul 15, 2025 1:34 am

Good day TMW community!

I've noticed over the last year since AI chatbots, Co-Pilot, Gemini, Grok, LLM's (Large Language Models) etc have been aggressively data harvesting anything they can scape (take) over the internet. As our TMW forums have been around for 23 years (since 2002) that's a lot of data they can source.

Because of this, my bandwidth use across the forums has been overwhelming, over 50% of TMW's bandwidth use has been to feed these AI LLM's and is not only costing a lot of money, but slowing the entire site down. I get no revenue from AI models visiting TMW, in fact, with AI scraping (stealing) the information they divert visitors AWAY from our forums and website.

Thus, I have to make some hard decisions soon on what to do with the forums as things are getting out of hand with bandwidth and the AI data theft.

If you have any suggestions, please feel free to raise your hand. Right now I'm looking at all options including:

1. Limiting access to the posts of the forum to login registered guests only.
2. Banning all bots and AI models from the site
3. Shutting down and removing the forums completely.
4. Honeypot (AI traps) to stop the steal.

Thank you for your attention in this matter. (I can't believe I just typed that!).

Mike
NEW 2026 Motorcycle Model Guides
2025 Motorcycle Model Guides

Total Motorcycle is official Media/Press for Aprilia, Benelli, Beta, Bimota, BMW, Brammo, Buell, Can-Am, CCW, Ducati, EBR, Harley-Davidson, Honda, Husqvarna, Husaberg, Hyosung, Indian, Kawasaki, KTM, KYMCO, LiveWire, Moto Guzzi, Moto Morini, MV Agusta, Norton, Phantom, Piaggio, Polaris, Ridley, Roehr, Royal Enfield, Suzuki, Triumph, Ural, Vespa, Victory, Yamaha and Zero.
Top
pchast
Site Supporter - Silver
Site Supporter - Silver
Posts: 633
Joined: Tue Sep 02, 2008 1:04 pm
Real Name: Pete
Sex: Male
Years Riding: 10
My Motorcycle: 1980 Suzuki GS550L, 2019 Zero DSR
Location: Athens, NY

Re: Community note: AI LLM and BOT forum data scraping

  • Quote

#2 Post by pchast » Tue Jul 15, 2025 3:58 pm

That's a difficult decision..
One of the expressed reasons for the historical data is to help newbies.

Perhaps a 2 tiered approach?
2019 Zero DSR, 1980 Suzuki GS550L
Top
User avatar
totalmotorcycle
Administrator
Administrator
Posts: 30005
Joined: Sat Nov 22, 2003 1:00 pm
Real Name: Mike
Sex: Male
Years Riding: 34
My Motorcycle: 2013 Moto Guzzi V7 Stone
Location: Winnipeg, Manitoba

Re: Community note: AI LLM and BOT forum data scraping

  • Quote

#3 Post by totalmotorcycle » Wed Jul 16, 2025 9:06 am

pchast wrote: Tue Jul 15, 2025 3:58 pm That's a difficult decision..
One of the expressed reasons for the historical data is to help newbies.

Perhaps a 2 tiered approach?
100% with your thinking there. I really, really don't want to hurt the forums and the information they contain I feel is valuable for the riding community. Plus, we have a great community IMO.

What I've done right now is:

1. Restrict guest access to the forum. The majority of the record number of guests (Most users ever online was 289791 on June 27th, 2025, 2:02 am) were BOTs and AI LLM scrapers. So right now, guests get a "teaser" of 2 forums and won't see the other forums until they login. Bots and AI LLM's don't login as they don't have accounts.
2. I've banned all bots from seeing ANY messages in the forums. What they can't see, they can't scrape. But that's the GOOD bots that will respect that. These are my currently most common forum BOTS visiting:

Bing [Bot]
Amazon [Bot]
Semrush [Bot]
Google Adsense [Bot]
Google [Bot]
Ahrefs [Bot]
Majestic-12 [Bot]
Google Feedfetcher
YaCy [Bot]
AdsBot [Google]
Yahoo [Bot]
DuckDuckGo [Bot]
Baidu [Spider]
MSNbot Media
Ask Jeeves [Bot]
Alexa [Bot]
Exabot [Bot]

Now I'm actively watching the guests numbers, the bandwidth and new spam registrations to see those changes.

Overall, this as a whole makes me unhappy and I'm not amused with AI LLM's right now.

Mike
NEW 2026 Motorcycle Model Guides
2025 Motorcycle Model Guides

Total Motorcycle is official Media/Press for Aprilia, Benelli, Beta, Bimota, BMW, Brammo, Buell, Can-Am, CCW, Ducati, EBR, Harley-Davidson, Honda, Husqvarna, Husaberg, Hyosung, Indian, Kawasaki, KTM, KYMCO, LiveWire, Moto Guzzi, Moto Morini, MV Agusta, Norton, Phantom, Piaggio, Polaris, Ridley, Roehr, Royal Enfield, Suzuki, Triumph, Ural, Vespa, Victory, Yamaha and Zero.
Top
Post Reply
  • Print view

3 posts • Page 1 of 1

Return to “Total Motorcycle Talk”

Jump to
  • NEW: Please Login/Register to see ALL forums
  • Total Motorcycle Talk Forums
  • ↳   Start Your Engines - Introduce Yourself
  • ↳   Total Motorcycle Talk
  • News, Events and Stories
  • Total Motorcycle Garage Forums
  • Reviews
  • Rider Cafe'
  • Off Topic!
  • Total Motorcycle General
  • Board index
  • All times are UTC-11:00
  • Delete cookies
  • Contact us

Powered by phpBB® Forum Software © phpBB Limited

Privacy | Terms

 

 

TMW Privacy Policy - Forum Privacy Policy - Terms and Conditions

Follow us on X / Twitter - Facebook - YouTube - Pinterest - Instagram - News RSS Feed