MyBB Community Forums

Full Version: SEO casing resource faults
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Over the last months our website has a resource usage fault every day at 12.10am and again around 3:30am. This of course caused the site to be restricted for 24 hours. This has been the focus on the semrushbot who, at one point, seemed to be the culprit.

We took many actions to prevent this bot but even asking nicely, this bot will not relent and continues to pound away at the site to gain something from our meager 33 registered members.

But in spite of their unwelcome persistence, I found the following txt files, in the mybb root directory

htaccess.txt  and  htaccess-nginx.txt

which upon renaming them, the resource faults ceased.

I get it that most want SEo optimization, but our forum is for local users and is of no interest to the world at large.

Having said that, it would have been nice to be able to disable this feature when it is not needed.

regards

https://www.gbarc.ca/ForumBB/index.php
MyBB doesn't install these files by default - rather txt files that can be further dealt with by the installer.  Htaccess.txt for Apache, htaccess-nginx.txt for Nginx.  No expert, but don't think you want them both active. 

Installation docs have more information:

https://docs.mybb.com/1.8/install/
Those text files exist for your reference, there is no need to upload them, although doing so anyway should not cause any harm as text files have no function whatsoever.

For Apache the file has to be named .htaccess (starting with . and no .txt extension), and for nginx (if you are using nginx at all, instead of Apache), nginx doesn't use htaccess, so you have to put rewrite rules into your nginx configuration file which is usually somewhere in /etc/nginx/... - it could be nginx.conf, or sites-available/yoursite.conf, or something else. There is no standard name for this file as it depends on your distribution and site configuration.

If your site can be taken down for 24 hours just by hammering it with a bot, you might have to take a closer look at your access logs (what requests are those bots sending exactly) and/or maybe switch to a more reliable hosting solution. It makes no sense for plain text files to cause faults, it's usually PHP and database queries that cause too much load for cheap shared hosters.
I returned to file names to default last evening and at 12:10am, on schedule, a resource usage fault is logged. These files are doing something, and I've heard the "need a better host" before.

The facts are the facts,

1.the issue is with mybb.
You're using Nginx, so neither of these files will even be being used. As frostschutz said, the htaccess.txt/.htaccess file is for Apache which you're not using, and the htaccess-nginx.txt is just there for reference and doesn't do anything no matter what you rename it to. None of this is MyBB-specific, this would apply to any software running on Apache or Nginx. With the default names with .txt at the end, they're just arbitrary text files that literally don't do anything. What did you rename them to when it supposedly fixed the issue?

What are the "resource usage faults" you're actually seeing? I can see you host with Vultr, what size server do you have? How many page loads is Semrush actually causing? MyBB is used on sites with 8-figure post counts, so if traffic from a single bot is taking the server down, then it suggests a server issue. Are there any backups or anything that run? That fact it happens so incredibly reliably at the exact same time of day again suggests something server related.
The only app we have here is Mybb, the rest of the site is just a simple html website, every day about 12:15am and 3:30am a resource usage fault is logged on the apache server.   www.gbarc.ca
I had troubles with semrushbot which don't follow rules and browse too much pages too quickly.
I finally add a jail in fail2ban to ban it from my servers.
(2021-08-20, 11:53 PM)tomsttsa Wrote: [ -> ]The only app we have here is Mybb, the rest of the site is just a simple html website, every day about 12:15am and 3:30am a resource usage fault is logged on the apache server.   www.gbarc.ca

The only web app your have on the server may be MyBB, but the server itself will be doing hundreds of things, the same way a computer is still doing loads of stuff behind the scenes even if you only have one program open. For something to be happening at such a specific time, there must be something specific happening. A PHP application won't suddenly use more resources at an incredibly specific time of day by itself, something will be being run or excessive load applied. You'd need to work with your host to track down if it's increased traffic from a particular source, or some other server process that's running on the server. If it Semrush causing the issue then you'll need to block Semrush, it's not something MyBB can do anything about as it's happening at server level.
semrush doesn't follow rules in robots.txt?
https://moz.com/learn/seo/robotstxt
No. It doesn't respect the crawl-delay and often loads a lot of pages in the same time.