Not sure how to use your robots.txt file? Believe it or not, it’s one of the most important files in terms of SEO. You need to use the file to specify which sections of your site should and which should not be accessible to search engines. For example, you don’t need the wp-admin directory to be crawled and indexed by search engines because it’s intended for internal use only. Robots.txt is a plain text (.txt) file that should be placed in the root directory on your server, which means you need to place it in the same folder where you have your website files and folders on the server. You need to specifically call it robots.txt . Otherwise it won’t work.
Virtual Robots.txt File on WordPress
WordPress uses a virtual robots.txt file. That means you won’t find it on your ftp server if you try to access it for editing because it’s created dynamically each time a user visits your site. Though it’s visible if you add /robots.txt to your site URL, it’s not available on your server if you try to find it with the help an ftp manager such as CuteFTP, FileZilla or CyberDuck.
How to Edit Robots.txt with WordPress
In case you want to have the option to specifically edit your robots.txt file manually, you should install the WP Robots.txt plugin. It’s gonna allow you to edit your robots.txt file right in your WordPress dashboard. So, let’s install the plugin and see how it works.
How to Install the WP Robots.txt Plugin
- While in your WordPress dashboard, go to Plugins and select Add New.
- Type in WP Robots.txt in the Search text box and hit the Search Plugins button.
- Having found the plugin, just click the Install Now link. You should have a pop-up window now that double-checks if you really want to install the plugin. Just click OK.
- Now click the Activate Plugin option.
- At this point, you can just expand the Settings drop-down menu and select Reading from there.
- Now just find the Robots.txt Content text field. What the field contains is the content of your actual Robots.txt file.
Content of Your WordPress Robots.txt File
You should have a similar content by default:
User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/
So, the code above just prohibits all crawlers from seeing the /wp-admin/ and /wp-includes/ directories on your server.
Improved Content for Your Robots.txt
Though the default settings are workable as well, it’s best WordPress SEO practices to modify them a bit so that your robots.txt file looks as follows:
User-agent: *
Disallow: /feed/
Disallow: /trackback/
Disallow: /wp-admin/
Disallow: /wp-content/
Disallow: /wp-includes/
Disallow: /xmlrpc.php
Disallow: /wp-
Allow: /wp-content/uploads/
Sitemap: http://example.com/sitemap.xml
The first line shows which exactly crawling robot or robots you wanna target. An * means that you target all robots. In other words, you’re saying,”Hey, all of you search robots, act as follows”.
Alternatively, you can point to specific crawlers such as Googlebot, Rogerbot, etc. You want to do it if using an * asterisk won’t target a specific crawler for some reason or another.
I personally had such an issue with Roberbot. You also may need to target it explicitly.So, instead of:
User-agent: *
You could target the Moz Rogerbot specifically:
User-agent: Rogerbot
Most of the code above just disallows access to the specified directories ( /trackback/, /wp-admin/, etc) because the content of those directories is of no interest neither to your site visitors nor search engines.
Disallow: /feed/
Disallow: /trackback/
Disallow: /wp-admin/
Disallow: /wp-content/
Disallow: /wp-includes/
Disallow: /xmlrpc.php
Disallow: /wp-
Since you want to have the option to rank in search engines with the content that is in the uploads directory (such as images and stuff like that) , the second to last line allows access to the /wp-content/uploads/ directory.
Allow: /wp-content/uploads/
And the last line just points to the location of your sitemap.xml file which Google and other search engines use for properly crawling your site.
Sitemap: http://example.com/sitemap.xml
Dangerous File
Improper configuration of your robots.txt file can result in being totally invisible for search engines. The worst configuration would be as follows:
Disallow: /
The code above disallows access to all your site. So, search engines will index NOTHING. Just keep in mind that you don’t want that rule in your robots.txt file.
Joke for SEO Nerds
Take a look at this SEO-nerdy joke. Got it? That’s really neat! 🙂
This chick disallows everything to the guy. This joke can actually help you better understand how disallow: / works. Ok, let’s move on.
How to Fine Tune or Robots.txt Syntax
In case you want to fine tune your robots.txt settings, you may want to know the following.
To target a specific directory, just enclose its name with slashes. for example /wp-content/.
Disallow: /wp-content/
In order to target a specific file, you just need to define a path to that file along with its name:
Disallow: /wp-content/your-file.php
You can point like that to all kinds of files:
Disallow: /wp-content/your-file.html
Disallow: /wp-content/your-file.png
Disallow: /wp-content/your-file.jpeg
Disallow: /wp-content/your-file.css
Disable Dynamic URL Indexing
Most likely, you’ll stumble upon this really wide-spread issue. You may need to disable dynamic URL indexing. A dynamic URL is one that contains a ? question mark. Such URLs can cause all sorts of SEO issues (duplicate content, duplicate page title, etc) and you want to disable search engines from indexing pages with such URLs. You can easily do it with the help of robots.txt. Just add the following line:
Disallow: /*?
Screencast about Robots.txt for WordPress Users
This screencast is a sample of the SEO course that I’m currently working on. The course is entitled SEO Crash Course for WordPress Users. If you want to be in the know when it’s launched, be sure to subscribe to my newsletter in the end of the post.
Useful Links
Robots.txt: the Ultimate Guide
Conclusion
You just can’t call yourself an SEO or an online marketer for that matter if you’re not comfortable with the robots.txt file because it defines how search engines see your site. Be sure to edit your robots.txt file only if you know what you’re doing. Otherwise your site may just disappear from the Internet and you won’t even know why.
There’s lots of WordPress plugins that allow handling your robots.txt file. Do you know any that work better than the WP Robots.txt plugin that I covered in this post?