Announcement

Collapse
No announcement yet.

Please Educate Me About Robot.txt

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Please Educate Me About Robot.txt

    Why should I use robot.txt, especially when I'm selling from my site? Can someone also please tell me why I would ever want to block search engine bots from indexing my site when I really want people to find me?

  • #2
    Well... some bots are just plain site unfriendly iads and will spider your site unmercifully and eat your resources up for your account. some are just spam bots looking for emails to harvest and some bots just dont return any traffic and some webmasters prefer to block them. You can also block sensitive areas of your website such as member areas from being spidered by bots with robots.txt This is an example of the robots.txt I use to allow all spiders but discourage indexing certain sections of my wordpress site.

    User-agent: *
    Disallow: /wp-admin/
    Disallow: /wp-includes/
    Disallow: /wp-content/plugins/
    Disallow: /wp-content/cache/
    Disallow: /wp-content/themes/
    Disallow: /wp-login.php
    Disallow: /wp-register.php

    Comment


    • #3
      robot.txt isnt all about blocking bots, its also about controlling them.

      If your goal is to sell your site then you should use the robots.txt to point bots to the content you want indexed and be search able on the web. You should also block the bots from the directories and pages you dont want indexed, like admin back-end, short life pages etc.

      Also robots.txt can give instructions on how to index your site. For example, say you have a site where all the pages are dynamically generated. Now say you have a bot which wants to index all your pages. These bots are on very fast connections and can be threaded (requesting numerous pages at the same time), they could easily bring your web-server to its knees! So you can control their indexing behaviour in the robots.txt file asking them to slow down while crawling your site, using the crawl-delay directive.

      Its worth noting that the Google bot does not obey the robots.txt file. Google requires you to sign up for their "Webmaster Tools" service which allows you to configure how the Googlebot crawls your site.

      Comment


      • clivejo
        clivejo commented
        Editing a comment
        Unfortunately it seems to be Google’s motto of late, our way or no way !!

      • doneritehosting
        doneritehosting commented
        Editing a comment
        As I have my robots set up how would I go about keeping google from indexing those parts of my site then clivejo other than webmaster tools? I mean there is no reason for them to be indexed. can it be done from hespia?

      • clivejo
        clivejo commented
        Editing a comment
        Googlebot does in most cases obey the disallow commands, but as for delays and controlling how it indexes your site Googlebot wont obey it. I had problems with Googlebot before indexing a wiki based website and overloading it, I added delay crawl directives but it didn’t work. The only way is to sign up to their Webmaster Tools or disallow the bot entirely.

    • #4
      Great info,thanks guys...

      Comment


    • #5
      As I re-read this it came to me that I must have never had enough traffic to ever worry about this... How sad...

      Comment


      • doneritehosting
        doneritehosting commented
        Editing a comment
        Bot traffic is easy to get lmao!!!!! For real though iads one inconsiderate bot (spider) can drag a server to its electrical knees...

    • #6
      With all of Googles Tools you can get a good idea on how people interact with your site. Install Google Analytics to track where your visitors are landing and the sequence of pages they visit. You might find that they are getting bored or not able to find the information they are looking for. Landing pages and exit pages can give you a lot of feedback.

      Comment


      • iads
        iads commented
        Editing a comment
        I tried Google Analytics, but I didn't like it. I use a simple, easy analytics site called: StatCounter. Google Analytics works well if you advertise a lot with Google, but other then SEO I don't deal with Google anymore. They cost way to much for the best hosting related keywords. I'm not going to pay $50 for one sale, the mark up is not that good.

      • doneritehosting
        doneritehosting commented
        Editing a comment
        Right on there iads the cost to run the keyword "reseller hosting" effectively on google adwords for one month would be staggering. Marketing on google is expensive for good keywords in any niche. I use statcounter too with another reseller-hosting gig

      • iads
        iads commented
        Editing a comment
        I get your point dorite, but I never identify myself as a reseller to my clients... just saying... :-)

    • #7
      Please also keep in mind that not all search engine will obey the rules in robot.txt. Some of them even ignore it.

      Comment

      Working...
      X