Add file "robots.txt"
"robots.txt" is a file which allows the website owner to disallow bots, crawlers, scrapers, and other potentially malicious or unwanted behaviour on their website.
This commit is contained in:
parent
df8c88fad4
commit
94de9053e6
43
robots.txt
Normal file
43
robots.txt
Normal file
@ -0,0 +1,43 @@
|
||||
# Inferencium - Website - robots.txt
|
||||
# Version: 1.0.0-beta.1
|
||||
|
||||
# Copyright 2024 Jake Winters
|
||||
# SPDX-License-Identifier: BSD-3-Clause
|
||||
|
||||
|
||||
# ChatGPT
|
||||
User-agent: ChatGPT-User
|
||||
Disallow: /
|
||||
|
||||
User-agent: GPTbot
|
||||
Disallow: /
|
||||
|
||||
|
||||
# Google Bard
|
||||
User-agent: Google-Extended
|
||||
Disallow: /
|
||||
|
||||
|
||||
# iThenticate (http://www.slysearch.com/)
|
||||
## A tool which crawls the internet in search of copyright and intellectual property violations
|
||||
## which may be of interest to clients. These tools have no right to scan my website for such
|
||||
## purposes.
|
||||
User-agent: SlySearch
|
||||
Disallow: /
|
||||
|
||||
|
||||
# NameProtect (http://www.nameprotect.com/botinfo.html)
|
||||
## A tool which crawls the internet in search of brand and intellectual property violations which
|
||||
## may be of interest to clients. These tools have no right to scan my website for such purposes.
|
||||
User-agent: NPBot
|
||||
Disallow: /
|
||||
|
||||
|
||||
# Turnitinbot (http://www.turnitin.com/robot/crawlerinfo.html)
|
||||
## A tool to scan the internet to allow educational institutions to compare content against
|
||||
## students' work in order to prevent plagiarism. These tools promote a bad precedence against
|
||||
## open-source content as it may be marked as copyrighted/plagiarised when it's actually legally
|
||||
## available for use under the copyright holder's license. I allow complete usage of my content for
|
||||
## educational purposes, without exception.
|
||||
User-agent: Turnitinbot
|
||||
Disallow: /
|
Loading…
x
Reference in New Issue
Block a user