Finding a blog post written by someone on the web is hard, especially if the author is new to publishing on the web and does not have much of an audience.
Large search engines like Google and Bing have democratized access to information but they were not designed to help you find smaller, niche sites who may not think about SEO.
This search engine lets you surf the IndieWeb, a community of people who own and make their own websites. There are other IndieWeb-adjacent sites indexed too.
We do not expect that the first result will immediately answer a query you have. Rather, we see IndieWeb Search as an exploration tool so you can find websites and posts from a variety of sites that might interest you.
IndieWeb Search provides special functionality for our wiki and the Microformats wiki. This special functionality lets us serve direct answers to queries like ("what is a h card" or "what is a reply?").
The IndieWeb search engine is a work in progress. The source code is available on GitHub for those who want to contribute. We are particularly interested in improving our crawl efficiency and search results but if you see any opportunities for improvement we want to know about it.
The crawler behind this search engine only crawls a portion of each website in the index. This ensures that we can index content from a vast range of sources.
We hope you enjoy the IndieWeb search engine!
Pages in this search engine are indexed using a crawler with the user agent "indieweb-search". The crawler obeys robots.txt directives so you can block our crawler from looking at certain pages on your site (or your whole site) if you would like.
Yes. IndieWeb Search obeys robots.txt files. This file is fetched before a sitemap or any page on a site is retrieved so long as the robots.txt file is available.
We also do not index content with a valid X-Robots-Tag header.
We are not currently accepting individual requests to index sites outwith the IndieWeb community.
If you are a member of the community, add your domain name as an issue in the project GitHub repository and we may add you to our crawl list.
We recrawl sites and pages in our index to ensure our index is up to date.
Each day, content in certain feeds discovered by IndieWeb Search will be reindexed (once this logic is fully implemented). This helps us catch new content soon after it is published so we can serve it from our search engine.
Providing manual indexing requests is not supported by IndieWeb Search. You will need to wait for us to crawl your site again before a new page is added.
You can, however, increase the chance your content is indexed or re-indexed quickly by providing a feed on your site. A feed will help us find updated content on your site.
You can also increase the chance your content is indexed or re-indexed by adding a URL that you want to be crawled into your sitemap.
Sitemaps are parsed at the beginning of a crawl. As a result, URLs placed in a sitemap are more likely to be indexed faster.
If you would like to remove your site or a page on your site from the IndieWeb Search Index, please create an issue on the project GitHub page.
Our direct answers, which resemble what you may know from other search engines as "featured snippets, rely heavily on microformats markup. If you want to improve your chance of receiving a featured snippet, we recommend investigating microformats2 and adding markup that you believe is relevant to your site.
We currently parse rel=me, h-event, h-review, h-recipe, and h-card in our direct answers.
Microformats markup is not required to receive a featured snippet but aids us in finding content that may be relevant to a search query.
The IndieWeb Search crawler currently does not crawl more than 15,000 URLs per domain.