Self-hosted web searches with LibreY and YaCy
How many web searches do you perform every day? 10, 50, 100? Which search engine are you using? Are you logged in, for example, to your Google account, when you are searching? Is your browser preventing your identification by fingerprinting, and are you using a VPN to hide your location?
You should know that commercial web search providers probably get more information from your search input than you get back as results. Additionally, if you use a search as an identified user, the results might already be adjusted to your shopping interests, political views, etc. At the end of the day, they live from selling your data (Surprise!). What can you do?
Of course, you should use a privacy-oriented browser and use a VPN. I personally mainly use the browsers Firefox and Brave for different purposes, and the Surfshark VPN.
However, the weak point is often the search engine. If you're still using Google search, even worse logged in with your profile, nobody can help you. At the very least, you should switch your default search engine to, for example, DuckDuckGo. It helps hide your identity and prevent tracking. Furthermore, you can search with a scope, e.g., “!gsc famous professor” will search for the famous professor's content in Google Scholar.
If you're even more paranoid, you can use a self-hosted web search engine. For my own and community use, I'm hosting two web search engines:
- The first, https://libre-find.online/, is a LibreY instance, which gives you results from Brave Search, DuckDuckGo, Ecosia, Google, Mojeek, and Yandex Search. Additionally, you can search for images, videos, torrents, and maps. You can also connect to Libre-Find.Online with a Tor-compatible browser such as the Tor Browser or Brave, using the address http://ug3lz3wdjxljv5fxyoicgcugmcsuym5e4zgxggotjdscdrdwid7s2mqd.onion/.
- The second search engine I'm self-hosting is a YaCy instance at https://yacy.nube-gran.de. YaCy is not only a peer-to-peer (P2P) search but also a web crawler, i.e., the instance can index pages that might be less interesting to commercial crawlers and web searches. Since YaCy instances connect to each other, a distributed search index is available for P2P searches. You can see the integration of my instance in the YaCy network at https://yacy.nube-gran.de/Network.html. The name of my node is 'PRESIDENTA' in honor of the first female president of Mexico, Dr. Claudia Sheinbaum Pardo.
Good luck with your private web searches!