A modest proposal to fix web search

The problem with search these days is that Google has begun to produce absolute garbage results.

To try it, just search for something simple like “css center element”. On a fresh session of Tor Browser, here’s my top 10 items on Google:

  1. (https://www.w3schools.com) CSS Layout - Horizontal & Vertical Align - W3Schools
  2. (https://www.freecodecamp.org) How to Center Anything with CSS - Align a Div, Text, and More
  3. (https://www.w3.org) CSS: centering things - World Wide Web Consortium (W3C)
  4. People also ask
  5. (https://developer.mozilla.org) Center an element - CSS: Cascading Style Sheets - MDN Web …
  6. (https://stackoverflow.com) How to horizontally center an element - Stack Overflow
  7. (https://blog.hubspot.com) 11 Ways to Center Div or Text in Div in CSS - HubSpot Blog
  8. (https://css-tricks.com) Centering in CSS: A Complete Guide | CSS-Tricks
  9. (https://dev.to) 3 Ways to Center an Element in CSS - DEV Community
  10. (https://blog.logrocket.com) 13 ways to vertically center HTML elements with CSS

It gets even worse on harder queries. Bing/DDG is slightly better (MDN is first result and W3 second, but instant answer is unfortunately w3schools), but their image search is still crap. Yandex has great image search for some reason, but their (English?) text search is unimpressive. To see the real horror, try some really high CPC keywords on your phone, and see how the entire first page is just ads.

I presume the reason for this is because they realized they would make more money absolutely whoring out their search engine than trying to give people good results. At a certain point, you reach the end of the growth stage, and all that remains is extracting as much profit as you can from the decaying carcass of your once-great enterprise.

But the root cause of it all is these heavily SEO’d sites, right? It goes something like:

  1. Make a shitty site with information (tutorialspoint, cyberciti, etc)
  2. SEO it to hell
  3. Site places above tldr pages, Arch Wiki, etc
  4. Earn ad revenue

So if the root cause is the profit motive, why can’t you just filter out the sites with big profit?

If you load a site in headless Firefox, you can obviously measure how much ads it has. For example, load it once with and once without uBlock, and then compare e.g. loading times. There are also other things:

Then, you could get a score of how plagued the site it is. Combine this with the ordinary search rankings from e.g. DDG, Bing, YaCy, or whatever, and Bob’s your uncle.

Heck, you don’t even have to make a new search engine. Just make a block list for uBlock that hides links to plagued sites.

I don’t see how this could fall prey to Goodhart’s law. It’s pretty hard to pretend like your site doesn’t run ads, right?