Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Here is an example:

http://www.gnoosic.com/discussion/metallica__5.html

No matter how you search for the content on Google, nothing comes up:

https://www.google.com/search?q="Metallica+only+played+2+son...

DuckDuckGo has it:

https://duckduckgo.com/?q="Metallica+only+played+2+songs+fro...

I checked the wayback machine and the content has constantly been on that url for over 10 years.

This is the first example of an old forum page I tried after reading the article. So I tend to think it's true. Google is discarding the "classic" web.



Anecdotally and perhaps unrelated - has anyone else noticed a decrease in the accuracy and general quality of Google search over the past 2-4 months? They've had to have been utilizing ML to 'improve' searches for some time now, but the quality of the results has decreased suddenly and inexplicably (for me).


> Anecdotally and perhaps unrelated - has anyone else noticed a decrease in the accuracy and general quality of Google search over the past 2-4 months

Yes. Not just over the past 2-4 months, but over the past five years or so.

It's become so bad that Google is no longer the most useful search engine for me.


It all started going downhill since Google's "Hummingbird" switch to be honest. Interviewing for Google, I actually brought this up with an engineer in the search team during the lunch.

He said they haven't noticed any regressions. I said I figured that would be the case but I can definitely feel the difference as a daily user.


This is indicative of a larger issue - testing is probably as difficult as solving the halting problem i.e. code could be generated from proper tests, yet teams tend to trust their tests completely. I see high profile websites having severe usability issues or being outright broken in ways that would be immediately caught by "interns randomly click here and there" usability tests. But these version got deployed probably because testing did not show any regressions.

I tend to believe that if user complaints about new problems or regressions increase over statistical noise - there is a problem.


> yet teams tend to trust their tests completely.

Well said. This is a big problem. We see a similar problem with the use of telemetry data as well.


I noticed the same. I've wondered for years why it happened and sometimes when I'm frustrated I try to think about it. But I am not entirely sure that the degradation in search results for me happened only in the past 5 years. Maybe, but I'm not sure.

I had no idea about Hummingbird though.


In 2009 Google was amazing for diagnosing Linux issues. I would just copy the error from the console and I'd have links to the issue tracker, a work around and the version in which the bug was fixed. Today I get a link to some github project that has nothing to do with what I'm working on and was closed as being an upstream issue.


I don't have the time, money or energy to build a specific crawler, but a Linux search engine that indexed all the major distros, packages, mailing lists, forums and issue trackers would be amazing.


> Google's "Hummingbird"

I had assumed that Google search had gone downhill because it started trying to "personalize" my search results. That wasn't a great explanation though, as I don't use a Google account.

Hummingbird seems a much more likely explanation.


Oh, they still "personalize" your search results.

I do a full clear on my web browser (cookies / offline storage / history, everything) and then open YouTube in a private browsing window and it asks me which of my two Gmail accounts I want to log in with. I'd guess it's just a combo of external IP and browser fingerprint, but it's creepy.


I know they do, and I consider this a real problem. I was just saying that personalization isn't a completely satisfactory explanation for the decline in Google search result quality. It is likely to be a factor in that, though.


I noticed the same but only on an absolute level. Compared to Bing, DDG & Co, Google is still by far the best search engine.


What would you recommend instead?


DuckDuckGo.com is my daily driver.


I love the concept of DDG and have it as my standard but still use Google (via !g) for about 30-50% of my queries. Simple queries work well in DDG (which is basically Bing) but more complicated queries only really work in Google.


Sadly I've been finding the same result. Exact searches on Google are often frustrating, but lately they've been all but impossible on DDG. It seems that all search engines (including those backing DDG) are getting on the ML train and assuming they know what I'm looking for better than I do.

I understand this being default behavior, but there really needs to be a way to disable it.


DDG has become my first stop. It gets me what I need 90% of the time.


I found the opposite. I started using DDG when I moved to Brave but after a month, I found I would go to DDG, search page after page and get frustrated and open google and have my result on page one or two.


I've heard others say similar things. That's simply not my experience. I wonder if it depends on the sorts of searches that we each tend to perform?


For me English has been working decently on DDG but last time I tried I had a really hard time getting decent results in other languages.


Qwant might be a good option, as it's European it should be better for searching in (some) other languages.

https://lite.qwant.com/


Personally, my impression is that since at least ~1-4 years, the google searches return less exact matches especially when I search for an exact match of an error message, or multiple exact matches involving the same error message (when I start to become desperate I usually tend to split the error into 2 parts...).

On the other hand the non-exact hits that it returns push me from time to time in the right direction.

Having said this, I don't know of course if A) I'm too old (40) and the mindset of the younger search-people has now changed and/or B) Google just doesn't index tech forums as much as it used to and/or C) there are just fewer forum-posts and/or D) my problems became more complex (don't think so) and/or etc... .

I tried (and still try from time to time) to use DDG and Bing but without success.

Does anybody else have the same impression?


Hello fellow 40 year old. I run DDG as my daily, but similarly have a hard time finding exact matches for error messages. I am suspicious, however, that I've learned how to use Google's search controls (", +, etc.) and I'm not sure they work the same on DDG. I also can't find a reference on DDG for how to control advanced searches.

So ... although I feel like we might be having the same issue, I'm not sure I'm using DDG correctly enough to say it's a problem.


Google search worked much better over 10 years ago than it does for me today, i.e. before it abandoned the what-you-search-is-what-you-get model. My once-masterful Google-fu seems to be borderline useless today. I'm not sure what happened over the years, but Google search has morphed into a completely different, less useful product, at least for me.

Any search term not wrapped in quotes can be randomly ignored today. It can inject keywords it thinks you want (but really don't). Google is great for searching modern sites like Stack Overflow, but it seems to have lost interest in servicing power users.


2-4 months? Try at least a year. Google search results are now that creamy scum that floats between an industrial wasteland and the tidal flats it was built upon during the changing of the tides.


Upwards of five years, actually. It was already declining when they decided to fuck the +WORD operator for their Facebook ripoff.


So you're also saying ever since Hummingbird [0], Google search hasn't been the same.

I agree.

[0] https://en.wikipedia.org/wiki/Google_Hummingbird


Yup, that's when I started to switch to DDG. I remember Google saying that you needed to add '+' +before +words that must be included instead of "putting them in quotes"–how annoying. But even using their new operators, I couldn't get answers like I used to be able to. I already didn't like their data gathering practices by then, so gimping search for me made the transition a breeze. Interestingly even up to even 9-12 months ago I remember people consistently saying that DDG was so much worse than Google, which I always figured was a result of user error, or a result of not caring about tracking and leveraging the Google profile. I'd been off the Google grid for a while so I couldn't really argue, but I knew that I got significantly better information from DuckDuckGo, having grown accustomed to the level of detail needed. These days I probably only use Google a handful of times a month. The concept that they are purposefully soiling search results to add value to ads and sponsored results sounds about right, honestly. Advertisements used to be much less relevant than the results I'd get if I inputted a string of 5+ words, but anymore I have to be careful not to accidentally click on an ad, as the results tend to be terrible, and I'd rather enter a url into a browser than click on an ad I'm actually interested in. .


I asked Jeeve's about this, he picked up a Magic 8 Ball and it said, "not so good".


Yes, i'm having to go several pages deep and even then not finding anything relevant, I've started to use other search engines and reddit to actually find useful info.


Google poured billions into their search engine for two decades to make it better. Now that have a ridiculous amount of money and power, the search results get.... objectively worse. Which brings us to the elephant in the room: what are Google's motives behind this (clearly intentional) change?

Could something as innocent as training a new neural net or testing a buggy version of the algorithm on subsets of users. But it could also be as sinister as driving traffic to those in bed with Google, silencing opposition, or effectively whitewashing the entire internet...


> Which brings us to the elephant in the room: what are Google's motives behind this (clearly intentional) change?

They are maximising ad revenue, not the search relevance/usefulness.


I was looking forward to seeing someone else share this opinion. So the google behavior is driving some users away, im wondering why others are sticking with it? I propose that in the course of professional contact we should strive to avoid use of google as a verb. Yes i know its not slick to say, "Perform a search - using the search engine." instead of, "Google it."; but it starves a mentality, i think it would disconnect the G-word from the perceived face of the internet, The whole point is a monopoly eventually gets out of hand and starts screwing its users, to its own benefit, due to largesse of the users. If google is to improve itself, We the users have to force it to by ignoring it and going elsewhere, This i think starts by RE-Realizing, as a herd, that there is choice other than the Alaughabet search engine [aka google].


> I propose that in the course of professional contact we should strive to avoid use of google as a verb.

Stopped using "google" as a verb a long, long time ago, in favor of just saying "search". I don't think that's ever confused anyone.


One possibility is that Google hasn't gotten worse, but the spammers have come up with new techniques that Google hasn't adapted to.


Maybe bad organic results lead to more ad clicks.


Most definitely. But the folk at Google are very smart, very rich, and already run the most lucrative ad platform in the world. Wouldn't hamstringing their flagship product for the sake of a few extra $B/yr harm them in the long run as more and more users switch to other search engines? They had to have considered that and made the change anyway. What's the endgame? I don't feel it's more ad clicks.


What makes you think they wouldn't? Everything else google seems to do is in the interest of short term profits. Look at all the great products they've shut down simply because they weren't all that profitable.

It's my opinion that a large portion of the websites on the front page of any search (quora and pinboard anyone) are completely bought and paid for.


I think the endgame ends up very close to same everytime this sort of thing happens. corp gets good people like it, then they get rich and take on investors. when stocks and investors get involved, then there is an expectation of an ever increasing >RATE< of profit. if that rate decrases then stocks are dropped, and if this goes on long enough, the corp is so interested in maximum profits over a shrinking timeslice that it basically takes all and gives nothing in return, that is the point when it is no longer a service, and exodus begins. [myspace]


Maybe you had this problem before but your expectations grew faster than technology? Can you think of something from your search history and fins anything that other search engines found but Google failed?


Their keyboard predictions have gone from "OK" to "Amazing, we live in the future", and over the past couple years to "of course I didn't mean 'aaAAaAAnd', wtf were you thinking".

I frequently suspect they're starting to optimize more for $ than they were before, and ML just gives them more ways to make that number go up another % or so... but it often comes with impossible-to-predict and wildly inhuman edge cases. It's a pretty common trend when companies start focusing on small number increases - each A/B test shows improvement, but the product as a whole worsens and it drives people away in time.


About 2-3 months ago they basically nuked Youtube's search and recommendation. This was associated with some bad press about those features coming up with "harmful content" like unapproved radical politics & conspiracy theories. Now you basically see mostly curated front-page stuff plus some user stuff that had probably never come up in search before (e.g. a fairly common search term will come up with videos that are a decade old and only have 5k views). Maybe changes in Google search are related?


IMO, Youtube changed for the better. It used to focus on controversial and current, now it focus on curated and evergreen content. Exactly the kind of thing people in this thread are missing from Google Search.

Maybe some similar change is coming to Search.


Yep, I've noticed a lot more commercial results than before. To find something relevant I often have to dig deep, especially if what I'm looking for is a little bit obscure. I'm glad you mentioned it.


Yes ! This morning I was not finding exactly what I was looking for in ddg and felt back to Google and result were quite noticibly worst.

To me they started spiraling down when they started to give too much power to designers. Form over content is a terrible idea for a search engine ...


Google Images is a partial example, but this happened a while ago. It appears what they do is use ML to classify what is in the image, and then show images that fit those categories. It is useless now for checking things like, did this logo designer you hired off Upwork/Fiverr/etc just steal someone else's design.

Aspiring science fiction authors, or Neal Stephenson, should write a novel about a world where ML tuned models optimize everything to be just good enough not to churn customers while maximizing margins. (Also applicable to non-profit items like politicians and universities)


Google Images still checks for exact matches. The ML stuff is an extra.


Can you give an example from your search history? How can you quantify that results got worse?


I've had this problem recently. I can craft a search for something just slightly obscure and specific that should, nonetheless, have had plenty of hits on the "old web", let alone now on the many-times-larger web. But "no pages found". Loosen up the search and it's nothing but Google-friendly blogspam that isn't remotely related to what I'm trying to find. I call bullshit.


Loosen it up? You mean google didn't automatically remove your keywords for you?


Heh, oh yeah, tons of that, usually the ones most relevant to narrowing the search beyond "everything on the Web". Thanks, Google.

So then I do the quotes thing, especially quoting phrases that 100% for sure must exist on some web pages, along with all my other keywords and pretty soon I'm at "no pages found". Pull back just a little, and it's page after page of entirely unrelated-to-what-I-want blogspam.


https://www.google.com/search?q=site%3Awww.gnoosic.com+Metal...

Looks like only page 6 is indexed for some reason. The site owner would be able to check the webmaster tools on Google to see why.


Search console isn't really helpful in many cases. Unless there's an error, it'll probably say "crawled but not indexed", which gives you no idea why they didn't include it.


there are 3 parties:

the end user searching for the content

the webmaster or author of the content

the search provider

If I'm searching for something that I know exists and I cant find it there is no excuse. The search provider failed to do its job.

There is not but the webmaster should have done this and that. He was hit by a bus 10 years ago and we should be happy the content is still available.

A good search provider would link a vanished website to archive org if the content is exactly what the customer wanted.

Long long ago when posting interesting links in comments didn't trigger commercial hysteria people would cite bits of texts and have a link to the full text. Later this became simply citing a chunk of text. I use to drop a few lines from the citation into the search engine and find the original work.

Just look!

https://www.google.com/search?q=Looks+like+only+page+6+is+in....

As i'm writing this there are exactly 45 search results above the one that should have been displayed.

There is no excuse like HN not ranking enough, they didn't not index the page, the other results didn't match the query better.

If we do this with 4 exact lines from a less popular site it will end up some place on page 20 of the search results.

Another example, I really don't care for indexing but here is an article that I always (jokingly) refer to as my greatest work.

The exact title:

https://www.google.com/search?q=%22the+wrath+of+the+book%22

A really weird result. Safe to say nothing matching is there.

The first many words from the text:

https://www.google.com/search?q=I+think+someone+%28you%29+sh...

It doesn't find it.

Then we check if it is even indexed...

https://www.google.com/search?q=http%3A%2F%2Fblog.go-here.nl...

And there it is! Why does it even crawl the page?

It also lists websites that have the number 8616 on them and ones with both the word "blog" and "here" in the text.

I'm not suppose to laugh?


Probably because the site is not https, and Google rank includes https: https://www.sangfroidwebdesign.com/search-engine-optimizatio...


page 5 seems not indexed. Everything on others pages can be found with Google.

you can force the site with "site:... ": https://www.google.com/search?q="metallica+only+played+2+son...

it's doesn't find page 5 with these terms but find page 6.

There is probably an issue within the page 5 itself.


From what I can tell, there's no links anywhere on the site to that particular page, you have to know the exact term and search it: http://www.gnoosic.com/discussion/

How is Google supposed to find that out?!


That's a good point. Googlebot probably wouldn't try out combinations in the search box, so unless the site owner provides a sitemap Google wouldn't know about entries.


How are you determining nothing links this? Google?


No, I'm browsing the site and I'm unable to find any links to those band-specific pages.


Seeing Metallica on HN makes me feel much more welcome :)


Well, search may have bugs (or undocumented features) too, I have googled content from other pages on this site (related to Metallica). Page 6 for example: https://www.google.com/search?q=%22Listen+up+you+fags+metall...


I don't think the articles premise is that Google axed all content older then 5 years or so. But that it gradually discards old unique content.

Which goes against the original mission of Google to "organize the world's information and make it universally accessible".

A "bug" could be an option, but I don't expect that to be the reason. It's too easy to find examples of forgotten content. And I don't think a bug of that magnitude in Googles core business would go unnoticed.


>> Googles core business

Which core business are you referring to?


Search. They still form a major part of their business. Through direct ad revenues but also to redirect traffic to other Google products (e.g. Maps, Youtube).


Interestingly, even though DuckDuckGo finds the post, Bing doesn't seem to.



This is reminding me of the meta search engines that consolidated results from multiple sources. I haven't used one of those in probably 15 years.


Not showing up here. Not even if I add quotes.


I tried with quotes and it doesn't show up, ironically. It must be without quotes.

It forced my to solve a bunch of CAPTCHAS too.


I do see it on Bing.

Also on the "Million Short" search engine mentioned by kickscondor:

https://millionshort.com/search?keywords=%22Metallica%20only...

I never saw that one. Do they have their own crawler?


Thanks a lot for the example!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: