We’re Making The Web Worse

We’re Making The Web Worse

Hi, this is Wayne again with a topic “We’re Making The Web Worse”.
Reddit may become unsearchable. This would actually really hurt me because, legitimately a way that I Google things now is I append Reddit um, which I I know that some companies actually know that and they try to hijack the top search results of when you put Reddit on the end and stuff. But those are pretty easy to spot, so it’s okay uh, but Google has become so like unusable that I actually do do that. So this will suck, but uh Reddit is reportedly threatening to block web crawlers, including from search engines like Google and Bing. It cannot reach an agreement with generative AI if it cannot reach an agreement with generative AI companies to pay for data collected from the site sources inside Reddit report that the company believes Reddit can survive without search. I actually disagree me too. The amount of organic browsing that I do on Reddit compared to the amount of oh. I came across this article by searching. Oh I’m on Reddit, and this is an interesting thread.

I’Ll read for a bit. Okay, goodbye is like I don’t just personally. I never just go to Reddit. I almost always end up there by accident y.

I uh, I don’t know man. I Reddit, on the one hand, has weathered some serious storms and has demonstrated that they are quite resilient. On the other hand, there’s a fine line between resilient and and confident and arrogant in hubris, um over 500 news organizations, including the New York Times Reuters and the Washington Post, have installed a blocker that prevents their content from being collected and used to train AI.

We’re Making The Web Worse

I uh have bad news for them: um that isn’t working so cool um and I guess the difference between those news organizations and Reddit is that reddit’s management is techsavvy enough to know that yeah yeah, I uh. I don’t know what to tell you. I I can’t remember what um what it’s called and this isn’t a topic, but I was reading a discussion um the other day about how some sites are like honeypotting junk data in order to like wreck the data sets that some of these AI crawlers have interesting.

We’re Making The Web Worse

So they’re they’re creating a bunk, a bunch of false information so that anything that crawls their stuff without permission uh is going to have bad results. If that makes sense interesting, so you what you idea yeah, I think it’s a matter of time before the Crawlers figure out how to work around that I mean as as someone who was recently tasked with creating a crawler. You, oh you’re, talking about me yeah sure I don’t know um, like I mean, would would you would you find a way around that yeah, but then it’s like it’s slower, yes, cuz! You have to you have to hide where the where the traffic is coming from.

We’re Making The Web Worse

You have to you have to take a more subversive approach to collecting the data. This is one of those arms race situations where, like you, could get a whole bunch of websites. That report, on the same thing um to agree to put like one piece of junk and then, if none of them have agreements with these things and that piece of junk ends up showing up in results, it’s like okay. He we got you um, then you can use legal action stuff like that yeah, but like realistically, a lot of these crawlers are going to be coming out of places like China or Russia where matter realistically, what are you going to do about it? Nothing, yeah, yeah.

What what you’re going to sue someone in China for using your data incorrectly and like okay, yeah? Good luck with that right? It’S it’s! It’S been an interesting thing to watch um. I of the opinion that yeah, I don’t. I don’t see anyway, that you’re really going to stop it.

Personally, I don’t think it’s realistic yeah. If people want to take it, they’re going to take it, and this is the same conversation we had before, where I was saying that I um, even though there’s like uh chip restrictions going into China, I think Chinese AI developers are not necessarily at any bigger disadvantage Than North American ones, because there’s more uh legal pressure here, so they have like Hardware restrictions and over here we have legal pressure. Well, they o have their governmental restrictions in terms of like I’m. My understanding is that um there’s there’s restrictions on the outputs of large language models over there, be really careful about that.

Don’T exist over here in sort of the the land of the free as we as we are familiar with people referring to themselves. I think there’s still some well there’s some that I know have been uh self-imposed by some of the larger ones, but I also know that there are much smaller llms that will output anything. Oh yeah. That’S definitely true, and I access those here.

So I effectively there isn’t really a limitation on it if you’re willing to dig for it yeah anyways, that’s a developing and sort of uninteresting conversation for most people. .