AI’s increasing ability to sift through data and track Americans’ locations has some lawmakers reconsidering parts of the Foreign Intelligence Surveillance Act.
LLMs function because we have a technology now that can operate in a space of extremely high mathematical abstraction. Just consider for a moment what you do know about LLMs. They’re trained on massive amounts of text, while fundamentally they operate by predicting the next token (or, word) in a sequence (or, sentence).
An LLM is what you get when you use this method of information processing on natural language.
What if you instead train it on fingerprinting user identities based on web behavior? It doesn’t even output language in this case, now being a different tool operating on the same fundamental information processing methodology.
What if you train a system to automate semantic analysis, which is much simpler than an LLM? Give it categories like “leftist activist” and see what kind of lists they can garner after processing the likes, shares, replies, views, … of every Reddit user that has ever existed? What if you then cross associate users via writing styles, so they can roughly patch up your old Reddit with your new Lemmy — or maybe even your really old Facebook with your old Reddit? What if they further augment that with ISP data that helps really drive these points home?
What if they don’t need tens of thousands of analysts to do this kind of thing for every single American citizen, anymore? Something previously seen as intractable and not worthy of consideration outside conspiracists, now might only require a large enough data center. Surely it doesn’t require a data center with a ballroom on top, but that’s more architectural than anything else.
Edit: let me be more clear about something. LLMs don’t predict the truth. LLMs predict the next token. That being said, they do a really damn good job. Hallucinations are a problem with alignment of that good-job to our expectation of truth — a different issue. So, when you consider the effectiveness of their “spying technology” — do so by comparing it to an LLMs ability to “sound right,” not “be right.”
They aren’t using LLMs to do the spying.
LLMs function because we have a technology now that can operate in a space of extremely high mathematical abstraction. Just consider for a moment what you do know about LLMs. They’re trained on massive amounts of text, while fundamentally they operate by predicting the next token (or, word) in a sequence (or, sentence).
An LLM is what you get when you use this method of information processing on natural language.
What if you instead train it on fingerprinting user identities based on web behavior? It doesn’t even output language in this case, now being a different tool operating on the same fundamental information processing methodology.
What if you train a system to automate semantic analysis, which is much simpler than an LLM? Give it categories like “leftist activist” and see what kind of lists they can garner after processing the likes, shares, replies, views, … of every Reddit user that has ever existed? What if you then cross associate users via writing styles, so they can roughly patch up your old Reddit with your new Lemmy — or maybe even your really old Facebook with your old Reddit? What if they further augment that with ISP data that helps really drive these points home?
What if they don’t need tens of thousands of analysts to do this kind of thing for every single American citizen, anymore? Something previously seen as intractable and not worthy of consideration outside conspiracists, now might only require a large enough data center. Surely it doesn’t require a data center with a ballroom on top, but that’s more architectural than anything else.
Edit: let me be more clear about something. LLMs don’t predict the truth. LLMs predict the next token. That being said, they do a really damn good job. Hallucinations are a problem with alignment of that good-job to our expectation of truth — a different issue. So, when you consider the effectiveness of their “spying technology” — do so by comparing it to an LLMs ability to “sound right,” not “be right.”
The bottom line is, any one of us could get a midnight visit from ICE