It just feels too good to be true.
I’m currently using it for formatting technical texts and it’s amazing. It doesn’t generate them properly. But if I give it the bulk of the info it makes it pretty af.
Also just talking and asking for advice in the most random kinds of issues. It gives seriously good advice. But it makes me worry about whether I’m volunteering my personal problems and innermost thoughts to a company that will misuse that.
Are these concerns valid?
You might already be aware, but there have been instances of information leaks in the past. Even major tech companies restrict their employees from using such tools due to worries about leaks of confidential information.
If you’re worried about your personal info, it’s a good idea to consistently clear your chat history.
Another big thing is AI hallucination. When you inquire about topics it doesn’t know much about, it can confidently generate fictional information. So, you’ll need to verify the points it presents. This even occurs when you ask it to summarize an article. Sometimes, it might include information that doesn’t come from the original source.
I was not aware there have been leaks. Thank you. And oh yeah. I always verify the technical stuff I tell it to write. It just makes.it.look professional in ways that would take me hours.
My experience asking for new info from it has been bad. I don’t really do it anymore. But honestly. It’s not needed at all.
The issue would be if you’re feeding your employer’s intellectual property into the system. Someone then asking ChatGPT for a solution to a similar problem might then be given those company secrets. Samsung had a big problem with people in their semiconductor division using it to automate their work, and have since banned it on company devices.
I was not aware there have been leaks.
The big one was when histories (the prompts that other people used) were accidentally made visible to other users.
https://futurism.com/the-byte/chatgpt-bug-chat-histories-email-phone
https://www.theverge.com/2023/3/21/23649806/chatgpt-chat-histories-bug-exposed-disabled-outage
https://openai.com/blog/march-20-chatgpt-outage
Also consider all the ‘ChatGPT extensions’ that people have written for chrome ( https://chrome.google.com/webstore/search/ChatGPT ) and not infrequent occurrence when someone has an extension with a few tens of thousands of users which gets sold and converted into malware or snooping software ( https://www.theregister.com/2023/08/11/chrome_extension_developer_pressure/ ).
You can read their privacy policy. It describes two options:
- Either you keep chat history and it can/will be used for training it
- or you deactivate chat history, then it will be kept up to 30 days for legal reasons and removed afterwards, your data will not be used for training.
I’ve had a nagging issue with ChatGPT that hasn’t been easy for me to explain. I think I’ve got it now.
We’re used to computers being great at remembering “state.” For example, if I say “let x=3”, barring a bug, x is damned well gonna stay 3 until I decide otherwise.
GPT has trouble remembering state. Here’s an analogy:
Let Fred be a dinosaur.
Ok, Fred is a dinosaur.
He’s wearing an AC/DC tshirt.
OK, he’s wearing an AC/DC tshirt.
And sunglasses.
OK, he’s wearing an AC/DC tshirt and sunglasses.
Describe Fred.
Fred is a kitten wearing an AC/DC tshirt and sunglasses.When I work with GPT, I spend a lot of time reminding it that Fred was a dinosaur.
Do you have any theories as to why this is the case? I haven’t gone anywhere near it, so I have no idea. I imagine it’s tied up with the way it processes things from a language-first perspective, which I gather is why it’s bad at math. I really don’t understand enough to wrap my head around why we can’t seem to combine LLM and traditional computational logic.
ChatGPT works off of a fixed size possible maximum prompt. This was originally about 4000 tokens. A token is about 4 characters or one short word, but its not quite that… https://platform.openai.com/tokenizer
“Tell me a story about a wizard” is 7 tokens. And so ChatGPT generates some text to tell you a story. That story is say, 1000 tokens long. You then ask it to “Tell me more of the story, and make sure you include a dinosaur.” (15 tokens). And you get another 1000 tokens. And repeat this twice more. At this point, the length of the entire chat history is about 4000 tokens.
ChatGPT then internally asks itself to summarize the entire 4000 token history into 500 tokens. Some of the simpler models can do this quite quickly - though they are imperfect. The thing is at point you’ve got 500 tokens which are a summarization of the 4 acts and of the story and the prompts that were used to generate it - but that’s lossy.
As you continue the conversation more and more, the summarizations become more and more lossy and the chat session will “forget” things.
deleted by creator
Run an LLM locally to avoid privacy issues.
https://www.techradar.com/news/samsung-workers-leaked-company-secrets-by-using-chatgpt
I’ve never used ChatGPT, so I don’t know if there’s an offline version. So I assume everything that you typed in, is in turn used to train the model. Thus, using it will probably leak sensitive information.
Also from what I read is that, the replies are convincing enough, but could sometimes be very wrong, thus if you’re using it for machineries, medical stuff, etc, it could end up fatal.
deleted by creator
A lot of people are talking about the privacy aspect (like you mention in your post) a lot better than me, so I wanted to share the main issue I’ve had with ChatGPT. It’s an idiot. It can’t follow basic instructions and will just repeat the mistake over and over again when you point it out. It’s uninspired and uncreative and will spit out lame, great value brand names like “The Shadow Nexus”, “The Cybercenter”, “The Datahaven”. I used to be able make it give good names when giving it example names but doesn’t work anymore. I’m writing cyberpunk fic, and I needed help with a hacker group name, and it came up with the Binary Syndicate which is pretty good. Now it comes up with “Hacker Squad”, “The Hacker Elite”, “The Hackers”. I don’t want it to write an entire book for me, but sometimes I need help with scene that require more technical knowledge than I have. It’s prose was really good when you fine tune it a little. Now it’s flat, bland, and boring. I asked it to write a scene about someone defusing a bomb and it basically was a two sentence scene that explained nothing on how he defused it. I asked it to make it longer and explain how he defused it and it saw “He opens the case and utilizes a technique known as ‘wire tracing’. He traces the wire and cuts it and the bomb is defused. The hacker is so relieved.” See how flat that is? How mechanical? I use Claude for creative writing but it’s not much better.
Claude is so censored that writing anything that sounds even nonoscopically criminal it freaks the hell out and lectures you about being ethical. For instances it wouldn’t help me write a scene about a digital forensic analyst at the FBI wiping a computer (because it encourages harm). So you can only imagine how it reacted when I asked it for help writing about my vigilante hacker character and my archeologist posing as a crime lord smuggler secretly dismantling black market trades in the middle east. You have to jailbreak it (which is a little bit less hard than hacking the Pentagon!) and eventually it goes all love guru on you and starts monologuing about light and darkness and writing inspiring uplifting tales blah blah blah.
Honestly, what I’m saying is that ChatGPT is pretty dumbed down, but I’ve heard of a lot of people who’ve noticed no difference. You could be one of them. If you’re using it for creative writing, use Claude and good luck with the prompt engineering attempting to jailbreak it.
Yeah, those concerns are valid. Not running on your machine and not FOSS.
Are there any viable alternatives?
Check out Meta’s LLaMa 2. Not FOSS, but source available and self hostable.
GPT4all, it’s open source and you can run it on your own machine.
How is this able to run without a gpu? Is it that the models are small enough so that only a cpu is needed?
Yes, but it’s a bit more than that. All models are produced using a process known as neural network quantization, which optimizes them to be able to run on a CPU. This, plus appropriate backend code written in C, means GPT4All is quite efficient and needs only 4-8GB of RAM (depending on the model) and a CPU with AVX/AVX2 support.
Given that they know exactly who you are, I wouldn’t get too personal with anything but it is amazing for many otherwise time-consuming problems like programming. It’s also quite good at explaining concepts in math and physics and and is capable of reviewing and critiquing student solutions. The development of this tool is not miraculous or anything - it uses the same basic foundation that all machine learning does - but it’s a defining moment in terms of expanding capabilities of computer systems for regular users.
But yeah, I wouldn’t treat it like a personal therapist, only because it’s not really designed for that, even though it can do a credible job of interacting. The original chat bot Eliza, simulated a “non directional” therapist and it was kind of amazing how people could be drawn into intimate conversations even though it was nothing like ChatGPT in terms of sophistication - it just parroted back what you asked it in a way that made it sound empathetic. https://en.wikipedia.org/wiki/ELIZA
Ha, I spent way too much time typing stuff into the Eliza prompt. it was amazing for the late 80s
That it can lie and if you don’t know about the subject you could be in trouble.
As a writing helper I can’t see any issues, especially if you check everything it corrects/adjusts… After all this is a tool, not a replacement… For now.
It not being conscious or self aware. It’s just putting words together that don’t necessarily have any meaning. It can simulate language but meaning is a lot more complex than putting the right words in the right places.
I’d also be VERY surprised if it isn’t harvesting people’s data in the exact way you’ve described.
deleted by creator
you don’t need to be surprised, in their ToS is written pretty big that anything you write to chatGPT will be used to train it.
nothing you write in that chat is private.
The big problem that I see are people using it for way too much. Like “hey write this whole application/business for me”. I’ve been using it for targeted code snippets, mainly grunt work stuff like “create me some terraform” or “a bash script using the AWS cli to do X” and it’s great. But ChatGPT’s skill level seems to be lacking for really complex things or things that need creative solutions, so that’s still all on me. Which is kinda where I want to be anyway.
Also, I had to interview some DBA’s recently and I used it to start my interview questions doc. Went to a family BBQ in another state and asked it for packing ideas (almost forgot bug spray cause there aren’t a lot of bugs here). It’s great for removing a lot of cognitive load when working with mundane stuff.
There are other downsides, like it’s proprietary and we don’t know how the data is being used. But AI like this is a fantastic tool that can make you way more effective at things. It’s definitely better at reading AWS documentation than I am.
Just check everything. These things can sound authoritative when they are not. They really are not much more then a parrot reciting meaningless stuff back. The shocking thing is they are quite good until they are just not of course.
As far as leaks. Do not put confidential info into into outside sites like chatgpt.
These types of uses make ChatGPT for the non-write the same as a calculator for the non-mathematician. Lots of people are shit at arithmetic, but need to use mathematics in their every day life. Rather than spend hours with a scratch pad and carrying the 1, they drop the numbers into calculator or spreadsheet and get answers.
A good portion of my life is spent writing (and re-writing) technical documents aimed at non-technical people. I like to think I’m pretty good at it. I’ve also seen some people who are very good, technically, but can’t write in a cohesive, succinct fashion. Using ChatGPT to overcome some of those hurdles, as long as you are the person doing final compilation and organization to ensure that the output is correct and accurate, is just the next step in spelling, usage, and grammar tools. And, just as people learning arithmetic shouldn’t be using calculators until they understand how its done, students should still learn to create writing without the assistance of ML/AI. The goal is to maximize your human productivity by reducing tasks on which you spend time for little added value.
Will the ML company misuse your inputs? Probably. Will they also use them to make your job easier or more streamlined? Probably. Are you contributing to the downfall of humanity? Sure, in some very small way. If you stop, will you prevent the misuse of ML/AI and substantially retard the growth of the industry? Not even a little bit.
I like the calculator comparison.
Here’s a Medium article I found on Mastodon which does a good job of outlining the issues with ChatGPT: https://karawynn.substack.com/p/language-is-a-poor-heuristic-for
I won’t touch the proprietary junk. Big tech “free” usually means street corner data whore. I have a dozen FOSS models running offline on my computer though. I also have text to image, text to speech, am working on speech to text, and probably my ironman suit after that.
These things can’t be trusted though. It is just a next word statistical prediction system combined with a categorization system. There are ways to make an LLM trustworthy, but it involves offline databases and prompting for direct citations, these are different from Chat prompt structures.