Nvidia launches a chatbot that can run on your PC's GeForce RTX GPU

Shawn Knight

Posts: 15,296   +192
Staff member
Why it matters: Nvidia has released a demo application that leverages select video cards to run a personal AI chatbot on your PC. Chat with RTX runs on Windows 11 and requires a GeForce RTX 30 or 40 Series GPU (or an RTX Ampere or Ada Generation GPU). It uses retrieval-augmented generation (RAG), TensorRT-LLM, and RTX acceleration to create a personalized GPT large language model based on your own unique content.

You can also feed it videos from the web, including clips from YouTube. Simply plug in the URL of the source video, then ask the chatbot any question related to the content in the video.

As the name suggests, retrieval-augmented generation is a technique that enhances the accuracy and reliability of generative AI models using facts gathered from external sources. Nvidia has a full write-up on the subject for those seeking a deeper understanding of the technique.

Because it runs locally instead of in the cloud and is trained on your own personal data, it should be both fast and contextually relevant. Nvidia also says it delivers secure results – plausible considering it isn't transmitting sensitive data over the internet.

Tom Warren from The Verge spent some time using a pre-release version of Chat with RTX and although it was a little rough around the edges, the editor said he can see it being a valuable tool for journalists or anyone needing to analyze a set of documents.

For example, Warren was able to have the bot summarize Microsoft's entire Xbox Game Pass strategy using legal documents from Redmond's court battle with the FTC. Things were a bit buggier on the video side, however, as the app somehow loaded a transcript for a completely separate video instead of the intended one. Notably, it was not even a video that Warren had previously queried.

If you have a GeForce 30 or 40 Series GPU and want to give Chat with RTX a spin, head over to TechSpot's downloads section to grab the installation file.

Permalink to story.

 
There's plenty of open source applications that can do this on older GPUs, though you can only go so far back before you get severe performance penalties. I run similar stuff on an RTX 2080 Ti.

The main limitation on the new-ish GPUs is the memory quantity. It's pretty easy to use up all the VRAM on any GPU with gen AI models, whether you have 11 Gigs of VRAM or 192. That's why model quantization is so popular (and essential) for many of the smaller GPUs, and why I am curious about what Intel's NPUs will truly be able to deliver (if anything).
 
Oh Boy! Just what I was hoping for - NOT!
It could prove useful in having a personal chatbot without all of the abominable invasive telemetry, privacy, or government surveillance if you're depraved enough for that to be a concern.
A friend will do much better, though.
 
Only until they release the 50 series that only does X feature on the new cards that you need to upgrade to. Just like they did with the 10 series, and the 20 series, and the 30 series......and the 40 series........

Mind explaining how any other options work? Are they supposed to invent and apply every technology possible in a single generation?

Then all possible software for every use case is soon created after that?

New hardware features lead to new software. New software requirements lead to new hardware features. There is no other way, this isn't even a real discussion.
 
Mind explaining how any other options work? Are they supposed to invent and apply every technology possible in a single generation?

Then all possible software for every use case is soon created after that?

New hardware features lead to new software. New software requirements lead to new hardware features. There is no other way, this isn't even a real discussion.
AMD did it, care to explain why AMD could add features to nVidia hardware that Nvidia says Isa hardware requirement?
 
Easy, AMD's implementation is different then Nvidias, it does not provide the same performance nor the same level of polish.
Absolutely. AMD is using technology that already exists. Nvidia is pushing things forward. This almost always results in better performance and more or better features.

Look up G-sync and the AMD equivalent that used already existing standards and older technology. They are littered with examples of this.
 
It could prove useful in having a personal chatbot without all of the abominable invasive telemetry, privacy, or government surveillance if you're depraved enough for that to be a concern.
A friend will do much better, though.
So while making an attempt at imploring people to use it, you say a real friend is better, and this is supposed to convince me or anyone, for that matter, to use it? 🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣
Like I said, No Thanks Nvidia!
 
Why anyone would want to use something that simply regurgitates the turds it finds on the internet, I have no idea. I recently read a book that does just that, and its the worst book I have read on the subject. BTW - The subject is not AI, but the book could have been written by AI.

IMO - AI is a misnomer invented by some marketing department (think NVidia) just to sell more of their chips and line Leatherman's closet with more leather.
 
Absolutely. AMD is using technology that already exists. Nvidia is pushing things forward. This almost always results in better performance and more or better features.

Look up G-sync and the AMD equivalent that used already existing standards and older technology. They are littered with examples of this.
Gsync? You mean nVidia branded free Sync? nVidia isn't paying you to shill for them
 
So while making an attempt at imploring people to use it, you say a real friend is better, and this is supposed to convince me or anyone, for that matter, to use it? 🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣
Like I said, No Thanks Nvidia!
Its an option not mandatory, obviously you're not they're target audience, neither am I, so is about 90% of people, you've made that very clear. All of the benefits are granted for people already using LLMs day to day, which I am not.
I am currently eligible to use this some day with this 3070 gifted to me, but I find it much better suited towards enthusiast VR and creative works than toying with the creations of cracked algorithm engineers.
 
Gsync? You mean nVidia branded free Sync? nVidia isn't paying you to shill for them

Incorrect. G-sync has an ARM chip in the monitor so that there is almost no loss of performance, as there is in the lower cost functional but worse AMD version. But hey, you don't have to buy a new monitor!

Why are you so concerned about which way the 2 companies do things, and just enjoy the fact that we are all lucky that we have the option of two companies doing things differently. Or is the idea of having these choices the issue for you?
 
Incorrect. G-sync has an ARM chip in the monitor so that there is almost no loss of performance, as there is in the lower cost functional but worse AMD version. But hey, you don't have to buy a new monitor!

Why are you so concerned about which way the 2 companies do things, and just enjoy the fact that we are all lucky that we have the option of two companies doing things differently. Or is the idea of having these choices the issue for you?
There is an arm chip in every monitor these days, it's just that one has a gsync brand and the other doesn't.

Or when you said incorrect, did you mean nVidia was paying you to shill for them?
 
Not that I have one, but if I had an RTX 2080 Ti that is locked out from this when a considerably weaker RTX 3060 can do it, I would be pissed...

...artificial segmentation of software features is crappy and should stop.
 
Perfect Chinese and US Gov't Data mining tool. I guess if you really need to chat with someone and you have nobody... well....

Either way... No way....
 
There is an arm chip in every monitor these days, it's just that one has a gsync brand and the other doesn't.

Or when you said incorrect, did you mean nVidia was paying you to shill for them?

You have a lot of grievances. I hope you get that resolved because that is a terrible way to live.

There is no sense trying to speak reasonably with someone not connected to reality.

So you win. White flag is up. I’m done with you.
 
Incorrect. G-sync has an ARM chip in the monitor so that there is almost no loss of performance, as there is in the lower cost functional but worse AMD version. But hey, you don't have to buy a new monitor!

Why are you so concerned about which way the 2 companies do things, and just enjoy the fact that we are all lucky that we have the option of two companies doing things differently. Or is the idea of having these choices the issue for you?
Actually, no. There are very few monitors these days that contain the G-Sync module and no it's not a cheap ARM chip. It's an expensive FPGA that requires active cooling (the v2 version does). Plus these days it's mostly outdated with no support for even HDMI 2.1, not to mention DP 2.1. Also it's straight up unnecessary on OLED monitors. As for buying a new monitor - the v1 version of this FPGA is not compatible with AMD so in order to get VRR the user has to buy a new monitor if they plan on using an AMD card and want VRR. These were monitors produced mostly before 2018 or so.
 
Back