Computer keeps freezing (all of sudden)

Status
Not open for further replies.
Summary:
------------
I have a dual boot machine (WinXP, FedoraCore4) that has worked flawlessly for months and months, and now, suddenly in the last week or so it is freezing up so regularly that most of the time it does not complete the boot sequence for either OS.


The Facts:
------------
1. I have made no hardware or software changes to this machine in months.
2. Every time I boot up the machine, it makes it just fine to the boot loader screen where I can choose the OS (however this only takes a few seconds).
3. At the GRUB screen:
a) If I choose Windows (w/ Normal startup): the screen usually just goes blank/black and the system must be hard reset.
b) If I choose Windows holding F8 w/ Safe Mode: it freezes upon loading gagp30kx.sys.
c) If I choose Windows holding F8 w/ Last Good Configuration: it freezes right then on that selection and must be hard reset
d) If I choose Fedora, 9 times out of 10 it will freeze when it says “booting the kernel” on the first screen.
4. When Windows does let me in (rarely), the desktop is white and it asks me if I’d like to Restore my Active Desktop. Anything I do after that causes a freeze and the system must be hard reset.
5. When Fedora does let me in (1 out of ever 6 or 7 boots), I’m able to use the system for up to 30 minutes or so, before it either freezes or goes blank/black and must be hard reset.
6. I have 2 hard drives: HD1 has Windows and the MBR (or whatever you call it that references the OS locations) and HD2 has Fedora.
7. When I put in the Windows XP install CD to attempt to reinstall/repair, it freezes within 30 seconds, at the screen that says Set is Loading Executive Files, consistently.
8. When I put in the Fedora Core 4 install CD to attempt a reinstall/repair it freezes within the first few configuration screens, consistently.
9. I’ve attempted install CDs in both of my disc drives.
10. Both of the install discs work in other machines.
11. I’ve checked the motherboard carefully and can’t find any evidence of damage shown on www.badcaps.com
12. I’ve recently moved to a new apartment, but for weeks after moving in, my machine has worked fine (only in the last week has this scenario started).
13. When these symptoms started, it used to just freeze Windows with the “Restore Active Desktop” error, but then quickly progressed into a state that wouldn’t finish booting.
14. I recently removed the case cover to allow (what I thought) was more air flow. I wish I could tell the internal temps, but it won’t stay on long enough for me to run any programs.
15. There is no over-clocking done in this system
16. Specs: (tell me if you need more detailed info)
MB: MSI K8T Master-2 FAR (with only 1 processor)
CPU: Opteron 244
2 GIG RAM
2 HDs, 250 GB each
ATI Radeon 9800 graphics card


Comments:
--------------
It seems pretty clear that this isn’t a software issue. I’ve read a lot of freeze-related newsgroup entries and the thing that seems the most likely is a heat-related problem. Suggestions online that have caught my attention include:
- Removing the cover of the computer can sometimes increase the temp because the fans, etc assume certain airflow (and I removed my cover).
- Some “goop” on my processor’s heat sink has been reduced (or something) and it’s over heating which causes the freeze.
- My graphics card may be over heating and causing the freeze.
- Additionally, the fact that this situation occurs within minutes (or sometimes seconds) of turning on the system makes me think it could be heat.

It may also be that HD1 has some sort of corruption of data that would affect the boot process, but still keep the data intact. I mention this because when I do get into Linux, I’m able to access the data on the NTFS file system just fine. Not sure.


Any suggestions or help would be much appreciated. Please tell me if there is someone else out there experiencing the same issue(s).

Thanks in advance.
Ben
 
Look over the mother board really good. MSI boards had a rash of bad capacitors that would leak. Could also be a faulty video card or AGP slot.
 
Here’s the latest:

1. I checked the CPU temp in the BIOS. It stays pretty consistently around 39 deg C … which doesn’t seem too bad. I think the Max for the Opteron 244 is 70 deg C. So, I’m going to dismiss the idea that the CPU is the source of the possible heat issue.

2. Focusing on the Graphics card (ATI Radeon 9800 Pro), I took it out of the machine and blew compressed air over it and the AGP slot. Enough dust came out of the card’s fan to choke a donkey. I also checked the card carefully for physical damage, but I could find nothing. Nothing seemed out of the ordinary with the AGP slot either. Same symptoms though when I put it back in and the fan was definitely spinning on it.

3. I tried swapping out the card with other ones at my house but nothing was compatible. I’ll stop by Fry’s on the way home today and get another one so that I can swap it and see if anything is revealed. If it doesn’t turn out to be the problem, I can always return the card. Though to tell you the truth, if it IS the card, I’ll be pretty pissed that a $350 graphics card had a problem after being used for only 1 year. If the problem persists with a new card, then I may not now much more … could be the MBoard … or the AGP slot or something.

4. I’ll also be picking up a new power supply from Fry’s to swap out and see if it makes a difference. If it doesn’t in the end, I’ll just return it.

Other than that, not much more to report … I’ve stripped out all the other components that aren’t in use, but to no avail. I’ve also confirmed that which IDE cables the HDs are on doesn’t make a difference.

I’ll post again tonight with my results of swapping the PSU and the results of swapping the G-Card.

However, IF it happens to be one of those things (PSU, Graphics card, or even MBoard) this really sucks. I would not have had any way of figuring this out before by observing any physical symptoms, in which case this forum thread may not help people in the future who have similar issues. Also, this MSI board, graphics card and PSU are supposed to be pretty good, I’ll be pissed if those things failed after just a year of use.

More to come …

Any thoughts?
 
Fixed.

Well. I don’t know what to say. I swapped out the graphics card. Nothing.
Then, after a few minutes of cursing, I pulled out one of the gig sticks of RAM and turned it back on.

That fixed it.

The system booted up perfectly as if nothing had ever happened. I’m writing this post from the machine right now and I’m still in shock.
Looks like it had nothing to do with heat at all (maybe). So much for all of my focus on CPUs and GPUs. I guess somewhere in the back of my mind whenever I booted the thing up and it started with that “Memory Test” message (the one with a quick counter that seemed to signify everything was OK after it went through all the bits), I figured the RAM was fine.

For the record, the RAM I’m using is:

Patriot “PSDIG400ER” 1 GB sticks. It’s PC3200, 400MHz, Registered, and obviously a piece of #$%!@. The only reason I got it (and spent a fortune on it) was because it was advertised to be tested with/for the Opteron processor I have.
Clearly this stuff is like kryptonite; stay away from it.

Thanks for your input and suggestions (especially those of you who mentioned RAM ;) ), I hope at least that this thread will be useful for others who face similar issues. Even now as I write this and watch the fans spin on this thing I still can’t believe it was the RAM, especially because of how it seemed to “progress” into a worse state over a short period of time, like I would imagine a heating issue would. Money well spent on a degree in Computer Science eh? Good thing I’m applying to law school soon.

I need a drink.
 
you've now been introduced to Murphy's Law ;)

Don't beat yourself silly -- sometimes stuff just happens.
Sounds as if the stick wasn't seated properly and it can happen to anybody!
 
There has been issues with registered memory and certain motherboards its not patriots fault. Especailly since they produce some of the finest OC ram available. But you also said youve used the memory for a year so who knows.

Anyways. I always keep a memtest CD handy. www.memtest.org if i got funky problems. You boot directly from CD and you dont even have to touch the keyboard or anything it starts instantly. You might want to run another sweep. It immediatly will let you know if its ram to blame.
 
Yeah, you're totally right, ofcourse.
Actually, I was only able to swap the RAM stick with another Patriot stick from the place I bought it, and that one has worked fine since I put it in my machine.

At the moment I am starting to think there is something wrong with the motherboard (as you said), I've heard some bad stories from other people who also have the MSI K8T Master-2 Far ... I've put the new stick into a different memory slot, just in case.

I actually did run memtest86 a little while ago .. but it never finished. The computer actually froze during the test. I'm going to run it again, but what would you suggest for that?
 
Just FYI, if you paid $350 for a Radeon 9800 PRO a year ago, you got ripped off, bigtime, to the double triple.
 
If your memtest itself is freezing then the the extremely small portion of ram the program is occuping is bad itself. were talking 110k of a 1gb stick. Or your mobo is developing some sort of problem. Just to get it out of the way. Maybe you should try the ram in a different PC. (assuming you got a second computer or a friend who is willing to let you tinker with his) Option 2 test one stick at a time. In a slot you know is working great.


I own a MSI neo platinum ms-7030. And let me tell you I HATE IT. I had to RMA the first one because it just wouldnt power on at all. And the one i got from RMA the eithernet adapter just failed one day. Never to be resurected. It no longer passes prime 95 at stock voltages and settings. (well on and off issue) Sometimes i have to power on/ power off/ power on to get bios to post at stock settings. For a socket 754 they says its really good but i dont see how. And its picky with RAM the first ram was corsair value series and it would reboot at the most random moments. (stable for 2 hours then BAM!!!!) Memtest showed me there was errors. (its not the rams fault passed with flying colors on another PC) I found you cant use value ram on this board. It just wont take it. Without glitching.
 
Status
Not open for further replies.
Back