SUSE Linux partition error

gbhall

Posts: 2,419   +77
We had a Novell suse linux open enterprise server running happily some months, but an attempt to use a new Acronis image package to create a server image failed, leaving us with a server that will not properly boot.

It seems that during the boot process, whenever it tries to write anything, a message is generated saying 'no more space on drive'. This is not correct, of course, and after starting the server recovery process and using the expert partitioning tool, it seems that all the partition information is correct except that the columns 'mount point' and 'mount by' are empty - apart from the linux swap partition.

The expert partitioner does not seem to allow these two values (mount point, mount by) to be edited.

Is there any way to recover this server where all the system and data is almost certainly intact, but inaccessible because of a simple partition table corruption?

Acronis Backup & Recovery™ 10 Advanced Server
 
Can you not just boot into recovery mode using your CD/DVD, log in as root and then manually edit your fstab file again?

Or as root re-setup your bootup manager, as thats likely your issue.

I've had quite a few issues setting LiLo from my MBR due to having Seagate Disc Wizard (Acronis true image) installed on my main PC. It would seem Acronis' recovery console that is installed when you install the software conflicts, or in some way affects the operation of LiLo's bootloader. I'm not entirely sure why to be honest.
 
what can you advise?

Ok, back again. Unfortunately on my own for at least two days - that's me, who had nothing whatever to do with installing this system !!

First let me encourage you to believe that you might be able to work this one out, because we ARE talking about OES-linux - that is

When installed using a Linux kernel, the product is known as OES-Linux. This uses SUSE Linux Enterprise Server (SLES) as its platform. Atop the SLES install, daemons are added to provide NCP, eDirectory, NSS, iPrint and other services delivered by OES.

Now the problem must be entirely confined to the SUSE linux booting process, and I am confident you can feel at home there !

I can boot from the install CD and then do various things to try to determine the problem, like starting a rescue system, or dropping to a bash shell and so on.

SUSE linux Enterprise server 10 SP3 Kernel 2.6.16.60-0.54.5-smp
partition 1 is ext2
partition 2 is LVM
partition 3 is a Novell NSS-formatted volume

Right now, I have booted the recovery system, chosen Install, other, and am looking at the Yast screen which offers (1) automatic repair (2) customised repair and (3) expert tools.

I have chosen option 3, which offers
Install new boot loader
Start partitioning tool
Repair file system
Recover lost partitions
and a couple of irrelevant options.

Looking at the partitions first
/dev/sda1 70.5Mb Linux native (no mount point or mount by)
/dev/sda2 14.9Gb Linux LVM (no mount point or mount by)
/dev/sda3 665.5Gb Novell netware
/dev/system 14.9Gb LVM2 system (no mount point or mount by, and no start or end either)
/dev/system/root 10.0Gb LV (no mount point or mount by, and no start or end either)
/devsystem/swap 2.0Gb LV (no mount point or mount by, and no start or end either)

I am reluctant to make ANY changes to the system without knowing what I am doing....your advice will be extremely valuable.

The problem seems to be that a volume may be full, or fstab file corrupt, or MBR bootloader over-written.

What can I do to help you detect which it is ?
 
Looking at the partitions first
/dev/sda1 70.5Mb Linux native (no mount point or mount by)
/dev/sda2 14.9Gb Linux LVM (no mount point or mount by)
/dev/sda3 665.5Gb Novell netware
/dev/system 14.9Gb LVM2 system (no mount point or mount by, and no start or end either)
/dev/system/root 10.0Gb LV (no mount point or mount by, and no start or end either)
/devsystem/swap 2.0Gb LV (no mount point or mount by, and no start or end either)
how did you get this info?

I'm on Fedora Core-2, so the 'visable parts' may be different, but the inards of linux are the same. I boot into command-line mode and then run startx when I want the gui.

from the console (or within a Terminal window) just enter
mount​
you will get a list like

/dev/hda9 on /aaaa type ....
the first column is the device it self (is also a partition) and
/aaaa is the mount point.

if you issue cat /etc/fstab you get basically the same kind of information.
starting in the third column of fstab are the options to the mount.

fyi: MOUNT can run w/o fstab by supplying everthing on the command line,
but typically we might issue mount /aaaa and everything related to that in the
fstab file is imported.
If the fourth column contains auto then that line will be used at Linux boot time to
mount that device for you.
 
How I got to the partition information is running the 'expert partitioning tool' as described in post#3

also starting a liveCd, dropping to a command prompt, I have tried...

mount /dev/sda1 /mnt that worked
cd /mnt worked
ls -l says
total 9049
all lines start with a rights list and 1 or 2 and root root ... I hope you do not need those rights !

954780 Sep 5 2009 system.map-2.6.16.60-0.54.5-smp
512 Jun 8 13:48 backup.mbr
1 Jun 8 13:34 boot -> .
61340 Sep 5 2009 config-2.6.16.60-0.54.5-smp
1024 Jun 8 13:48 grub
27 Jun 8 14:29 initrd -> initrd-2.6.16.60-0.54.5-smp
4244216 Jun 8 14:29 initrd-2.6.16.60-0.54.5-smp
12288 Jun 8 13:39 lost+found
135680 Jun 8 13:48 message
107733 Sep 5 2009 symsets-2.6.16.60-0.54.5-smp.tar.gz
289608 Sep 5 2009 symtypes-2.6.16.60-0.54.5-smp.gz
95755 Sep 5 2009 symvers-2.6.16.60-0.54.5-smp..gz
1807692 Sep 5 2009 vmlinux-2.6.16.60-0.54.5-smp.gz
28 Jun 8 13:39 vmlinuz -> vmlinux-2.6.16.60-0.54.5-smp
1505819 Sep 5 2009 vmlinuz-2.6.16.60-0.54.5-smp
Rescue:/mnt #

It will be tommorow before I can do that over again, and execute the cat command for you.
 
mounted /dev/sda1 as /mnt and found file /grub/menu.lst this is what it says (omitting comments)

default 0
timeout 8
gfxmenu (hd0,0)/message
title SUSE Linux Enterprise Server 10 SP3
root (hd0,0)
kernel /vmlinuz-2.6.16.60-0.54.5-smp root=/dev/evms/lvm2/system/root vga=0x31a resume=/dev/evms/lvm2/system/swap splash=silent showopts
initrd /initrd-2.6.16.60-0.54.5-smp

title failsafe -- SUSE Linux Enterprise Server 10 SP3
root (hd0,0)
kernel /vmlinuz-2.6.16.60-0.54.5-smp root=/dev/evms/lvm2/system/root vga=0x31a showopts ide=nodma apm=off acpi=off noresume edd=off 3
initrd /initrd-2.6.16.60-0.54.5-smp

mounted /dev/sda1 as /mnt and found file /etc/fstab this is what it says
/dev/root / ext2 defaults 0 0
proc /proc proc defaults 0 0
sysfs /sys /sysfs noauto defaults 0 0
usbfs /proc/bus/usb usbfs defaults 0 0
devpts /dev/pts devpts mode=0620,gid=5 0 0
firmware /lib/firmware tmpfs defaults 0 0
microcode /usr/lib/microcode tmpfs defaults 0 0
 
get to a terminal window and try stat -f X
where X is any device OR a mount point eg stat -f /

the results will be 4 lines and the last two report
Blocks: total xxx free yyy available zzz size 1024
Inodes: total xxx free yyy

the partition is "FULL" if free is very very low, eg 10,20

notice FULL can be the data space; aka blocks or the inodes which contain the meta data like Name, Date, link count, permissions

btw: you can test everything in /etc/fstab with this shell script (ie copy and paste)
Code:
stat -f `cat /etc/fstab | awk '{print $1}' | egrep \#`
 
can't give that a go until Monday, but recently, I burned and booted from an openSuse Linux GNU liveCd and found I could actually see the structure of the server, including able to completely access the two drives
/dev/sda1 (the linux ext2 boot system drive?) at 74Mb
and /dev/sda2 (the linux lvm2 drive?) at 11Gb

Neither drive appears to be full
On the lvm2 drive, I can see the mount point which are the novell drive, although the details cannot be seen (this is NSS formatted)
here is the content of a file called nrmdfinfo which is certainly what should be on this server
Filesystem Type Size Used Avail Use% Mounted on
/dev/evms/lvm2/system/root
reiserfs 10G 6.4G 3.7G 64% /
udev tmpfs 5.9G 248K 5.9G 1% /dev
/dev/evms/sda1
ext2 69M 11M 55M 16% /boot
/dev/evms/DATAPOOL
nsspool 666G 47G 619G 8% /opt/novell/nss/mnt/.pools/DATAPOOL
admin nssadmin 4.0M 0 4.0M 0% /_admin
JAG nssvol 666G 5.3G 619G 1% /media/nss/JAG
ARC nssvol 666G 40G 619G 7% /media/nss/ARC
SYS nssvol 666G 548K 619G 1% /media/nss/SYS
/dev/sr0 iso9660 659M 659M 0 100% /media/SU2SP2_001
On the face of it, there is no apparent reason for the failure to boot with 'drive is full' messages. I can access pretty much what you might want to see.


Below I post a screenshot of the server as seen by GNU

http://www.jagspares.co.uk/image/screenshot.png

I can also see, on the lvm2 partition, clear signs of a partial archive backup In the media folder, there is clear evidence of an incomplete backup created by the offending Acronis backup software. It was supposed to be writing to a plugged-in USB drive.

It is 2Gb. It might fill the volume. All I need to do might be to find out how to delete it when using the livecd. At first sight, it is denied to the user.

One other strangish thing - the /dev/sda2 volume looks just fine, and displays the size and file allocation total just fine. On the other hand, the lvm2 partition on /dev/sda2 says total size 0, occupied 0, yet shows a file total after adding them all up, and can show me file contents etc.

Now that might just be an artifact of using a different distro on the real system, or it might be an indication of file system corrupted. If this was Windows, i'd be looking to running chkdsk /r !!!

Do you have a view on this?
 
On the other hand, the lvm2 partition on /dev/sda2 says total size 0, occupied 0, yet shows a file total after adding them all up, and can show me file contents etc.
lvm2 is a Logical Volume; aka windows dynamic disk - - usually two or more drives treated as one but NOT a raid-x; unless you consider it JBOD
 
Just for your education, Jobeard. The solution has been found.

1st use a similar Linux Live CD distro - in this case openSuse Linux 11.3. With the server booted from that, the weird extended partition shown as /dev/evms/lvm2/system/root can actually be mounted as SU, and the offending file deleted. At 2Gb it was indeed filling the volume (a Faulted Acronis backup incorrectly written to the main server volume).

The server will then boot normally, but failed to mount the volumes on /dev/evms/DATAPOOL. It was also learned the the rescue system Suse also does not automatically mount the evms volumes either, but they can be mounted manually. That was why so many weird things were shown in the server recovery process - pretty poor.

Using a remote IE connection to Novell iManager the whole server can be seen, and it is observed that the extended evms volumes are 'not active','not mounted'. They can be mounted and made active in iManager, and apparently the change is persistent. Thereafter, the server fully booted normally.
 
Back