Discussion:
[EON] NFS Shares suddenly unavailable
Travis T
2011-06-09 01:19:09 UTC
Permalink
I have had a directory shared via NFS for almost a year now, and today for no apparent reason my client has lost access to it. It still shows the share is connected from the client, but nothing is listed in the directory.

On the EON box, everything is still listed in the directory. showmount -a shows the client connected to the correct shares. I'm obviously not very smart on solaris, but I can't see anything that changed. This share is for my ESXi server which hosts all of my virtual machines, so it's important to me to get this restored ASAP.

/var/adm/messages only has errors pertaining to LDAP and global catalog servers, which is due to the NFS datastore for my VM's being down. Last entries are from 4 June when I connected a keyboard/mouse to the local console.

Any ideas on what to check?
--
This message posted from opensolaris.org
Travis T
2011-06-09 02:38:09 UTC
Permalink
Ok, I think I'm slowly validating a hardware issue. I noticed that my PCI video card was freaking out. Changed slots with no change. Scrapped it for a PCI-e card. Video started working again.

After getting the EON box to boot, noticed my PCI NIC was resetting every second or two. It would never negotiate with my switch. I moved the NIC to a second PCI slot. Bootup fine, NIC didn't reset but didn't show up in ifconfig -a. Tried ifconfig -a plumb, but got an error message.

Shutdown box and moved NIC to 3rd PCI slot. No video, and didn't see any indication of a post. Reset and EON booted with NIC configured as usual. Then noticed my zpools were gone. Did a zpool import -a and zpools were imported.

Remote logins are taking over 5 minutes before I get a password prompt. Something is definitely wrong here, but I can't find any indication of a problem.

I'm currently having permission issues on my NFS share, so I still can't start my VMs.
--
This message posted from opensolaris.org
Andre Lue
2011-06-09 02:50:29 UTC
Permalink
Hi Travist,

5 minutes tells me something is really wrong.

Is this a simple authentication or something involving LDAP?

If you login on the console, how long does that take?

I think there is an hardware issue of some sort which is creating problems in other areas and needs to be addressed first. I'll do what I can to assist.
--
This message posted from opensolaris.org
Travis T
2011-06-09 17:48:42 UTC
Permalink
The only ldap related queries should be for the ad integrated sharing of my zpools. Console login is normal. I suspect there is a dns lookup issue, since my dns server is down as it's hosted on my vmware host. I've unjoined the domain and joined a workgroup which enabled me to copy my files for my servers over to a local datastore on the esxi server. This enabled me to get my servers back online.

Not sure what h/w problems could be causing this, but definitely saw some weird stuff yesterday with the pci cards...
--
This message posted from opensolaris.org
Travis T
2011-06-12 04:35:21 UTC
Permalink
Dre,

Due to the problems I'm currently having with the EON box, I'm trying to move all of the data from this box via rsync so I can do some more testing on the hardware to determine the cause. I've never used rsync before today, but from what I understand it is a fairly fast protocol. My transfer rates are very slow, and I have a lot of data to move over. Any ideas to improve the transfer rates? I'd like to get this data transferred in the next couple of days so I can start looking into this while I have some time.

Is rsync the best bet for migrating this data?

Is there anything I can try to improve data transfer rates?
--
This message posted from opensolaris.org
Andre Lue
2011-06-12 23:39:35 UTC
Permalink
Hi Travist,

rsync is usually decent. The other option is using zfs send/receive in combination with netcat or mbuffer but it is really hard to say with what's going in your case if performance will be any better or worse.
--
This message posted from opensolaris.org
Loading...