Tuesday, June 07, 2016

 

FATAL ERROR: Failure (1) launching ......../bin/linux-x86_64/elemLinxMgr

Q: Executed a shutdown of an Element cluster via WebUI. Then did "/etc/init.d/element stop" and verified elemBaseLoader and elemLoader are both stopped. Tried starting up Element again and failed due to LINX error : 

(element.log)
FATAL ERROR: Failure (1) launching /usr/local/element/file_cache/elem324sc1616/bin/linux-x86_64/elemLinxMgr

(linx_launch.log)
===> Installed LINX kernel modules
Tue Jun  7 15:49:15 PDT 2016
Jun 07 15:49:15 '/sbin/insmod /lib/modules/2.6.27.25-78.2.56.fc9.x86_64/kernel/net/linx/linx.ko 2> /dev/null linx_max_links=128 linx_max_spids=2048 linx_max_attrefs=8192' FAILED
Tue Jun  7 16:26:56 PDT 2016
===> Uninstall LINX kernel modules


Trying to manually insmod this same module, as root, I get an error that the linx*.ko kernel modules are already there:
-1 File exists

But "lsmod" doesn't show any LINX at all. The loader thinks it is there, but it isn't (??)

A: Some systems can get into an odd state on restarts, whereby one can neither insmod nor rmmod the LINX modules. The linx*.ko files are indeed there; deleting them won't help because they'll reappear on the next attempt. There are two possibilities. 1) LINX is really there  2) LINX tried to get in there, but something else beat Element to the punch and installed something in place of protocol # 29 (which is where LINX wants to be). Most of the time, this ends up being a module called Controller Area Network, aka "can". The telltale sign is this (in dmesg, syslog, messages etc)

kernel: can: controller area network core (rev 20081130 abi 8)
kernel: NET: Registered protocol family 29

The easiest way to get rid of this problem is to blacklist "can" forever. The file is either called "blacklist" or "blacklist.conf" depending on your distro. 

cat /etc/modprobe.d/blacklist | grep can
....
blacklist can


If you do this, next "rmmod can" (must be root still), then "element stop" and "element start" again, and you should be able to get LINX past this. Users frequently run into this issue, especially when bringing in a new blade or new piece of hardware, or conducting restart/robustness testing for the first time on an established system.

This page is powered by Blogger. Isn't yours?

free web hit counter
free invisible web counter