User:Zucca/trash/InfiniBand/resources

= InfiniBand resources for improving InfiniBand Gentoo wiki -article = ''' If anyone has urge/need to improve the article, I list some resources here for me as a reminder and for others to use. '''

What is this InfiniBand anyway? And how it compares to Ethernet?
”''InfiniBand is a high-speed serial computer bus, intended for both internal and external connections. It is the result of merging two competing designs, Future I/O, developed by Compaq, IBM, and Hewlett-Packard, with Next Generation I/O (ngio), developed by Intel, Microsoft, and Sun Microsystems. From the Compaq side, the roots were derived from Tandem's ServerNet. For a short time before the group came up with a new name, InfiniBand was called System I/O.''

''Ethernet: A computer network cabling system designed by Xerox in the late 1970s. Originally transmission rates were 3 Megabits per second (Mb/s) over thick coaxial cable. Media today include fiber, twisted-pair (copper), and several coaxial cable types. Rates are upto 10 Gigabits per second or 10,000 Mb/s.''”

See also: http://www.informatix-sol.com/docs/EthernetvInfiniBand.pdf

Passive cables
Passive cables do not contain any electronics. Only wires. The maximum length at Double Data Rate (DDR) is 10 meters, but usually 7 to 7.5 meters is the maximum length being sold. Passive CX -type connectors have at least three different physical widths: 4X, 8X and 12X. Each representing the number of lanes in the connection. Adding more lanes adds more bandwidth naturally, but the cables get bulkier and connectors get bigger as one needs to carry more wires and. Then in addition to the width there are at least three ways these connectors can be attached to the HCA:
 * Pull-latch
 * Push-latch
 * thumbscrew

Setting up
Users of Mellanox InfiniBand hardware propably need kernel 4.9 or newer.

https://software.intel.com/en-us/articles/enabling-ip-over-infiniband-on-the-intel-xeon-phi-coprocessor

http://pkg-ofed.alioth.debian.org/howto/infiniband-howto-4.html Setting up IB

http://pkg-ofed.alioth.debian.org/howto/infiniband-howto-5.html IPoIB

http://www.shocksolution.com/2012/12/installing-and-configuring-infiniband-on-a-red-hat-system/

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Networking_Guide/sec-Configuring_IPoIB.html IPoIB

http://www.davidhunt.ie/enabling-infiniband-on-ububtu-10-10/ & http://www.davidhunt.ie/infiniband-at-home-10gb-networking-on-the-cheap/

And of course the forum topic. ;) If you have some success stories tell it there. :)

If you want to contribute a resource link here, paste in to the talk page or to forums (link above).

Needed modules
If the related InfiniBand drivers have been compiled as kernel modules it may happen that not all required modules are loaded.

systemd
Systems with systemd may use this as a starting point. It however includes all the most common modules.

OpenRC
Refer to systemd -section for more information about the modules itself. '''This needs actual testing. I haven't yet rebooted my server so... :P --Zucca (talk) 21:25, 11 May 2017 (UTC)'''

NFS over RDMA
https://www.openfabrics.org/images/eventpresos/workshops2015/DevWorkshop/Tuesday/tuesday_09_lever.pdf

https://www.openfabrics.org/images/eventpresos/2017presentations/204_LinuxNFS_CLever.pdf

Users of NFS and InfiniBand can significantly reduce cpu load by utilizing NSF over RDMA.

Client side
Users who don't use systemd may drop the  -option and use threir preferred automounter.

Poweroff and sleep related problems
There are some problems, with at least Mellanox hardware which uses kernel driver, when putting system to sleep (suspend or hibernate) or when powering off or rebooting. The problems are usually caused by use of libibumad.

InfiniBand stops working after suspend/hibernate
”When running applications that use ND or libibumad (such as OpenSM) the system might get to an unstable state when trying to shutdown/restart/hibernate it.” This is a problem at least on Mellanox branded HCAs. Most propably the ones using mthca driver ( kernel module) Solution would be to shut down all programs using libibumad, or not use any such programs at all and even blacklist ib_umad kernel module or alternatively deselect it from kernel config. To be sure it would be best to even unload all ib_* modules althogether.

A possible Systemd -way to un-load modules
We can try to unload modules before initiating sleep. This, however, isn't enough in most cases. Terminating programs using the interface may be required to unload modules in the first place.

The service above goes trough line by line. On each non-comment line it looks for a modules-load config file base name. Then if it finds a file under with a same name plus .conf extension it unloads all the modules the file lists. So in this case we have these files to load and unload InfiniBand modules:

So to recap shortly: in are list of modules-load.d config files whose listed modules will be unloaded before sleep, reboot and shutdown targets.

Hibernate and Sleep are not functional when user-space is using its resources.
Mellanox HCA issue. No offical answer. Solution above should work.

Unsorted resource links
http://www.ietf.org/wg/concluded/ipoib.html

Gallery / pictures
= refs =