联系我们
简单又实用的WordPress网站制作教学
当前位置:网站首页 > 程序开发学习 > 正文

问题记录 MLNX_OFED_LINUX-5.4-3.6.8.1-ubuntu20.04-x86_64安装mlnx-ofed-kernel-dkms错误

作者:小教学发布时间:2023-09-24分类:程序开发学习浏览:76


导读:root@zju-PowerEdge-R540:/home/zju/Downloads/MLNX_OFED_LINUX-5.4-3.6.8.1-ubuntu20.04-x86_64...

root@zju-PowerEdge-R540:/home/zju/Downloads/MLNX_OFED_LINUX-5.4-3.6.8.1-ubuntu20.04-x86_64# ./mlnxofedinstall --upstream-libs --dpdk
Logs dir: /tmp/MLNX_OFED_LINUX.1892101.logs
General log file: /tmp/MLNX_OFED_LINUX.1892101.logs/general.log

Below is the list of MLNX_OFED_LINUX packages that you have chosen
(some may have been added by the installer due to package dependencies):

ofed-scripts
mstflint
mlnx-tools
mlnx-ofed-kernel-utils
mlnx-ofed-kernel-dkms
rdma-core
libibverbs1
ibverbs-utils
ibverbs-providers
libibverbs-dev
librdmacm1
rdmacm-utils
librdmacm-dev
libibumad3
ibacm
python3-pyverbs

This program will install the MLNX_OFED_LINUX package on your machine.
Note that all other Mellanox, OEM, OFED, RDMA or Distribution IB packages will be removed.
Those packages are removed due to conflicts with MLNX_OFED_LINUX, do not reinstall them.

Do you want to continue? [Y/n] y
Setting up mlnx-ofed-kernel-dkms (5.4-OFED.5.4.3.6.8.1) …
Removing old mlnx-ofed-kernel-5.4 DKMS files…


Deleting module version: 5.4
completely from the DKMS tree.

Done.
Loading new mlnx-ofed-kernel-5.4 DKMS files…
First Installation: checking all kernels…
Building only for 5.15.0-83-generic
Building for architecture x86_64
Building initial module for 5.15.0-83-generic

Reading package lists… Done
Building dependency tree
Reading state information… Done
mlnx-ofed-kernel-dkms is already the newest version (5.4-OFED.5.4.3.6.8.1).
0 upgraded, 0 newly installed, 0 to remove and 31 not upgraded.
1 not fully installed or removed.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n] y
Setting up mlnx-ofed-kernel-dkms (5.4-OFED.5.4.3.6.8.1) …
Removing old mlnx-ofed-kernel-5.4 DKMS files…


Deleting module version: 5.4
completely from the DKMS tree.

Done.
Loading new mlnx-ofed-kernel-5.4 DKMS files…
First Installation: checking all kernels…
Building only for 5.15.0-83-generic
Building for architecture x86_64
Building initial module for 5.15.0-83-generic
Error! Bad return status for module build on kernel: 5.15.0-83-generic (x86_64)
Consult /var/lib/dkms/mlnx-ofed-kernel/5.4/build/make.log for more information.
dpkg: error processing package mlnx-ofed-kernel-dkms (–configure):
installed mlnx-ofed-kernel-dkms package post-installation script subprocess returned error exit status 10
Errors were encountered while processing:
mlnx-ofed-kernel-dkms
E: Sub-process /usr/bin/dpkg returned an error code (1)
W: Operation was interrupted before it could finish

同时近期该机器出现了某个存储卷失效的问题,不知道是何原因

本来想安装rdma的组件,参考https://blog.csdn.net/ibless/article/details/121663751 ,
输入了sudo ./mlnxofedinstall --add-kernel-support这一步之后报错
之后发现dpdk的收包功能失效,出现了以下错误
EAL: Detected 32 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode ‘PA’
EAL: No free hugepages reported in hugepages-1048576kB
EAL: No free hugepages reported in hugepages-1048576kB
EAL: No available hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support…
EAL: VFIO support initialized
EAL: using IOMMU type 8 (No-IOMMU)
EAL: Probe PCI driver: net_ice (8086:1592) device: 0000:af:00.0 (socket 1)
ice_load_pkg_type(): Active package is: 1.3.30.0, ICE OS Default Package
ice_init_proto_xtr(): Protocol extraction is not supported
EAL: Probe PCI driver: net_ice (8086:1592) device: 0000:af:00.1 (socket 1)
ice_load_pkg_type(): Active package is: 1.3.30.0, ICE OS Default Package
ice_init_proto_xtr(): Protocol extraction is not supported
EAL: No legacy callbacks, legacy socket not created
EAL: Error - exiting with code: 1

之后尝试按照该方法安装MLNX就出错了

MLNX版本下载,直接wget

https://content.mellanox.com/ofed/MLNX_OFED-5.4-3.6.8.1/MLNX_OFED_LINUX-5.4-3.6.8.1-ubuntu20.04-x86_64.tgz

然后解压缩,安装

/home/zju/Downloads/MLNX_OFED_LINUX-5.4-3.6.8.1-ubuntu20.04-x86_64

tar -xvzf

sudo ./mlnxofedinstall --dpdk

然后按照他输出的提示restart

/etc/init.d/openibd restart

该问题待解决,可能可以参考https://forums.developer.nvidia.com/t/failed-to-install-mlnx-ofed-kernel-dkms-deb-with-version-4-9-4-1-7-0/205889
或者这个
https://forums.developer.nvidia.com/t/i-failed-to-build-mlnx-ofed-linux-for-5-4-0-70-generic/206147

更新,安装5.8版本已解决:

去NVIDA官网看一下,选择另一个合适的版本
https://network.nvidia.com/products/infiniband-drivers/linux/mlnx_ofed/
我是ubuntu20.04系统,所以
下载wget https://content.mellanox.com/ofed/MLNX_OFED-5.8-3.0.7.0/MLNX_OFED_LINUX-5.8-3.0.7.0-ubuntu20.04-x86_64.tgz,并解压
5.8新版本的驱动
然后对老版本5.4目录下运行在这里插入图片描述

然后在5.8目录下运行
./mlnxofedinstall --add-kernel-support
在这里插入图片描述

发现失败
然后再运行 ./mlnxofedinstall --dpdk
在这里插入图片描述
在这里插入图片描述

成功
然后运行/etc/init.d/openibd restart
失败,原因如下:
在这里插入图片描述

rmmod: ERROR: Module ib_uverbs is in use by: irdma

意思就是,内核模块 ib_uverbs被irama使用,所以可以参考
https://blog.csdn.net/qq_32949893/article/details/108402550

运行 sudo modprobe -r irdma ib_uverbs,然后再运行/etc/init.d/openibd restart,就可以了,如图所示

在这里插入图片描述
安装完成





程序开发学习排行
最近发表
网站分类
标签列表