-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Mellanox SDK and PRM Sniffer Utility CLI Design for SONiC
Rev | Date | Author | Change Description |
---|---|---|---|
0.2 | Liu Kebo | Initial version |
This document is intended to provide information about the Mellanox SDK/FW sniffer utilities and how to implement a CLI to use these utilities in SONiC system.
SDK sniffer will record the RPC calls from the Mellanox SDK user API library to the sx_sdk task into a .pcap file. This .pcap file can be replayed afterward to get the exact same state on SDK and FW to reproduce and investigate issues.
In some case if we want to detect the interaction between the SDK and FW, we can enable the PRM sniffer to record the communication to human readable format log file, then MLNX support team can analyze this log file to identify where the problem is.
These two sniffers are independent and can work simultaneously.
To enable these two sniffers, need to set some specific environment variable and restart the SDK again.
The new CLI shall provide a user interface to expose the above SDK and PRM sniffer debug utilities in SONiC system.
The enable/disable of SDK and PRM sniffer are controlled by some environment variable which will be passed to SDK task during the startup.
In SONiC case, SDK task resides in syncd container, thus to enable/disable the sniffers need to set/unset some environment variables inside syncd container and pass them to sx_sdk task. One possible way is to manipulate the supervisord configuration of syncd container.
For the convenience of debugging, the sniffer file shall be stored in the host file system instead of in the container, to achieve this volume will be used to bind a directory of the host file system to a directory of the container. This can be done with adding volume bind options to docker run
command.
A new folder will be created to store the sniffer files: "/var/log/mellanox/sniffer/"
For the SDK sniffer, result file will be stored in a .pcap file, which includes a time stamp of the starting time in the file name, for example, "sx_sdk_sniffer_20180224081306.pcap".
PRM sniffer result file name will also contain a starting timestamp, like "prm_recording_20180225111422.log".
So the major work of this CLI will be composed of a set of actions which manipulate the supervisord configuration of syncd docker container and restart the related services:
- Add/Delete related ENV variable configuration to the syncd supervisord configuration
- Restart the swss service to reload all the related modules/services include SDK to make sniffer start to work
Sniffer file will be stored in the directory "/var/log/mellanox/sniffer/" of the host, to set the volume, one option need to be added to the original syncd container create command which is in file "/usr/bin/syncd.sh":
-v /var/log/mellanox/sniffer:/var/log/mellanox/sniffer:rw
To enable the SDK/PRM sniffer need to pass 2 sets of environment variables to the syncd container to have the SDK started with sniffer enabled:
SX_SNIFFER_ENABLE
SX_SNIFFER_TARGET
PRM_SNIFFER
PRM_SNIFFER_FILE_PATH
Each set of them control the enable/disable and the file path of corresponding sniffer.
In the case to enable the sniffer, save the related configuration to a supervisord configuration file and upload it to the folder "/etc/supervisor/conf.d" of syncd container:
[program:syncd]
environment=SX_SNIFFER_ENABLE="1",SX_SNIFFER_TARGET="/var/log/mellanox/sniffer/sx_sdk_sniffer_20180224081306.pcap"
To restart the SDK and have the whole system work properly after SDK restarted, some related modules and service also need to be restarted. SWSS service restart can guarantee all the impacted modules and services be restarted in the proper sequence. The command is :
service swss restart
Sniffer CLI will be implemented to run the commands mentioned above to enable or disable the SDK/PRM sniffer, or both of them.
SONiC:# config platform mlnx sniffer ?
Usage: sniffer [OPTIONS] COMMAND [ARGS]...
SONiC command line - 'Sniffer' command
Options:
-?, -h, --help Show this message and exit.
Commands:
sdk sdk sniffer
prm prm sniffer
all all sniffers
status sniffer running status
SONiC# config platform mlnx sniffer sdk ?
Usage: sniffer sdk [OPTIONS] COMMAND [ARGS]...
SDK Sniffers
Options:
-?, -h, --help Show this message and exit.
Commands:
enable Enable SDK sniffer
disable Disable SDK sniffer
SONiC# config platform mlnx sniffer prm ?
Usage: sniffer disable [OPTIONS] COMMAND [ARGS]...
PRM Sniffers
Options:
-?, -h, --help Show this message and exit.
Commands:
enable Enable PRM sniffer
disable Disable PRM sniffer
SONiC# config platform mlnx sniffer all ?
Usage: sniffer disable [OPTIONS] COMMAND [ARGS]...
SDK and PRM Sniffers
Options:
-?, -h, --help Show this message and exit.
Commands:
enable Enable SDK and PRM sniffer
disable Disable SDK and PRM sniffer
When sniffer enable/disable command are issued, a prompt for SWSS service restart will be showed and user needs to agree to proceed, or the command will be canceled.
Sniffer files names will also be shown after issuing the command.
Will log rotate be required?
For the PRM sniffer, it will generate a log file by default. Multi-files should not impact the analysis, maybe can consider doing log rotate for PRM sniffer file.
-
For Users
-
For Developers
-
Subgroups/Working Groups
-
Presentations
-
Join Us