We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When running comm_replay on ET traces I get the following error:
$ comm_replay --enable-profiler --trace-type et --trace-path /workspace/traces --num-replays 1 0: [rank0]: Traceback (most recent call last): 0: [rank0]: File "/usr/local/bin/comm_replay", line 8, in <module> 0: [rank0]: sys.exit(main()) 0: [rank0]: File "/usr/local/lib/python3.10/dist-packages/et_replay/tools/comm_replay.py", line 1671, in main 0: [rank0]: traceBench.runBench(commsParams) 0: [rank0]: File "/usr/local/lib/python3.10/dist-packages/et_replay/tools/comm_replay.py", line 1324, in runBench 0: [rank0]: self.benchTime(commsParams) 0: [rank0]: File "/usr/local/lib/python3.10/dist-packages/et_replay/tools/comm_replay.py", line 1236, in benchTime 0: [rank0]: self.replayTrace(commsParams=commsParams, warmup=True) 0: [rank0]: File "/usr/local/lib/python3.10/dist-packages/et_replay/tools/comm_replay.py", line 1063, in replayTrace 0: [rank0]: (latency, global_latency) = self.runComms( 0: [rank0]: File "/usr/local/lib/python3.10/dist-packages/et_replay/tools/comm_replay.py", line 820, in runComms 0: [rank0]: self.collectiveArgs.waitObjIds[curComm.req] = retObj 0: [rank0]: TypeError: unhashable type: 'list' 56: [rank56]: Traceback (most recent call last): 56: [rank56]: File "/usr/local/bin/comm_replay", line 8, in <module> 56: [rank56]: sys.exit(main()) 56: [rank56]: File "/usr/local/lib/python3.10/dist-packages/et_replay/tools/comm_replay.py", line 1671, in main 56: [rank56]: traceBench.runBench(commsParams) 56: [rank56]: File "/usr/local/lib/python3.10/dist-packages/et_replay/tools/comm_replay.py", line 1324, in runBench 56: [rank56]: self.benchTime(commsParams) 56: [rank56]: File "/usr/local/lib/python3.10/dist-packages/et_replay/tools/comm_replay.py", line 1236, in benchTime 56: [rank56]: self.replayTrace(commsParams=commsParams, warmup=True) 56: [rank56]: File "/usr/local/lib/python3.10/dist-packages/et_replay/tools/comm_replay.py", line 1063, in replayTrace 56: [rank56]: (latency, global_latency) = self.runComms( 56: [rank56]: File "/usr/local/lib/python3.10/dist-packages/et_replay/tools/comm_replay.py", line 820, in runComms 56: [rank56]: self.collectiveArgs.waitObjIds[curComm.req] = retObj 56: [rank56]: TypeError: unhashable type: 'list'
The chakra schema is 1.1.1-chakra.0.0.4. I've tried with param@main and param@ 7b19f58 as chakra user guide recommends.
The text was updated successfully, but these errors were encountered:
Are you able to share the trace file? Or share one node of type "record_param_comms". Thanks.
Sorry, something went wrong.
No branches or pull requests
Describe the Bug
When running comm_replay on ET traces I get the following error:
The chakra schema is 1.1.1-chakra.0.0.4.
I've tried with param@main and param@ 7b19f58 as chakra user guide recommends.
The text was updated successfully, but these errors were encountered: