forked from open-mpi/ompi
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathNEWS
4951 lines (4626 loc) · 235 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Copyright (c) 2004-2010 The Trustees of Indiana University and Indiana
University Research and Technology
Corporation. All rights reserved.
Copyright (c) 2004-2006 The University of Tennessee and The University
of Tennessee Research Foundation. All rights
reserved.
Copyright (c) 2004-2008 High Performance Computing Center Stuttgart,
University of Stuttgart. All rights reserved.
Copyright (c) 2004-2006 The Regents of the University of California.
All rights reserved.
Copyright (c) 2006-2021 Cisco Systems, Inc. All rights reserved.
Copyright (c) 2006 Voltaire, Inc. All rights reserved.
Copyright (c) 2006 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
Copyright (c) 2006-2017 Los Alamos National Security, LLC. All rights
reserved.
Copyright (c) 2010-2021 IBM Corporation. All rights reserved.
Copyright (c) 2012 Oak Ridge National Labs. All rights reserved.
Copyright (c) 2012 Sandia National Laboratories. All rights reserved.
Copyright (c) 2012 University of Houston. All rights reserved.
Copyright (c) 2013-2021 NVIDIA Corporation. All rights reserved.
Copyright (c) 2013-2018 Intel, Inc. All rights reserved.
Copyright (c) 2018-2021 Amazon.com, Inc. or its affiliates. All Rights
reserved.
$COPYRIGHT$
Additional copyrights may follow
$HEADER$
===========================================================================
This file contains the main features as well as overviews of specific
bug fixes (and other actions) for each version of Open MPI since
version 1.0.
As more fully described in the "Software Version Number" section in
the README file, Open MPI typically releases two separate version
series simultaneously. Since these series have different goals and
are semi-independent of each other, a single NEWS-worthy item may be
introduced into different series at different times. For example,
feature F was introduced in the vA.B series at version vA.B.C, and was
later introduced into the vX.Y series at vX.Y.Z.
The first time feature F is released, the item will be listed in the
vA.B.C section, denoted as:
(** also to appear: X.Y.Z) -- indicating that this item is also
likely to be included in future release
version vX.Y.Z.
When vX.Y.Z is later released, the same NEWS-worthy item will also be
included in the vX.Y.Z section and be denoted as:
(** also appeared: A.B.C) -- indicating that this item was previously
included in release version vA.B.C.
Master (not on release branches yet)
------------------------------------
**********************************************************************
* PRE-DEPRECATION WARNING: MPIR Support
*
* As was announced in summer 2017, Open MPI is deprecating support for
* MPIR-based tools beginning with the future release of OMPI v5.0, with
* full removal of that support tentatively planned for OMPI v6.0.
*
* This serves as a pre-deprecation warning to provide tools developers
* with sufficient time to migrate to PMIx. Support for PMIx-based
* tools will be rolled out during the OMPI v4.x series. No runtime
* deprecation warnings will be output during this time.
*
* Runtime deprecation warnings will be output beginning with the OMPI v5.0
* release whenever MPIR-based tools connect to Open MPI's mpirun/mpiexec
* launcher.
**********************************************************************
v5.0.0rc2 -- October, 2021
--------------------
- ORTE, the underlying OMPI launcher has been removed, and replaced
with PRTE.
- Reworked how Open MPI integrates with 3rd party packages.
The decision was made to stop building 3rd-party packages
such as Libevent, HWLOC, PMIx, and PRRTE as MCA components
and instead 1) start relying on external libraries whenever
possible and 2) Open MPI builds the 3rd party libraries (if needed)
as independent libraries, rather than linked into libopen-pal.
- Update to use PMIx v4.1.1rc2
- Update to use PRRTE v2.0.1rc2
- Change the default component build behavior to prefer building
components as part of libmpi.so instead of individual DSOs.
- Remove pml/yalla, mxm, mtl/psm, and ikrit components.
- Remove all vestiges of the C/R support.
- Various ROMIO v3.4.1 updates.
- Use Pandoc to generate manpages
- 32 bit atomics are now only supported via C11 compliant compilers.
- Explicitly disable support for GNU gcc < v4.8.1 (note: the default
gcc compiler that is included in RHEL 7 is v4.8.5).
- Do not build Open SHMEM layer when there are no SPMLs available.
Currently, this means the Open SHMEM layer will only build if
the UCX library is found.
- Fix rank-by algorithms to properly rank by object and span.
- Updated the "-mca pml" option to only accept one pml, not a list.
- vprotocol/pessimist: Updated to support MPI_THREAD_MULLTIPLE.
- btl/tcp: Updated to use reachability and graph solving for global
interface matching. This has been shown to improve MPI_Init()
performance under btl/tcp.
- fs/ime: Fixed compilation errors due to missing header inclusion
Thanks to Sylvain Didelot <[email protected]> for finding
and fixing this issue.
- Fixed bug where MPI_Init_thread can give wrong error messages by
delaying error reporting until all infrastructure is running.
- Atomics support removed: S390/s390x, Sparc v9, ARMv4 and ARMv5 CMA
support.
- autogen.pl now supports a "-j" option to run multi-threaded.
Users can also use environment variable "AUTOMAKE_JOBS".
- PMI support has been removed for Open MPI apps.
- Legacy btl/sm has been removed, and replaced with btl/vader, which
was renamed to "btl/sm".
- Update btl/sm to not use CMA in user namespaces.
- C++ bindings have been removed.
- The "--am" and "--amca" options have been deprecated.
- opal/mca/threads framework added. Currently supports
argobots, qthreads, and pthreads. See the --with-threads=x option
in configure.
- Various README.md fixes - thanks to:
Yixin Zhang <[email protected]>,
Samuel Cho <[email protected]>,
rlangefe <[email protected]>,
Alex Ross <[email protected]>,
Sophia Fang <[email protected]>,
mitchelltopaloglu <[email protected]>,
Evstrife <[email protected]>, and
Hao Tong <[email protected]> for their
contributions.
- osc/pt2pt: Removed. Users can use osc/rdma + btl/tcp
for OSC support using TCP, or other providers.
- Open MPI now links -levent_core instead of -levent.
- MPI-4: Added ERRORS_ABORT infrastructure.
- common/cuda docs: Various fixes. Thanks to
Simon Byrne <[email protected]> for finding and fixing.
- osc/ucx: Add support for acc_single_intrinsic.
- Fixed buildrpm.sh "-r" option used for RPM options specification.
Thanks to John K. McIver III <[email protected]> for
reporting and fixing.
- configure: Added support for setting the wrapper C compiler.
Adds new option "--with-wrapper-cc=" .
- mpi_f08: Fixed Fortran-8-byte-INTEGER vs. C-4-byte-int issue.
Thanks to @ahaichen for reporting the bug.
- MPI-4: Added support for 'initial error handler'.
- opal/thread/tsd: Added thread-specific-data (tsd) api.
- MPI-4: Added error handling for 'unbound' errors to MPI_COMM_SELF.
- Add missing MPI_Status conversion subroutines:
MPI_Status_c2f08(), MPI_Status_f082c(), MPI_Status_f082f(),
MPI_Status_f2f08() and the PMPI_* related subroutines.
- patcher: Removed the Linux component.
- opal/util: Fixed typo in error string. Thanks to
NARIBAYASHI Akira <[email protected]> for finding
and fixing the bug.
- fortran/use-mpi-f08: Generate PMPI bindings from the MPI bindings.
- Converted man pages to markdown.
Thanks to Fangcong Yin <[email protected]> for their contribution
to this effort.
- Fixed ompi_proc_world error string and some comments in pml/ob1.
Thanks to Julien EMMANUEL <[email protected]> for
finding and fixing these issues.
- oshmem/tools/oshmem_info: Fixed Fortran keyword issue when
compiling param.c. Thanks to Pak Lui <[email protected]> for
finding and fixing the bug.
- autogen.pl: Patched libtool.m4 for OSX Big Sur. Thanks to
@fxcoudert for reporting the issue.
- Updgraded to HWLOC v2.4.0.
- Removed config/opal_check_pmi.m4.
Thanks to Zach Osman <[email protected]> for the contribution.
- opal/atomics: Added load-linked, store-conditional atomics for
AArch6.
- Fixed envvar names to OMPI_MCA_orte_precondition_transports.
Thanks to Marisa Roman <[email protected]>
for the contribution.
- fcoll/two_phase: Removed the component. All scenerios it was
used for has been replaced.
- btl/uct: Bumped UCX allowed version to v1.9.x.
- ULFM Fault Tolerance has been added. See README.FT.ULFM.md.
- Fixed a crash during CUDA initialization.
Thanks to Yaz Saito <[email protected]> for finding
and fixing the bug.
- Added CUDA support to the OFI MTL.
- ompio: Added atomicity support.
- Singleton comm spawn support has been fixed.
- Autoconf v2.7 support has been updated.
- fortran: Added check for ISO_FORTRAN_ENV:REAL16. Thanks to
Jeff Hammond <[email protected]> for reporting this issue.
- Changed the MCA component build style default to static.
- PowerPC atomics: Force usage of opal/ppc assembly.
- Removed C++ compiler requirement to build Open MPI.
- Fixed .la files leaking into wrapper compilers.
- Fixed bug where the cache line size was not set soon enough in
MPI_Init().
- coll/ucc and scoll/ucc components were added.
- coll/ucc: Added support for allgather and reduce collective
operations.
- autogen.pl: Fixed bug where it would not ignore all
excluded components.
- Various datatype bugfixes and performance improvements
- Various pack/unpack bugfixes and performance improvements
- Fix mmap infinite recurse in memory patcher
- Fix C to Fortran error code conversions.
- osc/ucx: Fix data corruption with non-contiguous accumulates
- Update coll/tuned selection rules
- Fix non-blocking collective ops
- btl/portals4: Fix flow control
- Various oshmem:ucx bugfixes and performance improvements
- common/ofi: Disable new monitor API until libfabric 1.14.0
- Fix AVX detection with icc
- mpirun option "--mca ompi_display_comm mpi_init/mpi_finalize"
has been added. Enables a communication protocol report:
when MPI_Init is invoked (using the 'mpi_init' value) and/or
when MPI_Finalize is invoked (using the 'mpi_finalize' value).
- New algorithm for Allgather and Allgatherv added, based on the
paper "Sparbit: a new logarithmic-cost and data locality-aware MPI
Allgather algorithm". Default algorithm selection rules are
un-changed, to use these algorithms add:
"--mca coll_tuned_allgather_algorithm sparbit" and/or
"--mca coll_tuned_allgatherv_algorithm sparbit"
Thanks to: Wilton Jaciel Loch <wiltonloch [email protected]>,
and Guilherme Koslovski for their contribution.
- MPI-4: Persistent collectives have been moved to the MPI
namespace from MPIX.
- OFI: Delay patcher initialization until needed. It will now
be initialized only after the component is officially selected.
- MPI-4: Make MPI_Comm_get_info, MPI_File_get_info, and
MPI_Win_get_info compliant to the standard.
- Portable_platform file has been updated from GASNet.
- GCC versions < 4.8.1 are no longer supported.
- coll: Fix a bug with the libnbc MPI_AllReduce ring algorithm
when using MPI_IN_PLACE.
- Updated the usage of .gitmodules to use relative paths from
absolute paths. This allows the submodule cloning to use the same
protocol as OMPI cloning. Thanks to Felix Uhl
<[email protected]> for the contribution.
- osc/rdma: Add local leader pid in shm file name to make it unique.
- ofi: Fix memory handler unregistration. This change fixes a
segfault during shutdown if the common/ofi component was built
as a dynamic object.
- osc/rdma: Add support for MPI minimum alignment key.
- memory_patcher: Add ability to detect patched memory. Thanks
to Rich Welch <[email protected]> for the contribution.
- build: Improve handling of compiler version string. This
fixes a compiler error with clang and armclang.
- Fix bug where the relocation of OMPI packages caused
the launch to fail.
- Various improvements to MPI_AlltoAll algorithms for both
performance and memory usage.
- coll/basic: Fix segmentation fault in MPI_Alltoallw with
MPI_IN_PLACE.
NOTE: This patch either caused or exposed a regression
in MPI_AlltoAllv() using MPI_IN_PLACE. See github issue #9501.
This will be fixed prior to v5.0.0 release.
4.1.1 -- April, 2021
--------------------
- Fix a number of datatype issues, including an issue with
improper handling of partial datatypes that could lead to
an unexpected application failure.
- Change UCX PML to not warn about MPI_Request leaks during
MPI_FINALIZE by default. The old behavior can be restored with
the mca_pml_ucx_request_leak_check MCA parameter.
- Reverted temporary solution that worked around launch issues in
SLURM v20.11.{0,1,2}. SchedMD encourages users to avoid these
versions and to upgrade to v20.11.3 or newer.
- Updated PMIx to v3.2.2.
- Fixed configuration issue on Apple Silicon observed with
Homebrew. Thanks to François-Xavier Coudert for reporting the issue.
- Disabled gcc built-in atomics by default on aarch64 platforms.
- Disabled UCX PML when UCX v1.8.0 is detected. UCX version 1.8.0 has a bug that
may cause data corruption when its TCP transport is used in conjunction with
the shared memory transport. UCX versions prior to v1.8.0 are not affected by
this issue. Thanks to @ksiazekm for reporting the issue.
- Fixed detection of available UCX transports/devices to better inform PML
prioritization.
- Fixed SLURM support to mark ORTE daemons as non-MPI tasks.
- Improved AVX detection to more accurately detect supported
platforms. Also improved the generated AVX code, and switched to
using word-based MCA params for the op/avx component (vs. numeric
big flags).
- Improved OFI compatibility support and fixed memory leaks in error
handling paths.
- Improved HAN collectives with support for Barrier and Scatter. Thanks
to @EmmanuelBRELLE for these changes and the relevant bug fixes.
- Fixed MPI debugger support (i.e., the MPIR_Breakpoint() symbol).
Thanks to @louisespellacy-arm for reporting the issue.
- Fixed ORTE bug that prevented debuggers from reading MPIR_Proctable.
- Removed PML uniformity check from the UCX PML to address performance
regression.
- Fixed MPI_Init_thread(3) statement about C++ binding and update
references about MPI_THREAD_MULTIPLE. Thanks to Andreas Lösel for
bringing the outdated docs to our attention.
- Added fence_nb to Flux PMIx support to address segmentation faults.
- Ensured progress of AIO requests in the POSIX FBTL component to
prevent exceeding maximum number of pending requests on MacOS.
- Used OPAL's mutli-thread support in the orted to leverage atomic
operations for object refcounting.
- Fixed segv when launching with static TCP ports.
- Fixed --debug-daemons mpirun CLI option.
- Fixed bug where mpirun did not honor --host in a managed job
allocation.
- Made a managed allocation filter a hostfile/hostlist.
- Fixed bug to marked a generalized request as pending once initiated.
- Fixed external PMIx v4.x check.
- Fixed OSHMEM build with `--enable-mem-debug`.
- Fixed a performance regression observed with older versions of GCC when
__ATOMIC_SEQ_CST is used. Thanks to @BiplabRaut for reporting the issue.
- Fixed buffer allocation bug in the binomial tree scatter algorithm when
non-contiguous datatypes are used. Thanks to @sadcat11 for reporting the issue.
- Fixed bugs related to the accumulate and atomics functionality in the
osc/rdma component.
- Fixed race condition in MPI group operations observed with
MPI_THREAD_MULTIPLE threading level.
- Fixed a deadlock in the TCP BTL's connection matching logic.
- Fixed pml/ob1 compilation error when CUDA support is enabled.
- Fixed a build issue with Lustre caused by unnecessary header includes.
- Fixed a build issue with IMB LSF workload manager.
- Fixed linker error with UCX SPML.
4.1.0 -- December, 2020
-----------------------
- collectives: Add HAN and ADAPT adaptive collectives components.
Both components are off by default and can be enabled by specifying
"mpirun --mca coll_adapt_priority 100 --mca coll_han_priority 100 ...".
We intend to enable both by default in Open MPI 5.0.
- OMPIO is now the default for MPI-IO on all filesystems, including
Lustre (prior to this, ROMIO was the default for Lustre). Many
thanks to Mark Dixon for identifying MPI I/O issues and providing
access to Lustre systems for testing.
- Updates for macOS Big Sur. Thanks to FX Coudert for reporting this
issue and pointing to a solution.
- Minor MPI one-sided RDMA performance improvements.
- Fix hcoll MPI_SCATTERV with MPI_IN_PLACE.
- Add AVX support for MPI collectives.
- Updates to mpirun(1) about "slots" and PE=x values.
- Fix buffer allocation for large environment variables. Thanks to
@zrss for reporting the issue.
- Upgrade the embedded OpenPMIx to v3.2.2.
- Take more steps towards creating fully Reproducible builds (see
https://reproducible-builds.org/). Thanks Bernhard M. Wiedemann for
bringing this to our attention.
- Fix issue with extra-long values in MCA files. Thanks to GitHub
user @zrss for bringing the issue to our attention.
- UCX: Fix zero-sized datatype transfers.
- Fix --cpu-list for non-uniform modes.
- Fix issue in PMIx callback caused by missing memory barrier on Arm platforms.
- OFI MTL: Various bug fixes.
- Fixed issue where MPI_TYPE_CREATE_RESIZED would create a datatype
with unexpected extent on oddly-aligned datatypes.
- collectives: Adjust default tuning thresholds for many collective
algorithms
- runtime: fix situation where rank-by argument does not work
- Portals4: Clean up error handling corner cases
- runtime: Remove --enable-install-libpmix option, which has not
worked since it was added
- opal: Disable memory patcher component on MacOS
- UCX: Allow UCX 1.8 to be used with the btl uct
- UCX: Replace usage of the deprecated NB API of UCX with NBX
- OMPIO: Add support for the IME file system
- OFI/libfabric: Added support for multiple NICs
- OFI/libfabric: Added support for Scalable Endpoints
- OFI/libfabric: Added btl for one-sided support
- OFI/libfabric: Multiple small bugfixes
- libnbc: Adding numerous performance-improving algorithms
4.0.6 -- March, 2021
-----------------------
- Update embedded PMIx to 3.2.2. This update addresses several
MPI_COMM_SPAWN problems.
- Fix a problem when using Flux PMI and UCX. Thanks to Sami Ilvonen
for reporting and supplying a fix.
- Fix a problem with MPIR breakpoint being compiled out using PGI
compilers. Thanks to @louisespellacy-arm for reporting.
- Fix some ROMIO issues when using Lustre. Thanks to Mark Dixon for
reporting.
- Fix a problem using an external PMIx 4 to build Open MPI 4.0.x.
- Fix a compile problem when using the enable-timing configure option
and UCX. Thanks to Jan Bierbaum for reporting.
- Fix a symbol name collision when using the Cray compiler to build
Open SHMEM. Thanks to Pak Lui for reporting and fixing.
- Correct an issue encountered when building Open MPI under OSX Big Sur.
Thanks to FX Coudert for reporting.
- Various fixes to the OFI MTL.
- Fix an issue with allocation of sufficient memory for parsing long
environment variable values. Thanks to @zrss for reporting.
- Improve reproducibility of builds to assist Open MPI packages.
Thanks to Bernhard Wiedmann for bringing this to our attention.
4.0.5 -- August, 2020
---------------------
- Fix a problem with MPI RMA compare and swap operations. Thanks
to Wojciech Chlapek for reporting.
- Disable binding of MPI processes to system resources by Open MPI
if an application is launched using SLURM's srun command.
- Disable building of the Fortran mpi_f08 module when configuring
Open MPI with default 8 byte Fortran integer size. Thanks to
@ahcien for reporting.
- Fix a problem with mpirun when the --map-by option is used.
Thanks to Wenbin Lyu for reporting.
- Fix some issues with MPI one-sided operations uncovered using Global
Arrays regression test-suite. Thanks to @bjpalmer for reporting.
- Fix a problem with make check when using the PGI compiler. Thanks to
Carl Ponder for reporting.
- Fix a problem with MPI_FILE_READ_AT_ALL that could lead to application
hangs under certain circumstances. Thanks to Scot Breitenfeld for
reporting.
- Fix a problem building C++ applications with newer versions of GCC.
Thanks to Constantine Khrulev for reporting.
4.0.4 -- June, 2020
-------------------
- Fix a memory patcher issue intercepting shmat and shmdt. This was
observed on RHEL 8.x ppc64le (see README for more info).
- Fix an illegal access issue caught using gcc's address sanitizer.
Thanks to Georg Geiser for reporting.
- Add checks to avoid conflicts with a libevent library shipped with LSF.
- Switch to linking against libevent_core rather than libevent, if present.
- Add improved support for UCX 1.9 and later.
- Fix an ABI compatibility issue with the Fortran 2008 bindings.
Thanks to Alastair McKinstry for reporting.
- Fix an issue with rpath of /usr/lib64 when building OMPI on
systems with Lustre. Thanks to David Shrader for reporting.
- Fix a memory leak occurring with certain MPI RMA operations.
- Fix an issue with ORTE's mapping of MPI processes to resources.
Thanks to Alex Margolin for reporting and providing a fix.
- Correct a problem with incorrect error codes being returned
by OMPI MPI_T functions.
- Fix an issue with debugger tools not being able to attach
to mpirun more than once. Thanks to Gregory Lee for reporting.
- Fix an issue with the Fortran compiler wrappers when using
NAG compilers. Thanks to Peter Brady for reporting.
- Fix an issue with the ORTE ssh based process launcher at scale.
Thanks to Benjamín Hernández for reporting.
- Address an issue when using shared MPI I/O operations. OMPIO will
now successfully return from the file open statement but will
raise an error if the file system does not supported shared I/O
operations. Thanks to Romain Hild for reporting.
- Fix an issue with MPI_WIN_DETACH. Thanks to Thomas Naughton for reporting.
4.0.3 -- March, 2020
-----------------------
- Update embedded PMIx to 3.1.5
- Add support for Mellanox ConnectX-6.
- Fix an issue in OpenMPI IO when using shared file pointers.
Thanks to Romain Hild for reporting.
- Fix a problem with Open MPI using a previously installed
Fortran mpi module during compilation. Thanks to Marcin
Mielniczuk for reporting
- Fix a problem with Fortran compiler wrappers ignoring use of
disable-wrapper-runpath configure option. Thanks to David
Shrader for reporting.
- Fixed an issue with trying to use mpirun on systems where neither
ssh nor rsh is installed.
- Address some problems found when using XPMEM for intra-node message
transport.
- Improve dimensions returned by MPI_Dims_create for certain
cases. Thanks to @aw32 for reporting.
- Fix an issue when sending messages larger than 4GB. Thanks to
Philip Salzmann for reporting this issue.
- Add ability to specify alternative module file path using
Open MPI's RPM spec file. Thanks to @jschwartz-cray for reporting.
- Clarify use of --with-hwloc configuration option in the README.
Thanks to Marcin Mielniczuk for raising this documentation issue.
- Fix an issue with shmem_atomic_set. Thanks to Sameh Sharkawi for reporting.
- Fix a problem with MPI_Neighbor_alltoall(v,w) for cartesian communicators
with cyclic boundary conditions. Thanks to Ralph Rabenseifner and
Tony Skjellum for reporting.
- Fix an issue using Open MPIO on 32 bit systems. Thanks to
Orion Poplawski for reporting.
- Fix an issue with NetCDF test deadlocking when using the vulcan
Open MPIO component. Thanks to Orion Poplawski for reporting.
- Fix an issue with the mpi_yield_when_idle parameter being ignored
when set in the Open MPI MCA parameter configuration file.
Thanks to @iassiour for reporting.
- Address an issue with Open MPIO when writing/reading more than 2GB
in an operation. Thanks to Richard Warren for reporting.
4.0.2 -- September, 2019
------------------------
- Update embedded PMIx to 3.1.4
- Enhance Open MPI to detect when processes are running in
different name spaces on the same node, in which case the
vader CMA single copy mechanism is disabled. Thanks
to Adrian Reber for reporting and providing a fix.
- Fix an issue with ORTE job tree launch mechanism. Thanks
to @lanyangyang for reporting.
- Fix an issue with env processing when running as root.
Thanks to Simon Byrne for reporting and providing a fix.
- Fix Fortran MPI_FILE_GET_POSITION return code bug.
Thanks to Wei-Keng Liao for reporting.
- Fix user defined datatypes/ops leak in nonblocking base collective
component. Thanks to Andrey Maslennikov for verifying fix.
- Fixed shared memory not working with spawned processes.
Thanks to @rodarima for reporting.
- Fix data corruption of overlapping datatypes on sends.
Thanks to DKRZ for reporting.
- Fix segfault in oob_tcp component on close with active listeners.
Thanks to Orivej Desh for reporting and providing a fix.
- Fix divide by zero segfault in ompio.
Thanks to @haraldkl for reporting and providing a fix.
- Fix finalize of flux compnents.
Thanks to Stephen Herbein and Jim Garlick for providing a fix.
- Fix osc_rdma_acc_single_intrinsic regression.
Thanks to Joseph Schuchart for reporting and providing a fix.
- Fix hostnames with large integers.
Thanks to @perrynzhou for reporting and providing a fix.
- Fix Deadlock in MPI_Fetch_and_op when using UCX
Thanks to Joseph Schuchart for reporting.
- Fix the SLURM plm for mpirun-based launching.
Thanks to Jordon Hayes for reporting and providing a fix.
- Prevent grep failure in rpmbuild from aborting.
Thanks to Daniel Letai for reporting.
- Fix btl/vader finalize sequence.
Thanks to Daniel Vollmer for reporting.
- Fix pml/ob1 local handle sent during PUT control message.
Thanks to @EmmanuelBRELLE for reporting and providing a fix.
- Fix Memory leak with persistent MPI sends and the ob1 "get" protocol.
Thanks to @s-kuberski for reporting.
- v4.0.x: mpi: mark MPI_COMBINER_{HVECTOR,HINDEXED,STRUCT}_INTEGER
removed unless configured with --enable-mpi1-compatibility
- Fix make-authors.pl when run in a git submodule.
Thanks to Michael Heinz for reporting and providing a fix.
- Fix deadlock with mpi_assert_allow_overtaking in MPI_Issend.
Thanks to Joseph Schuchart and George Bosilca for reporting.
- Add compilation flag to allow unwinding through files that are
present in the stack when attaching with MPIR.
Thanks to James A Clark for reporting and providing a fix.
Known issues:
- There is a known issue with the OFI libfabric and PSM2 MTLs when trying to send
very long (> 4 GBytes) messages. In this release, these MTLs will catch
this case and abort the transfer. A future release will provide a
better solution to this issue.
4.0.1 -- March, 2019
--------------------
- Update embedded PMIx to 3.1.2.
- Fix an issue with Vader (shared-memory) transport on OS-X. Thanks
to Daniel Vollmer for reporting.
- Fix a problem with the usNIC BTL Makefile. Thanks to George Marselis
for reporting.
- Fix an issue when using --enable-visibility configure option
and older versions of hwloc. Thanks to Ben Menadue for reporting
and providing a fix.
- Fix an issue with MPI_WIN_CREATE_DYNAMIC and MPI_GET from self.
Thanks to Bart Janssens for reporting.
- Fix an issue of excessive compiler warning messages from mpi.h
when using newer C++ compilers. Thanks to @Shadow-fax for
reporting.
- Fix a problem when building Open MPI using clang 5.0.
- Fix a problem with MPI_WIN_CREATE when using UCX. Thanks
to Adam Simpson for reporting.
- Fix a memory leak encountered for certain MPI datatype
destructor operations. Thanks to Axel Huebl for reporting.
- Fix several problems with MPI RMA accumulate operations.
Thanks to Jeff Hammond for reporting.
- Fix possible race condition in closing some file descriptors
during job launch using mpirun. Thanks to Jason Williams
for reporting and providing a fix.
- Fix a problem in OMPIO for large individual write operations.
Thanks to Axel Huebl for reporting.
- Fix a problem with parsing of map-by ppr options to mpirun.
Thanks to David Rich for reporting.
- Fix a problem observed when using the mpool hugepage component. Thanks
to Hunter Easterday for reporting and fixing.
- Fix valgrind warning generated when invoking certain MPI Fortran
data type creation functions. Thanks to @rtoijala for reporting.
- Fix a problem when trying to build with a PMIX 3.1 or newer
release. Thanks to Alastair McKinstry for reporting.
- Fix a problem encountered with building MPI F08 module files.
Thanks to Igor Andriyash and Axel Huebl for reporting.
- Fix two memory leaks encountered for certain MPI-RMA usage patterns.
Thanks to Joseph Schuchart for reporting and fixing.
- Fix a problem with the ORTE rmaps_base_oversubscribe MCA paramater.
Thanks to @iassiour for reporting.
- Fix a problem with UCX PML default error handler for MPI communicators.
Thanks to Marcin Krotkiewski for reporting.
- Fix various issues with OMPIO uncovered by the testmpio test suite.
4.0.0 -- September, 2018
------------------------
- OSHMEM updated to the OpenSHMEM 1.4 API.
- Do not build OpenSHMEM layer when there are no SPMLs available.
Currently, this means the OpenSHMEM layer will only build if
a MXM or UCX library is found.
- A UCX BTL was added for enhanced MPI RMA support using UCX
- With this release, OpenIB BTL now only supports iWarp and RoCE by default.
- Updated internal HWLOC to 2.0.2
- Updated internal PMIx to 3.0.2
- Change the priority for selecting external verses internal HWLOC
and PMIx packages to build. Starting with this release, configure
by default selects available external HWLOC and PMIx packages over
the internal ones.
- Updated internal ROMIO to 3.2.1.
- Removed support for the MXM MTL.
- Removed support for SCIF.
- Improved CUDA support when using UCX.
- Enable use of CUDA allocated buffers for OMPIO.
- Improved support for two phase MPI I/O operations when using OMPIO.
- Added support for Software-based Performance Counters, see
https://github.com/davideberius/ompi/wiki/How-to-Use-Software-Based-Performance-Counters-(SPCs)-in-Open-MPI
- Change MTL OFI from opting-IN on "psm,psm2,gni" to opting-OUT on
"shm,sockets,tcp,udp,rstream"
- Various improvements to MPI RMA performance when using RDMA
capable interconnects.
- Update memkind component to use the memkind 1.6 public API.
- Fix a problem with javadoc builds using OpenJDK 11. Thanks to
Siegmar Gross for reporting.
- Fix a memory leak using UCX. Thanks to Charles Taylor for reporting.
- Fix hangs in MPI_FINALIZE when using UCX.
- Fix a problem with building Open MPI using an external PMIx 2.1.2
library. Thanks to Marcin Krotkiewski for reporting.
- Fix race conditions in Vader (shared memory) transport.
- Fix problems with use of newer map-by mpirun options. Thanks to
Tony Reina for reporting.
- Fix rank-by algorithms to properly rank by object and span
- Allow for running as root of two environment variables are set.
Requested by Axel Huebl.
- Fix a problem with building the Java bindings when using Java 10.
Thanks to Bryce Glover for reporting.
- Fix a problem with ORTE not reporting error messages if an application
terminated normally but exited with non-zero error code. Thanks to
Emre Brookes for reporting.
3.1.6 -- March, 2020
--------------------
- Fix one-sided shared memory window configuration bug.
- Fix support for PGI'18 compiler.
- Fix issue with zero-length blockLength in MPI_TYPE_INDEXED.
- Fix run-time linker issues with OMPIO on newer Linux distros.
- Fix PMIX dstore locking compilation issue. Thanks to Marco Atzeri
for reporting the issue.
- Allow the user to override modulefile_path in the Open MPI SRPM,
even if install_in_opt is set to 1.
- Properly detect ConnectX-6 HCAs in the openib BTL.
- Fix segfault in the MTL/OFI initialization for large jobs.
- Fix issue to guarantee to properly release MPI one-sided lock when
using UCX transports to avoid a deadlock.
- Fix potential deadlock when processing outstanding transfers with
uGNI transports.
- Fix various portals4 control flow bugs.
- Fix communications ordering for alltoall and Cartesian neighborhood
collectives.
- Fix an infinite recursion crash in the memory patcher on systems
with glibc v2.26 or later (e.g., Ubuntu 18.04) when using certain
OS-bypass interconnects.
3.1.5 -- November, 2019
-----------------------
- Fix OMPIO issue limiting file reads/writes to 2GB. Thanks to
Richard Warren for reporting the issue.
- At run time, automatically disable Linux cross-memory attach (CMA)
for vader BTL (shared memory) copies when running in user namespaces
(i.e., containers). Many thanks to Adrian Reber for raising the
issue and providing the fix.
- Sending very large MPI messages using the ofi MTL will fail with
some of the underlying Libfabric transports (e.g., PSM2 with
messages >=4GB, verbs with messages >=2GB). Prior version of Open
MPI failed silently; this version of Open MPI invokes the
appropriate MPI error handler upon failure. See
https://github.com/open-mpi/ompi/issues/7058 for more details.
Thanks to Emmanuel Thomé for raising the issue.
- Fix case where 0-extent datatypes might be eliminated during
optimization. Thanks to Github user @tjahns for raising the issue.
- Ensure that the MPIR_Breakpoint symbol is not optimized out on
problematic platforms.
- Fix MPI one-sided 32 bit atomic support.
- Fix OMPIO offset calculations with SEEK_END and SEEK_CUR in
MPI_FILE_GET_POSITION. Thanks to Wei-keng Liao for raising the
issue.
- Add "naive" regx component that will never fail, no matter how
esoteric the hostnames are.
- Fix corner case for datatype extent computations. Thanks to David
Dickenson for raising the issue.
- Allow individual jobs to set their map/rank/bind policies when
running LSF. Thanks to Nick R. Papior for assistance in solving the
issue.
- Fix MPI buffered sends with the "cm" PML.
- Properly propagate errors to avoid deadlocks in MPI one-sided operations.
- Update to PMIx v2.2.3.
- Fix data corruption in non-contiguous MPI accumulates over UCX.
- Fix ssh-based tree-based spawning at scale. Many thanks to Github
user @zrss for the report and diagnosis.
- Fix the Open MPI RPM spec file to not abort when grep fails. Thanks
to Daniel Letai for bringing this to our attention.
- Handle new SLURM CLI options (SLURM 19 deprecated some options that
Open MPI was using). Thanks to Jordan Hayes for the report and the
initial fix.
- OMPI: fix division by zero with an empty file view.
- Also handle shmat()/shmdt() memory patching with OS-bypass networks.
- Add support for unwinding info to all files that are present in the
stack starting from MPI_Init, which is helpful with parallel
debuggers. Thanks to James Clark for the report and initial fix.
- Fixed inadvertant use of bitwise operators in the MPI C++ bindings
header files. Thanks to Bert Wesarg for the report and the fix.
3.1.4 -- April, 2019
--------------------
- Fix compile error when configured with --enable-mpi-java and
--with-devel-headers. Thanks to @g-raffy for reporting the issue
(** also appeared: v3.0.4).
- Only use hugepages with appropriate permissions. Thanks to Hunter
Easterday for the fix.
- Fix possible floating point rounding and division issues in OMPIO
which led to crashes and/or data corruption with very large data.
Thanks to Axel Huebl and René Widera for identifing the issue,
supplying and testing the fix (** also appeared: v3.0.4).
- Use static_cast<> in mpi.h where appropriate. Thanks to @shadow-fx
for identifying the issue (** also appeared: v3.0.4).
- Fix RMA accumulate of non-predefined datatypes with predefined
operators. Thanks to Jeff Hammond for raising the issue (** also
appeared: v3.0.4).
- Fix race condition when closing open file descriptors when launching
MPI processes. Thanks to Jason Williams for identifying the issue and
supplying the fix (** also appeared: v3.0.4).
- Fix support for external PMIx v3.1.x.
- Fix Valgrind warnings for some MPI_TYPE_CREATE_* functions. Thanks
to Risto Toijala for identifying the issue and supplying the fix (**
also appeared: v3.0.4).
- Fix MPI_TYPE_CREATE_F90_{REAL,COMPLEX} for r=38 and r=308 (** also
appeared: v3.0.4).
- Fix assembly issues with old versions of gcc (<6.0.0) that affected
the stability of shared memory communications (e.g., with the vader
BTL) (** also appeared: v3.0.4).
- Fix MPI_Allreduce crashes with some cases in the coll/spacc module.
- Fix the OFI MTL handling of MPI_ANY_SOURCE (** also appeared:
v3.0.4).
- Fix noisy errors in the openib BTL with regards to
ibv_exp_query_device(). Thanks to Angel Beltre and others who
reported the issue (** also appeared: v3.0.4).
- Fix zero-size MPI one-sided windows with UCX.
3.1.3 -- October, 2018
----------------------
- Fix race condition in MPI_THREAD_MULTIPLE support of non-blocking
send/receive path.
- Fix error handling SIGCHLD forwarding.
- Add support for CHARACTER and LOGICAL Fortran datatypes for MPI_SIZEOF.
- Fix compile error when using OpenJDK 11 to compile the Java bindings.
- Fix crash when using a hostfile with a 'user@host' line.
- Numerous Fortran '08 interface fixes.
- TCP BTL error message fixes.
- OFI MTL now will use any provider other than shm, sockets, tcp, udp, or
rstream, rather than only supporting gni, psm, and psm2.
- Disable async receive of CUDA buffers by default, fixing a hang
on large transfers.
- Support the BCM57XXX and BCM58XXX Broadcomm adapters.
- Fix minmax datatype support in ROMIO.
- Bug fixes in vader shared memory transport.
- Support very large buffers with MPI_TYPE_VECTOR.
- Fix hang when launching with mpirun on Cray systems.
3.1.2 -- August, 2018
------------------------
- A subtle race condition bug was discovered in the "vader" BTL
(shared memory communications) that, in rare instances, can cause
MPI processes to crash or incorrectly classify (or effectively drop)
an MPI message sent via shared memory. If you are using the "ob1"
PML with "vader" for shared memory communication (note that vader is
the default for shared memory communication with ob1), you need to
upgrade to v3.1.2 or later to fix this issue. You may also upgrade
to the following versions to fix this issue:
- Open MPI v2.1.5 (expected end of August, 2018) or later in the
v2.1.x series
- Open MPI v3.0.1 (released March, 2018) or later in the v3.0.x
series
- Assorted Portals 4.0 bug fixes.
- Fix for possible data corruption in MPI_BSEND.
- Move shared memory file for vader btl into /dev/shm on Linux.
- Fix for MPI_ISCATTER/MPI_ISCATTERV Fortran interfaces with MPI_IN_PLACE.
- Upgrade PMIx to v2.1.3.
- Numerous One-sided bug fixes.
- Fix for race condition in uGNI BTL.
- Improve handling of large number of interfaces with TCP BTL.
- Numerous UCX bug fixes.
3.1.1 -- June, 2018
-------------------
- Fix potential hang in UCX PML during MPI_FINALIZE
- Update internal PMIx to v2.1.2rc2 to fix forward version compatibility.
- Add new MCA parameter osc_sm_backing_store to allow users to specify
where in the filesystem the backing file for the shared memory
one-sided component should live. Defaults to /dev/shm on Linux.
- Fix potential hang on non-x86 platforms when using builds with
optimization flags turned off.
- Disable osc/pt2pt when using MPI_THREAD_MULTIPLE due to numerous
race conditions in the component.
- Fix dummy variable names for the mpi and mpi_f08 Fortran bindings to
match the MPI standard. This may break applications which use
name-based parameters in Fortran which used our internal names
rather than those documented in the MPI standard.
- Revamp Java detection to properly handle new Java versions which do
not provide a javah wrapper.
- Fix RMA function signatures for use-mpi-f08 bindings to have the
asynchonous property on all buffers.
- Improved configure logic for finding the UCX library.
3.1.0 -- May, 2018
------------------
- Various OpenSHMEM bug fixes.
- Properly handle array_of_commands argument to Fortran version of
MPI_COMM_SPAWN_MULTIPLE.
- Fix bug with MODE_SEQUENTIAL and the sharedfp MPI-IO component.
- Use "javac -h" instead of "javah" when building the Java bindings
with a recent version of Java.
- Fix mis-handling of jostepid under SLURM that could cause problems
with PathScale/OmniPath NICs.
- Disable the POWER 7/BE block in configure. Note that POWER 7/BE is
still not a supported platform, but it is no longer automatically
disabled. See
https://github.com/open-mpi/ompi/issues/4349#issuecomment-374970982
for more information.
- The output-filename option for mpirun is now converted to an
absolute path before being passed to other nodes.
- Add monitoring component for PML, OSC, and COLL to track data
movement of MPI applications. See
ompi/mca/commmon/monitoring/HowTo_pml_monitoring.tex for more
information about the monitoring framework.
- Add support for communicator assertions: mpi_assert_no_any_tag,
mpi_assert_no_any_source, mpi_assert_exact_length, and
mpi_assert_allow_overtaking.
- Update PMIx to version 2.1.1.
- Update hwloc to 1.11.7.
- Many one-sided behavior fixes.
- Improved performance for Reduce and Allreduce using Rabenseifner's algorithm.
- Revamped mpirun --help output to make it a bit more manageable.
- Portals4 MTL improvements: Fix race condition in rendezvous protocol and
retry logic.
- UCX OSC: initial implementation.
- UCX PML improvements: add multi-threading support.
- Yalla PML improvements: Fix error with irregular contiguous datatypes.
- Openib BTL: disable XRC support by default.
- TCP BTL: Add check to detect and ignore connections from processes
that aren't MPI (such as IDS probes) and verify that source and
destination are using the same version of Open MPI, fix issue with very
large message transfer.
- ompi_info parsable output now escapes double quotes in values, and
also quotes values can contains colons. Thanks to Lev Givon for the
suggestion.
- CUDA-aware support can now handle GPUs within a node that do not
support CUDA IPC. Earlier versions would get error and abort.
- Add a mca parameter ras_base_launch_orted_on_hn to allow for launching
MPI processes on the same node where mpirun is executing using a separate
orte daemon, rather than the mpirun process. This may be useful to set to
true when using SLURM, as it improves interoperability with SLURM's signal
propagation tools. By default it is set to false, except for Cray XC systems.
- Remove LoadLeveler RAS support.
- Remove IB XRC support from the OpenIB BTL due to lack of support.
- Add functionality for IBM s390 platforms. Note that regular
regression testing does not occur on the s390 and it is not
considered a supported platform.
- Remove support for big endian PowerPC.
- Remove support for XL compilers older than v13.1.
- Remove support for atomic operations using MacOS atomics library.
3.0.6 -- March, 2020
--------------------
- Fix one-sided shared memory window configuration bug.
- Fix support for PGI'18 compiler.
- Fix run-time linker issues with OMPIO on newer Linux distros.
- Allow the user to override modulefile_path in the Open MPI SRPM,
even if install_in_opt is set to 1.
- Properly detect ConnectX-6 HCAs in the openib BTL.
- Fix segfault in the MTL/OFI initialization for large jobs.
- Fix various portals4 control flow bugs.
- Fix communications ordering for alltoall and Cartesian neighborhood
collectives.
- Fix an infinite recursion crash in the memory patcher on systems
with glibc v2.26 or later (e.g., Ubuntu 18.04) when using certain
OS-bypass interconnects.
3.0.5 -- November, 2019
-----------------------
- Fix OMPIO issue limiting file reads/writes to 2GB. Thanks to
Richard Warren for reporting the issue.
- At run time, automatically disable Linux cross-memory attach (CMA)
for vader BTL (shared memory) copies when running in user namespaces
(i.e., containers). Many thanks to Adrian Reber for raising the
issue and providing the fix.
- Sending very large MPI messages using the ofi MTL will fail with
some of the underlying Libfabric transports (e.g., PSM2 with
messages >=4GB, verbs with messages >=2GB). Prior version of Open
MPI failed silently; this version of Open MPI invokes the
appropriate MPI error handler upon failure. See
https://github.com/open-mpi/ompi/issues/7058 for more details.
Thanks to Emmanuel Thomé for raising the issue.
- Fix case where 0-extent datatypes might be eliminated during
optimization. Thanks to Github user @tjahns for raising the issue.
- Ensure that the MPIR_Breakpoint symbol is not optimized out on
problematic platforms.
- Fix OMPIO offset calculations with SEEK_END and SEEK_CUR in
MPI_FILE_GET_POSITION. Thanks to Wei-keng Liao for raising the
issue.
- Fix corner case for datatype extent computations. Thanks to David
Dickenson for raising the issue.
- Fix MPI buffered sends with the "cm" PML.
- Update to PMIx v2.2.3.
- Fix ssh-based tree-based spawning at scale. Many thanks to Github
user @zrss for the report and diagnosis.
- Fix the Open MPI RPM spec file to not abort when grep fails. Thanks
to Daniel Letai for bringing this to our attention.
- Handle new SLURM CLI options (SLURM 19 deprecated some options that
Open MPI was using). Thanks to Jordan Hayes for the report and the
initial fix.
- OMPI: fix division by zero with an empty file view.
- Also handle shmat()/shmdt() memory patching with OS-bypass networks.
- Add support for unwinding info to all files that are present in the
stack starting from MPI_Init, which is helpful with parallel
debuggers. Thanks to James Clark for the report and initial fix.
- Fixed inadvertant use of bitwise operators in the MPI C++ bindings
header files. Thanks to Bert Wesarg for the report and the fix.
- Added configure option --disable-wrappers-runpath (alongside the
already-existing --disable-wrappers-rpath option) to prevent Open
MPI's configure script from automatically adding runpath CLI options
to the wrapper compilers.
3.0.4 -- April, 2019
--------------------
- Fix compile error when configured with --enable-mpi-java and
--with-devel-headers. Thanks to @g-raffy for reporting the issue.
- Fix possible floating point rounding and division issues in OMPIO
which led to crashes and/or data corruption with very large data.
Thanks to Axel Huebl and René Widera for identifing the issue,
supplying and testing the fix (** also appeared: v3.0.4).
- Use static_cast<> in mpi.h where appropriate. Thanks to @shadow-fx
for identifying the issue.
- Fix datatype issue with RMA accumulate. Thanks to Jeff Hammond for
raising the issue.
- Fix RMA accumulate of non-predefined datatypes with predefined
operators. Thanks to Jeff Hammond for raising the issue.
- Fix race condition when closing open file descriptors when launching
MPI processes. Thanks to Jason Williams for identifying the issue and
supplying the fix.
- Fix Valgrind warnings for some MPI_TYPE_CREATE_* functions. Thanks
to Risto Toijala for identifying the issue and supplying the fix.
- Fix MPI_TYPE_CREATE_F90_{REAL,COMPLEX} for r=38 and r=308.
- Fix assembly issues with old versions of gcc (<6.0.0) that affected
the stability of shared memory communications (e.g., with the vader
BTL).
- Fix the OFI MTL handling of MPI_ANY_SOURCE.
- Fix noisy errors in the openib BTL with regards to
ibv_exp_query_device(). Thanks to Angel Beltre and others who
reported the issue.
3.0.3 -- October, 2018
----------------------
- Fix race condition in MPI_THREAD_MULTIPLE support of non-blocking
send/receive path.
- Fix error handling SIGCHLD forwarding.
- Add support for CHARACTER and LOGICAL Fortran datatypes for MPI_SIZEOF.
- Fix compile error when using OpenJDK 11 to compile the Java bindings.
- Fix crash when using a hostfile with a 'user@host' line.
- Numerous Fortran '08 interface fixes.
- TCP BTL error message fixes.
- OFI MTL now will use any provider other than shm, sockets, tcp, udp, or
rstream, rather than only supporting gni, psm, and psm2.
- Disable async receive of CUDA buffers by default, fixing a hang
on large transfers.
- Support the BCM57XXX and BCM58XXX Broadcomm adapters.
- Fix minmax datatype support in ROMIO.
- Bug fixes in vader shared memory transport.
- Support very large buffers with MPI_TYPE_VECTOR.
- Fix hang when launching with mpirun on Cray systems.
- Bug fixes in OFI MTL.
- Assorted Portals 4.0 bug fixes.
- Fix for possible data corruption in MPI_BSEND.
- Move shared memory file for vader btl into /dev/shm on Linux.
- Fix for MPI_ISCATTER/MPI_ISCATTERV Fortran interfaces with MPI_IN_PLACE.
- Upgrade PMIx to v2.1.4.
- Fix for Power9 built-in atomics.
- Numerous One-sided bug fixes.
- Fix for race condition in uGNI BTL.
- Improve handling of large number of interfaces with TCP BTL.
- Numerous UCX bug fixes.