Skip to content

Releases: openucx/ucx

v1.3.0 RC4

27 Mar 11:44
f8064fa
Compare
Choose a tag to compare
v1.3.0 RC4 Pre-release
Pre-release

Changelog:

  • Fixes for gcc8 compilation
  • Fix missing initialization of rndv_send_nbr thresholds
  • Fix mlx5 srq cleanup
  • Fix ep info print when there is no wireup lane
  • Optimize ugni locking

v1.3.0 RC3

13 Mar 10:07
9287190
Compare
Choose a tag to compare
v1.3.0 RC3 Pre-release
Pre-release
  • Fix compilation issue with mlx5 on ARM
  • Disable GDR-copy when ODP is used

v1.3.0 RC2

25 Feb 09:23
0b45e29
Compare
Choose a tag to compare
v1.3.0 RC2 Pre-release
Pre-release

Bugfixes:

  • Fix flow control for DC transport

1.3.0 - RC1

15 Feb 17:49
822e820
Compare
Choose a tag to compare
1.3.0 - RC1 Pre-release
Pre-release

Features:

  • Added stream-based communication API to UCP
  • Added support for GPU platforms: Nvidia CUDA and AMD ROCM software stacks
  • Added API for client/server based connection establishment
  • Added support for TCP transport
  • Support for InfiniBand tag-matching offload for DC and accelerated transports
  • Multi-rail support for eager and rendezvous protocols
  • Added support for tag-matching communications with CUDA buffers
  • Added ucp_rkey_ptr() to obtain pointer for shared memory region
  • Avoid progress overhead on unused transports
  • Improved scalability of software tag-matching by using a hash table
  • Added transparent huge-pages allocator
  • Added non-blocking flush and disconnect for UCP
  • Support fixed-address memory allocation via ucp_mem_map()
  • Added ucp_tag_send_nbr() API to avoid send request allocation
  • Support global addressing in all IB transports
  • Add support for external epoll fd and edge-triggered events
  • Added registration cache for knem
  • Initial support for Java bindings

Bugfixes:

  • Multiple bugfixes (full list on githib)
    Tested configurations:
  • InfiniBand: MLNX_OFED 4.2, inbox OFED drivers.
  • CUDA: gdrcopy 1.2, cuda 9.1.85
  • XPMEM: 2.6.2
  • KNEM: 1.1.2

Known issues:
#2047 - UCP: ucp_do_am_bcopy_multi drops data on UCS_ERROR_NO_RESOURCE
#2047 - failure in ud/uct_flush_test.am_zcopy_flush_ep_nb/1
#1977 - failure in shm/test_ucp_rma.blocking_small/0
#1926 - Timeout in mpi_test_suite with HW TM
#1920 - transport retry count exceeded in many-to-one tests
#1689 - Segmentation fault on memory hooks test in jenkins

v1.2.2

11 Jan 20:14
2abcbfe
Compare
Choose a tag to compare

Main:

  • Support including UCX API headers from C++ code
  • UD transport to handle unicast flood on RoCE fabric
  • Compilation fixes for gcc 7.1.1, clang 3.6, clang 5

Details:

  • When UD transport is used with RoCE, packets intended for other peers may
    arrive on different adapters (as a result of unicast flooding).
  • This change adds packet filtering based on destination GIDs. Now the packet
    is silently dropped, if its destination GID does not match the local GID.
  • Added a new device ID for InfiniBand HCA
  • [packaging] Move examples/ and perftest/ into doc
  • [packaging] Update spec to work on old distros while complaint with Fedora
    guidelines
  • [cleanup] Removed unused ptmalloc version (2.83)
  • [cleanup] Fixup license headers

v1.2.2 RC1

29 Nov 16:26
fc625eb
Compare
Choose a tag to compare
v1.2.2 RC1 Pre-release
Pre-release

Main:

  • Support including UCX API headers from C++ code
  • UD transport to handle unicast flood on RoCE fabric
  • Compilation fixes for gcc 7.1.1 and clang 3.6

Details:

  • When UD transport is used with RoCE, packets intended for other peers may
    arrive on different adapters (as a result of unicast flooding).
  • This change adds packet filtering based on destination GIDs. Now the packet
    is silently dropped, if its destination GID does not match the local GID.
  • [packaging] Move examples/ and perftest/ into doc
  • [packaging] Update spec to work on old distros while complaint with Fedora
    guidelines
  • [cleanup] Removed unused ptmalloc version (2.83)
  • [cleanup] Fixup license headers

v1.2.1

27 Aug 17:03
Compare
Choose a tag to compare
  • Compilation fixes for gcc 7.1
  • Spec file cleanups
  • Versioning cleanups

v1.2.0

15 Jun 16:13
Compare
Choose a tag to compare

1.2.0 (June 15, 2017)

Supported platforms

  • Shared memory: KNEM, CMA, XPMEM, SYSV, Posix
  • VERBs over InfiniBand and RoCE.
    VERBS over other RDMA interconnects (iWarp, OmniPath, etc.) is available
    for community evaluation and has not been tested in context of this release
  • Cray Gemini and Aries
  • Architectures: x86_64, ARMv8 (64bit), Power64

Features:

  • Added support for InfiniBand DC and UD transports, including accelerated verbs for Mellanox devices
  • Full support for PGAS/SHMEM interfaces, blocking and non-blocking APIs
  • Support for MPI tag matching, both in software and offload mode
  • Zero copy protocols and rendezvous, registration cache
  • Handling transport errors
  • Flow control for DC/RC
  • Dataypes support: contiguous, IOV, generic
  • Multi-threading support
  • Support for ARMv8 64bit architecture
  • A new API for efficient memory polling
  • Support for malloc-hooks and memory registration caching

Bugfixes:

  • Multiple bugfixes improving overall stability of the library

Known issues:

  • #1604 - Failure in ud/test_ud_slow_timer.retransmit1/1 with valgrind bug
  • #1588 - Fix reading cpuinfo timebase for ppc bug portability training
  • #1579 - Ud/test_ud.ca_md test takes too long too complete bug
  • #1576 - Failure in ud/test_ud_slow_timer.retransmit1/0 with valgrind bug
  • #1569 - Send completion with error with dc_verbs bug
  • #1566 - Segfault in malloc_hook.fork on arm bug
  • #1565 - Hang in udrc/test_ucp_rma.nonblocking_stream_get_nbi_flush_worker bug
  • #1534 - Wireup.c:473 Fatal: endpoint reconfiguration not supported yet bug
  • #1533 - Stack overflow under Valgrind 'rc_mlx5/uct_p2p_err_test.local_access_error/0' bug
  • #1513 - Hang in MPI_Finalize with UCX_TLS=rc[_x],sm on the bsend2 test bug
  • #1504 - Failure in cm/uct_p2p_am_test.am_bcopy/1 bug
  • #1492 - Hang when using polling fd bug
  • #1489 - Hang on the osu_fop_latency test with RoCE bug
  • #1005 - ROcE problem with OMPI direct modex - UD assertion

1.1.0 (September 1, 2015)

Workarounds:
Features:

Bugfixes:
Known issues:

1.0.0 (July 22, 2015)

Features:

  • Added support for UCT cma shared memory transport (Cross-Memory Attatch)
  • Added support for UCT mm shared memory transport with mmap/sysv APIs
  • Added support for UCT rc transport based on Infiniband/RC with verbs
  • Added support for UCT mlx5_rc transport based on Infiniband/RC with accelerated verbs
  • Added support for UCT cm transport based on Infiniband/SIDR (Service ID Resolution)
  • Added support for UCT ugni transport based on Cray/UGNI
  • Added support for Doxygen based documentation generation
  • Added support for UCP basic protocol layer to fit PGAS paradigm (RMA, AMO)
  • Added ucx_perftest utility to exercise major UCX flows and provide performance metrics
  • Added test script for jenkins (contrib/test_jenkins.sh)
  • Added packaging for RPM/DEB based linux distributions (see contrib/buildrpm.sh)
  • Added Unit-tests infractucture for UCX functionality based on Google Test framework (see test/gtest/)
  • Added initial integration for OpenMPI with UCX for PGAS/SHMEM API
    (see: https://github.com/openucx/ompi-mirror/pull/1)
  • Added end-to-end testing infrastructure based on MTT (see contrib/mtt/README_MTT)