Skip to content

Commit

Permalink
Merge branch 'openshmem-org:master' into master
Browse files Browse the repository at this point in the history
  • Loading branch information
kwaters4 authored Aug 29, 2024
2 parents a2d9daa + 1d6f40e commit 9a6a048
Show file tree
Hide file tree
Showing 29 changed files with 684 additions and 49 deletions.
28 changes: 28 additions & 0 deletions .github/issue_template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
---
name: Issue Template
about: Template for OpenSHMEM Issues
title: ''
labels: ''
assignees: ''

---

# Problem Statement

<!-- Describe the problem solved by this proposal. -->

# Proposed Changes

<!-- Describe the high level idea and proposed changes. -->

# Impact on Implementations

<!-- Describe changes that implementations will be required to make here. -->

# Impact on Users

<!-- Describe the changes that will impact users here. -->

# References and Pull Requests

<!-- References to other pull requests or issues, papers, websites, etc. Please keep this updated. -->
7 changes: 7 additions & 0 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Summary of changes

# Proposal Checklist
- [ ] Link to issue(s)
- [ ] Changelog entry
- [ ] Reviewed for changes to front matter
- [ ] Reviewed for changes to back matter
34 changes: 28 additions & 6 deletions content/backmatter.tex
Original file line number Diff line number Diff line change
Expand Up @@ -143,12 +143,6 @@ \chapter{Undefined Behavior in OpenSHMEM}\label{sec:undefined}
immediately upon an \openshmem call into the uninitialized library.
\tabularnewline
\hline
Multiple calls to initialization routines & In an \openshmem program where
the initialization routines \FUNC{shmem\_init} or \FUNC{shmem\_init\_thread}
have already been called, any subsequent calls to these initialization routines
result in undefined behavior.
\tabularnewline
\hline
Specifying invalid \ac{PE} numbers & For \openshmem routines that accept a
\ac{PE} number as an argument, if the \ac{PE} number is invalid for the
team associated with the operation (either implicitly or explicitly), the
Expand Down Expand Up @@ -661,6 +655,11 @@ \section{Version 1.6}
The following list describes the specific changes in \openshmem[1.6]:
\begin{itemize}
%
\item Added support for initialization and finalization routines to be called
multiple times, and added an initialization status query API
\FUNC{shmem\_query\_initialized}.
\ChangelogRef{subsec:shmem_init, subsec:shmem_finalize, subsec:shmem_query_initialized}%
%
\item Added interleaved block transfer APIs \FUNC{shmem\_ibget} and
\FUNC{shmem\_ibput}.
\ChangelogRef{subsec:shmem_ibget, subsec:shmem_ibput}%
Expand All @@ -687,19 +686,42 @@ \section{Version 1.6}
operations for team-based reductions.
\ChangelogRef{teamreducetypes}%
%
\item Added the session routines, \FUNC{shmem\_ctx\_session\_start} and
\FUNC{shmem\_ctx\_session\_stop}, which allow users to pass hints to the
\openshmem library to apply runtime optimizations.
\ChangelogRef{subsec:sessions}%
\item Added fine grained completion routine: \FUNC{shmem\_pe\_quiet}.
\ChangelogRef{subsec:shmem_pe_quiet}%
%
\item Split the listings for the \FUNC{shmem\_\{malloc, free, realloc, align\}}
functions from a single entry in \openshmem[1.5] into separate entries.
\ChangelogRef{subsec:shmem_malloc, subsec:shmem_free, subsec:shmem_realloc,
subsec:shmem_align}%
%
\item Clarified that the \FUNC{shmem\_\{malloc, free, realloc, align,
malloc\_with\_hints, calloc\}} functions are collective operations on
the world team.
\ChangelogRef{subsec:shmem_malloc, subsec:shmem_free, subsec:shmem_realloc,
subsec:shmem_align, subsec:shmmallochint, subsec:shmem_calloc}%
\item Corrected the level argument's recommended value in API notes for
\FUNC{shmem\_pcontrol} to indicate that the value should be greater than
2 to enable profiling with profile library defined effects and
additional arguments.
\ChangelogRef{subsec:shmem_pcontrol}
%
\item Clarified that \FUNC{shmem\_team\_get\_config} returns the current
configuration values, which may differ from the values assigned at the
time of the team's creation.
\ChangelogRef{subsec:shmem_team_get_config}
%
\item Clarified the behavior of \FUNC{shmem\_team\_get\_config} when the
\VAR{config\_mask} is 0 and/or the \VAR{config} argument is a null pointer.
\ChangelogRef{subsec:shmem_team_get_config}
%
\item Clarified the behavior of \FUNC{shmem\_team\_split\_strided} when the
stride argument is 0 or negative.
\ChangelogRef{subsec:shmem_team_split_strided}
%
\end{itemize}

\section{Version 1.5}
Expand Down
2 changes: 1 addition & 1 deletion content/collective_intro.tex
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
\end{enumerate}

Concurrent accesses to symmetric memory by an \openshmem collective
routine and any other means of access---where at least one updates the
routine and any other means of access---where at least one \ac{PE} updates the
symmetric memory---results in undefined behavior.
Since \acp{PE} can enter and exit collectives at different times,
accessing such memory remotely may require additional synchronization.
Expand Down
14 changes: 6 additions & 8 deletions content/execution_model.tex
Original file line number Diff line number Diff line change
Expand Up @@ -8,17 +8,15 @@

\ac{PE} execution is loosely coupled, relying on \openshmem operations to
communicate and synchronize among executing \acp{PE}. The \openshmem phase in
a program begins with a call to the initialization routine \FUNC{shmem\_init}
a program begins with the first call to the initialization routine \FUNC{shmem\_init}
or \FUNC{shmem\_init\_thread}, which must be performed before using any of the
other \openshmem library routines.
An \openshmem program concludes its use of the \openshmem library when all \acp{PE} call
An \openshmem program concludes its use of the \openshmem library when all \acp{PE}
make their final call to
\FUNC{shmem\_finalize} or any \ac{PE} calls \FUNC{shmem\_global\_exit}.
During a call to \FUNC{shmem\_finalize}, the \openshmem library must
complete all pending communication and release all the resources associated to
the library using an implicit collective synchronization across \acp{PE}.
Calling any \openshmem routine before initialization or after
\FUNC{shmem\_finalize} leads to undefined behavior. After finalization, a
subsequent initialization call also leads to undefined behavior.
During the last call to \FUNC{shmem\_finalize}, the \openshmem library synchronizes
all \acp{PE}, completes all pending communication and releases all the resources
associated to the library.

The \acp{PE} of the \openshmem program are identified by unique integers. The
identifiers are integers assigned in a monotonically increasing manner from zero
Expand Down
2 changes: 1 addition & 1 deletion content/memmgmt_intro.tex
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
symmetric data objects in the symmetric heap.

The symmetric memory allocation routines differ from the private heap
allocation routines in that they must be called by all \acp{PE} in a
allocation routines in that they must be called by all \acp{PE} in
the world team. When specified, each of these routines includes at
least one call to a procedure that is semantically equivalent to
\FUNC{shmem\_barrier\_all}. This ensures that all \acp{PE}
Expand Down
31 changes: 31 additions & 0 deletions content/sessions_intro.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
\openshmem \emph{sessions} provide a mechanism for applications to inform the
\openshmem library of an upcoming sequence of communication routines that
exhibit suitable patterns for runtime optimizations.
A session is associated with a specific \openshmem communication context
(Section~\ref{sec:ctx}), and it indicates the beginning and ending of
communication phases on that context.
The \FUNC{shmem\_ctx\_session\_start} routine indicates the beginning of a session,
and the \FUNC{shmem\_ctx\_session\_stop} routine indicates the end of a session.
The \LibConstRef{SHMEM\_CTX\_SESSION\_*} options (Table~\ref{session_opts}) indicate
which patterns of \openshmem RMA and AMO routines will occur within a session.
These options serve only as \textit{hints} to the library; it is up to the
implementation whether or not to apply any optimizations within a session.
A session may be provided a configuration argument that specifies attributes
associated with the session. This configuration argument is of type
\CTYPE{shmem\_ctx\_session\_config\_t}, which is detailed further in
Section~\ref{subsec:shmem_team_config_t}.

Usage of the \openshmem session APIs on a particular context must comply with
the requirements of all options set on that context.
Starting and stopping \openshmem sessions should not affect the completion or
ordering semantics of any \openshmem routines in the program.
For these reasons, multi-threaded \openshmem programs may require additional
thread synchronization to ensure sessions hints are correctly applied to
shareable contexts.
Because sessions are associated with an \openshmem communication context,
routines not performed on a communication context (like collective routines)
are ineligible for session hints.

The \FUNC{shmem\_ctx\_session\_config\_t} object requires the \CONST{SIZE\_MAX}
macro defined in \HEADER{stdint.h} by \Cstd[99]~\S7.18.3 and
\Cstd[11]~\S7.20.3.
3 changes: 2 additions & 1 deletion content/shmem_align.tex
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,8 @@


\apidescription{
The \FUNC{shmem\_align} routine allocates a block in the symmetric
The \FUNC{shmem\_align} routine is a collective operation on the
world team that allocates a block in the symmetric
heap that has a byte alignment specified by the \VAR{alignment}
argument. The value of \VAR{alignment} shall be a multiple of
\CONST{sizeof(void *)} that is also a power of two; otherwise, the
Expand Down
79 changes: 79 additions & 0 deletions content/shmem_ctx_session_config_t.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
\apisummary{
A structure type representing communication session configuration arguments
}

\begin{apidefinition}

\begin{Csynopsis}
typedef struct {
size_t total_ops;
} shmem_ctx_session_config_t;
\end{Csynopsis}

\begin{apiarguments}
None.
\end{apiarguments}


\apidescription{
A communication session configuration object is provided as an argument to
the \FUNC{shmem\_ctx\_session\_start} routine.
The \VAR{shmem\_ctx\_session\_config\_t} object contains optional parameters
that are associated with the options of a communication session.
These parameters serve only as \textit{hints} to the library; it is up to
the implementation whether or not to use the parameter values within
a session.

The \VAR{total\_ops} member indicates the expected maximum number of all
calls to \openshmem RMA routines within the session (i.e., after a call to
\FUNC{shmem\_ctx\_session\_start} and before a corresponding call to
\FUNC{shmem\_ctx\_session\_stop}).
If \VAR{total\_ops} differs from the \textit{actual} number of calls to
\openshmem RMA routines within the session, then application performance
might be suboptimal; however, the result of any data transfers,
completions, or memory ordering operations are unaffected by the value of
\FUNC{total\_ops}.

When passing a configuration structure to \FUNC{shmem\_ctx\_session\_start},
the mask parameter specifies which fields the application requests to
associate with the session.
Any configuration parameter value that is not indicated in the mask will be
ignored, and the default value will be used instead.
Therefore, a program must set only the fields for which it does not want
the default value.

A configuration mask is created through a bitwise OR operation of the
following library constants.
A configuration mask value of \CONST{0} indicates that the session
should be started with the default values for all configuration
parameters.

\widetablerow{\LibConstRef{SHMEM\_CTX\_SESSION\_TOTAL\_OPS}}{
The value of the \VAR{total\_ops} member of the \VAR{config} structure is
unmasked within the session and applied as a hint.
}

The default values for configuration parameters are:

\widetablerow{\VAR{total\_ops} = \CONST{SIZE\_MAX}}{
By default, the expected maximum number of calls to \openshmem RMA routines
in the session is set to the maximum value of a \VAR{size\_t} variable,
\VAR{SIZE\_MAX}. This default setting indicates that the \openshmem
application chooses not to specify a value for \VAR{total\_ops}.
}
}

\apinotes{
Users are discouraged from calling \FUNC{shmem\_fence},
\FUNC{shmem\_ctx\_fence}, \FUNC{shmem\_quiet}, or \FUNC{shmem\_ctx\_quiet}
routines within a session whenever possible, because the library must
impose strict completions to comply with ordering semantics.
However, hints provided by \FUNC{shmem\_ctx\_session\_config\_t} do not imply
the occurence of any completion or memory ordering operations.
The requirements on buffers provided to \openshmem routines that are
\textit{in-use} (as described in Section
\ref{subsec:invoking_openshmem_operations}) apply regardless of any
\FUNC{shmem\_ctx\_session\_config\_t} hints.
}

\end{apidefinition}
113 changes: 113 additions & 0 deletions content/shmem_ctx_session_start.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
\apisummary{
Start a communication session.
}

\begin{apidefinition}

\begin{Csynopsis}
void @\FuncDecl{shmem\_ctx\_session\_start}@(shmem_ctx_t ctx, long options, const shmem_ctx_session_config_t *config, long config_mask);
\end{Csynopsis}

\begin{apiarguments}
\apiargument{IN}{ctx}{A context handle specifying the context associated
with this session.}
\apiargument{IN}{options}{The set of requested options from
Table~\ref{session_opts} for this session. Multiple options may be
requested by combining them with a bitwise OR operation; otherwise,
\CONST{0} can be given if no options are requested.}
\apiargument{IN}{config}{
A pointer to the configuration parameters for the session.}
\apiargument{IN}{config\_mask}{
The bitwise mask representing the set of configuration parameters to use
from \VAR{config}.}
\end{apiarguments}

\apidescription{
\FUNC{shmem\_ctx\_session\_start} is a non-collective routine that begins a
session on communication context \VAR{ctx} with hints requested via
\VAR{options}.
Sessions on a communication context must be stopped with a call to
\FUNC{shmem\_ctx\_session\_stop} on the same context.
If a session is already started on a given context, another call to
\FUNC{shmem\_ctx\_session\_start} on that same context combines new options
via a bitwise OR operation. In such a case, unmasked member values in the
\VAR{config} argument replace any existing configuration values that are
already applied to the session.

If \VAR{ctx} compares equal to \LibConstRef{SHMEM\_CTX\_INVALID} then
\FUNC{shmem\_ctx\_session\_start} performs no action and returns immediately.

No combination of \VAR{options} passed to \FUNC{shmem\_ctx\_session\_start}
results in undefined behavior, but some combinations may be detrimental for
performance; for example, when selecting an option that is not applicable
to the session. It is the user's responsibility to determine which
combination of \VAR{options} benefits the performance of the session.

The \VAR{config} argument specifies session configuration parameters,
which are described in Section~\ref{subsec:shmem_ctx_session_config_t}.

The \VAR{config\_mask} argument is a bitwise mask representing the set of
configuration parameters to use from \VAR{config}.
A \VAR{config\_mask} value of \CONST{0} indicates that the session should
be started with the default values for all configuration parameters.
See Section~\ref{subsec:shmem_ctx_session_config_t} for field mask names and
default configuration parameters.
}

\apireturnvalues{
None.
}

\sessiontablebegin

\sessiontablerow{\LibConstRef{SHMEM\_CTX\_SESSION\_BATCH}}{
A \textit{batch} is a series of calls to \openshmem routines that occur
within a session on a communication context (i.e., after a call to
\FUNC{shmem\_ctx\_session\_start} and before a corresponding call to
\FUNC{shmem\_ctx\_session\_stop}), that might tolerate an increase in
individual call latencies. Designating a batch may provide an opportunity
to decrease the overall overhead typically involved with the \openshmem
library implementing the series as individual RMA operations. In other
words, the performance of \openshmem programs that issue many consecutive
and small-sized RMA routines might be improved by informing the library
implementation ahead of time that it is free to delay transferring data
in order to buffer, combine, and/or coalesce the issued \openshmem
routines. The specific mechanisms for improving performance using
batching optimizations depend on the \openshmem library implementation.

The \VAR{SHMEM\_CTX\_SESSION\_BATCH} hint indicates that a communication
context will be used to issue a batch. An example of a batch is an
iterative loop of non-blocking RMA and/or AMO routines. A batch may
include a memory ordering or collective operation, but such routines
might require completions and/or synchronization that could degrade
performance.

Because sessions do not affect the completion or ordering semantics of any
\openshmem routines in the program, routines such as non-blocking RMAs,
non-blocking AMOs, non-blocking \OPR{put-with-signals}, blocking scalar
\OPR{puts}, small blocking \OPR{puts}, and blocking non-fetching AMOs are
viable candidates for batching. Other routines, such as large blocking
\OPR{puts}, all blocking \OPR{gets}, blocking fetching AMOs, and the
memory ordering routines might require the library to enforce
completions, reducing the potential benefit of batching.

The \VAR{total\_ops} field of \VAR{config} indicates the expected maximum
number of calls to \openshmem RMA routines within the session.
See Section~\ref{subsec:shmem_ctx_session_config_t} for details
about \VAR{shmem\_ctx\_session\_config\_t} parameters.
} \hline

\sessiontableend

\apinotes{
The \FUNC{shmem\_ctx\_session\_start} routine provides hints for improving
performance, and \openshmem implementations are not required to apply any
optimization.
\FUNC{shmem\_ctx\_session\_start} is non-collective, so there is no implied
synchronization.
Blocking puts must be sufficiently small to benefit from batching, and the
exact threshold for this benefit depends on the \openshmem implemenation
and/or the application.
}

\end{apidefinition}
Loading

0 comments on commit 9a6a048

Please sign in to comment.