Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate Active-set-based Language #511

Closed
wants to merge 14 commits into from
Closed
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions content/collective_intro.tex
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
\emph{Collective routines} are defined as coordinated communication or synchronization
operations performed by a group of \acp{PE}.

\openshmem provides three types of collective routines:
\openshmem provides four types of collective routines:

\begin{enumerate}
\item Collective routines that operate on teams use a team handle parameter to determine
Expand All @@ -11,9 +11,12 @@
\begin{DeprecateBlock}
\item Collective routines that operate on active sets use a set of parameters to determine
which \acp{PE} will participate and what resources are used to perform operations.

\item Collective routines that do not accept active set
parameters and, as required, the default context.
\end{DeprecateBlock}

\item Collective routines that accept neither team nor active set
\item Collective routines that do not accept team
parameters, which implicitly operate on the world team and, as
required, the default context.
\end{enumerate}
Expand Down
2 changes: 1 addition & 1 deletion content/programming_model_overview.tex
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@
data object on another symmetric data object.
\item \OPR{All-to-All}: All \acp{PE} participating in the routine exchange
a fixed amount of contiguous or strided data with all other \acp{PE}
in the active set.
in the team.
\end{enumerate}

\item \textbf{Mutual Exclusion}
Expand Down
40 changes: 29 additions & 11 deletions content/shmem_alltoall.tex
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,10 @@

\apiargument{OUT}{dest}{Symmetric address of a data object large enough to receive
the combined total of \VAR{nelems} elements from each \ac{PE} in the
active set.
particpating \acp{PE}.
The type of \dest{} should match that implied in the SYNOPSIS section.}
\apiargument{IN}{source}{Symmetric address of a data object that contains \VAR{nelems}
elements of data for each \ac{PE} in the active set, ordered according to
elements of data for each \ac{PE} in the participating \acp{PE}, ordered according to
destination \ac{PE}.
The type of \source{} should match that implied in the SYNOPSIS section.}
\apiargument{IN}{nelems}{
Expand Down Expand Up @@ -100,6 +100,21 @@
If \VAR{team} compares equal to \LibConstRef{SHMEM\_TEAM\_INVALID} or is
otherwise invalid, the behavior is undefined.

Before any \ac{PE} calls a \FUNC{shmem\_alltoall} routine,
the following conditions must be ensured:
\begin{itemize}
\item The \VAR{dest} data object on all \acp{PE} in the team is
ready to accept the \FUNC{shmem\_alltoall} data.
\end{itemize}

Upon return from a \FUNC{shmem\_alltoall} routine, the following is true for
the local PE:
\begin{itemize}
\item Its \VAR{dest} symmetric data object is completely updated and the
data has been copied out of the source data object.
\end{itemize}

\begin{DeprecateBlock}
Active-set-based collective routines operate over all \acp{PE} in the active set
defined by the \VAR{PE\_start}, \VAR{logPE\_stride}, \VAR{PE\_size} triplet.

Expand All @@ -116,23 +131,26 @@

Before any \ac{PE} calls a \FUNC{shmem\_alltoall} routine,
the following conditions must be ensured:

\begin{itemize}
\item The \VAR{dest} data object on all \acp{PE} in the active set is
ready to accept the \FUNC{shmem\_alltoall} data.
\item For active-set-based routines, the \VAR{pSync} array
on all \acp{PE} in the active set is not still in use from a prior call
to a \FUNC{shmem\_alltoall} routine.
\item The \VAR{dest} data object on all \acp{PE} in the active set is
ready to accept the \FUNC{shmem\_alltoall} data.
\item For active-set-based routines, the \VAR{pSync} array
on all \acp{PE} in the active set is not still in use from a prior call
to a \FUNC{shmem\_alltoall} routine.
\end{itemize}

Otherwise, the behavior is undefined.

Upon return from a \FUNC{shmem\_alltoall} routine, the following is true for
the local PE:
\begin{itemize}
\item Its \VAR{dest} symmetric data object is completely updated and
the data has been copied out of the \VAR{source} data object.
\item For active-set-based routines,
the values in the \VAR{pSync} array are restored to the original values.
\item Its \VAR{dest} symmetric data object is completely updated and the
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix: Whitespace

data has been copied out of the source data object.
\item For active-set-based routines,
the values in the \VAR{pSync} array are restored to the original values.
\end{itemize}
\end{DeprecateBlock}
}

\apireturnvalues{
Expand Down
4 changes: 2 additions & 2 deletions content/shmem_alltoalls.tex
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,10 @@

\apiargument{OUT}{dest}{Symmetric address of a data object large enough to receive
the combined total of \VAR{nelems} elements from each \ac{PE} in the
active set.
participating \acp{PE}.
The type of \dest{} should match that implied in the SYNOPSIS section.}
\apiargument{IN}{source}{Symmetric address of a data object that contains \VAR{nelems}
elements of data for each \ac{PE} in the active set, ordered according to
elements of data for each \ac{PE} in the participating \acp{PE}, ordered according to
destination \ac{PE}.
The type of \source{} should match that implied in the SYNOPSIS section.}
\apiargument{IN}{dst}{The stride between consecutive elements of the \dest{}
Expand Down
56 changes: 39 additions & 17 deletions content/shmem_broadcast.tex
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@
respectively.
}
\apiargument{IN}{PE\_root}{Zero-based ordinal of the \ac{PE}, with respect to
the team or active set, from which the data is copied.}
the calling PEs, from which the data is copied.}

\begin{DeprecateBlock}

Expand All @@ -61,8 +61,7 @@
\end{apiarguments}

\apidescription{
\openshmem broadcast routines are collective routines over an active set or
valid \openshmem team.
\openshmem team-based broadcast routines are collective routines over a valid \openshmem team.
They copy the \source{} data object on the \ac{PE} specified by
\VAR{PE\_root} to the \dest{} data object on the \acp{PE}
participating in the collective operation.
Expand All @@ -75,18 +74,44 @@
\item The \dest{} object is updated on all \acp{PE}.
\item All \acp{PE} in the \VAR{team} argument must participate in
the operation.
\item Only \acp{PE} in the team may call the routine. If a
\ac{PE} not in the team calls a team-based
collective routine, the behavior is undefined.
\item If \VAR{team} compares equal to \LibConstRef{SHMEM\_TEAM\_INVALID} or is
otherwise invalid, the behavior is undefined.
\item \ac{PE} numbering is relative to the team. The specified
root \ac{PE} must be a valid \ac{PE} number for the team,
between \CONST{0} and \VAR{N$-$1}, where \VAR{N} is the size of
the team.
\end{itemize}

Before any \ac{PE} calls a broadcast routine, the following
conditions must be ensured:
\begin{itemize}
\item The \dest{} array on all \acp{PE} participating in the broadcast
is ready to accept the broadcast data.
\end{itemize}
Otherwise, the behavior is undefined.

Upon return from a team-based broadcast routine, the following are true for the local
\ac{PE}:
\begin{itemize}
\item The \dest{} data object is updated.
\item The \source{} data object may be safely reused.
\end{itemize}

\begin{DeprecateBlock}
\openshmem active-set broadcast routines are collective routines over an active set.
They copy the \source{} data object on the \ac{PE} specified by
\VAR{PE\_root} to the \dest{} data object on the \acp{PE}
participating in the collective operation.
The same \dest{} and \source{} data objects and the same value of
\VAR{PE\_root} must be passed by all \acp{PE} participating in the
collective operation.

For active-set-based broadcasts:
\begin{itemize}
\item The \dest{} object is updated on all \acp{PE} other than the
root \ac{PE}.
\item The \VAR{dest} object is updated on all PEs other than the root PE.
\item All \acp{PE} in the active set defined by the
\VAR{PE\_start}, \VAR{logPE\_stride}, \VAR{PE\_size} triplet
must participate in the operation.
Expand All @@ -102,39 +127,36 @@
in the active set.
\end{itemize}

Before any \ac{PE} calls a broadcast routine, the following
Before any \ac{PE} calls a active-set-based broadcast routine, the following
conditions must be ensured:
\begin{itemize}
\item The \dest{} array on all \acp{PE} participating in the broadcast
is ready to accept the broadcast data.
\item For active-set-based broadcasts, the
\VAR{pSync} array on all \acp{PE} in the
\item The \VAR{pSync} array on all \acp{PE} in the
active set is not still in use from a prior call to an \openshmem
collective routine.
\end{itemize}
Otherwise, the behavior is undefined.

Upon return from a broadcast routine, the following are true for the local
Upon return from a active-based broadcast routine, the following are true for the local
\ac{PE}:
\begin{itemize}
\item For team-based broadcasts, the \dest{} data object is
updated.
\item For active-set-based broadcasts:
\begin{itemize}
\item If the current \ac{PE} is not the root \ac{PE}, the
\dest{} data object is updated.
\item If the current PE is not the root PE, the \dest{} data object is updated.
\item The \source{} data object may be safely reused.
\item The values in the \VAR{pSync} array are restored to the
original values.
\end{itemize}
\item The \source{} data object may be safely reused.
\end{itemize}
\end{DeprecateBlock}
}


\apireturnvalues{
For team-based broadcasts, zero on successful local completion; otherwise, nonzero.

\begin{DeprecateBlock}
For active-set-based broadcasts, none.
\end{DeprecateBlock}

}

\apinotes{
Expand Down
35 changes: 30 additions & 5 deletions content/shmem_collect.tex
Original file line number Diff line number Diff line change
Expand Up @@ -66,13 +66,11 @@
\openshmem \FUNC{collect} and \FUNC{fcollect} routines perform a collective
operation to concatenate \VAR{nelems}
data items from the \source{} array into the
\dest{} array, over an \openshmem team or active set
in processor number order. The resultant \dest{} array contains the contribution from
\dest{} array, over an \openshmem team in processor number order.
The resultant \dest{} array contains the contribution from
\acp{PE} as follows:

\begin{itemize}
\item For an active set, the data from \ac{PE} \VAR{PE\_start} is first, then the
contribution from \ac{PE} \VAR{PE\_start} + \VAR{PE\_stride} second, and so on.
\begin{itemize}
\item For a team, the data from \ac{PE} number \CONST{0} in the team is first, then the
contribution from \ac{PE} \CONST{1} in the team, and so on.
\end{itemize}
Expand All @@ -90,6 +88,26 @@
If \VAR{team} compares equal to \LibConstRef{SHMEM\_TEAM\_INVALID} or is
otherwise invalid, the behavior is undefined.

\begin{DeprecateBlock}
\openshmem \FUNC{collect} and \FUNC{fcollect} routines perform a collective
operation to concatenate \VAR{nelems}
data items from the \source{} array into the
\dest{} array, over an \openshmem active set
in processor number order. The resultant \dest{} array contains the contribution from
\acp{PE} as follows:
\begin{itemize}
\item For an active set, the data from \ac{PE} \VAR{PE\_start} is first, then the
contribution from \ac{PE} \VAR{PE\_start} + \VAR{PE\_stride} second, and so on.
\end{itemize}

The collected result is written to the \dest{} array for all \acp{PE}
that participate in the operation. The same \dest{} and \source{}
arrays must be passed by all \acp{PE} that participate in the operation.

The \FUNC{fcollect} routines require that \VAR{nelems} be the same value in all
participating \acp{PE}, while the \FUNC{collect} routines allow \VAR{nelems} to
vary from \ac{PE} to \ac{PE}.

Active-set-based collective routines operate over all \acp{PE} in the active set
defined by the \VAR{PE\_start}, \VAR{logPE\_stride}, \VAR{PE\_size} triplet.
As with all active-set-based collective routines,
Expand All @@ -108,16 +126,23 @@
\item For active-set-based collective routines, the values in the \VAR{pSync} array are
restored to the original values.
\end{itemize}
\end{DeprecateBlock}
}

\apireturnvalues{
Zero on successful local completion. Nonzero otherwise.
}

\apinotes{
\begin{DeprecateBlock}
The collective routines operate on active \ac{PE} sets that have a
non-power-of-two \VAR{PE\_size} with some performance degradation. They operate
with no performance degradation when \VAR{nelems} is a non-power-of-two value.
\end{DeprecateBlock}
The collective routines that operate on teams containing a
non-power-of-two of PEs do so with some performance degradation. They operate
with no performance degradation when \VAR{nelems} is a non-power-of-two value.

}

\begin{apiexamples}
Expand Down
4 changes: 0 additions & 4 deletions content/shmem_malloc_hints.tex
Original file line number Diff line number Diff line change
Expand Up @@ -57,19 +57,15 @@
\tabularnewline \hline
\endhead
%%
\newline
\CONST{0} &
\newline
Behavior same as \FUNC{shmem\_malloc}
\tabularnewline \hline

\LibConstDecl{SHMEM\_MALLOC\_ATOMICS\_REMOTE} &
\newline
Memory used for \VAR{atomic} operations
\tabularnewline \hline

\LibConstDecl{SHMEM\_MALLOC\_SIGNAL\_REMOTE} &
\newline
Memory used for \VAR{signal} operations
\tabularnewline \hline

Expand Down
36 changes: 35 additions & 1 deletion content/shmem_reductions.tex
Original file line number Diff line number Diff line change
Expand Up @@ -257,6 +257,8 @@ \subsubsubsection{PROD}
\VAR{nreduce} must be of type integer.}

\begin{DeprecateBlock}
\apiargument{IN}{nreduce}{In active-set based \ac{API} calls,
\VAR{nreduce} must be of type integer.}
\apiargument{IN}{PE\_start}{The lowest \ac{PE} number of the active set of
\acp{PE}.}
\apiargument{IN}{logPE\_stride}{The log (base 2) of the stride between consecutive
Expand All @@ -273,7 +275,7 @@ \subsubsubsection{PROD}
\end{apiarguments}

\apidescription{
\openshmem reduction routines are collective routines over an active set or
\openshmem reduction routines are collective routines over an
existing \openshmem team that compute one or more reductions across symmetric
arrays on multiple \acp{PE}. A reduction performs an associative binary routine
across a set of values.
Expand All @@ -295,6 +297,37 @@ \subsubsubsection{PROD}
If \VAR{team} compares equal to \LibConstRef{SHMEM\_TEAM\_INVALID} or is
otherwise invalid, the behavior is undefined.

Before any \ac{PE} calls a reduction routine, the following conditions must be ensured:
\begin{itemize}
\item The \dest{} array on all \acp{PE} participating in the reduction
is ready to accept the results of the \OPR{reduction}.
\end{itemize}
Otherwise, the behavior is undefined.

Upon return from a reduction routine, the following are true for the local
\ac{PE}:
\begin{itemize}
\item The \dest{} array is updated and the \source{} array may be safely reused.
\end{itemize}

\begin{DeprecateBlock}
\openshmem reduction routines are collective routines over an active set
that compute one or more reductions across symmetric
arrays on multiple \acp{PE}. A reduction performs an associative binary routine
across a set of values.

The \VAR{nreduce} argument determines the number of separate reductions to
perform. The \source{} array on all \acp{PE} participating in the reduction
provides one element for each reduction. The results of the reductions are placed in the
\dest{} array on all \acp{PE} participating in the reduction.

The same \source{} and \dest{} arrays must be passed by all PEs that
participate in the collective.
The \source{} and \dest{} arguments must either be the same symmetric
address, or two different symmetric addresses corresponding to buffers that
do not overlap in memory. That is, they must be completely overlapping (sometimes referred to as an ``in place'' reduction) or
completely disjoint.

Active-set-based sync routines operate over all \acp{PE} in the active set
defined by the \VAR{PE\_start}, \VAR{logPE\_stride}, \VAR{PE\_size} triplet.

Expand Down Expand Up @@ -327,6 +360,7 @@ \subsubsubsection{PROD}
\item If using active-set-based routines,
the values in the \VAR{pSync} array are restored to the original values.
\end{itemize}
\end{DeprecateBlock}

The complex-typed interfaces are only provided for sum and product reductions.
When the \Cstd translation environment does not support complex types
Expand Down
Loading