- 24 Apr, 2015 4 commits
-
-
Signed-off-by:
Igor Ivanov <Igor.Ivanov@itseez.com>
-
Pavan Balaji authored
Over time, we have disabled a number of tests. Some of these were ones that we had tickets for and some were just simply commented out before we had the xfail mechanism. This patch is semi-temporary to see where we stand with these changes and to try to see how far we can get in fixing them. This patch does not reenable the FT failures or the performance test failures. Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Because the PAMI_Rput_typed / PAMI_Rget_typed are used with user defined derived types that could be freed between the time of the MPID_Put / Get and when the pami context actually executes the put or get (say in the win_fence) add a ref count to the MPID_Datatype at the time of the MPID_Put / Get and release the ref in the MPIDI_Win_DoneCB callback once the get or put has actually executed and it is ok to free the datatype. Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
Rob Latham authored
Recent changes to MPI_INFO_GET will report a hard error if the info value passed in is too short. If we want to determine if a key is set or not, and don't care what it is, we'll use MPI_INFO_GET_VALUELEN. Took the opportunity to add a system-hints test to the ROMIO tests Signed-off-by:
Paul Coffman <pkcoff@us.ibm.com> Signed-off-by:
Wei-keng Liao <wkliao@ece.northwestern.edu>
-
- 23 Apr, 2015 3 commits
-
-
Huiwei Lu authored
3.2b2 is mainly a bug fix release. Most text remains unchanged. Only the web link for changes between 3.1.3 and 3.2b2 has been changed. Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
Huiwei Lu authored
ABI string remains unchanged as 3.2b1. Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
Halim Amer authored
Initialize the num_queued_threads variable and updated it with the appropriate OPA operations. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 22 Apr, 2015 6 commits
-
-
Pavan Balaji authored
Instead of polling for an arbitrarily decided number of times in the progress engine before yielding, we now moved the yielding intelligence to the threading layer. The threading layer can keep track of other threads that are waiting to enter the critical section and only yield if another thread is waiting. In this way, if no thread is waiting to get the lock, the main thread never yields. At the same time, if another thread is waiting to get a lock, there is no delay in yielding. This change, however, introduces possible deadlocks. If a thread enters MPIDI_CH3I_progress with is_blocking unset, it may set the MPIDI_CH3I_progress_blocked flag and then will yield the critical section. Another thread may enter with is_blocking set, find the flag MPIDI_CH3I_progress_blocked set, and block in the conditional variable. The first thread will wake up and leave the progress engine without emitting any signal to wake up the second thread which may sleep forever. A simple fix is to yield the critical section only if the current thread entered the progress engine with is_blocking set. Signed-off-by:
Halim Amer <aamer@anl.gov>
-
Pavan Balaji authored
Instead of a simple thread yield, this patch adds some additional information to the yield about how many threads are waiting for it. When a thread tries to acquire a lock, they increment a counter. When a thread needs to yield, it can check this counter to see how many threads are waiting to get the lock. If there are no threads waiting, the yield can be skipped. This patch contains various changes to make that happen: 1. We modify the mutex object to maintain additional information on the number of queued threads. 2. We improve the yield call to include the unlock and lock as well, since it needs to decide whether to do the unlock/lock based on how many other threads are queued up. Signed-off-by:
Halim Amer <aamer@anl.gov>
-
Pavan Balaji authored
The nemesis progress engine was written in a way so that if one thread is inside a progress engine, other threads cannot enter the receive progress. They can enter the send progress in some cases. There doesn't seem to be a good reason for this behavior. This patch combines this so threads would simply return for nonblocking operations and wait for a signal before entering the progress engine for blocking operations. Signed-off-by:
Halim Amer <aamer@anl.gov>
-
Sangmin Seo authored
__attribute__((weak,alias())) should have function names starting with PMPI, but some MPIX functions, such as MPIX_Grequest_class_create, MPIX_Grequest_class_allocate, MPIX_Grequest_start, MPIX_Mutex_create, MPIX_Mutex_free, MPIX_Mutex_lock, and MPIX_Mutex_unlock, had the same alias names as those of original functions. This patch fixes wrong alias names in __attribute__((weak,alias())) and also fixes some wrong alias names in #pragma. Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Antonio J. Pena authored
Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
Kenneth Raffenetti authored
The return value of anysource_matched should be the actual result of the cancel operation. If the result is uncancelable, i.e. already matched, then CH3 will let the netmod message win and move on to the other requests in the queue. When the completion for the unsuccessfully canceled message comes in, we process it like normal. Reviewed-by:
Igor Ivanov <Igor.Ivanov@itseez.com> Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
- 21 Apr, 2015 2 commits
-
-
Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
- 20 Apr, 2015 9 commits
-
-
Min Si authored
On FreeBSD, test threads/pt2pt/multisend4 sometimes reports the segfault error when calling free function. This error only happens when the buffer size is equal to 4M bytes and every thread performs malloc/free for multiple times. This bug can be reproduced by using simple memcpy without MPI communication, thus it is considered not as a MPI bug but a bug of the thread-safe memory allocation on FreeBSD. A workaround of this bug is to move malloc-free outside the loop to avoid frequent malloc-free calls. This patch added it. Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Xin Zhao authored
Signed-off-by:
Min Si <msi@il.is.s.u-tokyo.ac.jp> Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
Xin Zhao authored
Here we should check if the packet type is FOP_IMMED, if so, we initialize the response packet to be FOP_RESP_IMMED. The originally code wrongly check the packet flag instead of packet type. Signed-off-by:
Min Si <msi@il.is.s.u-tokyo.ac.jp> Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
Xin Zhao authored
After reducing the IMMED data size from 16 bytes to 8 bytes, FOP data is no longer always fit in the packet header, hence the assert no longer makes sense. Signed-off-by:
Min Si <msi@il.is.s.u-tokyo.ac.jp> Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
Xin Zhao authored
Originally the size of IMMED data in RMA packets is 16 bytes which makes the size of CH3 packet be 56 bytes. Here we reduce the size of IMMED data in RMA packets to 8 bytes, so that the size of CH3 packet is reduced to 48 bytes, the same with mpich-3.1.4 (the old RMA infrastructure). Signed-off-by:
Min Si <msi@il.is.s.u-tokyo.ac.jp> Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
Xin Zhao authored
'stream_offset' is used to specify the starting position (on target window) of the current streaming unit in ACC-like operations. It is originally put in the RMA packet struct, which potentially increases the size of CH3 packet size. In this patch, we move 'stream_offset' out of the RMA packet as follows: 1. when target data is basic datatype, we use 'stream_offset' and the starting address for the entire operation to calculate the starting address for current streaming unit, and rewrite 'addr' in RMA packet with that value; 2. when target data is derived datatype, we cannot do the same thing as basic datatype because the target needs to know both the starting address for the entire operation and the starting address for the current streaming unit. Therefore, we send 'stream_offset' separately to the target side. Signed-off-by:
Min Si <msi@il.is.s.u-tokyo.ac.jp> Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
Antonio J. Pena authored
An assert protecting from a non-null request was happening too late in pkt_COOKIE_handler from mpid_nem_lmt.c. This patch moves it to an earlier location so that it's checked before it's first used. Reported by Dmitry Polyakov.
-
Charles J Archer authored
Update OFI netmod to match portals4 netmod anysource_matched semantics.
-
Charles J Archer authored
-
- 17 Apr, 2015 9 commits
-
-
Halim Amer authored
This reverts commit 17e31e59.
-
Halim Amer authored
-
Kenneth Raffenetti authored
This fix, along with a pending patch to the Portal4 reference implementation, should make anysource_matched a more reliable operation for multithreaded apps. We were seeing a race condition where an ME would unlink successfully, but an event matching it would still arrive in the queue. CH3 can now reliably search the netmod queue for matched MPI_ANY_SOURCE requests. The reason that we no longer assert that an MPI_ANY_SOURCE request was removed from the CH3 queue is that FDP (find and dequeue posted) operations will remove the request from the queue, if it is known to be already matched by the netmod, even if it has not yet completed. Fixes #2199 Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
Junchao Zhang authored
We now auto-generate mpi_c_interface_types.f90 Fixes #2196 Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
Junchao Zhang authored
We now auto-generate mpi_f08_compile_constants.f90 Fixes #2196 Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
Junchao Zhang authored
The old one defines MPI_MESSAGE_NULL as MPI_REQUEST_NULL, which misleads F08 buildiface since types of MPI_REQUEST and MPI_MESSAGE are different. Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
Junchao Zhang authored
Module files are now only produced under use_mpi_f08 subdirectory. Also, it now auto-detects cases of names and extensions of module files, which are compiler-dependant. Fixes #2163 Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
Junchao Zhang authored
Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
Junchao Zhang authored
Since *.F90 implies the file need preprocess, e.g., by a C preprocessor, which is not required by the current f08 binding, and will lead to unnecessary compliation rules with automake. Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
- 16 Apr, 2015 2 commits
-
-
Pavan Balaji authored
Ticket #2243 was resolved as a duplicate of ticket #2183. Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Pavan Balaji authored
These tests seem to use a lot of memory per process, causing us to hit swap space when running with too many processes. Reducing it to two processes, allows this test to run on more machines. Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
- 15 Apr, 2015 3 commits
-
-
Pavan Balaji authored
We increased the number of cases the bcast test was running in [e01a20b6]. This is causing it to timeout on some platforms, where the test now seems to take close to 3 minutes. This increased timeout should be sufficient on those platforms. No reviewer.
-
Charles J Archer authored
* Rename MPIR_CVAR_DUMP_PROVIDERS to MPIR_CVAR_OFI_DUMP_PROVIDERS * Add MPIR_CVAR_OFI_USE_PROVIDER, which takes a string to desired provider name
-
Sameh Sharkawi authored
This commit includes multiple fixes: - Fixes for MPI_IN_PLACE checking. cudaGetPointerAttributes returns true on MPI_IN_PLACE which causes issues. Now we check on MPI_IN_PLACE before passing pointer to cuda. - Enabling PAMID geometries (in order to get to PAMID collectives) when MP_CUDA_AWARE=yes. This allows for intercepting CUDA buffer. - Disabling FCA when MP_CUDA_AWARE=yes if user enables FCA. - Copying user recv buffer into temp recv host buffer before collective starts, especially in MPI_IN_PLACE cases. (ibm) D203255 Signed-off-by:
Tsai-Yang (Alan) Jea <tjea@us.ibm.com>
-
- 14 Apr, 2015 2 commits
-
-
Min Si authored
The linker on Darwin does not allow common symbols, thus libtool adds the -fno-common option by default for shared libraries. However, the common symbols defined in different shared libraries and object files still can not be treated as the same symbol. For example: with gfortran, the same common block in the shared libraries and the object files will have different memory locations separately; with ifort, the same common block in different shared libraries will get the same memory location but still get a different location in the object file. The -Wl,-commons,use_dylibs option asks linker to check dylibs for definitions and use them to replace tentative definitions(commons) from object files, thus it solves the issue of the common symbol mismatch between the object file and the dylibs (i.e., by setting the address of a common symbol to the place located in the first dylib that is linked with the object file and contains this symbol). It needs to be added only in the linking stage for the final executable file. The -flat-namespace option allows linker to unify the same common symbols in different dylibs. It needs to be added in linking stage for both the shared library and the final executable file. (see man ld for their definition) Although gfortran works fine by only adding -flat-namespace, and ifort works by only adding -Wl,-commons,use_dylibs, we should add both options here as a generic solution to make sure everything safe. Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
Charles J Archer authored
-