1. 04 Mar, 2015 16 commits
    • Xin Zhao's avatar
      Code-refactoring: make perform_get_acc_in_lock_queue cleaner. · a3af53c3
      Xin Zhao authored and Pavan Balaji's avatar Pavan Balaji committed
      
      
      This patch does not change any functionality but just makes the
      code structure cleaner.
      
      The original code structure of perform_get_acc_in_lock_queue is
      a mess since the code of dealing with IMMED packet type and the
      code of dealing with normal packet type are mixed together.
      This patch separates these two parts and makes the function looks
      cleaner.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      a3af53c3
    • Xin Zhao's avatar
      Change name from data_size to buf_size. · 45cdb282
      Xin Zhao authored and Pavan Balaji's avatar Pavan Balaji committed
      
      
      When the lock is not satisfied, we queue up
      the lock request and op data in a lock entry
      queue. In the struct of lock entry, we use 'data_size'
      to remember the size of buffer for storing the
      data. Since the size of buffer is not type_size*count
      but might be type_extent*extent, here we change
      its name from 'data_size' to 'buf_size'.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      45cdb282
    • Xin Zhao's avatar
      Bug-fix: make RMA work correctly with pair basic type. · ce8bc310
      Xin Zhao authored and Pavan Balaji's avatar Pavan Balaji committed
      
      
      The original implementation of RMA does not consider pair basic
      types (e.g. MPI_FLOAT_INT, MPI_DOUBLE_INT). It only
      works correctly with builtin datatypes (e.g. MPI_INT, MPI_FLOAT).
      This patch makes the RMA work correctly with pair basic types.
      
      The bug is that: (1) when performing the ACC computation, the original
      implementation uses 'eltype' in the datatype structure, which is set
      when all basic elements in this datatype have the same builtin
      datatype. When basic elements have different builtin datatypes, like
      pair datatypes, the 'eltype' is set to MPI_DATATYPE_NULL. This makes
      the ACC computation be unable to work with pair types; (2) for all
      basic type of data, the original implementation assumes that
      they are all contiguous and issues them in an unpacked manner
      with length of data size (count*type_size). This is incorrect for
      pair datatypes, because most pair datatypes are non-contiguous
      (type_extent != type_size).
      
      In the previous patch, we already made 'eltype' to store basic
      type instead of builtin type. In this patch, we fixed this
      bug by (1) modify ACC computation to treat 'eltype' as basic
      type; (2) For non-contiguous basic type data, we use the noncontig
      API so that it will be issued in a packed manner.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      ce8bc310
    • Xin Zhao's avatar
      Make 'eltype' in datatype struct store basic type. · 67b69b2a
      Xin Zhao authored and Pavan Balaji's avatar Pavan Balaji committed
      
      
      'eltype' in datatype struct is originally used to store the
      builtin datatype. However, this is not correct when working
      with RMA ACC-like operation since ACC-like operation needs
      to work with basic type.
      
      In this patch we make the 'eltype' to store basic type.
      Note that (1) whenever we need the builtin type,
      we should call macro MPID_Datatype_get_basic_type instead
      of directly accessing 'eltype'; (2) 'element_size' and
      'n_elements' still represents builtin type, whereas 'eltype'
      represents basic type.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      67b69b2a
    • Xin Zhao's avatar
      Modify macro PAIRTYPE_SIZE_EXTENT to accept correct arguments. · 49dd90f4
      Xin Zhao authored and Pavan Balaji's avatar Pavan Balaji committed
      
      
      The original implementation of PAIRTYPE_SIZE_EXTENT is not
      correct because it directly modifies variables internally
      without letting the user pass them. This patch adds those
      variables in the argument list.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      49dd90f4
    • Xin Zhao's avatar
      7899a602
    • Xin Zhao's avatar
    • Xin Zhao's avatar
      Simplify code: deleting derived DT code for op piggybacked with LOCK. · 2317b31d
      Xin Zhao authored
      
      
      We piggyback LOCK flag with operations that does not use
      derived datatypes. Therefore, here we delete the unnecessary
      code that deal with derived datatypes in piggyback LOCK code.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      2317b31d
    • Xin Zhao's avatar
      Simplify code: not using flag MPIDI_CH3_PKT_FLAG_RMA_IMMED_RESP for GACC/FOP. · 344bf958
      Xin Zhao authored
      
      
      Flag MPIDI_CH3_PKT_FLAG_RMA_IMMED_RESP is used to tell the target
      if the response packet of current GET, GACC and FOP should use
      IMMED packet type. We use IMMED packet type only when
      origin/target/result datatypes are all basic types.
      Since the target does not know origin/result datatypes, origin
      process needs to set a flag to inform the target.
      
      However, this usage is redundant for GACC and FOP packets. The
      reason is that, when we use IMMED packet type for GACC/FOP packets,
      origin/target/result datatypes must be basic types,
      in such case, we must use IMMED packet type for response packets
      as well, and usage of MPIDI_CH3_PKT_FLAG_RMA_IMMED_RESP and
      related code is not necessary. In short,
      flag MPIDI_CH3_PKT_FLAG_RMA_IMMED_RESP is useful only for GET operation.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      344bf958
    • Xin Zhao's avatar
      Use function hook instead of function pointer for win_free. · 42b5fcf1
      Xin Zhao authored
      
      
      The original implementation of win_free is not correct. The
      problem is described as follows:
      
      It uses a function pointer which is initially set to the CH3
      implementation, and can be overridden by the channel layer if
      the channel provides an specific implementation.  In the CH3
      win_free implementation, it first checks if all RMA
      communication is finished and epoch states is reset, then
      performs a global barrier, then frees the window resources
      that are allocated in CH3, and finally returns. In the Nemesis
      win_free implementation, it directly frees the window resources
      that are allocated in Nemesis, and calls the CH3 win_free at last.
      This makes no sense because we free the window resources before
      checking if the RMA communication is completed.
      
      To fix this issue, we add a function hook for channel layer
      to free its own resources, the the function hook is called from
      the CH3 win_free.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      42b5fcf1
    • Xin Zhao's avatar
      Allow the channel layer to implement Win_gather_info function. · 9dbcae0c
      Xin Zhao authored
      
      
      In this patch, we first add a function pointer of Win_gather_info
      in CH3 to allow different channel layers to implement their own
      version of Win_gather_info function. The function pointer is
      initially set to the default implementation in CH3 layer. If the
      channel layer provides an implementation of Win_gather_info, it
      will override the function pointer.
      
      Secondly, we provide an implementation of Win_gather_info in the
      Nemesis layer. In this implementation, we allocate basic_info_table[]
      in the SHM region, so that processes on the same node can share the
      same base_info_table[].
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      9dbcae0c
    • Xin Zhao's avatar
      Add a function hook to initialize window attributes in channel layer. · 7c1a8fb1
      Xin Zhao authored
      
      
      There are some window attributes in the channel layer that
      needs to be initialized during window creation. In this
      patch, we first add a win_hooks table that contains pointers
      to the channel's implementation of the function hooks. Secondly,
      we add a function hook 'win_init' to allow the channel layer to
      initialize its own attributes. The hook is called from the
      CH3 win_init function.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      7c1a8fb1
    • Xin Zhao's avatar
      Reduce size of shm_base_addrs[] from comm_size to node_size. · eddd8b91
      Xin Zhao authored
      
      
      Given one process, shm_base_addrs[] is used to store the base
      addresses (in the address space of this process) of SHM window
      on other processes. The original size of it is comm_size. However,
      the maximum number of SHM windows that this process can access
      to is node_size instead of comm_size, which results in a waste
      of memory since most slots in the array is NULL. In this patch
      we reduce the size of shm_base_addrs[] from comm_size to node_size.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      eddd8b91
    • Xin Zhao's avatar
      Store window basic attributes into a struct on window. · 9404e953
      Xin Zhao authored
      
      
      In this patch, we gather window basic attributes of other
      processes (base_addr, size, disp_unit, win_handle) using a
      struct called "basic_info_table". By doing this, we can use
      one contiguous memory region to store them.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      9404e953
    • Xin Zhao's avatar
      Change name of MPIDI_CH3U_Win_create_gather to MPIDI_CH3U_Win_gather_info. · 131e06ef
      Xin Zhao authored
      
      
      Function MPIDI_CH3U_Win_create_gather exchanges the window
      information among processes. It does not create new window.
      Here we change the function name to a more suitable one.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      131e06ef
    • Xin Zhao's avatar
      Add CH3 APIs and macros to allow channel to implement Alloc_mem/Free_mem. · 03d4c77b
      Xin Zhao authored
      
      
      Originally MPIDI_Alloc_mem(size, info) and MPIDI_Free_mem(base_ptr)
      in CH3 layer are implemented by calling MPIU_Malloc(size) and
      MPIU_Free(base_ptr) internally. This makes the underlying hardware
      be unable to develop a specific implementation of Alloc_mem and Free_mem,
      which is necessary when registering memory for RDMA operations.
      
      This patch defines new APIs, MPIDI_CH3I_Alloc_mem(size, info)
      and MPIDI_CH3I_Free_mem(base_ptr), to allow channels to implement
      their own memory allocators. If the channel does not have its own
      implementation, MPICH will fallback to the default implementation
      in CH3 layer which uses MPIU_Malloc and MPIU_Free.
      
      Thanks to Steffen Christgau <christgau@cs.uni-potsdam.de> for
      this contribution.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      03d4c77b
  2. 03 Mar, 2015 3 commits
  3. 02 Mar, 2015 7 commits
  4. 01 Mar, 2015 5 commits
  5. 27 Feb, 2015 7 commits
  6. 26 Feb, 2015 2 commits