1. 07 Nov, 2019 1 commit
  2. 22 Oct, 2019 1 commit
    • Swann Perarnau's avatar
      [refactor] simplify CUDA detection and activation · 76577a9e
      Swann Perarnau authored
      The previous CUDA activation code was trying to differentiate between
      building CUDA support and actually running the tests, but this isn't
      really necessary: any platform with CUDA support should be able to run
      the tests somewhere.
      
      The configure.ac is modified to also check for all necessary cuda
      runtime and headers at once.
      
      We also introduce support for CUDA_HOME, which is an environment
      variable available on some target systems.
      76577a9e
  3. 02 Oct, 2019 1 commit
  4. 17 Sep, 2019 1 commit
    • Swann Perarnau's avatar
      [feature/refactor] second version of tilings · ed30f0ab
      Swann Perarnau authored
      Refactor the tilings to become generic to N dimensions, and interfacing
      with the newly added layouts.
      
      The main idea for this version of tilings is to provide an index into a
      partitioning of a source layout into sub-layouts of smaller sizes.
      ed30f0ab
  5. 30 Aug, 2019 1 commit
  6. 23 Aug, 2019 1 commit
    • Nicolas Denoyelle's avatar
      ### Cuda implementation of areas. · df3b0f85
      Nicolas Denoyelle authored
      New area allow to allocate data on cuda devices.
      Allocation optionally include the ability to map
      host memory on device memory. See cuda area
      documentation.
      
      Includes libtool helper to link cuda device object files
      with the remaining of the library.
      
      An additional error code has been added to aml errors for handling busy cuda devices
      Also, all CI stages as been set not to run on branches name starting with wip.
      df3b0f85
  7. 06 Aug, 2019 1 commit
  8. 02 Jul, 2019 2 commits
    • ndenoyelle's avatar
      [fix] Clean-up out-of-tree building · 577bc63d
      ndenoyelle authored
      Adapted to the new master from a patch sent by @cfoyer
      
          When building out of tree as the, make sure that target refer to
          relative paths.
      
          This commit also clean-up the usage of the flags and defines
          a per-target definition of flags (can be changed for a
          AM_CPPFLAGS if the global definition is good enough).
      Signed-off-by: Clément Foyer's avatarClement Foyer <cfoyer@cray.com>
      
      Also adds Clement to the authors list.
      577bc63d
    • Swann Perarnau's avatar
      [refactor/fix] create single src makefile.am · 8721a0d7
      Swann Perarnau authored
      Recursive makefiles do not propagation automake flags (AM_CFLAGS),
      making the proper configuration of the whole build chain more complex
      than it needs to be. This patch goes back to a single makefile.am in
      src, simplifying the build quite a bit.
      8721a0d7
  9. 27 Mar, 2019 1 commit
  10. 26 Mar, 2019 1 commit
  11. 13 Mar, 2019 1 commit
    • Nicolas Denoyelle's avatar
      [refactor] reorganize repository · 2ad4488c
      Nicolas Denoyelle authored
      - create one directory per building block in src and include
      - keep one directory for tests, \
        otherwise automake make them "test suites"
      - move to AC_OPENMP, which is from autoconf 2.62 (2008)
      2ad4488c
  12. 08 Mar, 2019 1 commit
    • Swann Perarnau's avatar
      [fix] Embed custom jemalloc into libaml · ac85bab6
      Swann Perarnau authored
      Force libtool to static link the PIC version of our jemalloc import into
      libaml, making libaml standalone. This requires us to test some
      additional libraries in our own configure (pthread, and dlopen).
      
      This also solves the long-standing issue of `make check` only working after
      `make install`, while removing our custom jemalloc from the installed
      libraries.
      
      Fixes #26.
      ac85bab6
  13. 24 Aug, 2018 1 commit
  14. 06 Aug, 2018 1 commit
    • Swann Perarnau's avatar
      [feature] add 2d tiling of contiguous tiles · 508c4695
      Swann Perarnau authored
      Add a tiling representing a 2d array of contiguous tiles. Also add a
      ndims function to retrieve the dimensions in tiles of the tiling.
      
      It also became quite obvious that the iterators are useless right now.
      We should think about changing that.
      508c4695
  15. 25 Jul, 2018 1 commit
    • Brian Suchy's avatar
      [feature] add 2D tiling, additional methods. · a13ddad2
      Brian Suchy authored
      Implement a 2D tiling with continuous tiles in memory, with tiles
      organized in row-major order inside the virtual address range.
      
      Also adds functions to query the size of a tile inside the tiling.
      a13ddad2
  16. 28 Mar, 2018 1 commit
    • Swann Perarnau's avatar
      [feature] add a pthread based scratchpad · fa51aea5
      Swann Perarnau authored
      Add a scratchpad that creates one pthread per request, to call
      synchronous dma operations.
      
      The intent is to end up with a cross product of programming language
      support between dma and scratch:
      - scratch_par + dma_seq gives users parallel scratch requests
      - scratch_seq + dma_par gives users sequential access to parallel moves
      
      The two other options don't make as much sense though.
      fa51aea5
  17. 27 Mar, 2018 1 commit
    • Swann Perarnau's avatar
      [feature] add generic vector type to library · 72c8508d
      Swann Perarnau authored
      Add a generic vector type to the library, with some special features:
      - the elements are embedded in the vector, and not pointers
      - each element must include an int field that is used as a "key"
      - the element has a "null" value for its key, used to indicate that this
      element of the vector is null.
      - add/remove functions provide access to a new element/free it from the
      vector, but don't "destroy" it.
      - resize on add is exponential.
      
      This patch includes implementation and unit test.
      72c8508d
  18. 26 Mar, 2018 1 commit
    • Swann Perarnau's avatar
      [feature] add sequential, copy-based scratchpad · 73b57ae5
      Swann Perarnau authored
      This is the initial implementation and validation of a scratchpad: a
      logic unit that handles tracking data across a "main" area and a
      "scratch" area.
      
      The API and internals will probably change again soon, as there's no
      clear way to implement a move based scratchpad on this one.
      
      Note that this implementation doesn't do any tracking, not really, and
      that's the next step.
      73b57ae5
  19. 23 Mar, 2018 1 commit
  20. 11 Mar, 2018 1 commit
    • Swann Perarnau's avatar
      [feature] implement simple, working dma engine · 15cd651b
      Swann Perarnau authored
      This patch adds the basics for a dma interface, including
      type-dependent requests structures, and an API based on explicit
      copy/move calls.
      
      The APIs is flexible enough to deal with sync/async calls. The internal
      design is inspired by aml_area, with the goal that create/init stay type
      specific, but the core interactions are generic.
      15cd651b
  21. 08 Mar, 2018 1 commit
    • Swann Perarnau's avatar
      [feature] Add initial tiling and binding support · bcb6c923
      Swann Perarnau authored
      Implement 1d tiling and simple binding support. The idea is to allow
      an application to explain the AML how data should be organized, and to
      be able to reuse this info when dealing with memory movement.
      
      The current interfaces are not great, but they work.
      bcb6c923
  22. 01 Feb, 2018 2 commits
    • Swann Perarnau's avatar
      [fix] Make jemalloc link properly with aml · f8e57eeb
      Swann Perarnau authored
      Using both --with-jemalloc-prefix=jemk_ and --with-install-suffix=-aml
      ensures that the libjemalloc we build for our internal use has unique
      functions names and library names.
      
      This patch fixes a problem with the libtool configuration, where we had
      to link a static jemalloc library into the shared aml library to make
      everything work. Instead, we can now use the proper LIBADD variables
      with the unique library.
      
      Note that with this patch the `make check` target requires a `make
      install` beforehand.
      f8e57eeb
    • Swann Perarnau's avatar
      [refactor] new version of the API · a002945c
      Swann Perarnau authored
      This is the result of countless interations on the internal design of
      the various building blocks we want to have for this library.
      
      At this point, I hope that this is stable enough. There are still some
      tweaks needed here and there, but the core is implemented AND tested.
      
      Some of the design decisions made:
      - all functions are public, but most are not meant to be used directly.
      - intended public functions take "generic" structs as arguments
      - intended actual implementations rely on more complex structures, with
      their own family of data and operators.
      - split all objects between data and operator structs.
      
      Exemple:
      - area.c and arena.c are generic dispatch functions to call the actual,
        specific implementations.
      - struct aml_area and struct_aml_arena are the same.
      
      Currently implement:
      - 2 area types: posix (malloc) and linux (numa).
      - 1 arena type: jemalloc
      a002945c
  23. 06 Oct, 2017 1 commit
    • Swann Perarnau's avatar
      [refactor] Implement new API, based on jemalloc · 66564c2f
      Swann Perarnau authored
      This is a redesign of the library, as a hierarchy of core objects
      implementing its various features. The idea is to create an API that is
      as flexible and customizable as possible, by exposing as much as
      possible of its internals, so that users can create customs versions
      easily.
      
      We also move away from memkind as a possible backend, opting instead to
      vendor the jemalloc interface and implement everything ourselves on top
      of that.
      
      We expect to start building the low-level pieces using hwloc as a
      backend soon, at least in terms of accessing available devices.
      66564c2f
  24. 22 Aug, 2017 1 commit
    • Swann Perarnau's avatar
      [refactor] Rework code for better abstractions · a41b0412
      Swann Perarnau authored
      This is a rewrite of the existing code into a memory library exposing
      more of its internal abstractions. This refactoring is required to:
      - make progress faster by focusing on the core new features
      - abstract more of the underlying components and expose those
      abstractions
      - build upon existing libraries (memkind) for the internal stuff.
      
      Memkind is used as a crutch here, we do not intend to use it in the long
      term, as some of its internal are opposed to what we want (topology
      management in particular).
      
      Nevertheless, it currently provides a good allocator internally, and
      decent access to deep memory, for now.
      
      Over time, we figured out that the best way to build this API was to
      create several layers of APIs, each with more abstractions over the
      devices. At the same time, we want each layer to expose its internal
      mechanisms, so that a user can customize any of them.
      
      This is why we end up with areas and dma engines, and we will add in the
      future other stuff, like data decomposition and distribution methods, as
      well as direct support for "pipelining".
      a41b0412
  25. 21 Feb, 2017 1 commit
  26. 30 Jun, 2016 1 commit
    • Swann Perarnau's avatar
      Add first working version: limit numa allocs · 80669c37
      Swann Perarnau authored
      This is the first working version of the library. It does very little:
      - only numa support
      - one allocation per node only
      - limited tests
      - limited set of functions
      
      Nevertheless, this gives a good idea of what the API should look like, and the
      kind of benchmarks we can write with it.
      80669c37