1. 23 Aug, 2019 1 commit
    • Nicolas Denoyelle's avatar
      ### Cuda implementation of areas. · df3b0f85
      Nicolas Denoyelle authored
      New area allow to allocate data on cuda devices.
      Allocation optionally include the ability to map
      host memory on device memory. See cuda area
      documentation.
      
      Includes libtool helper to link cuda device object files
      with the remaining of the library.
      
      An additional error code has been added to aml errors for handling busy cuda devices
      Also, all CI stages as been set not to run on branches name starting with wip.
      df3b0f85
  2. 07 Aug, 2019 1 commit
  3. 06 Aug, 2019 1 commit
  4. 02 Jul, 2019 1 commit
  5. 29 Mar, 2019 1 commit
  6. 27 Mar, 2019 1 commit
  7. 26 Mar, 2019 1 commit
  8. 22 Mar, 2019 1 commit
    • Swann Perarnau's avatar
      [refactor] use autoconf + m4 for version mngmt · d8803390
      Swann Perarnau authored
      Use m4 to define autoconf-level version variables, following the naming
      scheme of semver.org
      
      To make use of these variables in the headers and sources, a
      generated-header is added in aml/utils/version.h
      
      Also add a simple test for that part of the lib.
      d8803390
  9. 14 Mar, 2019 1 commit
  10. 13 Mar, 2019 1 commit
    • Nicolas Denoyelle's avatar
      [refactor] reorganize repository · 2ad4488c
      Nicolas Denoyelle authored
      - create one directory per building block in src and include
      - keep one directory for tests, \
        otherwise automake make them "test suites"
      - move to AC_OPENMP, which is from autoconf 2.62 (2008)
      2ad4488c
  11. 08 Mar, 2019 1 commit
    • Swann Perarnau's avatar
      [fix] Embed custom jemalloc into libaml · ac85bab6
      Swann Perarnau authored
      Force libtool to static link the PIC version of our jemalloc import into
      libaml, making libaml standalone. This requires us to test some
      additional libraries in our own configure (pthread, and dlopen).
      
      This also solves the long-standing issue of `make check` only working after
      `make install`, while removing our custom jemalloc from the installed
      libraries.
      
      Fixes #26.
      ac85bab6
  12. 27 Aug, 2018 2 commits
    • Swann Perarnau's avatar
      [fix] fixup unit tests · c759c9df
      Swann Perarnau authored
      Mbind is giving us trouble again, will need to spend time looking at it
      carefully.
      c759c9df
    • Swann Perarnau's avatar
      [feature/refactor] add tileid function · 55500ab0
      Swann Perarnau authored
      Instead of asking the user to provide the offsets into a tiling, add a
      function providing a tileid. This tileid corresponds to the in-memory
      order of tiles, making the tilestart functions a lot simpler.
      
      We still need to split the tileid for tilestart because scratchpads
      create requests based on tileids.
      
      Also add a unit test for tiling_2d, to make sure we're not doing
      anything crazy.
      55500ab0
  13. 20 Jul, 2018 1 commit
    • Swann Perarnau's avatar
      [refactor] move functional tests, proper OpenMP · 51167d12
      Swann Perarnau authored
      We are starting to work on benchmarks to evaluate the usefulness of this
      library. Instead of integrating them into the testing infrastructure, it
      makes more sense for them to have their own directory and a different
      way of handling them.
      
      This patch:
       - creates a benchmark directory for actual codes that we want to use as
         benchmarks of our library.
       - moves functional tests into it.
       - add proper OpenMP detection for these codes
      51167d12
  14. 30 Mar, 2018 3 commits
    • Swann Perarnau's avatar
      [test] add openmp version of mt stream_add · 956d9453
      Swann Perarnau authored
      This is a second type of use for the scratchpad: a single master
      thread is responsible for launching all data movements, but the tiles
      are worked on in parallel. We support this model by using a sequential
      scratch on top of a parallel dma.
      956d9453
    • Swann Perarnau's avatar
      [refactor] add openmp version of stream_add_pth · 21d3724e
      Swann Perarnau authored
      Add openmp version of the previous functional test. We also rename them,
      to mark the fact that those two tests are designed to use a *single-thread*
      to run the kernel across an entire tile.
      21d3724e
    • Swann Perarnau's avatar
      [refactor] make use of functional tests again · f47dc685
      Swann Perarnau authored
      This patch reintroduce the first functional test, a stream add
      implementation using pthreads for parallelism. We make use of our
      scratch_par implementation to implement a pipelined version of the
      application, where each worker thread is using its own batch of tiles,
      and migrating data asynchronously.
      f47dc685
  15. 28 Mar, 2018 1 commit
    • Swann Perarnau's avatar
      [feature] add a pthread based scratchpad · fa51aea5
      Swann Perarnau authored
      Add a scratchpad that creates one pthread per request, to call
      synchronous dma operations.
      
      The intent is to end up with a cross product of programming language
      support between dma and scratch:
      - scratch_par + dma_seq gives users parallel scratch requests
      - scratch_seq + dma_par gives users sequential access to parallel moves
      
      The two other options don't make as much sense though.
      fa51aea5
  16. 27 Mar, 2018 1 commit
    • Swann Perarnau's avatar
      [feature] add generic vector type to library · 72c8508d
      Swann Perarnau authored
      Add a generic vector type to the library, with some special features:
      - the elements are embedded in the vector, and not pointers
      - each element must include an int field that is used as a "key"
      - the element has a "null" value for its key, used to indicate that this
      element of the vector is null.
      - add/remove functions provide access to a new element/free it from the
      vector, but don't "destroy" it.
      - resize on add is exponential.
      
      This patch includes implementation and unit test.
      72c8508d
  17. 26 Mar, 2018 1 commit
    • Swann Perarnau's avatar
      [feature] add sequential, copy-based scratchpad · 73b57ae5
      Swann Perarnau authored
      This is the initial implementation and validation of a scratchpad: a
      logic unit that handles tracking data across a "main" area and a
      "scratch" area.
      
      The API and internals will probably change again soon, as there's no
      clear way to implement a move based scratchpad on this one.
      
      Note that this implementation doesn't do any tracking, not really, and
      that's the next step.
      73b57ae5
  18. 23 Mar, 2018 1 commit
  19. 11 Mar, 2018 1 commit
    • Swann Perarnau's avatar
      [feature] implement simple, working dma engine · 15cd651b
      Swann Perarnau authored
      This patch adds the basics for a dma interface, including
      type-dependent requests structures, and an API based on explicit
      copy/move calls.
      
      The APIs is flexible enough to deal with sync/async calls. The internal
      design is inspired by aml_area, with the goal that create/init stay type
      specific, but the core interactions are generic.
      15cd651b
  20. 08 Mar, 2018 1 commit
    • Swann Perarnau's avatar
      [feature] Add initial tiling and binding support · bcb6c923
      Swann Perarnau authored
      Implement 1d tiling and simple binding support. The idea is to allow
      an application to explain the AML how data should be organized, and to
      be able to reuse this info when dealing with memory movement.
      
      The current interfaces are not great, but they work.
      bcb6c923
  21. 01 Feb, 2018 2 commits
    • Swann Perarnau's avatar
      [fix] Make jemalloc link properly with aml · f8e57eeb
      Swann Perarnau authored
      Using both --with-jemalloc-prefix=jemk_ and --with-install-suffix=-aml
      ensures that the libjemalloc we build for our internal use has unique
      functions names and library names.
      
      This patch fixes a problem with the libtool configuration, where we had
      to link a static jemalloc library into the shared aml library to make
      everything work. Instead, we can now use the proper LIBADD variables
      with the unique library.
      
      Note that with this patch the `make check` target requires a `make
      install` beforehand.
      f8e57eeb
    • Swann Perarnau's avatar
      [refactor] new version of the API · a002945c
      Swann Perarnau authored
      This is the result of countless interations on the internal design of
      the various building blocks we want to have for this library.
      
      At this point, I hope that this is stable enough. There are still some
      tweaks needed here and there, but the core is implemented AND tested.
      
      Some of the design decisions made:
      - all functions are public, but most are not meant to be used directly.
      - intended public functions take "generic" structs as arguments
      - intended actual implementations rely on more complex structures, with
      their own family of data and operators.
      - split all objects between data and operator structs.
      
      Exemple:
      - area.c and arena.c are generic dispatch functions to call the actual,
        specific implementations.
      - struct aml_area and struct_aml_arena are the same.
      
      Currently implement:
      - 2 area types: posix (malloc) and linux (numa).
      - 1 arena type: jemalloc
      a002945c
  22. 06 Oct, 2017 1 commit
    • Swann Perarnau's avatar
      [refactor] Implement new API, based on jemalloc · 66564c2f
      Swann Perarnau authored
      This is a redesign of the library, as a hierarchy of core objects
      implementing its various features. The idea is to create an API that is
      as flexible and customizable as possible, by exposing as much as
      possible of its internals, so that users can create customs versions
      easily.
      
      We also move away from memkind as a possible backend, opting instead to
      vendor the jemalloc interface and implement everything ourselves on top
      of that.
      
      We expect to start building the low-level pieces using hwloc as a
      backend soon, at least in terms of accessing available devices.
      66564c2f
  23. 22 Aug, 2017 1 commit
    • Swann Perarnau's avatar
      [refactor] Rework code for better abstractions · a41b0412
      Swann Perarnau authored
      This is a rewrite of the existing code into a memory library exposing
      more of its internal abstractions. This refactoring is required to:
      - make progress faster by focusing on the core new features
      - abstract more of the underlying components and expose those
      abstractions
      - build upon existing libraries (memkind) for the internal stuff.
      
      Memkind is used as a crutch here, we do not intend to use it in the long
      term, as some of its internal are opposed to what we want (topology
      management in particular).
      
      Nevertheless, it currently provides a good allocator internally, and
      decent access to deep memory, for now.
      
      Over time, we figured out that the best way to build this API was to
      create several layers of APIs, each with more abstractions over the
      devices. At the same time, we want each layer to expose its internal
      mechanisms, so that a user can customize any of them.
      
      This is why we end up with areas and dma engines, and we will add in the
      future other stuff, like data decomposition and distribution methods, as
      well as direct support for "pipelining".
      a41b0412
  24. 27 Feb, 2017 1 commit
    • Swann Perarnau's avatar
      Implement non-transparent memory interface · 27252580
      Swann Perarnau authored
      This is a mmap-based, non-transparent version of the library, with a
      unit test checking that we can call move_pages properly from it.
      
      No node tracking performed. Memcpy not working.
      27252580
  25. 30 Jun, 2016 1 commit
    • Swann Perarnau's avatar
      Add first working version: limit numa allocs · 80669c37
      Swann Perarnau authored
      This is the first working version of the library. It does very little:
      - only numa support
      - one allocation per node only
      - limited tests
      - limited set of functions
      
      Nevertheless, this gives a good idea of what the API should look like, and the
      kind of benchmarks we can write with it.
      80669c37