1. 30 Mar, 2018 5 commits
    • Swann Perarnau's avatar
      [fix] unlocks are too early in dma_linux_* · c445b498
      Swann Perarnau authored
      We were unlocking the dma before the request type get set to a
      proper value, resulting in requests sometimes overlapping when
      multiple threads were used in benchmarks.
      c445b498
    • Swann Perarnau's avatar
      [test] add openmp version of mt stream_add · 956d9453
      Swann Perarnau authored
      This is a second type of use for the scratchpad: a single master
      thread is responsible for launching all data movements, but the tiles
      are worked on in parallel. We support this model by using a sequential
      scratch on top of a parallel dma.
      956d9453
    • Swann Perarnau's avatar
      [refactor] add openmp version of stream_add_pth · 21d3724e
      Swann Perarnau authored
      Add openmp version of the previous functional test. We also rename them,
      to mark the fact that those two tests are designed to use a *single-thread*
      to run the kernel across an entire tile.
      21d3724e
    • Swann Perarnau's avatar
      [refactor] make use of functional tests again · f47dc685
      Swann Perarnau authored
      This patch reintroduce the first functional test, a stream add
      implementation using pthreads for parallelism. We make use of our
      scratch_par implementation to implement a pipelined version of the
      application, where each worker thread is using its own batch of tiles,
      and migrating data asynchronously.
      f47dc685
    • Swann Perarnau's avatar
      [feature] add function to release a scratch tile · 7260868d
      Swann Perarnau authored
      When a user doesn't need a tile to be pushed back into the scratchpad,
      it is better to just `release` that tile instead. This is particularly
      useful for read-only data for applications that are bandwidth limited.
      7260868d
  2. 29 Mar, 2018 2 commits
  3. 28 Mar, 2018 7 commits
    • Swann Perarnau's avatar
      [feature] make scratch_par thread-safe · 1e1f1ced
      Swann Perarnau authored
      Add mutex to make request creation and destruction thread-safe. As for
      scratch_seq, we need to deal both with requests and tiles during these
      functions, so we lock the entire section.
      1e1f1ced
    • Swann Perarnau's avatar
      [feature] make scratch_seq thread-safe · cd9dba51
      Swann Perarnau authored
      Add mutex to make request creation and destruction thread-safe. As we
      need to deal both we requests and tiles during these functions, we lock
      the entire section.
      cd9dba51
    • Swann Perarnau's avatar
      [feature] make dma_linux_par thread-safe · 7a69c840
      Swann Perarnau authored
      Add mutex to make request creation and destruction thread-safe. Same as
      dma_linux_seq, the changes are quite simple, as we only need to protect
      modifications to the requests array.
      7a69c840
    • Swann Perarnau's avatar
      [feature] make dma_linux_seq thread-safe · 9f2b685d
      Swann Perarnau authored
      Add a mutex to make request creation and destruction thread-safe. As the
      code here is quite simple, we only need to protect modifications to the
      request array.
      9f2b685d
    • Swann Perarnau's avatar
      [refactor] remove extra tiling from request · b52f0e52
      Swann Perarnau authored
      scratch_request_seq contains one extra tiling that is unnecessary.
      Remove it.
      b52f0e52
    • Swann Perarnau's avatar
      [refactor] remove unnecessary data from request · 22063684
      Swann Perarnau authored
      The request type contains two much stuff, remove extra pointers to win
      some space.
      22063684
    • Swann Perarnau's avatar
      [feature] add a pthread based scratchpad · fa51aea5
      Swann Perarnau authored
      Add a scratchpad that creates one pthread per request, to call
      synchronous dma operations.
      
      The intent is to end up with a cross product of programming language
      support between dma and scratch:
      - scratch_par + dma_seq gives users parallel scratch requests
      - scratch_seq + dma_par gives users sequential access to parallel moves
      
      The two other options don't make as much sense though.
      fa51aea5
  4. 27 Mar, 2018 3 commits
    • Swann Perarnau's avatar
      [refactor] use vector in scratch · 20354336
      Swann Perarnau authored
      Replace custom code with generic vectors for the scratch implementation.
      In the process, fix a bug in the management of tiles, as they were being
      freed on pull completion, which is wrong.
      20354336
    • Swann Perarnau's avatar
      [refactor] use request vector for dma · 61160adf
      Swann Perarnau authored
      Use the newly introduced vector type to manage requests inside dmas.
      This cleans up the API a bit, and remove dubious ops from the dma
      internals.
      61160adf
    • Swann Perarnau's avatar
      [feature] add generic vector type to library · 72c8508d
      Swann Perarnau authored
      Add a generic vector type to the library, with some special features:
      - the elements are embedded in the vector, and not pointers
      - each element must include an int field that is used as a "key"
      - the element has a "null" value for its key, used to indicate that this
      element of the vector is null.
      - add/remove functions provide access to a new element/free it from the
      vector, but don't "destroy" it.
      - resize on add is exponential.
      
      This patch includes implementation and unit test.
      72c8508d
  5. 26 Mar, 2018 2 commits
    • Swann Perarnau's avatar
      [feature] make scratchpad track its own tiles · be88fe46
      Swann Perarnau authored
      Move the scratchpad tiles into an internal concern:
      - the scratchpad does the allocation
      - the scratchpad tracks available tiles internally
      - the user can ask for the scratch baseptr.
      
      This is necessary to abstract move-based scratchs, and to remove from the
      user responsibility of maintaining tiling and baseptr tracking.
      
      We still fail-hard when tiles are not available, and the design is not
      thread safe. But we are getting there.
      be88fe46
    • Swann Perarnau's avatar
      [feature] add sequential, copy-based scratchpad · 73b57ae5
      Swann Perarnau authored
      This is the initial implementation and validation of a scratchpad: a
      logic unit that handles tracking data across a "main" area and a
      "scratch" area.
      
      The API and internals will probably change again soon, as there's no
      clear way to implement a move based scratchpad on this one.
      
      Note that this implementation doesn't do any tracking, not really, and
      that's the next step.
      73b57ae5
  6. 23 Mar, 2018 5 commits
    • Kamil Iskra's avatar
      [feature] support offsets for file mappings · b698c7eb
      Kamil Iskra authored
      Replace the unused "max" argument for file-based mappings with an offset
      argument (until now the offset was hardcoded to 0).
      b698c7eb
    • Swann Perarnau's avatar
      [test] add working test for dma_linux_par · 12e946e0
      Swann Perarnau authored
      Add working implementation of copy and move to dma_linux_par, and
      corresponding unit test.
      12e946e0
    • Swann Perarnau's avatar
      [fix] fix typos across dma_seq code · 7bfa666c
      Swann Perarnau authored
      Fix a few typos in the dma_linux_seq code, that for some reason didn't
      raise any flags so far. Also add a small validation to the unit test.
      7bfa666c
    • Swann Perarnau's avatar
      [feature] add parallel, on-demand dma · 428ec530
      Swann Perarnau authored
      Add a dma that spawns a fixed amount of theads for each request created.
      The number of threads is configured at dma creation time.
      428ec530
    • Swann Perarnau's avatar
      [refactor] remove generic functions from requests · 0a66735a
      Swann Perarnau authored
      This patch refactors dma request types to remove generic function
      pointers from the library. This include modifying the linux_seq
      implementation to:
      - move the copy/move implementation to the dma ops
      - remove one layer of indirection, as the request type no longer need
      _data and _ops substructures.
      
      Enforcing dma requests to have a fully qualified generic type, with
      function pointers, will cause issues for future kinds of dma
      implementation, that might require a different way of handling requests
      altogether.
      
      This work is driven by our current work on a parallel dma implementation.
      0a66735a
  7. 22 Mar, 2018 5 commits
  8. 20 Mar, 2018 1 commit
  9. 11 Mar, 2018 4 commits
    • Swann Perarnau's avatar
      [feature] implement simple, working dma engine · 15cd651b
      Swann Perarnau authored
      This patch adds the basics for a dma interface, including
      type-dependent requests structures, and an API based on explicit
      copy/move calls.
      
      The APIs is flexible enough to deal with sync/async calls. The internal
      design is inspired by aml_area, with the goal that create/init stay type
      specific, but the core interactions are generic.
      15cd651b
    • Swann Perarnau's avatar
      [refactor/feature] change tilings to use uuids · 8a150cf2
      Swann Perarnau authored
      Using variable arguments on the tile id for retrieving tiling info makes
      the API difficult to use when more than one tile must be used at the
      same time.
      
      We change the API to use a tileid, with the assumption that any valuable
      tiling will be able to define a workable uuid scheme.
      8a150cf2
    • Swann Perarnau's avatar
      [features] add missing macros and prototypes · 24a640a2
      Swann Perarnau authored
      A few missing declarations in aml.h, to make it easier to deal with the
      library.
      24a640a2
    • Swann Perarnau's avatar
      [fix] area_binding should create the binding · f79e5b43
      Swann Perarnau authored
      As we cannot find out in advance the binding an area uses, it is not
      possible to use a correctly allocated pointer to aml_area_binding.
      
      Fixes a segfault we observed outside of current unit-tests.
      f79e5b43
  10. 08 Mar, 2018 6 commits
    • Swann Perarnau's avatar
      [feature] areas can now provide their binding · 1e0d24b8
      Swann Perarnau authored
      Allows memory movement logic to ask a target area how memory should be
      bound to it.
      
      Note that it would be safer in the long term to have areas take a
      binding at creation time, and translate to nodemasks internally.
      1e0d24b8
    • Swann Perarnau's avatar
      [feature] add init functions to area_linux · 085b9762
      Swann Perarnau authored
      Still the same schema, although it looks a bit messier on linux because
      of all the options needed.
      085b9762
    • Swann Perarnau's avatar
      [feature] add proper init functions for area_posix · 8d2cc6ae
      Swann Perarnau authored
      Same schema as for arena, we create init functions for each type of
      area, to make sure that users know what they are working with.
      
      The functions are easy here, as posix is more an allocator than anything
      proper.
      8d2cc6ae
    • Swann Perarnau's avatar
      [feature] add proper arena init · 261755e7
      Swann Perarnau authored
      Add arena_jemalloc initialization, working the same way than tiling and
      binding initializations.
      
      We made the choice of not building generic arena allocation functions,
      as the benefit of that isn't exactly obvious right now.
      
      Also, we want users to understand the kind of arenas and areas they are
      manipulating.
      261755e7
    • Swann Perarnau's avatar
      [refactor] rename create/purge to avoid confusion · 2ae0aedb
      Swann Perarnau authored
      create -> register
      purge -> deregister
      
      create is a name we are using elsewhere for dynamic allocation.
      And those names match better what we are doing.
      2ae0aedb
    • Swann Perarnau's avatar
      [feature] Add initial tiling and binding support · bcb6c923
      Swann Perarnau authored
      Implement 1d tiling and simple binding support. The idea is to allow
      an application to explain the AML how data should be organized, and to
      be able to reuse this info when dealing with memory movement.
      
      The current interfaces are not great, but they work.
      bcb6c923