- 26 Mar, 2019 3 commits
-
-
Swann Perarnau authored
Since we removed `move` from the dma API, bindings are now useless.
-
Swann Perarnau authored
Poorly supported feature that doesn't play well with the rest of the library and limits what we can do in the future.
-
-
- 25 Mar, 2019 1 commit
-
-
- 22 Mar, 2019 1 commit
-
-
Swann Perarnau authored
Use m4 to define autoconf-level version variables, following the naming scheme of semver.org To make use of these variables in the headers and sources, a generated-header is added in aml/utils/version.h Also add a simple test for that part of the lib.
-
- 21 Mar, 2019 4 commits
-
-
Nicolas Denoyelle authored
-
Nicolas Denoyelle authored
-
Nicolas Denoyelle authored
-
Nicolas Denoyelle authored
-
- 20 Mar, 2019 2 commits
-
-
Swann Perarnau authored
The overall strategy for now is to split the implementations into distinct headers, but keep the generic APIs inside the main aml.h Related to #27
-
Swann Perarnau authored
The overall strategy for now is to split the implementations into distinct headers, but keep the generic APIs inside the main aml.h Related to #27
-
- 14 Mar, 2019 1 commit
-
-
While the current bitmask management is heavily inspired by libnuma, it is not as easy to use has the libnuma API. This patch is an attempt to refactor to code towards something cleaner.
-
- 13 Mar, 2019 2 commits
-
-
Swann Perarnau authored
If an operation should not trigger any actual work, create a no-op request to deal with it properly. Related to #18.
-
- create one directory per building block in src and include - keep one directory for tests, \ otherwise automake make them "test suites" - move to AC_OPENMP, which is from autoconf 2.62 (2008)
-
- 08 Mar, 2019 1 commit
-
-
Swann Perarnau authored
Force libtool to static link the PIC version of our jemalloc import into libaml, making libaml standalone. This requires us to test some additional libraries in our own configure (pthread, and dlopen). This also solves the long-standing issue of `make check` only working after `make install`, while removing our custom jemalloc from the installed libraries. Fixes #26.
-
- 15 Feb, 2019 1 commit
-
-
Swann Perarnau authored
-
- 27 Aug, 2018 1 commit
-
-
Swann Perarnau authored
Instead of asking the user to provide the offsets into a tiling, add a function providing a tileid. This tileid corresponds to the in-memory order of tiles, making the tilestart functions a lot simpler. We still need to split the tileid for tilestart because scratchpads create requests based on tileids. Also add a unit test for tiling_2d, to make sure we're not doing anything crazy.
-
- 24 Aug, 2018 1 commit
-
-
Swann Perarnau authored
Tiling 2d and its interfaces wasn't the right way of looking at 2d grids of tiles. Rename the contig ones to provide the required features.
-
- 20 Aug, 2018 2 commits
-
-
Swann Perarnau authored
This patch provides aligned allocations for all areas. Simple tests included. Note that I haven't tested if it conflicts with arena-wide alignements.
-
Swann Perarnau authored
We are going to need more of those flags, and keeping track of the conversions is tricky. So let's use a copy of the macros.
-
- 06 Aug, 2018 4 commits
-
-
Swann Perarnau authored
mbind requires that the input ptr be aligned on a page. NOTE: we could also figure out a way to ask jemalloc for page-aligned allocations, but that would probably be too much for each alloc.
-
Swann Perarnau authored
The way jemalloc handles big allocations can often result in surprising calls to mmap/mbind (splitting allocations, rounded up sizes). It also makes the path between an aml_alloc and mbind quite difficult to see. More worrying, if jemalloc reuses a previous allocation, the mbind will not be called again, which might result in the wrong binding happening. To fix those issues, we move the mbind logic to be around the allocations returned from jemalloc. This will ensure that we always bind properly. The only issue is that it might slow down allocations. It can also cause issues if the same arena is used by multiple areas, as allocations might be overlapping a page. We will move away from sharing arenas for benchmarks from now on.
-
Swann Perarnau authored
Fix dgemm_noprefetch to match pattern from @suchyb in #19. In order to do so we split our 2d tiling into column-major and row-major ones. Note that those are refering to the order of the tiles, not the internal data of a tile, as a tiling should be agnostic to it.
-
Swann Perarnau authored
Add a tiling representing a 2d array of contiguous tiles. Also add a ndims function to retrieve the dimensions in tiles of the tiling. It also became quite obvious that the iterators are useless right now. We should think about changing that.
-
- 30 Jul, 2018 1 commit
-
-
Kamil Iskra authored
-
- 25 Jul, 2018 2 commits
-
-
Implement a 2D tiling with continuous tiles in memory, with tiles organized in row-major order inside the virtual address range. Also adds functions to query the size of a tile inside the tiling.
-
Swann Perarnau authored
When a code using aml is also linking against jemalloc, errors can occur because we use the default jemk prefix for the aml specific jemalloc install. To fix these issues, we instead use a prefix aml-specific. Discovered when using mkl on a knl box.
-
- 05 Jul, 2018 1 commit
-
-
Swann Perarnau authored
Useful and currently missing.
-
- 30 Mar, 2018 2 commits
-
-
Swann Perarnau authored
We were unlocking the dma before the request type get set to a proper value, resulting in requests sometimes overlapping when multiple threads were used in benchmarks.
-
Swann Perarnau authored
When a user doesn't need a tile to be pushed back into the scratchpad, it is better to just `release` that tile instead. This is particularly useful for read-only data for applications that are bandwidth limited.
-
- 29 Mar, 2018 2 commits
-
-
Kamil Iskra authored
Also add documentation to two forgotten functions in the header file.
-
Note that several comments are still missing, specifically for the area's acquire()/release()/available() routines, the function of which is not clear to me.
-
- 28 Mar, 2018 7 commits
-
-
Swann Perarnau authored
Add mutex to make request creation and destruction thread-safe. As for scratch_seq, we need to deal both with requests and tiles during these functions, so we lock the entire section.
-
Swann Perarnau authored
Add mutex to make request creation and destruction thread-safe. As we need to deal both we requests and tiles during these functions, we lock the entire section.
-
Swann Perarnau authored
Add mutex to make request creation and destruction thread-safe. Same as dma_linux_seq, the changes are quite simple, as we only need to protect modifications to the requests array.
-
Swann Perarnau authored
Add a mutex to make request creation and destruction thread-safe. As the code here is quite simple, we only need to protect modifications to the request array.
-
Swann Perarnau authored
scratch_request_seq contains one extra tiling that is unnecessary. Remove it.
-
Swann Perarnau authored
The request type contains two much stuff, remove extra pointers to win some space.
-
Swann Perarnau authored
Add a scratchpad that creates one pthread per request, to call synchronous dma operations. The intent is to end up with a cross product of programming language support between dma and scratch: - scratch_par + dma_seq gives users parallel scratch requests - scratch_seq + dma_par gives users sequential access to parallel moves The two other options don't make as much sense though.
-
- 27 Mar, 2018 1 commit
-
-
Swann Perarnau authored
Replace custom code with generic vectors for the scratch implementation. In the process, fix a bug in the management of tiles, as they were being freed on pull completion, which is wrong.
-