Commit 66564c2f authored by Swann Perarnau's avatar Swann Perarnau
Browse files

[refactor] Implement new API, based on jemalloc

This is a redesign of the library, as a hierarchy of core objects
implementing its various features. The idea is to create an API that is
as flexible and customizable as possible, by exposing as much as
possible of its internals, so that users can create customs versions
easily.

We also move away from memkind as a possible backend, opting instead to
vendor the jemalloc interface and implement everything ourselves on top
of that.

We expect to start building the low-level pieces using hwloc as a
backend soon, at least in terms of accessing available devices.
parent d667c528

Too many changes to show.

To preserve performance only 183 of 183+ files are displayed.
ACLOCAL_AMFLAGS = -I m4
SUBDIRS = src tests
SUBDIRS = jemalloc src tests
pkgconfigdir = $(libdir)/pkgconfig
pkgconfig_DATA = aml.pc
# remove jemalloc tests from make check
check-recursive:
$(MAKE) -C tests check
EXTRA_DIST = autogen.sh aml.pc README.markdown
#!/bin/sh
autoreconf --verbose --install --force
set -ex
mkdir -p build-aux
aclocal -I m4
libtoolize
automake --add-missing --copy
autoconf
......@@ -40,8 +40,9 @@ AM_CONDITIONAL([TEST_VALGRIND],[test "x$valgrind" = xtrue])
AC_CHECK_HEADERS(numa.h)
AC_CHECK_LIB(numa, move_pages)
# memkind
AC_CHECK_LIB(memkind, memkind_malloc)
# internal jemalloc
ac_configure_args="$ac_configure_args '--with-jemalloc-prefix=jemk_'"
AC_CONFIG_SUBDIRS([jemalloc])
AC_CONFIG_HEADERS([src/config.h])
......
version: '{build}'
environment:
matrix:
- MSYSTEM: MINGW64
CPU: x86_64
MSVC: amd64
- MSYSTEM: MINGW32
CPU: i686
MSVC: x86
- MSYSTEM: MINGW64
CPU: x86_64
- MSYSTEM: MINGW32
CPU: i686
- MSYSTEM: MINGW64
CPU: x86_64
MSVC: amd64
CONFIG_FLAGS: --enable-debug
- MSYSTEM: MINGW32
CPU: i686
MSVC: x86
CONFIG_FLAGS: --enable-debug
- MSYSTEM: MINGW64
CPU: x86_64
CONFIG_FLAGS: --enable-debug
- MSYSTEM: MINGW32
CPU: i686
CONFIG_FLAGS: --enable-debug
install:
- set PATH=c:\msys64\%MSYSTEM%\bin;c:\msys64\usr\bin;%PATH%
- if defined MSVC call "c:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\vcvarsall.bat" %MSVC%
- if defined MSVC pacman --noconfirm -Rsc mingw-w64-%CPU%-gcc gcc
- pacman --noconfirm -Suy mingw-w64-%CPU%-make
build_script:
- bash -c "autoconf"
- bash -c "./configure $CONFIG_FLAGS"
- mingw32-make
- file lib/jemalloc.dll
- mingw32-make tests
- mingw32-make -k check
begin-language: "Autoconf-without-aclocal-m4"
args: --no-cache
end-language: "Autoconf-without-aclocal-m4"
* text=auto eol=lf
/bin/jemalloc-config
/bin/jemalloc.sh
/bin/jeprof
/config.stamp
/config.log
/config.status
/configure
/doc/html.xsl
/doc/manpages.xsl
/doc/jemalloc.xml
/doc/jemalloc.html
/doc/jemalloc.3
/jemalloc.pc
/lib/
/Makefile
/include/jemalloc/internal/jemalloc_preamble.h
/include/jemalloc/internal/jemalloc_internal_defs.h
/include/jemalloc/internal/private_namespace.gen.h
/include/jemalloc/internal/private_namespace.h
/include/jemalloc/internal/private_namespace_jet.gen.h
/include/jemalloc/internal/private_namespace_jet.h
/include/jemalloc/internal/private_symbols.awk
/include/jemalloc/internal/private_symbols_jet.awk
/include/jemalloc/internal/public_namespace.h
/include/jemalloc/internal/public_symbols.txt
/include/jemalloc/internal/public_unnamespace.h
/include/jemalloc/internal/size_classes.h
/include/jemalloc/jemalloc.h
/include/jemalloc/jemalloc_defs.h
/include/jemalloc/jemalloc_macros.h
/include/jemalloc/jemalloc_mangle.h
/include/jemalloc/jemalloc_mangle_jet.h
/include/jemalloc/jemalloc_protos.h
/include/jemalloc/jemalloc_protos_jet.h
/include/jemalloc/jemalloc_rename.h
/include/jemalloc/jemalloc_typedefs.h
/src/*.[od]
/src/*.sym
/run_tests.out/
/test/test.sh
test/include/test/jemalloc_test.h
test/include/test/jemalloc_test_defs.h
/test/integration/[A-Za-z]*
!/test/integration/[A-Za-z]*.*
/test/integration/*.[od]
/test/integration/*.out
/test/integration/cpp/[A-Za-z]*
!/test/integration/cpp/[A-Za-z]*.*
/test/integration/cpp/*.[od]
/test/integration/cpp/*.out
/test/src/*.[od]
/test/stress/[A-Za-z]*
!/test/stress/[A-Za-z]*.*
/test/stress/*.[od]
/test/stress/*.out
/test/unit/[A-Za-z]*
!/test/unit/[A-Za-z]*.*
/test/unit/*.[od]
/test/unit/*.out
/VERSION
*.pdb
*.sdf
*.opendb
*.opensdf
*.cachefile
*.suo
*.user
*.sln.docstates
*.tmp
/msvc/Win32/
/msvc/x64/
/msvc/projects/*/*/Debug*/
/msvc/projects/*/*/Release*/
/msvc/projects/*/*/Win32/
/msvc/projects/*/*/x64/
language: generic
matrix:
include:
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: osx
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=clang CXX=clang++ COMPILER_FLAGS="" CONFIGURE_FLAGS="" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="-m32" CONFIGURE_FLAGS="" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
addons:
apt:
packages:
- gcc-multilib
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--enable-debug" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--enable-prof" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--disable-stats" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--with-malloc-conf=tcache:false" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--with-malloc-conf=dss:primary" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--with-malloc-conf=percpu_arena:percpu" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--with-malloc-conf=background_thread:true" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: osx
env: CC=clang CXX=clang++ COMPILER_FLAGS="" CONFIGURE_FLAGS="" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: osx
env: CC=gcc CXX=g++ COMPILER_FLAGS="-m32" CONFIGURE_FLAGS="" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: osx
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--enable-debug" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: osx
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--disable-stats" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: osx
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--with-malloc-conf=tcache:false" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=clang CXX=clang++ COMPILER_FLAGS="-m32" CONFIGURE_FLAGS="" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
addons:
apt:
packages:
- gcc-multilib
- os: linux
env: CC=clang CXX=clang++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--enable-debug" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=clang CXX=clang++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--enable-prof" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=clang CXX=clang++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--disable-stats" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=clang CXX=clang++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--with-malloc-conf=tcache:false" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=clang CXX=clang++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--with-malloc-conf=dss:primary" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=clang CXX=clang++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--with-malloc-conf=percpu_arena:percpu" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=clang CXX=clang++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--with-malloc-conf=background_thread:true" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="-m32" CONFIGURE_FLAGS="--enable-debug" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
addons:
apt:
packages:
- gcc-multilib
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="-m32" CONFIGURE_FLAGS="--enable-prof" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
addons:
apt:
packages:
- gcc-multilib
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="-m32" CONFIGURE_FLAGS="--disable-stats" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
addons:
apt:
packages:
- gcc-multilib
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="-m32" CONFIGURE_FLAGS="--with-malloc-conf=tcache:false" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
addons:
apt:
packages:
- gcc-multilib
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="-m32" CONFIGURE_FLAGS="--with-malloc-conf=dss:primary" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
addons:
apt:
packages:
- gcc-multilib
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="-m32" CONFIGURE_FLAGS="--with-malloc-conf=percpu_arena:percpu" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
addons:
apt:
packages:
- gcc-multilib
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="-m32" CONFIGURE_FLAGS="--with-malloc-conf=background_thread:true" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
addons:
apt:
packages:
- gcc-multilib
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--enable-debug --enable-prof" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--enable-debug --disable-stats" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--enable-debug --with-malloc-conf=tcache:false" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--enable-debug --with-malloc-conf=dss:primary" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--enable-debug --with-malloc-conf=percpu_arena:percpu" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--enable-debug --with-malloc-conf=background_thread:true" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--enable-prof --disable-stats" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--enable-prof --with-malloc-conf=tcache:false" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--enable-prof --with-malloc-conf=dss:primary" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--enable-prof --with-malloc-conf=percpu_arena:percpu" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--enable-prof --with-malloc-conf=background_thread:true" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--disable-stats --with-malloc-conf=tcache:false" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--disable-stats --with-malloc-conf=dss:primary" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--disable-stats --with-malloc-conf=percpu_arena:percpu" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--disable-stats --with-malloc-conf=background_thread:true" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--with-malloc-conf=tcache:false,dss:primary" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--with-malloc-conf=tcache:false,percpu_arena:percpu" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--with-malloc-conf=tcache:false,background_thread:true" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--with-malloc-conf=dss:primary,percpu_arena:percpu" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--with-malloc-conf=dss:primary,background_thread:true" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
- os: linux
env: CC=gcc CXX=g++ COMPILER_FLAGS="" CONFIGURE_FLAGS="--with-malloc-conf=percpu_arena:percpu,background_thread:true" EXTRA_CFLAGS="-Werror -Wno-array-bounds"
before_script:
- autoconf
- ./configure ${COMPILER_FLAGS:+ CC="$CC $COMPILER_FLAGS" CXX="$CXX $COMPILER_FLAGS" } $CONFIGURE_FLAGS
- make -j3
- make -j3 tests
script:
- make check
Unless otherwise specified, files in the jemalloc source distribution are
subject to the following license:
--------------------------------------------------------------------------------
Copyright (C) 2002-2017 Jason Evans <jasone@canonware.com>.
All rights reserved.
Copyright (C) 2007-2012 Mozilla Foundation. All rights reserved.
Copyright (C) 2009-2017 Facebook, Inc. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice(s),
this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice(s),
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY EXPRESS
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO
EVENT SHALL THE COPYRIGHT HOLDER(S) BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
--------------------------------------------------------------------------------
Following are change highlights associated with official releases. Important
bug fixes are all mentioned, but some internal enhancements are omitted here for
brevity. Much more detail can be found in the git revision history:
https://github.com/jemalloc/jemalloc
* 5.0.1 (July 1, 2017)
This bugfix release fixes several issues, most of which are obscure enough
that typical applications are not impacted.
Bug fixes:
- Update decay->nunpurged before purging, in order to avoid potential update
races and subsequent incorrect purging volume. (@interwq)
- Only abort on dlsym(3) error if the failure impacts an enabled feature (lazy
locking and/or background threads). This mitigates an initialization
failure bug for which we still do not have a clear reproduction test case.
(@interwq)
- Modify tsd management so that it neither crashes nor leaks if a thread's
only allocation activity is to call free() after TLS destructors have been
executed. This behavior was observed when operating with GNU libc, and is
unlikely to be an issue with other libc implementations. (@interwq)
- Mask signals during background thread creation. This prevents signals from
being inadvertently delivered to background threads. (@jasone,
@davidgoldblatt, @interwq)
- Avoid inactivity checks within background threads, in order to prevent
recursive mutex acquisition. (@interwq)
- Fix extent_grow_retained() to use the specified hooks when the
arena.<i>.extent_hooks mallctl is used to override the default hooks.
(@interwq)
- Add missing reentrancy support for custom extent hooks which allocate.
(@interwq)
- Post-fork(2), re-initialize the list of tcaches associated with each arena
to contain no tcaches except the forking thread's. (@interwq)
- Add missing post-fork(2) mutex reinitialization for extent_grow_mtx. This
fixes potential deadlocks after fork(2). (@interwq)
- Enforce minimum autoconf version (currently 2.68), since 2.63 is known to
generate corrupt configure scripts. (@jasone)
- Ensure that the configured page size (--with-lg-page) is no larger than the
configured huge page size (--with-lg-hugepage). (@jasone)
* 5.0.0 (June 13, 2017)
Unlike all previous jemalloc releases, this release does not use naturally
aligned "chunks" for virtual memory management, and instead uses page-aligned
"extents". This change has few externally visible effects, but the internal
impacts are... extensive. Many other internal changes combine to make this
the most cohesively designed version of jemalloc so far, with ample
opportunity for further enhancements.
Continuous integration is now an integral aspect of development thanks to the
efforts of @davidtgoldblatt, and the dev branch tends to remain reasonably
stable on the tested platforms (Linux, FreeBSD, macOS, and Windows). As a
side effect the official release frequency may decrease over time.
New features:
- Implement optional per-CPU arena support; threads choose which arena to use
based on current CPU rather than on fixed thread-->arena associations.
(@interwq)
- Implement two-phase decay of unused dirty pages. Pages transition from
dirty-->muzzy-->clean, where the first phase transition relies on
madvise(... MADV_FREE) semantics, and the second phase transition discards
pages such that they are replaced with demand-zeroed pages on next access.
(@jasone)
- Increase decay time resolution from seconds to milliseconds. (@jasone)
- Implement opt-in per CPU background threads, and use them for asynchronous
decay-driven unused dirty page purging. (@interwq)
- Add mutex profiling, which collects a variety of statistics useful for
diagnosing overhead/contention issues. (@interwq)
- Add C++ new/delete operator bindings. (@djwatson)
- Support manually created arena destruction, such that all data and metadata
are discarded. Add MALLCTL_ARENAS_DESTROYED for accessing merged stats
associated with destroyed arenas. (@jasone)
- Add MALLCTL_ARENAS_ALL as a fixed index for use in accessing
merged/destroyed arena statistics via mallctl. (@jasone)
- Add opt.abort_conf to optionally abort if invalid configuration options are
detected during initialization. (@interwq)
- Add opt.stats_print_opts, so that e.g. JSON output can be selected for the
stats dumped during exit if opt.stats_print is true. (@jasone)
- Add --with-version=VERSION for use when embedding jemalloc into another
project's git repository. (@jasone)
- Add --disable-thp to support cross compiling. (@jasone)
- Add --with-lg-hugepage to support cross compiling. (@jasone)
- Add mallctl interfaces (various authors):
+ background_thread
+ opt.abort_conf
+ opt.retain
+ opt.percpu_arena
+ opt.background_thread
+ opt.{dirty,muzzy}_decay_ms
+ opt.stats_print_opts
+ arena.<i>.initialized
+ arena.<i>.destroy
+ arena.<i>.{dirty,muzzy}_decay_ms
+ arena.<i>.extent_hooks
+ arenas.{dirty,muzzy}_decay_ms
+ arenas.bin.<i>.slab_size
+ arenas.nlextents
+ arenas.lextent.<i>.size
+ arenas.create
+ stats.background_thread.{num_threads,num_runs,run_interval}
+ stats.mutexes.{ctl,background_thread,prof,reset}.
{num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds,
num_owner_switch}
+ stats.arenas.<i>.{dirty,muzzy}_decay_ms
+ stats.arenas.<i>.uptime
+ stats.arenas.<i>.{pmuzzy,base,internal,resident}
+ stats.arenas.<i>.{dirty,muzzy}_{npurge,nmadvise,purged}
+ stats.arenas.<i>.bins.<j>.{nslabs,reslabs,curslabs}
+ stats.arenas.<i>.bins.<j>.mutex.
{num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds,
num_owner_switch}
+ stats.arenas.<i>.lextents.<j>.{nmalloc,ndalloc,nrequests,curlextents}
+ stats.arenas.i.mutexes.{large,extent_avail,extents_dirty,extents_muzzy,
extents_retained,decay_dirty,decay_muzzy,base,tcache_list}.
{num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds,
num_owner_switch}
Portability improvements:
- Improve reentrant allocation support, such that deadlock is less likely if
e.g. a system library call in turn allocates memory. (@davidtgoldblatt,
@interwq)
- Support static linking of jemalloc with glibc. (@djwatson)
Optimizations and refactors:
- Organize virtual memory as "extents" of virtual memory pages, rather than as
naturally aligned "chunks", and store all metadata in arbitrarily distant
locations. This reduces virtual memory external fragmentation, and will
interact better with huge pages (not yet explicitly supported). (@jasone)
- Fold large and huge size classes together; only small and large size classes
remain. (@jasone)
- Unify the allocation paths, and merge most fast-path branching decisions.
(@davidtgoldblatt, @interwq)
- Embed per thread automatic tcache into thread-specific data, which reduces
conditional branches and dereferences. Also reorganize tcache to increase
fast-path data locality. (@interwq)
- Rewrite atomics to closely model the C11 API, convert various
synchronization from mutex-based to atomic, and use the explicit memory
ordering control to resolve various hypothetical races without increasing
synchronization overhead. (@davidtgoldblatt)
- Extensively optimize rtree via various methods:
+ Add multiple layers of rtree lookup caching, since rtree lookups are now
part of fast-path deallocation. (@interwq)
+ Determine rtree layout at compile time. (@jasone)
+ Make the tree shallower for common configurations. (@jasone)
+ Embed the root node in the top-level rtree data structure, thus avoiding
one level of indirection. (@jasone)
+ Further specialize leaf elements as compared to internal node elements,
and directly embed extent metadata needed for fast-path deallocation.
(@jasone)
+ Ignore leading always-zero address bits (architecture-specific).
(@jasone)
- Reorganize headers (ongoing work) to make them hermetic, and disentangle
various module dependencies. (@davidtgoldblatt)
- Convert various internal data structures such as size class metadata from
boot-time-initialized to compile-time-initialized. Propagate resulting data
structure simplifications, such as making arena metadata fixed-size.
(@jasone)
- Simplify size class lookups when constrained to size classes that are
multiples of the page size. This speeds lookups, but the primary benefit is
complexity reduction in code that was the source of numerous regressions.
(@jasone)
- Lock individual extents when possible for localized extent operations,
rather than relying on a top-level arena lock. (@davidtgoldblatt, @jasone)
- Use first fit layout policy instead of best fit, in order to improve
packing. (@jasone)
- If munmap(2) is not in use, use an exponential series to grow each arena's
virtual memory, so that the number of disjoint virtual memory mappings
remains low. (@jasone)
- Implement per arena base allocators, so that arenas never share any virtual
memory pages. (@jasone)
- Automatically generate private symbol name mangling macros. (@jasone)
Incompatible changes:
- Replace chunk hooks with an expanded/normalized set of extent hooks.
(@jasone)
- Remove ratio-based purging. (@jasone)
- Remove --disable-tcache. (@jasone)
- Remove --disable-tls. (@jasone)
- Remove --enable-ivsalloc. (@jasone)
- Remove --with-lg-size-class-group. (@jasone)
- Remove --with-lg-tiny-min. (@jasone)
- Remove --disable-cc-silence. (@jasone)
- Remove --enable-code-coverage. (@jasone)
- Remove --disable-munmap (replaced by opt.retain). (@jasone)
- Remove Valgrind support. (@jasone)
- Remove quarantine support. (@jasone)
- Remove redzone support. (@jasone)
- Remove mallctl interfaces (various authors):
+ config.munmap
+ config.tcache
+ config.tls
+ config.valgrind
+ opt.lg_chunk
+ opt.purge
+ opt.lg_dirty_mult
+ opt.decay_time
+ opt.quarantine
+ opt.redzone
+ opt.thp
+ arena.<i>.lg_dirty_mult
+ arena.<i>.decay_time
+ arena.<i>.chunk_hooks
+ arenas.initialized
+ arenas.lg_dirty_mult
+ arenas.decay_time
+ arenas.bin.<i>.run_size
+ arenas.nlruns
+ arenas.lrun.<i>.size
+ arenas.nhchunks
+ arenas.hchunk.<i>.size
+ arenas.extend
+ stats.cactive
+ stats.arenas.<i>.lg_dirty_mult
+ stats.arenas.<i>.decay_time
+ stats.arenas.<i>.metadata.{mapped,allocated}
+ stats.arenas.<i>.{npurge,nmadvise,purged}
+ stats.arenas.<i>.huge.{allocated,nmalloc,ndalloc,nrequests}
+ stats.arenas.<i>.bins.<j>.{nruns,reruns,curruns}
+ stats.arenas.<i>.lruns.<j>.{nmalloc,ndalloc,nrequests,curruns}
+ stats.arenas.<i>.hchunks.<j>.{nmalloc,ndalloc,nrequests,curhchunks}
Bug fixes:
- Improve interval-based profile dump triggering to dump only one profile when
a single allocation's size exceeds the interval. (@jasone)
- Use prefixed function names (as controlled by --with-jemalloc-prefix) when
pruning backtrace frames in jeprof. (@jasone)
* 4.5.0 (February 28, 2017)
This is the first release to benefit from much broader continuous integration
testing, thanks to @davidtgoldblatt. Had we had this testing infrastructure
in place for prior releases, it would have caught all of the most serious
regressions fixed by this release.
New features:
- Add --disable-thp and the opt.thp mallctl to provide opt-out mechanisms for
transparent huge page integration. (@jasone)
- Update zone allocator integration to work with macOS 10.12. (@glandium)
- Restructure *CFLAGS configuration, so that CFLAGS behaves typically, and
EXTRA_CFLAGS provides a way to specify e.g. -Werror during building, but not
during configuration. (@jasone, @ronawho)
Bug fixes:
- Fix DSS (sbrk(2)-based) allocation. This regression was first released in
4.3.0. (@jasone)
- Handle race in per size class utilization computation. This functionality
was first released in 4.0.0. (@interwq)
- Fix lock order reversal during gdump. (@jasone)
- Fix/refactor tcache synchronization. This regression was first released in
4.0.0. (@jasone)
- Fix various JSON-formatted malloc_stats_print() bugs. This functionality
was first released in 4.3.0. (@jasone)
- Fix huge-aligned allocation. This regression was first released in 4.4.0.
(@jasone)
- When transparent huge page integration is enabled, detect what state pages
start in according to the kernel's current operating mode, and only convert
arena chunks to non-huge during purging if that is not their initial state.
This functionality was first released in 4.4.0. (@jasone)
- Fix lg_chunk clamping for the --enable-cache-oblivious --disable-fill case.
This regression was first released in 4.0.0. (@jasone, @428desmo)
- Properly detect sparc64 when building for Linux. (@glaubitz)
* 4.4.0 (December 3, 2016)
New features:
- Add configure support for *-*-linux-android. (@cferris1000, @jasone)
- Add the --disable-syscall configure option, for use on systems that place
security-motivated limitations on syscall(2). (@jasone)
- Add support for Debian GNU/kFreeBSD. (@thesam)
Optimizations:
- Add extent serial numbers and use them where appropriate as a sort key that