switch-linux/Documentation
Sergey Senozhatsky beca3ec71f zram: add multi stream functionality
Existing zram (zcomp) implementation has only one compression stream
(buffer and algorithm private part), so in order to prevent data
corruption only one write (compress operation) can use this compression
stream, forcing all concurrent write operations to wait for stream lock
to be released.  This patch changes zcomp to keep a compression streams
list of user-defined size (via sysfs device attr).  Each write operation
still exclusively holds compression stream, the difference is that we
can have N write operations (depending on size of streams list)
executing in parallel.  See TEST section later in commit message for
performance data.

Introduce struct zcomp_strm_multi and a set of functions to manage
zcomp_strm stream access.  zcomp_strm_multi has a list of idle
zcomp_strm structs, spinlock to protect idle list and wait queue, making
it possible to perform parallel compressions.

The following set of functions added:
- zcomp_strm_multi_find()/zcomp_strm_multi_release()
  find and release a compression stream, implement required locking
- zcomp_strm_multi_create()/zcomp_strm_multi_destroy()
  create and destroy zcomp_strm_multi

zcomp ->strm_find() and ->strm_release() callbacks are set during
initialisation to zcomp_strm_multi_find()/zcomp_strm_multi_release()
correspondingly.

Each time zcomp issues a zcomp_strm_multi_find() call, the following set
of operations performed:

- spin lock strm_lock
- if idle list is not empty, remove zcomp_strm from idle list, spin
  unlock and return zcomp stream pointer to caller
- if idle list is empty, current adds itself to wait queue. it will be
  awaken by zcomp_strm_multi_release() caller.

zcomp_strm_multi_release():
- spin lock strm_lock
- add zcomp stream to idle list
- spin unlock, wake up sleeper

Minchan Kim reported that spinlock-based locking scheme has demonstrated
a severe perfomance regression for single compression stream case,
comparing to mutex-based (see https://lkml.org/lkml/2014/2/18/16)

base                      spinlock                    mutex

==Initial write           ==Initial write             ==Initial  write
records:  5               records:  5                 records:   5
avg:      1642424.35      avg:      699610.40         avg:       1655583.71
std:      39890.95(2.43%) std:      232014.19(33.16%) std:       52293.96
max:      1690170.94      max:      1163473.45        max:       1697164.75
min:      1568669.52      min:      573429.88         min:       1553410.23
==Rewrite                 ==Rewrite                   ==Rewrite
records:  5               records:  5                 records:   5
avg:      1611775.39      avg:      501406.64         avg:       1684419.11
std:      17144.58(1.06%) std:      15354.41(3.06%)   std:       18367.42
max:      1641800.95      max:      531356.78         max:       1706445.84
min:      1593515.27      min:      488817.78         min:       1655335.73

When only one compression stream available, mutex with spin on owner
tends to perform much better than frequent wait_event()/wake_up().  This
is why single stream implemented as a special case with mutex locking.

Introduce and document zram device attribute max_comp_streams.  This
attr shows and stores current zcomp's max number of zcomp streams
(max_strm).  Extend zcomp's zcomp_create() with `max_strm' parameter.
`max_strm' limits the number of zcomp_strm structs in compression
backend's idle list (max_comp_streams).

max_comp_streams used during initialisation as follows:
-- passing to zcomp_create() max_strm equals to 1 will initialise zcomp
using single compression stream zcomp_strm_single (mutex-based locking).
-- passing to zcomp_create() max_strm greater than 1 will initialise zcomp
using multi compression stream zcomp_strm_multi (spinlock-based locking).

default max_comp_streams value is 1, meaning that zram with single stream
will be initialised.

Later patch will introduce configuration knob to change max_comp_streams
on already initialised and used zcomp.

TEST
iozone -t 3 -R -r 16K -s 60M -I +Z

       test           base       1 strm (mutex)     3 strm (spinlock)
-----------------------------------------------------------------------
 Initial write      589286.78       583518.39          718011.05
       Rewrite      604837.97       596776.38         1515125.72
  Random write      584120.11       595714.58         1388850.25
        Pwrite      535731.17       541117.38          739295.27
        Fwrite     1418083.88      1478612.72         1484927.06

Usage example:
set max_comp_streams to 4
        echo 4 > /sys/block/zram0/max_comp_streams

show current max_comp_streams (default value is 1).
        cat /sys/block/zram0/max_comp_streams

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07 16:36:01 -07:00
..
ABI zram: add multi stream functionality 2014-04-07 16:36:01 -07:00
accounting
acpi
aoe
arm Merge branch 'mvebu/soc3' into next/dt 2014-03-17 12:13:09 +01:00
arm64 arm64: Extend the PCI I/O space to 16MB 2014-02-26 11:16:27 +00:00
auxdisplay
backlight
blackfin
block
blockdev zram: add multi stream functionality 2014-04-07 16:36:01 -07:00
bus-devices
cdrom
cgroups memcg: rename high level charging functions 2014-04-07 16:35:57 -07:00
connector
console
cpu-freq cpufreq: Add stop CPU callback to cpufreq_driver interface 2014-03-20 03:50:12 +01:00
cpuidle
cris
crypto
development-process
device-mapper dm: add era target 2014-03-27 16:56:23 -04:00
devicetree IOMMU Upates for Linux v3.15 2014-04-05 18:46:26 -07:00
DocBook Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media 2014-04-04 09:50:07 -07:00
driver-model
dvb Linux 3.14-rc5 2014-03-11 06:55:49 -03:00
early-userspace
EDID
extcon
fault-injection
fb
filesystems mm: introduce vm_ops->map_pages() 2014-04-07 16:35:52 -07:00
firmware_class
fmc FMC: make eeprom attribute writable 2014-02-28 15:12:08 -08:00
frv
gpio
hid HID: uhid: Add UHID_CREATE2 + UHID_INPUT2 2014-04-01 18:27:33 +02:00
hwmon Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging 2014-04-05 18:45:11 -07:00
i2c Documentation: i2c: mention ACPI method for instantiating devices 2014-02-15 19:46:34 +01:00
i2o
ia64
ide
infiniband
input doc: fix double words 2014-03-21 13:16:58 +01:00
ioctl
isdn
ja_JP Documentation/SubmittingPatches: remove references to patch-scripts 2014-04-03 16:21:27 -07:00
kbuild
kdump
ko_KR
laptops
leds
m68k
make
memory-devices
metag
mic
mips
misc-devices
mmc
mn10300
mtd
namespaces
netlabel
networking Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-04-02 20:53:45 -07:00
nfc
parisc
PCI Merge branch 'pci/dead-code' into next 2014-02-20 14:32:34 -07:00
pcmcia
phy phy: Add new Exynos USB 2.0 PHY driver 2014-03-08 12:39:44 +05:30
power Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2014-04-02 16:23:38 -07:00
powerpc
pps
prctl
pti
ptp ptp: Fix compiler warnings in the testptp utility 2014-03-27 14:51:47 -04:00
rapidio
RCU documentation: Fix some inconsistencies in RTFP.txt 2014-02-17 14:56:10 -08:00
s390
scheduler
scsi [SCSI] megaraid_sas: Version and Changelog update 2014-03-15 10:19:21 -07:00
security doc: fix double words 2014-03-21 13:16:58 +01:00
serial
sh
sound x86, platforms: Remove SGI Visual Workstation 2014-02-27 08:07:39 -08:00
spi Merge remote-tracking branches 'spi/topic/s3c64xx', 'spi/topic/sc18is602', 'spi/topic/sh-hspi', 'spi/topic/sh-msiof', 'spi/topic/sh-sci', 'spi/topic/sirf' and 'spi/topic/spidev' into spi-next 2014-03-30 00:51:34 +00:00
sysctl Nothing major: the stricter permissions checking for sysfs broke 2014-04-06 09:38:07 -07:00
target
thermal
timers
tpm
trace Most of the changes were largely clean ups, and some documentation. 2014-04-03 10:26:31 -07:00
usb doc: fix double words 2014-03-21 13:16:58 +01:00
vDSO
video4linux [media] s5p-fimc: Remove reference to outdated macro 2014-03-14 10:37:43 -03:00
virtual Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2014-04-02 16:23:38 -07:00
vm doc: fix double words 2014-03-21 13:16:58 +01:00
w1 Merge 3.14-rc3 into char-misc-next 2014-02-18 08:09:40 -08:00
watchdog watchdog: it87_wdt: Work around non-working CIR interrupts 2014-03-31 13:33:55 +02:00
wimax
x86 x86, boot: Correct max ramdisk size name 2014-03-13 15:32:42 -07:00
xtensa
zh_CN Documentation/SubmittingPatches: remove references to patch-scripts 2014-04-03 16:21:27 -07:00
.gitignore
00-INDEX Merge branch 'x86-nuke-platforms-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-04-02 13:15:58 -07:00
applying-patches.txt
assoc_array.txt
atomic_ops.txt
bad_memory.txt
basic_profiling.txt
bcache.txt
binfmt_misc.txt
braille-console.txt
bt8xxgpio.txt
btmrvl.txt
BUG-HUNTING
bus-virt-phys-mapping.txt
cachetlb.txt
Changes
circular-buffers.txt
clk.txt Documentation: clk: Add locking documentation 2014-03-19 14:56:06 -07:00
coccinelle.txt
CodingStyle
cpu-hotplug.txt
cpu-load.txt
cputopology.txt
crc32.txt
dcdbas.txt
debugging-modules.txt
debugging-via-ohci1394.txt
dell_rbu.txt
devices.txt Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media 2014-04-04 09:50:07 -07:00
digsig.txt
DMA-API-HOWTO.txt
DMA-API.txt
DMA-attributes.txt
dma-buf-sharing.txt
DMA-ISA-LPC.txt
dmaengine.txt
dmatest.txt
dontdiff
dynamic-debug-howto.txt
edac.txt
efi-stub.txt
eisa.txt
email-clients.txt
flexible-arrays.txt
futex-requeue-pi.txt doc: fix double words 2014-03-21 13:16:58 +01:00
gcov.txt
highuid.txt
HOWTO
hw_random.txt
hwspinlock.txt
init.txt
initrd.txt
Intel-IOMMU.txt
intel_txt.txt
io-mapping.txt
io_ordering.txt
iostats.txt
IPMI.txt
IRQ-affinity.txt
IRQ-domain.txt
IRQ.txt
irqflags-tracing.txt
isapnp.txt
java.txt
kernel-doc-nano-HOWTO.txt
kernel-docs.txt
kernel-parameters.txt More ACPI and power management updates for 3.15-rc1 2014-04-02 14:10:21 -07:00
kernel-per-CPU-kthreads.txt Documentation/kernel-per-CPU-kthreads.txt: Workqueue affinity 2014-02-17 14:56:08 -08:00
kmemcheck.txt doc: fix double words 2014-03-21 13:16:58 +01:00
kmemleak.txt Documentation/kmemleak.txt: updates 2014-04-03 16:21:27 -07:00
kobject.txt
kprobes.txt
kref.txt
ldm.txt
local_ops.txt
lockdep-design.txt
lockstat.txt
lockup-watchdogs.txt
logo.gif
logo.txt
magic-number.txt
Makefile
ManagementStyle
md.txt
media-framework.txt
memory-barriers.txt Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2014-04-02 16:23:38 -07:00
memory-hotplug.txt
module-signing.txt Nothing major: the stricter permissions checking for sysfs broke 2014-04-06 09:38:07 -07:00
mono.txt
mutex-design.txt
nommu-mmap.txt
numastat.txt
oops-tracing.txt Use 'E' instead of 'X' for unsigned module taint flag. 2014-03-31 14:52:43 +10:30
padata.txt
parport-lowlevel.txt
parport.txt
percpu-rw-semaphore.txt
phy.txt
pi-futex.txt
pinctrl.txt
pnp.txt
preempt-locking.txt
printk-formats.txt
pwm.txt
ramoops.txt
rbtree.txt
remoteproc.txt
rfkill.txt
robust-futex-ABI.txt
robust-futexes.txt
rpmsg.txt
rt-mutex-design.txt
rt-mutex.txt
rtc.txt
SAK.txt
SecurityBugs
serial-console.txt
sgi-ioc4.txt
SM501.txt
smsc_ece1099.txt
sparse.txt
spinlocks.txt
stable_api_nonsense.txt
stable_kernel_rules.txt
static-keys.txt
SubmitChecklist
SubmittingDrivers
SubmittingPatches Documentation/SubmittingPatches: update some dead URLs 2014-04-03 16:21:27 -07:00
svga.txt
sysfs-rules.txt
sysrq.txt
this_cpu_ops.txt
unaligned-memory-access.txt
unicode.txt
unshare.txt
vfio.txt
VGA-softcursor.txt
vgaarbiter.txt
video-output.txt
vme_api.txt
volatile-considered-harmful.txt
workqueue.txt
ww-mutex-design.txt
xz.txt
zorro.txt