commit 8b4ed85b8404ffe7e10ee410c4df3968b86f0793 Author: Greg Kroah-Hartman Date: Thu Jan 9 12:25:15 2014 -0800 Linux 3.10.26 commit 0f9709f9c971003ff2a24315a3fe1b323495d3ef Author: Nobuhiro Iwamatsu Date: Thu Jan 2 12:58:53 2014 -0800 sh: add EXPORT_SYMBOL(min_low_pfn) and EXPORT_SYMBOL(max_low_pfn) to sh_ksyms_32.c commit ad70b029d2c678386384bd72c7fa2705c449b518 upstream. Min_low_pfn and max_low_pfn were used in pfn_valid macro if defined CONFIG_FLATMEM. When the functions that use the pfn_valid is used in driver module, max_low_pfn and min_low_pfn is to undefined, and fail to build. ERROR: "min_low_pfn" [drivers/block/aoe/aoe.ko] undefined! ERROR: "max_low_pfn" [drivers/block/aoe/aoe.ko] undefined! make[2]: *** [__modpost] Error 1 make[1]: *** [modules] Error 2 This patch fix this problem. Signed-off-by: Nobuhiro Iwamatsu Cc: Kuninori Morimoto Cc: Paul Mundt Cc: Geert Uytterhoeven Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 70139b6643db61809db76dd642af87979421d41d Author: Eric Whitney Date: Mon Jan 6 14:00:23 2014 -0500 ext4: fix bigalloc regression commit d0abafac8c9162f39c4f6b2f8141b772a09b3770 upstream. Commit f5a44db5d2 introduced a regression on filesystems created with the bigalloc feature (cluster size > blocksize). It causes xfstests generic/006 and /013 to fail with an unexpected JBD2 failure and transaction abort that leaves the test file system in a read only state. Other xfstests run on bigalloc file systems are likely to fail as well. The cause is the accidental use of a cluster mask where a cluster offset was needed in ext4_ext_map_blocks(). Signed-off-by: Eric Whitney Cc: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 3655a197b1ea3ce989d34868768c5f4b6205061c Author: Catalin Marinas Date: Fri Nov 29 10:56:14 2013 +0000 arm64: Use Normal NonCacheable memory for writecombine commit 4f00130b70e5eee813cc7bc298e0f3fdf79673cc upstream. This provides better performance compared to Device GRE and also allows unaligned accesses. Such memory is intended to be used with standard RAM (e.g. framebuffers) and not I/O. Signed-off-by: Catalin Marinas Cc: Mark Brown Signed-off-by: Greg Kroah-Hartman commit fc54900e08d840fcfec9e5d2fba2c6f233aa49b9 Author: Catalin Marinas Date: Wed May 1 16:34:22 2013 +0100 arm64: Do not flush the D-cache for anonymous pages commit 7249b79f6b4cc3c2aa9138dca52e535a4c789107 upstream. The D-cache on AArch64 is VIPT non-aliasing, so there is no need to flush it for anonymous pages. Signed-off-by: Catalin Marinas Reported-by: Will Deacon Acked-by: Will Deacon Cc: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 1e616427f20943c9966296dfff9e7a2b825846aa Author: Catalin Marinas Date: Wed May 1 12:23:05 2013 +0100 arm64: Avoid cache flushing in flush_dcache_page() commit b5b6c9e9149d8a7c3f1d7b9d0c046c6184e1dd17 upstream. The flush_dcache_page() function is called when the kernel modified a page cache page. Since the D-cache on AArch64 does not have aliases this function can simply mark the page as dirty for later flushing via set_pte_at()/__sync_icache_dcache() if the page is executable (to ensure the I-D cache coherency). Signed-off-by: Catalin Marinas Reported-by: Will Deacon Acked-by: Will Deacon Cc: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 6e7be6fc3eb45928bb28ebbedf30a30ed82b35b5 Author: Mark Rutland Date: Tue Mar 26 13:41:35 2013 +0000 ARM: KVM: arch_timers: zero CNTVOFF upon return to host commit f793c23ebbe5afd1cabf4a42a3a297022213756f upstream. To use the virtual counters from the host, we need to ensure that CNTVOFF doesn't change unexpectedly. When we change to a guest, we replace the host's CNTVOFF, but we don't restore it when returning to the host. As the host sets CNTVOFF to zero, and never changes it, we can simply zero CNTVOFF when returning to the host. This patch adds said zeroing to the return to host path. Signed-off-by: Mark Rutland Acked-by: Marc Zyngier Acked-by: Santosh Shilimkar Acked-by: Christoffer Dall Cc: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 3db87d4c3af23cb360f443c1a8b0c0b4278b9f6b Author: Marc Zyngier Date: Wed Jan 30 18:17:49 2013 +0000 ARM: hyp: initialize CNTVOFF to zero commit 0af0b189abf73d232af782df2f999235cd2fed7f upstream. In order to be able to use the virtual counter in a safe way, make sure it is initialized to zero before dropping to SVC. Signed-off-by: Marc Zyngier Signed-off-by: Mark Rutland Acked-by: Santosh Shilimkar Cc: Dave Martin Cc: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 714c21cb90951905b269870087a99c37f3a7af0c Author: Mark Rutland Date: Wed Jan 30 17:51:26 2013 +0000 clocksource: arch_timer: use virtual counters commit 0d651e4e65e96989f72236bf83bd4c6e55eb6ce4 upstream. Switching between reading the virtual or physical counters is problematic, as some core code wants a view of time before we're fully set up. Using a function pointer and switching the source after the first read can make time appear to go backwards, and having a check in the read function is an unfortunate block on what we want to be a fast path. Instead, this patch makes us always use the virtual counters. If we're a guest, or don't have hyp mode, we'll use the virtual timers, and as such don't care about CNTVOFF as long as it doesn't change in such a way as to make time appear to travel backwards. As the guest will use the virtual timers, a (potential) KVM host must use the physical timers (which can wake up the host even if they fire while a guest is executing), and hence a host must have CNTVOFF set to zero so as to have a consistent view of time between the physical timers and virtual counters. Signed-off-by: Mark Rutland Acked-by: Catalin Marinas Acked-by: Marc Zyngier Acked-by: Santosh Shilimkar Cc: Rob Herring Cc: Mark Brown Signed-off-by: Greg Kroah-Hartman commit ae5092fadd50cfd0d29e762771d30cbf7f2c2fec Author: Catalin Marinas Date: Mon Sep 2 16:33:54 2013 +0100 arm64: Remove unused cpu_name ascii in arch/arm64/mm/proc.S commit f3a1d7d53dccf51959aec16b574617cc6bfeca09 upstream. This string has been moved to arch/arm64/kernel/cputable.c. Signed-off-by: Catalin Marinas Cc: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 29f37087bb51dfe4ff61daad60bd93f7bcfa3d72 Author: Catalin Marinas Date: Thu Nov 14 15:15:37 2013 +0000 arm64: dts: Reserve the memory used for secondary CPU release address commit df503ba7f653c590b475ab80bde788edf5af70d5 upstream. With the spin-table SMP booting method, secondary CPUs poll a location passed in the DT. The foundation-v8.dts file doesn't have this memory reserved and there is a risk of Linux using it before secondary CPUs are started. Signed-off-by: Catalin Marinas Cc: Mark Brown Signed-off-by: Greg Kroah-Hartman commit e2956ef5b5ddb3ede70962afe9297d8340787fa6 Author: AKASHI Takahiro Date: Thu Oct 3 06:47:44 2013 +0100 arm64: check for number of arguments in syscall_get/set_arguments() commit 7b22c03536a539142f931815528d55df455ffe2d upstream. In ftrace_syscall_enter(), syscall_get_arguments(..., 0, n, ...) if (i == 0) { ...; n--;} memcpy(..., n * sizeof(args[0])); If 'number of arguments(n)' is zero and 'argument index(i)' is also zero in syscall_get_arguments(), none of arguments should be copied by memcpy(). Otherwise 'n--' can be a big positive number and unexpected amount of data will be copied. Tracing system calls which take no argument, say sync(void), may hit this case and eventually make the system corrupted. This patch fixes the issue both in syscall_get_arguments() and syscall_set_arguments(). Signed-off-by: AKASHI Takahiro Acked-by: Will Deacon Signed-off-by: Catalin Marinas Cc: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 2bf5861acf602fe6201ac1f82955ac7283e56b6f Author: Jiang Liu Date: Fri Sep 27 09:04:41 2013 +0100 arm64: fix possible invalid FPSIMD initialization state commit 6db83cea1c975b9a102e17def7d2795814e1ae2b upstream. If context switching happens during executing fpsimd_flush_thread(), stale value in FPSIMD registers will be saved into current thread's fpsimd_state by fpsimd_thread_switch(). That may cause invalid initialization state for the new process, so disable preemption when executing fpsimd_flush_thread(). Signed-off-by: Jiang Liu Cc: Jiang Liu Signed-off-by: Catalin Marinas Cc: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 79f783f05539479676406d3d42c3d86bd203f083 Author: Feng Kan Date: Tue Jul 23 18:52:31 2013 +0100 arm64: Change kernel stack size to 16K commit 845ad05ec31e0f3872a321e10dbeaf872022632c upstream. Written by Catalin Marinas, tested by APM on storm platform. This is needed because of the failures encountered when running SpecWeb benchmark test. Signed-off-by: Feng Kan Acked-by: Kumar Sankaran Signed-off-by: Catalin Marinas Cc: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 5fb08df3dd1f7b8e83936808b042725a8b067562 Author: Mark Rutland Date: Tue Jul 9 15:16:06 2013 +0100 arm64: virt: ensure visibility of __boot_cpu_mode commit 82b2f495fba338d1e3098dde1df54944a9c19751 upstream. Secondary CPUs write to __boot_cpu_mode with caches disabled, and thus a cached value of __boot_cpu_mode may be incoherent with that in memory. This could lead to a failure to detect mismatched boot modes. This patch adds flushing to ensure that writes by secondaries to __boot_cpu_mode are made visible before we test against it. Signed-off-by: Mark Rutland Cc: Marc Zyngier Cc: Christoffer Dall Signed-off-by: Catalin Marinas Cc: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 81ca15a3e04961b54d4a0b395e681bffb6cfbc68 Author: Catalin Marinas Date: Fri Jul 19 15:08:15 2013 +0100 arm64: Only enable local interrupts after the CPU is marked online commit 53ae3acd4390ffeecb3a11dbd5be347b5a3d98f2 upstream. There is a slight chance that (timer) interrupts are triggered before a secondary CPU has been marked online with implications on softirq thread affinity. Signed-off-by: Catalin Marinas Reported-by: Kirill Tkhai Cc: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 08518f6fc6055cc72ea29bf35d6574a930674d7c Author: Josh Durgin Date: Wed Sep 4 17:57:31 2013 -0700 rbd: fix error handling from rbd_snap_name() commit da6a6b63978d45f9ae582d1f362f182012da3a22 upstream. rbd_snap_name() calls rbd_dev_v{1,2}_snap_name() depending on the format of the image. The format 1 version returns NULL on error, which is handled by the caller. The format 2 version returns an ERR_PTR, which the caller of rbd_snap_name() does not expect. Fortunately this is unlikely to occur in practice because rbd_snap_id_by_name() is called before rbd_snap_name(). This would hit similar errors to rbd_snap_name() (like the snapshot not existing) and return early, so rbd_snap_name() would not hit an error unless the snapshot was removed between the two calls or memory was exhausted. Use an ERR_PTR in rbd_dev_v1_snap_name() so that the specific error can be propagated, and it is consistent with rbd_dev_v2_snap_name(). Handle the ERR_PTR in the only rbd_snap_name() caller. Suggested-by: Alex Elder Signed-off-by: Josh Durgin Reviewed-by: Alex Elder Signed-off-by: Greg Kroah-Hartman commit 6fe77759c31c0f5fc36837549c5848470113509b Author: Josh Durgin Date: Thu Aug 29 19:16:42 2013 -0700 rbd: ignore unmapped snapshots that no longer exist commit efadc98aab674153709cc357ba565f04e3164fcd upstream. This prevents erroring out while adding a device when a snapshot unrelated to the current mapping is deleted between reading the snapshot context and reading the snapshot names. If the mapped snapshot name is not found an error still occurs as usual. Signed-off-by: Josh Durgin Reviewed-by: Alex Elder Signed-off-by: Greg Kroah-Hartman commit 5b213542db631f8b0bf7b257e8ae2d37b134895c Author: Josh Durgin Date: Thu Aug 29 17:26:31 2013 -0700 rbd: fix use-after free of rbd_dev->disk commit 9875201e10496612080e7d164acc8f625c18725c upstream. Removing a device deallocates the disk, unschedules the watch, and finally cleans up the rbd_dev structure. rbd_dev_refresh(), called from the watch callback, updates the disk size and rbd_dev structure. With no locking between them, rbd_dev_refresh() may use the device or rbd_dev after they've been freed. To fix this, check whether RBD_DEV_FLAG_REMOVING is set before updating the disk size in rbd_dev_refresh(). In order to prevent a race where rbd_dev_refresh() is already revalidating the disk when rbd_remove() is called, move the call to rbd_bus_del_dev() after the watch is unregistered and all notifies are complete. It's safe to defer deleting this structure because no new requests can be submitted once the RBD_DEV_FLAG_REMOVING is set, since the device cannot be opened. Fixes: http://tracker.ceph.com/issues/5636 Signed-off-by: Josh Durgin Reviewed-by: Alex Elder Signed-off-by: Greg Kroah-Hartman commit b10f19aaa9a8e818254731a6219754b5015d7588 Author: Josh Durgin Date: Thu Aug 29 17:36:03 2013 -0700 rbd: make rbd_obj_notify_ack() synchronous commit 20e0af67ce88c657d0601977b9941a2256afbdaa upstream. The only user of rbd_obj_notify_ack() is rbd_watch_cb(). It used asynchronously with no tracking of when the notify ack completes, so it may still be in progress when the osd_client is shut down. This results in a BUG() since the osd client assumes no requests are in flight when it stops. Since all notifies are flushed before the osd_client is stopped, waiting for the notify ack to complete before returning from the watch callback ensures there are no notify acks in flight during shutdown. Rename rbd_obj_notify_ack() to rbd_obj_notify_ack_sync() to reflect its new synchronous nature. Signed-off-by: Josh Durgin Reviewed-by: Alex Elder Signed-off-by: Greg Kroah-Hartman commit 282e0636dcf5ee3329d9266de64386f21dd4d7d6 Author: Josh Durgin Date: Thu Aug 29 17:31:03 2013 -0700 rbd: complete notifies before cleaning up osd_client and rbd_dev commit 9abc59908e0c5f983aaa91150da32d5b62cf60b7 upstream. To ensure rbd_dev is not used after it's released, flush all pending notify callbacks before calling rbd_dev_image_release(). No new notifies can be added to the queue at this point because the watch has already be unregistered with the osd_client. Signed-off-by: Josh Durgin Reviewed-by: Alex Elder Signed-off-by: Greg Kroah-Hartman commit a2e5951b11b406a83f84c1eb3b5d722491f4d883 Author: Josh Durgin Date: Wed Aug 28 21:43:09 2013 -0700 libceph: add function to ensure notifies are complete commit dd935f44a40f8fb02aff2cc0df2269c92422df1c upstream. Without a way to flush the osd client's notify workqueue, a watch event that is unregistered could continue receiving callbacks indefinitely. Unregistering the event simply means no new notifies are added to the queue, but there may still be events in the queue that will call the watch callback for the event. If the queue is flushed after the event is unregistered, the caller can be sure no more watch callbacks will occur for the canceled watch. Signed-off-by: Josh Durgin Reviewed-by: Sage Weil Reviewed-by: Alex Elder Signed-off-by: Greg Kroah-Hartman commit dd932ee7a2453d86c920d3bbd2938602c5c9aaca Author: Josh Durgin Date: Wed Aug 28 17:08:10 2013 -0700 rbd: fix null dereference in dout commit c35455791c1131e7ccbf56ea6fbdd562401c2ce2 upstream. The order parameter is sometimes NULL in _rbd_dev_v2_snap_size(), but the dout() always derefences it. Move this to another dout() protected by a check that order is non-NULL. Signed-off-by: Josh Durgin Reviewed-by: Sage Weil Reviewed-by: Alex Elder Signed-off-by: Greg Kroah-Hartman commit b33b7132f0feeb0716390330bd5d53fabe057f25 Author: Josh Durgin Date: Tue Aug 27 14:45:46 2013 -0700 rbd: fix buffer size for writes to images with snapshots commit 03507db631c94a48e316c7f638ffb2991544d617 upstream. rbd_osd_req_create() needs to know the snapshot context size to create a buffer large enough to send it with the message front. It gets this from the img_request, which was not set for the obj_request yet. This resulted in trying to write past the end of the front payload, hitting this BUG: libceph: BUG_ON(p > msg->front.iov_base + msg->front.iov_len); Fix this by associating the obj_request with its img_request immediately after it's created, before the osd request is created. Fixes: http://tracker.ceph.com/issues/5760 Suggested-by: Alex Elder Signed-off-by: Josh Durgin Reviewed-by: Alex Elder Signed-off-by: Greg Kroah-Hartman commit 30379f4553bae684a928c6bdd9dc606fd0ef6093 Author: majianpeng Date: Wed Aug 21 15:02:51 2013 +0800 ceph: allow sync_read/write return partial successed size of read/write. commit ee7289bfadda5f4ef60884547ebc9989c8fb314a upstream. For sync_read/write, it may do multi stripe operations.If one of those met erro, we return the former successed size rather than a error value. There is a exception for write-operation met -EOLDSNAPC.If this occur,we retry the whole write again. Signed-off-by: Jianpeng Ma Signed-off-by: Greg Kroah-Hartman commit 533bc2950e085ee99225655f60276760c00af31f Author: majianpeng Date: Tue Aug 6 16:20:38 2013 +0800 ceph: fix bugs about handling short-read for sync read mode. commit 02ae66d8b229708fd94b764f6c17ead1c7741fcf upstream. cephfs . show_layout >layyout.data_pool: 0 >layout.object_size: 4194304 >layout.stripe_unit: 4194304 >layout.stripe_count: 1 TestA: >dd if=/dev/urandom of=test bs=1M count=2 oflag=direct >dd if=/dev/urandom of=test bs=1M count=2 seek=4 oflag=direct >dd if=test of=/dev/null bs=6M count=1 iflag=direct The messages from func striped_read are: ceph: file.c:350 : striped_read 0~6291456 (read 0) got 2097152 HITSTRIPE SHORT ceph: file.c:350 : striped_read 2097152~4194304 (read 2097152) got 0 HITSTRIPE SHORT ceph: file.c:381 : zero tail 4194304 ceph: file.c:390 : striped_read returns 6291456 The hole of file is from 2M--4M.But actualy it zero the last 4M include the last 2M area which isn't a hole. Using this patch, the messages are: ceph: file.c:350 : striped_read 0~6291456 (read 0) got 2097152 HITSTRIPE SHORT ceph: file.c:358 : zero gap 2097152 to 4194304 ceph: file.c:350 : striped_read 4194304~2097152 (read 4194304) got 2097152 ceph: file.c:384 : striped_read returns 6291456 TestB: >echo majianpeng > test >dd if=test of=/dev/null bs=2M count=1 iflag=direct The messages are: ceph: file.c:350 : striped_read 0~6291456 (read 0) got 11 HITSTRIPE SHORT ceph: file.c:350 : striped_read 11~6291445 (read 11) got 0 HITSTRIPE SHORT ceph: file.c:390 : striped_read returns 11 For this case,it did once more striped_read.It's no meaningless. Using this patch, the message are: ceph: file.c:350 : striped_read 0~6291456 (read 0) got 11 HITSTRIPE SHORT ceph: file.c:384 : striped_read returns 11 Big thanks to Yan Zheng for the patch. Reviewed-by: Yan, Zheng Signed-off-by: Jianpeng Ma Signed-off-by: Greg Kroah-Hartman commit 3834fb30d260b8fa6401fff271c65cd29ba94424 Author: Dan Carpenter Date: Thu Aug 15 08:58:59 2013 +0300 libceph: create_singlethread_workqueue() doesn't return ERR_PTRs commit dbcae088fa660086bde6e10d63bb3c9264832d85 upstream. create_singlethread_workqueue() returns NULL on error, and it doesn't return ERR_PTRs. I tweaked the error handling a little to be consistent with earlier in the function. Signed-off-by: Dan Carpenter Reviewed-by: Sage Weil Signed-off-by: Greg Kroah-Hartman commit 53341d7de3f013cc39a3692bdfd02032ce722dff Author: Dan Carpenter Date: Thu Aug 15 08:52:48 2013 +0300 libceph: potential NULL dereference in ceph_osdc_handle_map() commit b72e19b9225d4297a18715b0998093d843d170fa upstream. There are two places where we read "nr_maps" if both of them are set to zero then we would hit a NULL dereference here. Signed-off-by: Dan Carpenter Reviewed-by: Sage Weil Signed-off-by: Greg Kroah-Hartman commit e9e4b13abe4e4a399ed115b858ba8471854a938b Author: Dan Carpenter Date: Thu Aug 15 08:51:58 2013 +0300 libceph: fix error handling in handle_reply() commit 1874119664dafda3ef2ed9b51b4759a9540d4a1a upstream. We've tried to fix the error paths in this function before, but there is still a hidden goto in the ceph_decode_need() macro which goes to the wrong place. We need to release the "req" and unlock a mutex before returning. Signed-off-by: Dan Carpenter Reviewed-by: Sage Weil Signed-off-by: Greg Kroah-Hartman commit 01bb1b04a1dd3736d20b859a9bcf3d7509566c1f Author: majianpeng Date: Fri Aug 2 18:14:48 2013 +0800 ceph: Add check returned value on func ceph_calc_ceph_pg. commit 2fbcbff1d6b9243ef71c64a8ab993bc3c7bb7af1 upstream. Func ceph_calc_ceph_pg maybe failed.So add check for returned value. Signed-off-by: Jianpeng Ma Reviewed-by: Sage Weil Signed-off-by: Sage Weil Signed-off-by: Greg Kroah-Hartman commit 10debbda102fe52eb75ba531b8ee79e98d3b8774 Author: Dan Carpenter Date: Tue Jul 23 16:48:01 2013 +0300 ceph: cleanup types in striped_read() commit 688bac461ba3e9d221a879ab40b687f5d7b5b19c upstream. We pass in a u64 value for "len" and then immediately truncate away the upper 32 bits. Signed-off-by: Dan Carpenter Reviewed-by: Sage Weil Reviewed-by: Alex Elder Signed-off-by: Greg Kroah-Hartman commit acb2e930c72a1d38504d6d9c1f338562bbb9aa55 Author: Nathaniel Yazdani Date: Sun Aug 4 21:04:30 2013 -0700 ceph: fix null pointer dereference commit c338c07c51e3106711fad5eb599e375eadb6855d upstream. When register_session() is given an out-of-range argument for mds, ceph_mdsmap_get_addr() will return a null pointer, which would be given to ceph_con_open() & be dereferenced, causing a kernel oops. This fixes bug #4685 in the Ceph bug tracker . Signed-off-by: Nathaniel Yazdani Reviewed-by: Sage Weil Signed-off-by: Greg Kroah-Hartman commit ef0b67e4b1f57d14f314a206338e5dd806a08b21 Author: Yan, Zheng Date: Mon Jun 24 14:41:27 2013 +0800 libceph: call r_unsafe_callback when unsafe reply is received commit 61c5d6bf7074ee32d014dcdf7698dc8c59eb712d upstream. We can't use !req->r_sent to check if OSD request is sent for the first time, this is because __cancel_request() zeros req->r_sent when OSD map changes. Rather than adding a new variable to struct ceph_osd_request to indicate if it's sent for the first time, We can call the unsafe callback only when unsafe OSD reply is received. If OSD's first reply is safe, just skip calling the unsafe callback. The purpose of unsafe callback is adding unsafe request to a list, so that fsync(2) can wait for the safe reply. fsync(2) doesn't need to wait for a write(2) that hasn't returned yet. So it's OK to add request to the unsafe list when the first OSD reply is received. (ceph_sync_write() returns after receiving the first OSD reply) Signed-off-by: Yan, Zheng Reviewed-by: Sage Weil Signed-off-by: Greg Kroah-Hartman commit 724184c7a1fc44955f8900c382f18545423bc486 Author: Sasha Levin Date: Mon Jul 1 18:33:39 2013 -0400 ceph: avoid accessing invalid memory commit 5446429630257f4723829409337a26c076907d5d upstream. when mounting ceph with a dev name that starts with a slash, ceph would attempt to access the character before that slash. Since we don't actually own that byte of memory, we would trigger an invalid access: [ 43.499934] BUG: unable to handle kernel paging request at ffff880fa3a97fff [ 43.500984] IP: [] parse_mount_options+0x1a4/0x300 [ 43.501491] PGD 743b067 PUD 10283c4067 PMD 10282a6067 PTE 8000000fa3a97060 [ 43.502301] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 43.503006] Dumping ftrace buffer: [ 43.503596] (ftrace buffer empty) [ 43.504046] CPU: 0 PID: 10879 Comm: mount Tainted: G W 3.10.0-sasha #1129 [ 43.504851] task: ffff880fa625b000 ti: ffff880fa3412000 task.ti: ffff880fa3412000 [ 43.505608] RIP: 0010:[] [] parse_mount_options$ [ 43.506552] RSP: 0018:ffff880fa3413d08 EFLAGS: 00010286 [ 43.507133] RAX: ffff880fa3a98000 RBX: ffff880fa3a98000 RCX: 0000000000000000 [ 43.507893] RDX: ffff880fa3a98001 RSI: 000000000000002f RDI: ffff880fa3a98000 [ 43.508610] RBP: ffff880fa3413d58 R08: 0000000000001f99 R09: ffff880fa3fe64c0 [ 43.509426] R10: ffff880fa3413d98 R11: ffff880fa38710d8 R12: ffff880fa3413da0 [ 43.509792] R13: ffff880fa3a97fff R14: 0000000000000000 R15: ffff880fa3413d90 [ 43.509792] FS: 00007fa9c48757e0(0000) GS:ffff880fd2600000(0000) knlGS:000000000000$ [ 43.509792] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 43.509792] CR2: ffff880fa3a97fff CR3: 0000000fa3bb9000 CR4: 00000000000006b0 [ 43.509792] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 43.509792] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 43.509792] Stack: [ 43.509792] 0000e5180000000e ffffffff85ca1900 ffff880fa38710d8 ffff880fa3413d98 [ 43.509792] 0000000000000120 0000000000000000 ffff880fa3a98000 0000000000000000 [ 43.509792] ffffffff85cf32a0 0000000000000000 ffff880fa3413dc8 ffffffff818f3c72 [ 43.509792] Call Trace: [ 43.509792] [] ceph_mount+0xa2/0x390 [ 43.509792] [] ? pcpu_alloc+0x334/0x3c0 [ 43.509792] [] mount_fs+0x8d/0x1a0 [ 43.509792] [] ? __alloc_percpu+0x10/0x20 [ 43.509792] [] vfs_kern_mount+0x79/0x100 [ 43.509792] [] do_new_mount+0xcd/0x1c0 [ 43.509792] [] do_mount+0x15d/0x210 [ 43.509792] [] ? strndup_user+0x45/0x60 [ 43.509792] [] SyS_mount+0x9d/0xe0 [ 43.509792] [] tracesys+0xdd/0xe2 [ 43.509792] Code: 4c 8b 5d c0 74 0a 48 8d 50 01 49 89 14 24 eb 17 31 c0 48 83 c9 ff $ [ 43.509792] RIP [] parse_mount_options+0x1a4/0x300 [ 43.509792] RSP [ 43.509792] CR2: ffff880fa3a97fff [ 43.509792] ---[ end trace 22469cd81e93af51 ]--- Signed-off-by: Sasha Levin Reviewed-by: Sage Weil Signed-off-by: Greg Kroah-Hartman commit d5f9a684f9f21324fa58e97be127589db6c06a56 Author: majianpeng Date: Tue Jun 25 14:48:19 2013 +0800 ceph: Free mdsc if alloc mdsc->mdsmap failed. commit fb3101b6f0db9ae3f35dc8e6ec908d0af8cdf12e upstream. Signed-off-by: Jianpeng Ma Reviewed-by: Sage Weil Signed-off-by: Greg Kroah-Hartman commit 9a640548c96a4cc18394550ecce2241186f7357f Author: Sage Weil Date: Sun Jun 9 08:40:39 2013 -0700 rbd: fix a couple warnings commit e976cad0f0dbe5440a4ca38e29e1f932d9319125 upstream. gcc isn't quite smart enough and generates these warnings: drivers/block/rbd.c: In function 'rbd_img_request_fill': drivers/block/rbd.c:1266:22: warning: 'bio_list' may be used uninitialized in this function [-Wmaybe-uninitialized] drivers/block/rbd.c:2186:14: note: 'bio_list' was declared here drivers/block/rbd.c:2247:10: warning: 'pages' may be used uninitialized in this function [-Wmaybe-uninitialized] even though they are initialized for their respective code paths. Signed-off-by: Sage Weil Signed-off-by: Greg Kroah-Hartman commit 1f69fb068f24ac8bdf7b404dd49d0da7fb1a0d24 Author: Yan, Zheng Date: Sun Jun 2 18:40:23 2013 +0800 libceph: fix truncate size calculation commit ccca4e37b1a912da3db68aee826557ea66145273 upstream. check the "not truncated yet" case Signed-off-by: Yan, Zheng Reviewed-by: Sage Weil Signed-off-by: Greg Kroah-Hartman commit aede2cb5c95588e703e358239a4f3842e21f103e Author: Yan, Zheng Date: Fri May 31 15:54:44 2013 +0800 libceph: fix safe completion commit eb845ff13a44477f8a411baedbf11d678b9daf0a upstream. handle_reply() calls complete_request() only if the first OSD reply has ONDISK flag. Signed-off-by: Yan, Zheng Reviewed-by: Sage Weil Signed-off-by: Greg Kroah-Hartman commit 7aa73ee13251af1534de166ea5379f32a6bf7793 Author: Alex Elder Date: Fri May 31 17:40:44 2013 -0500 rbd: protect against concurrent unmaps commit 82a442d239695a242c4d584464c9606322cd02aa upstream. Make sure two concurrent unmap operations on the same rbd device won't collide, by only proceeding with the removal and cleanup of a device if is not already underway. Signed-off-by: Alex Elder Reviewed-by: Josh Durgin Signed-off-by: Greg Kroah-Hartman commit c4d00f5b3e1f48f99d99d813fe8071c719e1790b Author: Alex Elder Date: Fri May 31 15:17:01 2013 -0500 rbd: set removing flag while holding list lock commit 751cc0e3cfabdda87c4c21519253c6751e97a8d4 upstream. When unmapping a device, its id is supplied, and that is used to look up which rbd device should be unmapped. Looking up the device involves searching the rbd device list while holding a spinlock that protects access to that list. Currently all of this is done under protection of the control lock, but that protection is going away soon. To ensure the rbd_dev is still valid (still on the list) while setting its REMOVING flag, do so while still holding the list lock. To do so, get rid of __rbd_get_dev(), and open code what it did in the one place it was used. Signed-off-by: Alex Elder Reviewed-by: Josh Durgin Signed-off-by: Greg Kroah-Hartman commit 350505e73f8c4d03dfef660f40a6f35d5ac12be6 Author: Alex Elder Date: Wed May 22 20:54:25 2013 -0500 rbd: flush dcache after zeroing page data commit e215605417b87732c6debf65da6d953016a1e5bc upstream. Neither zero_bio_chain() nor zero_pages() contains a call to flush caches after zeroing a portion of a page. This can cause problems on architectures that have caches that allow virtual address aliasing. This resolves: http://tracker.ceph.com/issues/4777 Signed-off-by: Alex Elder Reviewed-by: Josh Durgin Signed-off-by: Greg Kroah-Hartman commit 1d02ec7ffd7577d07c91d2cbb391386d9a129fda Author: Alex Elder Date: Wed May 22 20:54:25 2013 -0500 libceph: add lingering request reference when registered commit 96e4dac66f69d28af2b736e723364efbbdf9fdee upstream. When an osd request is set to linger, the osd client holds onto the request so it can be re-submitted following certain osd map changes. The osd client holds a reference to the request until it is unregistered. This is used by rbd for watch requests. Currently, the reference is taken when the request is marked with the linger flag. This means that if an error occurs after that time but before the the request completes successfully, that reference is leaked. There's really no reason to take the reference until the request is registered in the the osd client's list of lingering requests, and that only happens when the lingering (watch) request completes successfully. So take that reference only when it gets registered following succesful completion, and drop it (as before) when the request gets unregistered. This avoids the reference problem on error in rbd. Rearrange ceph_osdc_unregister_linger_request() to avoid using the request pointer after it may have been freed. And hold an extra reference in kick_requests() while handling a linger request that has not yet been registered, to ensure it doesn't go away. This resolves: http://tracker.ceph.com/issues/3859 Signed-off-by: Alex Elder Reviewed-by: Josh Durgin Signed-off-by: Greg Kroah-Hartman commit f1b6559cfa8c280249b52112223459b00dcdb6e9 Author: Emil Goode Date: Tue May 28 16:59:00 2013 +0200 ceph: improve error handling in ceph_mdsmap_decode commit c213b50b7dcbf06abcfbf1e4eee5b76586718bd9 upstream. This patch makes the following improvements to the error handling in the ceph_mdsmap_decode function: - Add a NULL check for return value from kcalloc - Make use of the variable err Signed-off-by: Emil Goode Signed-off-by: Sage Weil Signed-off-by: Greg Kroah-Hartman commit 3da8d9a6ae084d013bfe1b47c319380a91c7012a Author: Dinh Nguyen Date: Tue Dec 10 19:49:18 2013 +0100 clocksource: dw_apb_timer_of: Fix read_sched_clock commit 85dc6ee1237c8a4a7742e6abab96a20389b7d682 upstream. The read_sched_clock should return the ~value because the clock is a countdown implementation. read_sched_clock() should be the same as __apbt_read_clocksource(). Signed-off-by: Dinh Nguyen Signed-off-by: Daniel Lezcano Signed-off-by: Greg Kroah-Hartman commit 0fdb9385a5909c218fa2cf0cf62896ebe0fcf30e Author: Paul Moore Date: Tue Dec 10 14:58:01 2013 -0500 selinux: process labeled IPsec TCP SYN-ACK packets properly in selinux_ip_postroute() commit c0828e50485932b7e019df377a6b0a8d1ebd3080 upstream. Due to difficulty in arriving at the proper security label for TCP SYN-ACK packets in selinux_ip_postroute(), we need to check packets while/before they are undergoing XFRM transforms instead of waiting until afterwards so that we can determine the correct security label. Reported-by: Janak Desai Signed-off-by: Paul Moore Signed-off-by: Greg Kroah-Hartman commit 070357081f37bb70ff1a09630c50529188846280 Author: Paul Moore Date: Tue Dec 10 14:57:54 2013 -0500 selinux: look for IPsec labels on both inbound and outbound packets commit 817eff718dca4e54d5721211ddde0914428fbb7c upstream. Previously selinux_skb_peerlbl_sid() would only check for labeled IPsec security labels on inbound packets, this patch enables it to check both inbound and outbound traffic for labeled IPsec security labels. Reported-by: Janak Desai Signed-off-by: Paul Moore Signed-off-by: Greg Kroah-Hartman commit 774d75ec4ec10aba883bf1732108e3cb9eeadd54 Author: Geert Uytterhoeven Date: Wed Dec 18 17:08:48 2013 -0800 sh: always link in helper functions extracted from libgcc commit 84ed8a99058e61567f495cc43118344261641c5f upstream. E.g. landisk_defconfig, which has CONFIG_NTFS_FS=m: ERROR: "__ashrdi3" [fs/ntfs/ntfs.ko] undefined! For "lib-y", if no symbols in a compilation unit are referenced by other units, the compilation unit will not be included in vmlinux. This breaks modules that do reference those symbols. Use "obj-y" instead to fix this. http://kisskb.ellerman.id.au/kisskb/buildresult/8838077/ This doesn't fix all cases. There are others, e.g. udivsi3. This is also not limited to sh, many architectures handle this in the same way. A simple solution is to unconditionally include all helper functions. A more complex solution is to make the choice of "lib-y" or "obj-y" depend on CONFIG_MODULES: obj-$(CONFIG_MODULES) += ... lib-y($CONFIG_MODULES) += ... Signed-off-by: Geert Uytterhoeven Cc: Paul Mundt Tested-by: Nobuhiro Iwamatsu Reviewed-by: Nobuhiro Iwamatsu Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit db74f02285dc0a6f649d7f44e97e1219af22ed76 Author: Stephen Boyd Date: Tue Dec 10 15:19:03 2013 -0800 gpio: msm: Fix irq mask/unmask by writing bits instead of numbers commit 4cc629b7a20945ce35628179180329b6bc9e552b upstream. We should be writing bits here but instead we're writing the numbers that correspond to the bits we want to write. Fix it by wrapping the numbers in the BIT() macro. This fixes gpios acting as interrupts. Signed-off-by: Stephen Boyd Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit d459f7e344ccdbdcba68192710642f90297de5b4 Author: Roger Quadros Date: Thu Dec 5 11:23:35 2013 +0200 gpio: twl4030: Fix regression for twl gpio LED output commit f5837ec11f8cfa6d53ebc5806582771b2c9988c6 upstream. Commit 0b2aa8be introduced a regression that causes failure in setting LED GPO direction to OUT. This causes USB host probe failures for Beagleboard C4. platform usb_phy_gen_xceiv.2: Driver usb_phy_gen_xceiv requests probe deferral hsusb2_vcc: Failed to request enable GPIO510: -22 reg-fixed-voltage reg-fixed-voltage.0.auto: Failed to register regulator: -22 reg-fixed-voltage: probe of reg-fixed-voltage.0.auto failed with error -22 direction_out/direction_in must return 0 if the operation succeeded. Also, don't update direction flag and output data if twl4030_set_gpio_direction() failed inside twl_direction_out(); Signed-off-by: Roger Quadros Acked-by: Tony Lindgren Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit ceed0859339443bc4fa22b9e0a825028ba8cd330 Author: Theodore Ts'o Date: Sun Dec 8 21:12:59 2013 -0500 jbd2: don't BUG but return ENOSPC if a handle runs out of space commit f6c07cad081ba222d63623d913aafba5586c1d2c upstream. If a handle runs out of space, we currently stop the kernel with a BUG in jbd2_journal_dirty_metadata(). This makes it hard to figure out what might be going on. So return an error of ENOSPC, so we can let the file system layer figure out what is going on, to make it more likely we can get useful debugging information). This should make it easier to debug problems such as the one which was reported by: https://bugzilla.kernel.org/show_bug.cgi?id=44731 The only two callers of this function are ext4_handle_dirty_metadata() and ocfs2_journal_dirty(). The ocfs2 function will trigger a BUG_ON(), which means there will be no change in behavior. The ext4 function will call ext4_error_inode() which will print the useful debugging information and then handle the situation using ext4's error handling mechanisms (i.e., which might mean halting the kernel or remounting the file system read-only). Also, since both file systems already call WARN_ON(), drop the WARN_ON from jbd2_journal_dirty_metadata() to avoid two stack traces from being displayed. Signed-off-by: "Theodore Ts'o" Cc: ocfs2-devel@oss.oracle.com Acked-by: Joel Becker Signed-off-by: Greg Kroah-Hartman commit 2202b3646c440e775d0e630c784b295b612dae0b Author: Martin Schwidefsky Date: Wed Dec 18 14:36:18 2013 +0100 s390/3270: fix allocation of tty3270_screen structure commit 36d9f4d3b68c7035ead3850dc85f310a579ed0eb upstream. The tty3270_alloc_screen function is called from tty3270_install with swapped arguments, the number of columns instead of rows and vice versa. The number of rows is typically smaller than the number of columns which makes the screen array too big but the individual cell arrays for the lines too small. Creating lines longer than the number of rows will clobber the memory after the end of the cell array. The fix is simple, call tty3270_alloc_screen with the correct argument order. Signed-off-by: Martin Schwidefsky Signed-off-by: Greg Kroah-Hartman commit 76fca2297af40a10ec000ecb08cac41b51fbccc9 Author: Vladimir Davydov Date: Thu Jan 2 12:58:47 2014 -0800 memcg: fix memcg_size() calculation commit 695c60830764945cf61a2cc623eb1392d137223e upstream. The mem_cgroup structure contains nr_node_ids pointers to mem_cgroup_per_node objects, not the objects themselves. Signed-off-by: Vladimir Davydov Acked-by: Michal Hocko Cc: Glauber Costa Cc: Johannes Weiner Cc: Balbir Singh Cc: KAMEZAWA Hiroyuki Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 8553459e73c6da3e5b9da9239dd8ef017181252a Author: Steven Whitehouse Date: Wed Dec 18 14:14:52 2013 +0000 GFS2: Fix incorrect invalidation for DIO/buffered I/O commit dfd11184d894cd0a92397b25cac18831a1a6a5bc upstream. In patch 209806aba9d540dde3db0a5ce72307f85f33468f we allowed local deferred locks to be granted against a cached exclusive lock. That opened up a corner case which this patch now fixes. The solution to the problem is to check whether we have cached pages each time we do direct I/O and if so to unmap, flush and invalidate those pages. Since the glock state machine normally does that for us, mostly the code will be a no-op. Signed-off-by: Steven Whitehouse Signed-off-by: Greg Kroah-Hartman commit 21b6291b88c534d8c6018ad483e232abab63f644 Author: Steven Whitehouse Date: Fri Dec 6 11:52:34 2013 +0000 GFS2: don't hold s_umount over blkdev_put commit dfe5b9ad83a63180f358b27d1018649a27b394a9 upstream. This is a GFS2 version of Tejun's patch: 4f331f01b9c43bf001d3ffee578a97a1e0633eac vfs: don't hold s_umount over close_bdev_exclusive() call In this case its blkdev_put itself that is the issue and this patch uses the same solution of dropping and retaking s_umount. Reported-by: Tejun Heo Reported-by: Al Viro Signed-off-by: Steven Whitehouse Signed-off-by: Greg Kroah-Hartman commit 2d8ccbd72d8e44d40daa7500a5a78cc8ccad02f1 Author: Dmitry Torokhov Date: Thu Dec 26 17:44:29 2013 -0800 Input: allocate absinfo data when setting ABS capability commit 28a2a2e1aedbe2d8b2301e6e0e4e63f6e4177aca upstream. We need to make sure we allocate absinfo data when we are setting one of EV_ABS/ABS_XXX capabilities, otherwise we may bomb when we try to emit this event. Rested-by: Paul Cercueil Signed-off-by: Dmitry Torokhov Signed-off-by: Greg Kroah-Hartman commit c074c8f4a7135a4175a73a8b87564c68b71746fe Author: Naoya Horiguchi Date: Thu Jan 2 12:58:51 2014 -0800 mm/memory-failure.c: transfer page count from head page to tail page after split thp commit a3e0f9e47d5ef7858a26cc12d90ad5146e802d47 upstream. Memory failures on thp tail pages cause kernel panic like below: mce: [Hardware Error]: Machine check events logged MCE exception done on CPU 7 BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 IP: [] dequeue_hwpoisoned_huge_page+0x131/0x1e0 PGD bae42067 PUD ba47d067 PMD 0 Oops: 0000 [#1] SMP ... CPU: 7 PID: 128 Comm: kworker/7:2 Tainted: G M O 3.13.0-rc4-131217-1558-00003-g83b7df08e462 #25 ... Call Trace: me_huge_page+0x3e/0x50 memory_failure+0x4bb/0xc20 mce_process_work+0x3e/0x70 process_one_work+0x171/0x420 worker_thread+0x11b/0x3a0 ? manage_workers.isra.25+0x2b0/0x2b0 kthread+0xe4/0x100 ? kthread_create_on_node+0x190/0x190 ret_from_fork+0x7c/0xb0 ? kthread_create_on_node+0x190/0x190 ... RIP dequeue_hwpoisoned_huge_page+0x131/0x1e0 CR2: 0000000000000058 The reasoning of this problem is shown below: - when we have a memory error on a thp tail page, the memory error handler grabs a refcount of the head page to keep the thp under us. - Before unmapping the error page from processes, we split the thp, where page refcounts of both of head/tail pages don't change. - Then we call try_to_unmap() over the error page (which was a tail page before). We didn't pin the error page to handle the memory error, this error page is freed and removed from LRU list. - We never have the error page on LRU list, so the first page state check returns "unknown page," then we move to the second check with the saved page flag. - The saved page flag have PG_tail set, so the second page state check returns "hugepage." - We call me_huge_page() for freed error page, then we hit the above panic. The root cause is that we didn't move refcount from the head page to the tail page after split thp. So this patch suggests to do this. This panic was introduced by commit 524fca1e73 ("HWPOISON: fix misjudgement of page_action() for errors on mlocked pages"). Note that we did have the same refcount problem before this commit, but it was just ignored because we had only first page state check which returned "unknown page." The commit changed the refcount problem from "doesn't work" to "kernel panic." Signed-off-by: Naoya Horiguchi Reviewed-by: Wanpeng Li Cc: Andi Kleen Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit abdd4b8ac01323900fcda70f6d9ee94b873d66e3 Author: Rik van Riel Date: Thu Jan 2 12:58:46 2014 -0800 mm: fix use-after-free in sys_remap_file_pages commit 4eb919825e6c3c7fb3630d5621f6d11e98a18b3a upstream. remap_file_pages calls mmap_region, which may merge the VMA with other existing VMAs, and free "vma". This can lead to a use-after-free bug. Avoid the bug by remembering vm_flags before calling mmap_region, and not trying to dereference vma later. Signed-off-by: Rik van Riel Reported-by: Dmitry Vyukov Cc: PaX Team Cc: Kees Cook Cc: Michel Lespinasse Cc: Cyrill Gorcunov Cc: Hugh Dickins Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 398bbc5710f23c0db6ebd4dd30bb331086bd16e7 Author: Jianguo Wu Date: Wed Dec 18 17:08:59 2013 -0800 mm/hugetlb: check for pte NULL pointer in __page_check_address() commit 98398c32f6687ee1e1f3ae084effb4b75adb0747 upstream. In __page_check_address(), if address's pud is not present, huge_pte_offset() will return NULL, we should check the return value. Signed-off-by: Jianguo Wu Cc: Naoya Horiguchi Cc: Mel Gorman Cc: qiuxishi Cc: Hanjun Guo Acked-by: Kirill A. Shutemov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 31eb5f24b9a0aeb058bef453c14328fb29e63501 Author: Joonsoo Kim Date: Wed Dec 18 17:08:52 2013 -0800 mm/compaction: respect ignore_skip_hint in update_pageblock_skip commit 6815bf3f233e0b10c99a758497d5d236063b010b upstream. update_pageblock_skip() only fits to compaction which tries to isolate by pageblock unit. If isolate_migratepages_range() is called by CMA, it try to isolate regardless of pageblock unit and it don't reference get_pageblock_skip() by ignore_skip_hint. We should also respect it on update_pageblock_skip() to prevent from setting the wrong information. Signed-off-by: Joonsoo Kim Acked-by: Vlastimil Babka Reviewed-by: Naoya Horiguchi Reviewed-by: Wanpeng Li Cc: Christoph Lameter Cc: Rafael Aquini Cc: Vlastimil Babka Cc: Wanpeng Li Cc: Mel Gorman Cc: Rik van Riel Cc: Zhang Yanfei Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 5d8e03b2544c3cc962106f63c9d60578ad8a4c91 Author: Mel Gorman Date: Wed Dec 18 17:08:45 2013 -0800 mm: numa: guarantee that tlb_flush_pending updates are visible before page table updates commit af2c1401e6f9177483be4fad876d0073669df9df upstream. According to documentation on barriers, stores issued before a LOCK can complete after the lock implying that it's possible tlb_flush_pending can be visible after a page table update. As per revised documentation, this patch adds a smp_mb__before_spinlock to guarantee the correct ordering. Signed-off-by: Mel Gorman Acked-by: Paul E. McKenney Reviewed-by: Rik van Riel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit d303cf4624824971d94b4e2c7c95df052d14aa81 Author: Rik van Riel Date: Wed Dec 18 17:08:44 2013 -0800 mm: fix TLB flush race between migration, and change_protection_range commit 20841405940e7be0617612d521e206e4b6b325db upstream. There are a few subtle races, between change_protection_range (used by mprotect and change_prot_numa) on one side, and NUMA page migration and compaction on the other side. The basic race is that there is a time window between when the PTE gets made non-present (PROT_NONE or NUMA), and the TLB is flushed. During that time, a CPU may continue writing to the page. This is fine most of the time, however compaction or the NUMA migration code may come in, and migrate the page away. When that happens, the CPU may continue writing, through the cached translation, to what is no longer the current memory location of the process. This only affects x86, which has a somewhat optimistic pte_accessible. All other architectures appear to be safe, and will either always flush, or flush whenever there is a valid mapping, even with no permissions (SPARC). The basic race looks like this: CPU A CPU B CPU C load TLB entry make entry PTE/PMD_NUMA fault on entry read/write old page start migrating page change PTE/PMD to new page read/write old page [*] flush TLB reload TLB from new entry read/write new page lose data [*] the old page may belong to a new user at this point! The obvious fix is to flush remote TLB entries, by making sure that pte_accessible aware of the fact that PROT_NONE and PROT_NUMA memory may still be accessible if there is a TLB flush pending for the mm. This should fix both NUMA migration and compaction. [mgorman@suse.de: fix build] Signed-off-by: Rik van Riel Signed-off-by: Mel Gorman Cc: Alex Thorlton Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 57f74b6ecebf59991677dd2da0f0433e8be6c945 Author: Oleg Nesterov Date: Mon Aug 12 18:14:00 2013 +0200 sched: fix the theoretical signal_wake_up() vs schedule() race commit e0acd0a68ec7dbf6b7a81a87a867ebd7ac9b76c4 upstream. This is only theoretical, but after try_to_wake_up(p) was changed to check p->state under p->pi_lock the code like __set_current_state(TASK_INTERRUPTIBLE); schedule(); can miss a signal. This is the special case of wait-for-condition, it relies on try_to_wake_up/schedule interaction and thus it does not need mb() between __set_current_state() and if(signal_pending). However, this __set_current_state() can move into the critical section protected by rq->lock, now that try_to_wake_up() takes another lock we need to ensure that it can't be reordered with "if (signal_pending(current))" check inside that section. The patch is actually one-liner, it simply adds smp_wmb() before spin_lock_irq(rq->lock). This is what try_to_wake_up() already does by the same reason. We turn this wmb() into the new helper, smp_mb__before_spinlock(), for better documentation and to allow the architectures to change the default implementation. While at it, kill smp_mb__after_lock(), it has no callers. Perhaps we can also add smp_mb__before/after_spinunlock() for prepare_to_wait(). Signed-off-by: Oleg Nesterov Acked-by: Peter Zijlstra Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit a29ccdd1b5a61fad7d4883b3ef63da3a313f1e44 Author: Mel Gorman Date: Wed Dec 18 17:08:39 2013 -0800 mm: numa: avoid unnecessary work on the failure path commit eb4489f69f224356193364dc2762aa009738ca7f upstream. If a PMD changes during a THP migration then migration aborts but the failure path is doing more work than is necessary. Signed-off-by: Mel Gorman Reviewed-by: Rik van Riel Cc: Alex Thorlton Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 4455c567b8a231d6aa8bf077facfa559d2605357 Author: Mel Gorman Date: Wed Dec 18 17:08:38 2013 -0800 mm: numa: ensure anon_vma is locked to prevent parallel THP splits commit c3a489cac38d43ea6dc4ac240473b44b46deecf7 upstream. The anon_vma lock prevents parallel THP splits and any associated complexity that arises when handling splits during THP migration. This patch checks if the lock was successfully acquired and bails from THP migration if it failed for any reason. Signed-off-by: Mel Gorman Reviewed-by: Rik van Riel Cc: Alex Thorlton Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 0b92137e5f67921fef752f55af7efe35a15d2923 Author: Mel Gorman Date: Wed Dec 18 17:08:34 2013 -0800 mm: clear pmd_numa before invalidating commit 67f87463d3a3362424efcbe8b40e4772fd34fc61 upstream. On x86, PMD entries are similar to _PAGE_PROTNONE protection and are handled as NUMA hinting faults. The following two page table protection bits are what defines them _PAGE_NUMA:set _PAGE_PRESENT:clear A PMD is considered present if any of the _PAGE_PRESENT, _PAGE_PROTNONE, _PAGE_PSE or _PAGE_NUMA bits are set. If pmdp_invalidate encounters a pmd_numa, it clears the present bit leaving _PAGE_NUMA which will be considered not present by the CPU but present by pmd_present. The existing caller of pmdp_invalidate should handle it but it's an inconsistent state for a PMD. This patch keeps the state consistent when calling pmdp_invalidate. Signed-off-by: Mel Gorman Reviewed-by: Rik van Riel Cc: Alex Thorlton Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 4fc7e47022c61f605101c99e99b2518c3c3df2f8 Author: Rob Herring Date: Sun Dec 29 19:37:43 2013 -0600 Revert "of/address: Handle #address-cells > 2 specially" commit 13fcca8f25f4e9ce7f55da9cd353bb743236e212 upstream. This reverts commit e38c0a1fbc5803cbacdaac0557c70ac8ca5152e7. Nikita Yushchenko reports: While trying to make freescale p2020ds and mpc8572ds boards working with mainline kernel, I faced that commit e38c0a1f (Handle Both these boards have uli1575 chip. Corresponding part in device tree is something like uli1575@0 { reg = <0x0 0x0 0x0 0x0 0x0>; #size-cells = <2>; #address-cells = <3>; ranges = <0x2000000 0x0 0x80000000 0x2000000 0x0 0x80000000 0x0 0x20000000 0x1000000 0x0 0x0 0x1000000 0x0 0x0 0x0 0x10000>; isa@1e { ... I.e. it has #address-cells = <3> With commit e38c0a1f reverted, devices under uli1575 are registered correctly, e.g. for rtc OF: ** translation for device /pcie@ffe09000/pcie@0/uli1575@0/isa@1e/rtc@70 ** OF: bus is isa (na=2, ns=1) on /pcie@ffe09000/pcie@0/uli1575@0/isa@1e OF: translating address: 00000001 00000070 OF: parent bus is default (na=3, ns=2) on /pcie@ffe09000/pcie@0/uli1575@0 OF: walking ranges... OF: ISA map, cp=0, s=1000, da=70 OF: parent translation for: 01000000 00000000 00000000 OF: with offset: 70 OF: one level translation: 00000000 00000000 00000070 OF: parent bus is pci (na=3, ns=2) on /pcie@ffe09000/pcie@0 OF: walking ranges... OF: default map, cp=a0000000, s=20000000, da=70 OF: default map, cp=0, s=10000, da=70 OF: parent translation for: 01000000 00000000 00000000 OF: with offset: 70 OF: one level translation: 01000000 00000000 00000070 OF: parent bus is pci (na=3, ns=2) on /pcie@ffe09000 OF: walking ranges... OF: PCI map, cp=0, s=10000, da=70 OF: parent translation for: 01000000 00000000 00000000 OF: with offset: 70 OF: one level translation: 01000000 00000000 00000070 OF: parent bus is default (na=2, ns=2) on / OF: walking ranges... OF: PCI map, cp=0, s=10000, da=70 OF: parent translation for: 00000000 ffc10000 OF: with offset: 70 OF: one level translation: 00000000 ffc10070 OF: reached root node With commit e38c0a1f in place, address translation fails: OF: ** translation for device /pcie@ffe09000/pcie@0/uli1575@0/isa@1e/rtc@70 ** OF: bus is isa (na=2, ns=1) on /pcie@ffe09000/pcie@0/uli1575@0/isa@1e OF: translating address: 00000001 00000070 OF: parent bus is default (na=3, ns=2) on /pcie@ffe09000/pcie@0/uli1575@0 OF: walking ranges... OF: ISA map, cp=0, s=1000, da=70 OF: parent translation for: 01000000 00000000 00000000 OF: with offset: 70 OF: one level translation: 00000000 00000000 00000070 OF: parent bus is pci (na=3, ns=2) on /pcie@ffe09000/pcie@0 OF: walking ranges... OF: default map, cp=a0000000, s=20000000, da=70 OF: default map, cp=0, s=10000, da=70 OF: not found ! Thierry Reding confirmed this commit was not needed after all: "We ended up merging a different address representation for Tegra PCIe and I've confirmed that reverting this commit doesn't cause any obvious regressions. I think all other drivers in drivers/pci/host ended up copying what we did on Tegra, so I wouldn't expect any other breakage either." There doesn't appear to be a simple way to support both behaviours, so reverting this as nothing should be depending on the new behaviour. Signed-off-by: Rob Herring Signed-off-by: Greg Kroah-Hartman commit ec84b71390b0433179f25364979d500453bf0468 Author: Rafael J. Wysocki Date: Tue Dec 31 13:37:46 2013 +0100 intel_pstate: Fail initialization if P-state information is missing commit 98a947abdd54e5de909bebadfced1696ccad30cf upstream. If pstate.current_pstate is 0 after the initial intel_pstate_get_cpu_pstates(), this means that we were unable to obtain any useful P-state information and there is no reason to continue, so free memory and return an error in that case. This fixes the following divide error occuring in a nested KVM guest: Intel P-state driver initializing. Intel pstate controlling: cpu 0 cpufreq: __cpufreq_add_dev: ->get() failed divide error: 0000 [#1] SMP Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.13.0-0.rc4.git5.1.fc21.x86_64 #1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff88001ea20000 ti: ffff88001e9bc000 task.ti: ffff88001e9bc000 RIP: 0010:[] [] intel_pstate_timer_func+0x11d/0x2b0 RSP: 0000:ffff88001ee03e18 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88001a454348 RCX: 0000000000006100 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88001ee03e38 R08: 0000000000000000 R09: 0000000000000000 R10: ffff88001ea20000 R11: 0000000000000000 R12: 00000c0a1ea20000 R13: 1ea200001ea20000 R14: ffffffff815c5400 R15: ffff88001a454348 FS: 0000000000000000(0000) GS:ffff88001ee00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000001c0c000 CR4: 00000000000006f0 Stack: fffffffb1a454390 ffffffff821a4500 ffff88001a454390 0000000000000100 ffff88001ee03ea8 ffffffff81083e9a ffffffff81083e15 ffffffff82d5ed40 ffffffff8258cc60 0000000000000000 ffffffff81ac39de 0000000000000000 Call Trace: [] call_timer_fn+0x8a/0x310 [] ? call_timer_fn+0x5/0x310 [] ? pid_param_set+0x130/0x130 [] run_timer_softirq+0x234/0x380 [] __do_softirq+0x104/0x430 [] irq_exit+0xcd/0xe0 [] smp_apic_timer_interrupt+0x45/0x60 [] apic_timer_interrupt+0x72/0x80 [] ? vprintk_emit+0x1dd/0x5e0 [] printk+0x67/0x69 [] __cpufreq_add_dev.isra.13+0x883/0x8d0 [] cpufreq_add_dev+0x10/0x20 [] subsys_interface_register+0xb1/0xf0 [] cpufreq_register_driver+0x9f/0x210 [] intel_pstate_init+0x27d/0x3be [] ? mutex_unlock+0xe/0x10 [] ? cpufreq_gov_dbs_init+0x12/0x12 [] do_one_initcall+0xfa/0x1b0 [] ? parse_args+0x225/0x3f0 [] kernel_init_freeable+0x1fc/0x287 [] ? do_early_param+0x88/0x88 [] ? rest_init+0x150/0x150 [] kernel_init+0xe/0x130 [] ret_from_fork+0x7c/0xb0 [] ? rest_init+0x150/0x150 Code: c1 e0 05 48 63 bc 03 10 01 00 00 48 63 83 d0 00 00 00 48 63 d6 48 c1 e2 08 c1 e1 08 4c 63 c2 48 c1 e0 08 48 98 48 c1 e0 08 48 99 <49> f7 f8 48 98 48 0f af f8 48 c1 ff 08 29 f9 89 ca c1 fa 1f 89 RIP [] intel_pstate_timer_func+0x11d/0x2b0 RSP ---[ end trace f166110ed22cc37a ]--- Kernel panic - not syncing: Fatal exception in interrupt Reported-and-tested-by: Kashyap Chamarthy Cc: Josh Boyer Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 7e47808e7f4f16932fdd009ade4244fa2e5a7579 Author: Larry Finger Date: Wed Dec 11 17:13:10 2013 -0600 rtlwifi: pci: Fix oops on driver unload commit 9278db6279e28d4d433bc8a848e10b4ece8793ed upstream. On Fedora systems, unloading rtl8192ce causes an oops. This patch fixes the problem reported at https://bugzilla.redhat.com/show_bug.cgi?id=852761. Signed-off-by: Larry Finger Signed-off-by: John W. Linville Signed-off-by: Greg Kroah-Hartman commit 6646ce885ff19da84a20193b1b36629f7953305a Author: Johannes Berg Date: Mon Dec 16 12:04:36 2013 +0100 radiotap: fix bitmap-end-finding buffer overrun commit bd02cd2549cfcdfc57cb5ce57ffc3feb94f70575 upstream. Evan Huus found (by fuzzing in wireshark) that the radiotap iterator code can access beyond the length of the buffer if the first bitmap claims an extension but then there's no data at all. Fix this. Reported-by: Evan Huus Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit 4e7255f33a03b219d76b32b7e7e1cb395004668a Author: Tejun Heo Date: Wed Dec 18 07:07:32 2013 -0500 libata, freezer: avoid block device removal while system is frozen commit 85fbd722ad0f5d64d1ad15888cd1eb2188bfb557 upstream. Freezable kthreads and workqueues are fundamentally problematic in that they effectively introduce a big kernel lock widely used in the kernel and have already been the culprit of several deadlock scenarios. This is the latest occurrence. During resume, libata rescans all the ports and revalidates all pre-existing devices. If it determines that a device has gone missing, the device is removed from the system which involves invalidating block device and flushing bdi while holding driver core layer locks. Unfortunately, this can race with the rest of device resume. Because freezable kthreads and workqueues are thawed after device resume is complete and block device removal depends on freezable workqueues and kthreads (e.g. bdi_wq, jbd2) to make progress, this can lead to deadlock - block device removal can't proceed because kthreads are frozen and kthreads can't be thawed because device resume is blocked behind block device removal. 839a8e8660b6 ("writeback: replace custom worker pool implementation with unbound workqueue") made this particular deadlock scenario more visible but the underlying problem has always been there - the original forker task and jbd2 are freezable too. In fact, this is highly likely just one of many possible deadlock scenarios given that freezer behaves as a big kernel lock and we don't have any debug mechanism around it. I believe the right thing to do is getting rid of freezable kthreads and workqueues. This is something fundamentally broken. For now, implement a funny workaround in libata - just avoid doing block device hot[un]plug while the system is frozen. Kernel engineering at its finest. :( v2: Add EXPORT_SYMBOL_GPL(pm_freezing) for cases where libata is built as a module. v3: Comment updated and polling interval changed to 10ms as suggested by Rafael. v4: Add #ifdef CONFIG_FREEZER around the hack as pm_freezing is not defined when FREEZER is not configured thus breaking build. Reported by kbuild test robot. Signed-off-by: Tejun Heo Reported-by: Tomaž Šolc Reviewed-by: "Rafael J. Wysocki" Link: https://bugzilla.kernel.org/show_bug.cgi?id=62801 Link: http://lkml.kernel.org/r/20131213174932.GA27070@htj.dyndns.org Cc: Greg Kroah-Hartman Cc: Len Brown Cc: Oleg Nesterov Cc: kbuild test robot Signed-off-by: Greg Kroah-Hartman commit 1dc58ce4a96460a21c6038f9517ce72ae73cfcc3 Author: Robin H. Johnson Date: Mon Dec 16 09:31:19 2013 -0800 libata: disable a disk via libata.force params commit b8bd6dc36186fe99afa7b73e9e2d9a98ad5c4865 upstream. A user on StackExchange had a failing SSD that's soldered directly onto the motherboard of his system. The BIOS does not give any option to disable it at all, so he can't just hide it from the OS via the BIOS. The old IDE layer had hdX=noprobe override for situations like this, but that was never ported to the libata layer. This patch implements a disable flag for libata.force. Example use: libata.force=2.0:disable [v2 of the patch, removed the nodisable flag per Tejun Heo] Signed-off-by: Robin H. Johnson Signed-off-by: Tejun Heo Link: http://unix.stackexchange.com/questions/102648/how-to-tell-linux-kernel-3-0-to-completely-ignore-a-failing-disk Link: http://askubuntu.com/questions/352836/how-can-i-tell-linux-kernel-to-completely-ignore-a-disk-as-if-it-was-not-even-co Link: http://superuser.com/questions/599333/how-to-disable-kernel-probing-for-drive Signed-off-by: Greg Kroah-Hartman commit 3d097b15182906eaf121c48697ceda0ab9ddf55e Author: Vincent Pelletier Date: Tue May 21 22:30:58 2013 +0200 libata: Add atapi_dmadir force flag commit 966fbe193f47c68e70a80ec9991098e88e7959cb upstream. Some device require DMADIR to be enabled, but are not detected as such by atapi_id_dmadir. One such example is "Asus Serillel 2" SATA-host-to-PATA-device bridge: the bridge itself requires DMADIR, even if the bridged device does not. As atapi_dmadir module parameter can cause problems with some devices (as per Tejun Heo's memory), enabling it globally may not be possible depending on the hardware. This patch adds atapi_dmadir in the form of a "force" horkage value, allowing global, per-bus and per-device control. Signed-off-by: Vincent Pelletier Signed-off-by: Tejun Heo Signed-off-by: Greg Kroah-Hartman commit 6dcccce8791fa72bff307f6c2506d323705102cd Author: Michele Baldessari Date: Mon Nov 25 19:00:14 2013 +0000 libata: add ATA_HORKAGE_BROKEN_FPDMA_AA quirk for Seagate Momentus SpinPoint M8 commit 87809942d3fa60bafb7a58d0bdb1c79e90a6821d upstream. We've received multiple reports in Fedora via (BZ 907193) that the Seagate Momentus SpinPoint M8 errors out when enabling AA: [ 2.555905] ata2.00: failed to enable AA (error_mask=0x1) [ 2.568482] ata2.00: failed to enable AA (error_mask=0x1) Add the ATA_HORKAGE_BROKEN_FPDMA_AA for this specific harddisk. Reported-by: Nicholas Signed-off-by: Michele Baldessari Tested-by: Nicholas Acked-by: Alan Cox Signed-off-by: Tejun Heo Signed-off-by: Greg Kroah-Hartman commit 37b1780623b001b57c92bc8c81c570a88f6f30d4 Author: Josh Boyer Date: Fri Oct 11 08:45:51 2013 -0400 cpupower: Fix segfault due to incorrect getopt_long arugments commit f447ef4a56dee4b68a91460bcdfe06b5011085f2 upstream. If a user calls 'cpupower set --perf-bias 15', the process will end with a SIGSEGV in libc because cpupower-set passes a NULL optarg to the atoi call. This is because the getopt_long structure currently has all of the options as having an optional_argument when they really have a required argument. We change the structure to use required_argument to match the short options and it resolves the issue. This fixes https://bugzilla.redhat.com/show_bug.cgi?id=1000439 Signed-off-by: Josh Boyer Cc: Dominik Brodowski Cc: Thomas Renninger Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 96350a7067c73a288af679f3420431cabe9453bc Author: Anton Blanchard Date: Mon Dec 23 12:19:51 2013 +1100 powerpc: Align p_end commit 286e4f90a72c0b0621dde0294af6ed4b0baddabb upstream. p_end is an 8 byte value embedded in the text section. This means it is only 4 byte aligned when it should be 8 byte aligned. Fix this by adding an explicit alignment. This fixes an issue where POWER7 little endian builds with CONFIG_RELOCATABLE=y fail to boot. Signed-off-by: Anton Blanchard Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Greg Kroah-Hartman commit 4e639053aacb787a762e3dea49f4d3a6c13a52e7 Author: Michael Neuling Date: Mon Dec 16 15:12:43 2013 +1100 powerpc: Fix bad stack check in exception entry commit 90ff5d688e61f49f23545ffab6228bd7e87e6dc7 upstream. In EXCEPTION_PROLOG_COMMON() we check to see if the stack pointer (r1) is valid when coming from the kernel. If it's not valid, we die but with a nice oops message. Currently we allocate a stack frame (subtract INT_FRAME_SIZE) before we check to see if the stack pointer is negative. Unfortunately, this won't detect a bad stack where r1 is less than INT_FRAME_SIZE. This patch fixes the check to compare the modified r1 with -INT_FRAME_SIZE. With this, bad kernel stack pointers (including NULL pointers) are correctly detected again. Kudos to Paulus for finding this. Signed-off-by: Michael Neuling Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Greg Kroah-Hartman commit 9e23c5bb1507d413f86652feea3fb1c41cb6f99b Author: Jan Kiszka Date: Sun Dec 29 02:29:30 2013 +0100 KVM: x86: Fix APIC map calculation after re-enabling commit e66d2ae7c67bd9ac982a3d1890564de7f7eabf4b upstream. Update arch.apic_base before triggering recalculate_apic_map. Otherwise the recalculation will work against the previous state of the APIC and will fail to build the correct map when an APIC is hardware-enabled again. This fixes a regression of 1e08ec4a13. Signed-off-by: Jan Kiszka Signed-off-by: Marcelo Tosatti Signed-off-by: Greg Kroah-Hartman commit d5985f143875f7d4dc718fa664b70666667877d9 Author: Mathy Vanhoef Date: Thu Nov 28 12:21:45 2013 +0100 ath9k_htc: properly set MAC address and BSSID mask commit 657eb17d87852c42b55c4b06d5425baa08b2ddb3 upstream. Pick the MAC address of the first virtual interface as the new hardware MAC address. Set BSSID mask according to this MAC address. This fixes CVE-2013-4579. Signed-off-by: Mathy Vanhoef Signed-off-by: John W. Linville Signed-off-by: Greg Kroah-Hartman commit 12bc42c524f045e914b0ee0cfb3eab33646e1acf Author: Sujith Manoharan Date: Mon Dec 16 07:04:59 2013 +0530 ath9k: Fix interrupt handling for the AR9002 family commit 73f0b56a1ff64e7fb6c3a62088804bab93bcedc2 upstream. This patch adds a driver workaround for a HW issue. A race condition in the HW results in missing interrupts, which can be avoided by a read/write with the ISR register. All chips in the AR9002 series are affected by this bug - AR9003 and above do not have this problem. Cc: Felix Fietkau Signed-off-by: Sujith Manoharan Signed-off-by: John W. Linville Signed-off-by: Greg Kroah-Hartman commit 5fd93067a1e7bc8ce6a889816d2b4c2d19a2e970 Author: Peter Korsgaard Date: Mon Dec 16 11:35:35 2013 +0100 dm9601: work around tx fifo sync issue on dm962x commit 4263c86dca5198da6bd3ad826d0b2304fbe25776 upstream. Certain dm962x revisions contain an bug, where if a USB bulk transfer retry (E.G. if bulk crc mismatch) happens right after a transfer with odd or maxpacket length, the internal tx hardware fifo gets out of sync causing the interface to stop working. Work around it by adding up to 3 bytes of padding to ensure this situation cannot trigger. This workaround also means we never pass multiple-of-maxpacket size skb's to usbnet, so the length adjustment to handle usbnet's padding of those can be removed. Reported-by: Joseph Chang Signed-off-by: Peter Korsgaard Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 1552c3d8e10d1113d55237f3dd0d65c0a3501632 Author: Peter Korsgaard Date: Mon Dec 16 11:35:33 2013 +0100 dm9601: fix reception of full size ethernet frames on dm9620/dm9621a commit 407900cfb54bdb2cfa228010b6697305f66b2948 upstream. dm9620/dm9621a require room for 4 byte padding even in dm9601 (3 byte header) mode. Signed-off-by: Peter Korsgaard Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 84395b85636ec02316b499c2e97185ffa83afc0e Author: Ard Biesheuvel Date: Mon Dec 23 18:49:30 2013 +0100 auxvec.h: account for AT_HWCAP2 in AT_VECTOR_SIZE_BASE commit f60900f2609e893c7f8d0bccc7ada4947dac4cd5 upstream. Commit 2171364d1a92 ("powerpc: Add HWCAP2 aux entry") introduced a new AT_ auxv entry type AT_HWCAP2 but failed to update AT_VECTOR_SIZE_BASE accordingly. Signed-off-by: Ard Biesheuvel Fixes: 2171364d1a92 (powerpc: Add HWCAP2 aux entry) Acked-by: Michael Neuling Cc: Nishanth Aravamudan Cc: Benjamin Herrenschmidt Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit a3c448cd6dc05bbd213026fd8ac4a9cfaa83513f Author: Nithin Sujir Date: Thu Dec 19 17:44:11 2013 -0800 tg3: Expand 4g_overflow_test workaround to skb fragments of any size. commit 375679104ab3ccfd18dcbd7ba503734fb9a2c63a upstream. The current driver assumes that an skb fragment can only be upto jumbo size. Presumably this was a fast-path optimization. This assumption is no longer true as fragments can be upto 32k. v2: Remove unnecessary parantheses per Eric Dumazet. Signed-off-by: Nithin Nayak Sujir Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 88285f766c96205ccf30afd3d0621401701ff26c Author: Li Wang Date: Wed Nov 13 15:22:14 2013 +0800 ceph: Avoid data inconsistency due to d-cache aliasing in readpage() commit 56f91aad69444d650237295f68c195b74d888d95 upstream. If the length of data to be read in readpage() is exactly PAGE_CACHE_SIZE, the original code does not flush d-cache for data consistency after finishing reading. This patches fixes this. Signed-off-by: Li Wang Signed-off-by: Sage Weil Signed-off-by: Greg Kroah-Hartman commit 6008a51b189b39fae1dd545da28c3cfc8038870c Author: Alex Deucher Date: Mon Dec 23 09:31:58 2013 -0500 drm/radeon: 0x9649 is SUMO2 not SUMO commit d00adcc8ae9e22eca9d8af5f66c59ad9a74c90ec upstream. Fixes rendering corruption due to incorrect gfx configuration. bug: https://bugs.freedesktop.org/show_bug.cgi?id=63599 Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit 85724fa134d1d059e4f38b141c65c42fca7a8d05 Author: Christian König Date: Fri Dec 20 17:48:54 2013 +0100 drm/radeon: fix UVD 256MB check commit bae651dbd7ade3c5d6518f89599ae680a2fe2b85 upstream. Otherwise the kernel might reject our decoding requests. Signed-off-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit 6f02958728dec12eb99715893e057bfdad8bed90 Author: Chris Wilson Date: Tue Dec 17 14:34:50 2013 +0000 drm/i915: Use the correct GMCH_CTRL register for Sandybridge+ commit a885b3ccc74d8e38074e1c43a47c354c5ea0b01e upstream. The GMCH_CTRL register (or MGCC in the spec) is at a different address on Sandybridge, and the address to which we currently write to is undefined. These stray writes appear to upset (hard hang) my Ivybridge machine whilst it is in UEFI mode. Note that the register is still marked as locked RO on Sandybridge, so vgaarb is still dysfunctional. Signed-off-by: Chris Wilson Cc: Jani Nikula Cc: Ville Syrjälä Reviewed-by: Jani Nikula Signed-off-by: Daniel Vetter Signed-off-by: Greg Kroah-Hartman commit 43515fde135fd981c9eaa4f7f76dc30a1aa2ed27 Author: Alex Deucher Date: Thu Dec 19 19:41:46 2013 -0500 drm/radeon: fix asic gfx values for scrapper asics commit e2f6c88fb903e123edfd1106b0b8310d5117f774 upstream. Fixes gfx corruption on certain TN/RL parts. bug: https://bugs.freedesktop.org/show_bug.cgi?id=60389 Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit fbd96bb876b140722afefda143f0545aebe155e6 Author: Daniel Vetter Date: Tue Dec 10 13:20:59 2013 +0100 drm/i915: don't update the dri1 breadcrumb with modesetting commit 6c719faca2aceca72f1bf5b1645c1734ed3e9447 upstream. The update is horribly racy since it doesn't protect at all against concurrent closing of the master fd. And it can't really since that requires us to grab a mutex. Instead of jumping through hoops and offloading this to a worker thread just block this bit of code for the modesetting driver. Note that the race is fairly easy to hit since we call the breadcrumb function for any interrupt. So the vblank interrupt (which usually keeps going for a bit) is enough. But even if we'd block this and only update the breadcrumb for user interrupts from the CS we could hit this race with kms/gem userspace: If a non-master is waiting somewhere (and hence has interrupts enabled) and the master closes its fd (probably due to crashing). v2: Add a code comment to explain why fixing this for real isn't really worth it. Also improve the commit message a bit. v3: Fix the spelling in the comment. Reported-by: Eugene Shatokhin Cc: Eugene Shatokhin Acked-by: Chris Wilson Tested-by: Eugene Shatokhin Signed-off-by: Daniel Vetter Signed-off-by: Greg Kroah-Hartman commit ef1bd33b2131909475cca757eaa29d7f9b3bf7f2 Author: Chris Wilson Date: Wed Dec 4 14:52:06 2013 +0000 drm/i915: Hold mutex across i915_gem_release commit 0d1430a3f4b7cfd8779b78740a4182321f3ca7f3 upstream. Inorder to serialise the closing of the file descriptor and its subsequent release of client requests with i915_gem_free_request(), we need to hold the struct_mutex in i915_gem_release(). Failing to do so has the potential to trigger an OOPS, later with a use-after-free. Testcase: igt/gem_close_race Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70874 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71029 Reported-by: Eric Anholt Signed-off-by: Chris Wilson Signed-off-by: Daniel Vetter Signed-off-by: Greg Kroah-Hartman commit 5c9dce6be58638a8457cf8ad6e84c5d4fe7431d6 Author: Ville Syrjälä Date: Mon Dec 2 11:08:06 2013 +0200 drm/i915: Take modeset locks around intel_modeset_setup_hw_state() commit 027476642811f8559cbe00ef6cc54db230e48a20 upstream. Some lower level things get angry if we don't have modeset locks during intel_modeset_setup_hw_state(). Actually the resume and lid_notify codepaths alreday hold the locks, but the init codepath doesn't, so fix that. Note: This slipped through since we only disable pipes if the plane/pipe linking doesn't match. Which is only relevant on older gen3 mobile machines, if the BIOS fails to set up our preferred linking. Signed-off-by: Ville Syrjälä Tested-and-reported-by: Paul Bolle [danvet: Add note now that I could confirm my theory with the log files Paul Bolle provided.] Signed-off-by: Daniel Vetter Signed-off-by: Greg Kroah-Hartman commit 9c74433b7ee48256b975d0e9b21d7617879d1729 Author: Alex Deucher Date: Wed Dec 11 11:43:58 2013 -0500 drm/radeon: add missing display tiling setup for oland commit 227ae10f17a5f2fd1307b7e582b603ef7bbb7e97 upstream. Fixes improperly set up display params for 2D tiling on oland. Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit 0ecae1fe908d341454e1bd33fe4462d8ffad53b2 Author: Alex Deucher Date: Mon Dec 2 18:15:51 2013 -0500 drm/radeon: Fix sideport problems on certain RS690 boards commit 8333f0fe133be420ce3fcddfd568c3a559ab274e upstream. Some RS690 boards with 64MB of sideport memory show up as having 128MB sideport + 256MB of UMA. In this case, just skip the sideport memory and use UMA. This fixes rendering corruption and should improve performance. bug: https://bugs.freedesktop.org/show_bug.cgi?id=35457 Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit a6443066fcb7cf9ae9e0740cfaff6f615affca5c Author: Rafał Miłecki Date: Sat Dec 7 13:22:42 2013 +0100 drm/edid: add quirk for BPC in Samsung NP700G7A-S01PL notebook commit 49d45a31b71d7d9da74485922bdb63faf3dc9684 upstream. This bug in EDID was exposed by: commit eccea7920cfb009c2fa40e9ecdce8c36f61cab66 Author: Alex Deucher Date: Mon Mar 26 15:12:54 2012 -0400 drm/radeon/kms: improve bpc handling (v2) Which resulted in kind of regression in 3.5. This fixes https://bugs.freedesktop.org/show_bug.cgi?id=70934 Signed-off-by: Rafał Miłecki Signed-off-by: Dave Airlie Signed-off-by: Greg Kroah-Hartman commit b69ec589136c6300ddf03af45a1c9006a942c321 Author: Dan Williams Date: Tue Dec 17 10:09:32 2013 -0800 net_dma: mark broken commit 77873803363c9e831fc1d1e6895c084279090c22 upstream. net_dma can cause data to be copied to a stale mapping if a copy-on-write fault occurs during dma. The application sees missing data. The following trace is triggered by modifying the kernel to WARN if it ever triggers copy-on-write on a page that is undergoing dma: WARNING: CPU: 24 PID: 2529 at lib/dma-debug.c:485 debug_dma_assert_idle+0xd2/0x120() ioatdma 0000:00:04.0: DMA-API: cpu touching an active dma mapped page [pfn=0x16bcd9] Modules linked in: iTCO_wdt iTCO_vendor_support ioatdma lpc_ich pcspkr dca CPU: 24 PID: 2529 Comm: linbug Tainted: G W 3.13.0-rc1+ #353 00000000000001e5 ffff88016f45f688 ffffffff81751041 ffff88017ab0ef70 ffff88016f45f6d8 ffff88016f45f6c8 ffffffff8104ed9c ffffffff810f3646 ffff8801768f4840 0000000000000282 ffff88016f6cca10 00007fa2bb699349 Call Trace: [] dump_stack+0x46/0x58 [] warn_slowpath_common+0x8c/0xc0 [] ? ftrace_pid_func+0x26/0x30 [] warn_slowpath_fmt+0x46/0x50 [] debug_dma_assert_idle+0xd2/0x120 [] do_wp_page+0xd0/0x790 [] handle_mm_fault+0x51c/0xde0 [] ? copy_user_enhanced_fast_string+0x9/0x20 [] __do_page_fault+0x19c/0x530 [] ? _raw_spin_lock_bh+0x16/0x40 [] ? trace_clock_local+0x9/0x10 [] ? rb_reserve_next_event+0x64/0x310 [] ? ioat2_dma_prep_memcpy_lock+0x60/0x130 [ioatdma] [] do_page_fault+0xe/0x10 [] page_fault+0x22/0x30 [] ? __kfree_skb+0x51/0xd0 [] ? copy_user_enhanced_fast_string+0x9/0x20 [] ? memcpy_toiovec+0x52/0xa0 [] skb_copy_datagram_iovec+0x5f/0x2a0 [] tcp_rcv_established+0x674/0x7f0 [] tcp_v4_do_rcv+0x2e5/0x4a0 [..] ---[ end trace e30e3b01191b7617 ]--- Mapped at: [] debug_dma_map_page+0xb9/0x160 [] dma_async_memcpy_pg_to_pg+0x127/0x210 [] dma_memcpy_pg_to_iovec+0x119/0x1f0 [] dma_skb_copy_datagram_iovec+0x11c/0x2b0 [] tcp_rcv_established+0x74a/0x7f0: ...the problem is that the receive path falls back to cpu-copy in several locations and this trace is just one of the areas. A few options were considered to fix this: 1/ sync all dma whenever a cpu copy branch is taken 2/ modify the page fault handler to hold off while dma is in-flight Option 1 adds yet more cpu overhead to an "offload" that struggles to compete with cpu-copy. Option 2 adds checks for behavior that is already documented as broken when using get_user_pages(). At a minimum a debug mode is warranted to catch and flag these violations of the dma-api vs get_user_pages(). Thanks to David for his reproducer. Cc: Dave Jiang Cc: Vinod Koul Cc: Alexander Duyck Reported-by: David Whipple Acked-by: David S. Miller Signed-off-by: Dan Williams Signed-off-by: Greg Kroah-Hartman commit 5bda0260f56de8ceddc5eb761a0ac09c1feafb5e Author: Stefan Richter Date: Sun Dec 15 16:18:01 2013 +0100 firewire: sbp2: bring back WRITE SAME support commit ce027ed98fd176710fb14be9d6015697b62436f0 upstream. Commit 54b2b50c20a6 "[SCSI] Disable WRITE SAME for RAID and virtual host adapter drivers" disabled WRITE SAME support for all SBP-2 attached targets. But as described in the changelog of commit b0ea5f19d3d8 "firewire: sbp2: allow WRITE SAME and REPORT SUPPORTED OPERATION CODES", it is not required to blacklist WRITE SAME. Bring the feature back by reverting the sbp2.c hunk of commit 54b2b50c20a6. Signed-off-by: Stefan Richter Signed-off-by: Greg Kroah-Hartman commit 42e7b42b1c04e92cad2b48b4aa8caeaea02ccd35 Author: Kirill Tkhai Date: Wed Nov 27 19:59:13 2013 +0400 sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities commit 757dfcaa41844595964f1220f1d33182dae49976 upstream. This patch touches the RT group scheduling case. Functions inc_rt_prio_smp() and dec_rt_prio_smp() change (global) rq's priority, while rt_rq passed to them may be not the top-level rt_rq. This is wrong, because changing of priority on a child level does not guarantee that the priority is the highest all over the rq. So, this leak makes RT balancing unusable. The short example: the task having the highest priority among all rq's RT tasks (no one other task has the same priority) are waking on a throttle rt_rq. The rq's cpupri is set to the task's priority equivalent, but real rq->rt.highest_prio.curr is less. The patch below fixes the problem. Signed-off-by: Kirill Tkhai Signed-off-by: Peter Zijlstra CC: Steven Rostedt Link: http://lkml.kernel.org/r/49231385567953@web4m.yandex.ru Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 13ea54872ad94976129d88a6e7d841bf689a180b Author: Mel Gorman Date: Wed Dec 18 17:08:40 2013 -0800 sched: numa: skip inaccessible VMAs commit 3c67f474558748b604e247d92b55dfe89654c81d upstream. Inaccessible VMA should not be trapping NUMA hint faults. Skip them. Signed-off-by: Mel Gorman Reviewed-by: Rik van Riel Cc: Alex Thorlton Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 8164ddf1f6f9fca6020143a259a9f9667d53c1e4 Author: Lukas Czerner Date: Wed Oct 30 11:10:52 2013 -0400 ext4: fix FITRIM in no journal mode commit 8f9ff189205a6817aee5a1f996f876541f86e07c upstream. When using FITRIM ioctl on a file system without journal it will only trim the block group once, no matter how many times you invoke FITRIM ioctl and how many block you release from the block group. It is because we only clear EXT4_GROUP_INFO_WAS_TRIMMED_BIT in journal callback. Fix this by clearing the bit in no journal mode as well. Signed-off-by: Lukas Czerner Signed-off-by: "Theodore Ts'o" Reported-by: Jorge Fábregas Signed-off-by: Greg Kroah-Hartman commit cd7442730bdf3c8ecdd9a0ae8d77750b359fe60e Author: Theodore Ts'o Date: Fri Dec 20 09:29:35 2013 -0500 ext4: add explicit casts when masking cluster sizes commit f5a44db5d2d677dfbf12deee461f85e9ec633961 upstream. The missing casts can cause the high 64-bits of the physical blocks to be lost. Set up new macros which allows us to make sure the right thing happen, even if at some point we end up supporting larger logical block numbers. Thanks to the Emese Revfy and the PaX security team for reporting this issue. Reported-by: PaX Team Reported-by: Emese Revfy Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit ad403b48be4d1777fdfb65966cefee9e2d8a8bcd Author: Jan Kara Date: Wed Dec 18 00:44:44 2013 -0500 ext4: fix deadlock when writing in ENOSPC conditions commit 34cf865d54813aab3497838132fb1bbd293f4054 upstream. Akira-san has been reporting rare deadlocks of his machine when running xfstests test 269 on ext4 filesystem. The problem turned out to be in ext4_da_reserve_metadata() and ext4_da_reserve_space() which called ext4_should_retry_alloc() while holding i_data_sem. Since ext4_should_retry_alloc() can force a transaction commit, this is a lock ordering violation and leads to deadlocks. Fix the problem by just removing the retry loops. These functions should just report ENOSPC to the caller (e.g. ext4_da_write_begin()) and that function must take care of retrying after dropping all necessary locks. Reported-and-tested-by: Akira Fujita Reviewed-by: Zheng Liu Signed-off-by: Jan Kara Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit b35e443037c2e162a015a5f46009e8c21e673b63 Author: Jan Kara Date: Sun Dec 8 21:11:59 2013 -0500 ext4: Do not reserve clusters when fs doesn't support extents commit 30fac0f75da24dd5bb43c9e911d2039a984ac815 upstream. When the filesystem doesn't support extents (like in ext2/3 compatibility modes), there is no need to reserve any clusters. Space estimates for writing are exact, hole punching doesn't need new metadata, and there are no unwritten extents to convert. This fixes a problem when filesystem still having some free space when accessed with a native ext2/3 driver suddently reports ENOSPC when accessed with ext4 driver. Reported-by: Geert Uytterhoeven Tested-by: Geert Uytterhoeven Reviewed-by: Lukas Czerner Signed-off-by: Jan Kara Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit ae21dda05193c441bde106a4bbf88c185a68fbed Author: Eryu Guan Date: Tue Dec 3 21:22:21 2013 -0500 ext4: check for overlapping extents in ext4_valid_extent_entries() commit 5946d089379a35dda0e531710b48fca05446a196 upstream. A corrupted ext4 may have out of order leaf extents, i.e. extent: lblk 0--1023, len 1024, pblk 9217, flags: LEAF UNINIT extent: lblk 1000--2047, len 1024, pblk 10241, flags: LEAF UNINIT ^^^^ overlap with previous extent Reading such extent could hit BUG_ON() in ext4_es_cache_extent(). BUG_ON(end < lblk); The problem is that __read_extent_tree_block() tries to cache holes as well but assumes 'lblk' is greater than 'prev' and passes underflowed length to ext4_es_cache_extent(). Fix it by checking for overlapping extents in ext4_valid_extent_entries(). I hit this when fuzz testing ext4, and am able to reproduce it by modifying the on-disk extent by hand. Also add the check for (ee_block + len - 1) in ext4_valid_extent() to make sure the value is not overflow. Ran xfstests on patched ext4 and no regression. Cc: Lukáš Czerner Signed-off-by: Eryu Guan Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit e696abfcc89a087afef75ebca0848f37fb9877ae Author: Junho Ryu Date: Tue Dec 3 18:10:28 2013 -0500 ext4: fix use-after-free in ext4_mb_new_blocks commit 4e8d2139802ce4f41936a687f06c560b12115247 upstream. ext4_mb_put_pa should hold pa->pa_lock before accessing pa->pa_count. While ext4_mb_use_preallocated checks pa->pa_deleted first and then increments pa->count later, ext4_mb_put_pa decrements pa->pa_count before holding pa->pa_lock and then sets pa->pa_deleted. * Free sequence ext4_mb_put_pa (1): atomic_dec_and_test pa->pa_count ext4_mb_put_pa (2): lock pa->pa_lock ext4_mb_put_pa (3): check pa->pa_deleted ext4_mb_put_pa (4): set pa->pa_deleted=1 ext4_mb_put_pa (5): unlock pa->pa_lock ext4_mb_put_pa (6): remove pa from a list ext4_mb_pa_callback: free pa * Use sequence ext4_mb_use_preallocated (1): iterate over preallocation ext4_mb_use_preallocated (2): lock pa->pa_lock ext4_mb_use_preallocated (3): check pa->pa_deleted ext4_mb_use_preallocated (4): increase pa->pa_count ext4_mb_use_preallocated (5): unlock pa->pa_lock ext4_mb_release_context: access pa * Use-after-free sequence [initial status] pa_deleted = 0, pa_count = 1> ext4_mb_use_preallocated (1): iterate over preallocation ext4_mb_use_preallocated (2): lock pa->pa_lock ext4_mb_use_preallocated (3): check pa->pa_deleted ext4_mb_put_pa (1): atomic_dec_and_test pa->pa_count [pa_count decremented] pa_deleted = 0, pa_count = 0> ext4_mb_use_preallocated (4): increase pa->pa_count [pa_count incremented] pa_deleted = 0, pa_count = 1> ext4_mb_use_preallocated (5): unlock pa->pa_lock ext4_mb_put_pa (2): lock pa->pa_lock ext4_mb_put_pa (3): check pa->pa_deleted ext4_mb_put_pa (4): set pa->pa_deleted=1 [race condition!] pa_deleted = 1, pa_count = 1> ext4_mb_put_pa (5): unlock pa->pa_lock ext4_mb_put_pa (6): remove pa from a list ext4_mb_pa_callback: free pa ext4_mb_release_context: access pa AddressSanitizer has detected use-after-free in ext4_mb_new_blocks Bug report: http://goo.gl/rG1On3 Signed-off-by: Junho Ryu Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 6b8588219a59ae34f99e20f17bc9756f831a9b2e Author: Theodore Ts'o Date: Mon Dec 2 09:31:36 2013 -0500 ext4: call ext4_error_inode() if jbd2_journal_dirty_metadata() fails commit ae1495b12df1897d4f42842a7aa7276d920f6290 upstream. While it's true that errors can only happen if there is a bug in jbd2_journal_dirty_metadata(), if a bug does happen, we need to halt the kernel or remount the file system read-only in order to avoid further data loss. The ext4_journal_abort_handle() function doesn't do any of this, and while it's likely that this call (since it doesn't adjust refcounts) will likely result in the file system eventually deadlocking since the current transaction will never be able to close, it's much cleaner to call let ext4's error handling system deal with this situation. There's a separate bug here which is that if certain jbd2 errors errors occur and file system is mounted errors=continue, the file system will probably eventually end grind to a halt as described above. But things have been this way in a long time, and usually when we have these sorts of errors it's pretty much a disaster --- and that's why the jbd2 layer aggressively retries memory allocations, which is the most likely cause of these jbd2 errors. Signed-off-by: "Theodore Ts'o" Reviewed-by: Jan Kara Signed-off-by: Greg Kroah-Hartman commit 024df8e28d40f2e264bd502060008cf5c0946b61 Author: Len Brown Date: Wed Dec 18 16:44:57 2013 -0500 x86 idle: Repair large-server 50-watt idle-power regression commit 40e2d7f9b5dae048789c64672bf3027fbb663ffa upstream. Linux 3.10 changed the timing of how thread_info->flags is touched: x86: Use generic idle loop (7d1a941731fabf27e5fb6edbebb79fe856edb4e5) This caused Intel NHM-EX and WSM-EX servers to experience a large number of immediate MONITOR/MWAIT break wakeups, which caused cpuidle to demote from deep C-states to shallow C-states, which caused these platforms to experience a significant increase in idle power. Note that this issue was already present before the commit above, however, it wasn't seen often enough to be noticed in power measurements. Here we extend an errata workaround from the Core2 EX "Dunnington" to extend to NHM-EX and WSM-EX, to prevent these immediate returns from MWAIT, reducing idle power on these platforms. While only acpi_idle ran on Dunnington, intel_idle may also run on these two newer systems. As of today, there are no other models that are known to need this tweak. Link: http://lkml.kernel.org/r/CAJvTdK=%2BaNN66mYpCGgbHGCHhYQAKx-vB0kJSWjVpsNb_hOAtQ@mail.gmail.com Signed-off-by: Len Brown Link: http://lkml.kernel.org/r/baff264285f6e585df757d58b17788feabc68918.1387403066.git.len.brown@intel.com Signed-off-by: H. Peter Anvin Signed-off-by: Greg Kroah-Hartman commit 6b584e0dafc57e99fcd888ca94c506f294f7a36a Author: Suman Anna Date: Mon Dec 23 16:53:11 2013 -0600 ARM: OMAP2+: hwmod_data: fix missing OMAP_INTC_START in irq data commit 6d4c88304794442055eaea1c07f3c7b988b8c924 upstream. Commit 7d7e1eb (ARM: OMAP2+: Prepare for irqs.h removal) and commit ec2c082 (ARM: OMAP2+: Remove hardcoded IRQs and enable SPARSE_IRQ) updated the way interrupts for OMAP2/3 devices are defined in the HWMOD data structures to being an index plus a fixed offset (defined by OMAP_INTC_START). Couple of irqs in the OMAP2/3 hwmod data were misconfigured completely as they were missing this OMAP_INTC_START relative offset. Add this offset back to fix the incorrect irq data for the following modules: OMAP2 - GPMC, RNG OMAP3 - GPMC, ISP MMU & IVA MMU Signed-off-by: Suman Anna Fixes: 7d7e1eba7e92 ("ARM: OMAP2+: Prepare for irqs.h removal") Fixes: ec2c0825ca31 ("ARM: OMAP2+: Remove hardcoded IRQs and enable SPARSE_IRQ") Cc: Tony Lindgren Signed-off-by: Paul Walmsley Signed-off-by: Greg Kroah-Hartman commit 42b5bb47a62b7b2d8cd0b9af58f914c5e83ef76e Author: Catalin Marinas Date: Fri May 31 16:30:58 2013 +0100 arm64: spinlock: retry trylock operation if strex fails on free lock commit 4ecf7ccb1973fd826456b6ab1e6dfafe9023c753 upstream. An exclusive store instruction may fail for reasons other than lock contention (e.g. a cache eviction during the critical section) so, in line with other architectures using similar exclusive instructions (alpha, mips, powerpc), retry the trylock operation if the lock appears to be free but the strex reported failure. Signed-off-by: Catalin Marinas Reported-by: Tony Thompson Acked-by: Will Deacon Cc: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 11b81802921618b02122855db021a63df7d9fd49 Author: Will Deacon Date: Tue Dec 17 17:09:08 2013 +0000 arm64: ptrace: avoid using HW_BREAKPOINT_EMPTY for disabled events commit cdc27c27843248ae7eb0df5fc261dd004eaa5670 upstream. Commit 8f34a1da35ae ("arm64: ptrace: use HW_BREAKPOINT_EMPTY type for disabled breakpoints") fixed an issue with GDB trying to zero breakpoint control registers. The problem there is that the arch hw_breakpoint code will attempt to create a (disabled), execute breakpoint of length 0. This will fail validation and report unexpected failure to GDB. To avoid this, we treated disabled breakpoints as HW_BREAKPOINT_EMPTY, but that seems to have broken with recent kernels, causing watchpoints to be treated as TYPE_INST in the core code and returning ENOSPC for any further breakpoints. This patch fixes the problem by prioritising the `enable' field of the breakpoint: if it is cleared, we simply update the perf_event_attr to indicate that the thing is disabled and don't bother changing either the type or the length. This reinforces the behaviour that the breakpoint control register is essentially read-only apart from the enable bit when disabling a breakpoint. Reported-by: Aaron Liu Signed-off-by: Will Deacon Signed-off-by: Catalin Marinas Signed-off-by: Greg Kroah-Hartman commit 885154cee87db097b01ae35d9ae9ee5feabeb9d7 Author: Miao Xie Date: Mon Dec 16 15:20:01 2013 +0800 ftrace: Initialize the ftrace profiler for each possible cpu commit c4602c1c818bd6626178d6d3fcc152d9f2f48ac0 upstream. Ftrace currently initializes only the online CPUs. This implementation has two problems: - If we online a CPU after we enable the function profile, and then run the test, we will lose the trace information on that CPU. Steps to reproduce: # echo 0 > /sys/devices/system/cpu/cpu1/online # cd /tracing/ # echo >> set_ftrace_filter # echo 1 > function_profile_enabled # echo 1 > /sys/devices/system/cpu/cpu1/online # run test - If we offline a CPU before we enable the function profile, we will not clear the trace information when we enable the function profile. It will trouble the users. Steps to reproduce: # cd /tracing/ # echo >> set_ftrace_filter # echo 1 > function_profile_enabled # run test # cat trace_stat/function* # echo 0 > /sys/devices/system/cpu/cpu1/online # echo 0 > function_profile_enabled # echo 1 > function_profile_enabled # cat trace_stat/function* # run test # cat trace_stat/function* So it is better that we initialize the ftrace profiler for each possible cpu every time we enable the function profile instead of just the online ones. Link: http://lkml.kernel.org/r/1387178401-10619-1-git-send-email-miaox@cn.fujitsu.com Signed-off-by: Miao Xie Signed-off-by: Steven Rostedt Signed-off-by: Greg Kroah-Hartman commit 480da400c39d5c4398765623c7bb007a359a059f Author: Nicholas Bellinger Date: Thu Dec 12 12:24:11 2013 -0800 target/file: Update hw_max_sectors based on current block_size commit 95cadace8f3959282e76ebf8b382bd0930807d2c upstream. This patch allows FILEIO to update hw_max_sectors based on the current max_bytes_per_io. This is required because vfs_[writev,readv]() can accept a maximum of 2048 iovecs per call, so the enforced hw_max_sectors really needs to be calculated based on block_size. This addresses a >= v3.5 bug where block_size=512 was rejecting > 1M sized I/O requests, because FD_MAX_SECTORS was hardcoded to 2048 for the block_size=4096 case. (v2: Use max_bytes_per_io instead of ->update_hw_max_sectors) Reported-by: Henrik Goldman Signed-off-by: Nicholas Bellinger Signed-off-by: Greg Kroah-Hartman commit 42ea20ee7fe958123981979da1c459160733dfdb Author: Nicholas Bellinger Date: Mon Nov 25 14:53:57 2013 -0800 iscsi-target: Fix-up all zero data-length CDBs with R/W_BIT set commit 4454b66cb67f14c33cd70ddcf0ff4985b26324b7 upstream. This patch changes special case handling for ISCSI_OP_SCSI_CMD where an initiator sends a zero length Expected Data Transfer Length (EDTL), but still sets the WRITE and/or READ flag bits when no payload transfer is requested. Many, many moons ago two special cases where added for an ancient version of ESX that has long since been fixed, so instead of adding a new special case for the reported bug with a Broadcom 57800 NIC, go ahead and always strip off the incorrect WRITE + READ flag bits. Also, avoid sending a reject here, as RFC-3720 does mandate this case be handled without protocol error. Reported-by: Witold Bazakbal <865perl@wp.pl> Tested-by: Witold Bazakbal <865perl@wp.pl> Signed-off-by: Nicholas Bellinger Signed-off-by: Greg Kroah-Hartman commit 0629b407e1855bd34f0b5feb9f9cbffd3416f0fd Author: Wei Yongjun Date: Tue Oct 29 09:56:34 2013 +0800 iser-target: fix error return code in isert_create_device_ib_res() commit 94a7111043d99819cd0a72d9b3174c7054adb2a0 upstream. Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun Signed-off-by: Nicholas Bellinger Signed-off-by: Greg Kroah-Hartman commit 2b25925431bfedfbd1b234afcadbcdf4f09901f8 Author: Oleg Nesterov Date: Mon Dec 23 17:45:01 2013 -0500 selinux: selinux_setprocattr()->ptrace_parent() needs rcu_read_lock() commit c0c1439541f5305b57a83d599af32b74182933fe upstream. selinux_setprocattr() does ptrace_parent(p) under task_lock(p), but task_struct->alloc_lock doesn't pin ->parent or ->ptrace, this looks confusing and triggers the "suspicious RCU usage" warning because ptrace_parent() does rcu_dereference_check(). And in theory this is wrong, spin_lock()->preempt_disable() doesn't necessarily imply rcu_read_lock() we need to access the ->parent. Reported-by: Evan McNabb Signed-off-by: Oleg Nesterov Signed-off-by: Paul Moore Signed-off-by: Greg Kroah-Hartman commit faecbbe4213acc9da2aa44d0577cad1a9f419946 Author: Chad Hanson Date: Mon Dec 23 17:45:01 2013 -0500 selinux: fix broken peer recv check commit 46d01d63221c3508421dd72ff9c879f61053cffc upstream. Fix a broken networking check. Return an error if peer recv fails. If secmark is active and the packet recv succeeds the peer recv error is ignored. Signed-off-by: Chad Hanson Signed-off-by: Paul Moore Signed-off-by: Greg Kroah-Hartman commit f62f6338d12d86133ea925822a3a95821a74ae58 Author: Bjørn Mork Date: Fri Nov 29 20:17:45 2013 +0100 usb: cdc-wdm: manage_power should always set needs_remote_wakeup commit 4144bc861ed7934d56f16d2acd808d44af0fcc90 upstream. Reported-by: Oliver Neukum Signed-off-by: Bjørn Mork Acked-by: Oliver Neukum Signed-off-by: Greg Kroah-Hartman commit b52cc5df7872f64f206397a7dd3cfa2954725ea5 Author: Marc Kleine-Budde Date: Sat Dec 14 14:36:25 2013 +0100 can: peak_usb: fix mem leak in pcan_usb_pro_init() commit 20fb4eb96fb0350d28fc4d7cbfd5506711079592 upstream. This patch fixes a memory leak in pcan_usb_pro_init(). In patch f14e224 net: can: peak_usb: Do not do dma on the stack the struct pcan_usb_pro_fwinfo *fi and struct pcan_usb_pro_blinfo *bi were converted from stack to dynamic allocation va kmalloc(). However the corresponding kfree() was not introduced. This patch adds the missing kfree(). Reported-by: Stephane Grosjean Signed-off-by: Marc Kleine-Budde Signed-off-by: Greg Kroah-Hartman commit c045542c6809649fffaae86a8e84e39cb350965f Author: Dmitry Kunilov Date: Tue Dec 3 12:11:30 2013 -0800 usb: serial: zte_ev: move support for ZTE AC2726 from zte_ev back to option commit 52d0dc7597c89b2ab779f3dcb9b9bf0800dd9218 upstream. ZTE AC2726 EVDO modem drops ppp connection every minute when driven by zte_ev but works fine when driven by option. Move the support for AC2726 back to option driver. Signed-off-by: Dmitry Kunilov Signed-off-by: Greg Kroah-Hartman commit a718268abcb22c6276b8a9126335b6f0714bd0e9 Author: Mika Westerberg Date: Tue Dec 10 12:56:59 2013 +0200 serial: 8250_dw: add new ACPI IDs commit d24c195f90cb1adb178d26d84c722d4b9e551e05 upstream. Newer Intel PCHs with LPSS have the same Designware controllers than Haswell but ACPI IDs are different. Add these IDs to the driver list. Signed-off-by: Mika Westerberg Signed-off-by: Greg Kroah-Hartman commit 3cb9bc507add4a5fe635b5cb5dfb228b5c44cf86 Author: Jonathan Cameron Date: Wed Dec 11 18:45:00 2013 +0000 iio:adc:ad7887 Fix channel reported endianness from cpu to big endian commit e39d99059ad7f75d7ae2d3c59055d3c476cdb0d9 upstream. Note this also sets the endianness to big endian whereas it would previously have defaulted to the cpu endian. Hence technically this is a bug fix on LE platforms. Signed-off-by: Jonathan Cameron Acked-by: Lars-Peter Clausen Signed-off-by: Greg Kroah-Hartman commit 3c24f339917b1eceb423c6a39f4807d58ef679fe Author: Jonathan Cameron Date: Wed Dec 11 18:45:00 2013 +0000 iio:imu:adis16400 fix pressure channel scan type commit 3425c0f7ac61f2fcfb7f2728e9b7ba7e27aec429 upstream. A single channel in this driver was using the IIO_ST macro. This does not provide a parameter for setting the endianness of the channel. Thus this channel will have been reported as whatever is the native endianness of the cpu rather than big endian. This means it would be incorrect on little endian platforms. Signed-off-by: Jonathan Cameron Acked-by: Lars-Peter Clausen Signed-off-by: Greg Kroah-Hartman commit 19fb3e448bee619a87df7e21ee23aaa0dddbec39 Author: David Henningsson Date: Thu Dec 12 09:52:03 2013 +0100 ALSA: hda - Add enable_msi=0 workaround for four HP machines commit 693e0cb052c607e2d41edf9e9f1fa99ff8c266c1 upstream. While enabling these machines, we found we would sometimes lose an interrupt if we change hardware volume during playback, and that disabling msi fixed this issue. (Losing the interrupt caused underruns and crackling audio, as the one second timeout is usually bigger than the period size.) The machines were all machines from HP, running AMD Hudson controller, and Realtek ALC282 codec. BugLink: https://bugs.launchpad.net/bugs/1260225 Signed-off-by: David Henningsson Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 79acc77a3f65760e9403d431f4a0b100dccdfb6e Author: JongHo Kim Date: Tue Dec 17 23:02:24 2013 +0900 ALSA: Add SNDRV_PCM_STATE_PAUSED case in wait_for_avail function commit ed697e1aaf7237b1a62af39f64463b05c262808d upstream. When the process is sleeping at the SNDRV_PCM_STATE_PAUSED state from the wait_for_avail function, the sleep process will be woken by timeout(10 seconds). Even if the sleep process wake up by timeout, by this patch, the process will continue with sleep and wait for the other state. Signed-off-by: JongHo Kim Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 6dd598aca39d1e62fd26403322b762adfd3b2440 Author: Charles Keepax Date: Tue Dec 17 13:16:16 2013 +0000 ASoC: wm5110: Correct HPOUT3 DAPM route typo commit 280484e708a3cc38fe6807718caa460e744c0b20 upstream. Reported-by: Kyung-Kwee Ryu Signed-off-by: Charles Keepax Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 5c8a555270784f007e5bd35d12c7885a112e9545 Author: Charles Keepax Date: Wed Dec 18 09:25:49 2013 +0000 ASoC: wm_adsp: Add small delay while polling DSP RAM start commit 939fd1e8d9deff206f12bd9d4e54aa7a4bd0ffd6 upstream. Some devices are getting very close to the limit whilst polling the RAM start, this patch adds a small delay to this loop to give a longer startup timeout. Signed-off-by: Charles Keepax Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 9231e317df6664c2d5322d66e91d641c9c0c6d02 Author: Bo Shen Date: Wed Dec 18 11:26:23 2013 +0800 ASoC: wm8904: fix DSP mode B configuration commit f0199bc5e3a3ec13f9bc938556517ec430b36437 upstream. When wm8904 work in DSP mode B, we still need to configure it to work in DSP mode. Or else, it will work in Right Justified mode. Signed-off-by: Bo Shen Acked-by: Charles Keepax Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 05687bd4a14b3970d1f903def34aa421634f31fe Author: Stephen Warren Date: Fri Dec 6 13:34:50 2013 -0700 ASoC: tegra: fix uninitialized variables in set_fmt commit 241bf43321a10815225f477bba96a42285a2da73 upstream. In tegra*_i2s_set_fmt(), in the (fmt == SND_SOC_DAIFMT_CBM_CFM) case, "val" is never assigned to, but left uninitialized. The other case does initialized it. Fix this by initializing val at the start of the function, and only ever ORing into it. Update the handling of "mask" so it works the same way for consistency. Update tegra20_spdif.c to use the same code-style for consistency, even though it doesn't happen to suffer from the same problem at present. Signed-off-by: Stephen Warren Reviewed-by: Thierry Reding Signed-off-by: Mark Brown Fixes: 0f163546a772 ("ASoC: tegra: use regmap more directly") Signed-off-by: Greg Kroah-Hartman commit f89a10d99cdf7510fd8b8440d3d6f1d97098c00f Author: Ian Abbott Date: Fri Dec 13 12:00:30 2013 +0000 staging: comedi: 8255_pci: fix for newer PCI-DIO48H commit 0283f7a100882684ad32b768f9f1ad81658a0b92 upstream. At some point, Measurement Computing / ComputerBoards redesigned the PCI-DIO48H to use a PLX PCI interface chip instead of an AMCC chip. This meant they had to put their hardware registers in the PCI BAR 2 region instead of PCI BAR 1. Unfortunately, they kept the same PCI device ID for the new design. This means the driver recognizes the newer cards, but doesn't work (and is likely to screw up the local configuration registers of the PLX chip) because it's using the wrong region. Since the PCI subvendor and subdevice IDs were both zero on the old design, but are the same as the vendor and device on the new design, we can tell the old design and new design apart easily enough. Split the existing entry for the PCI-DIO48H in `pci_8255_boards[]` into two new entries, referenced by different entries in the PCI device ID table `pci_8255_pci_table[]`. Use the same board name for both entries. Signed-off-by: Ian Abbott Acked-by: H Hartley Sweeten Signed-off-by: Greg Kroah-Hartman commit fb36b98472a57ce5e9aaa81de2a8d21089210071 Author: Geert Uytterhoeven Date: Fri Nov 22 16:47:26 2013 +0100 TTY: pmac_zilog, check existence of ports in pmz_console_init() commit dc1dc2f8a5dd863bf2e79f338fc3ae29e99c683a upstream. When booting a multi-platform m68k kernel on a non-Mac with "console=ttyS0" on the kernel command line, it crashes with: Unable to handle kernel NULL pointer dereference at virtual address (null) Oops: 00000000 PC: [<0013ad28>] __pmz_startup+0x32/0x2a0 ... Call Trace: [<002c5d3e>] pmz_console_setup+0x64/0xe4 The normal tty driver doesn't crash, because init_pmz() checks pmz_ports_count again after calling pmz_probe(). In the serial console initialization path, pmz_console_init() doesn't do this, causing the driver to crash later. Add a check for pmz_ports_count to fix this. Signed-off-by: Geert Uytterhoeven Cc: Finn Thain Cc: Benjamin Herrenschmidt Signed-off-by: Greg Kroah-Hartman commit 56046f38a4ef578303487009c58e2f86fab7aaea Author: pingfan liu Date: Fri Nov 15 16:35:00 2013 +0800 powerpc: kvm: fix rare but potential deadlock scene commit 91648ec09c1ef69c4d840ab6dab391bfb452d554 upstream. Since kvmppc_hv_find_lock_hpte() is called from both virtmode and realmode, so it can trigger the deadlock. Suppose the following scene: Two physical cpuM, cpuN, two VM instances A, B, each VM has a group of vcpus. If on cpuM, vcpu_A_1 holds bitlock X (HPTE_V_HVLOCK), then is switched out, and on cpuN, vcpu_A_2 try to lock X in realmode, then cpuN will be caught in realmode for a long time. What makes things even worse if the following happens, On cpuM, bitlockX is hold, on cpuN, Y is hold. vcpu_B_2 try to lock Y on cpuM in realmode vcpu_A_2 try to lock X on cpuN in realmode Oops! deadlock happens Signed-off-by: Liu Ping Fan Reviewed-by: Paul Mackerras Signed-off-by: Alexander Graf Signed-off-by: Greg Kroah-Hartman commit d58dd25632be5ec0250630ccddfea7e9bd913ecb Author: Yan, Zheng Date: Thu Oct 31 09:10:47 2013 +0800 ceph: wake up 'safe' waiters when unregistering request commit fc55d2c9448b34218ca58733a6f51fbede09575b upstream. We also need to wake up 'safe' waiters if error occurs or request aborted. Otherwise sync(2)/fsync(2) may hang forever. Signed-off-by: Yan, Zheng Signed-off-by: Sage Weil Signed-off-by: Greg Kroah-Hartman commit 2c460e835ceecda4b327aacb6423b54f9a89042f Author: Yan, Zheng Date: Thu Sep 26 14:25:36 2013 +0800 ceph: cleanup aborted requests when re-sending requests. commit eb1b8af33c2e42a9a57fc0a7588f4a7b255d2e79 upstream. Aborted requests usually get cleared when the reply is received. If MDS crashes, no reply will be received. So we need to cleanup aborted requests when re-sending requests. Signed-off-by: Yan, Zheng Reviewed-by: Greg Farnum Signed-off-by: Sage Weil Signed-off-by: Greg Kroah-Hartman commit d07732a871f6f0906c012e89e2a10edb3c9be3b4 Author: Johan Hovold Date: Sat Nov 9 12:38:09 2013 +0100 USB: serial: fix race in generic write commit 6f6485463aada1ec6a0f3db6a03eb8e393d6bb55 upstream. Fix race in generic write implementation, which could lead to temporarily degraded throughput. The current generic write implementation introduced by commit 27c7acf22047 ("USB: serial: reimplement generic fifo-based writes") has always had this bug, although it's fairly hard to trigger and the consequences are not likely to be noticed. Specifically, a write() on one CPU while the completion handler is running on another could result in only one of the two write urbs being utilised to empty the remainder of the write fifo (unless there is a second write() that doesn't race during that time). Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman