commit 8341455f7f2b36212f8cdded7725e93b17f5a8fc Author: Sasha Levin Date: Wed Oct 28 22:49:46 2015 -0400 Linux 3.18.23 Signed-off-by: Sasha Levin commit 4534a1c432551e5588955a5eb8f6cb26c69b047f Author: Steven Rostedt Date: Fri Feb 27 14:50:19 2015 -0500 x86: Init per-cpu shadow copy of CR4 on 32-bit CPUs too [ Upstream commit 5b2bdbc84556774afbe11bcfd24c2f6411cfa92b ] Commit: 1e02ce4cccdc ("x86: Store a per-cpu shadow copy of CR4") added a shadow CR4 such that reads and writes that do not modify the CR4 execute much faster than always reading the register itself. The change modified cpu_init() in common.c, so that the shadow CR4 gets initialized before anything uses it. Unfortunately, there's two cpu_init()s in common.c. There's one for 64-bit and one for 32-bit. The commit only added the shadow init to the 64-bit path, but the 32-bit path needs the init too. Link: http://lkml.kernel.org/r/20150227125208.71c36402@gandalf.local.home Fixes: 1e02ce4cccdc "x86: Store a per-cpu shadow copy of CR4" Signed-off-by: Steven Rostedt Acked-by: Andy Lutomirski Cc: Peter Zijlstra (Intel) Cc: Linus Torvalds Link: http://lkml.kernel.org/r/20150227145019.2bdd4354@gandalf.local.home Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit 4f1a98a2e0cabac1ddcd9aa83cec37526a545bac Author: Christoph Hellwig Date: Sat Oct 3 19:16:07 2015 +0200 3w-9xxx: don't unmap bounce buffered commands [ Upstream commit 15e3d5a285ab9283136dba34bbf72886d9146706 ] 3w controller don't dma map small single SGL entry commands but instead bounce buffer them. Add a helper to identify these commands and don't call scsi_dma_unmap for them. Based on an earlier patch from James Bottomley. Fixes: 118c85 ("3w-9xxx: fix command completion race") Reported-by: Tóth Attila Tested-by: Tóth Attila Signed-off-by: Christoph Hellwig Acked-by: Adam Radford Signed-off-by: James Bottomley Signed-off-by: Sasha Levin commit cb4b2ae6950b905c57d09ca92e953583a2430df2 Author: Roland Dreier Date: Mon Oct 5 10:29:28 2015 -0700 fib_rules: Fix dump_rules() not to exit early [ Upstream commit 8ea4b34355189e1f1eacaf2d825f2dce776b3b9c ] Backports of 41fc014332d9 ("fib_rules: fix fib rule dumps across multiple skbs") introduced a regression in "ip rule show" - it ends up dumping the first rule over and over and never exiting, because 3.19 and earlier are missing commit 053c095a82cf ("netlink: make nlmsg_end() and genlmsg_end() void"), so fib_nl_fill_rule() ends up returning skb->len (i.e. > 0) in the success case. Fix this by checking the return code for < 0 instead of != 0. Signed-off-by: Roland Dreier Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 7f61fd99a60195578215f46ed870b5c118cfbfc0 Author: Eric W. Biederman Date: Sat Aug 15 20:27:13 2015 -0500 vfs: Test for and handle paths that are unreachable from their mnt_root [ Upstream commit 397d425dc26da728396e66d392d5dcb8dac30c37 ] In rare cases a directory can be renamed out from under a bind mount. In those cases without special handling it becomes possible to walk up the directory tree to the root dentry of the filesystem and down from the root dentry to every other file or directory on the filesystem. Like division by zero .. from an unconnected path can not be given a useful semantic as there is no predicting at which path component the code will realize it is unconnected. We certainly can not match the current behavior as the current behavior is a security hole. Therefore when encounting .. when following an unconnected path return -ENOENT. - Add a function path_connected to verify path->dentry is reachable from path->mnt.mnt_root. AKA to validate that rename did not do something nasty to the bind mount. To avoid races path_connected must be called after following a path component to it's next path component. Signed-off-by: "Eric W. Biederman" Signed-off-by: Al Viro Signed-off-by: Sasha Levin commit 3e6fba2fc343ef5d80f751b699a4688a2f57f861 Author: NeilBrown Date: Wed Jul 22 10:20:07 2015 +1000 md: flush ->event_work before stopping array. [ Upstream commit ee5d004fd0591536a061451eba2b187092e9127c ] The 'event_work' worker used by dm-raid may still be running when the array is stopped. This can result in an oops. So flush the workqueue on which it is run after detaching and before destroying the device. Reported-by: Heinz Mauelshagen Signed-off-by: NeilBrown Cc: stable@vger.kernel.org (2.6.38+ please delay 2 weeks after -final release) Fixes: 9d09e663d550 ("dm: raid456 basic support") Signed-off-by: Sasha Levin commit a70046a8f849fb1a98c5aee0c3504ddf61461eab Author: Andy Lutomirski Date: Sun Sep 20 16:32:05 2015 -0700 x86/nmi/64: Fix a paravirt stack-clobbering bug in the NMI code [ Upstream commit 83c133cf11fb0e68a51681447e372489f052d40e ] The NMI entry code that switches to the normal kernel stack needs to be very careful not to clobber any extra stack slots on the NMI stack. The code is fine under the assumption that SWAPGS is just a normal instruction, but that assumption isn't really true. Use SWAPGS_UNSAFE_STACK instead. This is part of a fix for some random crashes that Sasha saw. Fixes: 9b6e6a8334d5 ("x86/nmi/64: Switch stacks on userspace NMI entry") Reported-and-tested-by: Sasha Levin Signed-off-by: Andy Lutomirski Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/974bc40edffdb5c2950a5c4977f821a446b76178.1442791737.git.luto@kernel.org Signed-off-by: Thomas Gleixner Signed-off-by: Sasha Levin commit c735802ffff335a033b1fba5efd26d778d7b43d6 Author: Markus Pargmann Date: Wed Jul 29 15:46:03 2015 +0200 Revert "iio: bmg160: IIO_BUFFER and IIO_TRIGGERED_BUFFER are required" [ Upstream commit HEAD ] This reverts commit 279c039ca63acbd69e69d6d7ddfed50346fb2185 which was commit 06d2f6ca5a38abe92f1f3a132b331eee773868c3 upstream as it should not have been applied. Reported-by: Luis Henriques Cc: Markus Pargmann Cc: Srinivas Pandruvada Cc: Jonathan Cameron Signed-off-by: Greg Kroah-Hartman (cherry picked from commit 934d9b907aeec5f344ca801ed7361551199dfc69) Signed-off-by: Sasha Levin commit d9a1133495b487154ac351cd33b26b416e966d2d Author: Herbert Xu Date: Mon Jul 13 16:04:13 2015 +0800 net: Fix skb_set_peeked use-after-free bug [ Upstream commit a0a2a6602496a45ae838a96db8b8173794b5d398 ] The commit 738ac1ebb96d02e0d23bc320302a6ea94c612dec ("net: Clone skb before setting peeked flag") introduced a use-after-free bug in skb_recv_datagram. This is because skb_set_peeked may create a new skb and free the existing one. As it stands the caller will continue to use the old freed skb. This patch fixes it by making skb_set_peeked return the new skb (or the old one if unchanged). Fixes: 738ac1ebb96d ("net: Clone skb before setting peeked flag") Reported-by: Brenden Blanco Signed-off-by: Herbert Xu Tested-by: Brenden Blanco Reviewed-by: Konstantin Khlebnikov Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 7feb2fc286e0712a6c17fa68ae20f7aa24a475df Author: Yinghai Lu Date: Fri Sep 4 15:42:39 2015 -0700 mm: check if section present during memory block registering [ Upstream commit 04697858d89e4bf2650364f8d6956e2554e8ef88 ] Tony Luck found on his setup, if memory block size 512M will cause crash during booting. BUG: unable to handle kernel paging request at ffffea0074000020 IP: get_nid_for_pfn+0x17/0x40 PGD 128ffcb067 PUD 128ffc9067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.2.0-rc8 #1 ... Call Trace: ? register_mem_sect_under_node+0x66/0xe0 register_one_node+0x17b/0x240 ? pci_iommu_alloc+0x6e/0x6e topology_init+0x3c/0x95 do_one_initcall+0xcd/0x1f0 The system has non continuous RAM address: BIOS-e820: [mem 0x0000001300000000-0x0000001cffffffff] usable BIOS-e820: [mem 0x0000001d70000000-0x0000001ec7ffefff] usable BIOS-e820: [mem 0x0000001f00000000-0x0000002bffffffff] usable BIOS-e820: [mem 0x0000002c18000000-0x0000002d6fffefff] usable BIOS-e820: [mem 0x0000002e00000000-0x00000039ffffffff] usable So there are start sections in memory block not present. For example: memory block : [0x2c18000000, 0x2c20000000) 512M first three sections are not present. The current register_mem_sect_under_node() assume first section is present, but memory block section number range [start_section_nr, end_section_nr] would include not present section. For arch that support vmemmap, we don't setup memmap for struct page area within not present sections area. So skip the pfn range that belong to absent section. [akpm@linux-foundation.org: simplification] [rientjes@google.com: more simplification] Fixes: bdee237c0343 ("x86: mm: Use 2GB memory block size on large memory x86-64 systems") Fixes: 982792c782ef ("x86, mm: probe memory block size for generic x86 64bit") Signed-off-by: Yinghai Lu Signed-off-by: David Rientjes Reported-by: Tony Luck Tested-by: Tony Luck Cc: Greg KH Cc: Ingo Molnar Tested-by: David Rientjes Cc: [3.15+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit c613e5be6cd5b8c3da2fd15d5757cb4e0bd4da03 Author: Mikulas Patocka Date: Wed Sep 2 22:51:53 2015 +0200 hpfs: update ctime and mtime on directory modification [ Upstream commit f49a26e7718dd30b49e3541e3e25aecf5e7294e2 ] Update ctime and mtime when a directory is modified. (though OS/2 doesn't update them anyway) Signed-off-by: Mikulas Patocka Cc: stable@kernel.org # v3.3+ Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 7267a72e74f8e92f09e4cab1e84fca1340df673c Author: Grant Likely Date: Sun Jun 7 15:20:11 2015 +0100 drivercore: Fix unregistration path of platform devices [ Upstream commit 7f5dcaf1fdf289767a126a0a5cc3ef39b5254b06 ] The unregister path of platform_device is broken. On registration, it will register all resources with either a parent already set, or type==IORESOURCE_{IO,MEM}. However, on unregister it will release everything with type==IORESOURCE_{IO,MEM}, but ignore the others. There are also cases where resources don't get registered in the first place, like with devices created by of_platform_populate()*. Fix the unregister path to be symmetrical with the register path by checking the parent pointer instead of the type field to decide which resources to unregister. This is safe because the upshot of the registration path algorithm is that registered resources have a parent pointer, and non-registered resources do not. * It can be argued that of_platform_populate() should be registering it's resources, and they argument has some merit. However, there are quite a few platforms that end up broken if we try to do that due to overlapping resources in the device tree. Until that is fixed, we need to solve the immediate problem. Cc: Pantelis Antoniou Cc: Wolfram Sang Cc: Rob Herring Cc: Greg Kroah-Hartman Cc: Ricardo Ribalda Delgado Signed-off-by: Grant Likely Tested-by: Ricardo Ribalda Delgado Tested-by: Wolfram Sang Cc: stable@vger.kernel.org Signed-off-by: Rob Herring Signed-off-by: Sasha Levin commit 2b839b01233b7566da3330a7a7b5d56ffd322451 Author: Vignesh R Date: Wed Jun 3 17:21:20 2015 +0530 ARM: OMAP2+: DRA7: clockdomain: change l4per2_7xx_clkdm to SW_WKUP [ Upstream commit b9e23f321940d2db2c9def8ff723b8464fb86343 ] Legacy IPs like PWMSS, present under l4per2_7xx_clkdm, cannot support smart-idle when its clock domain is in HW_AUTO on DRA7 SoCs. Hence, program clock domain to SW_WKUP. Signed-off-by: Vignesh R Acked-by: Tero Kristo Reviewed-by: Paul Walmsley Signed-off-by: Paul Walmsley Cc: Signed-off-by: Sasha Levin commit be2a16239713fcaa5c0479c5cb0821cdba59fc81 Author: David Daney Date: Wed Aug 19 13:17:47 2015 -0700 of/address: Don't loop forever in of_find_matching_node_by_address(). [ Upstream commit 3a496b00b6f90c41bd21a410871dfc97d4f3c7ab ] If the internal call to of_address_to_resource() fails, we end up looping forever in of_find_matching_node_by_address(). This can be caused by a defective device tree, or calling with an incorrect matches argument. Fix by calling of_find_matching_node() unconditionally at the end of the loop. Signed-off-by: David Daney Cc: stable@vger.kernel.org Signed-off-by: Rob Herring Signed-off-by: Sasha Levin commit 825cc9616f9b826bb700973b58ae7affa3da5748 Author: Sudip Mukherjee Date: Mon Jul 20 17:27:21 2015 +0530 auxdisplay: ks0108: fix refcount [ Upstream commit bab383de3b84e584b0f09227151020b2a43dc34c ] parport_find_base() will implicitly do parport_get_port() which increases the refcount. Then parport_register_device() will again increment the refcount. But while unloading the module we are only doing parport_unregister_device() decrementing the refcount only once. We add an parport_put_port() to neutralize the effect of parport_get_port(). Cc: # 2.6.32+ Signed-off-by: Sudip Mukherjee Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 9f6e3432b9b0f30c83174ca3990b747675d9f768 Author: Peter Chen Date: Fri Jul 31 16:36:29 2015 +0800 Doc: ABI: testing: configfs-usb-gadget-sourcesink [ Upstream commit 4bc58eb16bb2352854b9c664cc36c1c68d2bfbb7 ] Fix the name of attribute Cc: Signed-off-by: Peter Chen Signed-off-by: Felipe Balbi Signed-off-by: Sasha Levin commit 59c3c6c94a143436ec512eb0db32d34a473b59c1 Author: Peter Chen Date: Fri Jul 31 16:36:28 2015 +0800 Doc: ABI: testing: configfs-usb-gadget-loopback [ Upstream commit 8cd50626823c00ca7472b2f61cb8c0eb9798ddc0 ] Fix the name of attribute Cc: Signed-off-by: Peter Chen Signed-off-by: Felipe Balbi Signed-off-by: Sasha Levin commit 6e046d33a18c008065d9d3b7448f244e55e53897 Author: Masahiro Yamada Date: Wed Jul 15 10:29:00 2015 +0900 devres: fix devres_get() [ Upstream commit 64526370d11ce8868ca495723d595b61e8697fbf ] Currently, devres_get() passes devres_free() the pointer to devres, but devres_free() should be given with the pointer to resource data. Fixes: 9ac7849e35f7 ("devres: device resource management") Signed-off-by: Masahiro Yamada Acked-by: Tejun Heo Cc: stable # 2.6.21+ Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit ab0a85f0cb4fcf6390d223d53bddf9478a8a27f9 Author: Max Filippov Date: Thu Jul 16 10:41:02 2015 +0300 xtensa: fix kernel register spilling [ Upstream commit 77d6273e79e3a86552fcf10cdd31a69b46ed2ce6 ] call12 can't be safely used as the first call in the inline function, because the compiler does not extend the stack frame of the bounding function accordingly, which may result in corruption of local variables. If a call needs to be done, do call8 first followed by call12. For pure assembly code in _switch_to increase stack frame size of the bounding function. Cc: stable@vger.kernel.org Signed-off-by: Max Filippov Signed-off-by: Sasha Levin commit b8a64ee186718fd0cd6616c907a89254f6583d3a Author: Max Filippov Date: Sat Jul 4 15:27:39 2015 +0300 xtensa: fix threadptr reload on return to userspace [ Upstream commit 4229fb12a03e5da5882b420b0aa4a02e77447b86 ] Userspace return code may skip restoring THREADPTR register if there are no registers that need to be zeroed. This leads to spurious failures in libc NPTL tests. Always restore THREADPTR on return to userspace. Cc: stable@vger.kernel.org Signed-off-by: Max Filippov Signed-off-by: Sasha Levin commit 2c333eb66bca69bfadf5ae3424a3fb32037b5d48 Author: Xiao Guangrong Date: Wed Aug 5 12:04:19 2015 +0800 KVM: MMU: fix validation of mmio page fault [ Upstream commit 6f691251c0350ac52a007c54bf3ef62e9d8cdc5e ] We got the bug that qemu complained with "KVM: unknown exit, hardware reason 31" and KVM shown these info: [84245.284948] EPT: Misconfiguration. [84245.285056] EPT: GPA: 0xfeda848 [84245.285154] ept_misconfig_inspect_spte: spte 0x5eaef50107 level 4 [84245.285344] ept_misconfig_inspect_spte: spte 0x5f5fadc107 level 3 [84245.285532] ept_misconfig_inspect_spte: spte 0x5141d18107 level 2 [84245.285723] ept_misconfig_inspect_spte: spte 0x52e40dad77 level 1 This is because we got a mmio #PF and the handler see the mmio spte becomes normal (points to the ram page) However, this is valid after introducing fast mmio spte invalidation which increases the generation-number instead of zapping mmio sptes, a example is as follows: 1. QEMU drops mmio region by adding a new memslot 2. invalidate all mmio sptes 3. VCPU 0 VCPU 1 access the invalid mmio spte access the region originally was MMIO before set the spte to the normal ram map mmio #PF check the spte and see it becomes normal ram mapping !!! This patch fixes the bug just by dropping the check in mmio handler, it's good for backport. Full check will be introduced in later patches Reported-by: Pavel Shirshov Tested-by: Pavel Shirshov Signed-off-by: Xiao Guangrong Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit c22fac8485a145a8e8b2634839c7a33bc334d16b Author: Don Zickus Date: Mon Aug 10 12:06:53 2015 -0400 HID: usbhid: Fix the check for HID_RESET_PENDING in hid_io_error [ Upstream commit 3af4e5a95184d6d3c1c6a065f163faa174a96a1d ] It was reported that after 10-20 reboots, a usb keyboard plugged into a docking station would not work unless it was replugged in. Using usbmon, it turns out the interrupt URBs were streaming with callback errors of -71 for some reason. The hid-core.c::hid_io_error was supposed to retry and then reset, but the reset wasn't really happening. The check for HID_NO_BANDWIDTH was inverted. Fix was simple. Tested by reporter and locally by me by unplugging a keyboard halfway until I could recreate a stream of errors but no disconnect. Signed-off-by: Don Zickus Cc: stable@vger.kernel.org Signed-off-by: Jiri Kosina Signed-off-by: Sasha Levin commit fb45b15c8e0666eb5500d7867394a5ddc9865655 Author: Andrey Ryabinin Date: Thu Sep 3 14:32:01 2015 +0300 crypto: ghash-clmulni: specify context size for ghash async algorithm [ Upstream commit 71c6da846be478a61556717ef1ee1cea91f5d6a8 ] Currently context size (cra_ctxsize) doesn't specified for ghash_async_alg. Which means it's zero. Thus crypto_create_tfm() doesn't allocate needed space for ghash_async_ctx, so any read/write to ctx (e.g. in ghash_async_init_tfm()) is not valid. Cc: stable@vger.kernel.org Signed-off-by: Andrey Ryabinin Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit d7c4784332a67cf9af2e9d9b96e8ef1caf3d5e8f Author: Maciej S. Szmigiero Date: Sun Aug 2 23:11:52 2015 +0200 serial: 8250: don't bind to SMSC IrCC IR port [ Upstream commit ffa34de03bcfbfa88d8352942bc238bb48e94e2d ] SMSC IrCC SIR/FIR port should not be bound to by (legacy) serial driver so its own driver (smsc-ircc2) can bind to it. Signed-off-by: Maciej Szmigiero Cc: stable Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit d50d8a83a75906fc2b6d937f4bcc6a99e1e124dc Author: Peter Chen Date: Mon Aug 17 10:23:03 2015 +0800 usb: host: ehci-sys: delete useless bus_to_hcd conversion [ Upstream commit 0521cfd06e1ebcd575e7ae36aab068b38df23850 ] The ehci platform device's drvdata is the pointer of struct usb_hcd already, so we doesn't need to call bus_to_hcd conversion again. Cc: Signed-off-by: Peter Chen Acked-by: Alan Stern Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 82c24f85b81f93626ac8dfe470fffb394fac61ce Author: Kishon Vijay Abraham I Date: Mon Jul 27 12:25:27 2015 +0530 usb: dwc3: ep0: Fix mem corruption on OUT transfers of more than 512 bytes [ Upstream commit b2fb5b1a0f50d3ebc12342c8d8dead245e9c9d4e ] DWC3 uses bounce buffer to handle non max packet aligned OUT transfers and the size of bounce buffer is 512 bytes. However if the host initiates OUT transfers of size more than 512 bytes (and non max packet aligned), the driver throws a WARN dump but still programs the TRB to receive more than 512 bytes. This will cause bounce buffer to overflow and corrupt the adjacent memory locations which can be fatal. Fix it by programming the TRB to receive a maximum of DWC3_EP0_BOUNCE_SIZE (512) bytes. Cc: # 3.4+ Signed-off-by: Kishon Vijay Abraham I Signed-off-by: Felipe Balbi Signed-off-by: Sasha Levin commit a095c50310208abe8dd77bbb817271e0f9bfad1a Author: Matthijs Kooijman Date: Tue Aug 18 10:33:56 2015 +0200 USB: ftdi_sio: Added custom PID for CustomWare products [ Upstream commit 1fb8dc36384ae1140ee6ccc470de74397606a9d5 ] CustomWare uses the FTDI VID with custom PIDs for their ShipModul MiniPlex products. Signed-off-by: Matthijs Kooijman Cc: stable Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 02f84c66ebaa872c03b3f7afb65d2acb2d98944c Author: Philipp Hachtmann Date: Mon Aug 17 17:31:46 2015 +0200 USB: symbolserial: Use usb_get_serial_port_data [ Upstream commit 951d3793bbfc0a441d791d820183aa3085c83ea9 ] The driver used usb_get_serial_data(port->serial) which compiled but resulted in a NULL pointer being returned (and subsequently used). I did not go deeper into this but I guess this is a regression. Signed-off-by: Philipp Hachtmann Fixes: a85796ee5149 ("USB: symbolserial: move private-data allocation to port_probe") Cc: stable # v3.10 Acked-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit d01fa89cf3615509269c17cade109c4e6fb93446 Author: Bjorn Helgaas Date: Fri Jun 19 15:58:24 2015 -0500 PCI: Fix TI816X class code quirk [ Upstream commit d1541dc977d376406f4584d8eb055488655c98ec ] In fixup_ti816x_class(), we assigned "class = PCI_CLASS_MULTIMEDIA_VIDEO". But PCI_CLASS_MULTIMEDIA_VIDEO is only the two-byte base class/sub-class and needs to be shifted to make space for the low-order interface byte. Shift PCI_CLASS_MULTIMEDIA_VIDEO to set the correct class code. Fixes: 63c4408074cb ("PCI: Add quirk for setting valid class for TI816X Endpoint") Signed-off-by: Bjorn Helgaas CC: Hemant Pedanekar Signed-off-by: Sasha Levin commit 8437a1598f929204f5bd683431f5f8fa4481cbe9 Author: Dan Carpenter Date: Wed Jul 29 13:17:06 2015 +0300 clk: versatile: off by one in clk_sp810_timerclken_of_get() [ Upstream commit 3294bee87091be5f179474f6c39d1d87769635e2 ] The ">" should be ">=" or we end up reading beyond the end of the array. Fixes: 6e973d2c4385 ('clk: vexpress: Add separate SP810 driver') Signed-off-by: Dan Carpenter Acked-by: Pawel Moll Signed-off-by: Stephen Boyd Signed-off-by: Sasha Levin commit 51935aeef6c9b82ee82d37c5dc5d9244a379b5d1 Author: Ian Abbott Date: Tue Aug 11 13:05:10 2015 +0100 staging: comedi: adl_pci7x3x: fix digital output on PCI-7230 [ Upstream commit ad83dbd974feb2e2a8cc071a1d28782bd4d2c70e ] The "adl_pci7x3x" driver replaced the "adl_pci7230" and "adl_pci7432" drivers in commits 8f567c373c4b ("staging: comedi: new adl_pci7x3x driver") and 657f77d173d3 ("staging: comedi: remove adl_pci7230 and adl_pci7432 drivers"). Although the new driver code agrees with the user manuals for the respective boards, digital outputs stopped working on the PCI-7230. This has 16 digital output channels and the previous adl_pci7230 driver shifted the 16 bit output state left by 16 bits before writing to the hardware register. The new adl_pci7x3x driver doesn't do that. Fix it in `adl_pci7x3x_do_insn_bits()` by checking for the special case of the subdevice having only 16 channels and duplicating the 16 bit output state into both halves of the 32-bit register. That should work both for what the board actually does and for what the user manual says it should do. Fixes: 8f567c373c4b ("staging: comedi: new adl_pci7x3x driver") Signed-off-by: Ian Abbott Cc: # 3.13+, needs backporting for 3.7 to 3.12 Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit b2ec1ef6b08b8c5b29b5ee88a7e2b84215bc457c Author: Lars-Peter Clausen Date: Wed Aug 5 15:38:15 2015 +0200 iio: adis16480: Fix scale factors [ Upstream commit 7abad1063deb0f77d275c61f58863ec319c58c5c ] The different devices support by the adis16480 driver have slightly different scales for the gyroscope and accelerometer channels. Signed-off-by: Lars-Peter Clausen Cc: Signed-off-by: Jonathan Cameron Signed-off-by: Sasha Levin commit c2ab48856eb937e0320e314c79d4f0f43b3aacb9 Author: Lars-Peter Clausen Date: Wed Aug 5 15:38:14 2015 +0200 iio: Add inverse unit conversion macros [ Upstream commit c689a923c867eac40ed3826c1d9328edea8b6bc7 ] Add inverse unit conversion macro to convert from standard IIO units to units that might be used by some devices. Those are useful in combination with scale factors that are specified as IIO_VAL_FRACTIONAL. Typically the denominator for those specifications will contain the maximum raw value the sensor will generate and the numerator the value it maps to in a specific unit. Sometimes datasheets specify those in different units than the standard IIO units (e.g. degree/s instead of rad/s) and so we need to do a unit conversion. From a mathematical point of view it does not make a difference whether we apply the unit conversion to the numerator or the inverse unit conversion to the denominator since (x / y) / z = x / (y * z). But as the denominator is typically a larger value and we are rounding both the numerator and denominator to integer values using the later method gives us a better precision (E.g. the relative error is smaller if we round 8000.3 to 8000 rather than rounding 8.3 to 8). This is where in inverse unit conversion macros will be used. Marked for stable as used by some upcoming fixes. Signed-off-by: Lars-Peter Clausen Cc: Signed-off-by: Jonathan Cameron Signed-off-by: Sasha Levin commit fdb0c59ca71ae9133995bc6ba05fe4878237192c Author: Cristina Opriceana Date: Mon Aug 3 13:37:40 2015 +0300 iio: industrialio-buffer: Fix iio_buffer_poll return value [ Upstream commit 1bdc0293901cbea23c6dc29432e81919d4719844 ] Change return value to 0 if no device is bound since unsigned int cannot support negative error codes. Fixes: f18e7a068 ("iio: Return -ENODEV for file operations if the device has been unregistered") Signed-off-by: Cristina Opriceana Cc: Signed-off-by: Jonathan Cameron Signed-off-by: Sasha Levin commit 6bf6419ded2c771046e50230371ae2b098d9a7ed Author: Cristina Opriceana Date: Mon Aug 3 13:00:47 2015 +0300 iio: event: Remove negative error code from iio_event_poll [ Upstream commit 41d903c00051d8f31c98a8136edbac67e6f8688f ] Negative return values are not supported by iio_event_poll since its return type is unsigned int. Fixes: f18e7a068a0a3 ("iio: Return -ENODEV for file operations if the device has been unregistered") Signed-off-by: Cristina Opriceana Cc: Signed-off-by: Jonathan Cameron Signed-off-by: Sasha Levin commit a56cd44a153e6a0de75015632039221494ab29eb Author: Markus Pargmann Date: Wed Jul 29 15:46:03 2015 +0200 iio: bmg160: IIO_BUFFER and IIO_TRIGGERED_BUFFER are required [ Upstream commit 06d2f6ca5a38abe92f1f3a132b331eee773868c3 ] This patch adds selects for IIO_BUFFER and IIO_TRIGGERED_BUFFER. Without IIO_BUFFER, the driver does not compile. Signed-off-by: Markus Pargmann Reviewed-by: Srinivas Pandruvada Cc: Signed-off-by: Jonathan Cameron Signed-off-by: Sasha Levin commit 77ffebfc23b0b8e23da1af91d8999ef9b31cde51 Author: Sebastian Ott Date: Thu Jun 25 09:32:22 2015 +0200 s390/sclp: fix compile error [ Upstream commit a313bdc5310dd807655d3ca3eb2219cd65dfe45a ] Fix this error when compiling with CONFIG_SMP=n and CONFIG_DYNAMIC_DEBUG=y: drivers/s390/char/sclp_early.c: In function 'sclp_read_info_early': drivers/s390/char/sclp_early.c:87:19: error: 'EBUSY' undeclared (first use in this function) } while (rc == -EBUSY); ^ Signed-off-by: Sebastian Ott Signed-off-by: Martin Schwidefsky Signed-off-by: Sasha Levin commit d19448237d9b6e178878a9abe23ef0ea54860a44 Author: Jonathon Jongsma Date: Thu Aug 20 14:04:32 2015 -0500 drm/qxl: validate monitors config modes [ Upstream commit bd3e1c7c6de9f5f70d97cdb6c817151c0477c5e3 ] Due to some recent changes in drm_helper_probe_single_connector_modes_merge_bits(), old custom modes were not being pruned properly. In current kernels, drm_mode_validate_basic() is called to sanity-check each mode in the list. If the sanity-check passes, the mode's status gets set to to MODE_OK. In older kernels this check was not done, so old custom modes would still have a status of MODE_UNVERIFIED at this point, and would therefore be pruned later in the function. As a result of this new behavior, the list of modes for a device always includes every custom mode ever configured for the device, with the largest one listed first. Since desktop environments usually choose the first preferred mode when a hotplug event is emitted, this had the result of making it very difficult for the user to reduce the size of the display. The qxl driver did implement the mode_valid connector function, but it was empty. In order to restore the old behavior where old custom modes are pruned, we implement a proper mode_valid function for the qxl driver. This function now checks each mode against the last configured custom mode and the list of standard modes. If the mode doesn't match any of these, its status is set to MODE_BAD so that it will be pruned as expected. Signed-off-by: Jonathon Jongsma Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie Signed-off-by: Sasha Levin commit f298cc4107e677f4622a99837044f587e06b1e77 Author: Stephen Chandler Paul Date: Fri Aug 21 14:16:12 2015 -0400 drm/amdgpu: Don't link train DisplayPort on HPD until we get the dpcd [ Upstream commit a887adadb7b9ef9eb4ee48e4ad575aefcfd1db14 ] This is a port of: DRM - radeon: Don't link train DisplayPort on HPD until we get the dpcd to amdgpu. Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 7b30a56362585fd98a0fc6fce5573ec45b4e5840 Author: Joonsoo Kim Date: Thu Oct 1 15:36:54 2015 -0700 mm/slab: fix unexpected index mapping result of kmalloc_size(INDEX_NODE+1) [ Upstream commit 03a2d2a3eafe4015412cf4e9675ca0e2d9204074 ] Commit description is copied from the original post of this bug: http://comments.gmane.org/gmane.linux.kernel.mm/135349 Kernels after v3.9 use kmalloc_size(INDEX_NODE + 1) to get the next larger cache size than the size index INDEX_NODE mapping. In kernels 3.9 and earlier we used malloc_sizes[INDEX_L3 + 1].cs_size. However, sometimes we can't get the right output we expected via kmalloc_size(INDEX_NODE + 1), causing a BUG(). The mapping table in the latest kernel is like: index = {0, 1, 2 , 3, 4, 5, 6, n} size = {0, 96, 192, 8, 16, 32, 64, 2^n} The mapping table before 3.10 is like this: index = {0 , 1 , 2, 3, 4 , 5 , 6, n} size = {32, 64, 96, 128, 192, 256, 512, 2^(n+3)} The problem on my mips64 machine is as follows: (1) When configured DEBUG_SLAB && DEBUG_PAGEALLOC && DEBUG_LOCK_ALLOC && DEBUG_SPINLOCK, the sizeof(struct kmem_cache_node) will be "150", and the macro INDEX_NODE turns out to be "2": #define INDEX_NODE kmalloc_index(sizeof(struct kmem_cache_node)) (2) Then the result of kmalloc_size(INDEX_NODE + 1) is 8. (3) Then "if(size >= kmalloc_size(INDEX_NODE + 1)" will lead to "size = PAGE_SIZE". (4) Then "if ((size >= (PAGE_SIZE >> 3))" test will be satisfied and "flags |= CFLGS_OFF_SLAB" will be covered. (5) if (flags & CFLGS_OFF_SLAB)" test will be satisfied and will go to "cachep->slabp_cache = kmalloc_slab(slab_size, 0u)", and the result here may be NULL while kernel bootup. (6) Finally,"BUG_ON(ZERO_OR_NULL_PTR(cachep->slabp_cache));" causes the BUG info as the following shows (may be only mips64 has this problem): This patch fixes the problem of kmalloc_size(INDEX_NODE + 1) and removes the BUG by adding 'size >= 256' check to guarantee that all necessary small sized slabs are initialized regardless sequence of slab size in mapping table. Fixes: e33660165c90 ("slab: Use common kmalloc_index/kmalloc_size...") Signed-off-by: Joonsoo Kim Reported-by: Liuhailong Acked-by: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit e4f238fc4f13f1107f342a0bf12415b42de0a165 Author: Prarit Bhargava Date: Mon Jun 15 13:43:29 2015 -0400 intel_pstate: Fix overflow in busy_scaled due to long delay [ Upstream commit 7180dddf7c32c49975c7e7babf2b60ed450cb760 ] The kernel may delay interrupts for a long time which can result in timers being delayed. If this occurs the intel_pstate driver will crash with a divide by zero error: divide error: 0000 [#1] SMP Modules linked in: btrfs zlib_deflate raid6_pq xor msdos ext4 mbcache jbd2 binfmt_misc arc4 md4 nls_utf8 cifs dns_resolver tcp_lp bnep bluetooth rfkill fuse dm_service_time iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ftp ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter ip_tables intel_powerclamp coretemp vfat fat kvm_intel iTCO_wdt iTCO_vendor_support ipmi_devintf sr_mod kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel cdc_ether lrw usbnet cdrom mii gf128mul glue_helper ablk_helper cryptd lpc_ich mfd_core pcspkr sb_edac edac_core ipmi_si ipmi_msghandler ioatdma wmi shpchp acpi_pad nfsd auth_rpcgss nfs_acl lockd uinput dm_multipath sunrpc xfs libcrc32c usb_storage sd_mod crc_t10dif crct10dif_common ixgbe mgag200 syscopyarea sysfillrect sysimgblt mdio drm_kms_helper ttm igb drm ptp pps_core dca i2c_algo_bit megaraid_sas i2c_core dm_mirror dm_region_hash dm_log dm_mod CPU: 113 PID: 0 Comm: swapper/113 Tainted: G W -------------- 3.10.0-229.1.2.el7.x86_64 #1 Hardware name: IBM x3950 X6 -[3837AC2]-/00FN827, BIOS -[A8E112BUS-1.00]- 08/27/2014 task: ffff880fe8abe660 ti: ffff880fe8ae4000 task.ti: ffff880fe8ae4000 RIP: 0010:[] [] intel_pstate_timer_func+0x179/0x3d0 RSP: 0018:ffff883fff4e3db8 EFLAGS: 00010206 RAX: 0000000027100000 RBX: ffff883fe6965100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000010 RDI: 000000002e53632d RBP: ffff883fff4e3e20 R08: 000e6f69a5a125c0 R09: ffff883fe84ec001 R10: 0000000000000002 R11: 0000000000000005 R12: 00000000000049f5 R13: 0000000000271000 R14: 00000000000049f5 R15: 0000000000000246 FS: 0000000000000000(0000) GS:ffff883fff4e0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f7668601000 CR3: 000000000190a000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff883fff4e3e58 ffffffff81099dc1 0000000000000086 0000000000000071 ffff883fff4f3680 0000000000000071 fbdc8a965e33afee ffffffff810b69dd ffff883fe84ec000 ffff883fe6965108 0000000000000100 ffffffff814a9100 Call Trace: [] ? run_posix_cpu_timers+0x51/0x840 [] ? trigger_load_balance+0x5d/0x200 [] ? pid_param_set+0x130/0x130 [] call_timer_fn+0x36/0x110 [] ? pid_param_set+0x130/0x130 [] run_timer_softirq+0x21f/0x320 [] __do_softirq+0xef/0x280 [] call_softirq+0x1c/0x30 [] do_softirq+0x65/0xa0 [] irq_exit+0x115/0x120 [] smp_apic_timer_interrupt+0x45/0x60 [] apic_timer_interrupt+0x6d/0x80 [] ? cpuidle_enter_state+0x52/0xc0 [] ? cpuidle_enter_state+0x48/0xc0 [] cpuidle_idle_call+0xc5/0x200 [] arch_cpu_idle+0xe/0x30 [] cpu_startup_entry+0xf1/0x290 [] start_secondary+0x1ba/0x230 Code: 42 0f 00 45 89 e6 48 01 c2 43 8d 44 6d 00 39 d0 73 26 49 c1 e5 08 89 d2 4d 63 f4 49 63 c5 48 c1 e2 08 48 c1 e0 08 48 63 ca 48 99 <48> f7 f9 48 98 4c 0f af f0 49 c1 ee 08 8b 43 78 c1 e0 08 44 29 RIP [] intel_pstate_timer_func+0x179/0x3d0 RSP The kernel values for cpudata for CPU 113 were: struct cpudata { cpu = 113, timer = { entry = { next = 0x0, prev = 0xdead000000200200 }, expires = 8357799745, base = 0xffff883fe84ec001, function = 0xffffffff814a9100 , data = 18446612406765768960, i_gain = 0, d_gain = 0, deadband = 0, last_err = 22489 }, last_sample_time = { tv64 = 4063132438017305 }, prev_aperf = 287326796397463, prev_mperf = 251427432090198, sample = { core_pct_busy = 23081, aperf = 2937407, mperf = 3257884, freq = 2524484, time = { tv64 = 4063149215234118 } } } which results in the time between samples = last_sample_time - sample.time = 4063149215234118 - 4063132438017305 = 16777216813 which is 16.777 seconds. The duration between reads of the APERF and MPERF registers overflowed a s32 sized integer in intel_pstate_get_scaled_busy()'s call to div_fp(). The result is that int_tofp(duration_us) == 0, and the kernel attempts to divide by 0. While the kernel shouldn't be delaying for a long time, it can and does happen and the intel_pstate driver should not panic in this situation. This patch changes the div_fp() function to use div64_s64() to allow for "long" division. This will avoid the overflow condition on long delays. [v2]: use div64_s64() in div_fp() Signed-off-by: Prarit Bhargava Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin commit af32cc7bde6304dac92e6a74fe4b2cc8120cb29a Author: Kosuke Tatsukawa Date: Fri Oct 2 08:27:05 2015 +0000 tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c [ Upstream commit e81107d4c6bd098878af9796b24edc8d4a9524fd ] My colleague ran into a program stall on a x86_64 server, where n_tty_read() was waiting for data even if there was data in the buffer in the pty. kernel stack for the stuck process looks like below. #0 [ffff88303d107b58] __schedule at ffffffff815c4b20 #1 [ffff88303d107bd0] schedule at ffffffff815c513e #2 [ffff88303d107bf0] schedule_timeout at ffffffff815c7818 #3 [ffff88303d107ca0] wait_woken at ffffffff81096bd2 #4 [ffff88303d107ce0] n_tty_read at ffffffff8136fa23 #5 [ffff88303d107dd0] tty_read at ffffffff81368013 #6 [ffff88303d107e20] __vfs_read at ffffffff811a3704 #7 [ffff88303d107ec0] vfs_read at ffffffff811a3a57 #8 [ffff88303d107f00] sys_read at ffffffff811a4306 #9 [ffff88303d107f50] entry_SYSCALL_64_fastpath at ffffffff815c86d7 There seems to be two problems causing this issue. First, in drivers/tty/n_tty.c, __receive_buf() stores the data and updates ldata->commit_head using smp_store_release() and then checks the wait queue using waitqueue_active(). However, since there is no memory barrier, __receive_buf() could return without calling wake_up_interactive_poll(), and at the same time, n_tty_read() could start to wait in wait_woken() as in the following chart. __receive_buf() n_tty_read() ------------------------------------------------------------------------ if (waitqueue_active(&tty->read_wait)) /* Memory operations issued after the RELEASE may be completed before the RELEASE operation has completed */ add_wait_queue(&tty->read_wait, &wait); ... if (!input_available_p(tty, 0)) { smp_store_release(&ldata->commit_head, ldata->read_head); ... timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, timeout); ------------------------------------------------------------------------ The second problem is that n_tty_read() also lacks a memory barrier call and could also cause __receive_buf() to return without calling wake_up_interactive_poll(), and n_tty_read() to wait in wait_woken() as in the chart below. __receive_buf() n_tty_read() ------------------------------------------------------------------------ spin_lock_irqsave(&q->lock, flags); /* from add_wait_queue() */ ... if (!input_available_p(tty, 0)) { /* Memory operations issued after the RELEASE may be completed before the RELEASE operation has completed */ smp_store_release(&ldata->commit_head, ldata->read_head); if (waitqueue_active(&tty->read_wait)) __add_wait_queue(q, wait); spin_unlock_irqrestore(&q->lock,flags); /* from add_wait_queue() */ ... timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, timeout); ------------------------------------------------------------------------ There are also other places in drivers/tty/n_tty.c which have similar calls to waitqueue_active(), so instead of adding many memory barrier calls, this patch simply removes the call to waitqueue_active(), leaving just wake_up*() behind. This fixes both problems because, even though the memory access before or after the spinlocks in both wake_up*() and add_wait_queue() can sneak into the critical section, it cannot go past it and the critical section assures that they will be serialized (please see "INTER-CPU ACQUIRING BARRIER EFFECTS" in Documentation/memory-barriers.txt for a better explanation). Moreover, the resulting code is much simpler. Latency measurement using a ping-pong test over a pty doesn't show any visible performance drop. Signed-off-by: Kosuke Tatsukawa Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 8a28d52c11c9a4a52af00bb67a928d1ab5c85f16 Author: covici@ccs.covici.com Date: Wed May 20 05:44:11 2015 -0400 staging: speakup: fix speakup-r regression [ Upstream commit b1d562acc78f0af46de0dfe447410bc40bdb7ece ] Here is a patch to make speakup-r work again. It broke in 3.6 due to commit 4369c64c79a22b98d3b7eff9d089196cd878a10a "Input: Send events one packet at a time) The problem was that the fakekey.c routine to fake a down arrow no longer functioned properly and putting the input_sync fixed it. Fixes: 4369c64c79a22b98d3b7eff9d089196cd878a10a Cc: stable Acked-by: Samuel Thibault Signed-off-by: John Covici Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit d2adffd701eefc29c06cd10494e4bdb2bc70fe3b Author: Joe Thornber Date: Fri Oct 9 14:03:38 2015 +0100 dm cache: fix NULL pointer when switching from cleaner policy [ Upstream commit 2bffa1503c5c06192eb1459180fac4416575a966 ] The cleaner policy doesn't make use of the per cache block hint space in the metadata (unlike the other policies). When switching from the cleaner policy to mq or smq a NULL pointer crash (in dm_tm_new_block) was observed. The crash was caused by bugs in dm-cache-metadata.c when trying to skip creation of the hint btree. The minimal fix is to change hint size for the cleaner policy to 4 bytes (only hint size supported). Signed-off-by: Joe Thornber Signed-off-by: Mike Snitzer Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 5b0d0df8f71b1f8565ba0a66f9c7e431cc3b4cbc Author: Ben Dooks Date: Tue Sep 29 15:01:08 2015 +0100 clk: ti: fix dual-registration of uart4_ick [ Upstream commit 19e79687de22f23bcfb5e79cce3daba20af228d1 ] On the OMAP AM3517 platform the uart4_ick gets registered twice, causing any power management to /dev/ttyO3 to fail when trying to wake the device up. This solves the following oops: [] Unhandled fault: external abort on non-linefetch (0x1028) at 0xfa09e008 [] PC is at serial_omap_pm+0x48/0x15c [] LR is at _raw_spin_unlock_irqrestore+0x30/0x5c Fixes: aafd900cab87 ("CLK: TI: add omap3 clock init file") Cc: stable@vger.kernel.org Cc: mturquette@baylibre.com Cc: sboyd@codeaurora.org Cc: linux-clk@vger.kernel.org Cc: linux-omap@vger.kernel.org Cc: linux-kernel@lists.codethink.co.uk Signed-off-by: Ben Dooks Signed-off-by: Tero Kristo Signed-off-by: Sasha Levin commit 9ad3947d25b9c26e165fd1b13334835d685bea97 Author: Kinglong Mee Date: Mon Sep 14 20:12:21 2015 +0800 nfs/filelayout: Fix NULL reference caused by double freeing of fh_array [ Upstream commit 3ec0c97959abff33a42db9081c22132bcff5b4f2 ] If filelayout_decode_layout fail, _filelayout_free_lseg will causes a double freeing of fh_array. [ 1179.279800] BUG: unable to handle kernel NULL pointer dereference at (null) [ 1179.280198] IP: [] filelayout_free_fh_array.isra.11+0x1d/0x70 [nfs_layout_nfsv41_files] [ 1179.281010] PGD 0 [ 1179.281443] Oops: 0000 [#1] [ 1179.281831] Modules linked in: nfs_layout_nfsv41_files(OE) nfsv4(OE) nfs(OE) fscache(E) xfs libcrc32c coretemp nfsd crct10dif_pclmul ppdev crc32_pclmul crc32c_intel auth_rpcgss ghash_clmulni_intel nfs_acl lockd vmw_balloon grace sunrpc parport_pc vmw_vmci parport shpchp i2c_piix4 vmwgfx drm_kms_helper ttm drm serio_raw mptspi scsi_transport_spi mptscsih e1000 mptbase ata_generic pata_acpi [last unloaded: fscache] [ 1179.283891] CPU: 0 PID: 13336 Comm: cat Tainted: G OE 4.3.0-rc1-pnfs+ #244 [ 1179.284323] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014 [ 1179.285206] task: ffff8800501d48c0 ti: ffff88003e3c4000 task.ti: ffff88003e3c4000 [ 1179.285668] RIP: 0010:[] [] filelayout_free_fh_array.isra.11+0x1d/0x70 [nfs_layout_nfsv41_files] [ 1179.286612] RSP: 0018:ffff88003e3c77f8 EFLAGS: 00010202 [ 1179.287092] RAX: 0000000000000000 RBX: ffff88001fe78900 RCX: 0000000000000000 [ 1179.287731] RDX: ffffea0000f40760 RSI: ffff88001fe789c8 RDI: ffff88001fe789c0 [ 1179.288383] RBP: ffff88003e3c7810 R08: ffffea0000f40760 R09: 0000000000000000 [ 1179.289170] R10: 0000000000000000 R11: 0000000000000001 R12: ffff88001fe789c8 [ 1179.289959] R13: ffff88001fe789c0 R14: ffff88004ec05a80 R15: ffff88004f935b88 [ 1179.290791] FS: 00007f4e66bb5700(0000) GS:ffffffff81c29000(0000) knlGS:0000000000000000 [ 1179.291580] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1179.292209] CR2: 0000000000000000 CR3: 00000000203f8000 CR4: 00000000001406f0 [ 1179.292731] Stack: [ 1179.293195] ffff88001fe78900 00000000000000d0 ffff88001fe78178 ffff88003e3c7868 [ 1179.293676] ffffffffa0272737 0000000000000001 0000000000000001 ffff88001fe78800 [ 1179.294151] 00000000614fffce ffffffff81727671 ffff88001fe78100 ffff88001fe78100 [ 1179.294623] Call Trace: [ 1179.295092] [] filelayout_alloc_lseg+0xa7/0x2d0 [nfs_layout_nfsv41_files] [ 1179.295625] [] ? out_of_line_wait_on_bit+0x81/0xb0 [ 1179.296133] [] pnfs_layout_process+0xae/0x320 [nfsv4] [ 1179.296632] [] nfs4_proc_layoutget+0x2b1/0x360 [nfsv4] [ 1179.297134] [] pnfs_update_layout+0x853/0xb30 [nfsv4] [ 1179.297632] [] ? nfs_get_lock_context+0x74/0x170 [nfs] [ 1179.298158] [] filelayout_pg_init_read+0x37/0x50 [nfs_layout_nfsv41_files] [ 1179.298834] [] __nfs_pageio_add_request+0x119/0x460 [nfs] [ 1179.299385] [] ? nfs_create_request.part.9+0x37/0x2e0 [nfs] [ 1179.299872] [] nfs_pageio_add_request+0xa3/0x1b0 [nfs] [ 1179.300362] [] readpage_async_filler+0x85/0x260 [nfs] [ 1179.300907] [] read_cache_pages+0x91/0xd0 [ 1179.301391] [] ? nfs_read_completion+0x220/0x220 [nfs] [ 1179.301867] [] nfs_readpages+0x128/0x200 [nfs] [ 1179.302330] [] __do_page_cache_readahead+0x203/0x280 [ 1179.302784] [] ? __do_page_cache_readahead+0xd8/0x280 [ 1179.303413] [] ondemand_readahead+0x1a6/0x2f0 [ 1179.303855] [] page_cache_sync_readahead+0x31/0x50 [ 1179.304286] [] generic_file_read_iter+0x4a6/0x5c0 [ 1179.304711] [] ? __nfs_revalidate_mapping+0x1f6/0x240 [nfs] [ 1179.305132] [] nfs_file_read+0x52/0xa0 [nfs] [ 1179.305540] [] __vfs_read+0xcc/0x100 [ 1179.305936] [] vfs_read+0x85/0x130 [ 1179.306326] [] SyS_read+0x58/0xd0 [ 1179.306708] [] entry_SYSCALL_64_fastpath+0x12/0x76 [ 1179.307094] Code: c4 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 8b 07 49 89 f4 85 c0 74 47 48 8b 06 49 89 fd <48> 8b 38 48 85 ff 74 22 31 db eb 0c 48 63 d3 48 8b 3c d0 48 85 [ 1179.308357] RIP [] filelayout_free_fh_array.isra.11+0x1d/0x70 [nfs_layout_nfsv41_files] [ 1179.309177] RSP [ 1179.309582] CR2: 0000000000000000 Signed-off-by: Kinglong Mee Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin commit d706c400314ac9486012fd52f1efc48e6cb7678a Author: Al Viro Date: Sun Jul 12 10:39:45 2015 -0400 fix a braino in ovl_d_select_inode() [ Upstream commit 9391dd00d13c853ab4f2a85435288ae2202e0e43 ] when opening a directory we want the overlayfs inode, not one from the topmost layer. Reported-By: Andrey Jr. Melnikov Tested-By: Andrey Jr. Melnikov Signed-off-by: Al Viro Signed-off-by: Sasha Levin commit e93f29ffbb99d45f718c18832007d0c77091ed54 Author: David Howells Date: Thu Jun 18 14:32:31 2015 +0100 overlayfs: Make f_path always point to the overlay and f_inode to the underlay [ Upstream commit 4bacc9c9234c7c8eec44f5ed4e960d9f96fa0f01 ] Make file->f_path always point to the overlay dentry so that the path in /proc/pid/fd is correct and to ensure that label-based LSMs have access to the overlay as well as the underlay (path-based LSMs probably don't need it). Using my union testsuite to set things up, before the patch I see: [root@andromeda union-testsuite]# bash 5 /a/foo107 [root@andromeda union-testsuite]# stat /mnt/a/foo107 ... Device: 23h/35d Inode: 13381 Links: 1 ... [root@andromeda union-testsuite]# stat -L /proc/$$/fd/5 ... Device: 23h/35d Inode: 13381 Links: 1 ... After the patch: [root@andromeda union-testsuite]# bash 5 /mnt/a/foo107 [root@andromeda union-testsuite]# stat /mnt/a/foo107 ... Device: 23h/35d Inode: 40346 Links: 1 ... [root@andromeda union-testsuite]# stat -L /proc/$$/fd/5 ... Device: 23h/35d Inode: 40346 Links: 1 ... Note the change in where /proc/$$/fd/5 points to in the ls command. It was pointing to /a/foo107 (which doesn't exist) and now points to /mnt/a/foo107 (which is correct). The inode accessed, however, is the lower layer. The union layer is on device 25h/37d and the upper layer on 24h/36d. Signed-off-by: David Howells Signed-off-by: Al Viro Signed-off-by: Sasha Levin commit 9ebef9b7b9ea46b636c441989898aeef42bfe400 Author: David Howells Date: Thu Jan 29 12:02:27 2015 +0000 VFS: Introduce inode-getting helpers for layered/unioned fs environments [ Upstream commit 155e35d4daa804582f75acaa2c74ec797a89c615 ] Introduce some function for getting the inode (and also the dentry) in an environment where layered/unioned filesystems are in operation. The problem is that we have places where we need *both* the union dentry and the lower source or workspace inode or dentry available, but we can only have a handle on one of them. Therefore we need to derive the handle to the other from that. The idea is to introduce an extra field in struct dentry that allows the union dentry to refer to and pin the lower dentry. Signed-off-by: David Howells Signed-off-by: Al Viro Signed-off-by: Sasha Levin commit 5d4beb9a5d8a8f91271436e8ee54bf1bdc1c936d Author: David Howells Date: Thu Jun 18 14:32:23 2015 +0100 overlay: Call ovl_drop_write() earlier in ovl_dentry_open() [ Upstream commit f25801ee4680ef1db21e15c112e6e5fe3ffe8da5 ] Call ovl_drop_write() earlier in ovl_dentry_open() before we call vfs_open() as we've done the copy up for which we needed the freeze-write lock by that point. Signed-off-by: David Howells Signed-off-by: Al Viro Signed-off-by: Sasha Levin commit 4f3ab8579a28bac0fc5bd15cc82a852fd0a261ab Author: Ben Hutchings Date: Sat Sep 26 12:23:56 2015 +0100 genirq: Fix race in register_irq_proc() [ Upstream commit 95c2b17534654829db428f11bcf4297c059a2a7e ] Per-IRQ directories in procfs are created only when a handler is first added to the irqdesc, not when the irqdesc is created. In the case of a shared IRQ, multiple tasks can race to create a directory. This race condition seems to have been present forever, but is easier to hit with async probing. Signed-off-by: Ben Hutchings Link: http://lkml.kernel.org/r/1443266636.2004.2.camel@decadent.org.uk Signed-off-by: Thomas Gleixner Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 730e73dd7df6f51eaa1ee8a14bc57c21b55b6642 Author: Stefan Assmann Date: Fri Jul 10 15:01:12 2015 +0200 igb: do not re-init SR-IOV during probe [ Upstream commit 6423fc34160939142d72ffeaa2db6408317f54df ] During driver probing the following code path is triggered. igb_probe ->igb_sw_init ->igb_probe_vfs ->igb_pci_enable_sriov ->igb_sriov_reinit Doing the SR-IOV re-init is not necessary during probing since we're starting from scratch. Here we can call igb_enable_sriov() right away. Running igb_sriov_reinit() during igb_probe() also seems to cause occasional packet loss on some onboard 82576 NICs. Reproduced on Dell and HP servers with onboard 82576 NICs. Example: Intel Corporation 82576 Gigabit Network Connection [8086:10c9] (rev 01) Subsystem: Dell Device [1028:0481] Signed-off-by: Stefan Assmann Tested-by: Aaron Brown Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin commit e05188972cdc0d19dda43b0b35f472e681a4f654 Author: Chas Williams <3chas3@gmail.com> Date: Thu Aug 27 12:28:46 2015 -0400 net/xen-netfront: only napi_synchronize() if running [ Upstream commit 274b045509175db0405c784be85e8cce116e6f7d ] If an interface isn't running napi_synchronize() will hang forever. [ 392.248403] rmmod R running task 0 359 343 0x00000000 [ 392.257671] ffff88003760fc88 ffff880037193b40 ffff880037193160 ffff88003760fc88 [ 392.267644] ffff880037610000 ffff88003760fcd8 0000000100014c22 ffffffff81f75c40 [ 392.277524] 0000000000bc7010 ffff88003760fca8 ffffffff81796927 ffffffff81f75c40 [ 392.287323] Call Trace: [ 392.291599] [] schedule+0x37/0x90 [ 392.298553] [] schedule_timeout+0x14b/0x280 [ 392.306421] [] ? irq_free_descs+0x69/0x80 [ 392.314006] [] ? internal_add_timer+0xb0/0xb0 [ 392.322125] [] msleep+0x37/0x50 [ 392.329037] [] xennet_disconnect_backend.isra.24+0xda/0x390 [xen_netfront] [ 392.339658] [] xennet_remove+0x2c/0x80 [xen_netfront] [ 392.348516] [] xenbus_dev_remove+0x59/0xc0 [ 392.356257] [] __device_release_driver+0x87/0x120 [ 392.364645] [] driver_detach+0xb8/0xc0 [ 392.371989] [] bus_remove_driver+0x59/0xe0 [ 392.379883] [] driver_unregister+0x30/0x70 [ 392.387495] [] xenbus_unregister_driver+0x12/0x20 [ 392.395908] [] netif_exit+0x10/0x775 [xen_netfront] [ 392.404877] [] SyS_delete_module+0x1d8/0x230 [ 392.412804] [] system_call_fastpath+0x12/0x71 Signed-off-by: Chas Williams <3chas3@gmail.com> Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 38ffc1c95211455b167c4a9b12c6bb16735e1472 Author: Andreas Schwab Date: Wed Sep 23 23:12:09 2015 +0200 m68k: Define asmlinkage_protect [ Upstream commit 8474ba74193d302e8340dddd1e16c85cc4b98caf ] Make sure the compiler does not modify arguments of syscall functions. This can happen if the compiler generates a tailcall to another function. For example, without asmlinkage_protect sys_openat is compiled into this function: sys_openat: clr.l %d0 move.w 18(%sp),%d0 move.l %d0,16(%sp) jbra do_sys_open Note how the fourth argument is modified in place, modifying the register %d4 that gets restored from this stack slot when the function returns to user-space. The caller may expect the register to be unmodified across system calls. Signed-off-by: Andreas Schwab Signed-off-by: Geert Uytterhoeven Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 166e53c9a61082fe8d9b4a5c6d1dd1d9545676af Author: Mark Salyzyn Date: Mon Sep 21 21:39:50 2015 +0100 arm64: readahead: fault retry breaks mmap file read random detection [ Upstream commit 569ba74a7ba69f46ce2950bf085b37fea2408385 ] This is the arm64 portion of commit 45cac65b0fcd ("readahead: fault retry breaks mmap file read random detection"), which was absent from the initial port and has since gone unnoticed. The original commit says: > .fault now can retry. The retry can break state machine of .fault. In > filemap_fault, if page is miss, ra->mmap_miss is increased. In the second > try, since the page is in page cache now, ra->mmap_miss is decreased. And > these are done in one fault, so we can't detect random mmap file access. > > Add a new flag to indicate .fault is tried once. In the second try, skip > ra->mmap_miss decreasing. The filemap_fault state machine is ok with it. With this change, Mark reports that: > Random read improves by 250%, sequential read improves by 40%, and > random write by 400% to an eMMC device with dm crypto wrapped around it. Cc: Shaohua Li Cc: Rik van Riel Cc: Wu Fengguang Cc: Signed-off-by: Mark Salyzyn Signed-off-by: Riley Andrews Signed-off-by: Will Deacon Signed-off-by: Sasha Levin commit 855805e78f7d3f04d29823e0349756e2d587f252 Author: Li Bin Date: Wed Sep 30 10:49:55 2015 +0800 arm64: ftrace: fix function_graph tracer panic [ Upstream commit ee556d00cf20012e889344a0adbbf809ab5015a3 ] When function graph tracer is enabled, the following operation will trigger panic: mount -t debugfs nodev /sys/kernel echo next_tgid > /sys/kernel/tracing/set_ftrace_filter echo function_graph > /sys/kernel/tracing/current_tracer ls /proc/ ------------[ cut here ]------------ [ 198.501417] Unable to handle kernel paging request at virtual address cb88537fdc8ba316 [ 198.506126] pgd = ffffffc008f79000 [ 198.509363] [cb88537fdc8ba316] *pgd=00000000488c6003, *pud=00000000488c6003, *pmd=0000000000000000 [ 198.517726] Internal error: Oops: 94000005 [#1] SMP [ 198.518798] Modules linked in: [ 198.520582] CPU: 1 PID: 1388 Comm: ls Tainted: G [ 198.521800] Hardware name: linux,dummy-virt (DT) [ 198.522852] task: ffffffc0fa9e8000 ti: ffffffc0f9ab0000 task.ti: ffffffc0f9ab0000 [ 198.524306] PC is at next_tgid+0x30/0x100 [ 198.525205] LR is at return_to_handler+0x0/0x20 [ 198.526090] pc : [] lr : [] pstate: 60000145 [ 198.527392] sp : ffffffc0f9ab3d40 [ 198.528084] x29: ffffffc0f9ab3d40 x28: ffffffc0f9ab0000 [ 198.529406] x27: ffffffc000d6a000 x26: ffffffc000b786e8 [ 198.530659] x25: ffffffc0002a1900 x24: ffffffc0faf16c00 [ 198.531942] x23: ffffffc0f9ab3ea0 x22: 0000000000000002 [ 198.533202] x21: ffffffc000d85050 x20: 0000000000000002 [ 198.534446] x19: 0000000000000002 x18: 0000000000000000 [ 198.535719] x17: 000000000049fa08 x16: ffffffc000242efc [ 198.537030] x15: 0000007fa472b54c x14: ffffffffff000000 [ 198.538347] x13: ffffffc0fada84a0 x12: 0000000000000001 [ 198.539634] x11: ffffffc0f9ab3d70 x10: ffffffc0f9ab3d70 [ 198.540915] x9 : ffffffc0000907c0 x8 : ffffffc0f9ab3d40 [ 198.542215] x7 : 0000002e330f08f0 x6 : 0000000000000015 [ 198.543508] x5 : 0000000000000f08 x4 : ffffffc0f9835ec0 [ 198.544792] x3 : cb88537fdc8ba316 x2 : cb88537fdc8ba306 [ 198.546108] x1 : 0000000000000002 x0 : ffffffc000d85050 [ 198.547432] [ 198.547920] Process ls (pid: 1388, stack limit = 0xffffffc0f9ab0020) [ 198.549170] Stack: (0xffffffc0f9ab3d40 to 0xffffffc0f9ab4000) [ 198.582568] Call trace: [ 198.583313] [] next_tgid+0x30/0x100 [ 198.584359] [] ftrace_graph_caller+0x6c/0x70 [ 198.585503] [] ftrace_graph_caller+0x6c/0x70 [ 198.586574] [] ftrace_graph_caller+0x6c/0x70 [ 198.587660] [] ftrace_graph_caller+0x6c/0x70 [ 198.588896] Code: aa0003f5 2a0103f4 b4000102 91004043 (885f7c60) [ 198.591092] ---[ end trace 6a346f8f20949ac8 ]--- This is because when using function graph tracer, if the traced function return value is in multi regs ([x0-x7]), return_to_handler may corrupt them. So in return_to_handler, the parameter regs should be protected properly. Cc: # 3.18+ Signed-off-by: Li Bin Acked-by: AKASHI Takahiro Signed-off-by: Catalin Marinas Signed-off-by: Sasha Levin commit 27f5c615afb5303eb902a1f2535903e0fd1d7517 Author: Eric W. Biederman Date: Sat Aug 15 13:36:12 2015 -0500 dcache: Handle escaped paths in prepend_path [ Upstream commit cde93be45a8a90d8c264c776fab63487b5038a65 ] A rename can result in a dentry that by walking up d_parent will never reach it's mnt_root. For lack of a better term I call this an escaped path. prepend_path is called by four different functions __d_path, d_absolute_path, d_path, and getcwd. __d_path only wants to see paths are connected to the root it passes in. So __d_path needs prepend_path to return an error. d_absolute_path similarly wants to see paths that are connected to some root. Escaped paths are not connected to any mnt_root so d_absolute_path needs prepend_path to return an error greater than 1. So escaped paths will be treated like paths on lazily unmounted mounts. getcwd needs to prepend "(unreachable)" so getcwd also needs prepend_path to return an error. d_path is the interesting hold out. d_path just wants to print something, and does not care about the weird cases. Which raises the question what should be printed? Given that / should result in -ENOENT I believe it is desirable for escaped paths to be printed as empty paths. As there are not really any meaninful path components when considered from the perspective of a mount tree. So tweak prepend_path to return an empty path with an new error code of 3 when it encounters an escaped path. Signed-off-by: "Eric W. Biederman" Signed-off-by: Al Viro Signed-off-by: Sasha Levin commit 2ebadea3eeaeb52853e365c407b9b4485ceb0927 Author: shengyong Date: Mon Sep 28 17:57:19 2015 +0000 UBI: return ENOSPC if no enough space available [ Upstream commit 7c7feb2ebfc9c0552c51f0c050db1d1a004faac5 ] UBI: attaching mtd1 to ubi0 UBI: scanning is finished UBI error: init_volumes: not enough PEBs, required 706, available 686 UBI error: ubi_wl_init: no enough physical eraseblocks (-20, need 1) UBI error: ubi_attach_mtd_dev: failed to attach mtd1, error -12 <= NOT ENOMEM UBI error: ubi_init: cannot attach mtd1 If available PEBs are not enough when initializing volumes, return -ENOSPC directly. If available PEBs are not enough when initializing WL, return -ENOSPC instead of -ENOMEM. Cc: stable@vger.kernel.org Signed-off-by: Sheng Yong Signed-off-by: Richard Weinberger Reviewed-by: David Gstir Signed-off-by: Sasha Levin commit dded44f1aff46ef9cfdcecc52a7688a4b1c73004 Author: Richard Weinberger Date: Tue Sep 22 23:58:07 2015 +0200 UBI: Validate data_size [ Upstream commit 281fda27673f833a01d516658a64d22a32c8e072 ] Make sure that data_size is less than LEB size. Otherwise a handcrafted UBI image is able to trigger an out of bounds memory access in ubi_compare_lebs(). Cc: stable@vger.kernel.org Signed-off-by: Richard Weinberger Reviewed-by: David Gstir Signed-off-by: Sasha Levin commit 5b12405cd01699eb0ac6784fb3cd9fb53e80dfd7 Author: Paul Mackerras Date: Thu Sep 10 14:36:21 2015 +1000 powerpc/MSI: Fix race condition in tearing down MSI interrupts [ Upstream commit e297c939b745e420ef0b9dc989cb87bda617b399 ] This fixes a race which can result in the same virtual IRQ number being assigned to two different MSI interrupts. The most visible consequence of that is usually a warning and stack trace from the sysfs code about an attempt to create a duplicate entry in sysfs. The race happens when one CPU (say CPU 0) is disposing of an MSI while another CPU (say CPU 1) is setting up an MSI. CPU 0 calls (for example) pnv_teardown_msi_irqs(), which calls msi_bitmap_free_hwirqs() to indicate that the MSI (i.e. its hardware IRQ number) is no longer in use. Then, before CPU 0 gets to calling irq_dispose_mapping() to free up the virtal IRQ number, CPU 1 comes in and calls msi_bitmap_alloc_hwirqs() to allocate an MSI, and gets the same hardware IRQ number that CPU 0 just freed. CPU 1 then calls irq_create_mapping() to get a virtual IRQ number, which sees that there is currently a mapping for that hardware IRQ number and returns the corresponding virtual IRQ number (which is the same virtual IRQ number that CPU 0 was using). CPU 0 then calls irq_dispose_mapping() and frees that virtual IRQ number. Now, if another CPU comes along and calls irq_create_mapping(), it is likely to get the virtual IRQ number that was just freed, resulting in the same virtual IRQ number apparently being used for two different hardware interrupts. To fix this race, we just move the call to msi_bitmap_free_hwirqs() to after the call to irq_dispose_mapping(). Since virq_to_hw() doesn't work for the virtual IRQ number after irq_dispose_mapping() has been called, we need to call it before irq_dispose_mapping() and remember the result for the msi_bitmap_free_hwirqs() call. The pattern of calling msi_bitmap_free_hwirqs() before irq_dispose_mapping() appears in 5 places under arch/powerpc, and appears to have originated in commit 05af7bd2d75e ("[POWERPC] MPIC U3/U4 MSI backend") from 2007. Fixes: 05af7bd2d75e ("[POWERPC] MPIC U3/U4 MSI backend") Cc: stable@vger.kernel.org # v2.6.22+ Reported-by: Alexey Kardashevskiy Signed-off-by: Paul Mackerras Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin commit 9588db31ca1f578263ad8f813c7ae4ab84861c92 Author: Kapileshwar Singh Date: Tue Sep 22 14:22:03 2015 +0100 tools lib traceevent: Fix string handling in heterogeneous arch environments [ Upstream commit c2e4b24ff848bb180f9b9cd873a38327cd219ad2 ] When a trace recorded on a 32-bit device is processed with a 64-bit binary, the higher 32-bits of the address need to ignored. The lack of this results in the output of the 64-bit pointer value to the trace as the 32-bit address lookup fails in find_printk(). Before: burn-1778 [003] 548.600305: bputs: 0xc0046db2s: 2cec5c058d98c After: burn-1778 [003] 548.600305: bputs: 0xc0046db2s: RT throttling activated The problem occurs in PRINT_FIELD when the field is recognized as a pointer to a string (of the type const char *) Heterogeneous architectures cases below can arise and should be handled: * Traces recorded using 32-bit addresses processed on a 64-bit machine * Traces recorded using 64-bit addresses processed on a 32-bit machine Reported-by: Juri Lelli Signed-off-by: Kapileshwar Singh Reviewed-by: Steven Rostedt Cc: David Ahern Cc: Javi Merino Cc: Jiri Olsa Cc: Namhyung Kim Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1442928123-13824-1-git-send-email-kapileshwar.singh@arm.com Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin commit fbe76613568a1603c64577afebcf8297b14e0f8a Author: Linus Lüssing Date: Tue Jun 30 23:45:26 2015 +0200 batman-adv: Fix potentially broken skb network header access [ Upstream commit 53cf037bf846417fd92dc92ddf97267f69b110f4 ] The two commits noted below added calls to ip_hdr() and ipv6_hdr(). They need a correctly set skb network header. Unfortunately we cannot rely on the device drivers to set it for us. Therefore setting it in the beginning of the according ndo_start_xmit handler. Fixes: 1d8ab8d3c176 ("batman-adv: Modified forwarding behaviour for multicast packets") Fixes: ab49886e3da7 ("batman-adv: Add IPv4 link-local/IPv6-ll-all-nodes multicast support") Signed-off-by: Linus Lüssing Signed-off-by: Marek Lindner Signed-off-by: Antonio Quartulli Signed-off-by: Sasha Levin commit 722ccda13bc2ccfe3ca9331ae24d898c99b87432 Author: Linus Lüssing Date: Tue Jun 16 17:10:24 2015 +0200 batman-adv: Make TT capability changes atomic [ Upstream commit ac4eebd48461ec993e7cb614d5afe7df8c72e6b7 ] Bitwise OR/AND assignments in C aren't guaranteed to be atomic. One OGM handler might undo the set/clear of a specific bit from another handler run in between. Fix this by using the atomic set_bit()/clear_bit()/test_bit() functions. Fixes: e17931d1a61d ("batman-adv: introduce capability initialization bitfield") Signed-off-by: Linus Lüssing Signed-off-by: Marek Lindner Signed-off-by: Antonio Quartulli Signed-off-by: Sasha Levin commit 81aab3d164c22f42f5fb200558b93ef077cebbf1 Author: Linus Lüssing Date: Tue Jun 16 17:10:23 2015 +0200 batman-adv: Make NC capability changes atomic [ Upstream commit 4635469f5c617282f18c69643af36cd8c0acf707 ] Bitwise OR/AND assignments in C aren't guaranteed to be atomic. One OGM handler might undo the set/clear of a specific bit from another handler run in between. Fix this by using the atomic set_bit()/clear_bit()/test_bit() functions. Fixes: 3f4841ffb336 ("batman-adv: tvlv - add network coding container") Signed-off-by: Linus Lüssing Signed-off-by: Marek Lindner Signed-off-by: Antonio Quartulli Signed-off-by: Sasha Levin commit ab4c92c41d879aef3c54359cc6e1c95b80242fde Author: James Hogan Date: Fri Mar 27 08:33:43 2015 +0000 MIPS: dma-default: Fix 32-bit fall back to GFP_DMA [ Upstream commit 53960059d56ecef67d4ddd546731623641a3d2d1 ] If there is a DMA zone (usually 24bit = 16MB I believe), but no DMA32 zone, as is the case for some 32-bit kernels, then massage_gfp_flags() will cause DMA memory allocated for devices with a 32..63-bit coherent_dma_mask to fall back to using __GFP_DMA, even though there may only be 32-bits of physical address available anyway. Correct that case to compare against a mask the size of phys_addr_t instead of always using a 64-bit mask. Signed-off-by: James Hogan Fixes: a2e715a86c6d ("MIPS: DMA: Fix computation of DMA flags from device's coherent_dma_mask.") Cc: Ralf Baechle Cc: linux-mips@linux-mips.org Cc: # 2.6.36+ Patchwork: https://patchwork.linux-mips.org/patch/9610/ Signed-off-by: Ralf Baechle Signed-off-by: Sasha Levin commit d3e32326c31c915dc6cd86f015deed2708e5d1bb Author: Viresh Kumar Date: Wed Sep 2 14:36:50 2015 +0530 cpufreq: dt: Tolerance applies on both sides of target voltage [ Upstream commit a2022001cebd0825b96aa0f3345ea3ad44ae79d4 ] Tolerance applies on both sides of the target voltage, i.e. both min and max sides. But while checking if a voltage is supported by the regulator or not, we haven't taken care of tolerance on the lower side. Fix that. Cc: Lucas Stach Fixes: 045ee45c4ff2 ("cpufreq: cpufreq-dt: disable unsupported OPPs") Signed-off-by: Viresh Kumar Reviewed-by: Lucas Stach Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin commit b2aac3f62f292863c348c968524a153eac8dbf5b Author: Yao-Wen Mao Date: Mon Aug 31 14:24:09 2015 +0800 USB: Add reset-resume quirk for two Plantronics usb headphones. [ Upstream commit 8484bf2981b3d006426ac052a3642c9ce1d8d980 ] These two headphones need a reset-resume quirk to properly resume to original volume level. Signed-off-by: Yao-Wen Mao Cc: stable Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit df59346751bb86dbe1778420c27ee8d34986297e Author: Vincent Palatin Date: Thu Oct 1 14:10:22 2015 -0700 usb: Add device quirk for Logitech PTZ cameras [ Upstream commit 72194739f54607bbf8cfded159627a2015381557 ] Add a device quirk for the Logitech PTZ Pro Camera and its sibling the ConferenceCam CC3000e Camera. This fixes the failed camera enumeration on some boot, particularly on machines with fast CPU. Tested by connecting a Logitech PTZ Pro Camera to a machine with a Haswell Core i7-4600U CPU @ 2.10GHz, and doing thousands of reboot cycles while recording the kernel logs and taking camera picture after each boot. Before the patch, more than 7% of the boots show some enumeration transfer failures and in a few of them, the kernel is giving up before actually enumerating the webcam. After the patch, the enumeration has been correct on every reboot. Signed-off-by: Vincent Palatin Cc: stable Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 59dcdaf9220fc1f8204388f02bf7bf9ddc36072f Author: Felipe Balbi Date: Thu Aug 6 10:51:29 2015 -0500 usb: musb: cppi41: allow it to work again [ Upstream commit b0a688ddcc5015eb26000c63841db7c46cfb380a ] since commit 33c300cb90a6 ("usb: musb: dsps: don't fake of_node to musb core") we have been preventing CPPI 4.1 from probing due to NULL of_node. We can't revert said commit otherwise a different regression would show up, so the fix is to look for the parent device's (glue layer's) of_node instead, since that's the thing which is actually described in DTS. Signed-off-by: Felipe Balbi Signed-off-by: Sasha Levin commit c8759ce78f7dc0b5d7a0ddcd4957cad1bd9ee5f9 Author: Mathias Nyman Date: Mon Sep 21 17:46:09 2015 +0300 usb: Use the USB_SS_MULT() macro to get the burst multiplier. [ Upstream commit ff30cbc8da425754e8ab96904db1d295bd034f27 ] Bits 1:0 of the bmAttributes are used for the burst multiplier. The rest of the bits used to be reserved (zero), but USB3.1 takes bit 7 into use. Use the existing USB_SS_MULT() macro instead to make sure the mult value and hence max packet calculations are correct for USB3.1 devices. Note that burst multiplier in bmAttributes is zero based and that the USB_SS_MULT() macro adds one. Cc: Signed-off-by: Mathias Nyman Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 3d94dce3f2db73cd857b8e729606cfd64e9afaac Author: Peter Chen Date: Mon Aug 24 14:10:07 2015 +0800 usb: chipidea: udc: using the correct stall implementation [ Upstream commit 56ffa1d154c7e12af16273f0cdc42690dd05caf5 ] According to spec, there are functional and protocol stalls. For functional stall, it is for bulk and interrupt endpoints, below are cases for it: - Host sends SET_FEATURE request for Set-Halt, the udc driver needs to set stall, and return true unconditionally. - The gadget driver may call usb_ep_set_halt to stall certain endpoints, if there is a transfer in pending, the udc driver should not set stall, and return -EAGAIN accordingly. These two kinds of stall need to be cleared by host using CLEAR_FEATURE request (Clear-Halt). For protocol stall, it is for control endpoint, this stall will be set if the control request has failed. This stall will be cleared by next setup request (hardware will do it). It fixed usbtest (drivers/usb/misc/usbtest.c) Test 13 "set/clear halt" test failure, meanwhile, this change has been verified by USB2 CV Compliance Test and MSC Tests. Cc: #3.10+ Cc: Alan Stern Cc: Felipe Balbi Signed-off-by: Peter Chen Signed-off-by: Sasha Levin commit 79c927e7c79234db33d917d30e9f372a0b9baa25 Author: Jann Horn Date: Fri Sep 18 23:41:23 2015 +0200 security: fix typo in security_task_prctl [ Upstream commit b7f76ea2ef6739ee484a165ffbac98deb855d3d3 ] Signed-off-by: Jann Horn Reviewed-by: Andy Lutomirski Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit ae4fe5668480212519ca1013ab482720af8bf639 Author: Mark Brown Date: Sat Sep 19 07:12:34 2015 -0700 regmap: debugfs: Don't bother actually printing when calculating max length [ Upstream commit 176fc2d5770a0990eebff903ba680d2edd32e718 ] The in kernel snprintf() will conveniently return the actual length of the printed string even if not given an output beffer at all so just do that rather than relying on the user to pass in a suitable buffer, ensuring that we don't need to worry if the buffer was truncated due to the size of the buffer passed in. Reported-by: Rasmus Villemoes Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit d669c8ed8a13c891eb8e5a4007fb3e6705d28716 Author: Mark Brown Date: Sat Sep 19 07:00:18 2015 -0700 regmap: debugfs: Ensure we don't underflow when printing access masks [ Upstream commit b763ec17ac762470eec5be8ebcc43e4f8b2c2b82 ] If a read is attempted which is smaller than the line length then we may underflow the subtraction we're doing with the unsigned size_t type so move some of the calculation to be additions on the right hand side instead in order to avoid this. Reported-by: Rasmus Villemoes Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit e885e29eccc519fb3226978978e28a57b4b3b250 Author: Heiko Stuebner Date: Tue Aug 4 21:36:12 2015 +0200 PM / AVS: rockchip-io: depend on CONFIG_POWER_AVS [ Upstream commit 28c1f1628ee4b163e615eefe1b6463e3d229a873 ] The rockchip io-domain driver currently only depends on ARCH_ROCKCHIP itself. This makes it possible to select the power-domain driver, but not the POWER_AVS class and results in the iodomain-driver not getting build in this case. So add the additional dependency, which also results in the driver config option now being placed nicely into the AVS submenu. Fixes: 662a958638bd ("PM / AVS: rockchip-io: add driver handling Rockchip io domains") Signed-off-by: Heiko Stuebner Acked-by: Kevin Hilman Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin commit fd498c50c30ea0f951fac76bc733eae599efdcb9 Author: Antoine Ténart Date: Tue Aug 18 10:59:10 2015 +0200 mtd: pxa3xx_nand: add a default chunk size [ Upstream commit bc3e00f04cc1fe033a289c2fc2e5c73c0168d360 ] When keeping the configuration set by the bootloader (by using the marvell,nand-keep-config property), the pxa3xx_nand_detect_config() function is called and set the chunk size to 512 as a default value if NDCR_PAGE_SZ is not set. In the other case, when not keeping the bootloader configuration, no chunk size is set. Fix this by adding a default chunk size of 512. Fixes: 70ed85232a93 ("mtd: nand: pxa3xx: Introduce multiple page I/O support") Signed-off-by: Antoine Tenart Acked-by: Robert Jarzmik Signed-off-by: Brian Norris Signed-off-by: Sasha Levin commit 0dfca2dcd62e3ff9d6d05a927af066c51ce46f53 Author: Mario Carrillo Date: Mon Aug 24 09:33:09 2015 -0500 docs: update HOWTO for 3.x -> 4.x versioning [ Upstream commit e4144fe5d47c91c92d36cdbd5f31ed8d6e3a57ab ] The HOWTO document needed updating for the new kernel versioning. Signed-off-by: Mario Carrillo Signed-off-by: Jonathan Corbet Signed-off-by: Sasha Levin commit 664b3b884a70585c105c36a3f1f35dba7838d77d Author: Peter Seiderer Date: Thu Sep 17 21:40:12 2015 +0200 cifs: use server timestamp for ntlmv2 authentication [ Upstream commit 98ce94c8df762d413b3ecb849e2b966b21606d04 ] Linux cifs mount with ntlmssp against an Mac OS X (Yosemite 10.10.5) share fails in case the clocks differ more than +/-2h: digest-service: digest-request: od failed with 2 proto=ntlmv2 digest-service: digest-request: kdc failed with -1561745592 proto=ntlmv2 Fix this by (re-)using the given server timestamp for the ntlmv2 authentication (as Windows 7 does). A related problem was also reported earlier by Namjae Jaen (see below): Windows machine has extended security feature which refuse to allow authentication when there is time difference between server time and client time when ntlmv2 negotiation is used. This problem is prevalent in embedded enviornment where system time is set to default 1970. Modern servers send the server timestamp in the TargetInfo Av_Pair structure in the challenge message [see MS-NLMP 2.2.2.1] In [MS-NLMP 3.1.5.1.2] it is explicitly mentioned that the client must use the server provided timestamp if present OR current time if it is not Reported-by: Namjae Jeon Signed-off-by: Peter Seiderer Signed-off-by: Steve French CC: Stable Signed-off-by: Sasha Levin commit dd1cb4f71ac173f0f3d0fd172d4a58f3aa23bb8b Author: Dong Aisheng Date: Wed Jul 22 20:53:03 2015 +0800 dts: imx25: fix sd card gpio polarity specified in device tree [ Upstream commit cf75eb15be2bdd054a76aeef6458beaa4a6ab770 ] cd-gpios polarity should be changed to GPIO_ACTIVE_LOW and wp-gpios should be changed to GPIO_ACTIVE_HIGH. Otherwise, the SD may not work properly due to wrong polarity inversion specified in DT after switch to common parsing function mmc_of_parse(). Signed-off-by: Dong Aisheng Acked-by: Shawn Guo Signed-off-by: Ulf Hansson Signed-off-by: Sasha Levin commit b2a9883c2f4079aca2a887cd76d87703fe387b60 Author: Dong Aisheng Date: Wed Jul 22 20:53:01 2015 +0800 dts: imx53: fix sd card gpio polarity specified in device tree [ Upstream commit 94d76946859b4bcfa0da373357f14feda2af0622 ] cd-gpios polarity should be changed to GPIO_ACTIVE_LOW and wp-gpios should be changed to GPIO_ACTIVE_HIGH. Otherwise, the SD may not work properly due to wrong polarity inversion specified in DT after switch to common parsing function mmc_of_parse(). Signed-off-by: Dong Aisheng Acked-by: Shawn Guo Signed-off-by: Ulf Hansson Signed-off-by: Sasha Levin commit 8f29da13f064a6a0c8b07beaf9d1b0c980feb3f6 Author: Dong Aisheng Date: Wed Jul 22 20:53:00 2015 +0800 dts: imx51: fix sd card gpio polarity specified in device tree [ Upstream commit aca45c0e95dad1c4ba4d38da192756b0e10cbbbd ] cd-gpios polarity should be changed to GPIO_ACTIVE_LOW and wp-gpios should be changed to GPIO_ACTIVE_HIGH. Otherwise, the SD may not work properly due to wrong polarity inversion specified in DT after switch to common parsing function mmc_of_parse(). Signed-off-by: Dong Aisheng Acked-by: Shawn Guo Signed-off-by: Ulf Hansson Signed-off-by: Sasha Levin commit 6addbc55cf91c262296e7e93957eadb23e303231 Author: Linus Lüssing Date: Tue Jun 16 17:10:22 2015 +0200 batman-adv: Make DAT capability changes atomic [ Upstream commit 65d7d46050704bcdb8121ddbf4110bfbf2b38baa ] Bitwise OR/AND assignments in C aren't guaranteed to be atomic. One OGM handler might undo the set/clear of a specific bit from another handler run in between. Fix this by using the atomic set_bit()/clear_bit()/test_bit() functions. Fixes: 17cf0ea455f1 ("batman-adv: tvlv - add distributed arp table container") Signed-off-by: Linus Lüssing Signed-off-by: Marek Lindner Signed-off-by: Antonio Quartulli Signed-off-by: Sasha Levin commit 61b027c7cee9948bbd68d0ca0cca1830ca33d712 Author: Marek Lindner Date: Wed Jun 17 20:01:36 2015 +0800 batman-adv: protect tt_local_entry from concurrent delete events [ Upstream commit ef72706a0543d0c3a5ab29bd6378fdfb368118d9 ] The tt_local_entry deletion performed in batadv_tt_local_remove() was neither protecting against simultaneous deletes nor checking whether the element was still part of the list before calling hlist_del_rcu(). Replacing the hlist_del_rcu() call with batadv_hash_remove() provides adequate protection via hash spinlocks as well as an is-element-still-in-hash check to avoid 'blind' hash removal. Fixes: 068ee6e204e1 ("batman-adv: roaming handling mechanism redesign") Reported-by: alfonsname@web.de Signed-off-by: Marek Lindner Signed-off-by: Antonio Quartulli Signed-off-by: Sasha Levin commit b544c9930ce29ddcd97abc307df72ce5a735f209 Author: Linus Walleij Date: Tue Jul 28 15:31:12 2015 +0200 fbdev: select versatile helpers for the integrator [ Upstream commit 2701fa0864ecb9e49d47a4aa1c02f172ab79639a ] Commit 11c32d7b6274cb0f554943d65bd4a126c4a86dcd "video: move Versatile CLCD helpers" missed the fact that the Integrator/CP is also using the helper, and as a result the platform got only stubs and no graphics. Add this as a default selection to Kconfig so we have graphics again. Fixes: 11c32d7b6274 (video: move Versatile CLCD helpers) Signed-off-by: Linus Walleij Signed-off-by: Tomi Valkeinen Signed-off-by: Sasha Levin commit 025976366982a779dbef69ecea2c76fb8e65d362 Author: Julian Anastasov Date: Wed Jul 8 08:31:33 2015 +0300 ipvs: fix crash with sync protocol v0 and FTP [ Upstream commit 56184858d1fc95c46723436b455cb7261cd8be6f ] Fix crash in 3.5+ if FTP is used after switching sync_version to 0. Fixes: 749c42b620a9 ("ipvs: reduce sync rate with time thresholds") Signed-off-by: Julian Anastasov Signed-off-by: Simon Horman Signed-off-by: Sasha Levin commit 4cf3ff315fc5f0949dcf9a4f14f6ba76c2422c5c Author: Alex Gartrell Date: Sun Jul 5 14:28:26 2015 -0700 ipvs: skb_orphan in case of forwarding [ Upstream commit 71563f3414e917c62acd8e0fb0edf8ed6af63e4b ] It is possible that we bind against a local socket in early_demux when we are actually going to want to forward it. In this case, the socket serves no purpose and only serves to confuse things (particularly functions which implicitly expect sk_fullsock to be true, like ip_local_out). Additionally, skb_set_owner_w is totally broken for non full-socks. Signed-off-by: Alex Gartrell Fixes: 41063e9dd119 ("ipv4: Early TCP socket demux.") Acked-by: Julian Anastasov Signed-off-by: Simon Horman Signed-off-by: Sasha Levin commit c803fddd2a95a70873c68dbff42d4c59fd2e674e Author: Julian Anastasov Date: Mon Jun 29 21:51:40 2015 +0300 ipvs: fix crash if scheduler is changed [ Upstream commit 05f00505a89acd21f5d0d20f5797dfbc4cf85243 ] I overlooked the svc->sched_data usage from schedulers when the services were converted to RCU in 3.10. Now the rare ipvsadm -E command can change the scheduler but due to the reverse order of ip_vs_bind_scheduler and ip_vs_unbind_scheduler we provide new sched_data to the old scheduler resulting in a crash. To fix it without changing the scheduler methods we have to use synchronize_rcu() only for the editing case. It means all svc->scheduler readers should expect a NULL value. To avoid breakage for the service listing and ipvsadm -R we can use the "none" name to indicate that scheduler is not assigned, a state when we drop new connections. Reported-by: Alexander Vasiliev Fixes: ceec4c381681 ("ipvs: convert services to rcu") Signed-off-by: Julian Anastasov Signed-off-by: Simon Horman Signed-off-by: Sasha Levin commit e89e653311ac2c9f37ceb778212ae4dbe1104091 Author: Julian Anastasov Date: Sat Jun 27 14:39:30 2015 +0300 ipvs: do not use random local source address for tunnels [ Upstream commit 4754957f04f5f368792a0eb7dab0ae89fb93dcfd ] Michael Vallaly reports about wrong source address used in rare cases for tunneled traffic. Looks like __ip_vs_get_out_rt in 3.10+ is providing uninitialized dest_dst->dst_saddr.ip because ip_vs_dest_dst_alloc uses kmalloc. While we retry after seeing EINVAL from routing for data that does not look like valid local address, it still succeeded when this memory was previously used from other dests and with different local addresses. As result, we can use valid local address that is not suitable for our real server. Fix it by providing 0.0.0.0 every time our cache is refreshed. By this way we will get preferred source address from routing. Reported-by: Michael Vallaly Fixes: 026ace060dfe ("ipvs: optimize dst usage for real server") Signed-off-by: Julian Anastasov Signed-off-by: Simon Horman Signed-off-by: Sasha Levin commit d6a27fd859704a456d552b92d556fb2a559859d9 Author: Ben Segall Date: Mon Apr 6 15:28:10 2015 -0700 sched/fair: Prevent throttling in early pick_next_task_fair() [ Upstream commit 54d27365cae88fbcc853b391dcd561e71acb81fa ] The optimized task selection logic optimistically selects a new task to run without first doing a full put_prev_task(). This is so that we can avoid a put/set on the common ancestors of the old and new task. Similarly, we should only call check_cfs_rq_runtime() to throttle eligible groups if they're part of the common ancestry, otherwise it is possible to end up with no eligible task in the simple task selection. Imagine: /root /prev /next /A /B If our optimistic selection ends up throttling /next, we goto simple and our put_prev_task() ends up throttling /prev, after which we're going to bug out in set_next_entity() because there aren't any tasks left. Avoid this scenario by only throttling common ancestors. Reported-by: Mohammed Naser Reported-by: Konstantin Khlebnikov Signed-off-by: Ben Segall [ munged Changelog ] Signed-off-by: Peter Zijlstra (Intel) Cc: Andrew Morton Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Roman Gushchin Cc: Thomas Gleixner Cc: pjt@google.com Fixes: 678d5718d8d0 ("sched/fair: Optimize cgroup pick_next_task_fair()") Link: http://lkml.kernel.org/r/xm26wq1oswoq.fsf@sword-of-the-dawn.mtv.corp.google.com Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit b5495ddce4659122180b5fee6fc52dc5196e0918 Author: Linus Torvalds Date: Wed Sep 30 12:48:40 2015 -0400 Initialize msg/shm IPC objects before doing ipc_addid() [ Upstream commit b9a532277938798b53178d5a66af6e2915cb27cf ] As reported by Dmitry Vyukov, we really shouldn't do ipc_addid() before having initialized the IPC object state. Yes, we initialize the IPC object in a locked state, but with all the lockless RCU lookup work, that IPC object lock no longer means that the state cannot be seen. We already did this for the IPC semaphore code (see commit e8577d1f0329: "ipc/sem.c: fully initialize sem_array before making it visible") but we clearly forgot about msg and shm. Reported-by: Dmitry Vyukov Cc: Manfred Spraul Cc: Davidlohr Bueso Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit df8a261df56876825329c05e025b23ff1c6b02bb Author: Reyad Attiyat Date: Thu Aug 6 19:23:58 2015 +0300 usb: xhci: Add support for URB_ZERO_PACKET to bulk/sg transfers [ Upstream commit 4758dcd19a7d9ba9610b38fecb93f65f56f86346 ] This commit checks for the URB_ZERO_PACKET flag and creates an extra zero-length td if the urb transfer length is a multiple of the endpoint's max packet length. Signed-off-by: Reyad Attiyat Signed-off-by: Mathias Nyman Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit aafb9ef320365f18ce9e63fdcbfb84c8c1675b95 Author: Mathias Nyman Date: Mon Sep 21 17:46:17 2015 +0300 xhci: init command timeout timer earlier to avoid deleting it uninitialized [ Upstream commit cc8e4fc0c3b5e8340bc8358990515d116a3c274c ] Don't check if timer is running with a timer_pending() before deleting it with del_timer_sync(), this defies the whole point of the sync part and can cause a possible race. Instead we just want to make sure the timer is initialized early enough before we have a chance to delete it. Cc: Reported-by: Oliver Neukum Signed-off-by: Mathias Nyman Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit bf5b29517f11ae9ecd000a51413dca6aa6f8e3bf Author: Mathias Nyman Date: Mon Sep 21 17:46:16 2015 +0300 xhci: change xhci 1.0 only restrictions to support xhci 1.1 [ Upstream commit dca7794539eff04b786fb6907186989e5eaaa9c2 ] Some changes between xhci 0.96 and xhci 1.0 specifications forced us to check the hci version in code, some of these checks were implemented as hci_version == 1.0, which will not work with new xhci 1.1 controllers. xhci 1.1 behaves similar to xhci 1.0 in these cases, so change these checks to hci_version >= 1.0 Cc: Signed-off-by: Mathias Nyman Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit ef2b6a7e240080a6501aeff02f59bcd96be38143 Author: Roger Quadros Date: Mon Sep 21 17:46:15 2015 +0300 usb: xhci: exit early in xhci_setup_device() if we're halted or dying [ Upstream commit 448116bfa856d3c076fa7178ed96661a008a5d45 ] During quick plug/removal of OTG adapter during dual-role testing it can happen that xhci_alloc_device() is called for the newly detected device after the DRD library has called xhci_stop to remove the HCD. If that is the case, just fail early to prevent the following warning. [ 154.732649] hub 4-0:1.0: USB hub found [ 154.742204] hub 4-0:1.0: 1 port detected [ 154.824458] hub 3-0:1.0: state 7 ports 1 chg 0002 evt 0000 [ 154.854609] hub 4-0:1.0: state 7 ports 1 chg 0000 evt 0000 [ 154.944430] usb 3-1: new high-speed USB device number 2 using xhci-hcd [ 154.951009] xhci-hcd xhci-hcd.0.auto: xhci_setup_device [ 155.038191] xhci-hcd xhci-hcd.0.auto: remove, state 4 [ 155.043315] usb usb4: USB disconnect, device number 1 [ 155.055270] xhci-hcd xhci-hcd.0.auto: xhci_stop [ 155.060094] xhci-hcd xhci-hcd.0.auto: USB bus 4 deregistered [ 155.066576] xhci-hcd xhci-hcd.0.auto: remove, state 1 [ 155.071710] usb usb3: USB disconnect, device number 1 [ 155.077124] xhci-hcd xhci-hcd.0.auto: xhci_setup_device [ 155.082389] ------------[ cut here ]------------ [ 155.087690] WARNING: CPU: 0 PID: 72 at drivers/usb/host/xhci.c:3800 xhci_setup_device+0x410/0x484 [xhci_hcd]() [ 155.097861] Modules linked in: sd_mod usb_storage scsi_mod usb_f_ss_lb g_zero libcomposite ipv6 xhci_plat_hcd xhci_hcd usbcore dwc3 udc_core evdev ti_am335x_adc joydev kfifo_buf industrialio snd_soc_simple_cc [ 155.146734] CPU: 0 PID: 72 Comm: kworker/0:3 Tainted: G W 4.1.4-00834-gcd9380b-dirty #50 [ 155.156073] Hardware name: Generic AM43 (Flattened Device Tree) [ 155.162117] Workqueue: usb_hub_wq hub_event [usbcore] [ 155.167249] Backtrace: [ 155.169751] [] (dump_backtrace) from [] (show_stack+0x18/0x1c) [ 155.177390] r6:c089d4a4 r5:ffffffff r4:00000000 r3:ee46c000 [ 155.183137] [] (show_stack) from [] (dump_stack+0x84/0xd0) [ 155.190446] [] (dump_stack) from [] (warn_slowpath_common+0x80/0xbc) [ 155.198605] r7:00000009 r6:00000ed8 r5:bf27eb70 r4:00000000 [ 155.204348] [] (warn_slowpath_common) from [] (warn_slowpath_null+0x24/0x2c) [ 155.213202] r8:ee49f000 r7:ee7c0004 r6:00000000 r5:ee7c0158 r4:ee7c0000 [ 155.220051] [] (warn_slowpath_null) from [] (xhci_setup_device+0x410/0x484 [xhci_hcd]) [ 155.229816] [] (xhci_setup_device [xhci_hcd]) from [] (xhci_address_device+0x14/0x18 [xhci_hcd]) [ 155.240415] r10:ee598200 r9:00000001 r8:00000002 r7:00000001 r6:00000003 r5:00000002 [ 155.248363] r4:ee49f000 [ 155.250978] [] (xhci_address_device [xhci_hcd]) from [] (hub_port_init+0x1b8/0xa9c [usbcore]) [ 155.261403] [] (hub_port_init [usbcore]) from [] (hub_event+0x738/0x1020 [usbcore]) [ 155.270874] r10:ee598200 r9:ee7c0000 r8:ee7c0038 r7:ee518800 r6:ee49f000 r5:00000001 [ 155.278822] r4:00000000 [ 155.281426] [] (hub_event [usbcore]) from [] (process_one_work+0x128/0x340) [ 155.290196] r10:00000000 r9:00000003 r8:00000000 r7:fedfa000 r6:eeec5400 r5:ee598314 [ 155.298151] r4:ee434380 [ 155.300718] [] (process_one_work) from [] (worker_thread+0x158/0x49c) [ 155.308963] r10:ee434380 r9:00000003 r8:eeec5400 r7:00000008 r6:ee434398 r5:eeec5400 [ 155.316913] r4:eeec5414 [ 155.319482] [] (worker_thread) from [] (kthread+0xdc/0xf8) [ 155.326765] r10:00000000 r9:00000000 r8:00000000 r7:c00577a0 r6:ee434380 r5:ee4441c0 [ 155.334713] r4:00000000 r3:00000000 [ 155.338341] [] (kthread) from [] (ret_from_fork+0x14/0x2c) [ 155.345626] r7:00000000 r6:00000000 r5:c005cb64 r4:ee4441c0 [ 155.356108] ---[ end trace a58d34c223b190e6 ]--- [ 155.360783] xhci-hcd xhci-hcd.0.auto: Virt dev invalid for slot_id 0x1! [ 155.574404] xhci-hcd xhci-hcd.0.auto: xhci_setup_device [ 155.579667] ------------[ cut here ]------------ Cc: Signed-off-by: Roger Quadros Signed-off-by: Mathias Nyman Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit c48a27a42e0753c6b4a1fe74c57a3fa02b4b853d Author: Roger Quadros Date: Mon Sep 21 17:46:13 2015 +0300 usb: xhci: Clear XHCI_STATE_DYING on start [ Upstream commit e5bfeab0ad515b4f6df39fe716603e9dc6d3dfd0 ] For whatever reason if XHCI died in the previous instant then it will never recover on the next xhci_start unless we clear the DYING flag. Cc: Signed-off-by: Roger Quadros Signed-off-by: Mathias Nyman Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit b57a9f68701f9587e1b1792232db55615353c314 Author: Johan Hovold Date: Wed Sep 23 11:41:42 2015 -0700 USB: whiteheat: fix potential null-deref at probe [ Upstream commit cbb4be652d374f64661137756b8f357a1827d6a4 ] Fix potential null-pointer dereference at probe by making sure that the required endpoints are present. The whiteheat driver assumes there are at least five pairs of bulk endpoints, of which the final pair is used for the "command port". An attempt to bind to an interface with fewer bulk endpoints would currently lead to an oops. Fixes CVE-2015-5257. Reported-by: Moein Ghasemzadeh Cc: stable Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit c2be986b55f0b3aa4729684aee6364ed4afa9d77 Author: Michel Dänzer Date: Mon Sep 28 18:16:31 2015 +0900 drm/amdgpu: Restore LCD backlight level on resume [ Upstream commit 74b3112e95073b351e3b0b9799795bc76f8415fa ] Instead of only enabling the backlight (which seems to set it to max brightness), just re-set the current backlight level, which also takes care of enabling the backlight if necessary. Port of radeon commit: drm/radeon: Restore LCD backlight level on resume (>= R5xx) Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit a15b34af10701889991175a882c72075d85a982a Author: Daniel Vetter Date: Tue Jun 23 11:34:21 2015 +0200 drm: Reject DRI1 hw lock ioctl functions for kms drivers [ Upstream commit da168d81b44898404d281d5dbe70154ab5f117c1 ] I've done some extensive history digging across libdrm, mesa and xf86-video-{intel,nouveau,ati}. The only potential user of this with kms drivers I could find was ttmtest, which once used drmGetLock still. But that mistake was quickly fixed up. Even the intel xvmc library (which otherwise was really good with using dri1 stuff in kms mode) managed to never take the hw lock for dri2 (and hence kms). Hence it should be save to unconditionally disallow this. Cc: Peter Antoine Reviewed-by: Peter Antoine Signed-off-by: Daniel Vetter Signed-off-by: Sasha Levin commit 0548f19db3237b18c5dfa3aa8c0d965c01021e66 Author: Jani Nikula Date: Thu Sep 17 16:42:07 2015 +0300 drm/i915/bios: handle MIPI Sequence Block v3+ gracefully [ Upstream commit cd67d226ebd909d239d2c6e5a6abd6e2a338d1cd ] The VBT MIPI Sequence Block version 3 has forward incompatible changes: First, the block size in the header has been specified reserved, and the actual size is a separate 32-bit value within the block. The current find_section() function to will only look at the size in the block header, and, depending on what's in that now reserved size field, continue looking for other sections in the wrong place. Fix this by taking the new block size field into account. This will ensure that the lookups for other sections will work properly, as long as the new 32-bit size does not go beyond the opregion VBT mailbox size. Second, the contents of the block have been completely changed. Gracefully refuse parsing the yet unknown data version. Cc: Deepak M Cc: stable@vger.kernel.org Reviewed-by: Deepak M Signed-off-by: Jani Nikula Signed-off-by: Sasha Levin commit f2e976bc1230c3982c2cf6bcd9fb96b7daa99a7e Author: Fabiano Fidêncio Date: Thu Sep 24 15:18:34 2015 +0200 drm/qxl: recreate the primary surface when the bo is not primary [ Upstream commit 8d0d94015e96b8853c4f7f06eac3f269e1b3d866 ] When disabling/enabling a crtc the primary area must be updated independently of which crtc has been disabled/enabled. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1264735 Signed-off-by: Fabiano Fidêncio Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie Signed-off-by: Sasha Levin commit 13923ed351d40f17ec45531526808eadf6300f2e Author: Dave Airlie Date: Mon Sep 14 10:28:34 2015 +1000 drm/qxl: only report first monitor as connected if we have no state [ Upstream commit 69e5d3f893e19613486f300fd6e631810338aa4b ] If the server isn't new enough to give us state, report the first monitor as always connected, otherwise believe the server side. Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie Signed-off-by: Sasha Levin commit ec9dec9fc42e7528e4ddfe6c069bdec57e35db1e Author: Steve French Date: Mon Sep 28 17:21:07 2015 -0500 [SMB3] Do not fall back to SMBWriteX in set_file_size error cases [ Upstream commit 646200a041203f440fb6fcf9cacd9efeda9de74c ] The error paths in set_file_size for cifs and smb3 are incorrect. In the unlikely event that a server did not support set file info of the file size, the code incorrectly falls back to trying SMBWriteX (note that only the original core SMB Write, used for example by DOS, can set the file size this way - this actually does not work for the more recent SMBWriteX). The idea was since the old DOS SMB Write could set the file size if you write zero bytes at that offset then use that if server rejects the normal set file info call. Fortunately the SMBWriteX will never be sent on the wire (except when file size is zero) since the length and offset fields were reversed in the two places in this function that call SMBWriteX causing the fall back path to return an error. It is also important to never call an SMB request from an SMB2/sMB3 session (which theoretically would be possible, and can cause a brief session drop, although the client recovers) so this should be fixed. In practice this path does not happen with modern servers but the error fall back to SMBWriteX is clearly wrong. Removing the calls to SMBWriteX in the error paths in cifs_set_file_size Pointed out by PaX/grsecurity team Signed-off-by: Steve French Reported-by: PaX Team CC: Emese Revfy CC: Brad Spengler CC: Stable Signed-off-by: Sasha Levin commit 4409cd1890741914aaf90c8bfca09ebec1c643ed Author: Steve French Date: Tue Sep 22 09:29:38 2015 -0500 disabling oplocks/leases via module parm enable_oplocks broken for SMB3 [ Upstream commit e0ddde9d44e37fbc21ce893553094ecf1a633ab5 ] leases (oplocks) were always requested for SMB2/SMB3 even when oplocks disabled in the cifs.ko module. Signed-off-by: Steve French Reviewed-by: Chandrika Srinivasan CC: Stable Signed-off-by: Sasha Levin commit f67da13760a0be601a1d3b3daef030c2e5e7cdec Author: Peng Tao Date: Fri Sep 11 11:14:06 2015 +0800 nfs: fix pg_test page count calculation [ Upstream commit 048883e0b934d9a5103d40e209cb14b7f33d2933 ] We really want sizeof(struct page *) instead. Otherwise we limit maximum IO size to 64 pages rather than 512 pages on a 64bit system. Fixes 2e11f829(nfs: cap request size to fit a kmalloced page array). Cc: Christoph Hellwig Signed-off-by: Peng Tao Fixes: 2e11f8296d22 ("nfs: cap request size to fit a kmalloced page array") Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin commit 8bf6c729a8b9ecb4c2dccba2881b9a374daae7a2 Author: Florian Westphal Date: Wed Sep 9 02:57:21 2015 +0200 netfilter: nf_log: don't zap all loggers on unregister [ Upstream commit 205ee117d4dc4a11ac3bd9638bb9b2e839f4de9a ] like nf_log_unset, nf_log_unregister must not reset the list of loggers. Otherwise, a call to nf_log_unregister() will render loggers of other nf protocols unusable: iptables -A INPUT -j LOG modprobe nf_log_arp ; rmmod nf_log_arp iptables -A INPUT -j LOG iptables: No chain/target/match by that name Fixes: 30e0c6a6be ("netfilter: nf_log: prepare net namespace support for loggers") Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 2f6e55943522ae92ea551df7bb31369c304ef0a6 Author: Marcelo Leitner Date: Wed Oct 29 10:04:51 2014 -0200 netfilter: nf_log: Introduce nft_log_dereference() macro [ Upstream commit 0c26ed1c07f13ca27e2638ffdd1951013ed96c48 ] Wrap up a common call pattern in an easier to handle call. Signed-off-by: Marcelo Ricardo Leitner Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 98a383953f66e1c4f138385a70131b8b12a0cea7 Author: Pablo Neira Ayuso Date: Mon Sep 14 18:04:09 2015 +0200 netfilter: nft_compat: skip family comparison in case of NFPROTO_UNSPEC [ Upstream commit ba378ca9c04a5fc1b2cf0f0274a9d02eb3d1bad9 ] Fix lookup of existing match/target structures in the corresponding list by skipping the family check if NFPROTO_UNSPEC is used. This is resulting in the allocation and insertion of one match/target structure for each use of them. So this not only bloats memory consumption but also severely affects the time to reload the ruleset from the iptables-compat utility. After this patch, iptables-compat-restore and iptables-compat take almost the same time to reload large rulesets. Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 8dafc9930cc2d2cb895cbec6ab83d0bfeb46fac6 Author: Pablo Neira Ayuso Date: Thu Sep 17 13:37:00 2015 +0200 netfilter: nf_log: wait for rcu grace after logger unregistration [ Upstream commit ad5001cc7cdf9aaee5eb213fdee657e4a3c94776 ] The nf_log_unregister() function needs to call synchronize_rcu() to make sure that the objects are not dereferenced anymore on module removal. Fixes: 5962815a6a56 ("netfilter: nf_log: use an array of loggers instead of list") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit ba1fa01d6d2ed019df3d6fe7ac3585d1f8902cd9 Author: Pablo Neira Ayuso Date: Thu Jul 9 22:56:00 2015 +0200 netfilter: ctnetlink: put back references to master ct and expect objects [ Upstream commit 95dd8653de658143770cb0e55a58d2aab97c79d2 ] We have to put back the references to the master conntrack and the expectation that we just created, otherwise we'll leak them. Fixes: 0ef71ee1a5b9 ("netfilter: ctnetlink: refactor ctnetlink_create_expect") Reported-by: Tim Wiess Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit f17d9f15f635251d811f5fbc319c114b9be2790f Author: Joe Stringer Date: Tue Jul 21 21:37:31 2015 -0700 netfilter: nf_conntrack: Support expectations in different zones [ Upstream commit 4b31814d20cbe5cd4ccf18089751e77a04afe4f2 ] When zones were originally introduced, the expectation functions were all extended to perform lookup using the zone. However, insertion was not modified to check the zone. This means that two expectations which are intended to apply for different connections that have the same tuple but exist in different zones cannot both be tracked. Fixes: 5d0aa2ccd4 (netfilter: nf_conntrack: add support for "conntrack zones") Signed-off-by: Joe Stringer Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 3ad1bd820510a4018d4f9d6a6748ff35415f3b58 Author: Pablo Neira Ayuso Date: Fri Aug 28 21:01:43 2015 +0200 netfilter: nfnetlink: work around wrong endianess in res_id field [ Upstream commit a9de9777d613500b089a7416f936bf3ae5f070d2 ] The convention in nfnetlink is to use network byte order in every header field as well as in the attribute payload. The initial version of the batching infrastructure assumes that res_id comes in host byte order though. The only client of the batching infrastructure is nf_tables, so let's add a workaround to address this inconsistency. We currently have 11 nfnetlink subsystems according to NFNL_SUBSYS_COUNT, so we can assume that the subsystem 2560, ie. htons(10), will not be allocated anytime soon, so it can be an alias of nf_tables from the nfnetlink batching path when interpreting the res_id field. Based on original patch from Florian Westphal. Reported-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 545d252508fd550d5510ceca484416f7cc866c79 Author: Mikulas Patocka Date: Fri Oct 2 11:17:37 2015 -0400 dm raid: fix round up of default region size [ Upstream commit 042745ee53a0a7c1f5aff191a4a24213c6dcfb52 ] Commit 3a0f9aaee028 ("dm raid: round region_size to power of two") intended to make sure that the default region size is a power of two. However, the logic in that commit is incorrect and sets the variable region_size to 0 or 1, depending on whether min_region_size is a power of two. Fix this logic, using roundup_pow_of_two(), so that region_size is properly rounded up to the next power of two. Signed-off-by: Mikulas Patocka Fixes: 3a0f9aaee028 ("dm raid: round region_size to power of two") Cc: stable@vger.kernel.org # v3.8+ Signed-off-by: Mike Snitzer Signed-off-by: Sasha Levin commit 798191c20d0b883aa9cdb8919aacaa5c5eca5e4c Author: Liu.Zhao Date: Mon Aug 24 08:36:12 2015 -0700 USB: option: add ZTE PIDs [ Upstream commit 19ab6bc5674a30fdb6a2436b068d19a3c17dc73e ] This is intended to add ZTE device PIDs on kernel. Signed-off-by: Liu.Zhao Cc: stable [johan: sort the new entries ] Signed-off-by: Johan Hovold Signed-off-by: Sasha Levin commit 606c9512a3797bd5d4b7f9e0639aace1f6bc0112 Author: Shawn Lin Date: Wed Sep 9 15:41:52 2015 +0800 staging: ion: fix corruption of ion_import_dma_buf [ Upstream commit 6fa92e2bcf6390e64895b12761e851c452d87bd8 ] we found this issue but still exit in lastest kernel. Simply keep ion_handle_create under mutex_lock to avoid this race. WARNING: CPU: 2 PID: 2648 at drivers/staging/android/ion/ion.c:512 ion_handle_add+0xb4/0xc0() ion_handle_add: buffer already found. Modules linked in: iwlmvm iwlwifi mac80211 cfg80211 compat CPU: 2 PID: 2648 Comm: TimedEventQueue Tainted: G W 3.14.0 #7 00000000 00000000 9a3efd2c 80faf273 9a3efd6c 9a3efd5c 80935dc9 811d7fd3 9a3efd88 00000a58 812208a0 00000200 80e128d4 80e128d4 8d4ae00c a8cd8600 a8cd8094 9a3efd74 80935e0e 00000009 9a3efd6c 811d7fd3 9a3efd88 9a3efd9c Call Trace: [<80faf273>] dump_stack+0x48/0x69 [<80935dc9>] warn_slowpath_common+0x79/0x90 [<80e128d4>] ? ion_handle_add+0xb4/0xc0 [<80e128d4>] ? ion_handle_add+0xb4/0xc0 [<80935e0e>] warn_slowpath_fmt+0x2e/0x30 [<80e128d4>] ion_handle_add+0xb4/0xc0 [<80e144cc>] ion_import_dma_buf+0x8c/0x110 [<80c517c4>] reg_init+0x364/0x7d0 [<80993363>] ? futex_wait+0x123/0x210 [<80992e0e>] ? get_futex_key+0x16e/0x1e0 [<8099308f>] ? futex_wake+0x5f/0x120 [<80c51e19>] vpu_service_ioctl+0x1e9/0x500 [<80994aec>] ? do_futex+0xec/0x8e0 [<80971080>] ? prepare_to_wait_event+0xc0/0xc0 [<80c51c30>] ? reg_init+0x7d0/0x7d0 [<80a22562>] do_vfs_ioctl+0x2d2/0x4c0 [<80b198ad>] ? inode_has_perm.isra.41+0x2d/0x40 [<80b199cf>] ? file_has_perm+0x7f/0x90 [<80b1a5f7>] ? selinux_file_ioctl+0x47/0xf0 [<80a227a8>] SyS_ioctl+0x58/0x80 [<80fb45e8>] syscall_call+0x7/0x7 [<80fb0000>] ? mmc_do_calc_max_discard+0xab/0xe4 Fixes: 83271f626 ("ion: hold reference to handle...") Signed-off-by: Shawn Lin Reviewed-by: Laura Abbott Cc: stable # 3.14+ Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit f26708586aafa2955e849dde286954c7b277444a Author: Joe Thornber Date: Wed Aug 12 15:12:09 2015 +0100 dm btree: add ref counting ops for the leaves of top level btrees [ Upstream commit b0dc3c8bc157c60b1d470163882be8c13e1950af ] When using nested btrees, the top leaves of the top levels contain block addresses for the root of the next tree down. If we shadow a shared leaf node the leaf values (sub tree roots) should be incremented accordingly. This is only an issue if there is metadata sharing in the top levels. Which only occurs if metadata snapshots are being used (as is possible with dm-thinp). And could result in a block from the thinp metadata snap being reused early, thus corrupting the thinp metadata snap. Signed-off-by: Joe Thornber Signed-off-by: Mike Snitzer Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit e8b815958a7f31190ff190ec4caa144eef0cd385 Author: Chuck Lever Date: Thu Jul 9 16:45:18 2015 -0400 svcrdma: Fix send_reply() scatter/gather set-up [ Upstream commit 9d11b51ce7c150a69e761e30518f294fc73d55ff ] The Linux NFS server returns garbage in the data payload of inline NFS/RDMA READ replies. These are READs of under 1000 bytes or so where the client has not provided either a reply chunk or a write list. The NFS server delivers the data payload for an NFS READ reply to the transport in an xdr_buf page list. If the NFS client did not provide a reply chunk or a write list, send_reply() is supposed to set up a separate sge for the page containing the READ data, and another sge for XDR padding if needed, then post all of the sges via a single SEND Work Request. The problem is send_reply() does not advance through the xdr_buf when setting up scatter/gather entries for SEND WR. It always calls dma_map_xdr with xdr_off set to zero. When there's more than one sge, dma_map_xdr() sets up the SEND sge's so they all point to the xdr_buf's head. The current Linux NFS/RDMA client always provides a reply chunk or a write list when performing an NFS READ over RDMA. Therefore, it does not exercise this particular case. The Linux server has never had to use more than one extra sge for building RPC/RDMA replies with a Linux client. However, an NFS/RDMA client _is_ allowed to send small NFS READs without setting up a write list or reply chunk. The NFS READ reply fits entirely within the inline reply buffer in this case. This is perhaps a more efficient way of performing NFS READs that the Linux NFS/RDMA client may some day adopt. Fixes: b432e6b3d9c1 ('svcrdma: Change DMA mapping logic to . . .') BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=285 Signed-off-by: Chuck Lever Signed-off-by: J. Bruce Fields Signed-off-by: Sasha Levin commit fa1b77ba7760833af7eb61bef319ee0b1f630634 Author: Michal Kazior Date: Wed Aug 19 13:10:43 2015 +0200 ath10k: fix dma_mapping_error() handling [ Upstream commit 5e55e3cbd1042cffa6249f22c10585e63f8a29bf ] The function returns 1 when DMA mapping fails. The driver would return bogus values and could possibly confuse itself if DMA failed. Fixes: 767d34fc67af ("ath10k: remove DMA mapping wrappers") Reported-by: Dan Carpenter Signed-off-by: Michal Kazior Signed-off-by: Kalle Valo Signed-off-by: Sasha Levin commit 089699ed8bea7532b977126227ab488839281e9a Author: Filipe Manana Date: Mon Sep 28 09:56:26 2015 +0100 Btrfs: update fix for read corruption of compressed and shared extents [ Upstream commit 808f80b46790f27e145c72112189d6a3be2bc884 ] My previous fix in commit 005efedf2c7d ("Btrfs: fix read corruption of compressed and shared extents") was effective only if the compressed extents cover a file range with a length that is not a multiple of 16 pages. That's because the detection of when we reached a different range of the file that shares the same compressed extent as the previously processed range was done at extent_io.c:__do_contiguous_readpages(), which covers subranges with a length up to 16 pages, because extent_readpages() groups the pages in clusters no larger than 16 pages. So fix this by tracking the start of the previously processed file range's extent map at extent_readpages(). The following test case for fstests reproduces the issue: seq=`basename $0` seqres=$RESULT_DIR/$seq echo "QA output created by $seq" tmp=/tmp/$$ status=1 # failure is the default! trap "_cleanup; exit \$status" 0 1 2 3 15 _cleanup() { rm -f $tmp.* } # get standard environment, filters and checks . ./common/rc . ./common/filter # real QA test starts here _need_to_be_root _supported_fs btrfs _supported_os Linux _require_scratch _require_cloner rm -f $seqres.full test_clone_and_read_compressed_extent() { local mount_opts=$1 _scratch_mkfs >>$seqres.full 2>&1 _scratch_mount $mount_opts # Create our test file with a single extent of 64Kb that is going to # be compressed no matter which compression algo is used (zlib/lzo). $XFS_IO_PROG -f -c "pwrite -S 0xaa 0K 64K" \ $SCRATCH_MNT/foo | _filter_xfs_io # Now clone the compressed extent into an adjacent file offset. $CLONER_PROG -s 0 -d $((64 * 1024)) -l $((64 * 1024)) \ $SCRATCH_MNT/foo $SCRATCH_MNT/foo echo "File digest before unmount:" md5sum $SCRATCH_MNT/foo | _filter_scratch # Remount the fs or clear the page cache to trigger the bug in # btrfs. Because the extent has an uncompressed length that is a # multiple of 16 pages, all the pages belonging to the second range # of the file (64K to 128K), which points to the same extent as the # first range (0K to 64K), had their contents full of zeroes instead # of the byte 0xaa. This was a bug exclusively in the read path of # compressed extents, the correct data was stored on disk, btrfs # just failed to fill in the pages correctly. _scratch_remount echo "File digest after remount:" # Must match the digest we got before. md5sum $SCRATCH_MNT/foo | _filter_scratch } echo -e "\nTesting with zlib compression..." test_clone_and_read_compressed_extent "-o compress=zlib" _scratch_unmount echo -e "\nTesting with lzo compression..." test_clone_and_read_compressed_extent "-o compress=lzo" status=0 exit Cc: stable@vger.kernel.org Signed-off-by: Filipe Manana Tested-by: Timofey Titovets Signed-off-by: Sasha Levin commit 3c62114fe7c9ee6b7fa1e72b11965252a50024a1 Author: Filipe Manana Date: Mon Sep 14 09:09:31 2015 +0100 Btrfs: fix read corruption of compressed and shared extents [ Upstream commit 005efedf2c7d0a270ffbe28d8997b03844f3e3e7 ] If a file has a range pointing to a compressed extent, followed by another range that points to the same compressed extent and a read operation attempts to read both ranges (either completely or part of them), the pages that correspond to the second range are incorrectly filled with zeroes. Consider the following example: File layout [0 - 8K] [8K - 24K] | | | | points to extent X, points to extent X, offset 4K, length of 8K offset 0, length 16K [extent X, compressed length = 4K uncompressed length = 16K] If a readpages() call spans the 2 ranges, a single bio to read the extent is submitted - extent_io.c:submit_extent_page() would only create a new bio to cover the second range pointing to the extent if the extent it points to had a different logical address than the extent associated with the first range. This has a consequence of the compressed read end io handler (compression.c:end_compressed_bio_read()) finish once the extent is decompressed into the pages covering the first range, leaving the remaining pages (belonging to the second range) filled with zeroes (done by compression.c:btrfs_clear_biovec_end()). So fix this by submitting the current bio whenever we find a range pointing to a compressed extent that was preceded by a range with a different extent map. This is the simplest solution for this corner case. Making the end io callback populate both ranges (or more, if we have multiple pointing to the same extent) is a much more complex solution since each bio is tightly coupled with a single extent map and the extent maps associated to the ranges pointing to the shared extent can have different offsets and lengths. The following test case for fstests triggers the issue: seq=`basename $0` seqres=$RESULT_DIR/$seq echo "QA output created by $seq" tmp=/tmp/$$ status=1 # failure is the default! trap "_cleanup; exit \$status" 0 1 2 3 15 _cleanup() { rm -f $tmp.* } # get standard environment, filters and checks . ./common/rc . ./common/filter # real QA test starts here _need_to_be_root _supported_fs btrfs _supported_os Linux _require_scratch _require_cloner rm -f $seqres.full test_clone_and_read_compressed_extent() { local mount_opts=$1 _scratch_mkfs >>$seqres.full 2>&1 _scratch_mount $mount_opts # Create a test file with a single extent that is compressed (the # data we write into it is highly compressible no matter which # compression algorithm is used, zlib or lzo). $XFS_IO_PROG -f -c "pwrite -S 0xaa 0K 4K" \ -c "pwrite -S 0xbb 4K 8K" \ -c "pwrite -S 0xcc 12K 4K" \ $SCRATCH_MNT/foo | _filter_xfs_io # Now clone our extent into an adjacent offset. $CLONER_PROG -s $((4 * 1024)) -d $((16 * 1024)) -l $((8 * 1024)) \ $SCRATCH_MNT/foo $SCRATCH_MNT/foo # Same as before but for this file we clone the extent into a lower # file offset. $XFS_IO_PROG -f -c "pwrite -S 0xaa 8K 4K" \ -c "pwrite -S 0xbb 12K 8K" \ -c "pwrite -S 0xcc 20K 4K" \ $SCRATCH_MNT/bar | _filter_xfs_io $CLONER_PROG -s $((12 * 1024)) -d 0 -l $((8 * 1024)) \ $SCRATCH_MNT/bar $SCRATCH_MNT/bar echo "File digests before unmounting filesystem:" md5sum $SCRATCH_MNT/foo | _filter_scratch md5sum $SCRATCH_MNT/bar | _filter_scratch # Evicting the inode or clearing the page cache before reading # again the file would also trigger the bug - reads were returning # all bytes in the range corresponding to the second reference to # the extent with a value of 0, but the correct data was persisted # (it was a bug exclusively in the read path). The issue happened # only if the same readpages() call targeted pages belonging to the # first and second ranges that point to the same compressed extent. _scratch_remount echo "File digests after mounting filesystem again:" # Must match the same digests we got before. md5sum $SCRATCH_MNT/foo | _filter_scratch md5sum $SCRATCH_MNT/bar | _filter_scratch } echo -e "\nTesting with zlib compression..." test_clone_and_read_compressed_extent "-o compress=zlib" _scratch_unmount echo -e "\nTesting with lzo compression..." test_clone_and_read_compressed_extent "-o compress=lzo" status=0 exit Cc: stable@vger.kernel.org Signed-off-by: Filipe Manana Reviewed-by: Qu Wenruo Reviewed-by: Liu Bo Signed-off-by: Sasha Levin commit b0849b6a75a7da47c3e705e07a3dfdf02d9f62a8 Author: Jeff Mahoney Date: Fri Sep 11 21:44:17 2015 -0400 btrfs: skip waiting on ordered range for special files [ Upstream commit a30e577c96f59b1e1678ea5462432b09bf7d5cbc ] In btrfs_evict_inode, we properly truncate the page cache for evicted inodes but then we call btrfs_wait_ordered_range for every inode as well. It's the right thing to do for regular files but results in incorrect behavior for device inodes for block devices. filemap_fdatawrite_range gets called with inode->i_mapping which gets resolved to the block device inode before getting passed to wbc_attach_fdatawrite_inode and ultimately to inode_to_bdi. What happens next depends on whether there's an open file handle associated with the inode. If there is, we write to the block device, which is unexpected behavior. If there isn't, we through normally and inode->i_data is used. We can also end up racing against open/close which can result in crashes when i_mapping points to a block device inode that has been closed. Since there can't be any page cache associated with special file inodes, it's safe to skip the btrfs_wait_ordered_range call entirely and avoid the problem. Cc: Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=100911 Tested-by: Christoph Biedl Signed-off-by: Jeff Mahoney Reviewed-by: Filipe Manana Signed-off-by: Sasha Levin commit b03abc8bb7dc34f5204511586fc12ce2e2bd2394 Author: Yitian Bu Date: Fri Oct 2 15:18:41 2015 +0800 ASoC: dwc: correct irq clear method [ Upstream commit 4873867e5f2bd90faad861dd94865099fc3140f3 ] from Designware I2S datasheet, tx/rx XRUN irq is cleared by reading register TOR/ROR, rather than by writing into them. Signed-off-by: Yitian Bu Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 8df654455e63a3fddb501709bdea349b0f351cee Author: Robert Jarzmik Date: Tue Sep 15 20:51:31 2015 +0200 ASoC: fix broken pxa SoC support [ Upstream commit 3c8f7710c1c44fb650bc29b6ef78ed8b60cfaa28 ] The previous fix of pxa library support, which was introduced to fix the library dependency, broke the previous SoC behavior, where a machine code binding pxa2xx-ac97 with a coded relied on : - sound/soc/pxa/pxa2xx-ac97.c - sound/soc/codecs/XXX.c For example, the mioa701_wm9713.c machine code is currently broken. The "select ARM" statement wrongly selects the soc/arm/pxa2xx-ac97 for compilation, as per an unfortunate fate SND_PXA2XX_AC97 is both declared in sound/arm/Kconfig and sound/soc/pxa/Kconfig. Fix this by ensuring that SND_PXA2XX_SOC correctly triggers the correct pxa2xx-ac97 compilation. Fixes: 846172dfe33c ("ASoC: fix SND_PXA2XX_LIB Kconfig warning") Signed-off-by: Robert Jarzmik Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit b496c8045a1bc422b898a8330ec63be168f13ab9 Author: Robert Jarzmik Date: Tue Sep 22 21:20:22 2015 +0200 ASoC: pxa: pxa2xx-ac97: fix dma requestor lines [ Upstream commit 8811191fdf7ed02ee07cb8469428158572d355a2 ] PCM receive and transmit DMA requestor lines were reverted, breaking the PCM playback interface for PXA platforms using the sound/soc/ variant instead of the sound/arm variant. The commit below shows the inversion in the requestor lines. Fixes: d65a14587a9b ("ASoC: pxa: use snd_dmaengine_dai_dma_data") Signed-off-by: Robert Jarzmik Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit acd1288e7ea59594d4d258a0f8b8636850361d55 Author: John Flatness Date: Fri Oct 2 17:07:49 2015 -0400 ALSA: hda - Apply SPDIF pin ctl to MacBookPro 12,1 [ Upstream commit e8ff581f7ac2bc3b8886094b7ca635dcc4d1b0e9 ] The MacBookPro 12,1 has the same setup as the 11 for controlling the status of the optical audio light. Simply apply the existing workaround to the subsystem ID for the 12,1. [sorted the fixup entry by tiwai] Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=105401 Signed-off-by: John Flatness Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 91b15aa132f58b311212c670887954fdde66db46 Author: Laura Abbott Date: Fri Oct 2 11:09:54 2015 -0700 ALSA: hda: Add dock support for ThinkPad T550 [ Upstream commit d05ea7da0e8f6df3c62cfee75538f347cb3d89ef ] Much like all the other Lenovo laptops, add a quirk to make sound work with docking. Reported-and-tested-by: lacknerflo@gmail.com Signed-off-by: Laura Abbott Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 63758060bda18485a8bde90d0e0e946bc46a2276 Author: Takashi Iwai Date: Mon Oct 5 16:55:09 2015 +0200 ALSA: synth: Fix conflicting OSS device registration on AWE32 [ Upstream commit 225db5762dc1a35b26850477ffa06e5cd0097243 ] When OSS emulation is loaded on ISA SB AWE32 chip, we get now kernel warnings like: WARNING: CPU: 0 PID: 2791 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x51/0x80() sysfs: cannot create duplicate filename '/devices/isa/sbawe.0/sound/card0/seq-oss-0-0' It's because both emux synth and opl3 drivers try to register their OSS device object with the same static index number 0. This hasn't been a big problem until the recent rewrite of device management code (that exposes sysfs at the same time), but it's been an obvious bug. This patch works around it just by using a different index number of emux synth object. There can be a more elegant way to fix, but it's enough for now, as this code won't be touched so often, in anyway. Reported-and-tested-by: Michael Shell Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 204e65d3f7de07a295e0ec9a8768f82aa2bf5244 Author: Mel Gorman Date: Thu Oct 1 15:36:57 2015 -0700 mm: hugetlbfs: skip shared VMAs when unmapping private pages to satisfy a fault [ Upstream commit 2f84a8990ebbe235c59716896e017c6b2ca1200f ] SunDong reported the following on https://bugzilla.kernel.org/show_bug.cgi?id=103841 I think I find a linux bug, I have the test cases is constructed. I can stable recurring problems in fedora22(4.0.4) kernel version, arch for x86_64. I construct transparent huge page, when the parent and child process with MAP_SHARE, MAP_PRIVATE way to access the same huge page area, it has the opportunity to lead to huge page copy on write failure, and then it will munmap the child corresponding mmap area, but then the child mmap area with VM_MAYSHARE attributes, child process munmap this area can trigger VM_BUG_ON in set_vma_resv_flags functions (vma - > vm_flags & VM_MAYSHARE). There were a number of problems with the report (e.g. it's hugetlbfs that triggers this, not transparent huge pages) but it was fundamentally correct in that a VM_BUG_ON in set_vma_resv_flags() can be triggered that looks like this vma ffff8804651fd0d0 start 00007fc474e00000 end 00007fc475e00000 next ffff8804651fd018 prev ffff8804651fd188 mm ffff88046b1b1800 prot 8000000000000027 anon_vma (null) vm_ops ffffffff8182a7a0 pgoff 0 file ffff88106bdb9800 private_data (null) flags: 0x84400fb(read|write|shared|mayread|maywrite|mayexec|mayshare|dontexpand|hugetlb) ------------ kernel BUG at mm/hugetlb.c:462! SMP Modules linked in: xt_pkttype xt_LOG xt_limit [..] CPU: 38 PID: 26839 Comm: map Not tainted 4.0.4-default #1 Hardware name: Dell Inc. PowerEdge R810/0TT6JF, BIOS 2.7.4 04/26/2012 set_vma_resv_flags+0x2d/0x30 The VM_BUG_ON is correct because private and shared mappings have different reservation accounting but the warning clearly shows that the VMA is shared. When a private COW fails to allocate a new page then only the process that created the VMA gets the page -- all the children unmap the page. If the children access that data in the future then they get killed. The problem is that the same file is mapped shared and private. During the COW, the allocation fails, the VMAs are traversed to unmap the other private pages but a shared VMA is found and the bug is triggered. This patch identifies such VMAs and skips them. Signed-off-by: Mel Gorman Reported-by: SunDong Reviewed-by: Michal Hocko Cc: Andrea Arcangeli Cc: Hugh Dickins Cc: Naoya Horiguchi Cc: David Rientjes Reviewed-by: Naoya Horiguchi Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 13587ce1faf765b333667e93b34c3ed965e344e6 Author: Joseph Qi Date: Tue Sep 22 14:59:20 2015 -0700 ocfs2/dlm: fix deadlock when dispatch assert master [ Upstream commit 012572d4fc2e4ddd5c8ec8614d51414ec6cae02a ] The order of the following three spinlocks should be: dlm_domain_lock < dlm_ctxt->spinlock < dlm_lock_resource->spinlock But dlm_dispatch_assert_master() is called while holding dlm_ctxt->spinlock and dlm_lock_resource->spinlock, and then it calls dlm_grab() which will take dlm_domain_lock. Once another thread (for example, dlm_query_join_handler) has already taken dlm_domain_lock, and tries to take dlm_ctxt->spinlock deadlock happens. Signed-off-by: Joseph Qi Cc: Joel Becker Cc: Mark Fasheh Cc: "Junxiao Bi" Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit f6d0801410219f5115d344105468d6cb143fccec Author: Tan, Jui Nee Date: Tue Sep 1 10:22:51 2015 +0800 spi: spi-pxa2xx: Check status register to determine if SSSR_TINT is disabled [ Upstream commit 02bc933ebb59208f42c2e6305b2c17fd306f695d ] On Intel Baytrail, there is case when interrupt handler get called, no SPI message is captured. The RX FIFO is indeed empty when RX timeout pending interrupt (SSSR_TINT) happens. Use the BIOS version where both HSUART and SPI are on the same IRQ. Both drivers are using IRQF_SHARED when calling the request_irq function. When running two separate and independent SPI and HSUART application that generate data traffic on both components, user will see messages like below on the console: pxa2xx-spi pxa2xx-spi.0: bad message state in interrupt handler This commit will fix this by first checking Receiver Time-out Interrupt, if it is disabled, ignore the request and return without servicing. Signed-off-by: Tan, Jui Nee Acked-by: Jarkko Nikula Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 3cd1f376b0a0ec0c5f044d740705d89c740e8490 Author: Max Filippov Date: Tue Sep 22 14:32:03 2015 +0300 spi: xtensa-xtfpga: fix register endianness [ Upstream commit b0b4855099e301c8603ea37da9a0103a96c2e0b1 ] XTFPGA SPI controller has native endian registers. Fix register acessors so that they work in big-endian configurations. Signed-off-by: Max Filippov Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 9ab878154d89756408338aff0b2afeaaf84fc925 Author: Guenter Roeck Date: Sun Sep 6 01:46:54 2015 +0300 spi: Fix documentation of spi_alloc_master() [ Upstream commit a394d635193b641f2c86ead5ada5b115d57c51f8 ] Actually, spi_master_put() after spi_alloc_master() must _not_ be followed by kfree(). The memory is already freed with the call to spi_master_put() through spi_master_class, which registers a release function. Calling both spi_master_put() and kfree() results in often nasty (and delayed) crashes elsewhere in the kernel, often in the networking stack. This reverts commit eb4af0f5349235df2e4a5057a72fc8962d00308a. Link to patch and concerns: https://lkml.org/lkml/2012/9/3/269 or http://lkml.iu.edu/hypermail/linux/kernel/1209.0/00790.html Alexey Klimov: This revert becomes valid after 94c69f765f1b4a658d96905ec59928e3e3e07e6a when spi-imx.c has been fixed and there is no need to call kfree() so comment for spi_alloc_master() should be fixed. Signed-off-by: Guenter Roeck Signed-off-by: Alexey Klimov Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 8ba742c05178c67d5c5d5ae9d436e326981dfdfa Author: Christian Borntraeger Date: Mon Sep 28 22:47:42 2015 +0200 s390/boot/decompression: disable floating point in decompressor [ Upstream commit adc0b7fbf6fe9967505c0254d9535ec7288186ae ] my gcc 5.1 used an ldgr instruction with a register != 0,2,4,6 for spilling/filling into a floating point register in our decompressor. This will cause an AFP-register data exception as the decompressor did not setup the additional floating point registers via cr0. That causes a program check loop that looked like a hang with one "Uncompressing Linux... " message (directly booted via kvm) or a loop of "Uncompressing Linux... " messages (when booted via zipl boot loader). The offending code in my build was 48e400: e3 c0 af ff ff 71 lay %r12,-1(%r10) -->48e406: b3 c1 00 1c ldgr %f1,%r12 48e40a: ec 6c 01 22 02 7f clij %r6,2,12,0x48e64e but gcc could do spilling into an fpr at any function. We can simply disable floating point support at that early stage. Signed-off-by: Christian Borntraeger Acked-by: Heiko Carstens Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit cdd0e80e28a17908655674f29ae162b047d379a5 Author: Martin Schwidefsky Date: Tue Sep 8 15:25:39 2015 +0200 s390/compat: correct uc_sigmask of the compat signal frame [ Upstream commit 8d4bd0ed0439dfc780aab801a085961925ed6838 ] The uc_sigmask in the ucontext structure is an array of words to keep the 64 signal bits (or 1024 if you ask glibc but the kernel sigset_t only has 64 bits). For 64 bit the sigset_t contains a single 8 byte word, but for 31 bit there are two 4 byte words. The compat signal handler code uses a simple copy of the 64 bit sigset_t to the 31 bit compat_sigset_t. As s390 is a big-endian architecture this is incorrect, the two words in the 31 bit sigset_t array need to be swapped. Cc: Reported-by: Stefan Liebler Signed-off-by: Martin Schwidefsky Signed-off-by: Sasha Levin commit 9992c7d542b6e50c394d9fc5353560921f691347 Author: Peter Zijlstra Date: Tue Sep 29 14:45:09 2015 +0200 sched/core: Fix TASK_DEAD race in finish_task_switch() [ Upstream commit 95913d97914f44db2b81271c2e2ebd4d2ac2df83 ] So the problem this patch is trying to address is as follows: CPU0 CPU1 context_switch(A, B) ttwu(A) LOCK A->pi_lock A->on_cpu == 0 finish_task_switch(A) prev_state = A->state <-. WMB | A->on_cpu = 0; | UNLOCK rq0->lock | | context_switch(C, A) `-- A->state = TASK_DEAD prev_state == TASK_DEAD put_task_struct(A) context_switch(A, C) finish_task_switch(A) A->state == TASK_DEAD put_task_struct(A) The argument being that the WMB will allow the load of A->state on CPU0 to cross over and observe CPU1's store of A->state, which will then result in a double-drop and use-after-free. Now the comment states (and this was true once upon a long time ago) that we need to observe A->state while holding rq->lock because that will order us against the wakeup; however the wakeup will not in fact acquire (that) rq->lock; it takes A->pi_lock these days. We can obviously fix this by upgrading the WMB to an MB, but that is expensive, so we'd rather avoid that. The alternative this patch takes is: smp_store_release(&A->on_cpu, 0), which avoids the MB on some archs, but not important ones like ARM. Reported-by: Oleg Nesterov Signed-off-by: Peter Zijlstra (Intel) Acked-by: Linus Torvalds Cc: # v3.1+ Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-kernel@vger.kernel.org Cc: manfred@colorfullife.com Cc: will.deacon@arm.com Fixes: e4a52bcb9a18 ("sched: Remove rq->lock from the first half of ttwu()") Link: http://lkml.kernel.org/r/20150929124509.GG3816@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit 113da0dfd625f4e62378432346b767f41dff5959 Author: Vitaly Kuznetsov Date: Fri Sep 25 11:59:52 2015 +0200 x86/xen: Support kexec/kdump in HVM guests by doing a soft reset [ Upstream commit 0b34a166f291d255755be46e43ed5497cdd194f2 ] Currently there is a number of issues preventing PVHVM Xen guests from doing successful kexec/kdump: - Bound event channels. - Registered vcpu_info. - PIRQ/emuirq mappings. - shared_info frame after XENMAPSPACE_shared_info operation. - Active grant mappings. Basically, newly booted kernel stumbles upon already set up Xen interfaces and there is no way to reestablish them. In Xen-4.7 a new feature called 'soft reset' is coming. A guest performing kexec/kdump operation is supposed to call SCHEDOP_shutdown hypercall with SHUTDOWN_soft_reset reason before jumping to new kernel. Hypervisor (with some help from toolstack) will do full domain cleanup (but keeping its memory and vCPU contexts intact) returning the guest to the state it had when it was first booted and thus allowing it to start over. Doing SHUTDOWN_soft_reset on Xen hypervisors which don't support it is probably OK as by default all unknown shutdown reasons cause domain destroy with a message in toolstack log: 'Unknown shutdown reason code 5. Destroying domain.' which gives a clue to what the problem is and eliminates false expectations. Signed-off-by: Vitaly Kuznetsov Cc: Signed-off-by: David Vrabel Signed-off-by: Sasha Levin commit 4502d69855c85abb502599be05381a15bdf43c71 Author: Stephen Smalley Date: Thu Oct 1 09:04:22 2015 -0400 x86/mm: Set NX on gap between __ex_table and rodata [ Upstream commit ab76f7b4ab2397ffdd2f1eb07c55697d19991d10 ] Unused space between the end of __ex_table and the start of rodata can be left W+x in the kernel page tables. Extend the setting of the NX bit to cover this gap by starting from text_end rather than rodata_start. Before: ---[ High Kernel Mapping ]--- 0xffffffff80000000-0xffffffff81000000 16M pmd 0xffffffff81000000-0xffffffff81600000 6M ro PSE GLB x pmd 0xffffffff81600000-0xffffffff81754000 1360K ro GLB x pte 0xffffffff81754000-0xffffffff81800000 688K RW GLB x pte 0xffffffff81800000-0xffffffff81a00000 2M ro PSE GLB NX pmd 0xffffffff81a00000-0xffffffff81b3b000 1260K ro GLB NX pte 0xffffffff81b3b000-0xffffffff82000000 4884K RW GLB NX pte 0xffffffff82000000-0xffffffff82200000 2M RW PSE GLB NX pmd 0xffffffff82200000-0xffffffffa0000000 478M pmd After: ---[ High Kernel Mapping ]--- 0xffffffff80000000-0xffffffff81000000 16M pmd 0xffffffff81000000-0xffffffff81600000 6M ro PSE GLB x pmd 0xffffffff81600000-0xffffffff81754000 1360K ro GLB x pte 0xffffffff81754000-0xffffffff81800000 688K RW GLB NX pte 0xffffffff81800000-0xffffffff81a00000 2M ro PSE GLB NX pmd 0xffffffff81a00000-0xffffffff81b3b000 1260K ro GLB NX pte 0xffffffff81b3b000-0xffffffff82000000 4884K RW GLB NX pte 0xffffffff82000000-0xffffffff82200000 2M RW PSE GLB NX pmd 0xffffffff82200000-0xffffffffa0000000 478M pmd Signed-off-by: Stephen Smalley Acked-by: Kees Cook Cc: Cc: Linus Torvalds Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/1443704662-3138-1-git-send-email-sds@tycho.nsa.gov Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit 09be2e412c80f53854d251975aceac56a621bc29 Author: Thomas Gleixner Date: Wed Sep 30 08:38:22 2015 +0000 x86/process: Add proper bound checks in 64bit get_wchan() [ Upstream commit eddd3826a1a0190e5235703d1e666affa4d13b96 ] Dmitry Vyukov reported the following using trinity and the memory error detector AddressSanitizer (https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel). [ 124.575597] ERROR: AddressSanitizer: heap-buffer-overflow on address ffff88002e280000 [ 124.576801] ffff88002e280000 is located 131938492886538 bytes to the left of 28857600-byte region [ffffffff81282e0a, ffffffff82e0830a) [ 124.578633] Accessed by thread T10915: [ 124.579295] inlined in describe_heap_address ./arch/x86/mm/asan/report.c:164 [ 124.579295] #0 ffffffff810dd277 in asan_report_error ./arch/x86/mm/asan/report.c:278 [ 124.580137] #1 ffffffff810dc6a0 in asan_check_region ./arch/x86/mm/asan/asan.c:37 [ 124.581050] #2 ffffffff810dd423 in __tsan_read8 ??:0 [ 124.581893] #3 ffffffff8107c093 in get_wchan ./arch/x86/kernel/process_64.c:444 The address checks in the 64bit implementation of get_wchan() are wrong in several ways: - The lower bound of the stack is not the start of the stack page. It's the start of the stack page plus sizeof (struct thread_info) - The upper bound must be: top_of_stack - TOP_OF_KERNEL_STACK_PADDING - 2 * sizeof(unsigned long). The 2 * sizeof(unsigned long) is required because the stack pointer points at the frame pointer. The layout on the stack is: ... IP FP ... IP FP. So we need to make sure that both IP and FP are in the bounds. Fix the bound checks and get rid of the mix of numeric constants, u64 and unsigned long. Making all unsigned long allows us to use the same function for 32bit as well. Use READ_ONCE() when accessing the stack. This does not prevent a concurrent wakeup of the task and the stack changing, but at least it avoids TOCTOU. Also check task state at the end of the loop. Again that does not prevent concurrent changes, but it avoids walking for nothing. Add proper comments while at it. Reported-by: Dmitry Vyukov Reported-by: Sasha Levin Based-on-patch-from: Wolfram Gloger Signed-off-by: Thomas Gleixner Reviewed-by: Borislav Petkov Reviewed-by: Dmitry Vyukov Cc: Andrey Ryabinin Cc: Andy Lutomirski Cc: Andrey Konovalov Cc: Kostya Serebryany Cc: Alexander Potapenko Cc: kasan-dev Cc: Denys Vlasenko Cc: Andi Kleen Cc: Wolfram Gloger Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20150930083302.694788319@linutronix.de Signed-off-by: Thomas Gleixner Signed-off-by: Sasha Levin commit 6dbba2133ec370305c1c91785f91f415c26164d5 Author: Andy Lutomirski Date: Tue Mar 10 11:05:58 2015 -0700 x86/asm/entry: Create and use a 'TOP_OF_KERNEL_STACK_PADDING' macro [ Upstream commit 3ee4298f440c81638cbb5ec06f2497fb7a9a9eb4 ] x86_32, unlike x86_64, pads the top of the kernel stack, because the hardware stack frame formats are variable in size. Document this padding and give it a name. This should make no change whatsoever to the compiled kernel image. It also doesn't fix any of the current bugs in this area. Signed-off-by: Andy Lutomirski Acked-by: Denys Vlasenko Cc: Borislav Petkov Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Oleg Nesterov Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/02bf2f54b8dcb76a62a142b6dfe07d4ef7fc582e.1426009661.git.luto@amacapital.net [ Fixed small details, such as a missed magic constant in entry_32.S pointed out by Denys Vlasenko. ] Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit 9a2a1db52f2acfe671c8192d36b6b84b6ea5c9fa Author: Lee, Chun-Yi Date: Tue Sep 29 20:58:57 2015 +0800 x86/kexec: Fix kexec crash in syscall kexec_file_load() [ Upstream commit e3c41e37b0f4b18cbd4dac76cbeece5a7558b909 ] The original bug is a page fault crash that sometimes happens on big machines when preparing ELF headers: BUG: unable to handle kernel paging request at ffffc90613fc9000 IP: [] prepare_elf64_ram_headers_callback+0x165/0x260 The bug is caused by us under-counting the number of memory ranges and subsequently not allocating enough ELF header space for them. The bug is typically masked on smaller systems, because the ELF header allocation is rounded up to the next page. This patch modifies the code in fill_up_crash_elf_data() by using walk_system_ram_res() instead of walk_system_ram_range() to correctly count the max number of crash memory ranges. That's because the walk_system_ram_range() filters out small memory regions that reside in the same page, but walk_system_ram_res() does not. Here's how I found the bug: After tracing prepare_elf64_headers() and prepare_elf64_ram_headers_callback(), the code uses walk_system_ram_res() to fill-in crash memory regions information to the program header, so it counts those small memory regions that reside in a page area. But, when the kernel was using walk_system_ram_range() in fill_up_crash_elf_data() to count the number of crash memory regions, it filters out small regions. I printed those small memory regions, for example: kexec: Get nr_ram ranges. vaddr=0xffff880077592258 paddr=0x77592258, sz=0xdc0 Based on the code in walk_system_ram_range(), this memory region will be filtered out: pfn = (0x77592258 + 0x1000 - 1) >> 12 = 0x77593 end_pfn = (0x77592258 + 0xfc0 -1 + 1) >> 12 = 0x77593 end_pfn - pfn = 0x77593 - 0x77593 = 0 <=== if (end_pfn > pfn) is FALSE So, the max_nr_ranges that's counted by the kernel doesn't include small memory regions - causing us to under-allocate the required space. That causes the page fault crash that happens in a later code path when preparing ELF headers. This bug is not easy to reproduce on small machines that have few CPUs, because the allocated page aligned ELF buffer has more free space to cover those small memory regions' PT_LOAD headers. Signed-off-by: Lee, Chun-Yi Cc: Andy Lutomirski Cc: Baoquan He Cc: Jiang Liu Cc: Linus Torvalds Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Stephen Rothwell Cc: Takashi Iwai Cc: Thomas Gleixner Cc: Viresh Kumar Cc: Vivek Goyal Cc: kexec@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: Link: http://lkml.kernel.org/r/1443531537-29436-1-git-send-email-jlee@suse.com Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit 3b8db56e57afe7f2354daf484a75866737ff5202 Author: Matt Fleming Date: Fri Sep 25 23:02:18 2015 +0100 x86/efi: Fix boot crash by mapping EFI memmap entries bottom-up at runtime, instead of top-down [ Upstream commit a5caa209ba9c29c6421292e7879d2387a2ef39c9 ] Beginning with UEFI v2.5 EFI_PROPERTIES_TABLE was introduced that signals that the firmware PE/COFF loader supports splitting code and data sections of PE/COFF images into separate EFI memory map entries. This allows the kernel to map those regions with strict memory protections, e.g. EFI_MEMORY_RO for code, EFI_MEMORY_XP for data, etc. Unfortunately, an unwritten requirement of this new feature is that the regions need to be mapped with the same offsets relative to each other as observed in the EFI memory map. If this is not done crashes like this may occur, BUG: unable to handle kernel paging request at fffffffefe6086dd IP: [] 0xfffffffefe6086dd Call Trace: [] efi_call+0x7e/0x100 [] ? virt_efi_set_variable+0x61/0x90 [] efi_delete_dummy_variable+0x63/0x70 [] efi_enter_virtual_mode+0x383/0x392 [] start_kernel+0x38a/0x417 [] x86_64_start_reservations+0x2a/0x2c [] x86_64_start_kernel+0xeb/0xef Here 0xfffffffefe6086dd refers to an address the firmware expects to be mapped but which the OS never claimed was mapped. The issue is that included in these regions are relative addresses to other regions which were emitted by the firmware toolchain before the "splitting" of sections occurred at runtime. Needless to say, we don't satisfy this unwritten requirement on x86_64 and instead map the EFI memory map entries in reverse order. The above crash is almost certainly triggerable with any kernel newer than v3.13 because that's when we rewrote the EFI runtime region mapping code, in commit d2f7cbe7b26a ("x86/efi: Runtime services virtual mapping"). For kernel versions before v3.13 things may work by pure luck depending on the fragmentation of the kernel virtual address space at the time we map the EFI regions. Instead of mapping the EFI memory map entries in reverse order, where entry N has a higher virtual address than entry N+1, map them in the same order as they appear in the EFI memory map to preserve this relative offset between regions. This patch has been kept as small as possible with the intention that it should be applied aggressively to stable and distribution kernels. It is very much a bugfix rather than support for a new feature, since when EFI_PROPERTIES_TABLE is enabled we must map things as outlined above to even boot - we have no way of asking the firmware not to split the code/data regions. In fact, this patch doesn't even make use of the more strict memory protections available in UEFI v2.5. That will come later. Suggested-by: Ard Biesheuvel Reported-by: Ard Biesheuvel Signed-off-by: Matt Fleming Cc: Cc: Borislav Petkov Cc: Chun-Yi Cc: Dave Young Cc: H. Peter Anvin Cc: James Bottomley Cc: Lee, Chun-Yi Cc: Leif Lindholm Cc: Linus Torvalds Cc: Matthew Garrett Cc: Mike Galbraith Cc: Peter Jones Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/1443218539-7610-2-git-send-email-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit 1e4f2890f92a32b41fdae434cee45eb93d239ad4 Author: Dirk Müller Date: Thu Oct 1 13:43:42 2015 +0200 Use WARN_ON_ONCE for missing X86_FEATURE_NRIPS [ Upstream commit d2922422c48df93f3edff7d872ee4f3191fefb08 ] The cpu feature flags are not ever going to change, so warning everytime can cause a lot of kernel log spam (in our case more than 10GB/hour). The warning seems to only occur when nested virtualization is enabled, so it's probably triggered by a KVM bug. This is a sensible and safe change anyway, and the KVM bug fix might not be suitable for stable releases anyway. Cc: stable@vger.kernel.org Signed-off-by: Dirk Mueller Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit 7cb9685d817e262c8dce4e3bd369f243bc6961f8 Author: Andy Lutomirski Date: Sun Sep 20 16:32:04 2015 -0700 x86/paravirt: Replace the paravirt nop with a bona fide empty function [ Upstream commit fc57a7c68020dcf954428869eafd934c0ab1536f ] PARAVIRT_ADJUST_EXCEPTION_FRAME generates this code (using nmi as an example, trimmed for readability): ff 15 00 00 00 00 callq *0x0(%rip) # 2796 2792: R_X86_64_PC32 pv_irq_ops+0x2c That's a call through a function pointer to regular C function that does nothing on native boots, but that function isn't protected against kprobes, isn't marked notrace, and is certainly not guaranteed to preserve any registers if the compiler is feeling perverse. This is bad news for a CLBR_NONE operation. Of course, if everything works correctly, once paravirt ops are patched, it gets nopped out, but what if we hit this code before paravirt ops are patched in? This can potentially cause breakage that is very difficult to debug. A more subtle failure is possible here, too: if _paravirt_nop uses the stack at all (even just to push RBP), it will overwrite the "NMI executing" variable if it's called in the NMI prologue. The Xen case, perhaps surprisingly, is fine, because it's already written in asm. Fix all of the cases that default to paravirt_nop (including adjust_exception_frame) with a big hammer: replace paravirt_nop with an asm function that is just a ret instruction. The Xen case may have other problems, so document them. This is part of a fix for some random crashes that Sasha saw. Reported-and-tested-by: Sasha Levin Signed-off-by: Andy Lutomirski Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/8f5d2ba295f9d73751c33d97fda03e0495d9ade0.1442791737.git.luto@kernel.org Signed-off-by: Thomas Gleixner Signed-off-by: Sasha Levin commit 6ea48cdcf3c9054b03d2607ead2740d7a0376fa4 Author: David Woodhouse Date: Wed Sep 16 14:10:03 2015 +0100 x86/platform: Fix Geode LX timekeeping in the generic x86 build [ Upstream commit 03da3ff1cfcd7774c8780d2547ba0d995f7dc03d ] In 2007, commit 07190a08eef36 ("Mark TSC on GeodeLX reliable") bypassed verification of the TSC on Geode LX. However, this code (now in the check_system_tsc_reliable() function in arch/x86/kernel/tsc.c) was only present if CONFIG_MGEODE_LX was set. OpenWRT has recently started building its generic Geode target for Geode GX, not LX, to include support for additional platforms. This broke the timekeeping on LX-based devices, because the TSC wasn't marked as reliable: https://dev.openwrt.org/ticket/20531 By adding a runtime check on is_geode_lx(), we can also include the fix if CONFIG_MGEODEGX1 or CONFIG_X86_GENERIC are set, thus fixing the problem. Signed-off-by: David Woodhouse Cc: Andres Salomon Cc: Linus Torvalds Cc: Marcelo Tosatti Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1442409003.131189.87.camel@infradead.org Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit 51b366f06578c32bc97b7b35d827945cdae6f877 Author: Shaohua Li Date: Thu Jul 30 16:24:43 2015 -0700 x86/apic: Serialize LVTT and TSC_DEADLINE writes [ Upstream commit 5d7c631d926b59aa16f3c56eaeb83f1036c81dc7 ] The APIC LVTT register is MMIO mapped but the TSC_DEADLINE register is an MSR. The write to the TSC_DEADLINE MSR is not serializing, so it's not guaranteed that the write to LVTT has reached the APIC before the TSC_DEADLINE MSR is written. In such a case the write to the MSR is ignored and as a consequence the local timer interrupt never fires. The SDM decribes this issue for xAPIC and x2APIC modes. The serialization methods recommended by the SDM differ. xAPIC: "1. Memory-mapped write to LVT Timer Register, setting bits 18:17 to 10b. 2. WRMSR to the IA32_TSC_DEADLINE MSR a value much larger than current time-stamp counter. 3. If RDMSR of the IA32_TSC_DEADLINE MSR returns zero, go to step 2. 4. WRMSR to the IA32_TSC_DEADLINE MSR the desired deadline." x2APIC: "To allow for efficient access to the APIC registers in x2APIC mode, the serializing semantics of WRMSR are relaxed when writing to the APIC registers. Thus, system software should not use 'WRMSR to APIC registers in x2APIC mode' as a serializing instruction. Read and write accesses to the APIC registers will occur in program order. A WRMSR to an APIC register may complete before all preceding stores are globally visible; software can prevent this by inserting a serializing instruction, an SFENCE, or an MFENCE before the WRMSR." The xAPIC method is to just wait for the memory mapped write to hit the LVTT by checking whether the MSR write has reached the hardware. There is no reason why a proper MFENCE after the memory mapped write would not do the same. Andi Kleen confirmed that MFENCE is sufficient for the xAPIC case as well. Issue MFENCE before writing to the TSC_DEADLINE MSR. This can be done unconditionally as all CPUs which have TSC_DEADLINE also have MFENCE support. [ tglx: Massaged the changelog ] Signed-off-by: Shaohua Li Reviewed-by: Ingo Molnar Cc: Cc: Cc: Cc: Andi Kleen Cc: H. Peter Anvin Cc: stable@vger.kernel.org #v3.7+ Link: http://lkml.kernel.org/r/20150909041352.GA2059853@devbig257.prn2.facebook.com Signed-off-by: Thomas Gleixner Signed-off-by: Sasha Levin commit 5a1c58d34f326eb02c67235876b5851bc202d242 Author: Andy Shevchenko Date: Mon Sep 28 18:57:03 2015 +0300 dmaengine: dw: properly read DWC_PARAMS register [ Upstream commit 6bea0f6d1c47b07be88dfd93f013ae05fcb3d8bf ] In case we have less than maximum allowed channels (8) and autoconfiguration is enabled the DWC_PARAMS read is wrong because it uses different arithmetic to what is needed for channel priority setup. Re-do the caclulations properly. This now works on AVR32 board well. Fixes: fed2574b3c9f (dw_dmac: introduce software emulation of LLP transfers) Cc: yitian.bu@tangramtek.com Signed-off-by: Andy Shevchenko Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin commit 3f052171735fc50371f5bd8beacbb5e44b9cb9da Author: Felipe F. Tonello Date: Wed Sep 16 18:40:32 2015 +0100 ARM: dts: fix usb pin control for imx-rex dts [ Upstream commit 0af822110871400908d5b6f83a8908c45f881d8f ] This fixes a duplicated pin control causing this error: imx6q-pinctrl 20e0000.iomuxc: pin MX6Q_PAD_GPIO_1 already requested by regulators:regulator@2; cannot claim for 2184000.usb imx6q-pinctrl 20e0000.iomuxc: pin-137 (2184000.usb) status -22 imx6q-pinctrl 20e0000.iomuxc: could not request pin 137 (MX6Q_PAD_GPIO_1) from group usbotggrp on device 20e0000.iomuxc imx_usb 2184000.usb: Error applying setting, reverse things back imx6q-pinctrl 20e0000.iomuxc: pin MX6Q_PAD_EIM_D31 already requested by regulators:regulator@1; cannot claim for 2184200.usb imx6q-pinctrl 20e0000.iomuxc: pin-52 (2184200.usb) status -22 imx6q-pinctrl 20e0000.iomuxc: could not request pin 52 (MX6Q_PAD_EIM_D31) from group usbh1grp on device 20e0000.iomuxc imx_usb 2184200.usb: Error applying setting, reverse things back Signed-off-by: Felipe F. Tonello Fixes: e2047e33f2bd ("ARM: dts: add initial Rex Pro board support") Cc: Signed-off-by: Shawn Guo Signed-off-by: Sasha Levin commit f2895fb34aa0c955bf2590c8d36e22ec9fb0c043 Author: Carl Frederik Werner Date: Wed Sep 2 10:07:57 2015 +0900 ARM: dts: omap3-beagle: make i2c3, ddc and tfp410 gpio work again [ Upstream commit 3a2fa775bd1d0579113666c1a2e37654a34018a0 ] Let's fix pinmux address of gpio 170 used by tfp410 powerdown-gpio. According to the OMAP35x Technical Reference Manual CONTROL_PADCONF_I2C3_SDA[15:0] 0x480021C4 mode0: i2c3_sda CONTROL_PADCONF_I2C3_SDA[31:16] 0x480021C4 mode4: gpio_170 the pinmux address of gpio 170 must be 0x480021C6. The former wrong address broke i2c3 (used by hdmi ddc), resulting in kernel message: omap_i2c 48060000.i2c: controller timed out Fixes: 8cecf52befd7 ("ARM: omap3-beagle.dts: add display information") Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Carl Frederik Werner Signed-off-by: Tony Lindgren Signed-off-by: Sasha Levin commit 5dbe39d03864ebee88fa608d523c572dda91a9d6 Author: Grazvydas Ignotas Date: Wed Sep 16 01:34:31 2015 +0300 ARM: dts: omap5-uevm.dts: fix i2c5 pinctrl offsets [ Upstream commit 1dbdad75074d16c3e3005180f81a01cdc04a7872 ] The i2c5 pinctrl offsets are wrong. If the bootloader doesn't set the pins up, communication with tca6424a doesn't work (controller timeouts) and it is not possible to enable HDMI. Fixes: 9be495c42609 ("ARM: dts: omap5-evm: Add I2c pinctrl data") Signed-off-by: Grazvydas Ignotas Signed-off-by: Tony Lindgren Signed-off-by: Sasha Levin commit c2a352ab83e8141761caa5257c398e1e8e18413c Author: Paul Bolle Date: Fri Jul 31 14:08:58 2015 +0200 windfarm: decrement client count when unregistering [ Upstream commit fe2b592173ff0274e70dc44d1d28c19bb995aa7c ] wf_unregister_client() increments the client count when a client unregisters. That is obviously incorrect. Decrement that client count instead. Fixes: 75722d3992f5 ("[PATCH] ppc64: Thermal control for SMU based machines") Signed-off-by: Paul Bolle Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin commit 2297978f3fc6e3f20917ff7e32fc1c81c3e72fbd Author: Ard Biesheuvel Date: Thu Sep 3 13:24:40 2015 +0100 ARM: 8429/1: disable GCC SRA optimization [ Upstream commit a077224fd35b2f7fbc93f14cf67074fc792fbac2 ] While working on the 32-bit ARM port of UEFI, I noticed a strange corruption in the kernel log. The following snprintf() statement (in drivers/firmware/efi/efi.c:efi_md_typeattr_format()) snprintf(pos, size, "|%3s|%2s|%2s|%2s|%3s|%2s|%2s|%2s|%2s]", was producing the following output in the log: | | | | | |WB|WT|WC|UC] | | | | | |WB|WT|WC|UC] | | | | | |WB|WT|WC|UC] |RUN| | | | |WB|WT|WC|UC]* |RUN| | | | |WB|WT|WC|UC]* | | | | | |WB|WT|WC|UC] |RUN| | | | |WB|WT|WC|UC]* | | | | | |WB|WT|WC|UC] |RUN| | | | | | | |UC] |RUN| | | | | | | |UC] As it turns out, this is caused by incorrect code being emitted for the string() function in lib/vsprintf.c. The following code if (!(spec.flags & LEFT)) { while (len < spec.field_width--) { if (buf < end) *buf = ' '; ++buf; } } for (i = 0; i < len; ++i) { if (buf < end) *buf = *s; ++buf; ++s; } while (len < spec.field_width--) { if (buf < end) *buf = ' '; ++buf; } when called with len == 0, triggers an issue in the GCC SRA optimization pass (Scalar Replacement of Aggregates), which handles promotion of signed struct members incorrectly. This is a known but as yet unresolved issue. (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65932). In this particular case, it is causing the second while loop to be executed erroneously a single time, causing the additional space characters to be printed. So disable the optimization by passing -fno-ipa-sra. Cc: Acked-by: Nicolas Pitre Signed-off-by: Ard Biesheuvel Signed-off-by: Russell King Signed-off-by: Sasha Levin commit a1c34031c6db90aa8b905d616369e573d301b86c Author: Russell King Date: Fri Sep 11 16:44:02 2015 +0100 ARM: fix Thumb2 signal handling when ARMv6 is enabled [ Upstream commit 9b55613f42e8d40d5c9ccb8970bde6af4764b2ab ] When a kernel is built covering ARMv6 to ARMv7, we omit to clear the IT state when entering a signal handler. This can cause the first few instructions to be conditionally executed depending on the parent context. In any case, the original test for >= ARMv7 is broken - ARMv6 can have Thumb-2 support as well, and an ARMv6T2 specific build would omit this code too. Relax the test back to ARMv6 or greater. This results in us always clearing the IT state bits in the PSR, even on CPUs where these bits are reserved. However, they're reserved for the IT state, so this should cause no harm. Cc: Fixes: d71e1352e240 ("Clear the IT state when invoking a Thumb-2 signal handler") Acked-by: Tony Lindgren Tested-by: H. Nikolaus Schaller Tested-by: Grazvydas Ignotas Signed-off-by: Russell King Signed-off-by: Sasha Levin commit dbea835ae741fe3956027d77afc204c9c09e1c00 Author: Guenter Roeck Date: Mon Aug 31 16:13:47 2015 -0700 hwmon: (nct6775) Swap STEP_UP_TIME and STEP_DOWN_TIME registers for most chips [ Upstream commit 728d29400488d54974d3317fe8a232b45fdb42ee ] The STEP_UP_TIME and STEP_DOWN_TIME registers are swapped for all chips but NCT6775. Reported-by: Grazvydas Ignotas Reviewed-by: Jean Delvare Cc: stable@vger.kernel.org # v3.10+ Signed-off-by: Guenter Roeck Signed-off-by: Sasha Levin commit 1c393822b2a54b7e39c58740cb296262b1f09229 Author: Dominik Dingel Date: Fri Sep 18 11:27:45 2015 +0200 sched: access local runqueue directly in single_task_running [ Upstream commit 00cc1633816de8c95f337608a1ea64e228faf771 ] Commit 2ee507c47293 ("sched: Add function single_task_running to let a task check if it is the only task running on a cpu") referenced the current runqueue with the smp_processor_id. When CONFIG_DEBUG_PREEMPT is enabled, that is only allowed if preemption is disabled or the currrent task is bound to the local cpu (e.g. kernel worker). With commit f78195129963 ("kvm: add halt_poll_ns module parameter") KVM calls single_task_running. If CONFIG_DEBUG_PREEMPT is enabled that generates a lot of kernel messages. To avoid adding preemption in that cases, as it would limit the usefulness, we change single_task_running to access directly the cpu local runqueue. Cc: Tim Chen Suggested-by: Peter Zijlstra Acked-by: Peter Zijlstra (Intel) Cc: Fixes: 2ee507c472939db4b146d545352b8a7c79ef47f8 Signed-off-by: Dominik Dingel Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit 101a8cee8944e6143754ccc09f638ef512988097 Author: Francesco Lavra Date: Sat Jul 25 08:25:18 2015 +0200 watchdog: sunxi: fix activation of system reset [ Upstream commit 0919e4445190da18496d31aac08b90828a47d45f ] Commit f2147de33470 ("watchdog: sunxi: support parameterized compatible strings") introduced a regression in sunxi_wdt_start(), by which the system reset function of the watchdog is not enabled upon starting the watchdog. As a result, the system is not reset when the watchdog expires. Fix it. Fixes: f2147de33470 ("watchdog: sunxi: support parameterized compatible strings") Signed-off-by: Francesco Lavra Acked-by: Maxime Ripard Reviewed-by: Guenter Roeck Signed-off-by: Wim Van Sebroeck Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 835199c03a9afc36cab02e76f59e7c67b8ae324d Author: Arnaldo Carvalho de Melo Date: Fri Sep 11 12:36:12 2015 -0300 perf header: Fixup reading of HEADER_NRCPUS feature [ Upstream commit caa470475d9b59eeff093ae650800d34612c4379 ] The original patch introducing this header wrote the number of CPUs available and online in one order and then swapped those values when reading, fix it. Before: # perf record usleep 1 # perf report --header-only | grep 'nrcpus \(online\|avail\)' # nrcpus online : 4 # nrcpus avail : 4 # echo 0 > /sys/devices/system/cpu/cpu2/online # perf record usleep 1 # perf report --header-only | grep 'nrcpus \(online\|avail\)' # nrcpus online : 4 # nrcpus avail : 3 # echo 0 > /sys/devices/system/cpu/cpu1/online # perf record usleep 1 # perf report --header-only | grep 'nrcpus \(online\|avail\)' # nrcpus online : 4 # nrcpus avail : 2 After the fix, bringing back the CPUs online: # perf report --header-only | grep 'nrcpus \(online\|avail\)' # nrcpus online : 2 # nrcpus avail : 4 # echo 1 > /sys/devices/system/cpu/cpu2/online # perf record usleep 1 # perf report --header-only | grep 'nrcpus \(online\|avail\)' # nrcpus online : 3 # nrcpus avail : 4 # echo 1 > /sys/devices/system/cpu/cpu1/online # perf record usleep 1 # perf report --header-only | grep 'nrcpus \(online\|avail\)' # nrcpus online : 4 # nrcpus avail : 4 Acked-by: Namhyung Kim Cc: Adrian Hunter Cc: Borislav Petkov Cc: David Ahern Cc: Frederic Weisbecker Cc: Jiri Olsa Cc: Kan Liang Cc: Stephane Eranian Cc: Wang Nan Fixes: fbe96f29ce4b ("perf tools: Make perf.data more self-descriptive (v8)") Link: http://lkml.kernel.org/r/20150911153323.GP23511@kernel.org Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin commit 6299a825f2b7ac2753af8c017d4b9a185d412780 Author: Kan Liang Date: Thu Jul 2 03:08:43 2015 -0400 perf stat: Get correct cpu id for print_aggr [ Upstream commit 601083cffb7cabdcc55b8195d732f0f7028570fa ] print_aggr() fails to print per-core/per-socket statistics after commit 582ec0829b3d ("perf stat: Fix per-socket output bug for uncore events") if events have differnt cpus. Because in print_aggr(), aggr_get_id needs index (not cpu id) to find core/pkg id. Also, evsel cpu maps should be used to get aggregated id. Here is an example: Counting events cycles,uncore_imc_0/cas_count_read/. (Uncore event has cpumask 0,18) $ perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,18 --per-core sleep 2 Without this patch, it failes to get CPU 18 result. Performance counter stats for 'CPU(s) 0,18': S0-C0 1 7526851 cycles S0-C0 1 1.05 MiB uncore_imc_0/cas_count_read/ S1-C0 0 cycles S1-C0 0 MiB uncore_imc_0/cas_count_read/ With this patch, it can get both CPU0 and CPU18 result. Performance counter stats for 'CPU(s) 0,18': S0-C0 1 6327768 cycles S0-C0 1 0.47 MiB uncore_imc_0/cas_count_read/ S1-C0 1 330228 cycles S1-C0 1 0.29 MiB uncore_imc_0/cas_count_read/ Signed-off-by: Kan Liang Acked-by: Jiri Olsa Acked-by: Stephane Eranian Cc: Adrian Hunter Cc: Andi Kleen Cc: David Ahern Cc: Namhyung Kim Cc: Peter Zijlstra Fixes: 582ec0829b3d ("perf stat: Fix per-socket output bug for uncore events") Link: http://lkml.kernel.org/r/1435820925-51091-1-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin commit 85c394ec20ef1524af97503d82e6a7acc52a1609 Author: Arnaldo Carvalho de Melo Date: Mon Aug 10 16:53:54 2015 -0300 perf report: Add support for srcfile sort key [ Upstream commit 31191a85fb875cf123cea56bbfd34f4b941f3c79 ] In some cases it's useful to characterize samples by file. This is useful to get a higher level categorization, for example to map cost to subsystems. Add a srcfile sort key to perf report. It builds on top of the existing srcline support. Commiter notes: E.g.: # perf record -F 10000 usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.016 MB perf.data (13 samples) ] [root@zoo ~]# perf report -s srcfile --stdio # Total Lost Samples: 0 # # Samples: 13 of event 'cycles' # Event count (approx.): 869878 # # Overhead Source File # ........ ........... 60.99% . 20.62% paravirt.h 14.23% rmap.c 4.04% signal.c 0.11% msr.h # The first line is collecting all the files for which srcfiles couldn't somehow get resolved to: # perf report -s srcfile,dso --stdio # Total Lost Samples: 0 # # Samples: 13 of event 'cycles' # Event count (approx.): 869878 # # Overhead Source File Shared Object # ........ ........... ................ 40.97% . ld-2.20.so 20.62% paravirt.h [kernel.vmlinux] 20.02% . libc-2.20.so 14.23% rmap.c [kernel.vmlinux] 4.04% signal.c [kernel.vmlinux] 0.11% msr.h [kernel.vmlinux] # XXX: Investigate why that is not resolving on Fedora 21, Andi says he hasn't seen this on Fedora 22. Signed-off-by: Andi Kleen Tested-by: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Namhyung Kim Link: http://lkml.kernel.org/r/1438988064-21834-1-git-send-email-andi@firstfloor.org [ Added column length update, from 0e65bdb3f90f ('perf hists: Update the column width for the "srcline" sort key') ] Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin commit 70afc9cd7f417e44f0486110ccc1b856b40a1d96 Author: Adrian Hunter Date: Thu Sep 24 13:05:22 2015 +0300 perf tools: Fix copying of /proc/kcore [ Upstream commit b5cabbcbd157a4bf5a92dfc85134999a3b55342d ] A copy of /proc/kcore containing the kernel text can be made to the buildid cache. e.g. perf buildid-cache -v -k /proc/kcore To workaround objdump limitations, a copy is also made when annotating against /proc/kcore. The copying process stops working from libelf about v1.62 onwards (the problem was found with v1.63). The cause is that a call to gelf_getphdr() in kcore__add_phdr() fails because additional validation has been added to gelf_getphdr(). The use of gelf_getphdr() is a misguided attempt to get default initialization of the Gelf_Phdr structure. That should not be necessary because every member of the Gelf_Phdr structure is subsequently assigned. So just remove the call to gelf_getphdr(). Similarly, a call to gelf_getehdr() in gelf_kcore__init() can be removed also. Committer notes: Note to stable@kernel.org, from Adrian in the cover letter for this patchkit: The "Fix copying of /proc/kcore" problem goes back to v3.13 if you think it is important enough for stable. Signed-off-by: Adrian Hunter Cc: Jiri Olsa Cc: stable@kernel.org Link: http://lkml.kernel.org/r/1443089122-19082-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin commit 645c23ecfd8ef0f81a42db43ad60d8c28ca3348c Author: Jenny Derzhavetz Date: Sun Sep 6 14:52:20 2015 +0300 iser-target: remove command with state ISTATE_REMOVE [ Upstream commit a4c15cd957cbd728f685645de7a150df5912591a ] As documented in iscsit_sequence_cmd: /* * Existing callers for iscsit_sequence_cmd() will silently * ignore commands with CMDSN_LOWER_THAN_EXP, so force this * return for CMDSN_MAXCMDSN_OVERRUN as well.. */ We need to silently finish a command when it's in ISTATE_REMOVE. This fixes an teardown hang we were seeing where a mis-behaved initiator (triggered by allocation error injections) sent us a cmdsn which was lower than expected. Signed-off-by: Jenny Derzhavetz Signed-off-by: Sagi Grimberg Cc: # v3.10+ Signed-off-by: Nicholas Bellinger Signed-off-by: Sasha Levin commit bda8d5c1a05079c10da9fd43b04ad00cd853fb79 Author: Michal Hocko Date: Thu Aug 27 20:16:37 2015 +0200 scsi: fix scsi_error_handler vs. scsi_host_dev_release race [ Upstream commit 537b604c8b3aa8b96fe35f87dd085816552e294c ] b9d5c6b7ef57 ("[SCSI] cleanup setting task state in scsi_error_handler()") has introduced a race between scsi_error_handler and scsi_host_dev_release resulting in the hang when the device goes away because scsi_error_handler might miss a wake up: CPU0 CPU1 scsi_error_handler scsi_host_dev_release kthread_stop() kthread_should_stop() test_bit(KTHREAD_SHOULD_STOP) set_bit(KTHREAD_SHOULD_STOP) wake_up_process() wait_for_completion() set_current_state(TASK_INTERRUPTIBLE) schedule() The most straightforward solution seems to be to invert the ordering of the set_current_state and kthread_should_stop. The issue has been noticed during reboot test on a 3.0 based kernel but the current code seems to be affected in the same way. [jejb: additional comment added] Cc: # 3.6+ Reported-and-debugged-by: Mike Mayer Signed-off-by: Michal Hocko Reviewed-by: Dan Williams Reviewed-by: Hannes Reinecke Signed-off-by: James Bottomley Signed-off-by: Sasha Levin commit 447aac1e3d136bcde17f561b4615427adbf4070a Author: Andy Grover Date: Mon Aug 24 10:26:03 2015 -0700 target/iscsi: Fix np_ip bracket issue by removing np_ip [ Upstream commit 76c28f1fcfeb42b47f798fe498351ee1d60086ae ] Revert commit 1997e6259, which causes double brackets on ipv6 inaddr_any addresses. Since we have np_sockaddr, if we need a textual representation we can use "%pISc". Change iscsit_add_network_portal() and iscsit_add_np() signatures to remove *ip_str parameter. Fix and extend some comments earlier in the function. Tested to work for :: and ::1 via iscsiadm, previously :: failed, see https://bugzilla.redhat.com/show_bug.cgi?id=1249107 . CC: stable@vger.kernel.org Signed-off-by: Andy Grover Signed-off-by: Nicholas Bellinger Signed-off-by: Sasha Levin commit 1dbcfc2dcd84d2b2781fb8611fb4bcd3792c32ae Author: John Stultz Date: Wed Sep 9 16:07:30 2015 -0700 time: Fix timekeeping_freqadjust()'s incorrect use of abs() instead of abs64() [ Upstream commit 2619d7e9c92d524cb155ec89fd72875321512e5b ] The internal clocksteering done for fine-grained error correction uses a logarithmic approximation, so any time adjtimex() adjusts the clock steering, timekeeping_freqadjust() quickly approximates the correct clock frequency over a series of ticks. Unfortunately, the logic in timekeeping_freqadjust(), introduced in commit: dc491596f639 ("timekeeping: Rework frequency adjustments to work better w/ nohz") used the abs() function with a s64 error value to calculate the size of the approximated adjustment to be made. Per include/linux/kernel.h: "abs() should not be used for 64-bit types (s64, u64, long long) - use abs64()". Thus on 32-bit platforms, this resulted in the clocksteering to take a quite dampended random walk trying to converge on the proper frequency, which caused the adjustments to be made much slower then intended (most easily observed when large adjustments are made). This patch fixes the issue by using abs64() instead. Reported-by: Nuno Gonçalves Tested-by: Nuno Goncalves Signed-off-by: John Stultz Cc: # v3.17+ Cc: Linus Torvalds Cc: Miroslav Lichvar Cc: Peter Zijlstra Cc: Prarit Bhargava Cc: Richard Cochran Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1441840051-20244-1-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit 0b5ee81826c2d98d40366c4fa435ebf73cfec8fd Author: Jason Wang Date: Tue Sep 15 14:41:56 2015 +0800 kvm: fix double free for fast mmio eventfd [ Upstream commit eefd6b06b17c5478e7c24bea6f64beaa2c431ca6 ] We register wildcard mmio eventfd on two buses, once for KVM_MMIO_BUS and once on KVM_FAST_MMIO_BUS but with a single iodev instance. This will lead to an issue: kvm_io_bus_destroy() knows nothing about the devices on two buses pointing to a single dev. Which will lead to double free[1] during exit. Fix this by allocating two instances of iodevs then registering one on KVM_MMIO_BUS and another on KVM_FAST_MMIO_BUS. CPU: 1 PID: 2894 Comm: qemu-system-x86 Not tainted 3.19.0-26-generic #28-Ubuntu Hardware name: LENOVO 2356BG6/2356BG6, BIOS G7ET96WW (2.56 ) 09/12/2013 task: ffff88009ae0c4b0 ti: ffff88020e7f0000 task.ti: ffff88020e7f0000 RIP: 0010:[] [] ioeventfd_release+0x28/0x60 [kvm] RSP: 0018:ffff88020e7f3bc8 EFLAGS: 00010292 RAX: dead000000200200 RBX: ffff8801ec19c900 RCX: 000000018200016d RDX: ffff8801ec19cf80 RSI: ffffea0008bf1d40 RDI: ffff8801ec19c900 RBP: ffff88020e7f3bd8 R08: 000000002fc75a01 R09: 000000018200016d R10: ffffffffc07df6ae R11: ffff88022fc75a98 R12: ffff88021e7cc000 R13: ffff88021e7cca48 R14: ffff88021e7cca50 R15: ffff8801ec19c880 FS: 00007fc1ee3e6700(0000) GS:ffff88023e240000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8f389d8000 CR3: 000000023dc13000 CR4: 00000000001427e0 Stack: ffff88021e7cc000 0000000000000000 ffff88020e7f3be8 ffffffffc07e2622 ffff88020e7f3c38 ffffffffc07df69a ffff880232524160 ffff88020e792d80 0000000000000000 ffff880219b78c00 0000000000000008 ffff8802321686a8 Call Trace: [] ioeventfd_destructor+0x12/0x20 [kvm] [] kvm_put_kvm+0xca/0x210 [kvm] [] kvm_vcpu_release+0x18/0x20 [kvm] [] __fput+0xe7/0x250 [] ____fput+0xe/0x10 [] task_work_run+0xd4/0xf0 [] do_exit+0x368/0xa50 [] ? recalc_sigpending+0x1f/0x60 [] do_group_exit+0x45/0xb0 [] get_signal+0x291/0x750 [] do_signal+0x28/0xab0 [] ? do_futex+0xdb/0x5d0 [] ? __wake_up_locked_key+0x18/0x20 [] ? SyS_futex+0x76/0x170 [] do_notify_resume+0x69/0xb0 [] int_signal+0x12/0x17 Code: 5d c3 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 8b 7f 20 e8 06 d6 a5 c0 48 8b 43 08 48 8b 13 48 89 df 48 89 42 08 <48> 89 10 48 b8 00 01 10 00 00 RIP [] ioeventfd_release+0x28/0x60 [kvm] RSP Cc: stable@vger.kernel.org Cc: Gleb Natapov Cc: Paolo Bonzini Signed-off-by: Jason Wang Reviewed-by: Cornelia Huck Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit 7642b3f109228718f1bf57c35210c9a36696a465 Author: Jason Wang Date: Tue Sep 15 14:41:55 2015 +0800 kvm: factor out core eventfd assign/deassign logic [ Upstream commit 85da11ca587c8eb73993a1b503052391a73586f9 ] This patch factors out core eventfd assign/deassign logic and leaves the argument checking and bus index selection to callers. Cc: stable@vger.kernel.org Cc: Gleb Natapov Cc: Paolo Bonzini Signed-off-by: Jason Wang Reviewed-by: Cornelia Huck Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit 7d765ce07eff78ab78d09d4acaa3aecb71e322a4 Author: Jason Wang Date: Tue Sep 15 14:41:57 2015 +0800 kvm: fix zero length mmio searching [ Upstream commit 8f4216c7d28976f7ec1b2bcbfa0a9f787133c45e ] Currently, if we had a zero length mmio eventfd assigned on KVM_MMIO_BUS. It will never be found by kvm_io_bus_cmp() since it always compares the kvm_io_range() with the length that guest wrote. This will cause e.g for vhost, kick will be trapped by qemu userspace instead of vhost. Fixing this by using zero length if an iodevice is zero length. Cc: stable@vger.kernel.org Cc: Gleb Natapov Cc: Paolo Bonzini Signed-off-by: Jason Wang Reviewed-by: Cornelia Huck Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit d758df24a58e5160270c74b467dfa6453fceb91b Author: Jason Wang Date: Tue Sep 15 14:41:54 2015 +0800 kvm: don't try to register to KVM_FAST_MMIO_BUS for non mmio eventfd [ Upstream commit 8453fecbecae26edb3f278627376caab05d9a88d ] We only want zero length mmio eventfd to be registered on KVM_FAST_MMIO_BUS. So check this explicitly when arg->len is zero to make sure this. Cc: stable@vger.kernel.org Cc: Gleb Natapov Cc: Paolo Bonzini Signed-off-by: Jason Wang Reviewed-by: Cornelia Huck Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit 45258bdd7f2da471c1201443a63765eff1622863 Author: Marek Majtyka Date: Wed Sep 16 12:04:55 2015 +0200 arm: KVM: Fix incorrect device to IPA mapping [ Upstream commit ca09f02f122b2ecb0f5ddfc5fd47b29ed657d4fd ] A critical bug has been found in device memory stage1 translation for VMs with more then 4GB of address space. Once vm_pgoff size is smaller then pa (which is true for LPAE case, u32 and u64 respectively) some more significant bits of pa may be lost as a shift operation is performed on u32 and later cast onto u64. Example: vm_pgoff(u32)=0x00210030, PAGE_SHIFT=12 expected pa(u64): 0x0000002010030000 produced pa(u64): 0x0000000010030000 The fix is to change the order of operations (casting first onto phys_addr_t and then shifting). Reviewed-by: Marc Zyngier [maz: fixed changelog and patch formatting] Cc: stable@vger.kernel.org Signed-off-by: Marek Majtyka Signed-off-by: Marc Zyngier Signed-off-by: Sasha Levin commit 3cd079e563ff5ffc475114e9f0d140f7882d56d2 Author: Kyle Evans Date: Fri Sep 11 10:40:17 2015 -0500 hp-wmi: limit hotkey enable [ Upstream commit 8a1513b49321e503fd6c8b6793e3b1f9a8a3285b ] Do not write initialize magic on systems that do not have feature query 0xb. Fixes Bug #82451. Redefine FEATURE_QUERY to align with 0xb and FEATURE2 with 0xd for code clearity. Add a new test function, hp_wmi_bios_2008_later() & simplify hp_wmi_bios_2009_later(), which fixes a bug in cases where an improper value is returned. Probably also fixes Bug #69131. Add missing __init tag. Signed-off-by: Kyle Evans Cc: stable@vger.kernel.org Signed-off-by: Darren Hart Signed-off-by: Sasha Levin commit 2889a072fc7d3d94fff4ec6ad5db9c2fbc37c335 Author: Luis Henriques Date: Thu Sep 17 16:01:40 2015 -0700 zram: fix possible use after free in zcomp_create() [ Upstream commit 3aaf14da807a4e9931a37f21e4251abb8a67021b ] zcomp_create() verifies the success of zcomp_strm_{multi,single}_create() through comp->stream, which can potentially be pointing to memory that was freed if these functions returned an error. While at it, replace a 'ERR_PTR(-ENOMEM)' by a more generic 'ERR_PTR(error)' as in the future zcomp_strm_{multi,siggle}_create() could return other error codes. Function documentation updated accordingly. Fixes: beca3ec71fe5 ("zram: add multi stream functionality") Signed-off-by: Luis Henriques Acked-by: Sergey Senozhatsky Acked-by: Minchan Kim Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 71a386c7a6a1cf88c712f39e9e60fead672846c5 Author: Stas Sergeev Date: Mon Jul 20 17:49:57 2015 -0700 of_mdio: add new DT property 'managed' to specify the PHY management type [ Upstream commit 4cba5c2103657d43d0886e4cff8004d95a3d0def ] Currently the PHY management type is selected by the MAC driver arbitrary. The decision is based on the presence of the "fixed-link" node and on a will of the driver's authors. This caused a regression recently, when mvneta driver suddenly started to use the in-band status for auto-negotiation on fixed links. It appears the auto-negotiation may not work when expected by the MAC driver. Sebastien Rannou explains: << Yes, I confirm that my HW does not generate an in-band status. AFAIK, it's a PHY that aggregates 4xSGMIIs to 1xQSGMII ; the MAC side of the PHY (with inband status) is connected to the switch through QSGMII, and in this context we are on the media side of the PHY. >> https://lkml.org/lkml/2015/7/10/206 This patch introduces the new string property 'managed' that allows the user to set the management type explicitly. The supported values are: "auto" - default. Uses either MDIO or nothing, depending on the presence of the fixed-link node "in-band-status" - use in-band status Signed-off-by: Stas Sergeev CC: Rob Herring CC: Pawel Moll CC: Mark Rutland CC: Ian Campbell CC: Kumar Gala CC: Florian Fainelli CC: Grant Likely CC: devicetree@vger.kernel.org CC: linux-kernel@vger.kernel.org CC: netdev@vger.kernel.org Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit c0fb099370bcd7dd1956a787efbec35584d00117 Author: Florian Fainelli Date: Mon Jul 20 17:49:55 2015 -0700 net: dsa: bcm_sf2: Do not override speed settings [ Upstream commit d2eac98f7d1b950b762a7eca05a9ce0ea1d878d2 ] The SF2 driver currently overrides speed settings for its port configured using a fixed PHY, this is both unnecessary and incorrect, because we keep feedback to the hardware parameters that we read from the PHY device, which in the case of a fixed PHY cannot possibly change speed. This is a required change to allow the fixed PHY code to allow registering a PHY with a link configured as DOWN by default and avoid some sort of circular dependency where we require the link_update callback to run to program the hardware, and we then utilize the fixed PHY parameters to program the hardware with the same settings. Fixes: 246d7f773c13 ("net: dsa: add Broadcom SF2 switch driver") Signed-off-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 9a2c1f52a4a65565ea54ecd65e99a34e8c39cb3c Author: Eric Dumazet Date: Wed Sep 23 14:00:21 2015 -0700 tcp: add proper TS val into RST packets [ Upstream commit 675ee231d960af2af3606b4480324e26797eb010 ] RST packets sent on behalf of TCP connections with TS option (RFC 7323 TCP timestamps) have incorrect TS val (set to 0), but correct TS ecr. A > B: Flags [S], seq 0, win 65535, options [mss 1000,nop,nop,TS val 100 ecr 0], length 0 B > A: Flags [S.], seq 2444755794, ack 1, win 28960, options [mss 1460,nop,nop,TS val 7264344 ecr 100], length 0 A > B: Flags [.], ack 1, win 65535, options [nop,nop,TS val 110 ecr 7264344], length 0 B > A: Flags [R.], seq 1, ack 1, win 28960, options [nop,nop,TS val 0 ecr 110], length 0 We need to call skb_mstamp_get() to get proper TS val, derived from skb->skb_mstamp Note that RFC 1323 was advocating to not send TS option in RST segment, but RFC 7323 recommends the opposite : Once TSopt has been successfully negotiated, that is both and contain TSopt, the TSopt MUST be sent in every non- segment for the duration of the connection, and SHOULD be sent in an segment (see Section 5.2 for details) Note this RFC recommends to send TS val = 0, but we believe it is premature : We do not know if all TCP stacks are properly handling the receive side : When an segment is received, it MUST NOT be subjected to the PAWS check by verifying an acceptable value in SEG.TSval, and information from the Timestamps option MUST NOT be used to update connection state information. SEG.TSecr MAY be used to provide stricter acceptance checks. In 5 years, if/when all TCP stack are RFC 7323 ready, we might consider to decide to send TS val = 0, if it buys something. Fixes: 7faee5c0d514 ("tcp: remove TCP_SKB_CB(skb)->when") Signed-off-by: Eric Dumazet Acked-by: Yuchung Cheng Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 646cd5edbf95700a6d720f153ce2b6a66dde8aa3 Author: Florian Fainelli Date: Tue Sep 8 20:06:41 2015 -0700 net: dsa: bcm_sf2: Fix 64-bits register writes [ Upstream commit 03679a14739a0d4c14b52ba65a69ff553bfba73b ] The macro to write 64-bits quantities to the 32-bits register swapped the value and offsets arguments, we want to preserve the ordering of the arguments with respect to how writel() is implemented for instance: value first, offset/base second. Fixes: 246d7f773c13 ("net: dsa: add Broadcom SF2 switch driver") Signed-off-by: Florian Fainelli Reviewed-by: Vivien Didelot Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit ca41797a868fee22aab7d1bb3cbccb985172c65c Author: Atsushi Nemoto Date: Wed Sep 2 17:49:29 2015 +0900 net: eth: altera: fix napi poll_list corruption [ Upstream commit 4548a697e4969d695047cebd6d9af5e2f6cc728e ] tse_poll() calls __napi_complete() with irq enabled. This leads napi poll_list corruption and may stop all napi drivers working. Use napi_complete() instead of __napi_complete(). Signed-off-by: Atsushi Nemoto Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 826d518a0241a4f9fe960673b9047f44f165218a Author: Eric Sandeen Date: Sat Aug 15 10:45:06 2015 -0400 ext4: don't manipulate recovery flag when freezing no-journal fs [ Upstream commit c642dc9e1aaed953597e7092d7df329e6234096e ] At some point along this sequence of changes: f6e63f9 ext4: fold ext4_nojournal_sops into ext4_sops bb04457 ext4: support freezing ext2 (nojournal) file systems 9ca9238 ext4: Use separate super_operations structure for no_journal filesystems ext4 started setting needs_recovery on filesystems without journals when they are unfrozen. This makes no sense, and in fact confuses blkid to the point where it doesn't recognize the filesystem at all. (freeze ext2; unfreeze ext2; run blkid; see no output; run dumpe2fs, see needs_recovery set on fs w/ no journal). To fix this, don't manipulate the INCOMPAT_RECOVER feature on filesystems without journals. Reported-by: Stu Mark Reviewed-by: Jan Kara Signed-off-by: Eric Sandeen Signed-off-by: Theodore Ts'o Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 5324b25311a4676cda90a280ec273b7b46a79d28 Author: Daniel Axtens Date: Tue Sep 15 15:04:07 2015 +1000 cxl: Fix unbalanced pci_dev_get in cxl_probe [ Upstream commit 2925c2fdf1e0eb642482f5b30577e9435aaa8edb ] Currently the first thing we do in cxl_probe is to grab a reference on the pci device. Later on, we call device_register on our adapter. In our remove path, we call device_unregister, but we never call pci_dev_put. We therefore leak the device every time we do a reflash. device_register/unregister is sufficient to hold the reference. Therefore, drop the call to pci_dev_get. Here's why this is safe. The proposed cxl_probe(pdev) calls cxl_adapter_init: a) init calls cxl_adapter_alloc, which creates a struct cxl, conventionally called adapter. This struct contains a device entry, adapter->dev. b) init calls cxl_configure_adapter, where we set adapter->dev.parent = &dev->dev (here dev is the pci dev) So at this point, the cxl adapter's device's parent is the PCI device that I want to be refcounted properly. c) init calls cxl_register_adapter *) cxl_register_adapter calls device_register(&adapter->dev) So now we're in device_register, where dev is the adapter device, and we want to know if the PCI device is safe after we return. device_register(&adapter->dev) calls device_initialize() and then device_add(). device_add() does a get_device(). device_add() also explicitly grabs the device's parent, and calls get_device() on it: parent = get_device(dev->parent); So therefore, device_register() takes a lock on the parent PCI dev, which is what pci_dev_get() was guarding. pci_dev_get() can therefore be safely removed. Fixes: f204e0b8cedd ("cxl: Driver code for powernv PCIe based cards for userspace access") Cc: stable@vger.kernel.org Signed-off-by: Daniel Axtens Acked-by: Ian Munsie Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin commit 5be042b1917ddf444c20f4e12856535307b37c01 Author: Shota Suzuki Date: Wed Jul 1 09:25:52 2015 +0900 igb: Fix oops caused by missing queue pairing [ Upstream commit 72ddef0506da852dc82f078f37ced8ef4d74a2bf ] When initializing igb driver (e.g. 82576, I350), IGB_FLAG_QUEUE_PAIRS is set if adapter->rss_queues exceeds half of max_rss_queues in igb_init_queue_configuration(). On the other hand, IGB_FLAG_QUEUE_PAIRS is not set even if the number of queues exceeds half of max_combined in igb_set_channels() when changing the number of queues by "ethtool -L". In this case, if numvecs is larger than MAX_MSIX_ENTRIES (10), the size of adapter->msix_entries[], an overflow can occur in igb_set_interrupt_capability(), which in turn leads to an oops. Fix this problem as follows: - When changing the number of queues by "ethtool -L", set IGB_FLAG_QUEUE_PAIRS in the same way as initializing igb driver. - When increasing the size of q_vector, reallocate it appropriately. (With IGB_FLAG_QUEUE_PAIRS set, the size of q_vector gets larger.) Another possible way to fix this problem is to cap the queues at its initial number, which is the number of the initial online cpus. But this is not the optimal way because we cannot increase queues when another cpu becomes online. Note that before commit cd14ef54d25b ("igb: Change to use statically allocated array for MSIx entries"), this problem did not cause oops but just made the number of queues become 1 because of entering msi_only mode in igb_set_interrupt_capability(). Fixes: 907b7835799f ("igb: Add ethtool support to configure number of channels") CC: stable Signed-off-by: Shota Suzuki Tested-by: Aaron Brown Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin commit e936a4c6e21b33b3c894c31cf45757961a0a22ef Author: Larry Finger Date: Wed Jul 8 10:18:50 2015 -0500 rtlwifi: rtl8821ae: Fix an expression that is always false [ Upstream commit 251086f588720277a6f5782020a648ce32c4e00b ] In routine _rtl8821ae_set_media_status(), an incorrect mask results in a test for AP status to always be false. Similar bugs were fixed in rtl8192cu and rtl8192de, but this instance was missed at that time. Reported-by: David Binderman Signed-off-by: Larry Finger Cc: Stable [3.18+] Cc: David Binderman Signed-off-by: Kalle Valo Signed-off-by: Sasha Levin commit 4bc532d8428f6dd671c66f51ce5e459cc0ff1c86 Author: Andy Lutomirski Date: Wed Jul 15 10:29:38 2015 -0700 x86/nmi/64: Use DF to avoid userspace RSP confusing nested NMI detection [ Upstream commit 810bc075f78ff2c221536eb3008eac6a492dba2d ] We have a tricky bug in the nested NMI code: if we see RSP pointing to the NMI stack on NMI entry from kernel mode, we assume that we are executing a nested NMI. This isn't quite true. A malicious userspace program can point RSP at the NMI stack, issue SYSCALL, and arrange for an NMI to happen while RSP is still pointing at the NMI stack. Fix it with a sneaky trick. Set DF in the region of code that the RSP check is intended to detect. IRET will clear DF atomically. ( Note: other than paravirt, there's little need for all this complexity. We could check RIP instead of RSP. ) Signed-off-by: Andy Lutomirski Reviewed-by: Steven Rostedt Cc: Borislav Petkov Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: stable@vger.kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit eb0bad52bb1c7754607a49fdfdf1a57082dded64 Author: Andy Lutomirski Date: Wed Jul 15 10:29:37 2015 -0700 x86/nmi/64: Reorder nested NMI checks [ Upstream commit a27507ca2d796cfa8d907de31ad730359c8a6d06 ] Check the repeat_nmi .. end_repeat_nmi special case first. The next patch will rework the RSP check and, as a side effect, the RSP check will no longer detect repeat_nmi .. end_repeat_nmi, so we'll need this ordering of the checks. Note: this is more subtle than it appears. The check for repeat_nmi .. end_repeat_nmi jumps straight out of the NMI code instead of adjusting the "iret" frame to force a repeat. This is necessary, because the code between repeat_nmi and end_repeat_nmi sets "NMI executing" and then writes to the "iret" frame itself. If a nested NMI comes in and modifies the "iret" frame while repeat_nmi is also modifying it, we'll end up with garbage. The old code got this right, as does the new code, but the new code is a bit more explicit. If we were to move the check right after the "NMI executing" check, then we'd get it wrong and have random crashes. ( Because the "NMI executing" check would jump to the code that would modify the "iret" frame without checking if the interrupted NMI was currently modifying it. ) Signed-off-by: Andy Lutomirski Reviewed-by: Steven Rostedt Cc: Borislav Petkov Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: stable@vger.kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit 092f7a2aeeaa580099615239bad5b35a445cf6d3 Author: Andy Lutomirski Date: Wed Jul 15 10:29:36 2015 -0700 x86/nmi/64: Improve nested NMI comments [ Upstream commit 0b22930ebad563ae97ff3f8d7b9f12060b4c6e6b ] I found the nested NMI documentation to be difficult to follow. Improve the comments. Signed-off-by: Andy Lutomirski Reviewed-by: Steven Rostedt Cc: Borislav Petkov Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: stable@vger.kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit 5e4c0ae9512a9a697300a73d2f5ec4ff9822bcd7 Author: Ivan Vecera Date: Thu Aug 6 22:48:23 2015 +0200 bna: fix interrupts storm caused by erroneous packets [ Upstream commit ade4dc3e616e33c80d7e62855fe1b6f9895bc7c3 ] The commit "e29aa33 bna: Enable Multi Buffer RX" moved packets counter increment from the beginning of the NAPI processing loop after the check for erroneous packets so they are never accounted. This counter is used to inform firmware about number of processed completions (packets). As these packets are never acked the firmware fires IRQs for them again and again. Fixes: e29aa33 ("bna: Enable Multi Buffer RX") Signed-off-by: Ivan Vecera Acked-by: Rasesh Mody Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 2ab4f1137e9c201684bb92fb32cfd717dd266898 Author: Eric Dumazet Date: Sat Aug 1 12:14:33 2015 +0200 udp: fix dst races with multicast early demux [ Upstream commit 10e2eb878f3ca07ac2f05fa5ca5e6c4c9174a27a ] Multicast dst are not cached. They carry DST_NOCACHE. As mentioned in commit f8864972126899 ("ipv4: fix dst race in sk_dst_get()"), these dst need special care before caching them into a socket. Caching them is allowed only if their refcnt was not 0, ie we must use atomic_inc_not_zero() Also, we must use READ_ONCE() to fetch sk->sk_rx_dst, as mentioned in commit d0c294c53a771 ("tcp: prevent fetching dst twice in early demux code") Fixes: 421b3885bf6d ("udp: ipv4: Add udp early demux") Tested-by: Gregory Hoggarth Signed-off-by: Eric Dumazet Reported-by: Gregory Hoggarth Reported-by: Alex Gartrell Cc: Michal Kubeček Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit f0efe0105861b64a20345e4b2db84baac239d80b Author: Lars Westerhoff Date: Tue Jul 28 01:32:21 2015 +0300 packet: missing dev_put() in packet_do_bind() [ Upstream commit 158cd4af8dedbda0d612d448c724c715d0dda649 ] When binding a PF_PACKET socket, the use count of the bound interface is always increased with dev_hold in dev_get_by_{index,name}. However, when rebound with the same protocol and device as in the previous bind the use count of the interface was not decreased. Ultimately, this caused the deletion of the interface to fail with the following message: unregister_netdevice: waiting for dummy0 to become free. Usage count = 1 This patch moves the dev_put out of the conditional part that was only executed when either the protocol or device changed on a bind. Fixes: 902fefb82ef7 ('packet: improve socket create/bind latency in some cases') Signed-off-by: Lars Westerhoff Signed-off-by: Dan Carpenter Reviewed-by: Daniel Borkmann Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 71960d66b34fbcdb38474d55c4e85c6016047a33 Author: Wilson Kok Date: Tue Sep 22 21:40:22 2015 -0700 fib_rules: fix fib rule dumps across multiple skbs [ Upstream commit 41fc014332d91ee90c32840bf161f9685b7fbf2b ] dump_rules returns skb length and not error. But when family == AF_UNSPEC, the caller of dump_rules assumes that it returns an error. Hence, when family == AF_UNSPEC, we continue trying to dump on -EMSGSIZE errors resulting in incorrect dump idx carried between skbs belonging to the same dump. This results in fib rule dump always only dumping rules that fit into the first skb. This patch fixes dump_rules to return error so that we exit correctly and idx is correctly maintained between skbs that are part of the same dump. Signed-off-by: Wilson Kok Signed-off-by: Roopa Prabhu Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit ae688bc6a552199f47564bc5d57a47b3a7370251 Author: Jesse Gross Date: Mon Sep 21 20:21:20 2015 -0700 openvswitch: Zero flows on allocation. [ Upstream commit ae5f2fb1d51fa128a460bcfbe3c56d7ab8bf6a43 ] When support for megaflows was introduced, OVS needed to start installing flows with a mask applied to them. Since masking is an expensive operation, OVS also had an optimization that would only take the parts of the flow keys that were covered by a non-zero mask. The values stored in the remaining pieces should not matter because they are masked out. While this works fine for the purposes of matching (which must always look at the mask), serialization to netlink can be problematic. Since the flow and the mask are serialized separately, the uninitialized portions of the flow can be encoded with whatever values happen to be present. In terms of functionality, this has little effect since these fields will be masked out by definition. However, it leaks kernel memory to userspace, which is a potential security vulnerability. It is also possible that other code paths could look at the masked key and get uninitialized data, although this does not currently appear to be an issue in practice. This removes the mask optimization for flows that are being installed. This was always intended to be the case as the mask optimizations were really targetting per-packet flow operations. Fixes: 03f0d916 ("openvswitch: Mega flow implementation") Signed-off-by: Jesse Gross Acked-by: Pravin B Shelar Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 779c19e0ac88b95710ceae2495caebfd442dd2c1 Author: Marcelo Ricardo Leitner Date: Thu Sep 10 17:31:15 2015 -0300 sctp: fix race on protocol/netns initialization [ Upstream commit 8e2d61e0aed2b7c4ecb35844fe07e0b2b762dee4 ] Consider sctp module is unloaded and is being requested because an user is creating a sctp socket. During initialization, sctp will add the new protocol type and then initialize pernet subsys: status = sctp_v4_protosw_init(); if (status) goto err_protosw_init; status = sctp_v6_protosw_init(); if (status) goto err_v6_protosw_init; status = register_pernet_subsys(&sctp_net_ops); The problem is that after those calls to sctp_v{4,6}_protosw_init(), it is possible for userspace to create SCTP sockets like if the module is already fully loaded. If that happens, one of the possible effects is that we will have readers for net->sctp.local_addr_list list earlier than expected and sctp_net_init() does not take precautions while dealing with that list, leading to a potential panic but not limited to that, as sctp_sock_init() will copy a bunch of blank/partially initialized values from net->sctp. The race happens like this: CPU 0 | CPU 1 socket() | __sock_create | socket() inet_create | __sock_create list_for_each_entry_rcu( | answer, &inetsw[sock->type], | list) { | inet_create /* no hits */ | if (unlikely(err)) { | ... | request_module() | /* socket creation is blocked | * the module is fully loaded | */ | sctp_init | sctp_v4_protosw_init | inet_register_protosw | list_add_rcu(&p->list, | last_perm); | | list_for_each_entry_rcu( | answer, &inetsw[sock->type], sctp_v6_protosw_init | list) { | /* hit, so assumes protocol | * is already loaded | */ | /* socket creation continues | * before netns is initialized | */ register_pernet_subsys | Simply inverting the initialization order between register_pernet_subsys() and sctp_v4_protosw_init() is not possible because register_pernet_subsys() will create a control sctp socket, so the protocol must be already visible by then. Deferring the socket creation to a work-queue is not good specially because we loose the ability to handle its errors. So, as suggested by Vlad, the fix is to split netns initialization in two moments: defaults and control socket, so that the defaults are already loaded by when we register the protocol, while control socket initialization is kept at the same moment it is today. Fixes: 4db67e808640 ("sctp: Make the address lists per network namespace") Signed-off-by: Vlad Yasevich Signed-off-by: Marcelo Ricardo Leitner Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit d38200098e3203ba30ba06ed3f345ec6ca75234c Author: Daniel Borkmann Date: Thu Sep 10 20:05:46 2015 +0200 netlink, mmap: transform mmap skb into full skb on taps [ Upstream commit 1853c949646005b5959c483becde86608f548f24 ] Ken-ichirou reported that running netlink in mmap mode for receive in combination with nlmon will throw a NULL pointer dereference in __kfree_skb() on nlmon_xmit(), in my case I can also trigger an "unable to handle kernel paging request". The problem is the skb_clone() in __netlink_deliver_tap_skb() for skbs that are mmaped. I.e. the cloned skb doesn't have a destructor, whereas the mmap netlink skb has it pointed to netlink_skb_destructor(), set in the handler netlink_ring_setup_skb(). There, skb->head is being set to NULL, so that in such cases, __kfree_skb() doesn't perform a skb_release_data() via skb_release_all(), where skb->head is possibly being freed through kfree(head) into slab allocator, although netlink mmap skb->head points to the mmap buffer. Similarly, the same has to be done also for large netlink skbs where the data area is vmalloced. Therefore, as discussed, make a copy for these rather rare cases for now. This fixes the issue on my and Ken-ichirou's test-cases. Reference: http://thread.gmane.org/gmane.linux.network/371129 Fixes: bcbde0d449ed ("net: netlink: virtual tap device management") Reported-by: Ken-ichirou MATSUZAWA Signed-off-by: Daniel Borkmann Tested-by: Ken-ichirou MATSUZAWA Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 3ad45f92f83fab87142b06addd1c548f7f6a4e89 Author: Richard Laing Date: Thu Sep 3 13:52:31 2015 +1200 net/ipv6: Correct PIM6 mrt_lock handling [ Upstream commit 25b4a44c19c83d98e8c0807a7ede07c1f28eab8b ] In the IPv6 multicast routing code the mrt_lock was not being released correctly in the MFC iterator, as a result adding or deleting a MIF would cause a hang because the mrt_lock could not be acquired. This fix is a copy of the code for the IPv4 case and ensures that the lock is released correctly. Signed-off-by: Richard Laing Acked-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 833db3b898d313b4f4f011a8a2cf845e6f620a89 Author: Daniel Borkmann Date: Thu Sep 3 00:29:07 2015 +0200 ipv6: fix exthdrs offload registration in out_rt path [ Upstream commit e41b0bedba0293b9e1e8d1e8ed553104b9693656 ] We previously register IPPROTO_ROUTING offload under inet6_add_offload(), but in error path, we try to unregister it with inet_del_offload(). This doesn't seem correct, it should actually be inet6_del_offload(), also ipv6_exthdrs_offload_exit() from that commit seems rather incorrect (it also uses rthdr_offload twice), but it got removed entirely later on. Fixes: 3336288a9fea ("ipv6: Switch to using new offload infrastructure.") Signed-off-by: Daniel Borkmann Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 965360dee1a0e2e1ef90799bfbc33bf1d3a6775d Author: Eugene Shatokhin Date: Mon Aug 24 23:13:42 2015 +0300 usbnet: Get EVENT_NO_RUNTIME_PM bit before it is cleared [ Upstream commit f50791ac1aca1ac1b0370d62397b43e9f831421a ] It is needed to check EVENT_NO_RUNTIME_PM bit of dev->flags in usbnet_stop(), but its value should be read before it is cleared when dev->flags is set to 0. The problem was spotted and the fix was provided by Oliver Neukum . Signed-off-by: Eugene Shatokhin Acked-by: Oliver Neukum Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit adda5e353608e7ad825cd1d4edca724b644f55f6 Author: huaibin Wang Date: Tue Aug 25 16:20:34 2015 +0200 ip6_gre: release cached dst on tunnel removal [ Upstream commit d4257295ba1b389c693b79de857a96e4b7cd8ac0 ] When a tunnel is deleted, the cached dst entry should be released. This problem may prevent the removal of a netns (seen with a x-netns IPv6 gre tunnel): unregister_netdevice: waiting for lo to become free. Usage count = 3 CC: Dmitry Kozlov Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Signed-off-by: huaibin Wang Signed-off-by: Nicolas Dichtel Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 2fb9a494b9f29080f504af3d50ac2feab15dbad5 Author: Daniel Borkmann Date: Tue Jul 7 00:07:52 2015 +0200 rtnetlink: verify IFLA_VF_INFO attributes before passing them to driver [ Upstream commit 4f7d2cdfdde71ffe962399b7020c674050329423 ] Jason Gunthorpe reported that since commit c02db8c6290b ("rtnetlink: make SR-IOV VF interface symmetric"), we don't verify IFLA_VF_INFO attributes anymore with respect to their policy, that is, ifla_vfinfo_policy[]. Before, they were part of ifla_policy[], but they have been nested since placed under IFLA_VFINFO_LIST, that contains the attribute IFLA_VF_INFO, which is another nested attribute for the actual VF attributes such as IFLA_VF_MAC, IFLA_VF_VLAN, etc. Despite the policy being split out from ifla_policy[] in this commit, it's never applied anywhere. nla_for_each_nested() only does basic nla_ok() testing for struct nlattr, but it doesn't know about the data context and their requirements. Fix, on top of Jason's initial work, does 1) parsing of the attributes with the right policy, and 2) using the resulting parsed attribute table from 1) instead of the nla_for_each_nested() loop (just like we used to do when still part of ifla_policy[]). Reference: http://thread.gmane.org/gmane.linux.network/368913 Fixes: c02db8c6290b ("rtnetlink: make SR-IOV VF interface symmetric") Reported-by: Jason Gunthorpe Cc: Chris Wright Cc: Sucheta Chakraborty Cc: Greg Rose Cc: Jeff Kirsher Cc: Rony Efraim Cc: Vlad Zolotarov Cc: Nicolas Dichtel Cc: Thomas Graf Signed-off-by: Jason Gunthorpe Signed-off-by: Daniel Borkmann Acked-by: Vlad Zolotarov Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit e8d18053d703c713b64031d65dda260d3db46472 Author: Vlad Zolotarov Date: Mon Mar 30 21:35:23 2015 +0300 if_link: Add an additional parameter to ifla_vf_info for RSS querying [ Upstream commit 01a3d796813d6302af9f828f34b73d21a4b96c9a ] Add configuration setting for drivers to allow/block an RSS Redirection Table and a Hash Key querying for discrete VFs. On some devices VF share the mentioned above information with PF and querying it may adduce a theoretical security risk. We want to let a system administrator to decide if he/she wants to take this risk or not. Signed-off-by: Vlad Zolotarov Tested-by: Phil Schmitt Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin commit 2b1aeac03526e1cd727c9bfca4da77b3bbb3beb6 Author: Hin-Tak Leung Date: Wed Sep 9 15:38:04 2015 -0700 hfs,hfsplus: cache pages correctly between bnode_create and bnode_free [ Upstream commit 7cb74be6fd827e314f81df3c5889b87e4c87c569 ] Pages looked up by __hfs_bnode_create() (called by hfs_bnode_create() and hfs_bnode_find() for finding or creating pages corresponding to an inode) are immediately kmap()'ed and used (both read and write) and kunmap()'ed, and should not be page_cache_release()'ed until hfs_bnode_free(). This patch fixes a problem I first saw in July 2012: merely running "du" on a large hfsplus-mounted directory a few times on a reasonably loaded system would get the hfsplus driver all confused and complaining about B-tree inconsistencies, and generates a "BUG: Bad page state". Most recently, I can generate this problem on up-to-date Fedora 22 with shipped kernel 4.0.5, by running "du /" (="/" + "/home" + "/mnt" + other smaller mounts) and "du /mnt" simultaneously on two windows, where /mnt is a lightly-used QEMU VM image of the full Mac OS X 10.9: $ df -i / /home /mnt Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/fedora-root 3276800 551665 2725135 17% / /dev/mapper/fedora-home 52879360 716221 52163139 2% /home /dev/nbd0p2 4294967295 1387818 4293579477 1% /mnt After applying the patch, I was able to run "du /" (60+ times) and "du /mnt" (150+ times) continuously and simultaneously for 6+ hours. There are many reports of the hfsplus driver getting confused under load and generating "BUG: Bad page state" or other similar issues over the years. [1] The unpatched code [2] has always been wrong since it entered the kernel tree. The only reason why it gets away with it is that the kmap/memcpy/kunmap follow very quickly after the page_cache_release() so the kernel has not had a chance to reuse the memory for something else, most of the time. The current RW driver appears to have followed the design and development of the earlier read-only hfsplus driver [3], where-by version 0.1 (Dec 2001) had a B-tree node-centric approach to read_cache_page()/page_cache_release() per bnode_get()/bnode_put(), migrating towards version 0.2 (June 2002) of caching and releasing pages per inode extents. When the current RW code first entered the kernel [2] in 2005, there was an REF_PAGES conditional (and "//" commented out code) to switch between B-node centric paging to inode-centric paging. There was a mistake with the direction of one of the REF_PAGES conditionals in __hfs_bnode_create(). In a subsequent "remove debug code" commit [4], the read_cache_page()/page_cache_release() per bnode_get()/bnode_put() were removed, but a page_cache_release() was mistakenly left in (propagating the "REF_PAGES <-> !REF_PAGE" mistake), and the commented-out page_cache_release() in bnode_release() (which should be spanned by !REF_PAGES) was never enabled. References: [1]: Michael Fox, Apr 2013 http://www.spinics.net/lists/linux-fsdevel/msg63807.html ("hfsplus volume suddenly inaccessable after 'hfs: recoff %d too large'") Sasha Levin, Feb 2015 http://lkml.org/lkml/2015/2/20/85 ("use after free") https://bugs.launchpad.net/ubuntu/+source/linux/+bug/740814 https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1027887 https://bugzilla.kernel.org/show_bug.cgi?id=42342 https://bugzilla.kernel.org/show_bug.cgi?id=63841 https://bugzilla.kernel.org/show_bug.cgi?id=78761 [2]: http://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/\ fs/hfs/bnode.c?id=d1081202f1d0ee35ab0beb490da4b65d4bc763db commit d1081202f1d0ee35ab0beb490da4b65d4bc763db Author: Andrew Morton Date: Wed Feb 25 16:17:36 2004 -0800 [PATCH] HFS rewrite http://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/\ fs/hfsplus/bnode.c?id=91556682e0bf004d98a529bf829d339abb98bbbd commit 91556682e0bf004d98a529bf829d339abb98bbbd Author: Andrew Morton Date: Wed Feb 25 16:17:48 2004 -0800 [PATCH] HFS+ support [3]: http://sourceforge.net/projects/linux-hfsplus/ http://sourceforge.net/projects/linux-hfsplus/files/Linux%202.4.x%20patch/hfsplus%200.1/ http://sourceforge.net/projects/linux-hfsplus/files/Linux%202.4.x%20patch/hfsplus%200.2/ http://linux-hfsplus.cvs.sourceforge.net/viewvc/linux-hfsplus/linux/\ fs/hfsplus/bnode.c?r1=1.4&r2=1.5 Date: Thu Jun 6 09:45:14 2002 +0000 Use buffer cache instead of page cache in bnode.c. Cache inode extents. [4]: http://git.kernel.org/cgit/linux/kernel/git/\ stable/linux-stable.git/commit/?id=a5e3985fa014029eb6795664c704953720cc7f7d commit a5e3985fa014029eb6795664c704953720cc7f7d Author: Roman Zippel Date: Tue Sep 6 15:18:47 2005 -0700 [PATCH] hfs: remove debug code Signed-off-by: Hin-Tak Leung Signed-off-by: Sergei Antonov Reviewed-by: Anton Altaparmakov Reported-by: Sasha Levin Cc: Al Viro Cc: Christoph Hellwig Cc: Vyacheslav Dubeyko Cc: Sougata Santra Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit a84c256319585b774c77bb5731150e5eebb71566 Author: Noa Osherovich Date: Thu Jul 30 17:34:24 2015 +0300 IB/mlx4: Use correct SL on AH query under RoCE [ Upstream commit 5e99b139f1b68acd65e36515ca347b03856dfb5a ] The mlx4 IB driver implementation for ib_query_ah used a wrong offset (28 instead of 29) when link type is Ethernet. Fixed to use the correct one. Fixes: fa417f7b520e ('IB/mlx4: Add support for IBoE') Signed-off-by: Shani Michaeli Signed-off-by: Noa Osherovich Signed-off-by: Or Gerlitz Signed-off-by: Doug Ledford Signed-off-by: Sasha Levin commit 64665b05f6cd361f324eddd849cae977a5f26d2e Author: Jack Morgenstein Date: Thu Jul 30 17:34:23 2015 +0300 IB/mlx4: Forbid using sysfs to change RoCE pkeys [ Upstream commit 2b135db3e81301d0452e6aa107349abe67b097d6 ] The pkey mapping for RoCE must remain the default mapping: VFs: virtual index 0 = mapped to real index 0 (0xFFFF) All others indices: mapped to a real pkey index containing an invalid pkey. PF: virtual index i = real index i. Don't allow users to change these mappings using files found in sysfs. Fixes: c1e7e466120b ('IB/mlx4: Add iov directory in sysfs under the ib device') Signed-off-by: Jack Morgenstein Signed-off-by: Or Gerlitz Signed-off-by: Doug Ledford Signed-off-by: Sasha Levin commit edf5a729fde3db5342937462999f6414de2a662e Author: Yishai Hadas Date: Thu Aug 13 18:32:03 2015 +0300 IB/uverbs: Fix race between ib_uverbs_open and remove_one [ Upstream commit 35d4a0b63dc0c6d1177d4f532a9deae958f0662c ] Fixes: 2a72f212263701b927559f6850446421d5906c41 ("IB/uverbs: Remove dev_table") Before this commit there was a device look-up table that was protected by a spin_lock used by ib_uverbs_open and by ib_uverbs_remove_one. When it was dropped and container_of was used instead, it enabled the race with remove_one as dev might be freed just after: dev = container_of(inode->i_cdev, struct ib_uverbs_device, cdev) but before the kref_get. In addition, this buggy patch added some dead code as container_of(x,y,z) can never be NULL and so dev can never be NULL. As a result the comment above ib_uverbs_open saying "the open method will either immediately run -ENXIO" is wrong as it can never happen. The solution follows Jason Gunthorpe suggestion from below URL: https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg25692.html cdev will hold a kref on the parent (the containing structure, ib_uverbs_device) and only when that kref is released it is guaranteed that open will never be called again. In addition, fixes the active count scheme to use an atomic not a kref to prevent WARN_ON as pointed by above comment from Jason. Signed-off-by: Yishai Hadas Signed-off-by: Shachar Raindel Reviewed-by: Jason Gunthorpe Signed-off-by: Doug Ledford Signed-off-by: Sasha Levin commit 8e75f06a93a76a04ee30400c4cbdecc190d3b25d Author: Christoph Hellwig Date: Wed Aug 26 11:00:37 2015 +0200 IB/uverbs: reject invalid or unknown opcodes [ Upstream commit b632ffa7cee439ba5dce3b3bc4a5cbe2b3e20133 ] We have many WR opcodes that are only supported in kernel space and/or require optional information to be copied into the WR structure. Reject all those not explicitly handled so that we can't pass invalid information to drivers. Cc: stable@vger.kernel.org Signed-off-by: Christoph Hellwig Reviewed-by: Jason Gunthorpe Reviewed-by: Sagi Grimberg Signed-off-by: Doug Ledford Signed-off-by: Sasha Levin commit b027811d9167a4d1e34ae5fc9d5b4e3fca0c5595 Author: Mike Marciniszyn Date: Tue Jul 21 08:36:07 2015 -0400 IB/qib: Change lkey table allocation to support more MRs [ Upstream commit d6f1c17e162b2a11e708f28fa93f2f79c164b442 ] The lkey table is allocated with with a get_user_pages() with an order based on a number of index bits from a module parameter. The underlying kernel code cannot allocate that many contiguous pages. There is no reason the underlying memory needs to be physically contiguous. This patch: - switches the allocation/deallocation to vmalloc/vfree - caps the number of bits to 23 to insure at least 1 generation bit o this matches the module parameter description Cc: stable@vger.kernel.org Reviewed-by: Vinit Agnihotri Signed-off-by: Mike Marciniszyn Signed-off-by: Doug Ledford Signed-off-by: Sasha Levin commit 3d7f1ecf1a2c421828043019454d79e353a3bb06 Author: Hin-Tak Leung Date: Wed Sep 9 15:38:07 2015 -0700 hfs: fix B-tree corruption after insertion at position 0 [ Upstream commit b4cc0efea4f0bfa2477c56af406cfcf3d3e58680 ] Fix B-tree corruption when a new record is inserted at position 0 in the node in hfs_brec_insert(). This is an identical change to the corresponding hfs b-tree code to Sergei Antonov's "hfsplus: fix B-tree corruption after insertion at position 0", to keep similar code paths in the hfs and hfsplus drivers in sync, where appropriate. Signed-off-by: Hin-Tak Leung Cc: Sergei Antonov Cc: Joe Perches Reviewed-by: Vyacheslav Dubeyko Cc: Anton Altaparmakov Cc: Al Viro Cc: Christoph Hellwig Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 9736e5f17bad42116c824c2a2d405631b20b095d Author: NeilBrown Date: Mon Jul 6 17:37:49 2015 +1000 md/raid10: always set reshape_safe when initializing reshape_position. [ Upstream commit 299b0685e31c9f3dcc2d58ee3beca761a40b44b3 ] 'reshape_position' tracks where in the reshape we have reached. 'reshape_safe' tracks where in the reshape we have safely recorded in the metadata. These are compared to determine when to update the metadata. So it is important that reshape_safe is initialised properly. Currently it isn't. When starting a reshape from the beginning it usually has the correct value by luck. But when reducing the number of devices in a RAID10, it has the wrong value and this leads to the metadata not being updated correctly. This can lead to corruption if the reshape is not allowed to complete. This patch is suitable for any -stable kernel which supports RAID10 reshape, which is 3.5 and later. Fixes: 3ea7daa5d7fd ("md/raid10: add reshape support") Cc: stable@vger.kernel.org (v3.5+ please wait for -final to be out for 2 weeks) Signed-off-by: NeilBrown Signed-off-by: Sasha Levin commit 21901ab75018a6df6aa41b1acd3331b590f03908 Author: Jialing Fu Date: Fri Aug 28 11:13:09 2015 +0800 mmc: core: fix race condition in mmc_wait_data_done [ Upstream commit 71f8a4b81d040b3d094424197ca2f1bf811b1245 ] The following panic is captured in ker3.14, but the issue still exists in latest kernel. --------------------------------------------------------------------- [ 20.738217] c0 3136 (Compiler) Unable to handle kernel NULL pointer dereference at virtual address 00000578 ...... [ 20.738499] c0 3136 (Compiler) PC is at _raw_spin_lock_irqsave+0x24/0x60 [ 20.738527] c0 3136 (Compiler) LR is at _raw_spin_lock_irqsave+0x20/0x60 [ 20.740134] c0 3136 (Compiler) Call trace: [ 20.740165] c0 3136 (Compiler) [] _raw_spin_lock_irqsave+0x24/0x60 [ 20.740200] c0 3136 (Compiler) [] __wake_up+0x1c/0x54 [ 20.740230] c0 3136 (Compiler) [] mmc_wait_data_done+0x28/0x34 [ 20.740262] c0 3136 (Compiler) [] mmc_request_done+0xa4/0x220 [ 20.740314] c0 3136 (Compiler) [] sdhci_tasklet_finish+0xac/0x264 [ 20.740352] c0 3136 (Compiler) [] tasklet_action+0xa0/0x158 [ 20.740382] c0 3136 (Compiler) [] __do_softirq+0x10c/0x2e4 [ 20.740411] c0 3136 (Compiler) [] irq_exit+0x8c/0xc0 [ 20.740439] c0 3136 (Compiler) [] handle_IRQ+0x48/0xac [ 20.740469] c0 3136 (Compiler) [] gic_handle_irq+0x38/0x7c ---------------------------------------------------------------------- Because in SMP, "mrq" has race condition between below two paths: path1: CPU0: static void mmc_wait_data_done(struct mmc_request *mrq) { mrq->host->context_info.is_done_rcv = true; // // If CPU0 has just finished "is_done_rcv = true" in path1, and at // this moment, IRQ or ICache line missing happens in CPU0. // What happens in CPU1 (path2)? // // If the mmcqd thread in CPU1(path2) hasn't entered to sleep mode: // path2 would have chance to break from wait_event_interruptible // in mmc_wait_for_data_req_done and continue to run for next // mmc_request (mmc_blk_rw_rq_prep). // // Within mmc_blk_rq_prep, mrq is cleared to 0. // If below line still gets host from "mrq" as the result of // compiler, the panic happens as we traced. wake_up_interruptible(&mrq->host->context_info.wait); } path2: CPU1: static int mmc_wait_for_data_req_done(... { ... while (1) { wait_event_interruptible(context_info->wait, (context_info->is_done_rcv || context_info->is_new_req)); static void mmc_blk_rw_rq_prep(... { ... memset(brq, 0, sizeof(struct mmc_blk_request)); This issue happens very coincidentally; however adding mdelay(1) in mmc_wait_data_done as below could duplicate it easily. static void mmc_wait_data_done(struct mmc_request *mrq) { mrq->host->context_info.is_done_rcv = true; + mdelay(1); wake_up_interruptible(&mrq->host->context_info.wait); } At runtime, IRQ or ICache line missing may just happen at the same place of the mdelay(1). This patch gets the mmc_context_info at the beginning of function, it can avoid this race condition. Signed-off-by: Jialing Fu Tested-by: Shawn Lin Fixes: 2220eedfd7ae ("mmc: fix async request mechanism ....") Signed-off-by: Shawn Lin Signed-off-by: Ulf Hansson Signed-off-by: Sasha Levin commit 5068ce153a5a5bf52077cc0045f126001eea08ac Author: Jann Horn Date: Wed Sep 9 15:38:28 2015 -0700 fs: if a coredump already exists, unlink and recreate with O_EXCL [ Upstream commit fbb1816942c04429e85dbf4c1a080accc534299e ] It was possible for an attacking user to trick root (or another user) into writing his coredumps into an attacker-readable, pre-existing file using rename() or link(), causing the disclosure of secret data from the victim process' virtual memory. Depending on the configuration, it was also possible to trick root into overwriting system files with coredumps. Fix that issue by never writing coredumps into existing files. Requirements for the attack: - The attack only applies if the victim's process has a nonzero RLIMIT_CORE and is dumpable. - The attacker can trick the victim into coredumping into an attacker-writable directory D, either because the core_pattern is relative and the victim's cwd is attacker-writable or because an absolute core_pattern pointing to a world-writable directory is used. - The attacker has one of these: A: on a system with protected_hardlinks=0: execute access to a folder containing a victim-owned, attacker-readable file on the same partition as D, and the victim-owned file will be deleted before the main part of the attack takes place. (In practice, there are lots of files that fulfill this condition, e.g. entries in Debian's /var/lib/dpkg/info/.) This does not apply to most Linux systems because most distros set protected_hardlinks=1. B: on a system with protected_hardlinks=1: execute access to a folder containing a victim-owned, attacker-readable and attacker-writable file on the same partition as D, and the victim-owned file will be deleted before the main part of the attack takes place. (This seems to be uncommon.) C: on any system, independent of protected_hardlinks: write access to a non-sticky folder containing a victim-owned, attacker-readable file on the same partition as D (This seems to be uncommon.) The basic idea is that the attacker moves the victim-owned file to where he expects the victim process to dump its core. The victim process dumps its core into the existing file, and the attacker reads the coredump from it. If the attacker can't move the file because he does not have write access to the containing directory, he can instead link the file to a directory he controls, then wait for the original link to the file to be deleted (because the kernel checks that the link count of the corefile is 1). A less reliable variant that requires D to be non-sticky works with link() and does not require deletion of the original link: link() the file into D, but then unlink() it directly before the kernel performs the link count check. On systems with protected_hardlinks=0, this variant allows an attacker to not only gain information from coredumps, but also clobber existing, victim-writable files with coredumps. (This could theoretically lead to a privilege escalation.) Signed-off-by: Jann Horn Cc: Kees Cook Cc: Al Viro Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit e6064491f59189315db9324e685ad885806b69a0 Author: Jaewon Kim Date: Tue Sep 8 15:02:21 2015 -0700 vmscan: fix increasing nr_isolated incurred by putback unevictable pages [ Upstream commit c54839a722a02818677bcabe57e957f0ce4f841d ] reclaim_clean_pages_from_list() assumes that shrink_page_list() returns number of pages removed from the candidate list. But shrink_page_list() puts back mlocked pages without passing it to caller and without counting as nr_reclaimed. This increases nr_isolated. To fix this, this patch changes shrink_page_list() to pass unevictable pages back to caller. Caller will take care those pages. Minchan said: It fixes two issues. 1. With unevictable page, cma_alloc will be successful. Exactly speaking, cma_alloc of current kernel will fail due to unevictable pages. 2. fix leaking of NR_ISOLATED counter of vmstat With it, too_many_isolated works. Otherwise, it could make hang until the process get SIGKILL. Signed-off-by: Jaewon Kim Acked-by: Minchan Kim Cc: Mel Gorman Acked-by: Vlastimil Babka Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit c9cfe46029a6c6f1ea337b8fe7ff98a544e9a9ec Author: Helge Deller Date: Thu Sep 3 22:45:21 2015 +0200 parisc: Filter out spurious interrupts in PA-RISC irq handler [ Upstream commit b1b4e435e4ef7de77f07bf2a42c8380b960c2d44 ] When detecting a serial port on newer PA-RISC machines (with iosapic) we have a long way to go to find the right IRQ line, registering it, then registering the serial port and the irq handler for the serial port. During this phase spurious interrupts for the serial port may happen which then crashes the kernel because the action handler might not have been set up yet. So, basically it's a race condition between the serial port hardware and the CPU which sets up the necessary fields in the irq sructs. The main reason for this race is, that we unmask the serial port irqs too early without having set up everything properly before (which isn't easily possible because we need the IRQ number to register the serial ports). This patch is a work-around for this problem. It adds checks to the CPU irq handler to verify if the IRQ action field has been initialized already. If not, we just skip this interrupt (which isn't critical for a serial port at bootup). The real fix would probably involve rewriting all PA-RISC specific IRQ code (for CPU, IOSAPIC, GSC and EISA) to use IRQ domains with proper parenting of the irq chips and proper irq enabling along this line. This bug has been in the PA-RISC port since the beginning, but the crashes happened very rarely with currently used hardware. But on the latest machine which I bought (a C8000 workstation), which uses the fastest CPUs (4 x PA8900, 1GHz) and which has the largest possible L1 cache size (64MB each), the kernel crashed at every boot because of this race. So, without this patch the machine would currently be unuseable. For the record, here is the flow logic: 1. serial_init_chip() in 8250_gsc.c calls iosapic_serial_irq(). 2. iosapic_serial_irq() calls txn_alloc_irq() to find the irq. 3. iosapic_serial_irq() calls cpu_claim_irq() to register the CPU irq 4. cpu_claim_irq() unmasks the CPU irq (which it shouldn't!) 5. serial_init_chip() then registers the 8250 port. Problems: - In step 4 the CPU irq shouldn't have been registered yet, but after step 5 - If serial irq happens between 4 and 5 have finished, the kernel will crash Signed-off-by: Helge Deller Signed-off-by: Sasha Levin commit 4ea349a7396131b3b4e9fafb86d60f16df666375 Author: John David Anglin Date: Mon Sep 7 20:13:28 2015 -0400 parisc: Use double word condition in 64bit CAS operation [ Upstream commit 1b59ddfcf1678de38a1f8ca9fb8ea5eebeff1843 ] The attached change fixes the condition used in the "sub" instruction. A double word comparison is needed. This fixes the 64-bit LWS CAS operation on 64-bit kernels. I can now enable 64-bit atomic support in GCC. Cc: Signed-off-by: John David Anglin Signed-off-by: Helge Deller Signed-off-by: Sasha Levin commit 8c87cb0912f641396f046756cf0744221eb5282c Author: Trond Myklebust Date: Mon Aug 17 12:57:07 2015 -0500 NFS: nfs_set_pgio_error sometimes misses errors [ Upstream commit e9ae58aeee8842a50f7e199d602a5ccb2e41a95f ] We should ensure that we always set the pgio_header's error field if a READ or WRITE RPC call returns an error. The current code depends on 'hdr->good_bytes' always being initialised to a large value, which is not always done correctly by callers. When this happens, applications may end up missing important errors. Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin commit 7730c1b9620d5b4887699d1b2ad9338fc63ca736 Author: Kinglong Mee Date: Sat Aug 15 21:52:10 2015 +0800 NFS: Fix a NULL pointer dereference of migration recovery ops for v4.2 client [ Upstream commit 18e3b739fdc826481c6a1335ce0c5b19b3d415da ] ---Steps to Reproduce-- # cat /etc/exports /nfs/referal *(rw,insecure,no_subtree_check,no_root_squash,crossmnt) /nfs/old *(ro,insecure,subtree_check,root_squash,crossmnt) # mount -t nfs nfs-server:/nfs/ /mnt/ # ll /mnt/*/ # cat /etc/exports /nfs/referal *(rw,insecure,no_subtree_check,no_root_squash,crossmnt,refer=/nfs/old/@nfs-server) /nfs/old *(ro,insecure,subtree_check,root_squash,crossmnt) # service nfs restart # ll /mnt/*/ --->>>>> oops here [ 5123.102925] BUG: unable to handle kernel NULL pointer dereference at (null) [ 5123.103363] IP: [] nfs4_proc_get_locations+0x9b/0x120 [nfsv4] [ 5123.103752] PGD 587b9067 PUD 3cbf5067 PMD 0 [ 5123.104131] Oops: 0000 [#1] [ 5123.104529] Modules linked in: nfsv4(OE) nfs(OE) fscache(E) nfsd(OE) xfs libcrc32c iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi coretemp crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ppdev vmw_balloon parport_pc parport i2c_piix4 shpchp auth_rpcgss nfs_acl vmw_vmci lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi serio_raw scsi_transport_spi e1000 mptscsih mptbase ata_generic pata_acpi [last unloaded: nfsd] [ 5123.105887] CPU: 0 PID: 15853 Comm: ::1-manager Tainted: G OE 4.2.0-rc6+ #214 [ 5123.106358] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014 [ 5123.106860] task: ffff88007620f300 ti: ffff88005877c000 task.ti: ffff88005877c000 [ 5123.107363] RIP: 0010:[] [] nfs4_proc_get_locations+0x9b/0x120 [nfsv4] [ 5123.107909] RSP: 0018:ffff88005877fdb8 EFLAGS: 00010246 [ 5123.108435] RAX: ffff880053f3bc00 RBX: ffff88006ce6c908 RCX: ffff880053a0d240 [ 5123.108968] RDX: ffffea0000e6d940 RSI: ffff8800399a0000 RDI: ffff88006ce6c908 [ 5123.109503] RBP: ffff88005877fe28 R08: ffffffff81c708a0 R09: 0000000000000000 [ 5123.110045] R10: 00000000000001a2 R11: ffff88003ba7f5c8 R12: ffff880054c55800 [ 5123.110618] R13: 0000000000000000 R14: ffff880053a0d240 R15: ffff880053a0d240 [ 5123.111169] FS: 0000000000000000(0000) GS:ffffffff81c27000(0000) knlGS:0000000000000000 [ 5123.111726] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5123.112286] CR2: 0000000000000000 CR3: 0000000054cac000 CR4: 00000000001406f0 [ 5123.112888] Stack: [ 5123.113458] ffffea0000e6d940 ffff8800399a0000 00000000000167d0 0000000000000000 [ 5123.114049] 0000000000000000 0000000000000000 0000000000000000 00000000a7ec82c6 [ 5123.114662] ffff88005877fe18 ffffea0000e6d940 ffff8800399a0000 ffff880054c55800 [ 5123.115264] Call Trace: [ 5123.115868] [] nfs4_try_migration+0xbb/0x220 [nfsv4] [ 5123.116487] [] nfs4_run_state_manager+0x4ab/0x7b0 [nfsv4] [ 5123.117104] [] ? nfs4_do_reclaim+0x510/0x510 [nfsv4] [ 5123.117813] [] kthread+0xd7/0xf0 [ 5123.118456] [] ? kthread_worker_fn+0x160/0x160 [ 5123.119108] [] ret_from_fork+0x3f/0x70 [ 5123.119723] [] ? kthread_worker_fn+0x160/0x160 [ 5123.120329] Code: 4c 8b 6a 58 74 17 eb 52 48 8d 55 a8 89 c6 4c 89 e7 e8 4a b5 ff ff 8b 45 b0 85 c0 74 1c 4c 89 f9 48 8b 55 90 48 8b 75 98 48 89 df <41> ff 55 00 3d e8 d8 ff ff 41 89 c6 74 cf 48 8b 4d c8 65 48 33 [ 5123.121643] RIP [] nfs4_proc_get_locations+0x9b/0x120 [nfsv4] [ 5123.122308] RSP [ 5123.122942] CR2: 0000000000000000 Fixes: ec011fe847 ("NFS: Introduce a vector of migration recovery ops") Cc: stable@vger.kernel.org # v3.13+ Signed-off-by: Kinglong Mee Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin commit 47bd9168e93703d8ab8104a9d0b858a9cb2dd9c3 Author: NeilBrown Date: Thu Jul 30 13:00:56 2015 +1000 NFSv4: don't set SETATTR for O_RDONLY|O_EXCL [ Upstream commit efcbc04e16dfa95fef76309f89710dd1d99a5453 ] It is unusual to combine the open flags O_RDONLY and O_EXCL, but it appears that libre-office does just that. [pid 3250] stat("/home/USER/.config", {st_mode=S_IFDIR|0700, st_size=8192, ...}) = 0 [pid 3250] open("/home/USER/.config/libreoffice/4-suse/user/extensions/buildid", O_RDONLY|O_EXCL NFSv4 takes O_EXCL as a sign that a setattr command should be sent, probably to reset the timestamps. When it was an O_RDONLY open, the SETATTR command does not identify any actual attributes to change. If no delegation was provided to the open, the SETATTR uses the all-zeros stateid and the request is accepted (at least by the Linux NFS server - no harm, no foul). If a read-delegation was provided, this is used in the SETATTR request, and a Netapp filer will justifiably claim NFS4ERR_BAD_STATEID, which the Linux client takes as a sign to retry - indefinitely. So only treat O_EXCL specially if O_CREAT was also given. Signed-off-by: NeilBrown Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin commit 04b4d54f032051de6259e859677d4712516eece3 Author: Filipe Manana Date: Wed Aug 12 11:54:35 2015 +0100 Btrfs: check if previous transaction aborted to avoid fs corruption [ Upstream commit 1f9b8c8fbc9a4d029760b16f477b9d15500e3a34 ] While we are committing a transaction, it's possible the previous one is still finishing its commit and therefore we wait for it to finish first. However we were not checking if that previous transaction ended up getting aborted after we waited for it to commit, so we ended up committing the current transaction which can lead to fs corruption because the new superblock can point to trees that have had one or more nodes/leafs that were never durably persisted. The following sequence diagram exemplifies how this is possible: CPU 0 CPU 1 transaction N starts (...) btrfs_commit_transaction(N) cur_trans->state = TRANS_STATE_COMMIT_START; (...) cur_trans->state = TRANS_STATE_COMMIT_DOING; (...) cur_trans->state = TRANS_STATE_UNBLOCKED; root->fs_info->running_transaction = NULL; btrfs_start_transaction() --> starts transaction N + 1 btrfs_write_and_wait_transaction(trans, root); --> starts writing all new or COWed ebs created at transaction N creates some new ebs, COWs some existing ebs but doesn't COW or deletes eb X btrfs_commit_transaction(N + 1) (...) cur_trans->state = TRANS_STATE_COMMIT_START; (...) wait_for_commit(root, prev_trans); --> prev_trans == transaction N btrfs_write_and_wait_transaction() continues writing ebs --> fails writing eb X, we abort transaction N and set bit BTRFS_FS_STATE_ERROR on fs_info->fs_state, so no new transactions can start after setting that bit cleanup_transaction() btrfs_cleanup_one_transaction() wakes up task at CPU 1 continues, doesn't abort because cur_trans->aborted (transaction N + 1) is zero, and no checks for bit BTRFS_FS_STATE_ERROR in fs_info->fs_state are made btrfs_write_and_wait_transaction(trans, root); --> succeeds, no errors during writeback write_ctree_super(trans, root, 0); --> succeeds --> we have now a superblock that points us to some root that uses eb X, which was never written to disk In this scenario future attempts to read eb X from disk results in an error message like "parent transid verify failed on X wanted Y found Z". So fix this by aborting the current transaction if after waiting for the previous transaction we verify that it was aborted. Cc: stable@vger.kernel.org Signed-off-by: Filipe Manana Reviewed-by: Josef Bacik Reviewed-by: Liu Bo Signed-off-by: Chris Mason Signed-off-by: Sasha Levin commit bf4f105946900888363a6a553ccf66bded7b0696 Author: Sakari Ailus Date: Fri Jun 12 20:06:23 2015 -0300 [media] v4l: omap3isp: Fix sub-device power management code [ Upstream commit 9d39f05490115bf145e5ea03c0b7ec9d3d015b01 ] Commit 813f5c0ac5cc ("media: Change media device link_notify behaviour") modified the media controller link setup notification API and updated the OMAP3 ISP driver accordingly. As a side effect it introduced a bug by turning power on after setting the link instead of before. This results in sub-devices not being powered down in some cases when they should be. Fix it. Fixes: 813f5c0ac5cc [media] media: Change media device link_notify behaviour Signed-off-by: Sakari Ailus Cc: stable@vger.kernel.org # since v3.10 Signed-off-by: Laurent Pinchart Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit a2e1f66ec92a8bbf680229d2806bf55c17b33bf7 Author: David Härdeman Date: Tue May 19 19:03:12 2015 -0300 [media] rc-core: fix remove uevent generation [ Upstream commit a66b0c41ad277ae62a3ae6ac430a71882f899557 ] The input_dev is already gone when the rc device is being unregistered so checking for its presence only means that no remove uevent will be generated. Cc: stable@kernel.org Signed-off-by: David Härdeman Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit 946f8cba3e55fb62ab463b9ab381399857b61c36 Author: Minfei Huang Date: Sun Jul 12 20:18:42 2015 +0800 x86/mm: Initialize pmd_idx in page_table_range_init_count() [ Upstream commit 9962eea9e55f797f05f20ba6448929cab2a9f018 ] The variable pmd_idx is not initialized for the first iteration of the for loop. Assign the proper value which indexes the start address. Fixes: 719272c45b82 'x86, mm: only call early_ioremap_page_table_range_init() once' Signed-off-by: Minfei Huang Cc: tony.luck@intel.com Cc: wangnan0@huawei.com Cc: david.vrabel@citrix.com Reviewed-by: yinghai@kernel.org Link: http://lkml.kernel.org/r/1436703522-29552-1-git-send-email-mhuang@redhat.com Signed-off-by: Thomas Gleixner Signed-off-by: Sasha Levin commit 2f4bef0fa3324781de28f3621bb203271709927a Author: Jeffery Miller Date: Tue Sep 1 11:23:02 2015 -0400 Add radeon suspend/resume quirk for HP Compaq dc5750. [ Upstream commit 09bfda10e6efd7b65bcc29237bee1765ed779657 ] With the radeon driver loaded the HP Compaq dc5750 Small Form Factor machine fails to resume from suspend. Adding a quirk similar to other devices avoids the problem and the system resumes properly. Signed-off-by: Jeffery Miller Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit ea80798329bc5e341d8e4525c476ab7fa3f827d5 Author: Jann Horn Date: Fri Sep 11 16:27:27 2015 +0200 CIFS: fix type confusion in copy offload ioctl [ Upstream commit 4c17a6d56bb0cad3066a714e94f7185a24b40f49 ] This might lead to local privilege escalation (code execution as kernel) for systems where the following conditions are met: - CONFIG_CIFS_SMB2 and CONFIG_CIFS_POSIX are enabled - a cifs filesystem is mounted where: - the mount option "vers" was used and set to a value >=2.0 - the attacker has write access to at least one file on the filesystem To attack this, an attacker would have to guess the target_tcon pointer (but guessing wrong doesn't cause a crash, it just returns an error code) and win a narrow race. CC: Stable Signed-off-by: Jann Horn Signed-off-by: Steve French Signed-off-by: Sasha Levin commit 522cbcf35d106e4738bbbad5a613a81838468420 Author: Aneesh Kumar K.V Date: Tue Sep 15 12:30:08 2015 +0530 powerpc/mm: Recompute hash value after a failed update [ Upstream commit 36b35d5d807b7e57aff7d08e63de8b17731ee211 ] If we had secondary hash flag set, we ended up modifying hash value in the updatepp code path. Hence with a failed updatepp we will be using a wrong hash value for the following hash insert. Fix this by recomputing hash before insert. Without this patch we can end up with using wrong slot number in linux pte. That can result in us missing an hash pte update or invalidate which can cause memory corruption or even machine check. Fixes: 6d492ecc6489 ("powerpc/THP: Add code to handle HPTE faults for hugepages") Cc: stable@vger.kernel.org # v3.11+ Signed-off-by: Aneesh Kumar K.V Reviewed-by: Paul Mackerras Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin commit 5fc706cc729a3e9ee46380054414d4bf65758911 Author: Thomas Huth Date: Fri Jul 17 12:46:58 2015 +0200 powerpc/rtas: Introduce rtas_get_sensor_fast() for IRQ handlers [ Upstream commit 1c2cb594441d02815d304cccec9742ff5c707495 ] The EPOW interrupt handler uses rtas_get_sensor(), which in turn uses rtas_busy_delay() to wait for RTAS becoming ready in case it is necessary. But rtas_busy_delay() is annotated with might_sleep() and thus may not be used by interrupts handlers like the EPOW handler! This leads to the following BUG when CONFIG_DEBUG_ATOMIC_SLEEP is enabled: BUG: sleeping function called from invalid context at arch/powerpc/kernel/rtas.c:496 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.2.0-rc2-thuth #6 Call Trace: [c00000007ffe7b90] [c000000000807670] dump_stack+0xa0/0xdc (unreliable) [c00000007ffe7bc0] [c0000000000e1f14] ___might_sleep+0x134/0x180 [c00000007ffe7c20] [c00000000002aec0] rtas_busy_delay+0x30/0xd0 [c00000007ffe7c50] [c00000000002bde4] rtas_get_sensor+0x74/0xe0 [c00000007ffe7ce0] [c000000000083264] ras_epow_interrupt+0x44/0x450 [c00000007ffe7d90] [c000000000120260] handle_irq_event_percpu+0xa0/0x300 [c00000007ffe7e70] [c000000000120524] handle_irq_event+0x64/0xc0 [c00000007ffe7eb0] [c000000000124dbc] handle_fasteoi_irq+0xec/0x260 [c00000007ffe7ef0] [c00000000011f4f0] generic_handle_irq+0x50/0x80 [c00000007ffe7f20] [c000000000010f3c] __do_irq+0x8c/0x200 [c00000007ffe7f90] [c0000000000236cc] call_do_irq+0x14/0x24 [c00000007e6f39e0] [c000000000011144] do_IRQ+0x94/0x110 [c00000007e6f3a30] [c000000000002594] hardware_interrupt_common+0x114/0x180 Fix this issue by introducing a new rtas_get_sensor_fast() function that does not use rtas_busy_delay() - and thus can only be used for sensors that do not cause a BUSY condition - known as "fast" sensors. The EPOW sensor is defined to be "fast" in sPAPR - mpe. Fixes: 587f83e8dd50 ("powerpc/pseries: Use rtas_get_sensor in RAS code") Signed-off-by: Thomas Huth Reviewed-by: Nathan Fontenot Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin commit e937b1acb208f5615a1b9c13f38c4943a86ddb31 Author: Michael Ellerman Date: Fri Aug 7 16:19:43 2015 +1000 powerpc/mm: Fix pte_pagesize_index() crash on 4K w/64K hash [ Upstream commit 74b5037baa2011a2799e2c43adde7d171b072f9e ] The powerpc kernel can be built to have either a 4K PAGE_SIZE or a 64K PAGE_SIZE. However when built with a 4K PAGE_SIZE there is an additional config option which can be enabled, PPC_HAS_HASH_64K, which means the kernel also knows how to hash a 64K page even though the base PAGE_SIZE is 4K. This is used in one obscure configuration, to support 64K pages for SPU local store on the Cell processor when the rest of the kernel is using 4K pages. In this configuration, pte_pagesize_index() is defined to just pass through its arguments to get_slice_psize(). However pte_pagesize_index() is called for both user and kernel addresses, whereas get_slice_psize() only knows how to handle user addresses. This has been broken forever, however until recently it happened to work. That was because in get_slice_psize() the large kernel address would cause the right shift of the slice mask to return zero. However in commit 7aa0727f3302 ("powerpc/mm: Increase the slice range to 64TB"), the get_slice_psize() code was changed so that instead of a right shift we do an array lookup based on the address. When passed a kernel address this means we index way off the end of the slice array and return random junk. That is only fatal if we happen to hit something non-zero, but when we do return a non-zero value we confuse the MMU code and eventually cause a check stop. This fix is ugly, but simple. When we're called for a kernel address we return 4K, which is always correct in this configuration, otherwise we use the slice mask. Fixes: 7aa0727f3302 ("powerpc/mm: Increase the slice range to 64TB") Reported-by: Cyril Bur Signed-off-by: Michael Ellerman Reviewed-by: Aneesh Kumar K.V Signed-off-by: Sasha Levin commit e0e8eb9ea24f1e721fd19b570a97d111dba0fa0e Author: Takashi Iwai Date: Thu Aug 13 18:05:06 2015 +0200 ALSA: hda - Use ALC880_FIXUP_FUJITSU for FSC Amilo M1437 [ Upstream commit a161574e200ae63a5042120e0d8c36830e81bde3 ] It turned out that the machine has a bass speaker, so take a correct fixup entry. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=102501 Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 48ab75cf0d5f50aaca5ff5666d39046155edc4e1 Author: Takashi Iwai Date: Thu Aug 13 18:02:39 2015 +0200 ALSA: hda - Enable headphone jack detect on old Fujitsu laptops [ Upstream commit bb148bdeb0ab16fc0ae8009799471e4d7180073b ] According to the bug report, FSC Amilo laptops with ALC880 can detect the headphone jack but currently the driver disables it. It's partly intentionally, as non-working jack detect was reported in the past. Let's enable now. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=102501 Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 638f310181bf4629bd42d85e0a0d9fe7481ebdea Author: Takashi Iwai Date: Thu Sep 3 22:20:00 2015 -0700 Input: evdev - do not report errors form flush() [ Upstream commit eb38f3a4f6e86f8bb10a3217ebd85ecc5d763aae ] We've got bug reports showing the old systemd-logind (at least system-210) aborting unexpectedly, and this turned out to be because of an invalid error code from close() call to evdev devices. close() is supposed to return only either EINTR or EBADFD, while the device returned ENODEV. logind was overreacting to it and decided to kill itself when an unexpected error code was received. What a tragedy. The bad error code comes from flush fops, and actually evdev_flush() returns ENODEV when device is disconnected or client's access to it is revoked. But in these cases the fact that flush did not actually happen is not an error, but rather normal behavior. For non-disconnected devices result of flush is also not that interesting as there is no potential of data loss and even if it fails application has no way of handling the error. Because of that we are better off always returning success from evdev_flush(). Also returning EINTR from flush()/close() is discouraged (as it is not clear how application should handle this error), so let's stop taking evdev->mutex interruptibly. Bugzilla: http://bugzilla.suse.com/show_bug.cgi?id=939834 Cc: Signed-off-by: Takashi Iwai Signed-off-by: Dmitry Torokhov Signed-off-by: Sasha Levin commit 325f791748d7fab63f859c518f84f8320d835241 Author: Marc Zyngier Date: Wed Sep 16 16:18:59 2015 +0100 arm64: KVM: Disable virtual timer even if the guest is not using it [ Upstream commit c4cbba9fa078f55d9f6d081dbb4aec7cf969e7c7 ] When running a guest with the architected timer disabled (with QEMU and the kernel_irqchip=off option, for example), it is important to make sure the timer gets turned off. Otherwise, the guest may try to enable it anyway, leading to a screaming HW interrupt. The fix is to unconditionally turn off the virtual timer on guest exit. Cc: stable@vger.kernel.org Reviewed-by: Christoffer Dall Signed-off-by: Marc Zyngier Signed-off-by: Sasha Levin commit be9d2b0eaf8d401979bbbb9e353beecc4ac8650a Author: Will Deacon Date: Tue Mar 17 12:15:02 2015 +0000 arm64: errata: add module build workaround for erratum #843419 [ Upstream commit df057cc7b4fa59e9b55f07ffdb6c62bf02e99a00 ] Cortex-A53 processors <= r0p4 are affected by erratum #843419 which can lead to a memory access using an incorrect address in certain sequences headed by an ADRP instruction. There is a linker fix to generate veneers for ADRP instructions, but this doesn't work for kernel modules which are built as unlinked ELF objects. This patch adds a new config option for the erratum which, when enabled, builds kernel modules with the mcmodel=large flag. This uses absolute addressing for all kernel symbols, thereby removing the use of ADRP as a PC-relative form of addressing. The ADRP relocs are removed from the module loader so that we fail to load any potentially affected modules. Cc: Acked-by: Catalin Marinas Signed-off-by: Will Deacon Signed-off-by: Sasha Levin commit dbb4b0e5a9ad6ec3221bcea438e585d6410df22d Author: Will Deacon Date: Wed Sep 2 18:49:28 2015 +0100 arm64: head.S: initialise mdcr_el2 in el2_setup [ Upstream commit d10bcd473301888f957ec4b6b12aa3621be78d59 ] When entering the kernel at EL2, we fail to initialise the MDCR_EL2 register which controls debug access and PMU capabilities at EL1. This patch ensures that the register is initialised so that all traps are disabled and all the PMU counters are available to the host. When a guest is scheduled, KVM takes care to configure trapping appropriately. Cc: Acked-by: Marc Zyngier Signed-off-by: Will Deacon Signed-off-by: Sasha Levin commit 0e49849f4f3e820b5c59fa8cb56884506b313f78 Author: Will Deacon Date: Tue Sep 15 12:07:06 2015 +0100 arm64: compat: fix vfp save/restore across signal handlers in big-endian [ Upstream commit bdec97a855ef1e239f130f7a11584721c9a1bf04 ] When saving/restoring the VFP registers from a compat (AArch32) signal frame, we rely on the compat registers forming a prefix of the native register file and therefore make use of copy_{to,from}_user to transfer between the native fpsimd_state and the compat_vfp_sigframe. Unfortunately, this doesn't work so well in a big-endian environment. Our fpsimd save/restore code operates directly on 128-bit quantities (Q registers) whereas the compat_vfp_sigframe represents the registers as an array of 64-bit (D) registers. The architecture packs the compat D registers into the Q registers, with the least significant bytes holding the lower register. Consequently, we need to swap the 64-bit halves when converting between these two representations on a big-endian machine. This patch replaces the __copy_{to,from}_user invocations in our compat VFP signal handling code with explicit __put_user loops that operate on 64-bit values and swap them accordingly. Cc: Reviewed-by: Catalin Marinas Signed-off-by: Will Deacon Signed-off-by: Sasha Levin commit 698cebfe17a68a172a5513d8025c280ba5e8cf4a Author: Jeff Vander Stoep Date: Tue Aug 18 20:50:10 2015 +0100 arm64: kconfig: Move LIST_POISON to a safe value [ Upstream commit bf0c4e04732479f650ff59d1ee82de761c0071f0 ] Move the poison pointer offset to 0xdead000000000000, a recognized value that is not mappable by user-space exploits. Cc: Acked-by: Catalin Marinas Signed-off-by: Thierry Strudel Signed-off-by: Jeff Vander Stoep Signed-off-by: Will Deacon Signed-off-by: Sasha Levin commit 0e90b248e7871c8a8fc1af5e73055e2d15778a79 Author: Bob Copeland Date: Sat Jun 13 10:16:31 2015 -0400 mac80211: enable assoc check for mesh interfaces [ Upstream commit 3633ebebab2bbe88124388b7620442315c968e8f ] We already set a station to be associated when peering completes, both in user space and in the kernel. Thus we should always have an associated sta before sending data frames to that station. Failure to check assoc state can cause crashes in the lower-level driver due to transmitting unicast data frames before driver sta structures (e.g. ampdu state in ath9k) are initialized. This occurred when forwarding in the presence of fixed mesh paths: frames were transmitted to stations with whom we hadn't yet completed peering. Cc: stable@vger.kernel.org Reported-by: Alexis Green Tested-by: Jesse Jones Signed-off-by: Bob Copeland Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin commit 39691ddadf465c59313fb27adb0dbe0975b2b5f5 Author: Jean Delvare Date: Tue Sep 1 18:07:41 2015 +0200 tg3: Fix temperature reporting [ Upstream commit d3d11fe08ccc9bff174fc958722b5661f0932486 ] The temperature registers appear to report values in degrees Celsius while the hwmon API mandates values to be exposed in millidegrees Celsius. Do the conversion so that the values reported by "sensors" are correct. Fixes: aed93e0bf493 ("tg3: Add hwmon support for temperature") Signed-off-by: Jean Delvare Cc: Prashant Sreedharan Cc: Michael Chan Cc: stable@vger.kernel.org [v3.6+] Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit e12419a52e738fb967e2c721ae5bf7df47e7840b Author: Eric W. Biederman Date: Mon Aug 10 17:35:07 2015 -0500 unshare: Unsharing a thread does not require unsharing a vm [ Upstream commit 12c641ab8270f787dfcce08b5f20ce8b65008096 ] In the logic in the initial commit of unshare made creating a new thread group for a process, contingent upon creating a new memory address space for that process. That is wrong. Two separate processes in different thread groups can share a memory address space and clone allows creation of such proceses. This is significant because it was observed that mm_users > 1 does not mean that a process is multi-threaded, as reading /proc/PID/maps temporarily increments mm_users, which allows other processes to (accidentally) interfere with unshare() calls. Correct the check in check_unshare_flags() to test for !thread_group_empty() for CLONE_THREAD, CLONE_SIGHAND, and CLONE_VM. For sighand->count > 1 for CLONE_SIGHAND and CLONE_VM. For !current_is_single_threaded instead of mm_users > 1 for CLONE_VM. By using the correct checks in unshare this removes the possibility of an accidental denial of service attack. Additionally using the correct checks in unshare ensures that only an explicit unshare(CLONE_VM) can possibly trigger the slow path of current_is_single_threaded(). As an explict unshare(CLONE_VM) is pointless it is not expected there are many applications that make that call. Cc: stable@vger.kernel.org Fixes: b2e0d98705e60e45bbb3c0032c48824ad7ae0704 userns: Implement unshare of the user namespace Reported-by: Ricky Zhou Reported-by: Kees Cook Reviewed-by: Kees Cook Signed-off-by: "Eric W. Biederman" Signed-off-by: Sasha Levin commit 89ca6eefdfad6b10eb0c7ee26fec15b61097ec41 Author: Ming Lei Date: Sun Aug 9 03:41:50 2015 -0400 blk-mq: fix buffer overflow when reading sysfs file of 'pending' [ Upstream commit 596f5aad2a704b72934e5abec1b1b4114c16f45b ] There may be lots of pending requests so that the buffer of PAGE_SIZE can't hold them at all. One typical example is scsi-mq, the queue depth(.can_queue) of scsi_host and blk-mq is quite big but scsi_device's queue_depth is a bit small(.cmd_per_lun), then it is quite easy to have lots of pending requests in hw queue. This patch fixes the following warning and the related memory destruction. [ 359.025101] fill_read_buffer: blk_mq_hw_sysfs_show+0x0/0x7d returned bad count^M [ 359.055595] irq event stamp: 15537^M [ 359.055606] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC ^M [ 359.055614] Dumping ftrace buffer:^M [ 359.055660] (ftrace buffer empty)^M [ 359.055672] Modules linked in: nbd ipv6 kvm_intel kvm serio_raw^M [ 359.055678] CPU: 4 PID: 21631 Comm: stress-ng-sysfs Not tainted 4.2.0-rc5-next-20150805 #434^M [ 359.055679] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011^M [ 359.055682] task: ffff8802161cc000 ti: ffff88021b4a8000 task.ti: ffff88021b4a8000^M [ 359.055693] RIP: 0010:[] [] __kmalloc+0xe8/0x152^M Cc: Signed-off-by: Ming Lei Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin