commit f82a53b87594f460f2dd9983eeb851a5840e8df8
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Wed Jul 5 14:41:57 2017 +0200

    Linux 4.11.9

commit f29125639b85dad56dbe95a9f28bb83f6b0c5803
Author: David S. Miller <davem@davemloft.net>
Date:   Thu Jun 8 10:16:05 2017 -0400

    hsi: Fix build regression due to netdev destructor fix.
    
    commit ed66e50d9587fc0bb032e276a2563c0068a5b63a upstream.
    
    > ../drivers/hsi/clients/ssi_protocol.c:1069:5: error: 'struct net_device' has no member named 'destructor'
    
    Reported-by: Mark Brown <broonie@kernel.org>
    Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Cc: Guenter Roeck <linux@roeck-us.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a5afcf8553e94d3002e1fbfeda82046dbfeb87e6
Author: Steffen Klassert <steffen.klassert@secunet.com>
Date:   Fri Jun 9 11:35:46 2017 +0200

    esp4: Fix udpencap for local TCP packets.
    
    [ Upstream commit 0e78a87306a6f55b1c7bbafad1de62c3975953ca ]
    
    Locally generated TCP packets are usually cloned, so we
    do skb_cow_data() on this packets. After that we need to
    reload the pointer to the esp header. On udpencap this
    header has an offset to skb_transport_header, so take this
    offset into account.
    
    This is a backport of:
    commit 0e78a87306a ("esp4: Fix udpencap for local TCP packets.")
    
    Fixes: 67d349ed603 ("net/esp4: Fix invalid esph pointer crash")
    Fixes: fca11ebde3f0 ("esp4: Reorganize esp_output")
    Reported-by: Don Bowman <db@donbowman.ca>
    Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b9a1254c31cc63d1f30fa37b782ebab70e9f0665
Author: Wanpeng Li <wanpeng.li@hotmail.com>
Date:   Mon Jun 5 05:19:09 2017 -0700

    KVM: nVMX: Fix exception injection
    
    commit d4912215d1031e4fb3d1038d2e1857218dba0d0a upstream.
    
     WARNING: CPU: 3 PID: 2840 at arch/x86/kvm/vmx.c:10966 nested_vmx_vmexit+0xdcd/0xde0 [kvm_intel]
     CPU: 3 PID: 2840 Comm: qemu-system-x86 Tainted: G           OE   4.12.0-rc3+ #23
     RIP: 0010:nested_vmx_vmexit+0xdcd/0xde0 [kvm_intel]
     Call Trace:
      ? kvm_check_async_pf_completion+0xef/0x120 [kvm]
      ? rcu_read_lock_sched_held+0x79/0x80
      vmx_queue_exception+0x104/0x160 [kvm_intel]
      ? vmx_queue_exception+0x104/0x160 [kvm_intel]
      kvm_arch_vcpu_ioctl_run+0x1171/0x1ce0 [kvm]
      ? kvm_arch_vcpu_load+0x47/0x240 [kvm]
      ? kvm_arch_vcpu_load+0x62/0x240 [kvm]
      kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
      ? kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
      ? __fget+0xf3/0x210
      do_vfs_ioctl+0xa4/0x700
      ? __fget+0x114/0x210
      SyS_ioctl+0x79/0x90
      do_syscall_64+0x81/0x220
      entry_SYSCALL64_slow_path+0x25/0x25
    
    This is triggered occasionally by running both win7 and win2016 in L2, in
    addition, EPT is disabled on both L1 and L2. It can't be reproduced easily.
    
    Commit 0b6ac343fc (KVM: nVMX: Correct handling of exception injection) mentioned
    that "KVM wants to inject page-faults which it got to the guest. This function
    assumes it is called with the exit reason in vmcs02 being a #PF exception".
    Commit e011c663 (KVM: nVMX: Check all exceptions for intercept during delivery to
    L2) allows to check all exceptions for intercept during delivery to L2. However,
    there is no guarantee the exit reason is exception currently, when there is an
    external interrupt occurred on host, maybe a time interrupt for host which should
    not be injected to guest, and somewhere queues an exception, then the function
    nested_vmx_check_exception() will be called and the vmexit emulation codes will
    try to emulate the "Acknowledge interrupt on exit" behavior, the warning is
    triggered.
    
    Reusing the exit reason from the L2->L0 vmexit is wrong in this case,
    the reason must always be EXCEPTION_NMI when injecting an exception into
    L1 as a nested vmexit.
    
    Cc: Paolo Bonzini <pbonzini@redhat.com>
    Cc: Radim Krčmář <rkrcmar@redhat.com>
    Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
    Fixes: e011c663b9c7 ("KVM: nVMX: Check all exceptions for intercept during delivery to L2")
    Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit bde6903736124e10c685656e40cee06fed938890
Author: Radim Krčmář <rkrcmar@redhat.com>
Date:   Thu May 18 19:37:30 2017 +0200

    KVM: x86: zero base3 of unusable segments
    
    commit f0367ee1d64d27fa08be2407df5c125442e885e3 upstream.
    
    Static checker noticed that base3 could be used uninitialized if the
    segment was not present (useable).  Random stack values probably would
    not pass VMCS entry checks.
    
    Reported-by:  Dan Carpenter <dan.carpenter@oracle.com>
    Fixes: 1aa366163b8b ("KVM: x86 emulator: consolidate segment accessors")
    Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 02800188d72c166ab6d0ebfbe13b83a7bab25a81
Author: Radim Krčmář <rkrcmar@redhat.com>
Date:   Thu May 18 19:37:31 2017 +0200

    KVM: x86/vPMU: fix undefined shift in intel_pmu_refresh()
    
    commit 34b0dadbdf698f9b277a31b2747b625b9a75ea1f upstream.
    
    Static analysis noticed that pmu->nr_arch_gp_counters can be 32
    (INTEL_PMC_MAX_GENERIC) and therefore cannot be used to shift 'int'.
    
    I didn't add BUILD_BUG_ON for it as we have a better checker.
    
    Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
    Fixes: 25462f7f5295 ("KVM: x86/vPMU: Define kvm_pmu_ops to support vPMU function dispatch")
    Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 3d2a8efd7d0d6e82b793a1783f01cbfb59392302
Author: Ladi Prosek <lprosek@redhat.com>
Date:   Tue Apr 25 16:42:44 2017 +0200

    KVM: x86: fix emulation of RSM and IRET instructions
    
    commit 6ed071f051e12cf7baa1b69d3becb8f232fdfb7b upstream.
    
    On AMD, the effect of set_nmi_mask called by emulate_iret_real and em_rsm
    on hflags is reverted later on in x86_emulate_instruction where hflags are
    overwritten with ctxt->emul_flags (the kvm_set_hflags call). This manifests
    as a hang when rebooting Windows VMs with QEMU, OVMF, and >1 vcpu.
    
    Instead of trying to merge ctxt->emul_flags into vcpu->arch.hflags after
    an instruction is emulated, this commit deletes emul_flags altogether and
    makes the emulator access vcpu->arch.hflags using two new accessors. This
    way all changes, on the emulator side as well as in functions called from
    the emulator and accessing vcpu state with emul_to_vcpu, are preserved.
    
    More details on the bug and its manifestation with Windows and OVMF:
    
      It's a KVM bug in the interaction between SMI/SMM and NMI, specific to AMD.
      I believe that the SMM part explains why we started seeing this only with
      OVMF.
    
      KVM masks and unmasks NMI when entering and leaving SMM. When KVM emulates
      the RSM instruction in em_rsm, the set_nmi_mask call doesn't stick because
      later on in x86_emulate_instruction we overwrite arch.hflags with
      ctxt->emul_flags, effectively reverting the effect of the set_nmi_mask call.
      The AMD-specific hflag of interest here is HF_NMI_MASK.
    
      When rebooting the system, Windows sends an NMI IPI to all but the current
      cpu to shut them down. Only after all of them are parked in HLT will the
      initiating cpu finish the restart. If NMI is masked, other cpus never get
      the memo and the initiating cpu spins forever, waiting for
      hal!HalpInterruptProcessorsStarted to drop. That's the symptom we observe.
    
    Fixes: a584539b24b8 ("KVM: x86: pass the whole hflags field to emulator and back")
    Signed-off-by: Ladi Prosek <lprosek@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 51c3bb1d99a31fa5179607eb9434d73f99af9b55
Author: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Date:   Tue Mar 21 11:03:53 2017 +0100

    mtd: nand: fsmc: fix NAND width handling
    
    commit ee56874f23e5c11576540bd695177a5ebc4f4352 upstream.
    
    In commit eea628199d5b ("mtd: Add device-tree support to fsmc_nand"),
    Device Tree support was added to the fmsc_nand driver. However, this
    code has a bug in how it handles the bank-width DT property to set the
    bus width.
    
    Indeed, in the function fsmc_nand_probe_config_dt() that parses the
    Device Tree, it sets pdata->width to either 8 or 16 depending on the
    value of the bank-width DT property.
    
    Then, the ->probe() function will test if pdata->width is equal to
    FSMC_NAND_BW16 (which is 2) to set NAND_BUSWIDTH_16 in
    nand->options. Therefore, with the DT probing, this condition will never
    match.
    
    This commit fixes that by removing the "width" field from
    fsmc_nand_platform_data and instead have the fsmc_nand_probe_config_dt()
    function directly set the appropriate nand->options value.
    
    It is worth mentioning that if this commit gets backported to older
    kernels, prior to the drop of non-DT probing, then non-DT probing will
    be broken because nand->options will no longer be set to
    NAND_BUSWIDTH_16.
    
    Fixes: eea628199d5b ("mtd: Add device-tree support to fsmc_nand")
    Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
    Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
    Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f34b2f6f683caa898fbbbb89bd188c153ff8b94d
Author: Kamal Dasu <kdasu.kdev@gmail.com>
Date:   Fri Mar 3 16:16:53 2017 -0500

    mtd: nand: brcmnand: Check flash #WP pin status before nand erase/program
    
    commit 9d2ee0a60b8bd9bef2a0082c533736d6a7b39873 upstream.
    
    On brcmnand controller v6.x and v7.x, the #WP pin is controlled through
    the NAND_WP bit in CS_SELECT register.
    
    The driver currently assumes that toggling the #WP pin is
    instantaneously enabling/disabling write-protection, but it actually
    takes some time to propagate the new state to the internal NAND chip
    logic. This behavior is sometime causing data corruptions when an
    erase/program operation is executed before write-protection has really
    been disabled.
    
    Fixes: 27c5b17cd1b1 ("mtd: nand: add NAND driver "library" for Broadcom STB NAND controller")
    Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
    Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 3b422c3c1fb5979a8fd9ca8a83d0bc9a2f67a64b
Author: Arnd Bergmann <arnd@arndb.de>
Date:   Fri Mar 24 23:02:48 2017 +0100

    infiniband: hns: avoid gcc-7.0.1 warning for uninitialized data
    
    commit 5b0ff9a00755d4d9c209033a77f1ed8f3186fe5c upstream.
    
    hns_roce_v1_cq_set_ci() calls roce_set_bit() on an uninitialized field,
    which will then change only a few of its bits, causing a warning with
    the latest gcc:
    
    infiniband/hw/hns/hns_roce_hw_v1.c: In function 'hns_roce_v1_cq_set_ci':
    infiniband/hw/hns/hns_roce_hw_v1.c:1854:23: error: 'doorbell[1]' is used uninitialized in this function [-Werror=uninitialized]
      roce_set_bit(doorbell[1], ROCEE_DB_OTHERS_H_ROCEE_DB_OTH_HW_SYNS_S, 1);
    
    The code is actually correct since we always set all bits of the
    port_vlan field, but gcc correctly points out that the first
    access does contain uninitialized data.
    
    This initializes the field to zero first before setting the
    individual bits.
    
    Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver")
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Signed-off-by: Doug Ledford <dledford@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 6af65c535d13ae6e43082af727abf408b71a6789
Author: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Date:   Mon Jun 26 04:28:04 2017 -0500

    iommu/amd: Fix interrupt remapping when disable guest_mode
    
    commit 84a21dbdef0b96d773599c33c2afbb002198d303 upstream.
    
    Pass-through devices to VM guest can get updated IRQ affinity
    information via irq_set_affinity() when not running in guest mode.
    Currently, AMD IOMMU driver in GA mode ignores the updated information
    if the pass-through device is setup to use vAPIC regardless of guest_mode.
    This could cause invalid interrupt remapping.
    
    Also, the guest_mode bit should be set and cleared only when
    SVM updates posted-interrupt interrupt remapping information.
    
    Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
    Cc: Joerg Roedel <jroedel@suse.de>
    Fixes: d98de49a53e48 ('iommu/amd: Enable vAPIC interrupt remapping mode by default')
    Signed-off-by: Joerg Roedel <jroedel@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit fb6237c7332fcd8062bdf093bbf7a77276f38069
Author: Pan Bian <bianpan2016@163.com>
Date:   Sun Apr 23 18:23:21 2017 +0800

    iommu/amd: Fix incorrect error handling in amd_iommu_bind_pasid()
    
    commit 73dbd4a4230216b6a5540a362edceae0c9b4876b upstream.
    
    In function amd_iommu_bind_pasid(), the control flow jumps
    to label out_free when pasid_state->mm and mm is NULL. And
    mmput(mm) is called.  In function mmput(mm), mm is
    referenced without validation. This will result in a NULL
    dereference bug. This patch fixes the bug.
    
    Signed-off-by: Pan Bian <bianpan2016@163.com>
    Fixes: f0aac63b873b ('iommu/amd: Don't hold a reference to mm_struct')
    Signed-off-by: Joerg Roedel <jroedel@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 3d06032fb21d5d5d4758d1398d38654e5682fbb7
Author: Robin Murphy <robin.murphy@arm.com>
Date:   Thu Mar 16 17:00:17 2017 +0000

    iommu/dma: Don't reserve PCI I/O windows
    
    commit 938f1bbe35e3a7cb07e1fa7c512e2ef8bb866bdf upstream.
    
    Even if a host controller's CPU-side MMIO windows into PCI I/O space do
    happen to leak into PCI memory space such that it might treat them as
    peer addresses, trying to reserve the corresponding I/O space addresses
    doesn't do anything to help solve that problem. Stop doing a silly thing.
    
    Fixes: fade1ec055dc ("iommu/dma: Avoid PCI host bridge windows")
    Reviewed-by: Eric Auger <eric.auger@redhat.com>
    Signed-off-by: Robin Murphy <robin.murphy@arm.com>
    Signed-off-by: Joerg Roedel <jroedel@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 092702fa4db58e7feca8b513c7faa52fdc4fb17f
Author: Eric Ren <zren@suse.com>
Date:   Fri Jun 23 15:08:55 2017 -0700

    ocfs2: fix deadlock caused by recursive locking in xattr
    
    commit 8818efaaacb78c60a9d90c5705b6c99b75d7d442 upstream.
    
    Another deadlock path caused by recursive locking is reported.  This
    kind of issue was introduced since commit 743b5f1434f5 ("ocfs2: take
    inode lock in ocfs2_iop_set/get_acl()").  Two deadlock paths have been
    fixed by commit b891fa5024a9 ("ocfs2: fix deadlock issue when taking
    inode lock at vfs entry points").  Yes, we intend to fix this kind of
    case in incremental way, because it's hard to find out all possible
    paths at once.
    
    This one can be reproduced like this.  On node1, cp a large file from
    home directory to ocfs2 mountpoint.  While on node2, run
    setfacl/getfacl.  Both nodes will hang up there.  The backtraces:
    
    On node1:
      __ocfs2_cluster_lock.isra.39+0x357/0x740 [ocfs2]
      ocfs2_inode_lock_full_nested+0x17d/0x840 [ocfs2]
      ocfs2_write_begin+0x43/0x1a0 [ocfs2]
      generic_perform_write+0xa9/0x180
      __generic_file_write_iter+0x1aa/0x1d0
      ocfs2_file_write_iter+0x4f4/0xb40 [ocfs2]
      __vfs_write+0xc3/0x130
      vfs_write+0xb1/0x1a0
      SyS_write+0x46/0xa0
    
    On node2:
      __ocfs2_cluster_lock.isra.39+0x357/0x740 [ocfs2]
      ocfs2_inode_lock_full_nested+0x17d/0x840 [ocfs2]
      ocfs2_xattr_set+0x12e/0xe80 [ocfs2]
      ocfs2_set_acl+0x22d/0x260 [ocfs2]
      ocfs2_iop_set_acl+0x65/0xb0 [ocfs2]
      set_posix_acl+0x75/0xb0
      posix_acl_xattr_set+0x49/0xa0
      __vfs_setxattr+0x69/0x80
      __vfs_setxattr_noperm+0x72/0x1a0
      vfs_setxattr+0xa7/0xb0
      setxattr+0x12d/0x190
      path_setxattr+0x9f/0xb0
      SyS_setxattr+0x14/0x20
    
    Fix this one by using ocfs2_inode_{lock|unlock}_tracker, which is
    exported by commit 439a36b8ef38 ("ocfs2/dlmglue: prepare tracking logic
    to avoid recursive cluster lock").
    
    Link: http://lkml.kernel.org/r/20170622014746.5815-1-zren@suse.com
    Fixes: 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
    Signed-off-by: Eric Ren <zren@suse.com>
    Reported-by: Thomas Voegtle <tv@lio96.de>
    Tested-by: Thomas Voegtle <tv@lio96.de>
    Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
    Cc: Mark Fasheh <mfasheh@versity.com>
    Cc: Joel Becker <jlbec@evilplan.org>
    Cc: Junxiao Bi <junxiao.bi@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 404dfb7533e45e451368c13fe5eb0d7564c810c8
Author: Junxiao Bi <junxiao.bi@oracle.com>
Date:   Wed May 3 14:51:41 2017 -0700

    ocfs2: o2hb: revert hb threshold to keep compatible
    
    commit 33496c3c3d7b88dcbe5e55aa01288b05646c6aca upstream.
    
    Configfs is the interface for ocfs2-tools to set configure to kernel and
    $configfs_dir/cluster/$clustername/heartbeat/dead_threshold is the one
    used to configure heartbeat dead threshold.  Kernel has a default value
    of it but user can set O2CB_HEARTBEAT_THRESHOLD in /etc/sysconfig/o2cb
    to override it.
    
    Commit 45b997737a80 ("ocfs2/cluster: use per-attribute show and store
    methods") changed heartbeat dead threshold name while ocfs2-tools did
    not, so ocfs2-tools won't set this configurable and the default value is
    always used.  So revert it.
    
    Fixes: 45b997737a80 ("ocfs2/cluster: use per-attribute show and store methods")
    Link: http://lkml.kernel.org/r/1490665245-15374-1-git-send-email-junxiao.bi@oracle.com
    Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
    Acked-by: Joseph Qi <jiangqi903@gmail.com>
    Cc: Mark Fasheh <mfasheh@versity.com>
    Cc: Joel Becker <jlbec@evilplan.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b363931e894a2d868f759ff60bf27548565adf58
Author: Andy Lutomirski <luto@kernel.org>
Date:   Sat Apr 22 00:01:22 2017 -0700

    x86/mm: Fix flush_tlb_page() on Xen
    
    commit dbd68d8e84c606673ebbcf15862f8c155fa92326 upstream.
    
    flush_tlb_page() passes a bogus range to flush_tlb_others() and
    expects the latter to fix it up.  native_flush_tlb_others() has the
    fixup but Xen's version doesn't.  Move the fixup to
    flush_tlb_others().
    
    AFAICS the only real effect is that, without this fix, Xen would
    flush everything instead of just the one page on remote vCPUs in
    when flush_tlb_page() was called.
    
    Signed-off-by: Andy Lutomirski <luto@kernel.org>
    Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Brian Gerst <brgerst@gmail.com>
    Cc: Dave Hansen <dave.hansen@intel.com>
    Cc: Denys Vlasenko <dvlasenk@redhat.com>
    Cc: H. Peter Anvin <hpa@zytor.com>
    Cc: Josh Poimboeuf <jpoimboe@redhat.com>
    Cc: Juergen Gross <jgross@suse.com>
    Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Nadav Amit <namit@vmware.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Fixes: e7b52ffd45a6 ("x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range")
    Link: http://lkml.kernel.org/r/10ed0e4dfea64daef10b87fb85df1746999b4dba.1492844372.git.luto@kernel.org
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ed815afd32cdd6f549036f50ae3a67c92e80c47a
Author: Joerg Roedel <jroedel@suse.de>
Date:   Thu Apr 6 16:19:22 2017 +0200

    x86/mpx: Correctly report do_mpx_bt_fault() failures to user-space
    
    commit 5ed386ec09a5d75bcf073967e55e895c2607a5c3 upstream.
    
    When this function fails it just sends a SIGSEGV signal to
    user-space using force_sig(). This signal is missing
    essential information about the cause, e.g. the trap_nr or
    an error code.
    
    Fix this by propagating the error to the only caller of
    mpx_handle_bd_fault(), do_bounds(), which sends the correct
    SIGSEGV signal to the process.
    
    Signed-off-by: Joerg Roedel <jroedel@suse.de>
    Cc: Andy Lutomirski <luto@kernel.org>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Brian Gerst <brgerst@gmail.com>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: Denys Vlasenko <dvlasenk@redhat.com>
    Cc: H. Peter Anvin <hpa@zytor.com>
    Cc: Josh Poimboeuf <jpoimboe@redhat.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Fixes: fe3d197f84319 ('x86, mpx: On-demand kernel allocation of bounds tables')
    Link: http://lkml.kernel.org/r/1491488362-27198-1-git-send-email-joro@8bytes.org
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 92ff9b6ce863575a4684c54b4897707d45933c5b
Author: Kan Liang <kan.liang@intel.com>
Date:   Tue Apr 4 15:14:06 2017 -0400

    perf/x86: Fix spurious NMI with PEBS Load Latency event
    
    commit fd583ad1563bec5f00140e1f2444adbcd331caad upstream.
    
    Spurious NMIs will be observed with the following command:
    
      while :; do
        perf record -bae "cpu/umask=0x01,event=0xcd,ldlat=0x80/pp"
                      -e "cpu/umask=0x03,event=0x0/"
                      -e "cpu/umask=0x02,event=0x0/"
                      -e cycles,branches,cache-misses
                      -e cache-references -- sleep 10
      done
    
    The bug was introduced by commit:
    
      8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+")
    
    That commit clears the status bits for the counters used for PEBS
    events, by masking the whole 64 bits pebs_enabled. However, only the
    low 32 bits of both status and pebs_enabled are reserved for PEBS-able
    counters.
    
    For status bits 32-34 are fixed counter overflow bits. For
    pebs_enabled bits 32-34 are for PEBS Load Latency.
    
    In the test case, the PEBS Load Latency event and fixed counter event
    could overflow at the same time. The fixed counter overflow bit will
    be cleared by mistake. Once it is cleared, the fixed counter overflow
    never be processed, which finally trigger spurious NMI.
    
    Correct the PEBS enabled mask by ignoring the non-PEBS bits.
    
    Signed-off-by: Kan Liang <kan.liang@intel.com>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Stephane Eranian <eranian@google.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Vince Weaver <vincent.weaver@maine.edu>
    Fixes: 8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+")
    Link: http://lkml.kernel.org/r/1491333246-3965-1-git-send-email-kan.liang@intel.com
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit cc6b8fbcc65652d09aef71d0ca232a5c7791a0c7
Author: Baoquan He <bhe@redhat.com>
Date:   Tue Jun 27 20:39:06 2017 +0800

    x86/boot/KASLR: Fix kexec crash due to 'virt_addr' calculation bug
    
    commit 8eabf42ae5237e6b699aeac687b5b629e3537c8d upstream.
    
    Kernel text KASLR is separated into physical address and virtual
    address randomization. And for virtual address randomization, we
    only randomiza to get an offset between 16M and KERNEL_IMAGE_SIZE.
    So the initial value of 'virt_addr' should be LOAD_PHYSICAL_ADDR,
    but not the original kernel loading address 'output'.
    
    The bug will cause kernel boot failure if kernel is loaded at a different
    position than the address, 16M, which is decided at compiled time.
    Kexec/kdump is such practical case.
    
    To fix it, just assign LOAD_PHYSICAL_ADDR to virt_addr as initial
    value.
    
    Tested-by: Dave Young <dyoung@redhat.com>
    Signed-off-by: Baoquan He <bhe@redhat.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Fixes: 8391c73 ("x86/KASLR: Randomize virtual address separately")
    Link: http://lkml.kernel.org/r/1498567146-11990-3-git-send-email-bhe@redhat.com
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 875cfdbe15cc75390f06d0f38cf6b1b03b4a3cd3
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Fri Jun 23 10:50:38 2017 +0200

    x86/mshyperv: Remove excess #includes from mshyperv.h
    
    commit 26fcd952d5c977a94ac64bb44ed409e37607b2c9 upstream.
    
    A recent commit included linux/slab.h in linux/irq.h. This breaks the build
    of vdso32 on a 64-bit kernel.
    
    The reason is that linux/irq.h gets included into the vdso code via
    linux/interrupt.h which is included from asm/mshyperv.h. That makes the
    32-bit vdso compile fail, because slab.h includes the pgtable headers for
    64-bit on a 64-bit build.
    
    Neither linux/clocksource.h nor linux/interrupt.h are needed in the
    mshyperv.h header file itself - it has a dependency on <linux/atomic.h>.
    
    Remove the includes and unbreak the build.
    
    Reported-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Cc: K. Y. Srinivasan <kys@microsoft.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
    Cc: devel@linuxdriverproject.org
    Fixes: dee863b571b0 ("hv: export current Hyper-V clocksource")
    Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1706231038460.2647@nanos
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b3bc81143c19fbc59430fbcc06fd56cc10aaac7e
Author: Josh Poimboeuf <jpoimboe@redhat.com>
Date:   Tue May 23 10:37:29 2017 -0500

    Revert "x86/entry: Fix the end of the stack for newly forked tasks"
    
    commit ebd574994c63164d538a197172157318f58ac647 upstream.
    
    Petr Mladek reported the following warning when loading the livepatch
    sample module:
    
      WARNING: CPU: 1 PID: 3699 at arch/x86/kernel/stacktrace.c:132 save_stack_trace_tsk_reliable+0x133/0x1a0
      ...
      Call Trace:
       __schedule+0x273/0x820
       schedule+0x36/0x80
       kthreadd+0x305/0x310
       ? kthread_create_on_cpu+0x80/0x80
       ? icmp_echo.part.32+0x50/0x50
       ret_from_fork+0x2c/0x40
    
    That warning means the end of the stack is no longer recognized as such
    for newly forked tasks.  The problem was introduced with the following
    commit:
    
      ff3f7e2475bb ("x86/entry: Fix the end of the stack for newly forked tasks")
    
    ... which was completely misguided.  It only partially fixed the
    reported issue, and it introduced another bug in the process.  None of
    the other entry code saves the frame pointer before calling into C code,
    so it doesn't make sense for ret_from_fork to do so either.
    
    Contrary to what I originally thought, the original issue wasn't related
    to newly forked tasks.  It was actually related to ftrace.  When entry
    code calls into a function which then calls into an ftrace handler, the
    stack frame looks different than normal.
    
    The original issue will be fixed in the unwinder, in a subsequent patch.
    
    Reported-by: Petr Mladek <pmladek@suse.com>
    Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
    Acked-by: Thomas Gleixner <tglx@linutronix.de>
    Cc: Dave Jones <davej@codemonkey.org.uk>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Cc: live-patching@vger.kernel.org
    Fixes: ff3f7e2475bb ("x86/entry: Fix the end of the stack for newly forked tasks")
    Link: http://lkml.kernel.org/r/f350760f7e82f0750c8d1dd093456eb212751caa.1495553739.git.jpoimboe@redhat.com
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 839fe2c4164304d4c460b67db5a3f4e7689edd38
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date:   Mon Apr 24 11:58:54 2017 -0300

    tools arch: Sync arch/x86/lib/memcpy_64.S with the kernel
    
    commit e883d09c9eb2ffddfd057c17e6a0cef446ec8c9b upstream.
    
    Just a minor fix done in:
    
      Fixes: 26a37ab319a2 ("x86/mce: Fix copy/paste error in exception table entries")
    
    Cc: Tony Luck <tony.luck@intel.com>
    Link: http://lkml.kernel.org/n/tip-ni9jzdd5yxlail6pq8cuexw2@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b856d45c710620d4324b660be1f36ce561ebc281
Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date:   Sat May 13 13:40:20 2017 +0200

    ARM: davinci: PM: Do not free useful resources in normal path in 'davinci_pm_init'
    
    commit 95d7c1f18bf8ac03b0fc48eac1f1b11f867765b8 upstream.
    
    It is wrong to iounmap resources in the normal path of davinci_pm_init()
    
    The 3 ioremap'ed fields of 'pm_config' can be accessed later on in other
    functions, so we should return 'success' instead of unrolling everything.
    
    Fixes: aa9aa1ec2df6 ("ARM: davinci: PM: rework init, remove platform device")
    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    [nsekhar@ti.com: commit message and minor style fixes]
    Signed-off-by: Sekhar Nori <nsekhar@ti.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b0ed471883749029e49c9a6f02b7568d7f9819d5
Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date:   Sat May 13 13:40:05 2017 +0200

    ARM: davinci: PM: Free resources in error handling path in 'davinci_pm_init'
    
    commit f3f6cc814f9cb61cfb738af2b126a8bf19e5ab4c upstream.
    
    If 'sram_alloc' fails, we need to free already allocated resources.
    
    Fixes: aa9aa1ec2df6 ("ARM: davinci: PM: rework init, remove platform device")
    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Signed-off-by: Sekhar Nori <nsekhar@ti.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 0afbd9fd39caff83e178dbaaa2581929d449cb85
Author: Doug Berger <opendmb@gmail.com>
Date:   Thu Jun 29 18:41:36 2017 +0100

    ARM: 8685/1: ensure memblock-limit is pmd-aligned
    
    commit 9e25ebfe56ece7541cd10a20d715cbdd148a2e06 upstream.
    
    The pmd containing memblock_limit is cleared by prepare_page_table()
    which creates the opportunity for early_alloc() to allocate unmapped
    memory if memblock_limit is not pmd aligned causing a boot-time hang.
    
    Commit 965278dcb8ab ("ARM: 8356/1: mm: handle non-pmd-aligned end of RAM")
    attempted to resolve this problem, but there is a path through the
    adjust_lowmem_bounds() routine where if all memory regions start and
    end on pmd-aligned addresses the memblock_limit will be set to
    arm_lowmem_limit.
    
    Since arm_lowmem_limit can be affected by the vmalloc early parameter,
    the value of arm_lowmem_limit may not be pmd-aligned. This commit
    corrects this oversight such that memblock_limit is always rounded
    down to pmd-alignment.
    
    Fixes: 965278dcb8ab ("ARM: 8356/1: mm: handle non-pmd-aligned end of RAM")
    Signed-off-by: Doug Berger <opendmb@gmail.com>
    Suggested-by: Mark Rutland <mark.rutland@arm.com>
    Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 16dfde48319bc5a5091b8ababc2d86c1de763ee4
Author: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Date:   Fri May 26 17:40:02 2017 +0100

    ARM64/ACPI: Fix BAD_MADT_GICC_ENTRY() macro implementation
    
    commit cb7cf772d83d2d4e6995c5bb9e0fb59aea8f7080 upstream.
    
    The BAD_MADT_GICC_ENTRY() macro checks if a GICC MADT entry passes
    muster from an ACPI specification standpoint. Current macro detects the
    MADT GICC entry length through ACPI firmware version (it changed from 76
    to 80 bytes in the transition from ACPI 5.1 to ACPI 6.0 specification)
    but always uses (erroneously) the ACPICA (latest) struct (ie struct
    acpi_madt_generic_interrupt - that is 80-bytes long) length to check if
    the current GICC entry memory record exceeds the MADT table end in
    memory as defined by the MADT table header itself, which may result in
    false negatives depending on the ACPI firmware version and how the MADT
    entries are laid out in memory (ie on ACPI 5.1 firmware MADT GICC
    entries are 76 bytes long, so by adding 80 to a GICC entry start address
    in memory the resulting address may well be past the actual MADT end,
    triggering a false negative).
    
    Fix the BAD_MADT_GICC_ENTRY() macro by reshuffling the condition checks
    and update them to always use the firmware version specific MADT GICC
    entry length in order to carry out boundary checks.
    
    Fixes: b6cfb277378e ("ACPI / ARM64: add BAD_MADT_GICC_ENTRY() macro")
    Reported-by: Julien Grall <julien.grall@arm.com>
    Acked-by: Will Deacon <will.deacon@arm.com>
    Acked-by: Marc Zyngier <marc.zyngier@arm.com>
    Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
    Cc: Julien Grall <julien.grall@arm.com>
    Cc: Hanjun Guo <hanjun.guo@linaro.org>
    Cc: Al Stone <ahs3@redhat.com>
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 5e1c6a5d7fa096fd2e2adc1baaf36f7701ef6560
Author: Timmy Li <lixiaoping3@huawei.com>
Date:   Mon May 22 16:48:28 2017 +0100

    ARM64: PCI: Fix struct acpi_pci_root_ops allocation failure path
    
    commit 717902cc93118119a6fce7765da6cf2786987418 upstream.
    
    Commit 093d24a20442 ("arm64: PCI: Manage controller-specific data on
    per-controller basis") added code to allocate ACPI PCI root_ops
    dynamically on a per host bridge basis but failed to update the
    corresponding memory allocation failure path in pci_acpi_scan_root()
    leading to a potential memory leakage.
    
    Fix it by adding the required kfree call.
    
    Fixes: 093d24a20442 ("arm64: PCI: Manage controller-specific data on per-controller basis")
    Reviewed-by: Tomasz Nowicki <tn@semihalf.com>
    Signed-off-by: Timmy Li <lixiaoping3@huawei.com>
    [lorenzo.pieralisi@arm.com: refactored code, rewrote commit log]
    Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
    CC: Will Deacon <will.deacon@arm.com>
    CC: Bjorn Helgaas <bhelgaas@google.com>
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 961010353d3e9bb2a7ddb434ee6c37b03271810b
Author: Eric Anholt <eric@anholt.net>
Date:   Thu Apr 27 18:02:32 2017 -0700

    watchdog: bcm281xx: Fix use of uninitialized spinlock.
    
    commit fedf266f9955d9a019643cde199a2fd9a0259f6f upstream.
    
    The bcm_kona_wdt_set_resolution_reg() call takes the spinlock, so
    initialize it earlier.  Fixes a warning at boot with lock debugging
    enabled.
    
    Fixes: 6adb730dc208 ("watchdog: bcm281xx: Watchdog Driver")
    Signed-off-by: Eric Anholt <eric@anholt.net>
    Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
    Reviewed-by: Guenter Roeck <linux@roeck-us.net>
    Signed-off-by: Guenter Roeck <linux@roeck-us.net>
    Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 662bb18efe1433aa6139af13408899ddac828ae7
Author: Dan Carpenter <dan.carpenter@oracle.com>
Date:   Wed Jun 14 13:34:05 2017 +0300

    xfrm: Oops on error in pfkey_msg2xfrm_state()
    
    commit 1e3d0c2c70cd3edb5deed186c5f5c75f2b84a633 upstream.
    
    There are some missing error codes here so we accidentally return NULL
    instead of an error pointer.  It results in a NULL pointer dereference.
    
    Fixes: df71837d5024 ("[LSM-IPSec]: Security association restriction.")
    Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
    Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f37a5bfa5cf7ceb330367947dab8a8083282795f
Author: Dan Carpenter <dan.carpenter@oracle.com>
Date:   Wed Jun 14 13:35:37 2017 +0300

    xfrm: NULL dereference on allocation failure
    
    commit e747f64336fc15e1c823344942923195b800aa1e upstream.
    
    The default error code in pfkey_msg2xfrm_state() is -ENOBUFS.  We
    added a new call to security_xfrm_state_alloc() which sets "err" to zero
    so there several places where we can return ERR_PTR(0) if kmalloc()
    fails.  The caller is expecting error pointers so it leads to a NULL
    dereference.
    
    Fixes: df71837d5024 ("[LSM-IPSec]: Security association restriction.")
    Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
    Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 29be0c1aefd38f07c5b0547c2f6d7423873c9aca
Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Wed May 3 16:43:19 2017 +0200

    xfrm: fix stack access out of bounds with CONFIG_XFRM_SUB_POLICY
    
    commit 9b3eb54106cf6acd03f07cf0ab01c13676a226c2 upstream.
    
    When CONFIG_XFRM_SUB_POLICY=y, xfrm_dst stores a copy of the flowi for
    that dst. Unfortunately, the code that allocates and fills this copy
    doesn't care about what type of flowi (flowi, flowi4, flowi6) gets
    passed. In multiple code paths (from raw_sendmsg, from TCP when
    replying to a FIN, in vxlan, geneve, and gre), the flowi that gets
    passed to xfrm is actually an on-stack flowi4, so we end up reading
    stuff from the stack past the end of the flowi4 struct.
    
    Since xfrm_dst->origin isn't used anywhere following commit
    ca116922afa8 ("xfrm: Eliminate "fl" and "pol" args to
    xfrm_bundle_ok()."), just get rid of it.  xfrm_dst->partner isn't used
    either, so get rid of that too.
    
    Fixes: 9d6ec938019c ("ipv4: Use flowi4 in public route lookup interfaces.")
    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e0cee9f3bfdf3c8e84db231d0247132b4c30bc6e
Author: Hangbin Liu <liuhangbin@gmail.com>
Date:   Sun Jun 11 09:44:20 2017 +0800

    xfrm: move xfrm_garbage_collect out of xfrm_policy_flush
    
    commit 138437f591dd9a42d53c6fed1a3c85e02678851c upstream.
    
    Now we will force to do garbage collection if any policy removed in
    xfrm_policy_flush(). But during xfrm_net_exit(). We call flow_cache_fini()
    first and set set fc->percpu to NULL. Then after we call xfrm_policy_fini()
    -> frxm_policy_flush() -> flow_cache_flush(), we will get NULL pointer
    dereference when check percpu_empty. The code path looks like:
    
    flow_cache_fini()
      - fc->percpu = NULL
    xfrm_policy_fini()
      - xfrm_policy_flush()
        - xfrm_garbage_collect()
          - flow_cache_flush()
            - flow_cache_percpu_empty()
              - fcp = per_cpu_ptr(fc->percpu, cpu)
    
    To reproduce, just add ipsec in netns and then remove the netns.
    
    v2:
    As Xin Long suggested, since only two other places need to call it. move
    xfrm_garbage_collect() outside xfrm_policy_flush().
    
    v3:
    Fix subject mismatch after v2 fix.
    
    Fixes: 35db06912189 ("xfrm: do the garbage collection after flushing policy")
    Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
    Reviewed-by: Xin Long <lucien.xin@gmail.com>
    Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 4d03c6171114ed45d1267cf5d0f49d832de1e917
Author: Yossi Kuperman <yossiku@mellanox.com>
Date:   Thu Jun 22 11:37:10 2017 +0300

    xfrm6: Fix IPv6 payload_len in xfrm6_transport_finish
    
    commit 7c88e21aefcf86fb41b48b2e04528db5a30fbe18 upstream.
    
    IPv6 payload length indicates the size of the payload, including any
    extension headers.
    
    In xfrm6_transport_finish, ipv6_hdr(skb)->payload_len is set to the
    payload size only, regardless of the presence of any extension headers.
    After ESP GRO transport mode decapsulation, ipv6_rcv trims the packet
    according to the wrong payload_len, thus corrupting the packet.
    
    Set payload_len to account for extension headers as well.
    
    Fixes: 7785bba299a8 ("esp: Add a software GRO codepath")
    Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
    Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c4ed418dc93b6d44b21acc3dbe060ac1be921b79
Author: Juergen Gross <jgross@suse.com>
Date:   Thu May 18 17:28:48 2017 +0200

    xen/blkback: don't free be structure too early
    
    commit 71df1d7ccad1c36f7321d6b3b48f2ea42681c363 upstream.
    
    The be structure must not be freed when freeing the blkif structure
    isn't done. Otherwise a use-after-free of be when unmapping the ring
    used for communicating with the frontend will occur in case of a
    late call of xenblk_disconnect() (e.g. due to an I/O still active
    when trying to disconnect).
    
    Signed-off-by: Juergen Gross <jgross@suse.com>
    Tested-by: Steven Haigh <netwiz@crc.id.au>
    Acked-by: Roger Pau Monné <roger.pau@citrix.com>
    Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f21d9eda322289bce95ade2b79f9cef314ab2b91
Author: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Date:   Fri Jun 23 15:08:41 2017 -0700

    mm/vmalloc.c: huge-vmap: fail gracefully on unexpected huge vmap mappings
    
    commit 029c54b09599573015a5c18dbe59cbdf42742237 upstream.
    
    Existing code that uses vmalloc_to_page() may assume that any address
    for which is_vmalloc_addr() returns true may be passed into
    vmalloc_to_page() to retrieve the associated struct page.
    
    This is not un unreasonable assumption to make, but on architectures
    that have CONFIG_HAVE_ARCH_HUGE_VMAP=y, it no longer holds, and we need
    to ensure that vmalloc_to_page() does not go off into the weeds trying
    to dereference huge PUDs or PMDs as table entries.
    
    Given that vmalloc() and vmap() themselves never create huge mappings or
    deal with compound pages at all, there is no correct answer in this
    case, so return NULL instead, and issue a warning.
    
    When reading /proc/kcore on arm64, you will hit an oops as soon as you
    hit the huge mappings used for the various segments that make up the
    mapping of vmlinux.  With this patch applied, you will no longer hit the
    oops, but the kcore contents willl be incorrect (these regions will be
    zeroed out)
    
    We are fixing this for kcore specifically, so it avoids vread() for
    those regions.  At least one other problematic user exists, i.e.,
    /dev/kmem, but that is currently broken on arm64 for other reasons.
    
    Link: http://lkml.kernel.org/r/20170609082226.26152-1-ard.biesheuvel@linaro.org
    Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
    Acked-by: Mark Rutland <mark.rutland@arm.com>
    Reviewed-by: Laura Abbott <labbott@redhat.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: zhong jiang <zhongjiang@huawei.com>
    Cc: Dave Hansen <dave.hansen@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 0ec03ce7d79dc9a5c47d26bab38c78075d42de9c
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Tue May 23 23:23:32 2017 +0200

    pinctrl/amd: Use regular interrupt instead of chained
    
    commit ba714a9c1dea85e0bf2899d02dfeb9c70040427c upstream.
    
    The AMD pinctrl driver uses a chained interrupt to demultiplex the GPIO
    interrupts. Kevin Vandeventer reported, that his new AMD Ryzen locks up
    hard on boot when the AMD pinctrl driver is initialized. The reason is an
    interrupt storm. It's not clear whether that's caused by hardware or
    firmware or both.
    
    Using chained interrupts on X86 is a dangerous endavour. If a system is
    misconfigured or the hardware buggy there is no safety net to catch an
    interrupt storm.
    
    Convert the driver to use a regular interrupt for the demultiplex
    handler. This allows the interrupt storm detector to catch the malfunction
    and lets the system boot up.
    
    This should be backported to stable because it's likely that more users run
    into this problem as the AMD Ryzen machines are spreading.
    
    Reported-by: Kevin Vandeventer
    Link: https://bugzilla.suse.com/show_bug.cgi?id=1034261
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
    Signed-off-by: Borislav Petkov <bp@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d9efa9db58338c359b56ce0fe9435fc201bffe03
Author: Baoquan He <bhe@redhat.com>
Date:   Thu May 4 10:25:47 2017 +0800

    x86/mm: Fix boot crash caused by incorrect loop count calculation in sync_global_pgds()
    
    commit fc5f9d5f151c9fff21d3d1d2907b888a5aec3ff7 upstream.
    
    Jeff Moyer reported that on his system with two memory regions 0~64G and
    1T~1T+192G, and kernel option "memmap=192G!1024G" added, enabling KASLR
    will make the system hang intermittently during boot. While adding 'nokaslr'
    won't.
    
    The back trace is:
    
     Oops: 0000 [#1] SMP
    
     RIP: memcpy_erms()
     [ .... ]
     Call Trace:
      pmem_rw_page()
      bdev_read_page()
      do_mpage_readpage()
      mpage_readpages()
      blkdev_readpages()
      __do_page_cache_readahead()
      force_page_cache_readahead()
      page_cache_sync_readahead()
      generic_file_read_iter()
      blkdev_read_iter()
      __vfs_read()
      vfs_read()
      SyS_read()
      entry_SYSCALL_64_fastpath()
    
    This crash happens because the for loop count calculation in sync_global_pgds()
    is not correct. When a mapping area crosses PGD entries, we should
    calculate the starting address of region which next PGD covers and assign
    it to next for loop count, but not add PGDIR_SIZE directly. The old
    code works right only if the mapping area is an exact multiple of PGDIR_SIZE,
    otherwize the end region could be skipped so that it can't be synchronized
    to all other processes from kernel PGD init_mm.pgd.
    
    In Jeff's system, emulated pmem area [1024G, 1216G) is smaller than
    PGDIR_SIZE. While 'nokaslr' works because PAGE_OFFSET is 1T aligned, it
    makes this area be mapped inside one PGD entry. With KASLR enabled,
    this area could cross two PGD entries, then the next PGD entry won't
    be synced to all other processes. That is why we saw empty PGD.
    
    Fix it.
    
    Reported-by: Jeff Moyer <jmoyer@redhat.com>
    Signed-off-by: Baoquan He <bhe@redhat.com>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Andy Lutomirski <luto@kernel.org>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Brian Gerst <brgerst@gmail.com>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: Dave Young <dyoung@redhat.com>
    Cc: Denys Vlasenko <dvlasenk@redhat.com>
    Cc: H. Peter Anvin <hpa@zytor.com>
    Cc: Jinbum Park <jinb.park7@gmail.com>
    Cc: Josh Poimboeuf <jpoimboe@redhat.com>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Garnier <thgarnie@google.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com>
    Cc: Yinghai Lu <yinghai@kernel.org>
    Link: http://lkml.kernel.org/r/1493864747-8506-1-git-send-email-bhe@redhat.com
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Dan Williams <dan.j.williams@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a340661a36d529038a2f8144b5fb2b132d227ebc
Author: Vallish Vaidyeshwara <vallish@amazon.com>
Date:   Fri Jun 23 18:53:06 2017 +0000

    dm thin: do not queue freed thin mapping for next stage processing
    
    commit 00a0ea33b495ee6149bf5a77ac5807ce87323abb upstream.
    
    process_prepared_discard_passdown_pt1() should cleanup
    dm_thin_new_mapping in cases of error.
    
    dm_pool_inc_data_range() can fail trying to get a block reference:
    
    metadata operation 'dm_pool_inc_data_range' failed: error = -61
    
    When dm_pool_inc_data_range() fails, dm thin aborts current metadata
    transaction and marks pool as PM_READ_ONLY. Memory for thin mapping
    is released as well. However, current thin mapping will be queued
    onto next stage as part of queue_passdown_pt2() or passdown_endio().
    This dangling thin mapping memory when processed and accessed in
    next stage will lead to device mapper crashing.
    
    Code flow without fix:
    -> process_prepared_discard_passdown_pt1(m)
       -> dm_thin_remove_range()
       -> discard passdown
          --> passdown_endio(m) queues m onto next stage
       -> dm_pool_inc_data_range() fails, frees memory m
                but does not remove it from next stage queue
    
    -> process_prepared_discard_passdown_pt2(m)
       -> processes freed memory m and crashes
    
    One such stack:
    
    Call Trace:
    [<ffffffffa037a46f>] dm_cell_release_no_holder+0x2f/0x70 [dm_bio_prison]
    [<ffffffffa039b6dc>] cell_defer_no_holder+0x3c/0x80 [dm_thin_pool]
    [<ffffffffa039b88b>] process_prepared_discard_passdown_pt2+0x4b/0x90 [dm_thin_pool]
    [<ffffffffa0399611>] process_prepared+0x81/0xa0 [dm_thin_pool]
    [<ffffffffa039e735>] do_worker+0xc5/0x820 [dm_thin_pool]
    [<ffffffff8152bf54>] ? __schedule+0x244/0x680
    [<ffffffff81087e72>] ? pwq_activate_delayed_work+0x42/0xb0
    [<ffffffff81089f53>] process_one_work+0x153/0x3f0
    [<ffffffff8108a71b>] worker_thread+0x12b/0x4b0
    [<ffffffff8108a5f0>] ? rescuer_thread+0x350/0x350
    [<ffffffff8108fd6a>] kthread+0xca/0xe0
    [<ffffffff8108fca0>] ? kthread_park+0x60/0x60
    [<ffffffff81530b45>] ret_from_fork+0x25/0x30
    
    The fix is to first take the block ref count for discarded block and
    then do a passdown discard of this block. If block ref count fails,
    then bail out aborting current metadata transaction, mark pool as
    PM_READ_ONLY and also free current thin mapping memory (existing error
    handling code) without queueing this thin mapping onto next stage of
    processing. If block ref count succeeds, then passdown discard of this
    block. Discard callback of passdown_endio() will queue this thin mapping
    onto next stage of processing.
    
    Code flow with fix:
    -> process_prepared_discard_passdown_pt1(m)
       -> dm_thin_remove_range()
       -> dm_pool_inc_data_range()
          --> if fails, free memory m and bail out
       -> discard passdown
          --> passdown_endio(m) queues m onto next stage
    
    Reviewed-by: Eduardo Valentin <eduval@amazon.com>
    Reviewed-by: Cristian Gafton <gafton@amazon.com>
    Reviewed-by: Anchal Agarwal <anchalag@amazon.com>
    Signed-off-by: Vallish Vaidyeshwara <vallish@amazon.com>
    Reviewed-by: Joe Thornber <ejt@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d7822ccb7fdcc36ebb96fe4a25902b8358f14fa0
Author: Deepak Rawat <drawat@vmware.com>
Date:   Mon Jun 26 14:39:08 2017 +0200

    drm/vmwgfx: Free hash table allocated by cmdbuf managed res mgr
    
    commit 82fcee526ba8ca2c5d378bdf51b21b7eb058fe3a upstream.
    
    The hash table created during vmw_cmdbuf_res_man_create was
    never freed. This causes memory leak in context creation.
    Added the corresponding drm_ht_remove in vmw_cmdbuf_res_man_destroy.
    
    Tested for memory leak by running piglit overnight and kernel
    memory is not inflated which earlier was.
    
    Signed-off-by: Deepak Rawat <drawat@vmware.com>
    Reviewed-by: Sinclair Yeh <syeh@vmware.com>
    Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit bdbe850337374f8aeb49a2d40b9cfe277bbdfcb1
Author: Kan Liang <kan.liang@intel.com>
Date:   Thu Jun 29 12:09:26 2017 -0700

    perf/x86/intel/uncore: Fix wrong box pointer check
    
    commit 80c65fdb4c6920e332a9781a3de5877594b07522 upstream.
    
    Should not init a NULL box. It will cause system crash.
    The issue looks like caused by a typo.
    
    This was not noticed because there is no NULL box. Also, for most
    boxes, they are enabled by default. The init code is not critical.
    
    Fixes: fff4b87e594a ("perf/x86/intel/uncore: Make package handling more robust")
    Signed-off-by: Kan Liang <kan.liang@intel.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Link: http://lkml.kernel.org/r/20170629190926.2456-1-kan.liang@intel.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 60c9685b4287d62fe2621c3f27abc9e4b2906f28
Author: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Date:   Mon Jun 26 11:55:49 2017 -0700

    x86/intel_rdt: Fix memory leak on mount failure
    
    commit 79298acc4ba097e9ab78644e3e38902d73547c92 upstream.
    
    If mount fails, the kn_info directory is not freed causing memory leak.
    
    Add the missing error handling path.
    
    Fixes: 4e978d06dedb ("x86/intel_rdt: Add "info" files to resctrl file system")
    Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Cc: ravi.v.shankar@intel.com
    Cc: tony.luck@intel.com
    Cc: fenghua.yu@intel.com
    Cc: peterz@infradead.org
    Cc: vikas.shivappa@intel.com
    Cc: andi.kleen@intel.com
    Link: http://lkml.kernel.org/r/1498503368-20173-3-git-send-email-vikas.shivappa@linux.intel.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit dfa1f2448715d0e2b8229f8d43b9053bb8221cd2
Author: Bartosz Golaszewski <brgl@bgdev.pl>
Date:   Fri Jun 23 13:45:16 2017 +0200

    gpiolib: fix filtering out unwanted events
    
    commit ad537b822577fcc143325786cd6ad50d7b9df31c upstream.
    
    GPIOEVENT_REQUEST_BOTH_EDGES is not a single flag, but a binary OR of
    GPIOEVENT_REQUEST_RISING_EDGE and GPIOEVENT_REQUEST_FALLING_EDGE.
    
    The expression 'le->eflags & GPIOEVENT_REQUEST_BOTH_EDGES' we'll get
    evaluated to true even if only one event type was requested.
    
    Fix it by checking both RISING & FALLING flags explicitly.
    
    Fixes: 61f922db7221 ("gpio: userspace ABI for reading GPIO line events")
    Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
    Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 515a95fafafe0d98ca6c35af0674f2aaa7e4d0b9
Author: Miklos Szeredi <mszeredi@redhat.com>
Date:   Wed Jun 28 13:41:22 2017 +0200

    ovl: copy-up: don't unlock between lookup and link
    
    commit e85f82ff9b8ef503923a3be8ca6b5fd1908a7f3f upstream.
    
    Nothing prevents mischief on upper layer while we are busy copying up the
    data.
    
    Move the lookup right before the looked up dentry is actually used.
    
    Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
    Fixes: 01ad3eb8a073 ("ovl: concurrent copy up of regular files")
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 003192c3d3e3c38cc604430266e2fe3c5ef5302c
Author: Benjamin Coddington <bcodding@redhat.com>
Date:   Fri Jun 16 11:12:59 2017 -0400

    Revert "NFS: nfs_rename() handle -ERESTARTSYS dentry left behind"
    
    commit d9f2950006f110f54444a10442752372ee568289 upstream.
    
    This reverts commit 920b4530fb80430ff30ef83efe21ba1fa5623731 which could
    call d_move() without holding the directory's i_mutex, and reverts commit
    d4ea7e3c5c0e341c15b073016dbf3ab6c65f12f3 "NFS: Fix old dentry rehash after
    move", which was a follow-up fix.
    
    Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
    Fixes: 920b4530fb80 ("NFS: nfs_rename() handle -ERESTARTSYS dentry left behind")
    Reviewed-by: Jeff Layton <jlayton@redhat.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 95b2e0882b4236b9c2884e38a8c070e8c3cc436a
Author: Trond Myklebust <trond.myklebust@primarydata.com>
Date:   Tue Jun 27 17:33:38 2017 -0400

    NFSv4.1: Fix a race in nfs4_proc_layoutget
    
    commit bd171930e6a3de4f5cffdafbb944e50093dfb59b upstream.
    
    If the task calling layoutget is signalled, then it is possible for the
    calls to nfs4_sequence_free_slot() and nfs4_layoutget_prepare() to race,
    in which case we leak a slot.
    The fix is to move the call to nfs4_sequence_free_slot() into the
    nfs4_layoutget_release() so that it gets called at task teardown time.
    
    Fixes: 2e80dbe7ac51 ("NFSv4.1: Close callback races for OPEN, LAYOUTGET...")
    Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f8da5dee0901ff291a1a99e2c37a23617d8a52ea
Author: Benjamin Coddington <bcodding@redhat.com>
Date:   Fri Jun 2 11:21:34 2017 -0400

    NFSv4.2: Don't send mode again in post-EXCLUSIVE4_1 SETATTR with umask
    
    commit 501e7a4689378f8b1690089bfdd4f1e12ec22903 upstream.
    
    Now that we have umask support, we shouldn't re-send the mode in a SETATTR
    following an exclusive CREATE, or we risk having the same problem fixed in
    commit 5334c5bdac92 ("NFS: Send attributes in OPEN request for
    NFS4_CREATE_EXCLUSIVE4_1"), which is that files with S_ISGID will have that
    bit stripped away.
    
    Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
    Fixes: dff25ddb4808 ("nfs: add support for the umask attribute")
    Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f521e0bfcbff93128bffba99343f03c7873a538e
Author: Hui Wang <hui.wang@canonical.com>
Date:   Wed Jun 28 08:59:16 2017 +0800

    ALSA: hda - set input_path bitmap to zero after moving it to new place
    
    commit a8f20fd25bdce81a8e41767c39f456d346b63427 upstream.
    
    Recently we met a problem, the codec has valid adcs and input pins,
    and they can form valid input paths, but the driver does not build
    valid controls for them like "Mic boost", "Capture Volume" and
    "Capture Switch".
    
    Through debugging, I found the driver needs to shrink the invalid
    adcs and input paths for this machine, so it will move the whole
    column bitmap value to the previous column, after moving it, the
    driver forgets to set the original column bitmap value to zero, as a
    result, the driver will invalidate the path whose index value is the
    original colume bitmap value. After executing this function, all
    valid input paths are invalidated by a mistake, there are no any
    valid input paths, so the driver won't build controls for them.
    
    Fixes: 3a65bcdc577a ("ALSA: hda - Fix inconsistent input_paths after ADC reduction")
    Signed-off-by: Hui Wang <hui.wang@canonical.com>
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a189e2f294544f539d30a72bf5bc1bc045915339
Author: Takashi Iwai <tiwai@suse.de>
Date:   Wed Jun 28 12:02:02 2017 +0200

    ALSA: hda - Fix endless loop of codec configure
    
    commit d94815f917da770d42c377786dc428f542e38f71 upstream.
    
    azx_codec_configure() loops over the codecs found on the given
    controller via a linked list.  The code used to work in the past, but
    in the current version, this may lead to an endless loop when a codec
    binding returns an error.
    
    The culprit is that the snd_hda_codec_configure() unregisters the
    device upon error, and this eventually deletes the given codec object
    from the bus.  Since the list is initialized via list_del_init(), the
    next object points to the same device itself.  This behavior change
    was introduced at splitting the HD-audio code code, and forgotten to
    adapt it here.
    
    For fixing this bug, just use a *_safe() version of list iteration.
    
    Fixes: d068ebc25e6e ("ALSA: hda - Move some codes up to hdac_bus struct")
    Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch>
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f1a8a4fe832433ea40927bcff6b1a83ec6327242
Author: Paul Burton <paul.burton@imgtec.com>
Date:   Fri Mar 3 15:26:05 2017 -0800

    MIPS: Fix IRQ tracing & lockdep when rescheduling
    
    commit d8550860d910c6b7b70f830f59003b33daaa52c9 upstream.
    
    When the scheduler sets TIF_NEED_RESCHED & we call into the scheduler
    from arch/mips/kernel/entry.S we disable interrupts. This is true
    regardless of whether we reach work_resched from syscall_exit_work,
    resume_userspace or by looping after calling schedule(). Although we
    disable interrupts in these paths we don't call trace_hardirqs_off()
    before calling into C code which may acquire locks, and we therefore
    leave lockdep with an inconsistent view of whether interrupts are
    disabled or not when CONFIG_PROVE_LOCKING & CONFIG_DEBUG_LOCKDEP are
    both enabled.
    
    Without tracing this interrupt state lockdep will print warnings such
    as the following once a task returns from a syscall via
    syscall_exit_partial with TIF_NEED_RESCHED set:
    
    [   49.927678] ------------[ cut here ]------------
    [   49.934445] WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:3687 check_flags.part.41+0x1dc/0x1e8
    [   49.946031] DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)
    [   49.946355] CPU: 0 PID: 1 Comm: init Not tainted 4.10.0-00439-gc9fd5d362289-dirty #197
    [   49.963505] Stack : 0000000000000000 ffffffff81bb5d6a 0000000000000006 ffffffff801ce9c4
    [   49.974431]         0000000000000000 0000000000000000 0000000000000000 000000000000004a
    [   49.985300]         ffffffff80b7e487 ffffffff80a24498 a8000000ff160000 ffffffff80ede8b8
    [   49.996194]         0000000000000001 0000000000000000 0000000000000000 0000000077c8030c
    [   50.007063]         000000007fd8a510 ffffffff801cd45c 0000000000000000 a8000000ff127c88
    [   50.017945]         0000000000000000 ffffffff801cf928 0000000000000001 ffffffff80a24498
    [   50.028827]         0000000000000000 0000000000000001 0000000000000000 0000000000000000
    [   50.039688]         0000000000000000 a8000000ff127bd0 0000000000000000 ffffffff805509bc
    [   50.050575]         00000000140084e0 0000000000000000 0000000000000000 0000000000040a00
    [   50.061448]         0000000000000000 ffffffff8010e1b0 0000000000000000 ffffffff805509bc
    [   50.072327]         ...
    [   50.076087] Call Trace:
    [   50.079869] [<ffffffff8010e1b0>] show_stack+0x80/0xa8
    [   50.086577] [<ffffffff805509bc>] dump_stack+0x10c/0x190
    [   50.093498] [<ffffffff8015dde0>] __warn+0xf0/0x108
    [   50.099889] [<ffffffff8015de34>] warn_slowpath_fmt+0x3c/0x48
    [   50.107241] [<ffffffff801c15b4>] check_flags.part.41+0x1dc/0x1e8
    [   50.114961] [<ffffffff801c239c>] lock_is_held_type+0x8c/0xb0
    [   50.122291] [<ffffffff809461b8>] __schedule+0x8c0/0x10f8
    [   50.129221] [<ffffffff80946a60>] schedule+0x30/0x98
    [   50.135659] [<ffffffff80106278>] work_resched+0x8/0x34
    [   50.142397] ---[ end trace 0cb4f6ef5b99fe21 ]---
    [   50.148405] possible reason: unannotated irqs-off.
    [   50.154600] irq event stamp: 400463
    [   50.159566] hardirqs last  enabled at (400463): [<ffffffff8094edc8>] _raw_spin_unlock_irqrestore+0x40/0xa8
    [   50.171981] hardirqs last disabled at (400462): [<ffffffff8094eb98>] _raw_spin_lock_irqsave+0x30/0xb0
    [   50.183897] softirqs last  enabled at (400450): [<ffffffff8016580c>] __do_softirq+0x4ac/0x6a8
    [   50.195015] softirqs last disabled at (400425): [<ffffffff80165e78>] irq_exit+0x110/0x128
    
    Fix this by using the TRACE_IRQS_OFF macro to call trace_hardirqs_off()
    when CONFIG_TRACE_IRQFLAGS is enabled. This is done before invoking
    schedule() following the work_resched label because:
    
     1) Interrupts are disabled regardless of the path we take to reach
        work_resched() & schedule().
    
     2) Performing the tracing here avoids the need to do it in paths which
        disable interrupts but don't call out to C code before hitting a
        path which uses the RESTORE_SOME macro that will call
        trace_hardirqs_on() or trace_hardirqs_off() as appropriate.
    
    We call trace_hardirqs_on() using the TRACE_IRQS_ON macro before calling
    syscall_trace_leave() for similar reasons, ensuring that lockdep has a
    consistent view of state after we re-enable interrupts.
    
    Signed-off-by: Paul Burton <paul.burton@imgtec.com>
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Cc: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/15385/
    Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 5033e622480a88441a76af48d58fd9202e5acec8
Author: Paul Burton <paul.burton@imgtec.com>
Date:   Thu Mar 2 14:02:40 2017 -0800

    MIPS: pm-cps: Drop manual cache-line alignment of ready_count
    
    commit 161c51ccb7a6faf45ffe09aa5cf1ad85ccdad503 upstream.
    
    We allocate memory for a ready_count variable per-CPU, which is accessed
    via a cached non-coherent TLB mapping to perform synchronisation between
    threads within the core using LL/SC instructions. In order to ensure
    that the variable is contained within its own data cache line we
    allocate 2 lines worth of memory & align the resulting pointer to a line
    boundary. This is however unnecessary, since kmalloc is guaranteed to
    return memory which is at least cache-line aligned (see
    ARCH_DMA_MINALIGN). Stop the redundant manual alignment.
    
    Besides cleaning up the code & avoiding needless work, this has the side
    effect of avoiding an arithmetic error found by Bryan on 64 bit systems
    due to the 32 bit size of the former dlinesz. This led the ready_count
    variable to have its upper 32b cleared erroneously for MIPS64 kernels,
    causing problems when ready_count was later used on MIPS64 via cpuidle.
    
    Signed-off-by: Paul Burton <paul.burton@imgtec.com>
    Fixes: 3179d37ee1ed ("MIPS: pm-cps: add PM state entry code for CPS systems")
    Reported-by: Bryan O'Donoghue <bryan.odonoghue@imgtec.com>
    Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@imgtec.com>
    Tested-by: Bryan O'Donoghue <bryan.odonoghue@imgtec.com>
    Cc: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/15383/
    Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit dba185d7a5d6be478c9c4d04d7a7ae683fd7b36e
Author: James Hogan <james.hogan@imgtec.com>
Date:   Thu Jun 29 15:05:04 2017 +0100

    MIPS: Avoid accidental raw backtrace
    
    commit 854236363370995a609a10b03e35fd3dc5e9e4a1 upstream.
    
    Since commit 81a76d7119f6 ("MIPS: Avoid using unwind_stack() with
    usermode") show_backtrace() invokes the raw backtracer when
    cp0_status & ST0_KSU indicates user mode to fix issues on EVA kernels
    where user and kernel address spaces overlap.
    
    However this is used by show_stack() which creates its own pt_regs on
    the stack and leaves cp0_status uninitialised in most of the code paths.
    This results in the non deterministic use of the raw back tracer
    depending on the previous stack content.
    
    show_stack() deals exclusively with kernel mode stacks anyway, so
    explicitly initialise regs.cp0_status to KSU_KERNEL (i.e. 0) to ensure
    we get a useful backtrace.
    
    Fixes: 81a76d7119f6 ("MIPS: Avoid using unwind_stack() with usermode")
    Signed-off-by: James Hogan <james.hogan@imgtec.com>
    Cc: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/16656/
    Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 9de0e07dfb9d6d479a3d48492902d8b9bb59eb4d
Author: Karl Beldan <karl.beldan@gmail.com>
Date:   Tue Jun 27 19:22:16 2017 +0000

    MIPS: head: Reorder instructions missing a delay slot
    
    commit 25d8b92e0af75d72ce8b99e63e5a449cc0888efa upstream.
    
    In this sequence the 'move' is assumed in the delay slot of the 'beq',
    but head.S is in reorder mode and the former gets pushed one 'nop'
    farther by the assembler.
    
    The corrected behavior made booting with an UHI supplied dtb erratic.
    
    Fixes: 15f37e158892 ("MIPS: store the appended dtb address in a variable")
    Signed-off-by: Karl Beldan <karl.beldan+oss@gmail.com>
    Reviewed-by: James Hogan <james.hogan@imgtec.com>
    Cc: Jonas Gorski <jogo@openwrt.org>
    Cc: linux-mips@linux-mips.org
    Cc: linux-kernel@vger.kernel.org
    Patchwork: https://patchwork.linux-mips.org/patch/16614/
    Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c806e0188da8a36c141219bc7cf02de7803f974f
Author: Juergen Gross <jgross@suse.com>
Date:   Thu May 18 17:28:49 2017 +0200

    xen/blkback: don't use xen_blkif_get() in xen-blkback kthread
    
    commit a24fa22ce22ae302b3bf8f7008896d52d5d57b8d upstream.
    
    There is no need to use xen_blkif_get()/xen_blkif_put() in the kthread
    of xen-blkback. Thread stopping is synchronous and using the blkif
    reference counting in the kthread will avoid to ever let the reference
    count drop to zero at the end of an I/O running concurrent to
    disconnecting and multiple rings.
    
    Setting ring->xenblkd to NULL after stopping the kthread isn't needed
    as the kthread does this already.
    
    Signed-off-by: Juergen Gross <jgross@suse.com>
    Tested-by: Steven Haigh <netwiz@crc.id.au>
    Acked-by: Roger Pau Monné <roger.pau@citrix.com>
    Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 887e338c2e830419c9dfd3890d8524eee0d58bd4
Author: Kinglong Mee <kinglongmee@gmail.com>
Date:   Thu Apr 27 11:13:38 2017 +0800

    NFSv4.x/callback: Create the callback service through svc_create_pooled
    
    commit df807fffaabde625fa9adb82e3e5b88cdaa5709a upstream.
    
    As the comments for svc_set_num_threads() said,
    " Destroying threads relies on the service threads filling in
    rqstp->rq_task, which only the nfs ones do.  Assumes the serv
    has been created using svc_create_pooled()."
    
    If creating service through svc_create(), the svc_pool_map_put()
    will be called in svc_destroy(), but the pool map isn't used.
    So that, the reference of pool map will be drop, the next using
    of pool map will get a zero npools.
    
    [  137.992130] divide error: 0000 [#1] SMP
    [  137.992148] Modules linked in: nfsd(E) nfsv4 nfs fscache fuse tun bridge stp llc ip_set nfnetlink vmw_vsock_vmci_transport vsock snd_seq_midi snd_seq_midi_event vmw_balloon coretemp crct10dif_pclmul crc32_pclmul ppdev ghash_clmulni_intel intel_rapl_perf joydev snd_ens1371 gameport snd_ac97_codec ac97_bus snd_seq snd_pcm snd_rawmidi snd_timer snd_seq_device snd soundcore parport_pc parport nfit acpi_cpufreq tpm_tis tpm_tis_core tpm vmw_vmci i2c_piix4 shpchp auth_rpcgss nfs_acl lockd(E) grace sunrpc(E) xfs libcrc32c vmwgfx drm_kms_helper ttm crc32c_intel drm e1000 mptspi scsi_transport_spi serio_raw mptscsih mptbase ata_generic pata_acpi [last unloaded: nfsd]
    [  137.992336] CPU: 0 PID: 4514 Comm: rpc.nfsd Tainted: G            E   4.11.0-rc8+ #536
    [  137.992777] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
    [  137.993757] task: ffff955984101d00 task.stack: ffff9873c2604000
    [  137.994231] RIP: 0010:svc_pool_for_cpu+0x2b/0x80 [sunrpc]
    [  137.994768] RSP: 0018:ffff9873c2607c18 EFLAGS: 00010246
    [  137.995227] RAX: 0000000000000000 RBX: ffff95598376f000 RCX: 0000000000000002
    [  137.995673] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9559944aec00
    [  137.996156] RBP: ffff9873c2607c18 R08: ffff9559944aec28 R09: 0000000000000000
    [  137.996609] R10: 0000000001080002 R11: 0000000000000000 R12: ffff95598376f010
    [  137.997063] R13: ffff95598376f018 R14: ffff9559944aec28 R15: ffff9559944aec00
    [  137.997584] FS:  00007f755529eb40(0000) GS:ffff9559bb600000(0000) knlGS:0000000000000000
    [  137.998048] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  137.998548] CR2: 000055f3aecd9660 CR3: 0000000084290000 CR4: 00000000001406f0
    [  137.999052] Call Trace:
    [  137.999517]  svc_xprt_do_enqueue+0xef/0x260 [sunrpc]
    [  138.000028]  svc_xprt_received+0x47/0x90 [sunrpc]
    [  138.000487]  svc_add_new_perm_xprt+0x76/0x90 [sunrpc]
    [  138.000981]  svc_addsock+0x14b/0x200 [sunrpc]
    [  138.001424]  ? recalc_sigpending+0x1b/0x50
    [  138.001860]  ? __getnstimeofday64+0x41/0xd0
    [  138.002346]  ? do_gettimeofday+0x29/0x90
    [  138.002779]  write_ports+0x255/0x2c0 [nfsd]
    [  138.003202]  ? _copy_from_user+0x4e/0x80
    [  138.003676]  ? write_recoverydir+0x100/0x100 [nfsd]
    [  138.004098]  nfsctl_transaction_write+0x48/0x80 [nfsd]
    [  138.004544]  __vfs_write+0x37/0x160
    [  138.004982]  ? selinux_file_permission+0xd7/0x110
    [  138.005401]  ? security_file_permission+0x3b/0xc0
    [  138.005865]  vfs_write+0xb5/0x1a0
    [  138.006267]  SyS_write+0x55/0xc0
    [  138.006654]  entry_SYSCALL_64_fastpath+0x1a/0xa9
    [  138.007071] RIP: 0033:0x7f7554b9dc30
    [  138.007437] RSP: 002b:00007ffc9f92c788 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
    [  138.007807] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f7554b9dc30
    [  138.008168] RDX: 0000000000000002 RSI: 00005640cd536640 RDI: 0000000000000003
    [  138.008573] RBP: 00007ffc9f92c780 R08: 0000000000000001 R09: 0000000000000002
    [  138.008918] R10: 0000000000000064 R11: 0000000000000246 R12: 0000000000000004
    [  138.009254] R13: 00005640cdbf77a0 R14: 00005640cdbf7720 R15: 00007ffc9f92c238
    [  138.009610] Code: 0f 1f 44 00 00 48 8b 87 98 00 00 00 55 48 89 e5 48 83 78 08 00 74 10 8b 05 07 42 02 00 83 f8 01 74 40 83 f8 02 74 19 31 c0 31 d2 <f7> b7 88 00 00 00 5d 89 d0 48 c1 e0 07 48 03 87 90 00 00 00 c3
    [  138.010664] RIP: svc_pool_for_cpu+0x2b/0x80 [sunrpc] RSP: ffff9873c2607c18
    [  138.011061] ---[ end trace b3468224cafa7d11 ]---
    
    Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a3042f8a0cdc7b213a8960bd731a97bf17d084c6
Author: Eric Leblond <eric@regit.org>
Date:   Thu May 11 18:56:38 2017 +0200

    netfilter: synproxy: fix conntrackd interaction
    
    commit 87e94dbc210a720a34be5c1174faee5c84be963e upstream.
    
    This patch fixes the creation of connection tracking entry from
    netlink when synproxy is used. It was missing the addition of
    the synproxy extension.
    
    This was causing kernel crashes when a conntrack entry created by
    conntrackd was used after the switch of traffic from active node
    to the passive node.
    
    Signed-off-by: Eric Leblond <eric@regit.org>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f19613afaf2622f5854a2266e2aa9abc9f9d145a
Author: Serhey Popovych <serhe.popovych@gmail.com>
Date:   Tue Jun 20 14:35:23 2017 +0300

    rtnetlink: add IFLA_GROUP to ifla_policy
    
    
    [ Upstream commit db833d40ad3263b2ee3b59a1ba168bb3cfed8137 ]
    
    Network interface groups support added while ago, however
    there is no IFLA_GROUP attribute description in policy
    and netlink message size calculations until now.
    
    Add IFLA_GROUP attribute to the policy.
    
    Fixes: cbda10fa97d7 ("net_device: add support for network device groups")
    Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 4ade61f463634383f18c614a2b9812ce0f7ac448
Author: Serhey Popovych <serhe.popovych@gmail.com>
Date:   Tue Jun 20 13:29:25 2017 +0300

    ipv6: Do not leak throw route references
    
    
    [ Upstream commit 07f615574f8ac499875b21c1142f26308234a92c ]
    
    While commit 73ba57bfae4a ("ipv6: fix backtracking for throw routes")
    does good job on error propagation to the fib_rules_lookup()
    in fib rules core framework that also corrects throw routes
    handling, it does not solve route reference leakage problem
    happened when we return -EAGAIN to the fib_rules_lookup()
    and leave routing table entry referenced in arg->result.
    
    If rule with matched throw route isn't last matched in the
    list we overwrite arg->result losing reference on throw
    route stored previously forever.
    
    We also partially revert commit ab997ad40839 ("ipv6: fix the
    incorrect return value of throw route") since we never return
    routing table entry with dst.error == -EAGAIN when
    CONFIG_IPV6_MULTIPLE_TABLES is on. Also there is no point
    to check for RTF_REJECT flag since it is always set throw
    route.
    
    Fixes: 73ba57bfae4a ("ipv6: fix backtracking for throw routes")
    Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 16c4d1be8fe4a7039f992506d4a49e58b1ed05ee
Author: Gao Feng <gfree.wind@vip.163.com>
Date:   Fri Jun 16 15:00:02 2017 +0800

    net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev
    
    
    [ Upstream commit 9745e362add89432d2c951272a99b0a5fe4348a9 ]
    
    The register_vlan_device would invoke free_netdev directly, when
    register_vlan_dev failed. It would trigger the BUG_ON in free_netdev
    if the dev was already registered. In this case, the netdev would be
    freed in netdev_run_todo later.
    
    So add one condition check now. Only when dev is not registered, then
    free it directly.
    
    The following is the part coredump when netdev_upper_dev_link failed
    in register_vlan_dev. I removed the lines which are too long.
    
    [  411.237457] ------------[ cut here ]------------
    [  411.237458] kernel BUG at net/core/dev.c:7998!
    [  411.237484] invalid opcode: 0000 [#1] SMP
    [  411.237705]  [last unloaded: 8021q]
    [  411.237718] CPU: 1 PID: 12845 Comm: vconfig Tainted: G            E   4.12.0-rc5+ #6
    [  411.237737] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
    [  411.237764] task: ffff9cbeb6685580 task.stack: ffffa7d2807d8000
    [  411.237782] RIP: 0010:free_netdev+0x116/0x120
    [  411.237794] RSP: 0018:ffffa7d2807dbdb0 EFLAGS: 00010297
    [  411.237808] RAX: 0000000000000002 RBX: ffff9cbeb6ba8fd8 RCX: 0000000000001878
    [  411.237826] RDX: 0000000000000001 RSI: 0000000000000282 RDI: 0000000000000000
    [  411.237844] RBP: ffffa7d2807dbdc8 R08: 0002986100029841 R09: 0002982100029801
    [  411.237861] R10: 0004000100029980 R11: 0004000100029980 R12: ffff9cbeb6ba9000
    [  411.238761] R13: ffff9cbeb6ba9060 R14: ffff9cbe60f1a000 R15: ffff9cbeb6ba9000
    [  411.239518] FS:  00007fb690d81700(0000) GS:ffff9cbebb640000(0000) knlGS:0000000000000000
    [  411.239949] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  411.240454] CR2: 00007f7115624000 CR3: 0000000077cdf000 CR4: 00000000003406e0
    [  411.240936] Call Trace:
    [  411.241462]  vlan_ioctl_handler+0x3f1/0x400 [8021q]
    [  411.241910]  sock_ioctl+0x18b/0x2c0
    [  411.242394]  do_vfs_ioctl+0xa1/0x5d0
    [  411.242853]  ? sock_alloc_file+0xa6/0x130
    [  411.243465]  SyS_ioctl+0x79/0x90
    [  411.243900]  entry_SYSCALL_64_fastpath+0x1e/0xa9
    [  411.244425] RIP: 0033:0x7fb69089a357
    [  411.244863] RSP: 002b:00007ffcd04e0fc8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
    [  411.245445] RAX: ffffffffffffffda RBX: 00007ffcd04e2884 RCX: 00007fb69089a357
    [  411.245903] RDX: 00007ffcd04e0fd0 RSI: 0000000000008983 RDI: 0000000000000003
    [  411.246527] RBP: 00007ffcd04e0fd0 R08: 0000000000000000 R09: 1999999999999999
    [  411.246976] R10: 000000000000053f R11: 0000000000000202 R12: 0000000000000004
    [  411.247414] R13: 00007ffcd04e1128 R14: 00007ffcd04e2888 R15: 0000000000000001
    [  411.249129] RIP: free_netdev+0x116/0x120 RSP: ffffa7d2807dbdb0
    
    Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c207e0594f5dbb4c8d24118559c7142791360b88
Author: Wei Wang <weiwan@google.com>
Date:   Fri Jun 16 10:46:37 2017 -0700

    decnet: always not take dst->__refcnt when inserting dst into hash table
    
    
    [ Upstream commit 76371d2e3ad1f84426a30ebcd8c3b9b98f4c724f ]
    
    In the existing dn_route.c code, dn_route_output_slow() takes
    dst->__refcnt before calling dn_insert_route() while dn_route_input_slow()
    does not take dst->__refcnt before calling dn_insert_route().
    This makes the whole routing code very buggy.
    In dn_dst_check_expire(), dnrt_free() is called when rt expires. This
    makes the routes inserted by dn_route_output_slow() not able to be
    freed as the refcnt is not released.
    In dn_dst_gc(), dnrt_drop() is called to release rt which could
    potentially cause the dst->__refcnt to be dropped to -1.
    In dn_run_flush(), dst_free() is called to release all the dst. Again,
    it makes the dst inserted by dn_route_output_slow() not able to be
    released and also, it does not wait on the rcu and could potentially
    cause crash in the path where other users still refer to this dst.
    
    This patch makes sure both input and output path do not take
    dst->__refcnt before calling dn_insert_route() and also makes sure
    dnrt_free()/dst_free() is called when removing dst from the hash table.
    The only difference between those 2 calls is that dnrt_free() waits on
    the rcu while dst_free() does not.
    
    Signed-off-by: Wei Wang <weiwan@google.com>
    Acked-by: Martin KaFai Lau <kafai@fb.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 941bdec095d55e0127b4e3bce6e0ea73ee9f5462
Author: Maor Dickman <maord@mellanox.com>
Date:   Thu May 18 15:15:08 2017 +0300

    net/mlx5e: Fix timestamping capabilities reporting
    
    
    [ Upstream commit f0b381178b01b831f9907d72f467d6443afdea67 ]
    
    Misuse of (BIT) macro caused to report wrong flags for
    "Hardware Transmit Timestamp Modes" and "Hardware Receive
    Filter Modes"
    
    Fixes: ef9814deafd0 ('net/mlx5e: Add HW timestamping (TS) support')
    Signed-off-by: Maor Dickman <maord@mellanox.com>
    Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f464aace786f4428549f36c4eb0e9da2d7e9b6f7
Author: Eli Cohen <eli@mellanox.com>
Date:   Thu Jun 8 11:33:16 2017 -0500

    net/mlx5: Wait for FW readiness before initializing command interface
    
    
    [ Upstream commit 6c780a0267b8a1075f40b39851132eeaefefcff5 ]
    
    Before attempting to initialize the command interface we must wait till
    the fw_initializing bit is clear.
    
    If we fail to meet this condition the hardware will drop our
    configuration, specifically the descriptors page address.  This scenario
    can happen when the firmware is still executing an FLR flow and did not
    finish yet so the driver needs to wait for that to finish.
    
    Fixes: e3297246c2c8 ('net/mlx5_core: Wait for FW readiness on startup')
    Signed-off-by: Eli Cohen <eli@mellanox.com>
    Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c7d1260afbd09f0240f858620c708ab6f776f4b9
Author: Or Gerlitz <ogerlitz@mellanox.com>
Date:   Thu Jun 15 20:08:32 2017 +0300

    net/mlx5e: Avoid doing a cleanup call if the profile doesn't have it
    
    
    [ Upstream commit 31ac93386d135a6c96de9c8bab406f5ccabf5a4d ]
    
    The error flow of mlx5e_create_netdev calls the cleanup call
    of the given profile without checking if it exists, fix that.
    
    Currently the VF reps don't register that callback and we crash
    if getting into error -- can be reproduced by the user doing ctrl^C
    while attempting to change the sriov mode from legacy to switchdev.
    
    Fixes: 26e59d8077a3 '(net/mlx5e: Implement mlx5e interface attach/detach callbacks')
    Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
    Reported-by: Sabrina Dubroca <sdubroca@redhat.com>
    Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 050efbe12925bc9fe1fa2695afc2ba186c1f9cc2
Author: Chris Mi <chrism@mellanox.com>
Date:   Tue May 16 07:07:11 2017 -0400

    net/mlx5e: Fix min inline value for VF rep SQs
    
    
    [ Upstream commit 5f195c2c5cba60241004146cd12d71451d6b0fc4 ]
    
    The offending commit only changed the code path for PF/VF, but it
    didn't take care of VF representors. As a result, since
    params->tx_min_inline_mode for VF representors is kzalloced to 0
    (MLX5_INLINE_MODE_NONE), all VF reps SQs were set to that mode.
    
    This actually works on CX5 by default but broke CX4. Fix that by
    adding a call to query the min inline mode from the VF rep build up code.
    
    Fixes: a6f402e49901 ("net/mlx5e: Tx, no inline copy on ConnectX-5")
    Signed-off-by: Chris Mi <chrism@mellanox.com>
    Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
    Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit eb0d418f2e579a2e9de610368c3100fe58d2c0ba
Author: Xin Long <lucien.xin@gmail.com>
Date:   Thu Jun 15 17:49:08 2017 +0800

    sctp: return next obj by passing pos + 1 into sctp_transport_get_idx
    
    
    [ Upstream commit 988c7322116970696211e902b468aefec95b6ec4 ]
    
    In sctp_for_each_transport, pos is used to save how many objs it has
    dumped. Now it gets the last obj by sctp_transport_get_idx, then gets
    the next obj by sctp_transport_get_next.
    
    The issue is that in the meanwhile if some objs in transport hashtable
    are removed and the objs nums are less than pos, sctp_transport_get_idx
    would return NULL and hti.walker.tbl is NULL as well. At this moment
    it should stop hti, instead of continue getting the next obj. Or it
    would cause a NULL pointer dereference in sctp_transport_get_next.
    
    This patch is to pass pos + 1 into sctp_transport_get_idx to get the
    next obj directly, even if pos > objs nums, it would return NULL and
    stop hti.
    
    Fixes: 626d16f50f39 ("sctp: export some apis or variables for sctp_diag and reuse some for proc")
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 247ab3c1c1f70add7e2f0a317bf679769677e53b
Author: Xin Long <lucien.xin@gmail.com>
Date:   Thu Jun 15 16:33:58 2017 +0800

    ipv6: fix calling in6_ifa_hold incorrectly for dad work
    
    
    [ Upstream commit f8a894b218138888542a5058d0e902378fd0d4ec ]
    
    Now when starting the dad work in addrconf_mod_dad_work, if the dad work
    is idle and queued, it needs to hold ifa.
    
    The problem is there's one gap in [1], during which if the pending dad work
    is removed elsewhere. It will miss to hold ifa, but the dad word is still
    idea and queue.
    
            if (!delayed_work_pending(&ifp->dad_work))
                    in6_ifa_hold(ifp);
                        <--------------[1]
            mod_delayed_work(addrconf_wq, &ifp->dad_work, delay);
    
    An use-after-free issue can be caused by this.
    
    Chen Wei found this issue when WARN_ON(!hlist_unhashed(&ifp->addr_lst)) in
    net6_ifa_finish_destroy was hit because of it.
    
    As Hannes' suggestion, this patch is to fix it by holding ifa first in
    addrconf_mod_dad_work, then calling mod_delayed_work and putting ifa if
    the dad_work is already in queue.
    
    Note that this patch did not choose to fix it with:
    
      if (!mod_delayed_work(delay))
              in6_ifa_hold(ifp);
    
    As with it, when delay == 0, dad_work would be scheduled immediately, all
    addrconf_mod_dad_work(0) callings had to be moved under ifp->lock.
    
    Reported-by: Wei Chen <weichen@redhat.com>
    Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
    Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit bdfae324ba1552520f711f9db8ee9c7e81061541
Author: Jesper Dangaard Brouer <brouer@redhat.com>
Date:   Wed Jun 14 13:27:37 2017 +0200

    net: don't global ICMP rate limit packets originating from loopback
    
    
    [ Upstream commit 849a44de91636c24cea799cb8ad8c36433feb913 ]
    
    Florian Weimer seems to have a glibc test-case which requires that
    loopback interfaces does not get ICMP ratelimited.  This was broken by
    commit c0303efeab73 ("net: reduce cycles spend on ICMP replies that
    gets rate limited").
    
    An ICMP response will usually be routed back-out the same incoming
    interface.  Thus, take advantage of this and skip global ICMP
    ratelimit when the incoming device is loopback.  In the unlikely event
    that the outgoing it not loopback, due to strange routing policy
    rules, ICMP rate limiting still works via peer ratelimiting via
    icmpv4_xrlim_allow().  Thus, we should still comply with RFC1812
    (section 4.3.2.8 "Rate Limiting").
    
    This seems to fix the reproducer given by Florian.  While still
    avoiding to perform expensive and unneeded outgoing route lookup for
    rate limited packets (in the non-loopback case).
    
    Fixes: c0303efeab73 ("net: reduce cycles spend on ICMP replies that gets rate limited")
    Reported-by: Florian Weimer <fweimer@redhat.com>
    Reported-by: "H.J. Lu" <hjl.tools@gmail.com>
    Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 487dd0ab72ed0fb8a0a1347b4bd140313aa858cc
Author: Bjørn Mork <bjorn@mork.no>
Date:   Tue Jun 13 19:10:18 2017 +0200

    qmi_wwan: new Telewell and Sierra device IDs
    
    
    [ Upstream commit 60cfe1eaccb8af598ebe1bdc44e157ea30fcdd81 ]
    
    A new Sierra Wireless EM7305 device ID used in a Toshiba laptop,
    and two Longcheer device IDs entries used by Telewell TW-3G HSPA+
    branded modems.
    
    Reported-by: Petr Kloc <petr_kloc@yahoo.com>
    Reported-by: Teemu Likonen <tlikonen@iki.fi>
    Signed-off-by: Bjørn Mork <bjorn@mork.no>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 5113c2dcb96d7004ed023fbb11fd7ca21bfa6494
Author: WANG Cong <xiyou.wangcong@gmail.com>
Date:   Tue Jun 20 10:46:27 2017 -0700

    igmp: add a missing spin_lock_init()
    
    
    [ Upstream commit b4846fc3c8559649277e3e4e6b5cec5348a8d208 ]
    
    Andrey reported a lockdep warning on non-initialized
    spinlock:
    
     INFO: trying to register non-static key.
     the code is fine but needs lockdep annotation.
     turning off the locking correctness validator.
     CPU: 1 PID: 4099 Comm: a.out Not tainted 4.12.0-rc6+ #9
     Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
     Call Trace:
      __dump_stack lib/dump_stack.c:16
      dump_stack+0x292/0x395 lib/dump_stack.c:52
      register_lock_class+0x717/0x1aa0 kernel/locking/lockdep.c:755
      ? 0xffffffffa0000000
      __lock_acquire+0x269/0x3690 kernel/locking/lockdep.c:3255
      lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855
      __raw_spin_lock_bh ./include/linux/spinlock_api_smp.h:135
      _raw_spin_lock_bh+0x36/0x50 kernel/locking/spinlock.c:175
      spin_lock_bh ./include/linux/spinlock.h:304
      ip_mc_clear_src+0x27/0x1e0 net/ipv4/igmp.c:2076
      igmpv3_clear_delrec+0xee/0x4f0 net/ipv4/igmp.c:1194
      ip_mc_destroy_dev+0x4e/0x190 net/ipv4/igmp.c:1736
    
    We miss a spin_lock_init() in igmpv3_add_delrec(), probably
    because previously we never use it on this code path. Since
    we already unlink it from the global mc_tomb list, it is
    probably safe not to acquire this spinlock here. It does not
    harm to have it although, to avoid conditional locking.
    
    Fixes: c38b7d327aaf ("igmp: acquire pmc lock for ip_mc_clear_src()")
    Reported-by: Andrey Konovalov <andreyknvl@google.com>
    Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 20407b1d4a3a3bb09003e9ab7ce10aab27d9d91f
Author: WANG Cong <xiyou.wangcong@gmail.com>
Date:   Mon Jun 12 09:52:26 2017 -0700

    igmp: acquire pmc lock for ip_mc_clear_src()
    
    
    [ Upstream commit c38b7d327aafd1e3ad7ff53eefac990673b65667 ]
    
    Andrey reported a use-after-free in add_grec():
    
            for (psf = *psf_list; psf; psf = psf_next) {
                    ...
                    psf_next = psf->sf_next;
    
    where the struct ip_sf_list's were already freed by:
    
     kfree+0xe8/0x2b0 mm/slub.c:3882
     ip_mc_clear_src+0x69/0x1c0 net/ipv4/igmp.c:2078
     ip_mc_dec_group+0x19a/0x470 net/ipv4/igmp.c:1618
     ip_mc_drop_socket+0x145/0x230 net/ipv4/igmp.c:2609
     inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:411
     sock_release+0x8d/0x1e0 net/socket.c:597
     sock_close+0x16/0x20 net/socket.c:1072
    
    This happens because we don't hold pmc->lock in ip_mc_clear_src()
    and a parallel mr_ifc_timer timer could jump in and access them.
    
    The RCU lock is there but it is merely for pmc itself, this
    spinlock could actually ensure we don't access them in parallel.
    
    Thanks to Eric and Long for discussion on this bug.
    
    Reported-by: Andrey Konovalov <andreyknvl@google.com>
    Cc: Eric Dumazet <edumazet@google.com>
    Cc: Xin Long <lucien.xin@gmail.com>
    Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
    Reviewed-by: Xin Long <lucien.xin@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 42b0540a98b19c0310780698a447e116aec98881
Author: Christian Perle <christian.perle@secunet.com>
Date:   Mon Jun 12 10:06:57 2017 +0200

    proc: snmp6: Use correct type in memset
    
    
    [ Upstream commit 3500cd73dff48f28f4ba80c171c4c80034d40f76 ]
    
    Reading /proc/net/snmp6 yields bogus values on 32 bit kernels.
    Use "u64" instead of "unsigned long" in sizeof().
    
    Fixes: 4a4857b1c81e ("proc: Reduce cache miss in snmp6_seq_show")
    Signed-off-by: Christian Perle <christian.perle@secunet.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 8fc4f0e6c3049d4fcbb41ba15b00398542ab09b0
Author: Majd Dibbiny <majd@mellanox.com>
Date:   Sun May 28 14:47:56 2017 +0300

    net/mlx5: Enable 4K UAR only when page size is bigger than 4K
    
    
    [ Upstream commit 91828bd89940e8145f91751a015bc11bc486aad0 ]
    
    When the page size isn't bigger than 4K, there is no added value of enabling 4K
    UAR feature in the Firmware.
    
    Modified the condition of enabling the 4K UAR accordingly.
    
    Fixes: f502d834950a ("net/mlx5: Activate support for 4K UARs")
    Signed-off-by: Majd Dibbiny <majd@mellanox.com>
    Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f0e6f2314d662a1e6b6bd77e1746a17f744fdacb
Author: Tal Gilboa <talgi@mellanox.com>
Date:   Mon May 29 17:02:55 2017 +0300

    net/mlx5e: Fix wrong indications in DIM due to counter wraparound
    
    
    [ Upstream commit 53acd76ce571e3b71f9205f2d49ab285a9f1aad8 ]
    
    DIM (Dynamically-tuned Interrupt Moderation) is a mechanism designed for
    changing the channel interrupt moderation values in order to reduce CPU
    overhead for all traffic types.
    Each iteration of the algorithm, DIM calculates the difference in
    throughput, packet rate and interrupt rate from last iteration in order
    to make a decision. DIM relies on counters for each metric. When these
    counters get to their type's max value they wraparound. In this case
    the delta between 'end' and 'start' samples is negative and when
    translated to unsigned integers - very high. This results in a false
    indication to the algorithm and might result in a wrong decision.
    
    The fix calculates the 'distance' between 'end' and 'start' samples in a
    cyclic way around the relevant type's max value. It can also be viewed as
    an absolute value around the type's max value instead of around 0.
    
    Testing show higher stability in DIM profile selection and no wraparound
    issues.
    
    Fixes: cb3c7fd4f839 ("net/mlx5e: Support adaptive RX coalescing")
    Signed-off-by: Tal Gilboa <talgi@mellanox.com>
    Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f98a0883afef8c2a20978891a0b3fb02af50ff62
Author: Tal Gilboa <talgi@mellanox.com>
Date:   Mon May 15 14:13:16 2017 +0300

    net/mlx5e: Added BW check for DIM decision mechanism
    
    
    [ Upstream commit c3164d2fc48fd4fa0477ab658b644559c3fe9073 ]
    
    DIM (Dynamically-tuned Interrupt Moderation) is a mechanism designed for
    changing the channel interrupt moderation values in order to reduce CPU
    overhead for all traffic types.
    Until now only interrupt and packet rate were sampled.
    We found a scenario on which we get a false indication since a change in
    DIM caused more aggregation and reduced packet rate while increasing BW.
    
    We now regard a change as succesfull iff:
    current_BW > (prev_BW + threshold) or
    current_BW ~= prev_BW and current_PR > (prev_PR + threshold) or
    current_BW ~= prev_BW and current_PR ~= prev_PR and
        current_IR < (prev_IR - threshold)
    Where BW = Bandwidth, PR = Packet rate and IR = Interrupt rate
    
    Improvements (ConnectX-4Lx 25GbE, single RX queue, LRO off)
        --------------------------------------------------
        packet size | before[Mb/s] | after[Mb/s] | gain  |
        2B          | 343.4        | 359.4       |  4.5% |
        16B         | 2739.7       | 2814.8      |  2.7% |
        64B         | 9739         | 10185.3     |  4.5% |
    
    Fixes: cb3c7fd4f839 ("net/mlx5e: Support adaptive RX coalescing")
    Signed-off-by: Tal Gilboa <talgi@mellanox.com>
    Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ff8cf391fed865f7e44c67e7181e51537b184af1
Author: Huy Nguyen <huyn@mellanox.com>
Date:   Mon May 8 11:46:50 2017 -0500

    net/mlx5: Remove several module events out of ethtool stats
    
    
    [ Upstream commit f729860a177d097ac44321fb2f7d927a0c54c5a3 ]
    
    Remove the following module event counters out of ethtool stats. The
    reason for removing these event counters is that these events do not
    occur without techinician's intervention.
      module_pwr_budget_exd
      module_long_range
      module_no_eeprom
      module_enforce_part
      module_unknown_id
      module_unknown_status
      module_plug
    
    Fixes: bedb7c909c19 ("net/mlx5e: Add port module event counters to ethtool stats")
    Signed-off-by: Huy Nguyen <huyn@mellanox.com>
    Reviewed by: Gal Pressman <galp@mellanox.com>
    
    Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e938c05e064c67d1de62d0886ffe6267c5085564
Author: Jia-Ju Bai <baijiaju1990@163.com>
Date:   Sat Jun 10 17:03:35 2017 +0800

    net: tipc: Fix a sleep-in-atomic bug in tipc_msg_reverse
    
    
    [ Upstream commit 343eba69c6968190d8654b857aea952fed9a6749 ]
    
    The kernel may sleep under a rcu read lock in tipc_msg_reverse, and the
    function call path is:
    tipc_l2_rcv_msg (acquire the lock by rcu_read_lock)
      tipc_rcv
        tipc_sk_rcv
          tipc_msg_reverse
            pskb_expand_head(GFP_KERNEL) --> may sleep
    tipc_node_broadcast
      tipc_node_xmit_skb
        tipc_node_xmit
          tipc_sk_rcv
            tipc_msg_reverse
              pskb_expand_head(GFP_KERNEL) --> may sleep
    
    To fix it, "GFP_KERNEL" is replaced with "GFP_ATOMIC".
    
    Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a89d15fc8bac19f5ae10b87d3a1df48d3457b102
Author: Jia-Ju Bai <baijiaju1990@163.com>
Date:   Sat Jun 10 16:49:39 2017 +0800

    net: caif: Fix a sleep-in-atomic bug in cfpkt_create_pfx
    
    
    [ Upstream commit f146e872eb12ebbe92d8e583b2637e0741440db3 ]
    
    The kernel may sleep under a rcu read lock in cfpkt_create_pfx, and the
    function call path is:
    cfcnfg_linkup_rsp (acquire the lock by rcu_read_lock)
      cfctrl_linkdown_req
        cfpkt_create
          cfpkt_create_pfx
            alloc_skb(GFP_KERNEL) --> may sleep
    cfserl_receive (acquire the lock by rcu_read_lock)
      cfpkt_split
        cfpkt_create_pfx
          alloc_skb(GFP_KERNEL) --> may sleep
    
    There is "in_interrupt" in cfpkt_create_pfx to decide use "GFP_KERNEL" or
    "GFP_ATOMIC". In this situation, "GFP_KERNEL" is used because the function
    is called under a rcu read lock, instead in interrupt.
    
    To fix it, only "GFP_ATOMIC" is used in cfpkt_create_pfx.
    
    Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 29b8ea35b0c1d4d6c4ab4e3c3ccfb774a22a3bb7
Author: Xin Long <lucien.xin@gmail.com>
Date:   Sat Jun 10 14:48:14 2017 +0800

    sctp: disable BH in sctp_for_each_endpoint
    
    
    [ Upstream commit 581409dacc9176b0de1f6c4ca8d66e13aa8e1b29 ]
    
    Now sctp holds read_lock when foreach sctp_ep_hashtable without disabling
    BH. If CPU schedules to another thread A at this moment, the thread A may
    be trying to hold the write_lock with disabling BH.
    
    As BH is disabled and CPU cannot schedule back to the thread holding the
    read_lock, while the thread A keeps waiting for the read_lock. A dead
    lock would be triggered by this.
    
    This patch is to fix this dead lock by calling read_lock_bh instead to
    disable BH when holding the read_lock in sctp_for_each_endpoint.
    
    Fixes: 626d16f50f39 ("sctp: export some apis or variables for sctp_diag and reuse some for proc")
    Reported-by: Xiumei Mu <xmu@redhat.com>
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e607742172be5f4d7970f38e3dc263848ff37cf0
Author: Krister Johansen <kjlx@templeofstupid.com>
Date:   Thu Jun 8 13:12:38 2017 -0700

    Fix an intermittent pr_emerg warning about lo becoming free.
    
    
    [ Upstream commit f186ce61bb8235d80068c390dc2aad7ca427a4c2 ]
    
    It looks like this:
    
    Message from syslogd@flamingo at Apr 26 00:45:00 ...
     kernel:unregister_netdevice: waiting for lo to become free. Usage count = 4
    
    They seem to coincide with net namespace teardown.
    
    The message is emitted by netdev_wait_allrefs().
    
    Forced a kdump in netdev_run_todo, but found that the refcount on the lo
    device was already 0 at the time we got to the panic.
    
    Used bcc to check the blocking in netdev_run_todo.  The only places
    where we're off cpu there are in the rcu_barrier() and msleep() calls.
    That behavior is expected.  The msleep time coincides with the amount of
    time we spend waiting for the refcount to reach zero; the rcu_barrier()
    wait times are not excessive.
    
    After looking through the list of callbacks that the netdevice notifiers
    invoke in this path, it appears that the dst_dev_event is the most
    interesting.  The dst_ifdown path places a hold on the loopback_dev as
    part of releasing the dev associated with the original dst cache entry.
    Most of our notifier callbacks are straight-forward, but this one a)
    looks complex, and b) places a hold on the network interface in
    question.
    
    I constructed a new bcc script that watches various events in the
    liftime of a dst cache entry.  Note that dst_ifdown will take a hold on
    the loopback device until the invalidated dst entry gets freed.
    
    [      __dst_free] on DST: ffff883ccabb7900 IF tap1008300eth0 invoked at 1282115677036183
        __dst_free
        rcu_nocb_kthread
        kthread
        ret_from_fork
    Acked-by: Eric Dumazet <edumazet@google.com>
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 6e8f72327b7743db71827efd2c327a4beec94c3b
Author: Mateusz Jurczyk <mjurczyk@google.com>
Date:   Thu Jun 8 11:13:36 2017 +0200

    af_unix: Add sockaddr length checks before accessing sa_family in bind and connect handlers
    
    
    [ Upstream commit defbcf2decc903a28d8398aa477b6881e711e3ea ]
    
    Verify that the caller-provided sockaddr structure is large enough to
    contain the sa_family field, before accessing it in bind() and connect()
    handlers of the AF_UNIX socket. Since neither syscall enforces a minimum
    size of the corresponding memory region, very short sockaddrs (zero or
    one byte long) result in operating on uninitialized memory while
    referencing .sa_family.
    
    Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 83cb92f4cf116e059d83c6e103c7fd9acbac76d9
Author: David Ahern <dsahern@gmail.com>
Date:   Thu Jun 8 11:31:11 2017 -0600

    net: vrf: Make add_fib_rules per network namespace flag
    
    
    [ Upstream commit 097d3c9508dc58286344e4a22b300098cf0c1566 ]
    
    Commit 1aa6c4f6b8cd8 ("net: vrf: Add l3mdev rules on first device create")
    adds the l3mdev FIB rule the first time a VRF device is created. However,
    it only creates the rule once and only in the namespace the first device
    is created - which may not be init_net. Fix by using the net_generic
    capability to make the add_fib_rules flag per network namespace.
    
    Fixes: 1aa6c4f6b8cd8 ("net: vrf: Add l3mdev rules on first device create")
    Reported-by: Petr Machata <petrm@mellanox.com>
    Signed-off-by: David Ahern <dsahern@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 5586883813df1a46a99d79b8ab69963b9f3401ba
Author: David Ahern <dsahern@gmail.com>
Date:   Wed Jun 7 12:26:23 2017 -0600

    net: ipv6: Release route when device is unregistering
    
    
    [ Upstream commit 8397ed36b7c585f8d3e06c431f4137309124f78f ]
    
    Roopa reported attempts to delete a bond device that is referenced in a
    multipath route is hanging:
    
    $ ifdown bond2    # ifupdown2 command that deletes virtual devices
    unregister_netdevice: waiting for bond2 to become free. Usage count = 2
    
    Steps to reproduce:
        echo 1 > /proc/sys/net/ipv6/conf/all/ignore_routes_with_linkdown
        ip link add dev bond12 type bond
        ip link add dev bond13 type bond
        ip addr add 2001:db8:2::0/64 dev bond12
        ip addr add 2001:db8:3::0/64 dev bond13
        ip route add 2001:db8:33::0/64 nexthop via 2001:db8:2::2 nexthop via 2001:db8:3::2
        ip link del dev bond12
        ip link del dev bond13
    
    The root cause is the recent change to keep routes on a linkdown. Update
    the check to detect when the device is unregistering and release the
    route for that case.
    
    Fixes: a1a22c12060e4 ("net: ipv6: Keep nexthop of multipath route on admin down")
    Reported-by: Roopa Prabhu <roopa@cumulusnetworks.com>
    Signed-off-by: David Ahern <dsahern@gmail.com>
    Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 199f4baff672c12657637791964b194b8212706d
Author: Mintz, Yuval <Yuval.Mintz@cavium.com>
Date:   Wed Jun 7 21:00:33 2017 +0300

    net: Zero ifla_vf_info in rtnl_fill_vfinfo()
    
    
    [ Upstream commit 0eed9cf58446b28b233388b7f224cbca268b6986 ]
    
    Some of the structure's fields are not initialized by the
    rtnetlink. If driver doesn't set those in ndo_get_vf_config(),
    they'd leak memory to user.
    
    Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
    CC: Michal Schmidt <mschmidt@redhat.com>
    Reviewed-by: Greg Rose <gvrose8192@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d8d01fc9bad3e0d06c34cb4ed64254b34ca6da06
Author: Mateusz Jurczyk <mjurczyk@google.com>
Date:   Wed Jun 7 16:14:29 2017 +0200

    decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skb
    
    
    [ Upstream commit dd0da17b209ed91f39872766634ca967c170ada1 ]
    
    Verify that the length of the socket buffer is sufficient to cover the
    nlmsghdr structure before accessing the nlh->nlmsg_len field for further
    input sanitization. If the client only supplies 1-3 bytes of data in
    sk_buff, then nlh->nlmsg_len remains partially uninitialized and
    contains leftover memory from the corresponding kernel allocation.
    Operating on such data may result in indeterminate evaluation of the
    nlmsg_len < sizeof(*nlh) expression.
    
    The bug was discovered by a runtime instrumentation designed to detect
    use of uninitialized memory in the kernel. The patch prevents this and
    other similar tools (e.g. KMSAN) from flagging this behavior in the future.
    
    Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e34cacd27f477800cb37cffbd055b0560a94065e
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Fri Jun 9 21:33:09 2017 +0200

    mac80211: free netdev on dev_alloc_name() error
    
    
    [ Upstream commit c7a61cba71fd151cc7d9ebe53a090e0e61eeebf3 ]
    
    The change to remove free_netdev() from ieee80211_if_free()
    erroneously didn't add the necessary free_netdev() for when
    ieee80211_if_free() is called directly in one place, rather
    than as the priv_destructor. Add the missing call.
    
    Fixes: cf124db566e6 ("net: Fix inconsistent teardown and release of private netdev state.")
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 64603b75f8f625437866b896bbad466459dcf7ed
Author: Stephen Rothwell <sfr@canb.auug.org.au>
Date:   Thu Jun 8 19:06:29 2017 +1000

    net: s390: fix up for "Fix inconsistent teardown and release of private netdev state"
    
    
    [ Upstream commit cd1997f6c11483da819a7719aa013093b8003743 ]
    
    Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 95876855a55072572895a236b156ffb357fd5538
Author: David S. Miller <davem@davemloft.net>
Date:   Mon May 8 12:52:56 2017 -0400

    net: Fix inconsistent teardown and release of private netdev state.
    
    
    [ Upstream commit cf124db566e6b036b8bcbe8decbed740bdfac8c6 ]
    
    Network devices can allocate reasources and private memory using
    netdev_ops->ndo_init().  However, the release of these resources
    can occur in one of two different places.
    
    Either netdev_ops->ndo_uninit() or netdev->destructor().
    
    The decision of which operation frees the resources depends upon
    whether it is necessary for all netdev refs to be released before it
    is safe to perform the freeing.
    
    netdev_ops->ndo_uninit() presumably can occur right after the
    NETDEV_UNREGISTER notifier completes and the unicast and multicast
    address lists are flushed.
    
    netdev->destructor(), on the other hand, does not run until the
    netdev references all go away.
    
    Further complicating the situation is that netdev->destructor()
    almost universally does also a free_netdev().
    
    This creates a problem for the logic in register_netdevice().
    Because all callers of register_netdevice() manage the freeing
    of the netdev, and invoke free_netdev(dev) if register_netdevice()
    fails.
    
    If netdev_ops->ndo_init() succeeds, but something else fails inside
    of register_netdevice(), it does call ndo_ops->ndo_uninit().  But
    it is not able to invoke netdev->destructor().
    
    This is because netdev->destructor() will do a free_netdev() and
    then the caller of register_netdevice() will do the same.
    
    However, this means that the resources that would normally be released
    by netdev->destructor() will not be.
    
    Over the years drivers have added local hacks to deal with this, by
    invoking their destructor parts by hand when register_netdevice()
    fails.
    
    Many drivers do not try to deal with this, and instead we have leaks.
    
    Let's close this hole by formalizing the distinction between what
    private things need to be freed up by netdev->destructor() and whether
    the driver needs unregister_netdevice() to perform the free_netdev().
    
    netdev->priv_destructor() performs all actions to free up the private
    resources that used to be freed by netdev->destructor(), except for
    free_netdev().
    
    netdev->needs_free_netdev is a boolean that indicates whether
    free_netdev() should be done at the end of unregister_netdevice().
    
    Now, register_netdevice() can sanely release all resources after
    ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit()
    and netdev->priv_destructor().
    
    And at the end of unregister_netdevice(), we invoke
    netdev->priv_destructor() and optionally call free_netdev().
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 3227b51e72f4d8e33cd6274e471c08b2fbec6e2e
Author: Alexander Potapenko <glider@google.com>
Date:   Tue Jun 6 15:56:54 2017 +0200

    net: don't call strlen on non-terminated string in dev_set_alias()
    
    
    [ Upstream commit c28294b941232931fbd714099798eb7aa7e865d7 ]
    
    KMSAN reported a use of uninitialized memory in dev_set_alias(),
    which was caused by calling strlcpy() (which in turn called strlen())
    on the user-supplied non-terminated string.
    
    Signed-off-by: Alexander Potapenko <glider@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
