Changelog in Linux kernel 6.1.138

ALSA: usb-audio: Add second USB ID for Jabra Evolve 65 headset [+ + +]

Author: Joachim Priesner <[email protected]>
Date:   Mon Apr 28 07:36:06 2025 +0200

    ALSA: usb-audio: Add second USB ID for Jabra Evolve 65 headset
    
    commit 1149719442d28c96dc63cad432b5a6db7c300e1a upstream.
    
    There seem to be multiple USB device IDs used for these;
    the one I have reports as 0b0e:030c when powered on.
    (When powered off, it reports as 0b0e:0311.)
    
    Signed-off-by: Joachim Priesner <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

amd-xgbe: Fix to ensure dependent features are toggled with RX checksum offload [+ + +]

Author: Vishal Badole <[email protected]>
Date:   Thu Apr 24 18:32:48 2025 +0530

    amd-xgbe: Fix to ensure dependent features are toggled with RX checksum offload
    
    commit f04dd30f1bef1ed2e74a4050af6e5e5e3869bac3 upstream.
    
    According to the XGMAC specification, enabling features such as Layer 3
    and Layer 4 Packet Filtering, Split Header and Virtualized Network support
    automatically selects the IPC Full Checksum Offload Engine on the receive
    side.
    
    When RX checksum offload is disabled, these dependent features must also
    be disabled to prevent abnormal behavior caused by mismatched feature
    dependencies.
    
    Ensure that toggling RX checksum offload (disabling or enabling) properly
    disables or enables all dependent features, maintaining consistent and
    expected behavior in the network device.
    
    Cc: [email protected]
    Fixes: 1a510ccf5869 ("amd-xgbe: Add support for VXLAN offload capabilities")
    Signed-off-by: Vishal Badole <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

arm64: errata: Add missing sentinels to Spectre-BHB MIDR arrays [+ + +]

Author: Will Deacon <[email protected]>
Date:   Thu May 1 11:47:47 2025 +0100

    arm64: errata: Add missing sentinels to Spectre-BHB MIDR arrays
    
    commit fee4d171451c1ad9e8aaf65fc0ab7d143a33bd72 upstream.
    
    Commit a5951389e58d ("arm64: errata: Add newer ARM cores to the
    spectre_bhb_loop_affected() lists") added some additional CPUs to the
    Spectre-BHB workaround, including some new arrays for designs that
    require new 'k' values for the workaround to be effective.
    
    Unfortunately, the new arrays omitted the sentinel entry and so
    is_midr_in_range_list() will walk off the end when it doesn't find a
    match. With UBSAN enabled, this leads to a crash during boot when
    is_midr_in_range_list() is inlined (which was more common prior to
    c8c2647e69be ("arm64: Make  _midr_in_range_list() an exported
    function")):
    
     |  Internal error: aarch64 BRK: 00000000f2000001 [#1] PREEMPT SMP
     |  pstate: 804000c5 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
     |  pc : spectre_bhb_loop_affected+0x28/0x30
     |  lr : is_spectre_bhb_affected+0x170/0x190
     | [...]
     |  Call trace:
     |   spectre_bhb_loop_affected+0x28/0x30
     |   update_cpu_capabilities+0xc0/0x184
     |   init_cpu_features+0x188/0x1a4
     |   cpuinfo_store_boot_cpu+0x4c/0x60
     |   smp_prepare_boot_cpu+0x38/0x54
     |   start_kernel+0x8c/0x478
     |   __primary_switched+0xc8/0xd4
     |  Code: 6b09011f 54000061 52801080 d65f03c0 (d4200020)
     |  ---[ end trace 0000000000000000 ]---
     |  Kernel panic - not syncing: aarch64 BRK: Fatal exception
    
    Add the missing sentinel entries.
    
    Cc: Lee Jones <[email protected]>
    Cc: James Morse <[email protected]>
    Cc: Doug Anderson <[email protected]>
    Cc: Shameer Kolothum <[email protected]>
    Cc: <[email protected]>
    Reported-by: Greg Kroah-Hartman <[email protected]>
    Fixes: a5951389e58d ("arm64: errata: Add newer ARM cores to the spectre_bhb_loop_affected() lists")
    Signed-off-by: Will Deacon <[email protected]>
    Reviewed-by: Lee Jones <[email protected]>
    Reviewed-by: Douglas Anderson <[email protected]>
    Reviewed-by: Greg Kroah-Hartman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Catalin Marinas <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ARM: dts: opos6ul: add ksz8081 phy properties [+ + +]

Author: Sébastien Szymanski <[email protected]>
Date:   Fri Mar 14 17:20:38 2025 +0100

    ARM: dts: opos6ul: add ksz8081 phy properties
    
    [ Upstream commit 6e1a7bc8382b0d4208258f7d2a4474fae788dd90 ]
    
    Commit c7e73b5051d6 ("ARM: imx: mach-imx6ul: remove 14x14 EVK specific
    PHY fixup") removed a PHY fixup that setted the clock mode and the LED
    mode.
    Make the Ethernet interface work again by doing as advised in the
    commit's log, set clock mode and the LED mode in the device tree.
    
    Fixes: c7e73b5051d6 ("ARM: imx: mach-imx6ul: remove 14x14 EVK specific PHY fixup")
    Signed-off-by: Sébastien Szymanski <[email protected]>
    Reviewed-by: Oleksij Rempel <[email protected]>
    Signed-off-by: Shawn Guo <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ASoC: soc-core: Stop using of_property_read_bool() for non-boolean properties [+ + +]

Author: Geert Uytterhoeven <[email protected]>
Date:   Wed Jan 22 09:21:27 2025 +0100

    ASoC: soc-core: Stop using of_property_read_bool() for non-boolean properties
    
    [ Upstream commit 6eab7034579917f207ca6d8e3f4e11e85e0ab7d5 ]
    
    On R-Car:
    
        OF: /sound: Read of boolean property 'simple-audio-card,bitclock-master' with a value.
        OF: /sound: Read of boolean property 'simple-audio-card,frame-master' with a value.
    
    or:
    
        OF: /soc/sound@ec500000/ports/port@0/endpoint: Read of boolean property 'bitclock-master' with a value.
        OF: /soc/sound@ec500000/ports/port@0/endpoint: Read of boolean property 'frame-master' with a value.
    
    The use of of_property_read_bool() for non-boolean properties is
    deprecated in favor of of_property_present() when testing for property
    presence.
    
    Replace testing for presence before calling of_property_read_u32() by
    testing for an -EINVAL return value from the latter, to simplify the
    code.
    
    Signed-off-by: Geert Uytterhoeven <[email protected]>
    Link: https://patch.msgid.link/db10e96fbda121e7456d70e97a013cbfc9755f4d.1737533954.git.geert+renesas@glider.be
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ASoC: soc-pcm: Fix hw_params() and DAPM widget sequence [+ + +]

Author: Sheetal <[email protected]>
Date:   Fri Apr 4 10:59:53 2025 +0000

    ASoC: soc-pcm: Fix hw_params() and DAPM widget sequence
    
    [ Upstream commit 9aff2e8df240e84a36f2607f98a0a9924a24e65d ]
    
    Issue:
     When multiple audio streams share a common BE DAI, the BE DAI
     widget can be powered up before its hardware parameters are configured.
     This incorrect sequence leads to intermittent pcm_write errors.
    
     For example, the below Tegra use-case throws an error:
      aplay(2 streams) -> AMX(mux) -> ADX(demux) -> arecord(2 streams),
      here, 'AMX TX' and 'ADX RX' are common BE DAIs.
    
    For above usecase when failure happens below sequence is observed:
     aplay(1) FE open()
      - BE DAI callbacks added to the list
      - BE DAI state = SND_SOC_DPCM_STATE_OPEN
     aplay(2) FE open()
      - BE DAI callbacks are not added to the list as the state is
        already SND_SOC_DPCM_STATE_OPEN during aplay(1) FE open().
     aplay(2) FE hw_params()
      - BE DAI hw_params() callback ignored
     aplay(2) FE prepare()
      - Widget is powered ON without BE DAI hw_params() call
     aplay(1) FE hw_params()
      - BE DAI hw_params() is now called
    
    Fix:
     Add BE DAIs in the list if its state is either SND_SOC_DPCM_STATE_OPEN
     or SND_SOC_DPCM_STATE_HW_PARAMS as well.
    
    It ensures the widget is powered ON after BE DAI hw_params() callback.
    
    Fixes: 0c25db3f7621 ("ASoC: soc-pcm: Don't reconnect an already active BE")
    Signed-off-by: Sheetal <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ASoC: Use of_property_read_bool() [+ + +]

Author: Rob Herring (Arm) <[email protected]>
Date:   Wed Jul 31 13:12:58 2024 -0600

    ASoC: Use of_property_read_bool()
    
    [ Upstream commit 69dd15a8ef0ae494179fd15023aa8172188db6b7 ]
    
    Use of_property_read_bool() to read boolean properties rather than
    of_get_property(). This is part of a larger effort to remove callers
    of of_get_property() and similar functions. of_get_property() leaks
    the DT property data pointer which is a problem for dynamically
    allocated nodes which may be freed.
    
    Signed-off-by: Rob Herring (Arm) <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Stable-dep-of: 6eab70345799 ("ASoC: soc-core: Stop using of_property_read_bool() for non-boolean properties")
    Signed-off-by: Sasha Levin <[email protected]>

bnxt_en: Fix coredump logic to free allocated buffer [+ + +]

Author: Shruti Parab <[email protected]>
Date:   Mon Apr 28 15:59:01 2025 -0700

    bnxt_en: Fix coredump logic to free allocated buffer
    
    [ Upstream commit ea9376cf68230e05492f22ca45d329f16e262c7b ]
    
    When handling HWRM_DBG_COREDUMP_LIST FW command in
    bnxt_hwrm_dbg_dma_data(), the allocated buffer info->dest_buf is
    not freed in the error path.  In the normal path, info->dest_buf
    is assigned to coredump->data and it will eventually be freed after
    the coredump is collected.
    
    Free info->dest_buf immediately inside bnxt_hwrm_dbg_dma_data() in
    the error path.
    
    Fixes: c74751f4c392 ("bnxt_en: Return error if FW returns more data than dump length")
    Reported-by: Michael Chan <[email protected]>
    Reviewed-by: Kalesh AP <[email protected]>
    Signed-off-by: Shruti Parab <[email protected]>
    Signed-off-by: Michael Chan <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

bnxt_en: Fix ethtool -d byte order for 32-bit values [+ + +]

Author: Michael Chan <[email protected]>
Date:   Mon Apr 28 15:59:03 2025 -0700

    bnxt_en: Fix ethtool -d byte order for 32-bit values
    
    [ Upstream commit 02e8be5a032cae0f4ca33c6053c44d83cf4acc93 ]
    
    For version 1 register dump that includes the PCIe stats, the existing
    code incorrectly assumes that all PCIe stats are 64-bit values.  Fix it
    by using an array containing the starting and ending index of the 32-bit
    values.  The loop in bnxt_get_regs() will use the array to do proper
    endian swap for the 32-bit values.
    
    Fixes: b5d600b027eb ("bnxt_en: Add support for 'ethtool -d'")
    Reviewed-by: Shruti Parab <[email protected]>
    Reviewed-by: Kalesh AP <[email protected]>
    Reviewed-by: Andy Gospodarek <[email protected]>
    Signed-off-by: Michael Chan <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

bnxt_en: Fix out-of-bound memcpy() during ethtool -w [+ + +]

Author: Shruti Parab <[email protected]>
Date:   Mon Apr 28 15:59:02 2025 -0700

    bnxt_en: Fix out-of-bound memcpy() during ethtool -w
    
    [ Upstream commit 6b87bd94f34370bbf1dfa59352bed8efab5bf419 ]
    
    When retrieving the FW coredump using ethtool, it can sometimes cause
    memory corruption:
    
    BUG: KFENCE: memory corruption in __bnxt_get_coredump+0x3ef/0x670 [bnxt_en]
    Corrupted memory at 0x000000008f0f30e8 [ ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ] (in kfence-#45):
    __bnxt_get_coredump+0x3ef/0x670 [bnxt_en]
    ethtool_get_dump_data+0xdc/0x1a0
    __dev_ethtool+0xa1e/0x1af0
    dev_ethtool+0xa8/0x170
    dev_ioctl+0x1b5/0x580
    sock_do_ioctl+0xab/0xf0
    sock_ioctl+0x1ce/0x2e0
    __x64_sys_ioctl+0x87/0xc0
    do_syscall_64+0x5c/0xf0
    entry_SYSCALL_64_after_hwframe+0x78/0x80
    
    ...
    
    This happens when copying the coredump segment list in
    bnxt_hwrm_dbg_dma_data() with the HWRM_DBG_COREDUMP_LIST FW command.
    The info->dest_buf buffer is allocated based on the number of coredump
    segments returned by the FW.  The segment list is then DMA'ed by
    the FW and the length of the DMA is returned by FW.  The driver then
    copies this DMA'ed segment list to info->dest_buf.
    
    In some cases, this DMA length may exceed the info->dest_buf length
    and cause the above BUG condition.  Fix it by capping the copy
    length to not exceed the length of info->dest_buf.  The extra
    DMA data contains no useful information.
    
    This code path is shared for the HWRM_DBG_COREDUMP_LIST and the
    HWRM_DBG_COREDUMP_RETRIEVE FW commands.  The buffering is different
    for these 2 FW commands.  To simplify the logic, we need to move
    the line to adjust the buffer length for HWRM_DBG_COREDUMP_RETRIEVE
    up, so that the new check to cap the copy length will work for both
    commands.
    
    Fixes: c74751f4c392 ("bnxt_en: Return error if FW returns more data than dump length")
    Reviewed-by: Kalesh AP <[email protected]>
    Signed-off-by: Shruti Parab <[email protected]>
    Signed-off-by: Michael Chan <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

cpufreq: Avoid using inconsistent policy->min and policy->max [+ + +]

Author: Rafael J. Wysocki <[email protected]>
Date:   Wed Apr 16 16:12:37 2025 +0200

    cpufreq: Avoid using inconsistent policy->min and policy->max
    
    commit 7491cdf46b5cbdf123fc84fbe0a07e9e3d7b7620 upstream.
    
    Since cpufreq_driver_resolve_freq() can run in parallel with
    cpufreq_set_policy() and there is no synchronization between them,
    the former may access policy->min and policy->max while the latter
    is updating them and it may see intermediate values of them due
    to the way the update is carried out.  Also the compiler is free
    to apply any optimizations it wants both to the stores in
    cpufreq_set_policy() and to the loads in cpufreq_driver_resolve_freq()
    which may result in additional inconsistencies.
    
    To address this, use WRITE_ONCE() when updating policy->min and
    policy->max in cpufreq_set_policy() and use READ_ONCE() for reading
    them in cpufreq_driver_resolve_freq().  Moreover, rearrange the update
    in cpufreq_set_policy() to avoid storing intermediate values in
    policy->min and policy->max with the help of the observation that
    their new values are expected to be properly ordered upfront.
    
    Also modify cpufreq_driver_resolve_freq() to take the possible reverse
    ordering of policy->min and policy->max, which may happen depending on
    the ordering of operations when this function and cpufreq_set_policy()
    run concurrently, into account by always honoring the max when it
    turns out to be less than the min (in case it comes from thermal
    throttling or similar).
    
    Fixes: 151717690694 ("cpufreq: Make policy min/max hard requirements")
    Cc: 5.16+ <[email protected]> # 5.16+
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Reviewed-by: Christian Loehle <[email protected]>
    Acked-by: Viresh Kumar <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

cpufreq: Fix setting policy limits when frequency tables are used [+ + +]

Author: Rafael J. Wysocki <[email protected]>
Date:   Fri Apr 25 13:36:21 2025 +0200

    cpufreq: Fix setting policy limits when frequency tables are used
    
    commit b79028039f440e7d2c4df6ab243060c4e3803e84 upstream.
    
    Commit 7491cdf46b5c ("cpufreq: Avoid using inconsistent policy->min and
    policy->max") overlooked the fact that policy->min and policy->max were
    accessed directly in cpufreq_frequency_table_target() and in the
    functions called by it.  Consequently, the changes made by that commit
    led to problems with setting policy limits.
    
    Address this by passing the target frequency limits to __resolve_freq()
    and cpufreq_frequency_table_target() and propagating them to the
    functions called by the latter.
    
    Fixes: 7491cdf46b5c ("cpufreq: Avoid using inconsistent policy->min and policy->max")
    Cc: 5.16+ <[email protected]> # 5.16+
    Closes: https://lore.kernel.org/linux-pm/[email protected]/
    Reported-by: Stephan Gerhold <[email protected]>
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Tested-by: Stephan Gerhold <[email protected]>
    Reviewed-by: Lifeng Zheng <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dm-bufio: don't schedule in atomic context [+ + +]

Author: LongPing Wei <[email protected]>
Date:   Thu Apr 17 11:07:38 2025 +0800

    dm-bufio: don't schedule in atomic context
    
    commit a3d8f0a7f5e8b193db509c7191fefeed3533fc44 upstream.
    
    A BUG was reported as below when CONFIG_DEBUG_ATOMIC_SLEEP and
    try_verify_in_tasklet are enabled.
    [  129.444685][  T934] BUG: sleeping function called from invalid context at drivers/md/dm-bufio.c:2421
    [  129.444723][  T934] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 934, name: kworker/1:4
    [  129.444740][  T934] preempt_count: 201, expected: 0
    [  129.444756][  T934] RCU nest depth: 0, expected: 0
    [  129.444781][  T934] Preemption disabled at:
    [  129.444789][  T934] [<ffffffd816231900>] shrink_work+0x21c/0x248
    [  129.445167][  T934] kernel BUG at kernel/sched/walt/walt_debug.c:16!
    [  129.445183][  T934] Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
    [  129.445204][  T934] Skip md ftrace buffer dump for: 0x1609e0
    [  129.447348][  T934] CPU: 1 PID: 934 Comm: kworker/1:4 Tainted: G        W  OE      6.6.56-android15-8-o-g6f82312b30b9-debug #1 1400000003000000474e5500b3187743670464e8
    [  129.447362][  T934] Hardware name: Qualcomm Technologies, Inc. Parrot QRD, Alpha-M (DT)
    [  129.447373][  T934] Workqueue: dm_bufio_cache shrink_work
    [  129.447394][  T934] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [  129.447406][  T934] pc : android_rvh_schedule_bug+0x0/0x8 [sched_walt_debug]
    [  129.447435][  T934] lr : __traceiter_android_rvh_schedule_bug+0x44/0x6c
    [  129.447451][  T934] sp : ffffffc0843dbc90
    [  129.447459][  T934] x29: ffffffc0843dbc90 x28: ffffffffffffffff x27: 0000000000000c8b
    [  129.447479][  T934] x26: 0000000000000040 x25: ffffff804b3d6260 x24: ffffffd816232b68
    [  129.447497][  T934] x23: ffffff805171c5b4 x22: 0000000000000000 x21: ffffffd816231900
    [  129.447517][  T934] x20: ffffff80306ba898 x19: 0000000000000000 x18: ffffffc084159030
    [  129.447535][  T934] x17: 00000000d2b5dd1f x16: 00000000d2b5dd1f x15: ffffffd816720358
    [  129.447554][  T934] x14: 0000000000000004 x13: ffffff89ef978000 x12: 0000000000000003
    [  129.447572][  T934] x11: ffffffd817a823c4 x10: 0000000000000202 x9 : 7e779c5735de9400
    [  129.447591][  T934] x8 : ffffffd81560d004 x7 : 205b5d3938373434 x6 : ffffffd8167397c8
    [  129.447610][  T934] x5 : 0000000000000000 x4 : 0000000000000001 x3 : ffffffc0843db9e0
    [  129.447629][  T934] x2 : 0000000000002f15 x1 : 0000000000000000 x0 : 0000000000000000
    [  129.447647][  T934] Call trace:
    [  129.447655][  T934]  android_rvh_schedule_bug+0x0/0x8 [sched_walt_debug 1400000003000000474e550080cce8a8a78606b6]
    [  129.447681][  T934]  __might_resched+0x190/0x1a8
    [  129.447694][  T934]  shrink_work+0x180/0x248
    [  129.447706][  T934]  process_one_work+0x260/0x624
    [  129.447718][  T934]  worker_thread+0x28c/0x454
    [  129.447729][  T934]  kthread+0x118/0x158
    [  129.447742][  T934]  ret_from_fork+0x10/0x20
    [  129.447761][  T934] Code: ???????? ???????? ???????? d2b5dd1f (d4210000)
    [  129.447772][  T934] ---[ end trace 0000000000000000 ]---
    
    dm_bufio_lock will call spin_lock_bh when try_verify_in_tasklet
    is enabled, and __scan will be called in atomic context.
    
    Fixes: 7cd326747f46 ("dm bufio: remove dm_bufio_cond_resched()")
    Signed-off-by: LongPing Wei <[email protected]>
    Cc: [email protected]
    Signed-off-by: Mikulas Patocka <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dm-integrity: fix a warning on invalid table line [+ + +]

Author: Mikulas Patocka <[email protected]>
Date:   Tue Apr 22 21:18:33 2025 +0200

    dm-integrity: fix a warning on invalid table line
    
    commit 0a533c3e4246c29d502a7e0fba0e86d80a906b04 upstream.
    
    If we use the 'B' mode and we have an invalit table line,
    cancel_delayed_work_sync would trigger a warning. This commit avoids the
    warning.
    
    Signed-off-by: Mikulas Patocka <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dm: always update the array size in realloc_argv on success [+ + +]

Author: Benjamin Marzinski <[email protected]>
Date:   Tue Apr 15 00:17:16 2025 -0400

    dm: always update the array size in realloc_argv on success
    
    commit 5a2a6c428190f945c5cbf5791f72dbea83e97f66 upstream.
    
    realloc_argv() was only updating the array size if it was called with
    old_argv already allocated. The first time it was called to create an
    argv array, it would allocate the array but return the array size as
    zero. dm_split_args() would think that it couldn't store any arguments
    in the array and would call realloc_argv() again, causing it to
    reallocate the initial slots (this time using GPF_KERNEL) and finally
    return a size. Aside from being wasteful, this could cause deadlocks on
    targets that need to process messages without starting new IO. Instead,
    realloc_argv should always update the allocated array size on success.
    
    Fixes: a0651926553c ("dm table: don't copy from a NULL pointer in realloc_argv()")
    Cc: [email protected]
    Signed-off-by: Benjamin Marzinski <[email protected]>
    Signed-off-by: Mikulas Patocka <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dm: fix copying after src array boundaries [+ + +]

Author: Tudor Ambarus <[email protected]>
Date:   Tue May 6 11:31:50 2025 +0000

    dm: fix copying after src array boundaries
    
    commit f1aff4bc199cb92c055668caed65505e3b4d2656 upstream.
    
    The blammed commit copied to argv the size of the reallocated argv,
    instead of the size of the old_argv, thus reading and copying from
    past the old_argv allocated memory.
    
    Following BUG_ON was hit:
    [    3.038929][    T1] kernel BUG at lib/string_helpers.c:1040!
    [    3.039147][    T1] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
    ...
    [    3.056489][    T1] Call trace:
    [    3.056591][    T1]  __fortify_panic+0x10/0x18 (P)
    [    3.056773][    T1]  dm_split_args+0x20c/0x210
    [    3.056942][    T1]  dm_table_add_target+0x13c/0x360
    [    3.057132][    T1]  table_load+0x110/0x3ac
    [    3.057292][    T1]  dm_ctl_ioctl+0x424/0x56c
    [    3.057457][    T1]  __arm64_sys_ioctl+0xa8/0xec
    [    3.057634][    T1]  invoke_syscall+0x58/0x10c
    [    3.057804][    T1]  el0_svc_common+0xa8/0xdc
    [    3.057970][    T1]  do_el0_svc+0x1c/0x28
    [    3.058123][    T1]  el0_svc+0x50/0xac
    [    3.058266][    T1]  el0t_64_sync_handler+0x60/0xc4
    [    3.058452][    T1]  el0t_64_sync+0x1b0/0x1b4
    [    3.058620][    T1] Code: f800865e a9bf7bfd 910003fd 941f48aa (d4210000)
    [    3.058897][    T1] ---[ end trace 0000000000000000 ]---
    [    3.059083][    T1] Kernel panic - not syncing: Oops - BUG: Fatal exception
    
    Fix it by copying the size of src, and not the size of dst, as it was.
    
    Fixes: 5a2a6c428190 ("dm: always update the array size in realloc_argv on success")
    Cc: [email protected]
    Signed-off-by: Tudor Ambarus <[email protected]>
    Signed-off-by: Mikulas Patocka <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Add scoped mutexes for amdgpu_dm_dhcp [+ + +]

Author: Mario Limonciello <[email protected]>
Date:   Fri Feb 28 13:30:01 2025 -0600

    drm/amd/display: Add scoped mutexes for amdgpu_dm_dhcp
    
    [ Upstream commit 6b675ab8efbf2bcee25be29e865455c56e246401 ]
    
    [Why]
    Guards automatically release mutex when it goes out of scope making
    code easier to follow.
    
    [How]
    Replace all use of mutex_lock()/mutex_unlock() with guard(mutex).
    
    Reviewed-by: Alex Hung <[email protected]>
    Signed-off-by: Mario Limonciello <[email protected]>
    Signed-off-by: Tom Chung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Stable-dep-of: be593d9d91c5 ("drm/amd/display: Fix slab-use-after-free in hdcp")
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Change HDCP update sequence for DM [+ + +]

Author: Bhawanpreet Lakha <[email protected]>
Date:   Mon Jul 24 16:32:47 2023 -0400

    drm/amd/display: Change HDCP update sequence for DM
    
    [ Upstream commit 393e83484839970e4975dfa1f0666f939a6f3e3d ]
    
    Refactor the sequence in hdcp_update_display() to use
    mod_hdcp_update_display().
    
    Previous sequence:
            - remove()->add()
    
    This Sequence was used to update the display, (mod_hdcp_update_display
    didn't exist at the time). This meant for any hdcp updates (type changes,
    enable/disable) we would remove, reconstruct, and add. This leads to
    unnecessary calls to psp eventually
    
    New Sequence using mod_hdcp_update_display():
            - add() once when stream is enabled
            - use update() for all updates
    
    The update function checks for prev == new states and will not
    unnecessarily end up calling psp via add/remove.
    
    Reviewed-by: Qingqing Zhuo <[email protected]>
    Acked-by: Tom Chung <[email protected]>
    Signed-off-by: Bhawanpreet Lakha <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Stable-dep-of: be593d9d91c5 ("drm/amd/display: Fix slab-use-after-free in hdcp")
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Clean up style problems in amdgpu_dm_hdcp.c [+ + +]

Author: Srinivasan Shanmugam <[email protected]>
Date:   Thu Jul 13 17:10:57 2023 +0530

    drm/amd/display: Clean up style problems in amdgpu_dm_hdcp.c
    
    [ Upstream commit a19de9dbb4d293c064b02cec8ef134cb9812d639 ]
    
    Conform to Linux kernel coding style.
    
    And promote sysfs entry for set/get srm to kdoc.
    
    Suggested-by: Rodrigo Siqueira <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Aurabindo Pillai <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Stable-dep-of: be593d9d91c5 ("drm/amd/display: Fix slab-use-after-free in hdcp")
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Fix slab-use-after-free in hdcp [+ + +]

Author: Chris Bainbridge <[email protected]>
Date:   Thu Apr 17 16:50:05 2025 -0500

    drm/amd/display: Fix slab-use-after-free in hdcp
    
    [ Upstream commit be593d9d91c5a3a363d456b9aceb71029aeb3f1d ]
    
    The HDCP code in amdgpu_dm_hdcp.c copies pointers to amdgpu_dm_connector
    objects without incrementing the kref reference counts. When using a
    USB-C dock, and the dock is unplugged, the corresponding
    amdgpu_dm_connector objects are freed, creating dangling pointers in the
    HDCP code. When the dock is plugged back, the dangling pointers are
    dereferenced, resulting in a slab-use-after-free:
    
    [   66.775837] BUG: KASAN: slab-use-after-free in event_property_validate+0x42f/0x6c0 [amdgpu]
    [   66.776171] Read of size 4 at addr ffff888127804120 by task kworker/0:1/10
    
    [   66.776179] CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Not tainted 6.14.0-rc7-00180-g54505f727a38-dirty #233
    [   66.776183] Hardware name: HP HP Pavilion Aero Laptop 13-be0xxx/8916, BIOS F.17 12/18/2024
    [   66.776186] Workqueue: events event_property_validate [amdgpu]
    [   66.776494] Call Trace:
    [   66.776496]  <TASK>
    [   66.776497]  dump_stack_lvl+0x70/0xa0
    [   66.776504]  print_report+0x175/0x555
    [   66.776507]  ? __virt_addr_valid+0x243/0x450
    [   66.776510]  ? kasan_complete_mode_report_info+0x66/0x1c0
    [   66.776515]  kasan_report+0xeb/0x1c0
    [   66.776518]  ? event_property_validate+0x42f/0x6c0 [amdgpu]
    [   66.776819]  ? event_property_validate+0x42f/0x6c0 [amdgpu]
    [   66.777121]  __asan_report_load4_noabort+0x14/0x20
    [   66.777124]  event_property_validate+0x42f/0x6c0 [amdgpu]
    [   66.777342]  ? __lock_acquire+0x6b40/0x6b40
    [   66.777347]  ? enable_assr+0x250/0x250 [amdgpu]
    [   66.777571]  process_one_work+0x86b/0x1510
    [   66.777575]  ? pwq_dec_nr_in_flight+0xcf0/0xcf0
    [   66.777578]  ? assign_work+0x16b/0x280
    [   66.777580]  ? lock_is_held_type+0xa3/0x130
    [   66.777583]  worker_thread+0x5c0/0xfa0
    [   66.777587]  ? process_one_work+0x1510/0x1510
    [   66.777588]  kthread+0x3a2/0x840
    [   66.777591]  ? kthread_is_per_cpu+0xd0/0xd0
    [   66.777594]  ? trace_hardirqs_on+0x4f/0x60
    [   66.777597]  ? _raw_spin_unlock_irq+0x27/0x60
    [   66.777599]  ? calculate_sigpending+0x77/0xa0
    [   66.777602]  ? kthread_is_per_cpu+0xd0/0xd0
    [   66.777605]  ret_from_fork+0x40/0x90
    [   66.777607]  ? kthread_is_per_cpu+0xd0/0xd0
    [   66.777609]  ret_from_fork_asm+0x11/0x20
    [   66.777614]  </TASK>
    
    [   66.777643] Allocated by task 10:
    [   66.777646]  kasan_save_stack+0x39/0x60
    [   66.777649]  kasan_save_track+0x14/0x40
    [   66.777652]  kasan_save_alloc_info+0x37/0x50
    [   66.777655]  __kasan_kmalloc+0xbb/0xc0
    [   66.777658]  __kmalloc_cache_noprof+0x1c8/0x4b0
    [   66.777661]  dm_dp_add_mst_connector+0xdd/0x5c0 [amdgpu]
    [   66.777880]  drm_dp_mst_port_add_connector+0x47e/0x770 [drm_display_helper]
    [   66.777892]  drm_dp_send_link_address+0x1554/0x2bf0 [drm_display_helper]
    [   66.777901]  drm_dp_check_and_send_link_address+0x187/0x1f0 [drm_display_helper]
    [   66.777909]  drm_dp_mst_link_probe_work+0x2b8/0x410 [drm_display_helper]
    [   66.777917]  process_one_work+0x86b/0x1510
    [   66.777919]  worker_thread+0x5c0/0xfa0
    [   66.777922]  kthread+0x3a2/0x840
    [   66.777925]  ret_from_fork+0x40/0x90
    [   66.777927]  ret_from_fork_asm+0x11/0x20
    
    [   66.777932] Freed by task 1713:
    [   66.777935]  kasan_save_stack+0x39/0x60
    [   66.777938]  kasan_save_track+0x14/0x40
    [   66.777940]  kasan_save_free_info+0x3b/0x60
    [   66.777944]  __kasan_slab_free+0x52/0x70
    [   66.777946]  kfree+0x13f/0x4b0
    [   66.777949]  dm_dp_mst_connector_destroy+0xfa/0x150 [amdgpu]
    [   66.778179]  drm_connector_free+0x7d/0xb0
    [   66.778184]  drm_mode_object_put.part.0+0xee/0x160
    [   66.778188]  drm_mode_object_put+0x37/0x50
    [   66.778191]  drm_atomic_state_default_clear+0x220/0xd60
    [   66.778194]  __drm_atomic_state_free+0x16e/0x2a0
    [   66.778197]  drm_mode_atomic_ioctl+0x15ed/0x2ba0
    [   66.778200]  drm_ioctl_kernel+0x17a/0x310
    [   66.778203]  drm_ioctl+0x584/0xd10
    [   66.778206]  amdgpu_drm_ioctl+0xd2/0x1c0 [amdgpu]
    [   66.778375]  __x64_sys_ioctl+0x139/0x1a0
    [   66.778378]  x64_sys_call+0xee7/0xfb0
    [   66.778381]  do_syscall_64+0x87/0x140
    [   66.778385]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
    
    Fix this by properly incrementing and decrementing the reference counts
    when making and deleting copies of the amdgpu_dm_connector pointers.
    
    (Mario: rebase on current code and update fixes tag)
    
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4006
    Signed-off-by: Chris Bainbridge <[email protected]>
    Fixes: da3fd7ac0bcf3 ("drm/amd/display: Update CP property based on HW query")
    Reviewed-by: Alex Hung <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mario Limonciello <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit d4673f3c3b3dcb74e36e53cdfc880baa7a87b330)
    Cc: [email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: phase2 enable mst hdcp multiple displays [+ + +]

Author: hersen wu <[email protected]>
Date:   Mon Nov 14 14:29:56 2022 -0500

    drm/amd/display: phase2 enable mst hdcp multiple displays
    
    [ Upstream commit aa9fdd5d5add50305d2022fa072fe6f189283415 ]
    
    [why]
    For MST topology with 1 physical link and multiple connectors (>=2),
    e.g. daisy cahined MST + SST, or 1-to-multi MST hub, if userspace
    set to enable the HDCP simultaneously on all connected outputs, the
    commit tail iteratively call the hdcp_update_display() for each
    display (connector). However, the hdcp workqueue data structure for
    each link has only one DM connector and encryption status members,
    which means the work queue of property_validate/update() would only
    be triggered for the last connector within this physical link, and
    therefore the HDCP property value of other connectors would stay on
    DESIRED instead of switching to ENABLED, which is NOT as expected.
    
    [how]
    Use array of AMDGPU_DM_MAX_DISPLAY_INDEX for both aconnector and
    encryption status in hdcp workqueue data structure for each physical
    link. For property validate/update work queue, we iterates over the
    array and do similar operation/check for each connected display.
    
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: hersen wu <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Stable-dep-of: be593d9d91c5 ("drm/amd/display: Fix slab-use-after-free in hdcp")
    Signed-off-by: Sasha Levin <[email protected]>

drm/nouveau: Fix WARN_ON in nouveau_fence_context_kill() [+ + +]

Author: Philipp Stanner <[email protected]>
Date:   Tue Apr 15 14:19:00 2025 +0200

    drm/nouveau: Fix WARN_ON in nouveau_fence_context_kill()
    
    commit bbe5679f30d7690a9b6838a583b9690ea73fe0e9 upstream.
    
    Nouveau is mostly designed in a way that it's expected that fences only
    ever get signaled through nouveau_fence_signal(). However, in at least
    one other place, nouveau_fence_done(), can signal fences, too. If that
    happens (race) a signaled fence remains in the pending list for a while,
    until it gets removed by nouveau_fence_update().
    
    Should nouveau_fence_context_kill() run in the meantime, this would be
    a bug because the function would attempt to set an error code on an
    already signaled fence.
    
    Have nouveau_fence_context_kill() check for a fence being signaled.
    
    Cc: [email protected] # v5.10+
    Fixes: ea13e5abf807 ("drm/nouveau: signal pending fences when channel has been killed")
    Suggested-by: Christian König <[email protected]>
    Signed-off-by: Philipp Stanner <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Danilo Krummrich <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

EDAC/altera: Set DDR and SDMMC interrupt mask before registration [+ + +]

Author: Niravkumar L Rabara <[email protected]>
Date:   Fri Apr 25 07:26:40 2025 -0700

    EDAC/altera: Set DDR and SDMMC interrupt mask before registration
    
    commit 6dbe3c5418c4368e824bff6ae4889257dd544892 upstream.
    
    Mask DDR and SDMMC in probe function to avoid spurious interrupts before
    registration.  Removed invalid register write to system manager.
    
    Fixes: 1166fde93d5b ("EDAC, altera: Add Arria10 ECC memory init functions")
    Signed-off-by: Niravkumar L Rabara <[email protected]>
    Signed-off-by: Matthew Gerlach <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Acked-by: Dinh Nguyen <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

EDAC/altera: Test the correct error reg offset [+ + +]

Author: Niravkumar L Rabara <[email protected]>
Date:   Fri Apr 25 07:26:39 2025 -0700

    EDAC/altera: Test the correct error reg offset
    
    commit 4fb7b8fceb0beebbe00712c3daf49ade0386076a upstream.
    
    Test correct structure member, ecc_cecnt_offset, before using it.
    
      [ bp: Massage commit message. ]
    
    Fixes: 73bcc942f427 ("EDAC, altera: Add Arria10 EDAC support")
    Signed-off-by: Niravkumar L Rabara <[email protected]>
    Signed-off-by: Matthew Gerlach <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Acked-by: Dinh Nguyen <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

firmware: arm_ffa: Skip Rx buffer ownership release if not acquired [+ + +]

Author: Sudeep Holla <[email protected]>
Date:   Fri Mar 21 11:57:00 2025 +0000

    firmware: arm_ffa: Skip Rx buffer ownership release if not acquired
    
    [ Upstream commit 4567bdaaaaa1744da3d7da07d9aca2f941f5b4e5 ]
    
    Completion of the FFA_PARTITION_INFO_GET ABI transfers the ownership of
    the caller’s Rx buffer from the producer(typically partition mnager) to
    the consumer(this driver/OS). FFA_RX_RELEASE transfers the ownership
    from the consumer back to the producer.
    
    However, when we set the flag to just return the count of partitions
    deployed in the system corresponding to the specified UUID while
    invoking FFA_PARTITION_INFO_GET, the Rx buffer ownership shouldn't be
    transferred to this driver. We must be able to skip transferring back
    the ownership to the partition manager when we request just to get the
    count of the partitions as the buffers are not acquired in this case.
    
    Firmware may return FFA_RET_DENIED or other error for the ffa_rx_release()
    in such cases.
    
    Fixes: bb1be7498500 ("firmware: arm_ffa: Add v1.1 get_partition_info support")
    Message-Id: <[email protected]>
    Signed-off-by: Sudeep Holla <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

firmware: arm_scmi: Balance device refcount when destroying devices [+ + +]

Author: Cristian Marussi <[email protected]>
Date:   Thu Mar 6 18:54:47 2025 +0000

    firmware: arm_scmi: Balance device refcount when destroying devices
    
    [ Upstream commit 9ca67840c0ddf3f39407339624cef824a4f27599 ]
    
    Using device_find_child() to lookup the proper SCMI device to destroy
    causes an unbalance in device refcount, since device_find_child() calls an
    implicit get_device(): this, in turns, inhibits the call of the provided
    release methods upon devices destruction.
    
    As a consequence, one of the structures that is not freed properly upon
    destruction is the internal struct device_private dev->p populated by the
    drivers subsystem core.
    
    KMemleak detects this situation since loading/unloding some SCMI driver
    causes related devices to be created/destroyed without calling any
    device_release method.
    
    unreferenced object 0xffff00000f583800 (size 512):
      comm "insmod", pid 227, jiffies 4294912190
      hex dump (first 32 bytes):
        00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00  .....N..........
        ff ff ff ff ff ff ff ff 60 36 1d 8a 00 80 ff ff  ........`6......
      backtrace (crc 114e2eed):
        kmemleak_alloc+0xbc/0xd8
        __kmalloc_cache_noprof+0x2dc/0x398
        device_add+0x954/0x12d0
        device_register+0x28/0x40
        __scmi_device_create.part.0+0x1bc/0x380
        scmi_device_create+0x2d0/0x390
        scmi_create_protocol_devices+0x74/0xf8
        scmi_device_request_notifier+0x1f8/0x2a8
        notifier_call_chain+0x110/0x3b0
        blocking_notifier_call_chain+0x70/0xb0
        scmi_driver_register+0x350/0x7f0
        0xffff80000a3b3038
        do_one_initcall+0x12c/0x730
        do_init_module+0x1dc/0x640
        load_module+0x4b20/0x5b70
        init_module_from_file+0xec/0x158
    
    $ ./scripts/faddr2line ./vmlinux device_add+0x954/0x12d0
    device_add+0x954/0x12d0:
    kmalloc_noprof at include/linux/slab.h:901
    (inlined by) kzalloc_noprof at include/linux/slab.h:1037
    (inlined by) device_private_init at drivers/base/core.c:3510
    (inlined by) device_add at drivers/base/core.c:3561
    
    Balance device refcount by issuing a put_device() on devices found via
    device_find_child().
    
    Reported-by: Alice Ryhl <[email protected]>
    Closes: https://lore.kernel.org/linux-arm-kernel/[email protected]/T/#mc1f73a0ea5e41014fa145147b7b839fc988ada8f
    CC: Sudeep Holla <[email protected]>
    CC: Catalin Marinas <[email protected]>
    Fixes: d4f9dddd21f3 ("firmware: arm_scmi: Add dynamic scmi devices creation")
    Signed-off-by: Cristian Marussi <[email protected]>
    Tested-by: Alice Ryhl <[email protected]>
    Message-Id: <[email protected]>
    Signed-off-by: Sudeep Holla <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

i2c: imx-lpi2c: Fix clock count when probe defers [+ + +]

Author: Clark Wang <[email protected]>
Date:   Mon Apr 21 14:23:41 2025 +0800

    i2c: imx-lpi2c: Fix clock count when probe defers
    
    commit b1852c5de2f2a37dd4462f7837c9e3e678f9e546 upstream.
    
    Deferred probe with pm_runtime_put() may delay clock disable, causing
    incorrect clock usage count. Use pm_runtime_put_sync() to ensure the
    clock is disabled immediately.
    
    Fixes: 13d6eb20fc79 ("i2c: imx-lpi2c: add runtime pm support")
    Signed-off-by: Clark Wang <[email protected]>
    Signed-off-by: Carlos Song <[email protected]>
    Cc: <[email protected]> # v4.16+
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ice: Check VF VSI Pointer Value in ice_vc_add_fdir_fltr() [+ + +]

Author: Xuanqiang Luo <[email protected]>
Date:   Fri Apr 25 15:26:32 2025 -0700

    ice: Check VF VSI Pointer Value in ice_vc_add_fdir_fltr()
    
    [ Upstream commit 425c5f266b2edeee0ce16fedd8466410cdcfcfe3 ]
    
    As mentioned in the commit baeb705fd6a7 ("ice: always check VF VSI
    pointer values"), we need to perform a null pointer check on the return
    value of ice_get_vf_vsi() before using it.
    
    Fixes: 6ebbe97a4881 ("ice: Add a per-VF limit on number of FDIR filters")
    Signed-off-by: Xuanqiang Luo <[email protected]>
    Reviewed-by: Przemek Kitszel <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Signed-off-by: Tony Nguyen <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iommu/amd: Fix potential buffer overflow in parse_ivrs_acpihid [+ + +]

Author: Pavel Paklov <[email protected]>
Date:   Tue Mar 25 09:22:44 2025 +0000

    iommu/amd: Fix potential buffer overflow in parse_ivrs_acpihid
    
    commit 8dee308e4c01dea48fc104d37f92d5b58c50b96c upstream.
    
    There is a string parsing logic error which can lead to an overflow of hid
    or uid buffers. Comparing ACPIID_LEN against a total string length doesn't
    take into account the lengths of individual hid and uid buffers so the
    check is insufficient in some cases. For example if the length of hid
    string is 4 and the length of the uid string is 260, the length of str
    will be equal to ACPIID_LEN + 1 but uid string will overflow uid buffer
    which size is 256.
    
    The same applies to the hid string with length 13 and uid string with
    length 250.
    
    Check the length of hid and uid strings separately to prevent
    buffer overflow.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Fixes: ca3bf5d47cec ("iommu/amd: Introduces ivrs_acpihid kernel parameter")
    Cc: [email protected]
    Signed-off-by: Pavel Paklov <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Joerg Roedel <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iommu/arm-smmu-v3: Fix iommu_device_probe bug due to duplicated stream ids [+ + +]

Author: Nicolin Chen <[email protected]>
Date:   Tue Apr 15 11:56:20 2025 -0700

    iommu/arm-smmu-v3: Fix iommu_device_probe bug due to duplicated stream ids
    
    [ Upstream commit b00d24997a11c10d3e420614f0873b83ce358a34 ]
    
    ASPEED VGA card has two built-in devices:
     0008:06:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 06)
     0008:07:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 52)
    
    Its toplogy looks like this:
     +-[0008:00]---00.0-[01-09]--+-00.0-[02-09]--+-00.0-[03]----00.0  Sandisk Corp Device 5017
                                 |               +-01.0-[04]--
                                 |               +-02.0-[05]----00.0  NVIDIA Corporation Device
                                 |               +-03.0-[06-07]----00.0-[07]----00.0  ASPEED Technology, Inc. ASPEED Graphics Family
                                 |               +-04.0-[08]----00.0  Renesas Technology Corp. uPD720201 USB 3.0 Host Controller
                                 |               \-05.0-[09]----00.0  Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
                                 \-00.1  PMC-Sierra Inc. Device 4028
    
    The IORT logic populaties two identical IDs into the fwspec->ids array via
    DMA aliasing in iort_pci_iommu_init() called by pci_for_each_dma_alias().
    
    Though the SMMU driver had been able to handle this situation since commit
    563b5cbe334e ("iommu/arm-smmu-v3: Cope with duplicated Stream IDs"), that
    got broken by the later commit cdf315f907d4 ("iommu/arm-smmu-v3: Maintain
    a SID->device structure"), which ended up with allocating separate streams
    with the same stuffing.
    
    On a kernel prior to v6.15-rc1, there has been an overlooked warning:
      pci 0008:07:00.0: vgaarb: setting as boot VGA device
      pci 0008:07:00.0: vgaarb: bridge control possible
      pci 0008:07:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
      pcieport 0008:06:00.0: Adding to iommu group 14
      ast 0008:07:00.0: stream 67328 already in tree   <===== WARNING
      ast 0008:07:00.0: enabling device (0002 -> 0003)
      ast 0008:07:00.0: Using default configuration
      ast 0008:07:00.0: AST 2600 detected
      ast 0008:07:00.0: [drm] Using analog VGA
      ast 0008:07:00.0: [drm] dram MCLK=396 Mhz type=1 bus_width=16
      [drm] Initialized ast 0.1.0 for 0008:07:00.0 on minor 0
      ast 0008:07:00.0: [drm] fb0: astdrmfb frame buffer device
    
    With v6.15-rc, since the commit bcb81ac6ae3c ("iommu: Get DT/ACPI parsing
    into the proper probe path"), the error returned with the warning is moved
    to the SMMU device probe flow:
      arm_smmu_probe_device+0x15c/0x4c0
      __iommu_probe_device+0x150/0x4f8
      probe_iommu_group+0x44/0x80
      bus_for_each_dev+0x7c/0x100
      bus_iommu_probe+0x48/0x1a8
      iommu_device_register+0xb8/0x178
      arm_smmu_device_probe+0x1350/0x1db0
    which then fails the entire SMMU driver probe:
      pci 0008:06:00.0: Adding to iommu group 21
      pci 0008:07:00.0: stream 67328 already in tree
      arm-smmu-v3 arm-smmu-v3.9.auto: Failed to register iommu
      arm-smmu-v3 arm-smmu-v3.9.auto: probe with driver arm-smmu-v3 failed with error -22
    
    Since SMMU driver had been already expecting a potential duplicated Stream
    ID in arm_smmu_install_ste_for_dev(), change the arm_smmu_insert_master()
    routine to ignore a duplicated ID from the fwspec->sids array as well.
    
    Note: this has been failing the iommu_device_probe() since 2021, although a
    recent iommu commit in v6.15-rc1 that moves iommu_device_probe() started to
    fail the SMMU driver probe. Since nobody has cared about DMA Alias support,
    leave that as it was but fix the fundamental iommu_device_probe() breakage.
    
    Fixes: cdf315f907d4 ("iommu/arm-smmu-v3: Maintain a SID->device structure")
    Cc: [email protected]
    Suggested-by: Jason Gunthorpe <[email protected]>
    Reviewed-by: Jason Gunthorpe <[email protected]>
    Signed-off-by: Nicolin Chen <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Will Deacon <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iommu/arm-smmu-v3: Use the new rb tree helpers [+ + +]

Author: Jason Gunthorpe <[email protected]>
Date:   Tue Aug 6 20:31:15 2024 -0300

    iommu/arm-smmu-v3: Use the new rb tree helpers
    
    [ Upstream commit a2bb820e862d61f9ca1499e500915f9f505a2655 ]
    
    Since v5.12 the rbtree has gained some simplifying helpers aimed at making
    rb tree users write less convoluted boiler plate code. Instead the caller
    provides a single comparison function and the helpers generate the prior
    open-coded stuff.
    
    Update smmu->streams to use rb_find_add() and rb_find().
    
    Tested-by: Nicolin Chen <[email protected]>
    Reviewed-by: Mostafa Saleh <[email protected]>
    Signed-off-by: Jason Gunthorpe <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Will Deacon <[email protected]>
    Stable-dep-of: b00d24997a11 ("iommu/arm-smmu-v3: Fix iommu_device_probe bug due to duplicated stream ids")
    Signed-off-by: Sasha Levin <[email protected]>

iommu/vt-d: Apply quirk_iommu_igfx for 8086:0044 (QM57/QS57) [+ + +]

Author: Mingcong Bai <[email protected]>
Date:   Fri Apr 18 11:16:42 2025 +0800

    iommu/vt-d: Apply quirk_iommu_igfx for 8086:0044 (QM57/QS57)
    
    commit 2c8a7c66c90832432496616a9a3c07293f1364f3 upstream.
    
    On the Lenovo ThinkPad X201, when Intel VT-d is enabled in the BIOS, the
    kernel boots with errors related to DMAR, the graphical interface appeared
    quite choppy, and the system resets erratically within a minute after it
    booted:
    
    DMAR: DRHD: handling fault status reg 3
    DMAR: [DMA Write NO_PASID] Request device [00:02.0] fault addr 0xb97ff000
    [fault reason 0x05] PTE Write access is not set
    
    Upon comparing boot logs with VT-d on/off, I found that the Intel Calpella
    quirk (`quirk_calpella_no_shadow_gtt()') correctly applied the igfx IOMMU
    disable/quirk correctly:
    
    pci 0000:00:00.0: DMAR: BIOS has allocated no shadow GTT; disabling IOMMU
    for graphics
    
    Whereas with VT-d on, it went into the "else" branch, which then
    triggered the DMAR handling fault above:
    
    ... else if (!disable_igfx_iommu) {
            /* we have to ensure the gfx device is idle before we flush */
            pci_info(dev, "Disabling batched IOTLB flush on Ironlake\n");
            iommu_set_dma_strict();
    }
    
    Now, this is not exactly scientific, but moving 0x0044 to quirk_iommu_igfx
    seems to have fixed the aforementioned issue. Running a few `git blame'
    runs on the function, I have found that the quirk was originally
    introduced as a fix specific to ThinkPad X201:
    
    commit 9eecabcb9a92 ("intel-iommu: Abort IOMMU setup for igfx if BIOS gave
    no shadow GTT space")
    
    Which was later revised twice to the "else" branch we saw above:
    
    - 2011: commit 6fbcfb3e467a ("intel-iommu: Workaround IOTLB hang on
      Ironlake GPU")
    - 2024: commit ba00196ca41c ("iommu/vt-d: Decouple igfx_off from graphic
      identity mapping")
    
    I'm uncertain whether further testings on this particular laptops were
    done in 2011 and (honestly I'm not sure) 2024, but I would be happy to do
    some distro-specific testing if that's what would be required to verify
    this patch.
    
    P.S., I also see IDs 0x0040, 0x0062, and 0x006a listed under the same
    `quirk_calpella_no_shadow_gtt()' quirk, but I'm not sure how similar these
    chipsets are (if they share the same issue with VT-d or even, indeed, if
    this issue is specific to a bug in the Lenovo BIOS). With regards to
    0x0062, it seems to be a Centrino wireless card, but not a chipset?
    
    I have also listed a couple (distro and kernel) bug reports below as
    references (some of them are from 7-8 years ago!), as they seem to be
    similar issue found on different Westmere/Ironlake, Haswell, and Broadwell
    hardware setups.
    
    Cc: [email protected]
    Fixes: 6fbcfb3e467a ("intel-iommu: Workaround IOTLB hang on Ironlake GPU")
    Fixes: ba00196ca41c ("iommu/vt-d: Decouple igfx_off from graphic identity mapping")
    Link: https://groups.google.com/g/qubes-users/c/4NP4goUds2c?pli=1
    Link: https://bugs.archlinux.org/task/65362
    Link: https://bbs.archlinux.org/viewtopic.php?id=230323
    Reported-by: Wenhao Sun <[email protected]>
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=197029
    Signed-off-by: Mingcong Bai <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Lu Baolu <[email protected]>
    Signed-off-by: Joerg Roedel <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

irqchip/gic-v2m: Mark a few functions __init [+ + +]

Author: Thomas Gleixner <[email protected]>
Date:   Mon Nov 21 15:39:33 2022 +0100

    irqchip/gic-v2m: Mark a few functions __init
    
    [ Upstream commit d51a15af37ce8cf59e73de51dcdce3c9f4944974 ]
    
    They are all part of the init sequence.
    
    Signed-off-by: Thomas Gleixner <[email protected]>
    Acked-by: Marc Zyngier <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Stable-dep-of: 3318dc299b07 ("irqchip/gic-v2m: Prevent use after free of gicv2m_get_fwnode()")
    Signed-off-by: Sasha Levin <[email protected]>

irqchip/gic-v2m: Prevent use after free of gicv2m_get_fwnode() [+ + +]

Author: Suzuki K Poulose <[email protected]>
Date:   Tue Apr 22 17:16:16 2025 +0100

    irqchip/gic-v2m: Prevent use after free of gicv2m_get_fwnode()
    
    [ Upstream commit 3318dc299b072a0511d6dfd8367f3304fb6d9827 ]
    
    With ACPI in place, gicv2m_get_fwnode() is registered with the pci
    subsystem as pci_msi_get_fwnode_cb(), which may get invoked at runtime
    during a PCI host bridge probe. But, the call back is wrongly marked as
    __init, causing it to be freed, while being registered with the PCI
    subsystem and could trigger:
    
     Unable to handle kernel paging request at virtual address ffff8000816c0400
      gicv2m_get_fwnode+0x0/0x58 (P)
      pci_set_bus_msi_domain+0x74/0x88
      pci_register_host_bridge+0x194/0x548
    
    This is easily reproducible on a Juno board with ACPI boot.
    
    Retain the function for later use.
    
    Fixes: 0644b3daca28 ("irqchip/gic-v2m: acpi: Introducing GICv2m ACPI support")
    Signed-off-by: Suzuki K Poulose <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Signed-off-by: Ingo Molnar <[email protected]>
    Reviewed-by: Marc Zyngier <[email protected]>
    Cc: [email protected]
    Signed-off-by: Sasha Levin <[email protected]>

irqchip/qcom-mpm: Prevent crash when trying to handle non-wake GPIOs [+ + +]

Author: Stephan Gerhold <[email protected]>
Date:   Fri May 2 13:22:28 2025 +0200

    irqchip/qcom-mpm: Prevent crash when trying to handle non-wake GPIOs
    
    commit 38a05c0b87833f5b188ae43b428b1f792df2b384 upstream.
    
    On Qualcomm chipsets not all GPIOs are wakeup capable. Those GPIOs do not
    have a corresponding MPM pin and should not be handled inside the MPM
    driver. The IRQ domain hierarchy is always applied, so it's required to
    explicitly disconnect the hierarchy for those. The pinctrl-msm driver marks
    these with GPIO_NO_WAKE_IRQ. qcom-pdc has a check for this, but
    irq-qcom-mpm is currently missing the check. This is causing crashes when
    setting up interrupts for non-wake GPIOs:
    
     root@rb1:~# gpiomon -c gpiochip1 10
       irq: IRQ159: trimming hierarchy from :soc@0:interrupt-controller@f200000-1
       Unable to handle kernel paging request at virtual address ffff8000a1dc3820
       Hardware name: Qualcomm Technologies, Inc. Robotics RB1 (DT)
       pc : mpm_set_type+0x80/0xcc
       lr : mpm_set_type+0x5c/0xcc
       Call trace:
        mpm_set_type+0x80/0xcc (P)
        qcom_mpm_set_type+0x64/0x158
        irq_chip_set_type_parent+0x20/0x38
        msm_gpio_irq_set_type+0x50/0x530
        __irq_set_trigger+0x60/0x184
        __setup_irq+0x304/0x6bc
        request_threaded_irq+0xc8/0x19c
        edge_detector_setup+0x260/0x364
        linereq_create+0x420/0x5a8
        gpio_ioctl+0x2d4/0x6c0
    
    Fix this by copying the check for GPIO_NO_WAKE_IRQ from qcom-pdc.c, so that
    MPM is removed entirely from the hierarchy for non-wake GPIOs.
    
    Fixes: a6199bb514d8 ("irqchip: Add Qualcomm MPM controller driver")
    Reported-by: Alexey Klimov <[email protected]>
    Signed-off-by: Stephan Gerhold <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Tested-by: Alexey Klimov <[email protected]>
    Reviewed-by: Bartosz Golaszewski <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/all/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ksmbd: fix use-after-free in kerberos authentication [+ + +]

Author: Sean Heelan <[email protected]>
Date:   Sat Apr 19 19:59:28 2025 +0100

    ksmbd: fix use-after-free in kerberos authentication
    
    commit e86e9134e1d1c90a960dd57f59ce574d27b9a124 upstream.
    
    Setting sess->user = NULL was introduced to fix the dangling pointer
    created by ksmbd_free_user. However, it is possible another thread could
    be operating on the session and make use of sess->user after it has been
    passed to ksmbd_free_user but before sess->user is set to NULL.
    
    Cc: [email protected]
    Signed-off-by: Sean Heelan <[email protected]>
    Acked-by: Namjae Jeon <[email protected]>
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

KVM: x86: Load DR6 with guest value only before entering .vcpu_run() loop [+ + +]

Author: Sean Christopherson <[email protected]>
Date:   Fri Jan 24 17:18:33 2025 -0800

    KVM: x86: Load DR6 with guest value only before entering .vcpu_run() loop
    
    commit c2fee09fc167c74a64adb08656cb993ea475197e upstream.
    
    Move the conditional loading of hardware DR6 with the guest's DR6 value
    out of the core .vcpu_run() loop to fix a bug where KVM can load hardware
    with a stale vcpu->arch.dr6.
    
    When the guest accesses a DR and host userspace isn't debugging the guest,
    KVM disables DR interception and loads the guest's values into hardware on
    VM-Enter and saves them on VM-Exit.  This allows the guest to access DRs
    at will, e.g. so that a sequence of DR accesses to configure a breakpoint
    only generates one VM-Exit.
    
    For DR0-DR3, the logic/behavior is identical between VMX and SVM, and also
    identical between KVM_DEBUGREG_BP_ENABLED (userspace debugging the guest)
    and KVM_DEBUGREG_WONT_EXIT (guest using DRs), and so KVM handles loading
    DR0-DR3 in common code, _outside_ of the core kvm_x86_ops.vcpu_run() loop.
    
    But for DR6, the guest's value doesn't need to be loaded into hardware for
    KVM_DEBUGREG_BP_ENABLED, and SVM provides a dedicated VMCB field whereas
    VMX requires software to manually load the guest value, and so loading the
    guest's value into DR6 is handled by {svm,vmx}_vcpu_run(), i.e. is done
    _inside_ the core run loop.
    
    Unfortunately, saving the guest values on VM-Exit is initiated by common
    x86, again outside of the core run loop.  If the guest modifies DR6 (in
    hardware, when DR interception is disabled), and then the next VM-Exit is
    a fastpath VM-Exit, KVM will reload hardware DR6 with vcpu->arch.dr6 and
    clobber the guest's actual value.
    
    The bug shows up primarily with nested VMX because KVM handles the VMX
    preemption timer in the fastpath, and the window between hardware DR6
    being modified (in guest context) and DR6 being read by guest software is
    orders of magnitude larger in a nested setup.  E.g. in non-nested, the
    VMX preemption timer would need to fire precisely between #DB injection
    and the #DB handler's read of DR6, whereas with a KVM-on-KVM setup, the
    window where hardware DR6 is "dirty" extends all the way from L1 writing
    DR6 to VMRESUME (in L1).
    
        L1's view:
        ==========
        <L1 disables DR interception>
               CPU 0/KVM-7289    [023] d....  2925.640961: kvm_entry: vcpu 0
     A:  L1 Writes DR6
               CPU 0/KVM-7289    [023] d....  2925.640963: <hack>: Set DRs, DR6 = 0xffff0ff1
    
     B:        CPU 0/KVM-7289    [023] d....  2925.640967: kvm_exit: vcpu 0 reason EXTERNAL_INTERRUPT intr_info 0x800000ec
    
     D: L1 reads DR6, arch.dr6 = 0
               CPU 0/KVM-7289    [023] d....  2925.640969: <hack>: Sync DRs, DR6 = 0xffff0ff0
    
               CPU 0/KVM-7289    [023] d....  2925.640976: kvm_entry: vcpu 0
        L2 reads DR6, L1 disables DR interception
               CPU 0/KVM-7289    [023] d....  2925.640980: kvm_exit: vcpu 0 reason DR_ACCESS info1 0x0000000000000216
               CPU 0/KVM-7289    [023] d....  2925.640983: kvm_entry: vcpu 0
    
               CPU 0/KVM-7289    [023] d....  2925.640983: <hack>: Set DRs, DR6 = 0xffff0ff0
    
        L2 detects failure
               CPU 0/KVM-7289    [023] d....  2925.640987: kvm_exit: vcpu 0 reason HLT
        L1 reads DR6 (confirms failure)
               CPU 0/KVM-7289    [023] d....  2925.640990: <hack>: Sync DRs, DR6 = 0xffff0ff0
    
        L0's view:
        ==========
        L2 reads DR6, arch.dr6 = 0
              CPU 23/KVM-5046    [001] d....  3410.005610: kvm_exit: vcpu 23 reason DR_ACCESS info1 0x0000000000000216
              CPU 23/KVM-5046    [001] .....  3410.005610: kvm_nested_vmexit: vcpu 23 reason DR_ACCESS info1 0x0000000000000216
    
        L2 => L1 nested VM-Exit
              CPU 23/KVM-5046    [001] .....  3410.005610: kvm_nested_vmexit_inject: reason: DR_ACCESS ext_inf1: 0x0000000000000216
    
              CPU 23/KVM-5046    [001] d....  3410.005610: kvm_entry: vcpu 23
              CPU 23/KVM-5046    [001] d....  3410.005611: kvm_exit: vcpu 23 reason VMREAD
              CPU 23/KVM-5046    [001] d....  3410.005611: kvm_entry: vcpu 23
              CPU 23/KVM-5046    [001] d....  3410.005612: kvm_exit: vcpu 23 reason VMREAD
              CPU 23/KVM-5046    [001] d....  3410.005612: kvm_entry: vcpu 23
    
        L1 writes DR7, L0 disables DR interception
              CPU 23/KVM-5046    [001] d....  3410.005612: kvm_exit: vcpu 23 reason DR_ACCESS info1 0x0000000000000007
              CPU 23/KVM-5046    [001] d....  3410.005613: kvm_entry: vcpu 23
    
        L0 writes DR6 = 0 (arch.dr6)
              CPU 23/KVM-5046    [001] d....  3410.005613: <hack>: Set DRs, DR6 = 0xffff0ff0
    
     A: <L1 writes DR6 = 1, no interception, arch.dr6 is still '0'>
    
     B:       CPU 23/KVM-5046    [001] d....  3410.005614: kvm_exit: vcpu 23 reason PREEMPTION_TIMER
              CPU 23/KVM-5046    [001] d....  3410.005614: kvm_entry: vcpu 23
    
     C: L0 writes DR6 = 0 (arch.dr6)
              CPU 23/KVM-5046    [001] d....  3410.005614: <hack>: Set DRs, DR6 = 0xffff0ff0
    
        L1 => L2 nested VM-Enter
              CPU 23/KVM-5046    [001] d....  3410.005616: kvm_exit: vcpu 23 reason VMRESUME
    
        L0 reads DR6, arch.dr6 = 0
    
    Reported-by: John Stultz <[email protected]>
    Closes: https://lkml.kernel.org/r/CANDhNCq5_F3HfFYABqFGCA1bPd_%2BxgNj-iDQhH4tDk%2Bwi8iZZg%40mail.gmail.com
    Fixes: 375e28ffc0cf ("KVM: X86: Set host DR6 only on VMX and for KVM_DEBUGREG_WONT_EXIT")
    Fixes: d67668e9dd76 ("KVM: x86, SVM: isolate vcpu->arch.dr6 from vmcb->save.dr6")
    Cc: [email protected]
    Cc: Jim Mattson <[email protected]>
    Tested-by: John Stultz <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sean Christopherson <[email protected]>
    [jth: Handled conflicts with kvm_x86_ops reshuffle]
    Signed-off-by: James Houghton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Linux: Linux 6.1.138 [+ + +]

Author: Greg Kroah-Hartman <[email protected]>
Date:   Fri May 9 09:41:46 2025 +0200

    Linux 6.1.138
    
    Link: https://lore.kernel.org/r/[email protected]
    Tested-by: Pavel Machek (CIP) <[email protected]>
    Tested-by: Florian Fainelli <[email protected]>
    Tested-by: Shuah Khan <[email protected]>
    Tested-by: Salvatore Bonaccorso <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Tested-by: Peter Schneider <[email protected]>
    Tested-by: Jon Hunter <[email protected]>
    Tested-by: Florian Fainelli <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

md: move initialization and destruction of 'io_acct_set' to md.c [+ + +]

Author: Yu Kuai <[email protected]>
Date:   Thu Jun 22 00:51:03 2023 +0800

    md: move initialization and destruction of 'io_acct_set' to md.c
    
    commit c567c86b90d4715081adfe5eb812141a5b6b4883 upstream.
    
    'io_acct_set' is only used for raid0 and raid456, prepare to use it for
    raid1 and raid10, so that io accounting from different levels can be
    consistent.
    
    By the way, follow up patches will also use this io clone mechanism to
    make sure 'active_io' represents in flight io, not io that is dispatching,
    so that mddev_suspend will wait for io to be done as designed.
    
    Signed-off-by: Yu Kuai <[email protected]>
    Reviewed-by: Xiao Ni <[email protected]>
    Signed-off-by: Song Liu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mmc: renesas_sdhi: Fix error handling in renesas_sdhi_probe [+ + +]

Author: Ruslan Piasetskyi <[email protected]>
Date:   Wed Mar 26 23:06:38 2025 +0100

    mmc: renesas_sdhi: Fix error handling in renesas_sdhi_probe
    
    commit 649b50a82f09fa44c2f7a65618e4584072145ab7 upstream.
    
    After moving tmio_mmc_host_probe down, error handling has to be
    adjusted.
    
    Fixes: 74f45de394d9 ("mmc: renesas_sdhi: register irqs before registering controller")
    Reviewed-by: Ihar Salauyou <[email protected]>
    Signed-off-by: Ruslan Piasetskyi <[email protected]>
    Reviewed-by: Geert Uytterhoeven <[email protected]>
    Reviewed-by: Wolfram Sang <[email protected]>
    Tested-by: Wolfram Sang <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Ulf Hansson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net/mlx5: E-switch, Fix error handling for enabling roce [+ + +]

Author: Chris Mi <[email protected]>
Date:   Wed Apr 23 11:36:11 2025 +0300

    net/mlx5: E-switch, Fix error handling for enabling roce
    
    [ Upstream commit 90538d23278a981e344d364e923162fce752afeb ]
    
    The cited commit assumes enabling roce always succeeds. But it is
    not true. Add error handling for it.
    
    Fixes: 80f09dfc237f ("net/mlx5: Eswitch, enable RoCE loopback traffic")
    Signed-off-by: Chris Mi <[email protected]>
    Reviewed-by: Roi Dayan <[email protected]>
    Reviewed-by: Maor Gottlieb <[email protected]>
    Signed-off-by: Mark Bloch <[email protected]>
    Reviewed-by: Michal Swiatkowski <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/mlx5: E-Switch, Initialize MAC Address for Default GID [+ + +]

Author: Maor Gottlieb <[email protected]>
Date:   Wed Apr 23 11:36:08 2025 +0300

    net/mlx5: E-Switch, Initialize MAC Address for Default GID
    
    [ Upstream commit 5d1a04f347e6cbf5ffe74da409a5d71fbe8c5f19 ]
    
    Initialize the source MAC address when creating the default GID entry.
    Since this entry is used only for loopback traffic, it only needs to
    be a unicast address. A zeroed-out MAC address is sufficient for this
    purpose.
    Without this fix, random bits would be assigned as the source address.
    If these bits formed a multicast address, the firmware would return an
    error, preventing the user from switching to switchdev mode:
    
    Error: mlx5_core: Failed setting eswitch to offloads.
    kernel answers: Invalid argument
    
    Fixes: 80f09dfc237f ("net/mlx5: Eswitch, enable RoCE loopback traffic")
    Signed-off-by: Maor Gottlieb <[email protected]>
    Signed-off-by: Mark Bloch <[email protected]>
    Reviewed-by: Michal Swiatkowski <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: dlink: Correct endianness handling of led_mode [+ + +]

Author: Simon Horman <[email protected]>
Date:   Fri Apr 25 16:50:47 2025 +0100

    net: dlink: Correct endianness handling of led_mode
    
    [ Upstream commit e7e5ae71831c44d58627a991e603845a2fed2cab ]
    
    As it's name suggests, parse_eeprom() parses EEPROM data.
    
    This is done by reading data, 16 bits at a time as follows:
    
            for (i = 0; i < 128; i++)
                    ((__le16 *) sromdata)[i] = cpu_to_le16(read_eeprom(np, i));
    
    sromdata is at the same memory location as psrom.
    And the type of psrom is a pointer to struct t_SROM.
    
    As can be seen in the loop above, data is stored in sromdata, and thus psrom,
    as 16-bit little-endian values.
    
    However, the integer fields of t_SROM are host byte order integers.
    And in the case of led_mode this leads to a little endian value
    being incorrectly treated as host byte order.
    
    Looking at rio_set_led_mode, this does appear to be a bug as that code
    masks led_mode with 0x1, 0x2 and 0x8. Logic that would be effected by a
    reversed byte order.
    
    This problem would only manifest on big endian hosts.
    
    Found by inspection while investigating a sparse warning
    regarding the crc field of t_SROM.
    
    I believe that warning is a false positive. And although I plan
    to send a follow-up to use little-endian types for other the integer
    fields of PSROM_t I do not believe that will involve any bug fixes.
    
    Compile tested only.
    
    Fixes: c3f45d322cbd ("dl2k: Add support for IP1000A-based cards")
    Signed-off-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: dsa: felix: fix broken taprio gate states after clock jump [+ + +]

Author: Vladimir Oltean <[email protected]>
Date:   Sat Apr 26 17:48:55 2025 +0300

    net: dsa: felix: fix broken taprio gate states after clock jump
    
    [ Upstream commit 426d487bca38b34f39c483edfc6313a036446b33 ]
    
    Simplest setup to reproduce the issue: connect 2 ports of the
    LS1028A-RDB together (eno0 with swp0) and run:
    
    $ ip link set eno0 up && ip link set swp0 up
    $ tc qdisc replace dev swp0 parent root handle 100 taprio num_tc 8 \
            queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 map 0 1 2 3 4 5 6 7 \
            base-time 0 sched-entry S 20 300000 sched-entry S 10 200000 \
            sched-entry S 20 300000 sched-entry S 48 200000 \
            sched-entry S 20 300000 sched-entry S 83 200000 \
            sched-entry S 40 300000 sched-entry S 00 200000 flags 2
    $ ptp4l -i eno0 -f /etc/linuxptp/configs/gPTP.cfg -m &
    $ ptp4l -i swp0 -f /etc/linuxptp/configs/gPTP.cfg -m
    
    One will observe that the PTP state machine on swp0 starts
    synchronizing, then it attempts to do a clock step, and after that, it
    never fails to recover from the condition below.
    
    ptp4l[82.427]: selected best master clock 00049f.fffe.05f627
    ptp4l[82.428]: port 1 (swp0): MASTER to UNCALIBRATED on RS_SLAVE
    ptp4l[83.252]: port 1 (swp0): UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
    ptp4l[83.886]: rms 4537731277 max 9075462553 freq -18518 +/- 11467 delay   818 +/-   0
    ptp4l[84.170]: timed out while polling for tx timestamp
    ptp4l[84.171]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it
    ptp4l[84.172]: port 1 (swp0): send peer delay request failed
    ptp4l[84.173]: port 1 (swp0): clearing fault immediately
    ptp4l[84.269]: port 1 (swp0): SLAVE to LISTENING on INIT_COMPLETE
    ptp4l[85.303]: timed out while polling for tx timestamp
    ptp4l[84.171]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it
    ptp4l[84.172]: port 1 (swp0): send peer delay request failed
    ptp4l[84.173]: port 1 (swp0): clearing fault immediately
    ptp4l[84.269]: port 1 (swp0): SLAVE to LISTENING on INIT_COMPLETE
    ptp4l[85.303]: timed out while polling for tx timestamp
    ptp4l[85.304]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it
    ptp4l[85.305]: port 1 (swp0): send peer delay response failed
    ptp4l[85.306]: port 1 (swp0): clearing fault immediately
    ptp4l[86.304]: timed out while polling for tx timestamp
    
    A hint is given by the non-zero statistics for dropped packets which
    were expecting hardware TX timestamps:
    
    $ ethtool --include-statistics -T swp0
    (...)
    Statistics:
      tx_pkts: 30
      tx_lost: 11
      tx_err: 0
    
    We know that when PTP clock stepping takes place (from ocelot_ptp_settime64()
    or from ocelot_ptp_adjtime()), vsc9959_tas_clock_adjust() is called.
    
    Another interesting hint is that placing an early return in
    vsc9959_tas_clock_adjust(), so as to neutralize this function, fixes the
    issue and TX timestamps are no longer dropped.
    
    The debugging function written by me and included below is intended to
    read the GCL RAM, after the admin schedule became operational, through
    the two status registers available for this purpose:
    QSYS_GCL_STATUS_REG_1 and QSYS_GCL_STATUS_REG_2.
    
    static void vsc9959_print_tas_gcl(struct ocelot *ocelot)
    {
            u32 val, list_length, interval, gate_state;
            int i, err;
    
            err = read_poll_timeout(ocelot_read, val,
                                    !(val & QSYS_PARAM_STATUS_REG_8_CONFIG_PENDING),
                                    10, 100000, false, ocelot, QSYS_PARAM_STATUS_REG_8);
            if (err) {
                    dev_err(ocelot->dev,
                            "Failed to wait for TAS config pending bit to clear: %pe\n",
                            ERR_PTR(err));
                    return;
            }
    
            val = ocelot_read(ocelot, QSYS_PARAM_STATUS_REG_3);
            list_length = QSYS_PARAM_STATUS_REG_3_LIST_LENGTH_X(val);
    
            dev_info(ocelot->dev, "GCL length: %u\n", list_length);
    
            for (i = 0; i < list_length; i++) {
                    ocelot_rmw(ocelot,
                               QSYS_GCL_STATUS_REG_1_GCL_ENTRY_NUM(i),
                               QSYS_GCL_STATUS_REG_1_GCL_ENTRY_NUM_M,
                               QSYS_GCL_STATUS_REG_1);
                    interval = ocelot_read(ocelot, QSYS_GCL_STATUS_REG_2);
                    val = ocelot_read(ocelot, QSYS_GCL_STATUS_REG_1);
                    gate_state = QSYS_GCL_STATUS_REG_1_GATE_STATE_X(val);
    
                    dev_info(ocelot->dev, "GCL entry %d: states 0x%x interval %u\n",
                             i, gate_state, interval);
            }
    }
    
    Calling it from two places: after the initial QSYS_TAS_PARAM_CFG_CTRL_CONFIG_CHANGE
    performed by vsc9959_qos_port_tas_set(), and after the one done by
    vsc9959_tas_clock_adjust(), I notice the following difference.
    
    From the tc-taprio process context, where the schedule was initially
    configured, the GCL looks like this:
    
    mscc_felix 0000:00:00.5: GCL length: 8
    mscc_felix 0000:00:00.5: GCL entry 0: states 0x20 interval 300000
    mscc_felix 0000:00:00.5: GCL entry 1: states 0x10 interval 200000
    mscc_felix 0000:00:00.5: GCL entry 2: states 0x20 interval 300000
    mscc_felix 0000:00:00.5: GCL entry 3: states 0x48 interval 200000
    mscc_felix 0000:00:00.5: GCL entry 4: states 0x20 interval 300000
    mscc_felix 0000:00:00.5: GCL entry 5: states 0x83 interval 200000
    mscc_felix 0000:00:00.5: GCL entry 6: states 0x40 interval 300000
    mscc_felix 0000:00:00.5: GCL entry 7: states 0x0 interval 200000
    
    But from the ptp4l clock stepping process context, when the
    vsc9959_tas_clock_adjust() hook is called, the GCL RAM of the
    operational schedule now looks like this:
    
    mscc_felix 0000:00:00.5: GCL length: 8
    mscc_felix 0000:00:00.5: GCL entry 0: states 0x0 interval 0
    mscc_felix 0000:00:00.5: GCL entry 1: states 0x0 interval 0
    mscc_felix 0000:00:00.5: GCL entry 2: states 0x0 interval 0
    mscc_felix 0000:00:00.5: GCL entry 3: states 0x0 interval 0
    mscc_felix 0000:00:00.5: GCL entry 4: states 0x0 interval 0
    mscc_felix 0000:00:00.5: GCL entry 5: states 0x0 interval 0
    mscc_felix 0000:00:00.5: GCL entry 6: states 0x0 interval 0
    mscc_felix 0000:00:00.5: GCL entry 7: states 0x0 interval 0
    
    I do not have a formal explanation, just experimental conclusions.
    It appears that after triggering QSYS_TAS_PARAM_CFG_CTRL_CONFIG_CHANGE
    for a port's TAS, the GCL entry RAM is updated anyway, despite what the
    documentation claims: "Specify the time interval in
    QSYS::GCL_CFG_REG_2.TIME_INTERVAL. This triggers the actual RAM
    write with the gate state and the time interval for the entry number
    specified". We don't touch that register (through vsc9959_tas_gcl_set())
    from vsc9959_tas_clock_adjust(), yet the GCL RAM is updated anyway.
    
    It seems to be updated with effectively stale memory, which in my
    testing can hold a variety of things, including even pieces of the
    previously applied schedule, for particular schedule lengths.
    
    As such, in most circumstances it is very difficult to pinpoint this
    issue, because the newly updated schedule would "behave strangely",
    but ultimately might still pass traffic to some extent, due to some
    gate entries still being present in the stale GCL entry RAM. It is easy
    to miss.
    
    With the particular schedule given at the beginning, the GCL RAM
    "happens" to be reproducibly rewritten with all zeroes, and this is
    consistent with what we see: when the time-aware shaper has gate entries
    with all gates closed, traffic is dropped on TX, no wonder we can't
    retrieve TX timestamps.
    
    Rewriting the GCL entry RAM when reapplying the new base time fixes the
    observed issue.
    
    Fixes: 8670dc33f48b ("net: dsa: felix: update base time of time-aware shaper when adjusting PTP time")
    Reported-by: Richie Pearn <[email protected]>
    Signed-off-by: Vladimir Oltean <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: ethernet: mtk-star-emac: fix spinlock recursion issues on rx/tx poll [+ + +]

Author: Louis-Alexis Eyraud <[email protected]>
Date:   Thu Apr 24 10:38:48 2025 +0200

    net: ethernet: mtk-star-emac: fix spinlock recursion issues on rx/tx poll
    
    [ Upstream commit 6fe0866014486736cc3ba1c6fd4606d3dbe55c9c ]
    
    Use spin_lock_irqsave and spin_unlock_irqrestore instead of spin_lock
    and spin_unlock in mtk_star_emac driver to avoid spinlock recursion
    occurrence that can happen when enabling the DMA interrupts again in
    rx/tx poll.
    
    ```
    BUG: spinlock recursion on CPU#0, swapper/0/0
     lock: 0xffff00000db9cf20, .magic: dead4ead, .owner: swapper/0/0,
        .owner_cpu: 0
    CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted
        6.15.0-rc2-next-20250417-00001-gf6a27738686c-dirty #28 PREEMPT
    Hardware name: MediaTek MT8365 Open Platform EVK (DT)
    Call trace:
     show_stack+0x18/0x24 (C)
     dump_stack_lvl+0x60/0x80
     dump_stack+0x18/0x24
     spin_dump+0x78/0x88
     do_raw_spin_lock+0x11c/0x120
     _raw_spin_lock+0x20/0x2c
     mtk_star_handle_irq+0xc0/0x22c [mtk_star_emac]
     __handle_irq_event_percpu+0x48/0x140
     handle_irq_event+0x4c/0xb0
     handle_fasteoi_irq+0xa0/0x1bc
     handle_irq_desc+0x34/0x58
     generic_handle_domain_irq+0x1c/0x28
     gic_handle_irq+0x4c/0x120
     do_interrupt_handler+0x50/0x84
     el1_interrupt+0x34/0x68
     el1h_64_irq_handler+0x18/0x24
     el1h_64_irq+0x6c/0x70
     regmap_mmio_read32le+0xc/0x20 (P)
     _regmap_bus_reg_read+0x6c/0xac
     _regmap_read+0x60/0xdc
     regmap_read+0x4c/0x80
     mtk_star_rx_poll+0x2f4/0x39c [mtk_star_emac]
     __napi_poll+0x38/0x188
     net_rx_action+0x164/0x2c0
     handle_softirqs+0x100/0x244
     __do_softirq+0x14/0x20
     ____do_softirq+0x10/0x20
     call_on_irq_stack+0x24/0x64
     do_softirq_own_stack+0x1c/0x40
     __irq_exit_rcu+0xd4/0x10c
     irq_exit_rcu+0x10/0x1c
     el1_interrupt+0x38/0x68
     el1h_64_irq_handler+0x18/0x24
     el1h_64_irq+0x6c/0x70
     cpuidle_enter_state+0xac/0x320 (P)
     cpuidle_enter+0x38/0x50
     do_idle+0x1e4/0x260
     cpu_startup_entry+0x34/0x3c
     rest_init+0xdc/0xe0
     console_on_rootfs+0x0/0x6c
     __primary_switched+0x88/0x90
    ```
    
    Fixes: 0a8bd81fd6aa ("net: ethernet: mtk-star-emac: separate tx/rx handling with two NAPIs")
    Signed-off-by: Louis-Alexis Eyraud <[email protected]>
    Reviewed-by: Maxime Chevallier <[email protected]>
    Acked-by: Bartosz Golaszewski <[email protected]>
    Link: https://patch.msgid.link/20250424-mtk_star_emac-fix-spinlock-recursion-issue-v2-1-f3fde2e529d8@collabora.com
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: ethernet: mtk-star-emac: rearm interrupts in rx_poll only when advised [+ + +]

Author: Louis-Alexis Eyraud <[email protected]>
Date:   Thu Apr 24 10:38:49 2025 +0200

    net: ethernet: mtk-star-emac: rearm interrupts in rx_poll only when advised
    
    [ Upstream commit e54b4db35e201a9173da9cb7abc8377e12abaf87 ]
    
    In mtk_star_rx_poll function, on event processing completion, the
    mtk_star_emac driver calls napi_complete_done but ignores its return
    code and enable RX DMA interrupts inconditionally. This return code
    gives the info if a device should avoid rearming its interrupts or not,
    so fix this behaviour by taking it into account.
    
    Fixes: 8c7bd5a454ff ("net: ethernet: mtk-star-emac: new driver")
    Signed-off-by: Louis-Alexis Eyraud <[email protected]>
    Acked-by: Bartosz Golaszewski <[email protected]>
    Link: https://patch.msgid.link/20250424-mtk_star_emac-fix-spinlock-recursion-issue-v2-2-f3fde2e529d8@collabora.com
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: fec: ERR007885 Workaround for conventional TX [+ + +]

Author: Mattias Barthel <[email protected]>
Date:   Tue Apr 29 11:08:26 2025 +0200

    net: fec: ERR007885 Workaround for conventional TX
    
    [ Upstream commit a179aad12badc43201cbf45d1e8ed2c1383c76b9 ]
    
    Activate TX hang workaround also in
    fec_enet_txq_submit_skb() when TSO is not enabled.
    
    Errata: ERR007885
    
    Symptoms: NETDEV WATCHDOG: eth0 (fec): transmit queue 0 timed out
    
    commit 37d6017b84f7 ("net: fec: Workaround for imx6sx enet tx hang when enable three queues")
    There is a TDAR race condition for mutliQ when the software sets TDAR
    and the UDMA clears TDAR simultaneously or in a small window (2-4 cycles).
    This will cause the udma_tx and udma_tx_arbiter state machines to hang.
    
    So, the Workaround is checking TDAR status four time, if TDAR cleared by
        hardware and then write TDAR, otherwise don't set TDAR.
    
    Fixes: 53bb20d1faba ("net: fec: add variable reg_desc_active to speed things up")
    Signed-off-by: Mattias Barthel <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: hns3: defer calling ptp_clock_register() [+ + +]

Author: Jian Shen <[email protected]>
Date:   Wed Apr 30 17:30:52 2025 +0800

    net: hns3: defer calling ptp_clock_register()
    
    [ Upstream commit 4971394d9d624f91689d766f31ce668d169d9959 ]
    
    Currently the ptp_clock_register() is called before relative
    ptp resource ready. It may cause unexpected result when upper
    layer called the ptp API during the timewindow. Fix it by
    moving the ptp_clock_register() to the function end.
    
    Fixes: 0bf5eb788512 ("net: hns3: add support for PTP")
    Signed-off-by: Jian Shen <[email protected]>
    Signed-off-by: Jijie Shao <[email protected]>
    Reviewed-by: Vadim Fedorenko <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: hns3: fix an interrupt residual problem [+ + +]

Author: Yonglong Liu <[email protected]>
Date:   Wed Apr 30 17:30:50 2025 +0800

    net: hns3: fix an interrupt residual problem
    
    [ Upstream commit 8e6b9c6ea5a55045eed6526d8ee49e93192d1a58 ]
    
    When a VF is passthrough to a VM, and the VM is killed, the reported
    interrupt may not been handled, it will remain, and won't be clear by
    the nic engine even with a flr or tqp reset. When the VM restart, the
    interrupt of the first vector may be dropped by the second enable_irq
    in vfio, see the issue below:
    https://gitlab.com/qemu-project/qemu/-/issues/2884#note_2423361621
    
    We notice that the vfio has always behaved this way, and the interrupt
    is a residue of the nic engine, so we fix the problem by moving the
    vector enable process out of the enable_irq loop.
    
    Fixes: 08a100689d4b ("net: hns3: re-organize vector handle")
    Signed-off-by: Yonglong Liu <[email protected]>
    Signed-off-by: Jijie Shao <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: hns3: fixed debugfs tm_qset size [+ + +]

Author: Hao Lan <[email protected]>
Date:   Wed Apr 30 17:30:51 2025 +0800

    net: hns3: fixed debugfs tm_qset size
    
    [ Upstream commit e317aebeefcb3b0c71f2305af3c22871ca6b3833 ]
    
    The size of the tm_qset file of debugfs is limited to 64 KB,
    which is too small in the scenario with 1280 qsets.
    The size needs to be expanded to 1 MB.
    
    Fixes: 5e69ea7ee2a6 ("net: hns3: refactor the debugfs process")
    Signed-off-by: Hao Lan <[email protected]>
    Signed-off-by: Peiyang Wang <[email protected]>
    Signed-off-by: Jijie Shao <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: hns3: store rx VLAN tag offload state for VF [+ + +]

Author: Jian Shen <[email protected]>
Date:   Wed Apr 30 17:30:49 2025 +0800

    net: hns3: store rx VLAN tag offload state for VF
    
    [ Upstream commit ef2383d078edcbe3055032436b16cdf206f26de2 ]
    
    The VF driver missed to store the rx VLAN tag strip state when
    user change the rx VLAN tag offload state. And it will default
    to enable the rx vlan tag strip when re-init VF device after
    reset. So if user disable rx VLAN tag offload, and trig reset,
    then the HW will still strip the VLAN tag from packet nad fill
    into RX BD, but the VF driver will ignore it for rx VLAN tag
    offload disabled. It may cause the rx VLAN tag dropped.
    
    Fixes: b2641e2ad456 ("net: hns3: Add support of hardware rx-vlan-offload to HNS3 VF driver")
    Signed-off-by: Jian Shen <[email protected]>
    Signed-off-by: Jijie Shao <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: ipv6: fix UDPv6 GSO segmentation with NAT [+ + +]

Author: Felix Fietkau <[email protected]>
Date:   Sat Apr 26 17:32:09 2025 +0200

    net: ipv6: fix UDPv6 GSO segmentation with NAT
    
    [ Upstream commit b936a9b8d4a585ccb6d454921c36286bfe63e01d ]
    
    If any address or port is changed, update it in all packets and recalculate
    checksum.
    
    Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
    Signed-off-by: Felix Fietkau <[email protected]>
    Reviewed-by: Willem de Bruijn <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: lan743x: Fix memleak issue when GSO enabled [+ + +]

Author: Thangaraj Samynathan <[email protected]>
Date:   Tue Apr 29 10:55:27 2025 +0530

    net: lan743x: Fix memleak issue when GSO enabled
    
    [ Upstream commit 2d52e2e38b85c8b7bc00dca55c2499f46f8c8198 ]
    
    Always map the `skb` to the LS descriptor. Previously skb was
    mapped to EXT descriptor when the number of fragments is zero with
    GSO enabled. Mapping the skb to EXT descriptor prevents it from
    being freed, leading to a memory leak
    
    Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver")
    Signed-off-by: Thangaraj Samynathan <[email protected]>
    Reviewed-by: Jacob Keller <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: mscc: ocelot: delete PVID VLAN when readding it as non-PVID [+ + +]

Author: Vladimir Oltean <[email protected]>
Date:   Fri Apr 25 01:37:33 2025 +0300

    net: mscc: ocelot: delete PVID VLAN when readding it as non-PVID
    
    [ Upstream commit 5ec6d7d737a491256cd37e33910f7ac1978db591 ]
    
    The following set of commands:
    
    ip link add br0 type bridge vlan_filtering 1 # vlan_default_pvid 1 is implicit
    ip link set swp0 master br0
    bridge vlan add dev swp0 vid 1
    
    should result in the dropping of untagged and 802.1p-tagged traffic, but
    we see that it continues to be accepted. Whereas, had we deleted VID 1
    instead, the aforementioned dropping would have worked
    
    This is because the ANA_PORT_DROP_CFG update logic doesn't run, because
    ocelot_vlan_add() only calls ocelot_port_set_pvid() if the new VLAN has
    the BRIDGE_VLAN_INFO_PVID flag.
    
    Similar to other drivers like mt7530_port_vlan_add() which handle this
    case correctly, we need to test whether the VLAN we're changing used to
    have the BRIDGE_VLAN_INFO_PVID flag, but lost it now. That amounts to a
    PVID deletion and should be treated as such.
    
    Regarding blame attribution: this never worked properly since the
    introduction of bridge VLAN filtering in commit 7142529f1688 ("net:
    mscc: ocelot: add VLAN filtering"). However, there was a significant
    paradigm shift which aligned the ANA_PORT_DROP_CFG register with the
    PVID concept rather than with the native VLAN concept, and that change
    wasn't targeted for 'stable'. Realistically, that is as far as this fix
    needs to be propagated to.
    
    Fixes: be0576fed6d3 ("net: mscc: ocelot: move the logic to drop 802.1p traffic to the pvid deletion")
    Signed-off-by: Vladimir Oltean <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: mscc: ocelot: treat 802.1ad tagged traffic as 802.1Q-untagged [+ + +]

Author: Vladimir Oltean <[email protected]>
Date:   Thu Aug 15 03:07:07 2024 +0300

    net: mscc: ocelot: treat 802.1ad tagged traffic as 802.1Q-untagged
    
    [ Upstream commit 36dd1141be70b5966906919714dc504a24c65ddf ]
    
    I was revisiting the topic of 802.1ad treatment in the Ocelot switch [0]
    and realized that not only is its basic VLAN classification pipeline
    improper for offloading vlan_protocol 802.1ad bridges, but also improper
    for offloading regular 802.1Q bridges already.
    
    Namely, 802.1ad-tagged traffic should be treated as VLAN-untagged by
    bridged ports, but this switch treats it as if it was 802.1Q-tagged with
    the same VID as in the 802.1ad header. This is markedly different to
    what the Linux bridge expects; see the "other_tpid()" function in
    tools/testing/selftests/net/forwarding/bridge_vlan_aware.sh.
    
    An idea came to me that the VCAP IS1 TCAM is more powerful than I'm
    giving it credit for, and that it actually overwrites the classified VID
    before the VLAN Table lookup takes place. In other words, it can be
    used even to save a packet from being dropped on ingress due to VLAN
    membership.
    
    Add a sophisticated TCAM rule hardcoded into the driver to force the
    switch to behave like a Linux bridge with vlan_filtering 1 vlan_protocol
    802.1Q.
    
    Regarding the lifetime of the filter: eventually the bridge will
    disappear, and vlan_filtering on the port will be restored to 0 for
    standalone mode. Then the filter will be deleted.
    
    [0]: https://lore.kernel.org/netdev/20201009122947.nvhye4hvcha3tljh@skbuf/
    
    Fixes: 7142529f1688 ("net: mscc: ocelot: add VLAN filtering")
    Signed-off-by: Vladimir Oltean <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Stable-dep-of: 5ec6d7d737a4 ("net: mscc: ocelot: delete PVID VLAN when readding it as non-PVID")
    Signed-off-by: Sasha Levin <[email protected]>

net: phy: microchip: force IRQ polling mode for lan88xx [+ + +]

Author: Fiona Klute <[email protected]>
Date:   Wed Apr 16 12:24:13 2025 +0200

    net: phy: microchip: force IRQ polling mode for lan88xx
    
    [ Upstream commit 30a41ed32d3088cd0d682a13d7f30b23baed7e93 ]
    
    With lan88xx based devices the lan78xx driver can get stuck in an
    interrupt loop while bringing the device up, flooding the kernel log
    with messages like the following:
    
    lan78xx 2-3:1.0 enp1s0u3: kevent 4 may have been dropped
    
    Removing interrupt support from the lan88xx PHY driver forces the
    driver to use polling instead, which avoids the problem.
    
    The issue has been observed with Raspberry Pi devices at least since
    4.14 (see [1], bug report for their downstream kernel), as well as
    with Nvidia devices [2] in 2020, where disabling interrupts was the
    vendor-suggested workaround (together with the claim that phylib
    changes in 4.9 made the interrupt handling in lan78xx incompatible).
    
    Iperf reports well over 900Mbits/sec per direction with client in
    --dualtest mode, so there does not seem to be a significant impact on
    throughput (lan88xx device connected via switch to the peer).
    
    [1] https://github.com/raspberrypi/linux/issues/2447
    [2] https://forums.developer.nvidia.com/t/jetson-xavier-and-lan7800-problem/142134/11
    
    Link: https://lore.kernel.org/[email protected]
    Fixes: 792aec47d59d ("add microchip LAN88xx phy driver")
    Signed-off-by: Fiona Klute <[email protected]>
    Cc: [email protected]
    Cc: [email protected]
    Reviewed-by: Andrew Lunn <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: vertexcom: mse102x: Add range check for CMD_RTS [+ + +]

Author: Stefan Wahren <[email protected]>
Date:   Wed Apr 30 15:30:42 2025 +0200

    net: vertexcom: mse102x: Add range check for CMD_RTS
    
    [ Upstream commit d4dda902dac194e3231a1ed0f76c6c3b6340ba8a ]
    
    Since there is no protection in the SPI protocol against electrical
    interferences, the driver shouldn't blindly trust the length payload
    of CMD_RTS. So introduce a bounds check for incoming frames.
    
    Fixes: 2f207cbf0dd4 ("net: vertexcom: Add MSE102x SPI support")
    Signed-off-by: Stefan Wahren <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: vertexcom: mse102x: Fix LEN_MASK [+ + +]

Author: Stefan Wahren <[email protected]>
Date:   Wed Apr 30 15:30:41 2025 +0200

    net: vertexcom: mse102x: Fix LEN_MASK
    
    [ Upstream commit 74987089ec678b4018dba0a609e9f4bf6ef7f4ad ]
    
    The LEN_MASK for CMD_RTS doesn't cover the whole parameter mask.
    The Bit 11 is reserved, so adjust LEN_MASK accordingly.
    
    Fixes: 2f207cbf0dd4 ("net: vertexcom: Add MSE102x SPI support")
    Signed-off-by: Stefan Wahren <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: vertexcom: mse102x: Fix possible stuck of SPI interrupt [+ + +]

Author: Stefan Wahren <[email protected]>
Date:   Wed Apr 30 15:30:40 2025 +0200

    net: vertexcom: mse102x: Fix possible stuck of SPI interrupt
    
    [ Upstream commit 55f362885951b2d00fd7fbb02ef0227deea572c2 ]
    
    The MSE102x doesn't provide any SPI commands for interrupt handling.
    So in case the interrupt fired before the driver requests the IRQ,
    the interrupt will never fire again. In order to fix this always poll
    for pending packets after opening the interface.
    
    Fixes: 2f207cbf0dd4 ("net: vertexcom: Add MSE102x SPI support")
    Signed-off-by: Stefan Wahren <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: vertexcom: mse102x: Fix RX error handling [+ + +]

Author: Stefan Wahren <[email protected]>
Date:   Wed Apr 30 15:30:43 2025 +0200

    net: vertexcom: mse102x: Fix RX error handling
    
    [ Upstream commit ee512922ddd7d64afe2b28830a88f19063217649 ]
    
    In case the CMD_RTS got corrupted by interferences, the MSE102x
    doesn't allow a retransmission of the command. Instead the Ethernet
    frame must be shifted out of the SPI FIFO. Since the actual length is
    unknown, assume the maximum possible value.
    
    Fixes: 2f207cbf0dd4 ("net: vertexcom: Add MSE102x SPI support")
    Signed-off-by: Stefan Wahren <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net_sched: drr: Fix double list add in class with netem as child qdisc [+ + +]

Author: Victor Nogueira <[email protected]>
Date:   Fri Apr 25 19:07:05 2025 -0300

    net_sched: drr: Fix double list add in class with netem as child qdisc
    
    [ Upstream commit f99a3fbf023e20b626be4b0f042463d598050c9a ]
    
    As described in Gerrard's report [1], there are use cases where a netem
    child qdisc will make the parent qdisc's enqueue callback reentrant.
    In the case of drr, there won't be a UAF, but the code will add the same
    classifier to the list twice, which will cause memory corruption.
    
    In addition to checking for qlen being zero, this patch checks whether the
    class was already added to the active_list (cl_is_active) before adding
    to the list to cover for the reentrant case.
    
    [1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-TEJJbw@mail.gmail.com/
    
    Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs")
    Acked-by: Jamal Hadi Salim <[email protected]>
    Signed-off-by: Victor Nogueira <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net_sched: ets: Fix double list add in class with netem as child qdisc [+ + +]

Author: Victor Nogueira <[email protected]>
Date:   Fri Apr 25 19:07:07 2025 -0300

    net_sched: ets: Fix double list add in class with netem as child qdisc
    
    [ Upstream commit 1a6d0c00fa07972384b0c308c72db091d49988b6 ]
    
    As described in Gerrard's report [1], there are use cases where a netem
    child qdisc will make the parent qdisc's enqueue callback reentrant.
    In the case of ets, there won't be a UAF, but the code will add the same
    classifier to the list twice, which will cause memory corruption.
    
    In addition to checking for qlen being zero, this patch checks whether
    the class was already added to the active_list (cl_is_active) before
    doing the addition to cater for the reentrant case.
    
    [1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-TEJJbw@mail.gmail.com/
    
    Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs")
    Acked-by: Jamal Hadi Salim <[email protected]>
    Signed-off-by: Victor Nogueira <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net_sched: hfsc: Fix a UAF vulnerability in class with netem as child qdisc [+ + +]

Author: Victor Nogueira <[email protected]>
Date:   Fri Apr 25 19:07:06 2025 -0300

    net_sched: hfsc: Fix a UAF vulnerability in class with netem as child qdisc
    
    [ Upstream commit 141d34391abbb315d68556b7c67ad97885407547 ]
    
    As described in Gerrard's report [1], we have a UAF case when an hfsc class
    has a netem child qdisc. The crux of the issue is that hfsc is assuming
    that checking for cl->qdisc->q.qlen == 0 guarantees that it hasn't inserted
    the class in the vttree or eltree (which is not true for the netem
    duplicate case).
    
    This patch checks the n_active class variable to make sure that the code
    won't insert the class in the vttree or eltree twice, catering for the
    reentrant case.
    
    [1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-TEJJbw@mail.gmail.com/
    
    Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs")
    Reported-by: Gerrard Tai <[email protected]>
    Acked-by: Jamal Hadi Salim <[email protected]>
    Signed-off-by: Victor Nogueira <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net_sched: qfq: Fix double list add in class with netem as child qdisc [+ + +]

Author: Victor Nogueira <[email protected]>
Date:   Fri Apr 25 19:07:08 2025 -0300

    net_sched: qfq: Fix double list add in class with netem as child qdisc
    
    [ Upstream commit f139f37dcdf34b67f5bf92bc8e0f7f6b3ac63aa4 ]
    
    As described in Gerrard's report [1], there are use cases where a netem
    child qdisc will make the parent qdisc's enqueue callback reentrant.
    In the case of qfq, there won't be a UAF, but the code will add the same
    classifier to the list twice, which will cause memory corruption.
    
    This patch checks whether the class was already added to the agg->active
    list (cl_is_active) before doing the addition to cater for the reentrant
    case.
    
    [1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-TEJJbw@mail.gmail.com/
    
    Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs")
    Acked-by: Jamal Hadi Salim <[email protected]>
    Signed-off-by: Victor Nogueira <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

nvme-tcp: fix premature queue removal and I/O failover [+ + +]

Author: Michael Liang <[email protected]>
Date:   Tue Apr 29 10:42:01 2025 -0600

    nvme-tcp: fix premature queue removal and I/O failover
    
    [ Upstream commit 77e40bbce93059658aee02786a32c5c98a240a8a ]
    
    This patch addresses a data corruption issue observed in nvme-tcp during
    testing.
    
    In an NVMe native multipath setup, when an I/O timeout occurs, all
    inflight I/Os are canceled almost immediately after the kernel socket is
    shut down. These canceled I/Os are reported as host path errors,
    triggering a failover that succeeds on a different path.
    
    However, at this point, the original I/O may still be outstanding in the
    host's network transmission path (e.g., the NIC’s TX queue). From the
    user-space app's perspective, the buffer associated with the I/O is
    considered completed since they're acked on the different path and may
    be reused for new I/O requests.
    
    Because nvme-tcp enables zero-copy by default in the transmission path,
    this can lead to corrupted data being sent to the original target,
    ultimately causing data corruption.
    
    We can reproduce this data corruption by injecting delay on one path and
    triggering i/o timeout.
    
    To prevent this issue, this change ensures that all inflight
    transmissions are fully completed from host's perspective before
    returning from queue stop. To handle concurrent I/O timeout from multiple
    namespaces under the same controller, always wait in queue stop
    regardless of queue's state.
    
    This aligns with the behavior of queue stopping in other NVMe fabric
    transports.
    
    Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver")
    Signed-off-by: Michael Liang <[email protected]>
    Reviewed-by: Mohamed Khalfella <[email protected]>
    Reviewed-by: Randy Jennings <[email protected]>
    Reviewed-by: Sagi Grimberg <[email protected]>
    Signed-off-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

parisc: Fix double SIGFPE crash [+ + +]

Author: Helge Deller <[email protected]>
Date:   Sat May 3 18:24:01 2025 +0200

    parisc: Fix double SIGFPE crash
    
    commit de3629baf5a33af1919dec7136d643b0662e85ef upstream.
    
    Camm noticed that on parisc a SIGFPE exception will crash an application with
    a second SIGFPE in the signal handler.  Dave analyzed it, and it happens
    because glibc uses a double-word floating-point store to atomically update
    function descriptors. As a result of lazy binding, we hit a floating-point
    store in fpe_func almost immediately.
    
    When the T bit is set, an assist exception trap occurs when when the
    co-processor encounters *any* floating-point instruction except for a double
    store of register %fr0.  The latter cancels all pending traps.  Let's fix this
    by clearing the Trap (T) bit in the FP status register before returning to the
    signal handler in userspace.
    
    The issue can be reproduced with this test program:
    
    root@parisc:~# cat fpe.c
    
    static void fpe_func(int sig, siginfo_t *i, void *v) {
            sigset_t set;
            sigemptyset(&set);
            sigaddset(&set, SIGFPE);
            sigprocmask(SIG_UNBLOCK, &set, NULL);
            printf("GOT signal %d with si_code %ld\n", sig, i->si_code);
    }
    
    int main() {
            struct sigaction action = {
                    .sa_sigaction = fpe_func,
                    .sa_flags = SA_RESTART|SA_SIGINFO };
            sigaction(SIGFPE, &action, 0);
            feenableexcept(FE_OVERFLOW);
            return printf("%lf\n",1.7976931348623158E308*1.7976931348623158E308);
    }
    
    root@parisc:~# gcc fpe.c -lm
    root@parisc:~# ./a.out
     Floating point exception
    
    root@parisc:~# strace -f ./a.out
     execve("./a.out", ["./a.out"], 0xf9ac7034 /* 20 vars */) = 0
     getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0
     ...
     rt_sigaction(SIGFPE, {sa_handler=0x1110a, sa_mask=[], sa_flags=SA_RESTART|SA_SIGINFO}, NULL, 8) = 0
     --- SIGFPE {si_signo=SIGFPE, si_code=FPE_FLTOVF, si_addr=0x1078f} ---
     --- SIGFPE {si_signo=SIGFPE, si_code=FPE_FLTOVF, si_addr=0xf8f21237} ---
     +++ killed by SIGFPE +++
     Floating point exception
    
    Signed-off-by: Helge Deller <[email protected]>
    Suggested-by: John David Anglin <[email protected]>
    Reported-by: Camm Maguire <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

PCI: imx6: Skip controller_id generation logic for i.MX7D [+ + +]

Author: Richard Zhu <[email protected]>
Date:   Tue Nov 26 15:56:56 2024 +0800

    PCI: imx6: Skip controller_id generation logic for i.MX7D
    
    commit f068ffdd034c93f0c768acdc87d4d2d7023c1379 upstream.
    
    The i.MX7D only has one PCIe controller, so controller_id should always be
    0. The previous code is incorrect although yielding the correct result.
    
    Fix by removing "IMX7D" from the switch case branch.
    
    Fixes: 2d8ed461dbc9 ("PCI: imx6: Add support for i.MX8MQ")
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Richard Zhu <[email protected]>
    Signed-off-by: Krzysztof Wilczyński <[email protected]>
    Signed-off-by: Bjorn Helgaas <[email protected]>
    Reviewed-by: Manivannan Sadhasivam <[email protected]>
    Reviewed-by: Frank Li <[email protected]>
    [Because this switch case does more than just controller_id
     logic, move the "IMX7D" case label instead of removing it entirely.]
    Signed-off-by: Ryan Matthews <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

perf/x86/intel: KVM: Mask PEBS_ENABLE loaded for guest with vCPU's value. [+ + +]

Author: Sean Christopherson <[email protected]>
Date:   Fri Apr 25 17:13:55 2025 -0700

    perf/x86/intel: KVM: Mask PEBS_ENABLE loaded for guest with vCPU's value.
    
    commit 58f6217e5d0132a9f14e401e62796916aa055c1b upstream.
    
    When generating the MSR_IA32_PEBS_ENABLE value that will be loaded on
    VM-Entry to a KVM guest, mask the value with the vCPU's desired PEBS_ENABLE
    value.  Consulting only the host kernel's host vs. guest masks results in
    running the guest with PEBS enabled even when the guest doesn't want to use
    PEBS.  Because KVM uses perf events to proxy the guest virtual PMU, simply
    looking at exclude_host can't differentiate between events created by host
    userspace, and events created by KVM on behalf of the guest.
    
    Running the guest with PEBS unexpectedly enabled typically manifests as
    crashes due to a near-infinite stream of #PFs.  E.g. if the guest hasn't
    written MSR_IA32_DS_AREA, the CPU will hit page faults on address '0' when
    trying to record PEBS events.
    
    The issue is most easily reproduced by running `perf kvm top` from before
    commit 7b100989b4f6 ("perf evlist: Remove __evlist__add_default") (after
    which, `perf kvm top` effectively stopped using PEBS).  The userspace side
    of perf creates a guest-only PEBS event, which intel_guest_get_msrs()
    misconstrues a guest-*owned* PEBS event.
    
    Arguably, this is a userspace bug, as enabling PEBS on guest-only events
    simply cannot work, and userspace can kill VMs in many other ways (there
    is no danger to the host).  However, even if this is considered to be bad
    userspace behavior, there's zero downside to perf/KVM restricting PEBS to
    guest-owned events.
    
    Note, commit 854250329c02 ("KVM: x86/pmu: Disable guest PEBS temporarily
    in two rare situations") fixed the case where host userspace is profiling
    KVM *and* userspace, but missed the case where userspace is profiling only
    KVM.
    
    Fixes: c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS")
    Closes: https://lore.kernel.org/all/Z_VUswFkWiTYI0eD@do-x1carbon
    Reported-by: Seth Forshee <[email protected]>
    Signed-off-by: Sean Christopherson <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Reviewed-by: Dapeng Mi <[email protected]>
    Tested-by: "Seth Forshee (DigitalOcean)" <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

platform/x86/intel-uncore-freq: Fix missing uncore sysfs during CPU hotplug [+ + +]

Author: Shouye Liu <[email protected]>
Date:   Thu Apr 17 11:23:21 2025 +0800

    platform/x86/intel-uncore-freq: Fix missing uncore sysfs during CPU hotplug
    
    commit 8d6955ed76e8a47115f2ea1d9c263ee6f505d737 upstream.
    
    In certain situations, the sysfs for uncore may not be present when all
    CPUs in a package are offlined and then brought back online after boot.
    
    This issue can occur if there is an error in adding the sysfs entry due
    to a memory allocation failure. Retrying to bring the CPUs online will
    not resolve the issue, as the uncore_cpu_mask is already set for the
    package before the failure condition occurs.
    
    This issue does not occur if the failure happens during module
    initialization, as the module will fail to load in the event of any
    error.
    
    To address this, ensure that the uncore_cpu_mask is not set until the
    successful return of uncore_freq_add_entry().
    
    Fixes: dbce412a7733 ("platform/x86/intel-uncore-freq: Split common and enumeration part")
    Signed-off-by: Shouye Liu <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Ilpo Järvinen <[email protected]>
    Signed-off-by: Ilpo Järvinen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Revert "drm/meson: vclk: fix calculation of 59.94 fractional rates" [+ + +]

Author: Christian Hewitt <[email protected]>
Date:   Mon Apr 21 22:12:59 2025 +0200

    Revert "drm/meson: vclk: fix calculation of 59.94 fractional rates"
    
    [ Upstream commit f37bb5486ea536c1d61df89feeaeff3f84f0b560 ]
    
    This reverts commit bfbc68e.
    
    The patch does permit the offending YUV420 @ 59.94 phy_freq and
    vclk_freq mode to match in calculations. It also results in all
    fractional rates being unavailable for use. This was unintended
    and requires the patch to be reverted.
    
    Fixes: bfbc68e4d869 ("drm/meson: vclk: fix calculation of 59.94 fractional rates")
    Cc: [email protected]
    Signed-off-by: Christian Hewitt <[email protected]>
    Signed-off-by: Martin Blumenstingl <[email protected]>
    Reviewed-by: Neil Armstrong <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Neil Armstrong <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

Revert "rndis_host: Flag RNDIS modems as WWAN devices" [+ + +]

Author: Christian Heusel <[email protected]>
Date:   Thu Apr 24 16:00:28 2025 +0200

    Revert "rndis_host: Flag RNDIS modems as WWAN devices"
    
    commit 765f253e28909f161b0211f85cf0431cfee7d6df upstream.
    
    This reverts commit 67d1a8956d2d62fe6b4c13ebabb57806098511d8. Since this
    commit has been proven to be problematic for the setup of USB-tethered
    ethernet connections and the related breakage is very noticeable for
    users it should be reverted until a fixed version of the change can be
    rolled out.
    
    Closes: https://lore.kernel.org/all/[email protected]/
    Link: https://chaos.social/@gromit/114377862699921553
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=220002
    Link: https://bugs.gentoo.org/953555
    Link: https://bbs.archlinux.org/viewtopic.php?id=304892
    Cc: [email protected]
    Acked-by: Lubomir Rintel <[email protected]>
    Signed-off-by: Christian Heusel <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Revert "x86/kexec: Allocate PGD for x86_64 transition page tables separately" [+ + +]

Author: Greg Kroah-Hartman <[email protected]>
Date:   Wed May 7 16:06:48 2025 +0200

    Revert "x86/kexec: Allocate PGD for x86_64 transition page tables separately"
    
    This reverts commit 6821918f451942aa79759f29677a22f2d4ff4cbe which is
    commit 4b5bc2ec9a239bce261ffeafdd63571134102323 upstream.
    
    The patch it relies on is not in the 6.1.y tree, and has been reported
    to cause problems, so let's revert it for now.
    
    Reported-by: Eric Hagberg <[email protected]>
    Link: https://lore.kernel.org/r/CAAH4uRBxJ_XvYjCpgYXHqrKSNj6x9pA7X6NBPNTekeQ90DQSJA@mail.gmail.com
    Cc: David Woodhouse <[email protected]>
    Cc: Ingo Molnar <[email protected]>
    Cc: Baoquan He <[email protected]>
    Cc: Vivek Goyal <[email protected]>
    Cc: Dave Young <[email protected]>
    Cc: Eric Biederman <[email protected]>
    Cc: Ard Biesheuvel <[email protected]>
    Cc: "H. Peter Anvin" <[email protected]>
    Cc: Sasha Levin <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

sch_drr: make drr_qlen_notify() idempotent [+ + +]

Author: Cong Wang <[email protected]>
Date:   Thu Apr 3 14:10:24 2025 -0700

    sch_drr: make drr_qlen_notify() idempotent
    
    commit df008598b3a00be02a8051fde89ca0fbc416bd55 upstream.
    
    drr_qlen_notify() always deletes the DRR class from its active list
    with list_del(), therefore, it is not idempotent and not friendly
    to its callers, like fq_codel_dequeue().
    
    Let's make it idempotent to ease qdisc_tree_reduce_backlog() callers'
    life. Also change other list_del()'s to list_del_init() just to be
    extra safe.
    
    Reported-by: Gerrard Tai <[email protected]>
    Signed-off-by: Cong Wang <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Acked-by: Jamal Hadi Salim <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

sch_ets: make est_qlen_notify() idempotent [+ + +]

Author: Cong Wang <[email protected]>
Date:   Thu Apr 3 14:10:27 2025 -0700

    sch_ets: make est_qlen_notify() idempotent
    
    commit a7a15f39c682ac4268624da2abdb9114bdde96d5 upstream.
    
    est_qlen_notify() deletes its class from its active list with
    list_del() when qlen is 0, therefore, it is not idempotent and
    not friendly to its callers, like fq_codel_dequeue().
    
    Let's make it idempotent to ease qdisc_tree_reduce_backlog() callers'
    life. Also change other list_del()'s to list_del_init() just to be
    extra safe.
    
    Reported-by: Gerrard Tai <[email protected]>
    Signed-off-by: Cong Wang <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Acked-by: Jamal Hadi Salim <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

sch_hfsc: make hfsc_qlen_notify() idempotent [+ + +]

Author: Cong Wang <[email protected]>
Date:   Thu Apr 3 14:10:25 2025 -0700

    sch_hfsc: make hfsc_qlen_notify() idempotent
    
    commit 51eb3b65544c9efd6a1026889ee5fb5aa62da3bb upstream.
    
    hfsc_qlen_notify() is not idempotent either and not friendly
    to its callers, like fq_codel_dequeue(). Let's make it idempotent
    to ease qdisc_tree_reduce_backlog() callers' life:
    
    1. update_vf() decreases cl->cl_nactive, so we can check whether it is
    non-zero before calling it.
    
    2. eltree_remove() always removes RB node cl->el_node, but we can use
       RB_EMPTY_NODE() + RB_CLEAR_NODE() to make it safe.
    
    Reported-by: Gerrard Tai <[email protected]>
    Signed-off-by: Cong Wang <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Acked-by: Jamal Hadi Salim <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

sch_htb: make htb_qlen_notify() idempotent [+ + +]

Author: Cong Wang <[email protected]>
Date:   Thu Apr 3 14:10:23 2025 -0700

    sch_htb: make htb_qlen_notify() idempotent
    
    commit 5ba8b837b522d7051ef81bacf3d95383ff8edce5 upstream.
    
    htb_qlen_notify() always deactivates the HTB class and in fact could
    trigger a warning if it is already deactivated. Therefore, it is not
    idempotent and not friendly to its callers, like fq_codel_dequeue().
    
    Let's make it idempotent to ease qdisc_tree_reduce_backlog() callers'
    life.
    
    Reported-by: Gerrard Tai <[email protected]>
    Signed-off-by: Cong Wang <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Acked-by: Jamal Hadi Salim <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

sch_qfq: make qfq_qlen_notify() idempotent [+ + +]

Author: Cong Wang <[email protected]>
Date:   Thu Apr 3 14:10:26 2025 -0700

    sch_qfq: make qfq_qlen_notify() idempotent
    
    commit 55f9eca4bfe30a15d8656f915922e8c98b7f0728 upstream.
    
    qfq_qlen_notify() always deletes its class from its active list
    with list_del_init() _and_ calls qfq_deactivate_agg() when the whole list
    becomes empty.
    
    To make it idempotent, just skip everything when it is not in the active
    list.
    
    Also change other list_del()'s to list_del_init() just to be extra safe.
    
    Reported-by: Gerrard Tai <[email protected]>
    Signed-off-by: Cong Wang <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Acked-by: Jamal Hadi Salim <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

tracing: Fix oob write in trace_seq_to_buffer() [+ + +]

Author: Jeongjun Park <[email protected]>
Date:   Tue Apr 22 20:30:25 2025 +0900

    tracing: Fix oob write in trace_seq_to_buffer()
    
    commit f5178c41bb43444a6008150fe6094497135d07cb upstream.
    
    syzbot reported this bug:
    ==================================================================
    BUG: KASAN: slab-out-of-bounds in trace_seq_to_buffer kernel/trace/trace.c:1830 [inline]
    BUG: KASAN: slab-out-of-bounds in tracing_splice_read_pipe+0x6be/0xdd0 kernel/trace/trace.c:6822
    Write of size 4507 at addr ffff888032b6b000 by task syz.2.320/7260
    
    CPU: 1 UID: 0 PID: 7260 Comm: syz.2.320 Not tainted 6.15.0-rc1-syzkaller-00301-g3bde70a2c827 #0 PREEMPT(full)
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
    Call Trace:
     <TASK>
     __dump_stack lib/dump_stack.c:94 [inline]
     dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
     print_address_description mm/kasan/report.c:408 [inline]
     print_report+0xc3/0x670 mm/kasan/report.c:521
     kasan_report+0xe0/0x110 mm/kasan/report.c:634
     check_region_inline mm/kasan/generic.c:183 [inline]
     kasan_check_range+0xef/0x1a0 mm/kasan/generic.c:189
     __asan_memcpy+0x3c/0x60 mm/kasan/shadow.c:106
     trace_seq_to_buffer kernel/trace/trace.c:1830 [inline]
     tracing_splice_read_pipe+0x6be/0xdd0 kernel/trace/trace.c:6822
     ....
    ==================================================================
    
    It has been reported that trace_seq_to_buffer() tries to copy more data
    than PAGE_SIZE to buf. Therefore, to prevent this, we should use the
    smaller of trace_seq_used(&iter->seq) and PAGE_SIZE as an argument.
    
    Link: https://lore.kernel.org/[email protected]
    Reported-by: [email protected]
    Fixes: 3c56819b14b0 ("tracing: splice support for tracing_pipe")
    Suggested-by: Steven Rostedt <[email protected]>
    Signed-off-by: Jeongjun Park <[email protected]>
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

vxlan: vnifilter: Fix unlocked deletion of default FDB entry [+ + +]

Author: Ido Schimmel <[email protected]>
Date:   Wed Apr 23 17:51:31 2025 +0300

    vxlan: vnifilter: Fix unlocked deletion of default FDB entry
    
    [ Upstream commit 087a9eb9e5978e3ba362e1163691e41097e8ca20 ]
    
    When a VNI is deleted from a VXLAN device in 'vnifilter' mode, the FDB
    entry associated with the default remote (assuming one was configured)
    is deleted without holding the hash lock. This is wrong and will result
    in a warning [1] being generated by the lockdep annotation that was
    added by commit ebe642067455 ("vxlan: Create wrappers for FDB lookup").
    
    Reproducer:
    
     # ip link add vx0 up type vxlan dstport 4789 external vnifilter local 192.0.2.1
     # bridge vni add vni 10010 remote 198.51.100.1 dev vx0
     # bridge vni del vni 10010 dev vx0
    
    Fix by acquiring the hash lock before the deletion and releasing it
    afterwards. Blame the original commit that introduced the issue rather
    than the one that exposed it.
    
    [1]
    WARNING: CPU: 3 PID: 392 at drivers/net/vxlan/vxlan_core.c:417 vxlan_find_mac+0x17f/0x1a0
    [...]
    RIP: 0010:vxlan_find_mac+0x17f/0x1a0
    [...]
    Call Trace:
     <TASK>
     __vxlan_fdb_delete+0xbe/0x560
     vxlan_vni_delete_group+0x2ba/0x940
     vxlan_vni_del.isra.0+0x15f/0x580
     vxlan_process_vni_filter+0x38b/0x7b0
     vxlan_vnifilter_process+0x3bb/0x510
     rtnetlink_rcv_msg+0x2f7/0xb70
     netlink_rcv_skb+0x131/0x360
     netlink_unicast+0x426/0x710
     netlink_sendmsg+0x75a/0xc20
     __sock_sendmsg+0xc1/0x150
     ____sys_sendmsg+0x5aa/0x7b0
     ___sys_sendmsg+0xfc/0x180
     __sys_sendmsg+0x121/0x1b0
     do_syscall_64+0xbb/0x1d0
     entry_SYSCALL_64_after_hwframe+0x4b/0x53
    
    Fixes: f9c4bb0b245c ("vxlan: vni filtering support on collect metadata device")
    Signed-off-by: Ido Schimmel <[email protected]>
    Reviewed-by: Nikolay Aleksandrov <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: brcm80211: fmac: Add error handling for brcmf_usb_dl_writeimage() [+ + +]

Author: Wentao Liang <[email protected]>
Date:   Tue Apr 22 12:22:02 2025 +0800

    wifi: brcm80211: fmac: Add error handling for brcmf_usb_dl_writeimage()
    
    commit 8e089e7b585d95122c8122d732d1d5ef8f879396 upstream.
    
    The function brcmf_usb_dl_writeimage() calls the function
    brcmf_usb_dl_cmd() but dose not check its return value. The
    'state.state' and the 'state.bytes' are uninitialized if the
    function brcmf_usb_dl_cmd() fails. It is dangerous to use
    uninitialized variables in the conditions.
    
    Add error handling for brcmf_usb_dl_cmd() to jump to error
    handling path if the brcmf_usb_dl_cmd() fails and the
    'state.state' and the 'state.bytes' are uninitialized.
    
    Improve the error message to report more detailed error
    information.
    
    Fixes: 71bb244ba2fd ("brcm80211: fmac: add USB support for bcm43235/6/8 chipsets")
    Cc: [email protected] # v3.4+
    Signed-off-by: Wentao Liang <[email protected]>
    Acked-by: Arend van Spriel <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

wifi: plfxlc: Remove erroneous assert in plfxlc_mac_release [+ + +]

Author: Murad Masimov <[email protected]>
Date:   Fri Mar 21 21:52:25 2025 +0300

    wifi: plfxlc: Remove erroneous assert in plfxlc_mac_release
    
    [ Upstream commit 0fb15ae3b0a9221be01715dac0335647c79f3362 ]
    
    plfxlc_mac_release() asserts that mac->lock is held. This assertion is
    incorrect, because even if it was possible, it would not be the valid
    behaviour. The function is used when probe fails or after the device is
    disconnected. In both cases mac->lock can not be held as the driver is
    not working with the device at the moment. All functions that use mac->lock
    unlock it just after it was held. There is also no need to hold mac->lock
    for plfxlc_mac_release() itself, as mac data is not affected, except for
    mac->flags, which is modified atomically.
    
    This bug leads to the following warning:
    ================================================================
    WARNING: CPU: 0 PID: 127 at drivers/net/wireless/purelifi/plfxlc/mac.c:106 plfxlc_mac_release+0x7d/0xa0
    Modules linked in:
    CPU: 0 PID: 127 Comm: kworker/0:2 Not tainted 6.1.124-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
    Workqueue: usb_hub_wq hub_event
    RIP: 0010:plfxlc_mac_release+0x7d/0xa0 drivers/net/wireless/purelifi/plfxlc/mac.c:106
    Call Trace:
     <TASK>
     probe+0x941/0xbd0 drivers/net/wireless/purelifi/plfxlc/usb.c:694
     usb_probe_interface+0x5c0/0xaf0 drivers/usb/core/driver.c:396
     really_probe+0x2ab/0xcb0 drivers/base/dd.c:639
     __driver_probe_device+0x1a2/0x3d0 drivers/base/dd.c:785
     driver_probe_device+0x50/0x420 drivers/base/dd.c:815
     __device_attach_driver+0x2cf/0x510 drivers/base/dd.c:943
     bus_for_each_drv+0x183/0x200 drivers/base/bus.c:429
     __device_attach+0x359/0x570 drivers/base/dd.c:1015
     bus_probe_device+0xba/0x1e0 drivers/base/bus.c:489
     device_add+0xb48/0xfd0 drivers/base/core.c:3696
     usb_set_configuration+0x19dd/0x2020 drivers/usb/core/message.c:2165
     usb_generic_driver_probe+0x84/0x140 drivers/usb/core/generic.c:238
     usb_probe_device+0x130/0x260 drivers/usb/core/driver.c:293
     really_probe+0x2ab/0xcb0 drivers/base/dd.c:639
     __driver_probe_device+0x1a2/0x3d0 drivers/base/dd.c:785
     driver_probe_device+0x50/0x420 drivers/base/dd.c:815
     __device_attach_driver+0x2cf/0x510 drivers/base/dd.c:943
     bus_for_each_drv+0x183/0x200 drivers/base/bus.c:429
     __device_attach+0x359/0x570 drivers/base/dd.c:1015
     bus_probe_device+0xba/0x1e0 drivers/base/bus.c:489
     device_add+0xb48/0xfd0 drivers/base/core.c:3696
     usb_new_device+0xbdd/0x18f0 drivers/usb/core/hub.c:2620
     hub_port_connect drivers/usb/core/hub.c:5477 [inline]
     hub_port_connect_change drivers/usb/core/hub.c:5617 [inline]
     port_event drivers/usb/core/hub.c:5773 [inline]
     hub_event+0x2efe/0x5730 drivers/usb/core/hub.c:5855
     process_one_work+0x8a9/0x11d0 kernel/workqueue.c:2292
     worker_thread+0xa47/0x1200 kernel/workqueue.c:2439
     kthread+0x28d/0x320 kernel/kthread.c:376
     ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
     </TASK>
    ================================================================
    
    Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
    
    Fixes: 68d57a07bfe5 ("wireless: add plfxlc driver for pureLiFi X, XL, XC devices")
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=7d4f142f6c288de8abfe
    Signed-off-by: Murad Masimov <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

xfs: allow symlinks with short remote targets [+ + +]

Author: Darrick J. Wong <[email protected]>
Date:   Wed Apr 30 14:26:59 2025 -0700

    xfs: allow symlinks with short remote targets
    
    [ Upstream commit 38de567906d95c397d87f292b892686b7ec6fbc3 ]
    
    An internal user complained about log recovery failing on a symlink
    ("Bad dinode after recovery") with the following (excerpted) format:
    
    core.magic = 0x494e
    core.mode = 0120777
    core.version = 3
    core.format = 2 (extents)
    core.nlinkv2 = 1
    core.nextents = 1
    core.size = 297
    core.nblocks = 1
    core.naextents = 0
    core.forkoff = 0
    core.aformat = 2 (extents)
    u3.bmx[0] = [startoff,startblock,blockcount,extentflag]
    0:[0,12,1,0]
    
    This is a symbolic link with a 297-byte target stored in a disk block,
    which is to say this is a symlink with a remote target.  The forkoff is
    0, which is to say that there's 512 - 176 == 336 bytes in the inode core
    to store the data fork.
    
    Eventually, testing of generic/388 failed with the same inode corruption
    message during inode recovery.  In writing a debugging patch to call
    xfs_dinode_verify on dirty inode log items when we're committing
    transactions, I observed that xfs/298 can reproduce the problem quite
    quickly.
    
    xfs/298 creates a symbolic link, adds some extended attributes, then
    deletes them all.  The test failure occurs when the final removexattr
    also deletes the attr fork because that does not convert the remote
    symlink back into a shortform symlink.  That is how we trip this test.
    The only reason why xfs/298 only triggers with the debug patch added is
    that it deletes the symlink, so the final iflush shows the inode as
    free.
    
    I wrote a quick fstest to emulate the behavior of xfs/298, except that
    it leaves the symlinks on the filesystem after inducing the "corrupt"
    state.  Kernels going back at least as far as 4.18 have written out
    symlink inodes in this manner and prior to 1eb70f54c445f they did not
    object to reading them back in.
    
    Because we've been writing out inodes this way for quite some time, the
    only way to fix this is to relax the check for symbolic links.
    Directories don't have this problem because di_size is bumped to
    blocksize during the sf->data conversion.
    
    Fixes: 1eb70f54c445f ("xfs: validate inode fork size against fork format")
    Signed-off-by: "Darrick J. Wong" <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Leah Rumancik <[email protected]>
    Acked-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: allow unlinked symlinks and dirs with zero size [+ + +]

Author: Darrick J. Wong <[email protected]>
Date:   Wed Apr 30 14:27:02 2025 -0700

    xfs: allow unlinked symlinks and dirs with zero size
    
    [ Upstream commit 1ec9307fc066dd8a140d5430f8a7576aa9d78cd3 ]
    
    For a very very long time, inode inactivation has set the inode size to
    zero before unmapping the extents associated with the data fork.
    Unfortunately, commit 3c6f46eacd876 changed the inode verifier to
    prohibit zero-length symlinks and directories.  If an inode happens to
    get logged in this state and the system crashes before freeing the
    inode, log recovery will also fail on the broken inode.
    
    Therefore, allow zero-size symlinks and directories as long as the link
    count is zero; nobody will be able to open these files by handle so
    there isn't any risk of data exposure.
    
    Fixes: 3c6f46eacd876 ("xfs: sanity check directory inode di_size")
    Signed-off-by: Darrick J. Wong <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Leah Rumancik <[email protected]>
    Acked-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: check opcode and iovec count match in xlog_recover_attri_commit_pass2 [+ + +]

Author: Darrick J. Wong <[email protected]>
Date:   Wed Apr 30 14:26:52 2025 -0700

    xfs: check opcode and iovec count match in xlog_recover_attri_commit_pass2
    
    [ Upstream commit ad206ae50eca62836c5460ab5bbf2a6c59a268e7 ]
    
    Check that the number of recovered log iovecs is what is expected for
    the xattri opcode is expecting.
    
    Signed-off-by: Darrick J. Wong <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Leah Rumancik <[email protected]>
    Acked-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: convert delayed extents to unwritten when zeroing post eof blocks [+ + +]

Author: Zhang Yi <[email protected]>
Date:   Wed Apr 30 14:26:58 2025 -0700

    xfs: convert delayed extents to unwritten when zeroing post eof blocks
    
    [ Upstream commit 5ce5674187c345dc31534d2024c09ad8ef29b7ba ]
    
    Current clone operation could be non-atomic if the destination of a file
    is beyond EOF, user could get a file with corrupted (zeroed) data on
    crash.
    
    The problem is about preallocations. If you write some data into a file:
    
            [A...B)
    
    and XFS decides to preallocate some post-eof blocks, then it can create
    a delayed allocation reservation:
    
            [A.........D)
    
    The writeback path tries to convert delayed extents to real ones by
    allocating blocks. If there aren't enough contiguous free space, we can
    end up with two extents, the first real and the second still delalloc:
    
            [A....C)[C.D)
    
    After that, both the in-memory and the on-disk file sizes are still B.
    If we clone into the range [E...F) from another file:
    
            [A....C)[C.D)      [E...F)
    
    then xfs_reflink_zero_posteof() calls iomap_zero_range() to zero out the
    range [B, E) beyond EOF and flush it. Since [C, D) is still a delalloc
    extent, its pagecache will be zeroed and both the in-memory and on-disk
    size will be updated to D after flushing but before cloning. This is
    wrong, because the user can see the size change and read the zeroes
    while the clone operation is ongoing.
    
    We need to keep the in-memory and on-disk size before the clone
    operation starts, so instead of writing zeroes through the page cache
    for delayed ranges beyond EOF, we convert these ranges to unwritten and
    invalidate any cached data over that range beyond EOF.
    
    Suggested-by: Dave Chinner <[email protected]>
    Signed-off-by: Zhang Yi <[email protected]>
    Reviewed-by: "Darrick J. Wong" <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Leah Rumancik <[email protected]>
    Acked-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: fix error returns from xfs_bmapi_write [+ + +]

Author: Christoph Hellwig <[email protected]>
Date:   Wed Apr 30 14:26:48 2025 -0700

    xfs: fix error returns from xfs_bmapi_write
    
    [ Upstream commit 6773da870ab89123d1b513da63ed59e32a29cb77 ]
    
    xfs_bmapi_write can return 0 without actually returning a mapping in
    mval in two different cases:
    
     1) when there is absolutely no space available to do an allocation
     2) when converting delalloc space, and the allocation is so small
        that it only covers parts of the delalloc extent before the
        range requested by the caller
    
    Callers at best can handle one of these cases, but in many cases can't
    cope with either one.  Switch xfs_bmapi_write to always return a
    mapping or return an error code instead.  For case 1) above ENOSPC is
    the obvious choice which is very much what the callers expect anyway.
    For case 2) there is no really good error code, so pick a funky one
    from the SysV streams portfolio.
    
    This fixes the reproducer here:
    
        https://lore.kernel.org/linux-xfs/CAEJPjCvT3Uag-pMTYuigEjWZHn1sGMZ0GCjVVCv29tNHK76Cgg@mail.gmail.com0/
    
    which uses reserved blocks to create file systems that are gravely
    out of space and thus cause at least xfs_file_alloc_space to hang
    and trigger the lack of ENOSPC handling in xfs_dquot_disk_alloc.
    
    Note that this patch does not actually make any caller but
    xfs_alloc_file_space deal intelligently with case 2) above.
    
    Signed-off-by: Christoph Hellwig <[email protected]>
    Reported-by: 刘通 <[email protected]>
    Reviewed-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Leah Rumancik <[email protected]>
    Acked-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: fix freeing speculative preallocations for preallocated files [+ + +]

Author: Christoph Hellwig <[email protected]>
Date:   Wed Apr 30 14:27:01 2025 -0700

    xfs: fix freeing speculative preallocations for preallocated files
    
    [ Upstream commit 610b29161b0aa9feb59b78dc867553274f17fb01 ]
    
    xfs_can_free_eofblocks returns false for files that have persistent
    preallocations unless the force flag is passed and there are delayed
    blocks.  This means it won't free delalloc reservations for files
    with persistent preallocations unless the force flag is set, and it
    will also free the persistent preallocations if the force flag is
    set and the file happens to have delayed allocations.
    
    Both of these are bad, so do away with the force flag and always free
    only post-EOF delayed allocations for files with the XFS_DIFLAG_PREALLOC
    or APPEND flags set.
    
    Signed-off-by: Christoph Hellwig <[email protected]>
    Reviewed-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Leah Rumancik <[email protected]>
    Acked-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: fix xfs_bmap_add_extent_delay_real for partial conversions [+ + +]

Author: Christoph Hellwig <[email protected]>
Date:   Wed Apr 30 14:26:49 2025 -0700

    xfs: fix xfs_bmap_add_extent_delay_real for partial conversions
    
    [ Upstream commit d69bee6a35d3c5e4873b9e164dd1a9711351a97c ]
    
    xfs_bmap_add_extent_delay_real takes parts or all of a delalloc extent
    and converts them to a real extent.  It is written to deal with any
    potential overlap of the to be converted range with the delalloc extent,
    but it turns out that currently only converting the entire extents, or a
    part starting at the beginning is actually exercised, as the only caller
    always tries to convert the entire delalloc extent, and either succeeds
    or at least progresses partially from the start.
    
    If it only converts a tiny part of a delalloc extent, the indirect block
    calculation for the new delalloc extent (da_new) might be equivalent to that
    of the existing delalloc extent (da_old).  If this extent conversion now
    requires allocating an indirect block that gets accounted into da_new,
    leading to the assert that da_new must be smaller or equal to da_new
    unless we split the extent to trigger.
    
    Except for the assert that case is actually handled by just trying to
    allocate more space, as that already handled for the split case (which
    currently can't be reached at all), so just reusing it should be fine.
    Except that without dipping into the reserved block pool that would make
    it a bit too easy to trigger a fs shutdown due to ENOSPC.  So in addition
    to adjusting the assert, also dip into the reserved block pool.
    
    Note that I could only reproduce the assert with a change to only convert
    the actually asked range instead of the full delalloc extent from
    xfs_bmapi_write.
    
    Signed-off-by: Christoph Hellwig <[email protected]>
    Reviewed-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Leah Rumancik <[email protected]>
    Acked-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: make sure sb_fdblocks is non-negative [+ + +]

Author: Wengang Wang <[email protected]>
Date:   Wed Apr 30 14:27:00 2025 -0700

    xfs: make sure sb_fdblocks is non-negative
    
    [ Upstream commit 58f880711f2ba53fd5e959875aff5b3bf6d5c32e ]
    
    A user with a completely full filesystem experienced an unexpected
    shutdown when the filesystem tried to write the superblock during
    runtime.
    kernel shows the following dmesg:
    
    [    8.176281] XFS (dm-4): Metadata corruption detected at xfs_sb_write_verify+0x60/0x120 [xfs], xfs_sb block 0x0
    [    8.177417] XFS (dm-4): Unmount and run xfs_repair
    [    8.178016] XFS (dm-4): First 128 bytes of corrupted metadata buffer:
    [    8.178703] 00000000: 58 46 53 42 00 00 10 00 00 00 00 00 01 90 00 00  XFSB............
    [    8.179487] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [    8.180312] 00000020: cf 12 dc 89 ca 26 45 29 92 e6 e3 8d 3b b8 a2 c3  .....&E)....;...
    [    8.181150] 00000030: 00 00 00 00 01 00 00 06 00 00 00 00 00 00 00 80  ................
    [    8.182003] 00000040: 00 00 00 00 00 00 00 81 00 00 00 00 00 00 00 82  ................
    [    8.182004] 00000050: 00 00 00 01 00 64 00 00 00 00 00 04 00 00 00 00  .....d..........
    [    8.182004] 00000060: 00 00 64 00 b4 a5 02 00 02 00 00 08 00 00 00 00  ..d.............
    [    8.182005] 00000070: 00 00 00 00 00 00 00 00 0c 09 09 03 17 00 00 19  ................
    [    8.182008] XFS (dm-4): Corruption of in-memory data detected.  Shutting down filesystem
    [    8.182010] XFS (dm-4): Please unmount the filesystem and rectify the problem(s)
    
    When xfs_log_sb writes super block to disk, b_fdblocks is fetched from
    m_fdblocks without any lock. As m_fdblocks can experience a positive ->
    negative -> positive changing when the FS reaches fullness (see
    xfs_mod_fdblocks). So there is a chance that sb_fdblocks is negative, and
    because sb_fdblocks is type of unsigned long long, it reads super big.
    And sb_fdblocks being bigger than sb_dblocks is a problem during log
    recovery, xfs_validate_sb_write() complains.
    
    Fix:
    As sb_fdblocks will be re-calculated during mount when lazysbcount is
    enabled, We just need to make xfs_validate_sb_write() happy -- make sure
    sb_fdblocks is not nenative. This patch also takes care of other percpu
    counters in xfs_log_sb.
    
    Signed-off-by: Wengang Wang <[email protected]>
    Reviewed-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Leah Rumancik <[email protected]>
    Acked-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: make the seq argument to xfs_bmapi_convert_delalloc() optional [+ + +]

Author: Zhang Yi <[email protected]>
Date:   Wed Apr 30 14:26:56 2025 -0700

    xfs: make the seq argument to xfs_bmapi_convert_delalloc() optional
    
    [ Upstream commit fc8d0ba0ff5fe4700fa02008b7751ec6b84b7677 ]
    
    Allow callers to pass a NULLL seq argument if they don't care about
    the fork sequence number.
    
    Signed-off-by: Zhang Yi <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Reviewed-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Leah Rumancik <[email protected]>
    Acked-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset [+ + +]

Author: Zhang Yi <[email protected]>
Date:   Wed Apr 30 14:26:57 2025 -0700

    xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset
    
    [ Upstream commit 2e08371a83f1c06fd85eea8cd37c87a224cc4cc4 ]
    
    Since xfs_bmapi_convert_delalloc() only attempts to allocate the entire
    delalloc extent and require multiple invocations to allocate the target
    offset. So xfs_convert_blocks() add a loop to do this job and we call it
    in the write back path, but xfs_convert_blocks() isn't a common helper.
    Let's do it in xfs_bmapi_convert_delalloc() and drop
    xfs_convert_blocks(), preparing for the post EOF delalloc blocks
    converting in the buffered write begin path.
    
    Signed-off-by: Zhang Yi <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Reviewed-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Leah Rumancik <[email protected]>
    Acked-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: match lock mode in xfs_buffered_write_iomap_begin() [+ + +]

Author: Zhang Yi <[email protected]>
Date:   Wed Apr 30 14:26:55 2025 -0700

    xfs: match lock mode in xfs_buffered_write_iomap_begin()
    
    [ Upstream commit bb712842a85d595525e72f0e378c143e620b3ea2 ]
    
    Commit 1aa91d9c9933 ("xfs: Add async buffered write support") replace
    xfs_ilock(XFS_ILOCK_EXCL) with xfs_ilock_for_iomap() when locking the
    writing inode, and a new variable lockmode is used to indicate the lock
    mode. Although the lockmode should always be XFS_ILOCK_EXCL, it's still
    better to use this variable instead of useing XFS_ILOCK_EXCL directly
    when unlocking the inode.
    
    Fixes: 1aa91d9c9933 ("xfs: Add async buffered write support")
    Signed-off-by: Zhang Yi <[email protected]>
    Reviewed-by: "Darrick J. Wong" <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Leah Rumancik <[email protected]>
    Acked-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: remove a racy if_bytes check in xfs_reflink_end_cow_extent [+ + +]

Author: Christoph Hellwig <[email protected]>
Date:   Wed Apr 30 14:26:50 2025 -0700

    xfs: remove a racy if_bytes check in xfs_reflink_end_cow_extent
    
    [ Upstream commit 86de848403abda05bf9c16dcdb6bef65a8d88c41 ]
    
    Accessing if_bytes without the ilock is racy.  Remove the initial
    if_bytes == 0 check in xfs_reflink_end_cow_extent and let
    ext_iext_lookup_extent fail for this case after we've taken the ilock.
    
    Signed-off-by: Christoph Hellwig <[email protected]>
    Reviewed-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Leah Rumancik <[email protected]>
    Acked-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: require XFS_SB_FEAT_INCOMPAT_LOG_XATTRS for attr log intent item recovery [+ + +]

Author: Darrick J. Wong <[email protected]>
Date:   Wed Apr 30 14:26:51 2025 -0700

    xfs: require XFS_SB_FEAT_INCOMPAT_LOG_XATTRS for attr log intent item recovery
    
    [ Upstream commit 8ef1d96a985e4dc07ffbd71bd7fc5604a80cc644 ]
    
    The XFS_SB_FEAT_INCOMPAT_LOG_XATTRS feature bit protects a filesystem
    from old kernels that do not know how to recover extended attribute log
    intent items.  Make this check mandatory instead of a debugging assert.
    
    Fixes: fd920008784ea ("xfs: Set up infrastructure for log attribute replay")
    Signed-off-by: Darrick J. Wong <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Leah Rumancik <[email protected]>
    Acked-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: restrict when we try to align cow fork delalloc to cowextsz hints [+ + +]

Author: Darrick J. Wong <[email protected]>
Date:   Wed Apr 30 14:27:03 2025 -0700

    xfs: restrict when we try to align cow fork delalloc to cowextsz hints
    
    [ Upstream commit 288e1f693f04e66be99f27e7cbe4a45936a66745 ]
    
    xfs/205 produces the following failure when always_cow is enabled:
    
    #  --- a/tests/xfs/205.out      2024-02-28 16:20:24.437887970 -0800
    #  +++ b/tests/xfs/205.out.bad  2024-06-03 21:13:40.584000000 -0700
    #  @@ -1,4 +1,5 @@
    #   QA output created by 205
    #   *** one file
    #  +   !!! disk full (expected)
    #   *** one file, a few bytes at a time
    #   *** done
    
    This is the result of overly aggressive attempts to align cow fork
    delalloc reservations to the CoW extent size hint.  Looking at the trace
    data, we're trying to append a single fsblock to the "fred" file.
    Trying to create a speculative post-eof reservation fails because
    there's not enough space.
    
    We then set @prealloc_blocks to zero and try again, but the cowextsz
    alignment code triggers, which expands our request for a 1-fsblock
    reservation into a 39-block reservation.  There's not enough space for
    that, so the whole write fails with ENOSPC even though there's
    sufficient space in the filesystem to allocate the single block that we
    need to land the write.
    
    There are two things wrong here -- first, we shouldn't be attempting
    speculative preallocations beyond what was requested when we're low on
    space.  Second, if we've already computed a posteof preallocation, we
    shouldn't bother trying to align that to the cowextsize hint.
    
    Fix both of these problems by adding a flag that only enables the
    expansion of the delalloc reservation to the cowextsize if we're doing a
    non-extending write, and only if we're not doing an ENOSPC retry.  This
    requires us to move the ENOSPC retry logic to xfs_bmapi_reserve_delalloc.
    
    I probably should have caught this six years ago when 6ca30729c206d was
    being reviewed, but oh well.  Update the comments to reflect what the
    code does now.
    
    Fixes: 6ca30729c206d ("xfs: bmap code cleanup")
    Signed-off-by: Darrick J. Wong <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Leah Rumancik <[email protected]>
    Acked-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: revert commit 44af6c7e59b12 [+ + +]

Author: Darrick J. Wong <[email protected]>
Date:   Wed Apr 30 14:26:54 2025 -0700

    xfs: revert commit 44af6c7e59b12
    
    [ Upstream commit 2a009397eb5ae178670cbd7101e9635cf6412b35 ]
    
    In my haste to fix what I thought was a performance problem in the attr
    scrub code, I neglected to notice that the xfs_attr_get_ilocked also had
    the effect of checking that attributes can actually be looked up through
    the attr dabtree.  Fix this.
    
    Fixes: 44af6c7e59b12 ("xfs: don't load local xattr values during scrub")
    Signed-off-by: Darrick J. Wong <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Leah Rumancik <[email protected]>
    Acked-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: validate recovered name buffers when recovering xattr items [+ + +]

Author: Darrick J. Wong <[email protected]>
Date:   Wed Apr 30 14:26:53 2025 -0700

    xfs: validate recovered name buffers when recovering xattr items
    
    [ Upstream commit 1c7f09d210aba2f2bb206e2e8c97c9f11a3fd880 ]
    
    Strengthen the xattri log item recovery code by checking that we
    actually have the required name and newname buffers for whatever
    operation we're replaying.
    
    Signed-off-by: Darrick J. Wong <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Leah Rumancik <[email protected]>
    Acked-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>