Skip to content
Snippets Groups Projects
  1. Aug 29, 2022
    • Jakub Kicinski's avatar
      genetlink: start to validate reserved header bytes · 9c5d03d3
      Jakub Kicinski authored
      
      We had historically not checked that genlmsghdr.reserved
      is 0 on input which prevents us from using those precious
      bytes in the future.
      
      One use case would be to extend the cmd field, which is
      currently just 8 bits wide and 256 is not a lot of commands
      for some core families.
      
      To make sure that new families do the right thing by default
      put the onus of opting out of validation on existing families.
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Acked-by: Paul Moore <paul@paul-moore.com> (NetLabel)
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c5d03d3
  2. Mar 14, 2021
  3. Feb 25, 2021
  4. Jan 28, 2021
  5. Oct 03, 2020
  6. May 23, 2020
  7. May 22, 2020
    • Chris Mi's avatar
      net: psample: Add tunnel support · d8bed686
      Chris Mi authored
      
      Currently, psample can only send the packet bits after decapsulation.
      The tunnel information is lost. Add the tunnel support.
      
      If the sampled packet has no tunnel info, the behavior is the same as
      before. If it has, add a nested metadata field named PSAMPLE_ATTR_TUNNEL
      and include the tunnel subfields if applicable.
      
      Increase the metadata length for sampled packet with the tunnel info.
      If new subfields of tunnel info should be included, update the metadata
      length accordingly.
      
      Signed-off-by: default avatarChris Mi <chrism@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d8bed686
  8. Nov 26, 2019
    • Nikolay Aleksandrov's avatar
      net: psample: fix skb_over_panic · 7eb9d767
      Nikolay Aleksandrov authored
      
      We need to calculate the skb size correctly otherwise we risk triggering
      skb_over_panic[1]. The issue is that data_len is added to the skb in a
      nl attribute, but we don't account for its header size (nlattr 4 bytes)
      and alignment. We account for it when calculating the total size in
      the > PSAMPLE_MAX_PACKET_SIZE comparison correctly, but not when
      allocating after that. The fix is simple - use nla_total_size() for
      data_len when allocating.
      
      To reproduce:
       $ tc qdisc add dev eth1 clsact
       $ tc filter add dev eth1 egress matchall action sample rate 1 group 1 trunc 129
       $ mausezahn eth1 -b bcast -a rand -c 1 -p 129
       < skb_over_panic BUG(), tail is 4 bytes past skb->end >
      
      [1] Trace:
       [   50.459526][ T3480] skbuff: skb_over_panic: text:(____ptrval____) len:196 put:136 head:(____ptrval____) data:(____ptrval____) tail:0xc4 end:0xc0 dev:<NULL>
       [   50.474339][ T3480] ------------[ cut here ]------------
       [   50.481132][ T3480] kernel BUG at net/core/skbuff.c:108!
       [   50.486059][ T3480] invalid opcode: 0000 [#1] PREEMPT SMP
       [   50.489463][ T3480] CPU: 3 PID: 3480 Comm: mausezahn Not tainted 5.4.0-rc7 #108
       [   50.492844][ T3480] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014
       [   50.496551][ T3480] RIP: 0010:skb_panic+0x79/0x7b
       [   50.498261][ T3480] Code: bc 00 00 00 41 57 4c 89 e6 48 c7 c7 90 29 9a 83 4c 8b 8b c0 00 00 00 50 8b 83 b8 00 00 00 50 ff b3 c8 00 00 00 e8 ae ef c0 fe <0f> 0b e8 2f df c8 fe 48 8b 55 08 44 89 f6 4c 89 e7 48 c7 c1 a0 22
       [   50.504111][ T3480] RSP: 0018:ffffc90000447a10 EFLAGS: 00010282
       [   50.505835][ T3480] RAX: 0000000000000087 RBX: ffff888039317d00 RCX: 0000000000000000
       [   50.507900][ T3480] RDX: 0000000000000000 RSI: ffffffff812716e1 RDI: 00000000ffffffff
       [   50.509820][ T3480] RBP: ffffc90000447a60 R08: 0000000000000001 R09: 0000000000000000
       [   50.511735][ T3480] R10: ffffffff81d4f940 R11: 0000000000000000 R12: ffffffff834a22b0
       [   50.513494][ T3480] R13: ffffffff82c10433 R14: 0000000000000088 R15: ffffffff838a8084
       [   50.515222][ T3480] FS:  00007f3536462700(0000) GS:ffff88803eac0000(0000) knlGS:0000000000000000
       [   50.517135][ T3480] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [   50.518583][ T3480] CR2: 0000000000442008 CR3: 000000003b222000 CR4: 00000000000006e0
       [   50.520723][ T3480] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       [   50.522709][ T3480] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       [   50.524450][ T3480] Call Trace:
       [   50.525214][ T3480]  skb_put.cold+0x1b/0x1b
       [   50.526171][ T3480]  psample_sample_packet+0x1d3/0x340
       [   50.527307][ T3480]  tcf_sample_act+0x178/0x250
       [   50.528339][ T3480]  tcf_action_exec+0xb1/0x190
       [   50.529354][ T3480]  mall_classify+0x67/0x90
       [   50.530332][ T3480]  tcf_classify+0x72/0x160
       [   50.531286][ T3480]  __dev_queue_xmit+0x3db/0xd50
       [   50.532327][ T3480]  dev_queue_xmit+0x18/0x20
       [   50.533299][ T3480]  packet_sendmsg+0xee7/0x2090
       [   50.534331][ T3480]  sock_sendmsg+0x54/0x70
       [   50.535271][ T3480]  __sys_sendto+0x148/0x1f0
       [   50.536252][ T3480]  ? tomoyo_file_ioctl+0x23/0x30
       [   50.537334][ T3480]  ? ksys_ioctl+0x5e/0xb0
       [   50.540068][ T3480]  __x64_sys_sendto+0x2a/0x30
       [   50.542810][ T3480]  do_syscall_64+0x73/0x1f0
       [   50.545383][ T3480]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
       [   50.548477][ T3480] RIP: 0033:0x7f35357d6fb3
       [   50.551020][ T3480] Code: 48 8b 0d 18 90 20 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d f9 d3 20 00 00 75 13 49 89 ca b8 2c 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 eb f6 ff ff 48 89 04 24
       [   50.558547][ T3480] RSP: 002b:00007ffe0c7212c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
       [   50.561870][ T3480] RAX: ffffffffffffffda RBX: 0000000001dac010 RCX: 00007f35357d6fb3
       [   50.565142][ T3480] RDX: 0000000000000082 RSI: 0000000001dac2a2 RDI: 0000000000000003
       [   50.568469][ T3480] RBP: 00007ffe0c7212f0 R08: 00007ffe0c7212d0 R09: 0000000000000014
       [   50.571731][ T3480] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000082
       [   50.574961][ T3480] R13: 0000000001dac2a2 R14: 0000000000000001 R15: 0000000000000003
       [   50.578170][ T3480] Modules linked in: sch_ingress virtio_net
       [   50.580976][ T3480] ---[ end trace 61a515626a595af6 ]---
      
      CC: Yotam Gigi <yotamg@mellanox.com>
      CC: Jiri Pirko <jiri@mellanox.com>
      CC: Jamal Hadi Salim <jhs@mojatatu.com>
      CC: Simon Horman <simon.horman@netronome.com>
      CC: Roopa Prabhu <roopa@cumulusnetworks.com>
      Fixes: 6ae0a628 ("net: Introduce psample, a new genetlink channel for packet sampling")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7eb9d767
  9. Sep 16, 2019
    • Vlad Buslov's avatar
      net: sched: take reference to psample group in flow_action infra · 4a5da47d
      Vlad Buslov authored
      
      With recent patch set that removed rtnl lock dependency from cls hardware
      offload API rtnl lock is only taken when reading action data and can be
      released after action-specific data is parsed into intermediate
      representation. However, sample action psample group is passed by pointer
      without obtaining reference to it first, which makes it possible to
      concurrently overwrite the action and deallocate object pointed by
      psample_group pointer after rtnl lock is released but before driver
      finished using the pointer.
      
      To prevent such race condition, obtain reference to psample group while it
      is used by flow_action infra. Extend psample API with function
      psample_group_take() that increments psample group reference counter.
      Extend struct tc_action_ops with new get_psample_group() API. Implement the
      API for action sample using psample_group_take() and already existing
      psample_group_put() as a destructor. Use it in tc_setup_flow_action() to
      take reference to psample group pointed to by entry->sample.psample_group
      and release it in tc_cleanup_flow_action().
      
      Disable bh when taking psample_groups_lock. The lock is now taken while
      holding action tcf_lock that is used by data path and requires bh to be
      disabled, so doing the same for psample_groups_lock is necessary to
      preserve SOFTIRQ-irq-safety.
      
      Fixes: 918190f5 ("net: sched: flower: don't take rtnl lock for cls hw offloads API")
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4a5da47d
  10. Aug 28, 2019
    • Vlad Buslov's avatar
      net: sched: act_sample: fix psample group handling on overwrite · dbf47a2a
      Vlad Buslov authored
      
      Action sample doesn't properly handle psample_group pointer in overwrite
      case. Following issues need to be fixed:
      
      - In tcf_sample_init() function RCU_INIT_POINTER() is used to set
        s->psample_group, even though we neither setting the pointer to NULL, nor
        preventing concurrent readers from accessing the pointer in some way.
        Use rcu_swap_protected() instead to safely reset the pointer.
      
      - Old value of s->psample_group is not released or deallocated in any way,
        which results resource leak. Use psample_group_put() on non-NULL value
        obtained with rcu_swap_protected().
      
      - The function psample_group_put() that released reference to struct
        psample_group pointed by rcu-pointer s->psample_group doesn't respect rcu
        grace period when deallocating it. Extend struct psample_group with rcu
        head and use kfree_rcu when freeing it.
      
      Fixes: 5c5670fa ("net/sched: Introduce sample tc action")
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dbf47a2a
  11. Jun 19, 2019
  12. May 21, 2019
  13. Apr 27, 2019
    • Johannes Berg's avatar
      genetlink: optionally validate strictly/dumps · ef6243ac
      Johannes Berg authored
      
      Add options to strictly validate messages and dump messages,
      sometimes perhaps validating dump messages non-strictly may
      be required, so add an option for that as well.
      
      Since none of this can really be applied to existing commands,
      set the options everwhere using the following spatch:
      
          @@
          identifier ops;
          expression X;
          @@
          struct genl_ops ops[] = {
          ...,
           {
                  .cmd = X,
          +       .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
                  ...
           },
          ...
          };
      
      For new commands one should just not copy the .validate 'opt-out'
      flags and thus get strict validation.
      
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ef6243ac
  14. Nov 01, 2017
  15. Jun 16, 2017
    • Johannes Berg's avatar
      networking: make skb_put & friends return void pointers · 4df864c1
      Johannes Berg authored
      
      It seems like a historic accident that these return unsigned char *,
      and in many places that means casts are required, more often than not.
      
      Make these functions (skb_put, __skb_put and pskb_put) return void *
      and remove all the casts across the tree, adding a (u8 *) cast only
      where the unsigned char pointer was used directly, all done with the
      following spatch:
      
          @@
          expression SKB, LEN;
          typedef u8;
          identifier fn = { skb_put, __skb_put };
          @@
          - *(fn(SKB, LEN))
          + *(u8 *)fn(SKB, LEN)
      
          @@
          expression E, SKB, LEN;
          identifier fn = { skb_put, __skb_put };
          type T;
          @@
          - E = ((T *)(fn(SKB, LEN)))
          + E = fn(SKB, LEN)
      
      which actually doesn't cover pskb_put since there are only three
      users overall.
      
      A handful of stragglers were converted manually, notably a macro in
      drivers/isdn/i4l/isdn_bsdcomp.c and, oddly enough, one of the many
      instances in net/bluetooth/hci_sock.c. In the former file, I also
      had to fix one whitespace problem spatch introduced.
      
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4df864c1
  16. Jan 24, 2017
    • Yotam Gigi's avatar
      net: Introduce psample, a new genetlink channel for packet sampling · 6ae0a628
      Yotam Gigi authored
      
      Add a general way for kernel modules to sample packets, without being tied
      to any specific subsystem. This netlink channel can be used by tc,
      iptables, etc. and allow to standardize packet sampling in the kernel.
      
      For every sampled packet, the psample module adds the following metadata
      fields:
      
      PSAMPLE_ATTR_IIFINDEX - the packets input ifindex, if applicable
      
      PSAMPLE_ATTR_OIFINDEX - the packet output ifindex, if applicable
      
      PSAMPLE_ATTR_ORIGSIZE - the packet's original size, in case it has been
         truncated during sampling
      
      PSAMPLE_ATTR_SAMPLE_GROUP - the packet's sample group, which is set by the
         user who initiated the sampling. This field allows the user to
         differentiate between several samplers working simultaneously and
         filter packets relevant to him
      
      PSAMPLE_ATTR_GROUP_SEQ - sequence counter of last sent packet. The
         sequence is kept for each group
      
      PSAMPLE_ATTR_SAMPLE_RATE - the sampling rate used for sampling the packets
      
      PSAMPLE_ATTR_DATA - the actual packet bits
      
      The sampled packets are sent to the PSAMPLE_NL_MCGRP_SAMPLE multicast
      group. In addition, add the GET_GROUPS netlink command which allows the
      user to see the current sample groups, their refcount and sequence number.
      This command currently supports only netlink dump mode.
      
      Signed-off-by: default avatarYotam Gigi <yotamg@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Reviewed-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6ae0a628
Loading