1. 02 Apr, 2015 7 commits
  2. 27 Mar, 2015 1 commit
    • Andi Kleen's avatar
      perf/x86/intel: Add INST_RETIRED.ALL workarounds · 294fe0f5
      Andi Kleen authored
      On Broadwell INST_RETIRED.ALL cannot be used with any period
      that doesn't have the lowest 6 bits cleared. And the period
      should not be smaller than 128.
      
      This is erratum BDM11 and BDM55:
      
        http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/5th-gen-core-family-spec-update.pdf
      
      
      
      BDM11: When using a period < 100; we may get incorrect PEBS/PMI
      interrupts and/or an invalid counter state.
      BDM55: When bit0-5 of the period are !0 we may get redundant PEBS
      records on overflow.
      
      Add a new callback to enforce this, and set it for Broadwell.
      
      How does this handle the case when an app requests a specific
      period with some of the bottom bits set?
      
      Short answer:
      
      Any useful instruction sampling period needs to be 4-6 orders
      of magnitude larger than 128, as an PMI every 128 instructions
      would instantly overwhelm the system and be throttled.
      So the +-64 error from this is really small compared to the
      period, much smaller than normal system jitter.
      
      Long answer (by Peterz):
      
      IFF we guarantee perf_event_attr::sample_period >= 128.
      
      Suppose we start out with sample_period=192; then we'll set period_left
      to 192, we'll end up with left = 128 (we truncate the lower bits). We
      get an interrupt, find that period_left = 64 (>0 so we return 0 and
      don't get an overflow handler), up that to 128. Then we trigger again,
      at n=256. Then we find period_left = -64 (<=0 so we return 1 and do get
      an overflow). We increment with sample_period so we get left = 128. We
      fire again, at n=384, period_left = 0 (<=0 so we return 1 and get an
      overflow). And on and on.
      
      So while the individual interrupts are 'wrong' we get then with
      interval=256,128 in exactly the right ratio to average out at 192. And
      this works for everything >=128.
      
      So the num_samples*fixed_period thing is still entirely correct +- 127,
      which is good enough I'd say, as you already have that error anyhow.
      
      So no need to 'fix' the tools, al we need to do is refuse to create
      INST_RETIRED:ALL events with sample_period < 128.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      [ Updated comments and changelog a bit. ]
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1424225886-18652-3-git-send-email-andi@firstfloor.org
      
      Signed-off-by: Ingo Molnar's avatarIngo Molnar <mingo@kernel.org>
      294fe0f5
  3. 18 Feb, 2015 6 commits
  4. 04 Feb, 2015 1 commit
    • Andy Lutomirski's avatar
      perf/x86: Only allow rdpmc if a perf_event is mapped · 7911d3f7
      Andy Lutomirski authored
      
      
      We currently allow any process to use rdpmc.  This significantly
      weakens the protection offered by PR_TSC_DISABLED, and it could be
      helpful to users attempting to exploit timing attacks.
      
      Since we can't enable access to individual counters, use a very
      coarse heuristic to limit access to rdpmc: allow access only when
      a perf_event is mmapped.  This protects seccomp sandboxes.
      
      There is plenty of room to further tighen these restrictions.  For
      example, this allows rdpmc for any x86_pmu event, but it's only
      useful for self-monitoring tasks.
      
      As a side effect, cap_user_rdpmc will now be false for AMD uncore
      events.  This isn't a real regression, since .event_idx is disabled
      for these events anyway for the time being.  Whenever that gets
      re-added, the cap_user_rdpmc code can be adjusted or refactored
      accordingly.
      Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Vince Weaver <vince@deater.net>
      Cc: "hillf.zj" <hillf.zj@alibaba-inc.com>
      Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/a2bdb3cf3a1d70c26980d7c6dddfbaa69f3182bf.1414190806.git.luto@amacapital.net
      
      Signed-off-by: Ingo Molnar's avatarIngo Molnar <mingo@kernel.org>
      7911d3f7
  5. 16 Nov, 2014 1 commit
  6. 29 Oct, 2014 1 commit
    • Ingo Molnar's avatar
      perf/x86/intel: Revert incomplete and undocumented Broadwell client support · 1776b106
      Ingo Molnar authored
      These patches:
      
        86a349a2 ("perf/x86/intel: Add Broadwell core support")
        c46e665f ("perf/x86: Add INST_RETIRED.ALL workarounds")
        fdda3c4a ("perf/x86/intel: Use Broadwell cache event list for Haswell")
      
      introduced magic constants and unexplained changes:
      
        https://lkml.org/lkml/2014/10/28/1128
        https://lkml.org/lkml/2014/10/27/325
        https://lkml.org/lkml/2014/8/27/546
        https://lkml.org/lkml/2014/10/28/546
      
      Peter Zijlstra has attempted to help out, to clean up the mess:
      
        https://lkml.org/lkml/2014/10/28/543
      
      But has not received helpful and constructive replies which makes
      me doubt wether it can all be finished in time until v3.18 is
      released.
      
      Despite various review feedback the author (Andi Kleen) has answered
      only few of the review questions and has generally been uncooperative,
      only giving replies when prompted repeatedly, and only giving minimal
      answers instead of constructively explaining and helping along the effort.
      
      That kind of behavior is not acceptable.
      
      There's also a boot crash on Intel E5-1630 v3 CPUs reported for another
      commit from Andi Kleen:
      
        e735b9db ("perf/x86/intel/uncore: Add Haswell-EP uncore support")
      
        https://lkml.org/lkml/2014/10/22/730
      
      Which is not yet resolved. The uncore driver is independent in theory,
      but the crash makes me worry about how well all these patches were
      tested and makes me uneasy about the level of interminging that the
      Broadwell and Haswell code has received by the commits above.
      
      As a first step to resolve the mess revert the Broadwell client commits
      back to the v3.17 version, before we run out of time and problematic
      code hits a stable upstream kernel.
      
      ( If the Haswell-EP crash is not resolved via a simple fix then we'll have
        to revert the Haswell-EP uncore driver as well. )
      
      The Broadwell client series has to be submitted in a clean fashion, with
      single, well documented changes per patch. If they are submitted in time
      and are accepted during review then they can possibly go into v3.19 but
      will need additional scrutiny due to the rocky history of this patch set.
      
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: eranian@google.com
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1409683455-29168-3-git-send-email-andi@firstfloor.org
      
      Signed-off-by: Ingo Molnar's avatarIngo Molnar <mingo@kernel.org>
      1776b106
  7. 24 Sep, 2014 1 commit
    • Andi Kleen's avatar
      perf/x86: Add INST_RETIRED.ALL workarounds · c46e665f
      Andi Kleen authored
      
      
      On Broadwell INST_RETIRED.ALL cannot be used with any period
      that doesn't have the lowest 6 bits cleared. And the period
      should not be smaller than 128.
      
      Add a new callback to enforce this, and set it for Broadwell.
      
      This is erratum BDM57 and BDM11.
      
      How does this handle the case when an app requests a specific
      period with some of the bottom bits set
      
      The apps thinks it is sampling at X occurences per sample, when it is
      in fact at X - 63 (worst case).
      
      Short answer:
      
      Any useful instruction sampling period needs to be 4-6 orders
      of magnitude larger than 128, as an PMI every 128 instructions
      would instantly overwhelm the system and be throttled.
      So the +-64 error from this is really small compared to the
      period, much smaller than normal system jitter.
      
      Long answer:
      
      <write up by Peter:>
      
      IFF we guarantee perf_event_attr::sample_period >= 128.
      
      Suppose we start out with sample_period=192; then we'll set period_left
      to 192, we'll end up with left = 128 (we truncate the lower bits). We
      get an interrupt, find that period_left = 64 (>0 so we return 0 and
      don't get an overflow handler), up that to 128. Then we trigger again,
      at n=256. Then we find period_left = -64 (<=0 so we return 1 and do get
      an overflow). We increment with sample_period so we get left = 128. We
      fire again, at n=384, period_left = 0 (<=0 so we return 1 and get an
      overflow). And on and on.
      
      So while the individual interrupts are 'wrong' we get then with
      interval=256,128 in exactly the right ratio to average out at 192. And
      this works for everything >=128.
      
      So the num_samples*fixed_period thing is still entirely correct +- 127,
      which is good enough I'd say, as you already have that error anyhow.
      
      So no need to 'fix' the tools, al we need to do is refuse to create
      INST_RETIRED:ALL events with sample_period < 128.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
      Cc: Mark Davies <junk@eslaf.co.uk>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1409683455-29168-4-git-send-email-andi@firstfloor.org
      
      Signed-off-by: Ingo Molnar's avatarIngo Molnar <mingo@kernel.org>
      c46e665f
  8. 13 Aug, 2014 1 commit
    • Andi Kleen's avatar
      perf/x86: Revamp PEBS event selection · 86a04461
      Andi Kleen authored
      
      
      The basic idea is that it does not make sense to list all PEBS
      events individually. The list is very long, sometimes outdated
      and the hardware doesn't need it. If an event does not support
      PEBS it will just not count, there is no security issue.
      
      We need to only list events that something special, like
      supporting load or store addresses.
      
      This vastly simplifies the PEBS event selection. It also
      speeds up the scheduling because the scheduler doesn't
      have to walk as many constraints.
      
      Bugs fixed:
      
       - We do not allow setting forbidden flags with PEBS anymore
         (SDM 18.9.4), except for the special cycle event.
         This is done using a new constraint macro that also
         matches on the event flags.
      
       - Correct DataLA and load/store/na flags reporting on Haswell
         [Requires a followon patch]
      
       - We did not allow all PEBS events on Haswell:
         We were missing some valid subevents in d1-d2 (MEM_LOAD_UOPS_RETIRED.*,
         MEM_LOAD_UOPS_RETIRED_L3_HIT_RETIRED.*)
      
      This includes the changes proposed by Stephane earlier and obsoletes
      his patchkit (except for some changes on pre Sandy Bridge/Silvermont
      CPUs)
      
      I only did Sandy Bridge and Silvermont and later so far, mostly because these
      are the parts I could directly confirm the hardware behavior with hardware
      architects. Also I do not believe the older CPUs have any
      missing events in their PEBS list, so there's no pressing
      need to change them.
      
      I did not implement the flag proposed by Peter to allow
      setting forbidden flags. If really needed this could
      be implemented on to of this patch.
      
      v2: Fix broken store events on SNB/IVB (Stephane Eranian)
      v3: More fixes. Rename some arguments (Stephane Eranian)
      v4: List most Haswell events individually again to report
      memory operation type correctly.
      Add new flags to describe load/store/na for datala.
      Update description.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Reviewed-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1407785233-32193-2-git-send-email-eranian@google.com
      
      
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
      Cc: Mark Davies <junk@eslaf.co.uk>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yan, Zheng <zheng.z.yan@intel.com>
      Signed-off-by: Ingo Molnar's avatarIngo Molnar <mingo@kernel.org>
      86a04461
  9. 16 Jul, 2014 1 commit
    • Kan Liang's avatar
      perf/x86/intel: Protect LBR and extra_regs against KVM lying · 338b522c
      Kan Liang authored
      
      
      With -cpu host, KVM reports LBR and extra_regs support, if the host has
      support.
      
      When the guest perf driver tries to access LBR or extra_regs MSR,
      it #GPs all MSR accesses,since KVM doesn't handle LBR and extra_regs support.
      So check the related MSRs access right once at initialization time to avoid
      the error access at runtime.
      
      For reproducing the issue, please build the kernel with CONFIG_KVM_INTEL = y
      (for host kernel).
      And CONFIG_PARAVIRT = n and CONFIG_KVM_GUEST = n (for guest kernel).
      Start the guest with -cpu host.
      Run perf record with --branch-any or --branch-filter in guest to trigger LBR
      Run perf stat offcore events (E.g. LLC-loads/LLC-load-misses ...) in guest to
      trigger offcore_rsp #GP
      Signed-off-by: default avatarKan Liang <kan.liang@intel.com>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
      Cc: Mark Davies <junk@eslaf.co.uk>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yan, Zheng <zheng.z.yan@intel.com>
      Link: http://lkml.kernel.org/r/1405365957-20202-1-git-send-email-kan.liang@intel.com
      
      Signed-off-by: Ingo Molnar's avatarIngo Molnar <mingo@kernel.org>
      338b522c
  10. 27 Feb, 2014 1 commit
  11. 09 Feb, 2014 1 commit
  12. 05 Dec, 2013 1 commit
    • Maria Dimakopoulou's avatar
      perf/x86: Fix constraint table end marker bug · cf30d52e
      Maria Dimakopoulou authored
      
      
      The EVENT_CONSTRAINT_END() macro defines the end marker as
      a constraint with a weight of zero. This was all fine
      until we blacklisted the corrupting memory events on
      Intel IvyBridge. These events are blacklisted by using
      a counter bitmask of zero. Thus, they also get a constraint
      weight of zero.
      
      The iteration macro: for_each_constraint tests the weight==0.
      Therefore, it was stopping at the first blacklisted event, i.e.,
      0xd0. The corrupting events were therefore considered as
      unconstrained and were scheduled on any of the generic counters.
      
      This patch fixes the end marker to have a weight of -1. With
      this, the blacklisted events get an empty constraint and cannot
      be scheduled which is what we want for now.
      Signed-off-by: default avatarMaria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
      Reviewed-by: default avatarStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: ak@linux.intel.com
      Cc: jolsa@redhat.com
      Cc: zheng.z.yan@intel.com
      Link: http://lkml.kernel.org/r/20131204232437.GA10689@starlight
      
      Signed-off-by: Ingo Molnar's avatarIngo Molnar <mingo@kernel.org>
      cf30d52e
  13. 04 Oct, 2013 1 commit
  14. 12 Sep, 2013 1 commit
  15. 02 Sep, 2013 1 commit
  16. 26 Jun, 2013 2 commits
    • Stephane Eranian's avatar
      perf/x86: Fix shared register mutual exclusion enforcement · 2f7f73a5
      Stephane Eranian authored
      
      
      This patch fixes a problem with the shared registers mutual
      exclusion code and incremental event scheduling by the
      generic perf_event code.
      
      There was a bug whereby the mutual exclusion on the shared
      registers was not enforced because of incremental scheduling
      abort due to event constraints. As an example on Intel
      Nehalem, consider the following events:
      
      group1= L1D_CACHE_LD:E_STATE,OFFCORE_RESPONSE_0:PF_RFO,L1D_CACHE_LD:I_STATE
      group2= L1D_CACHE_LD:I_STATE
      
      The L1D_CACHE_LD event can only be measured by 2 counters. Yet, there
      are 3 instances here. The first group can be scheduled and is committed.
      Then, the generic code tries to schedule group2 and this fails (because
      there is no more counter to support the 3rd instance of L1D_CACHE_LD).
      But in x86_schedule_events() error path, put_event_contraints() is invoked
      on ALL the events and not just the ones that just failed. That causes the
      "lock" on the shared offcore_response MSR to be released. Yet the first group
      is actually scheduled and is exposed to reprogramming of that shared msr by
      the sibling HT thread. In other words, there is no guarantee on what is
      measured.
      
      This patch fixes the problem by tagging committed events with the
      PERF_X86_EVENT_COMMITTED tag. In the error path of x86_schedule_events(),
      only the events NOT tagged have their constraint released. The tag
      is eventually removed when the event in descheduled.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20130620164254.GA3556@quad
      
      Signed-off-by: Ingo Molnar's avatarIngo Molnar <mingo@kernel.org>
      2f7f73a5
    • Andi Kleen's avatar
      perf/x86/intel: Support full width counting · 069e0c3c
      Andi Kleen authored
      
      
      Recent Intel CPUs like Haswell and IvyBridge have a new
      alternative MSR range for perfctrs that allows writing the full
      counter width. Enable this range if the hardware reports it
      using a new capability bit.
      
      Currently the perf code queries CPUID to get the counter width,
      and sign extends the counter values as needed. The traditional
      PERFCTR MSRs always limit to 32bit, even though the counter
      internally is larger (usually 48 bits on recent CPUs)
      
      When the new capability is set use the alternative range which
      do not have these restrictions.
      
      This lowers the overhead of perf stat slightly because it has to
      do less interrupts to accumulate the counter value. On Haswell
      it also avoids some problems with TSX aborting when the end of
      the counter range is reached.
      
      ( See the patch "perf/x86/intel: Avoid checkpointed counters
        causing excessive TSX aborts" for more details. )
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Reviewed-by: default avatarStephane Eranian <eranian@google.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1372173153-20215-1-git-send-email-andi@firstfloor.org
      
      Signed-off-by: Ingo Molnar's avatarIngo Molnar <mingo@kernel.org>
      069e0c3c
  17. 19 Jun, 2013 5 commits
  18. 01 Apr, 2013 3 commits
  19. 26 Mar, 2013 2 commits
  20. 06 Feb, 2013 2 commits