From 90ec5e89e393c76e19afc845d8f88a5dc8315919 Mon Sep 17 00:00:00 2001 From: "Naveen N. Rao" Date: Wed, 22 Feb 2017 19:23:37 +0530 Subject: kretprobes: Ensure probe location is at function entry kretprobes can be registered by specifying an absolute address or by specifying offset to a symbol. However, we need to ensure this falls at function entry so as to be able to determine the return address. Validate the same during kretprobe registration. By default, there should not be any offset from a function entry, as determined through a kallsyms_lookup(). Introduce arch_function_offset_within_entry() as a way for architectures to override this. Signed-off-by: Naveen N. Rao Acked-by: Masami Hiramatsu Cc: Ananth N Mavinakayanahalli Cc: Michael Ellerman Cc: Steven Rostedt Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/f1583bc4839a3862cfc2acefcc56f9c8837fa2ba.1487770934.git.naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- kernel/kprobes.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) (limited to 'kernel') diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 699c5bc51a92..448759d4a263 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -1875,12 +1875,25 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs) } NOKPROBE_SYMBOL(pre_handler_kretprobe); +bool __weak arch_function_offset_within_entry(unsigned long offset) +{ + return !offset; +} + int register_kretprobe(struct kretprobe *rp) { int ret = 0; struct kretprobe_instance *inst; int i; void *addr; + unsigned long offset; + + addr = kprobe_addr(&rp->kp); + if (!kallsyms_lookup_size_offset((unsigned long)addr, NULL, &offset)) + return -EINVAL; + + if (!arch_function_offset_within_entry(offset)) + return -EINVAL; if (kretprobe_blacklist_size) { addr = kprobe_addr(&rp->kp); -- cgit v1.2.3 From 35b6f55aa9ba65141f2def0997e23aab13715d3f Mon Sep 17 00:00:00 2001 From: "Naveen N. Rao" Date: Wed, 22 Feb 2017 19:23:39 +0530 Subject: trace/kprobes: Allow return probes with offsets and absolute addresses Since the kernel includes many non-global functions with same names, we will need to use offsets from other symbols (typically _text/_stext) or absolute addresses to place return probes on specific functions. Also, the core register_kretprobe() API never forbid use of offsets or absolute addresses with kretprobes. Allow its use with the trace infrastructure. To distinguish kernels that support this, update ftrace README to explicitly call this out. Signed-off-by: Naveen N. Rao Acked-by: Masami Hiramatsu Cc: Ananth N Mavinakayanahalli Cc: Michael Ellerman Cc: Steven Rostedt Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/183e7ce2921a08c9c755ee9a5da3134febc6695b.1487770934.git.naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- kernel/trace/trace.c | 1 + kernel/trace/trace_kprobe.c | 8 -------- 2 files changed, 1 insertion(+), 8 deletions(-) (limited to 'kernel') diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index f35109514a01..0ed834d6beb0 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -4355,6 +4355,7 @@ static const char readme_msg[] = "\t -:[/]\n" #ifdef CONFIG_KPROBE_EVENTS "\t place: [:][+]|\n" + "place (kretprobe): [:][+]|\n" #endif #ifdef CONFIG_UPROBE_EVENTS "\t place: :\n" diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index eadd96ef772f..18775ef182f8 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -680,10 +680,6 @@ static int create_trace_kprobe(int argc, char **argv) return -EINVAL; } if (isdigit(argv[1][0])) { - if (is_return) { - pr_info("Return probe point must be a symbol.\n"); - return -EINVAL; - } /* an address specified */ ret = kstrtoul(&argv[1][0], 0, (unsigned long *)&addr); if (ret) { @@ -699,10 +695,6 @@ static int create_trace_kprobe(int argc, char **argv) pr_info("Failed to parse symbol.\n"); return ret; } - if (offset && is_return) { - pr_info("Return probe must be used without offset.\n"); - return -EINVAL; - } } argc -= 2; argv += 2; -- cgit v1.2.3 From d0e02579c282ccf34c79818045ec2d2934b56c19 Mon Sep 17 00:00:00 2001 From: "Steven Rostedt (VMware)" Date: Mon, 27 Feb 2017 11:52:04 -0500 Subject: trace/kprobes: Add back warning about offset in return probes Let's not remove the warning about offsets and return probes when the offset is invalid. Signed-off-by: Steven Rostedt Acked-by: Masami Hiramatsu Acked-by: Naveen N. Rao Cc: Ananth N Mavinakayanahalli Cc: Michael Ellerman Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20170227115204.00f92846@gandalf.local.home Signed-off-by: Arnaldo Carvalho de Melo --- kernel/trace/trace_kprobe.c | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'kernel') diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index 18775ef182f8..2b7d0dd938ba 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -695,6 +695,11 @@ static int create_trace_kprobe(int argc, char **argv) pr_info("Failed to parse symbol.\n"); return ret; } + if (offset && is_return && + !arch_function_offset_within_entry(offset)) { + pr_info("Given offset is not valid for return probe.\n"); + return -EINVAL; + } } argc -= 2; argv += 2; -- cgit v1.2.3 From e422267322cd319e2695a535e47c5b1feeac45eb Mon Sep 17 00:00:00 2001 From: Hari Bathini Date: Wed, 8 Mar 2017 02:11:36 +0530 Subject: perf: Add PERF_RECORD_NAMESPACES to include namespaces related info With the advert of container technologies like docker, that depend on namespaces for isolation, there is a need for tracing support for namespaces. This patch introduces new PERF_RECORD_NAMESPACES event for recording namespaces related info. By recording info for every namespace, it is left to userspace to take a call on the definition of a container and trace containers by updating perf tool accordingly. Each namespace has a combination of device and inode numbers. Though every namespace has the same device number currently, that may change in future to avoid the need for a namespace of namespaces. Considering such possibility, record both device and inode numbers separately for each namespace. Signed-off-by: Hari Bathini Acked-by: Jiri Olsa Acked-by: Peter Zijlstra Cc: Alexander Shishkin Cc: Alexei Starovoitov Cc: Ananth N Mavinakayanahalli Cc: Aravinda Prasad Cc: Brendan Gregg Cc: Daniel Borkmann Cc: Eric Biederman Cc: Sargun Dhillon Cc: Steven Rostedt Link: http://lkml.kernel.org/r/148891929686.25309.2827618988917007768.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- kernel/events/core.c | 139 +++++++++++++++++++++++++++++++++++++++++++++++++++ kernel/fork.c | 2 + kernel/nsproxy.c | 3 ++ 3 files changed, 144 insertions(+) (limited to 'kernel') diff --git a/kernel/events/core.c b/kernel/events/core.c index 6f41548f2e32..16c877a121c8 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -48,6 +48,8 @@ #include #include #include +#include +#include #include "internal.h" @@ -379,6 +381,7 @@ static DEFINE_PER_CPU(struct pmu_event_list, pmu_sb_events); static atomic_t nr_mmap_events __read_mostly; static atomic_t nr_comm_events __read_mostly; +static atomic_t nr_namespaces_events __read_mostly; static atomic_t nr_task_events __read_mostly; static atomic_t nr_freq_events __read_mostly; static atomic_t nr_switch_events __read_mostly; @@ -3991,6 +3994,8 @@ static void unaccount_event(struct perf_event *event) atomic_dec(&nr_mmap_events); if (event->attr.comm) atomic_dec(&nr_comm_events); + if (event->attr.namespaces) + atomic_dec(&nr_namespaces_events); if (event->attr.task) atomic_dec(&nr_task_events); if (event->attr.freq) @@ -6491,6 +6496,7 @@ static void perf_event_task(struct task_struct *task, void perf_event_fork(struct task_struct *task) { perf_event_task(task, NULL, 1); + perf_event_namespaces(task); } /* @@ -6592,6 +6598,132 @@ void perf_event_comm(struct task_struct *task, bool exec) perf_event_comm_event(&comm_event); } +/* + * namespaces tracking + */ + +struct perf_namespaces_event { + struct task_struct *task; + + struct { + struct perf_event_header header; + + u32 pid; + u32 tid; + u64 nr_namespaces; + struct perf_ns_link_info link_info[NR_NAMESPACES]; + } event_id; +}; + +static int perf_event_namespaces_match(struct perf_event *event) +{ + return event->attr.namespaces; +} + +static void perf_event_namespaces_output(struct perf_event *event, + void *data) +{ + struct perf_namespaces_event *namespaces_event = data; + struct perf_output_handle handle; + struct perf_sample_data sample; + int ret; + + if (!perf_event_namespaces_match(event)) + return; + + perf_event_header__init_id(&namespaces_event->event_id.header, + &sample, event); + ret = perf_output_begin(&handle, event, + namespaces_event->event_id.header.size); + if (ret) + return; + + namespaces_event->event_id.pid = perf_event_pid(event, + namespaces_event->task); + namespaces_event->event_id.tid = perf_event_tid(event, + namespaces_event->task); + + perf_output_put(&handle, namespaces_event->event_id); + + perf_event__output_id_sample(event, &handle, &sample); + + perf_output_end(&handle); +} + +static void perf_fill_ns_link_info(struct perf_ns_link_info *ns_link_info, + struct task_struct *task, + const struct proc_ns_operations *ns_ops) +{ + struct path ns_path; + struct inode *ns_inode; + void *error; + + error = ns_get_path(&ns_path, task, ns_ops); + if (!error) { + ns_inode = ns_path.dentry->d_inode; + ns_link_info->dev = new_encode_dev(ns_inode->i_sb->s_dev); + ns_link_info->ino = ns_inode->i_ino; + } +} + +void perf_event_namespaces(struct task_struct *task) +{ + struct perf_namespaces_event namespaces_event; + struct perf_ns_link_info *ns_link_info; + + if (!atomic_read(&nr_namespaces_events)) + return; + + namespaces_event = (struct perf_namespaces_event){ + .task = task, + .event_id = { + .header = { + .type = PERF_RECORD_NAMESPACES, + .misc = 0, + .size = sizeof(namespaces_event.event_id), + }, + /* .pid */ + /* .tid */ + .nr_namespaces = NR_NAMESPACES, + /* .link_info[NR_NAMESPACES] */ + }, + }; + + ns_link_info = namespaces_event.event_id.link_info; + + perf_fill_ns_link_info(&ns_link_info[MNT_NS_INDEX], + task, &mntns_operations); + +#ifdef CONFIG_USER_NS + perf_fill_ns_link_info(&ns_link_info[USER_NS_INDEX], + task, &userns_operations); +#endif +#ifdef CONFIG_NET_NS + perf_fill_ns_link_info(&ns_link_info[NET_NS_INDEX], + task, &netns_operations); +#endif +#ifdef CONFIG_UTS_NS + perf_fill_ns_link_info(&ns_link_info[UTS_NS_INDEX], + task, &utsns_operations); +#endif +#ifdef CONFIG_IPC_NS + perf_fill_ns_link_info(&ns_link_info[IPC_NS_INDEX], + task, &ipcns_operations); +#endif +#ifdef CONFIG_PID_NS + perf_fill_ns_link_info(&ns_link_info[PID_NS_INDEX], + task, &pidns_operations); +#endif +#ifdef CONFIG_CGROUPS + perf_fill_ns_link_info(&ns_link_info[CGROUP_NS_INDEX], + task, &cgroupns_operations); +#endif + + perf_iterate_sb(perf_event_namespaces_output, + &namespaces_event, + NULL); +} + /* * mmap tracking */ @@ -9146,6 +9278,8 @@ static void account_event(struct perf_event *event) atomic_inc(&nr_mmap_events); if (event->attr.comm) atomic_inc(&nr_comm_events); + if (event->attr.namespaces) + atomic_inc(&nr_namespaces_events); if (event->attr.task) atomic_inc(&nr_task_events); if (event->attr.freq) @@ -9691,6 +9825,11 @@ SYSCALL_DEFINE5(perf_event_open, return -EACCES; } + if (attr.namespaces) { + if (!capable(CAP_SYS_ADMIN)) + return -EACCES; + } + if (attr.freq) { if (attr.sample_freq > sysctl_perf_event_sample_rate) return -EINVAL; diff --git a/kernel/fork.c b/kernel/fork.c index 6c463c80e93d..afa2947286cd 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2352,6 +2352,8 @@ SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags) } } + perf_event_namespaces(current); + bad_unshare_cleanup_cred: if (new_cred) put_cred(new_cred); diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c index 782102e59eed..f6c5d330059a 100644 --- a/kernel/nsproxy.c +++ b/kernel/nsproxy.c @@ -26,6 +26,7 @@ #include #include #include +#include static struct kmem_cache *nsproxy_cachep; @@ -262,6 +263,8 @@ SYSCALL_DEFINE2(setns, int, fd, int, nstype) goto out; } switch_task_namespaces(tsk, new_nsproxy); + + perf_event_namespaces(tsk); out: fput(file); return err; -- cgit v1.2.3 From 5f6bee34707973ea7879a7857fd63ddccc92fff3 Mon Sep 17 00:00:00 2001 From: "Naveen N. Rao" Date: Wed, 8 Mar 2017 22:34:15 +0530 Subject: kprobes: Convert kprobe_exceptions_notify to use NOKPROBE_SYMBOL commit fc62d0207ae0 ("kprobes: Introduce weak variant of kprobe_exceptions_notify()") used the __kprobes annotation to exclude kprobe_exceptions_notify from being probed. Since NOKPROBE_SYMBOL() is a better way to do this enabling the symbol to be discovered as being blacklisted, change over to using NOKPROBE_SYMBOL(). Signed-off-by: Naveen N. Rao Acked-by: Masami Hiramatsu Cc: Ananth N Mavinakayanahalli Link: http://lkml.kernel.org/r/3f25bf400da5c222cd9b10eec6ded2d6b58209f8.1488991670.git.naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- kernel/kprobes.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) (limited to 'kernel') diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 448759d4a263..4780ec236035 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -1740,11 +1740,12 @@ void unregister_kprobes(struct kprobe **kps, int num) } EXPORT_SYMBOL_GPL(unregister_kprobes); -int __weak __kprobes kprobe_exceptions_notify(struct notifier_block *self, - unsigned long val, void *data) +int __weak kprobe_exceptions_notify(struct notifier_block *self, + unsigned long val, void *data) { return NOTIFY_DONE; } +NOKPROBE_SYMBOL(kprobe_exceptions_notify); static struct notifier_block kprobe_exceptions_nb = { .notifier_call = kprobe_exceptions_notify, -- cgit v1.2.3 From 1d585e70905e03e8c19c9aaf523ec246ae6b18a1 Mon Sep 17 00:00:00 2001 From: "Naveen N. Rao" Date: Wed, 8 Mar 2017 13:56:06 +0530 Subject: trace/kprobes: Fix check for kretprobe offset within function entry perf specifies an offset from _text and since this offset is fed directly into the arch-specific helper, kprobes tracer rejects installation of kretprobes through perf. Fix this by looking up the actual offset from a function for the specified sym+offset. Refactor and reuse existing routines to limit code duplication -- we repurpose kprobe_addr() for determining final kprobe address and we split out the function entry offset determination into a separate generic helper. Before patch: naveen@ubuntu:~/linux/tools/perf$ sudo ./perf probe -v do_open%return probe-definition(0): do_open%return symbol:do_open file:(null) line:0 offset:0 return:1 lazy:(null) 0 arguments Looking at the vmlinux_path (8 entries long) Using /boot/vmlinux for symbols Open Debuginfo file: /boot/vmlinux Try to find probe point from debuginfo. Matched function: do_open [2d0c7ff] Probe point found: do_open+0 Matched function: do_open [35d76dc] found inline addr: 0xc0000000004ba9c4 Failed to find "do_open%return", because do_open is an inlined function and has no return point. An error occurred in debuginfo analysis (-22). Trying to use symbols. Opening /sys/kernel/debug/tracing//README write=0 Opening /sys/kernel/debug/tracing//kprobe_events write=1 Writing event: r:probe/do_open _text+4469776 Failed to write event: Invalid argument Error: Failed to add events. Reason: Invalid argument (Code: -22) naveen@ubuntu:~/linux/tools/perf$ dmesg | tail [ 33.568656] Given offset is not valid for return probe. After patch: naveen@ubuntu:~/linux/tools/perf$ sudo ./perf probe -v do_open%return probe-definition(0): do_open%return symbol:do_open file:(null) line:0 offset:0 return:1 lazy:(null) 0 arguments Looking at the vmlinux_path (8 entries long) Using /boot/vmlinux for symbols Open Debuginfo file: /boot/vmlinux Try to find probe point from debuginfo. Matched function: do_open [2d0c7d6] Probe point found: do_open+0 Matched function: do_open [35d76b3] found inline addr: 0xc0000000004ba9e4 Failed to find "do_open%return", because do_open is an inlined function and has no return point. An error occurred in debuginfo analysis (-22). Trying to use symbols. Opening /sys/kernel/debug/tracing//README write=0 Opening /sys/kernel/debug/tracing//kprobe_events write=1 Writing event: r:probe/do_open _text+4469808 Writing event: r:probe/do_open_1 _text+4956344 Added new events: probe:do_open (on do_open%return) probe:do_open_1 (on do_open%return) You can now use it in all perf tools, such as: perf record -e probe:do_open_1 -aR sleep 1 naveen@ubuntu:~/linux/tools/perf$ sudo cat /sys/kernel/debug/kprobes/list c000000000041370 k kretprobe_trampoline+0x0 [OPTIMIZED] c0000000004ba0b8 r do_open+0x8 [DISABLED] c000000000443430 r do_open+0x0 [DISABLED] Signed-off-by: Naveen N. Rao Acked-by: Masami Hiramatsu Cc: Ananth N Mavinakayanahalli Cc: Michael Ellerman Cc: Steven Rostedt Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/d8cd1ef420ec22e3643ac332fdabcffc77319a42.1488961018.git.naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- kernel/kprobes.c | 40 ++++++++++++++++++++++++++-------------- kernel/trace/trace_kprobe.c | 2 +- 2 files changed, 27 insertions(+), 15 deletions(-) (limited to 'kernel') diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 4780ec236035..d733479a10ee 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -1391,21 +1391,19 @@ bool within_kprobe_blacklist(unsigned long addr) * This returns encoded errors if it fails to look up symbol or invalid * combination of parameters. */ -static kprobe_opcode_t *kprobe_addr(struct kprobe *p) +static kprobe_opcode_t *_kprobe_addr(kprobe_opcode_t *addr, + const char *symbol_name, unsigned int offset) { - kprobe_opcode_t *addr = p->addr; - - if ((p->symbol_name && p->addr) || - (!p->symbol_name && !p->addr)) + if ((symbol_name && addr) || (!symbol_name && !addr)) goto invalid; - if (p->symbol_name) { - kprobe_lookup_name(p->symbol_name, addr); + if (symbol_name) { + kprobe_lookup_name(symbol_name, addr); if (!addr) return ERR_PTR(-ENOENT); } - addr = (kprobe_opcode_t *)(((char *)addr) + p->offset); + addr = (kprobe_opcode_t *)(((char *)addr) + offset); if (addr) return addr; @@ -1413,6 +1411,11 @@ invalid: return ERR_PTR(-EINVAL); } +static kprobe_opcode_t *kprobe_addr(struct kprobe *p) +{ + return _kprobe_addr(p->addr, p->symbol_name, p->offset); +} + /* Check passed kprobe is valid and return kprobe in kprobe_table. */ static struct kprobe *__get_valid_kprobe(struct kprobe *p) { @@ -1881,19 +1884,28 @@ bool __weak arch_function_offset_within_entry(unsigned long offset) return !offset; } +bool function_offset_within_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset) +{ + kprobe_opcode_t *kp_addr = _kprobe_addr(addr, sym, offset); + + if (IS_ERR(kp_addr)) + return false; + + if (!kallsyms_lookup_size_offset((unsigned long)kp_addr, NULL, &offset) || + !arch_function_offset_within_entry(offset)) + return false; + + return true; +} + int register_kretprobe(struct kretprobe *rp) { int ret = 0; struct kretprobe_instance *inst; int i; void *addr; - unsigned long offset; - - addr = kprobe_addr(&rp->kp); - if (!kallsyms_lookup_size_offset((unsigned long)addr, NULL, &offset)) - return -EINVAL; - if (!arch_function_offset_within_entry(offset)) + if (!function_offset_within_entry(rp->kp.addr, rp->kp.symbol_name, rp->kp.offset)) return -EINVAL; if (kretprobe_blacklist_size) { diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index 12fb540da0e5..013f4e7146d4 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -697,7 +697,7 @@ static int create_trace_kprobe(int argc, char **argv) return ret; } if (offset && is_return && - !arch_function_offset_within_entry(offset)) { + !function_offset_within_entry(NULL, symbol, offset)) { pr_info("Given offset is not valid for return probe.\n"); return -EINVAL; } -- cgit v1.2.3 From f4c0b0aa58d9b7e30ab0a95e33da84d53b3d764a Mon Sep 17 00:00:00 2001 From: Will Deacon Date: Mon, 20 Feb 2017 15:33:50 +0200 Subject: perf/core: Keep AUX flags in the output handle In preparation for adding more flags to perf AUX records, introduce a separate API for setting the flags for a session, rather than appending more bool arguments to perf_aux_output_end. This allows to set each flag at the time a corresponding condition is detected, instead of tracking it in each driver's private state. Signed-off-by: Will Deacon Signed-off-by: Alexander Shishkin Signed-off-by: Peter Zijlstra (Intel) Cc: Arnaldo Carvalho de Melo Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Mathieu Poirier Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Cc: vince@deater.net Link: http://lkml.kernel.org/r/20170220133352.17995-3-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar --- kernel/events/ring_buffer.c | 34 +++++++++++++++++++++++----------- 1 file changed, 23 insertions(+), 11 deletions(-) (limited to 'kernel') diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c index 257fa460b846..9654e55c38d6 100644 --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -297,6 +297,19 @@ ring_buffer_init(struct ring_buffer *rb, long watermark, int flags) rb->paused = 1; } +void perf_aux_output_flag(struct perf_output_handle *handle, u64 flags) +{ + /* + * OVERWRITE is determined by perf_aux_output_end() and can't + * be passed in directly. + */ + if (WARN_ON_ONCE(flags & PERF_AUX_FLAG_OVERWRITE)) + return; + + handle->aux_flags |= flags; +} +EXPORT_SYMBOL_GPL(perf_aux_output_flag); + /* * This is called before hardware starts writing to the AUX area to * obtain an output handle and make sure there's room in the buffer. @@ -360,6 +373,7 @@ void *perf_aux_output_begin(struct perf_output_handle *handle, handle->event = event; handle->head = aux_head; handle->size = 0; + handle->aux_flags = 0; /* * In overwrite mode, AUX data stores do not depend on aux_tail, @@ -408,34 +422,32 @@ err: * of the AUX buffer management code is that after pmu::stop(), the AUX * transaction must be stopped and therefore drop the AUX reference count. */ -void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size, - bool truncated) +void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size) { struct ring_buffer *rb = handle->rb; - bool wakeup = truncated; + bool wakeup = !!handle->aux_flags; unsigned long aux_head; - u64 flags = 0; - - if (truncated) - flags |= PERF_AUX_FLAG_TRUNCATED; /* in overwrite mode, driver provides aux_head via handle */ if (rb->aux_overwrite) { - flags |= PERF_AUX_FLAG_OVERWRITE; + handle->aux_flags |= PERF_AUX_FLAG_OVERWRITE; aux_head = handle->head; local_set(&rb->aux_head, aux_head); } else { + handle->aux_flags &= ~PERF_AUX_FLAG_OVERWRITE; + aux_head = local_read(&rb->aux_head); local_add(size, &rb->aux_head); } - if (size || flags) { + if (size || handle->aux_flags) { /* * Only send RECORD_AUX if we have something useful to communicate */ - perf_event_aux_event(handle->event, aux_head, size, flags); + perf_event_aux_event(handle->event, aux_head, size, + handle->aux_flags); } aux_head = rb->user_page->aux_head = local_read(&rb->aux_head); @@ -446,7 +458,7 @@ void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size, } if (wakeup) { - if (truncated) + if (handle->aux_flags & PERF_AUX_FLAG_TRUNCATED) handle->event->pending_disable = 1; perf_output_wakeup(handle); } -- cgit v1.2.3 From ae0c2d995d648d5165545d5e05e2869642009b38 Mon Sep 17 00:00:00 2001 From: Alexander Shishkin Date: Mon, 20 Feb 2017 15:33:51 +0200 Subject: perf/core: Add a flag for partial AUX records The Intel PT driver needs to be able to communicate partial AUX transactions, that is, transactions with gaps in data for reasons other than no room left in the buffer (i.e. truncated transactions). Therefore, this condition does not imply a wakeup for the consumer. To this end, add a new "partial" AUX flag. Signed-off-by: Alexander Shishkin Signed-off-by: Peter Zijlstra (Intel) Cc: Arnaldo Carvalho de Melo Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Mathieu Poirier Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Cc: vince@deater.net Link: http://lkml.kernel.org/r/20170220133352.17995-4-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar --- kernel/events/ring_buffer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'kernel') diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c index 9654e55c38d6..2831480c63a2 100644 --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -424,8 +424,8 @@ err: */ void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size) { + bool wakeup = !!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED); struct ring_buffer *rb = handle->rb; - bool wakeup = !!handle->aux_flags; unsigned long aux_head; /* in overwrite mode, driver provides aux_head via handle */ -- cgit v1.2.3 From 61f35d750683b21e9e3836e309195c79c1daed74 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Thu, 16 Mar 2017 12:42:02 -0300 Subject: uprobes: Default UPROBES_EVENTS to Y As it is already turned on by most distros, so just flip the default to Y. Suggested-by: Alexei Starovoitov Acked-by: David Ahern Acked-by: Ingo Molnar Acked-by: Masami Hiramatsu Acked-by: Namhyung Kim Acked-by: Srikar Dronamraju Acked-by: Wang Nan Cc: Alexander Shishkin Cc: Anton Blanchard Cc: David Miller Cc: Hemant Kumar Cc: Jiri Olsa Cc: Peter Zijlstra Cc: Steven Rostedt Link: http://lkml.kernel.org/r/20170316005817.GA6805@ast-mbp.thefacebook.com Signed-off-by: Arnaldo Carvalho de Melo --- kernel/trace/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'kernel') diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index d4a06e714645..9619b5768e4b 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -455,7 +455,7 @@ config UPROBE_EVENTS select UPROBES select PROBE_EVENTS select TRACING - default n + default y help This allows the user to add tracing events on top of userspace dynamic events (similar to tracepoints) on the fly via the trace -- cgit v1.2.3