ebpf-challenge/README.md

111 lines
4.7 KiB
Markdown
Raw Normal View History

2020-01-14 15:32:06 +01:00
# Overview
The goal of this exercise is to construct a Go program that populates an eBPF
map with the address in memory of the `_PyRuntime` global variable for all live
Python 3.7 processes. This variable is typically found within `libpython`, which
is dynamically loaded. The target system can be assumed to be a 64-bit Linux OS
running a 5.x kernel.
As an example, say there are two live processes that have mapped
`libpython3.7m.so.1.0` (on your OS `libpython` may have a different name) into
their address spaces, with process IDs 100 and 200. In process 100 the library
is mapped at address 0x10000, and in process 200 the library is mapped at
address 0x20000. Furthermore, assume that `_PyRuntime` is at offset 256
within `libpython3.7m.so.1.0`.
Your solution should populate an eBPF map with the following two entries: (100
-> 0x10256), (200 -> 0x20256).
### Your Solution
Treat your solution as if this was code you would have to live with for a while,
rather than just a throw-away proof-of-concept. i.e. ideally it should be
modular, with a reasonable breakdown of functionality into packages and functions
as required, documented and formatted as per the Go standards, with any tests as
may be necessary (aiming for full test coverage isn't necessary. Use your
judgement as to what may need a test and what may not).
When you have completed your solution, create a pull request for it to the master
branch.
## Task 1
Enumerate all live processes that contain a Python 3.7 interpreter. Note that
this includes both Python REPL processes that have Python as their main binary,
and other processes that have libpython.*.so loaded in their address space.
## Task 2
For all processes containing a Python interpreter determine the address of the
`_PyRuntime` global variable within each process. In Python 3.7 this variable is
in the dynamic symbols of the Python library and the symbol information will be
available regardless of whether or not debug symbols are available. Note that
due to address space layout randomization the actual address of the variable
within each process will differ.
## Task 3
For all live processes containing a Python interpreter, add a mapping from
process ID to the address of `_PyRuntime` to an eBPF map. The skeleton solution
provides an `ebpf` package with a function for creating a map, as well as a
function for reading a map. You need to add a function for writing to the map.
## Getting Started
You will need to following in order to complete the exercise:
1. A functioning Go installation. See https://golang.org for information on how
to get started with this if you have not already done so.
2. A functioning Python 3.7 install. Run `python3 --version` to see what the
default installed version of Python is on your system. It is also necessary
for your Python install to be dynamically linked against `libpython`. To
check if this is the case run `ldd /path/to/python3.7` and check for a
`libpython` entry in the output. If there is no such entry the easiest
solution is to download the Python 3.7 source code from
https://www.python.org/downloads/release/python-376/ and compile/install it
via ``./configure --prefix=`pwd`/install --enable-shared && make && make
install``. Once that completes you will have a Python interpreter at
`./install/bin/python3.7` that you can use.
3. A running Linux kernel with eBPF enabled. It is enabled by default on most
modern kernel configurations. To check if this is the case for yours, search
the config of your running kernel for `CONFIG_BPF=y`. On Fedora this can be
done via `cat /boot/config-$(uname -r)| grep -i "CONFIG_BPF=y"`.
Clone this repository into `$GOPATH/src/github.com/optimyze-interviews`. A
skeleton Go program is provided in the `GetRuntimeAddresses` sub-directory. From
within `GetRuntimeAddresses` you can build the program via `go build`. This will
create a binary called `./GetRuntimeAddresses`. In its current state it will
simply create an eBPF map, attempt to print the contents of this map, and then
exit. The output would look something like the following:
```
$ ./GetRuntimeAddresses
Created eBPF map (FD: 3)
Printing contents of map 3
```
Finally, a Python script called `runforever.py` is also provided. When run via
`python3.7 ./runforever.py` this script simply prints the ID of the Python process and
enters an infinite loop. You can run a few of these simulataneously to test your
solution, if you wish.
# Supporting Material
## Golang
* https://golang.org/
* https://golang.org/doc/
* https://tour.golang.org/welcome/1
# ELF Symbols
* https://www.intezer.com/executable-linkable-format-101-part-2-symbols/
## eBPF
* https://lwn.net/Articles/740157/
* http://man7.org/linux/man-pages/man2/bpf.2.html