A coding challenge I worked out.
Go to file
2020-01-21 13:38:26 +01:00
GetRuntimeAddresses Lock access to S.cache 2020-01-21 13:38:26 +01:00
README.md Initial import 2020-01-14 14:32:06 +00:00
runforever.py Initial import 2020-01-14 14:32:06 +00:00

Overview

The goal of this exercise is to construct a Go program that populates an eBPF map with the address in memory of the _PyRuntime global variable for all live Python 3.7 processes. This variable is typically found within libpython, which is dynamically loaded. The target system can be assumed to be a 64-bit Linux OS running a 5.x kernel.

As an example, say there are two live processes that have mapped libpython3.7m.so.1.0 (on your OS libpython may have a different name) into their address spaces, with process IDs 100 and 200. In process 100 the library is mapped at address 0x10000, and in process 200 the library is mapped at address 0x20000. Furthermore, assume that _PyRuntime is at offset 256 within libpython3.7m.so.1.0.

Your solution should populate an eBPF map with the following two entries: (100 -> 0x10256), (200 -> 0x20256).

Your Solution

Treat your solution as if this was code you would have to live with for a while, rather than just a throw-away proof-of-concept. i.e. ideally it should be modular, with a reasonable breakdown of functionality into packages and functions as required, documented and formatted as per the Go standards, with any tests as may be necessary (aiming for full test coverage isn't necessary. Use your judgement as to what may need a test and what may not).

When you have completed your solution, create a pull request for it to the master branch.

Task 1

Enumerate all live processes that contain a Python 3.7 interpreter. Note that this includes both Python REPL processes that have Python as their main binary, and other processes that have libpython.*.so loaded in their address space.

Task 2

For all processes containing a Python interpreter determine the address of the _PyRuntime global variable within each process. In Python 3.7 this variable is in the dynamic symbols of the Python library and the symbol information will be available regardless of whether or not debug symbols are available. Note that due to address space layout randomization the actual address of the variable within each process will differ.

Task 3

For all live processes containing a Python interpreter, add a mapping from process ID to the address of _PyRuntime to an eBPF map. The skeleton solution provides an ebpf package with a function for creating a map, as well as a function for reading a map. You need to add a function for writing to the map.

Getting Started

You will need to following in order to complete the exercise:

  1. A functioning Go installation. See https://golang.org for information on how to get started with this if you have not already done so.

  2. A functioning Python 3.7 install. Run python3 --version to see what the default installed version of Python is on your system. It is also necessary for your Python install to be dynamically linked against libpython. To check if this is the case run ldd /path/to/python3.7 and check for a libpython entry in the output. If there is no such entry the easiest solution is to download the Python 3.7 source code from https://www.python.org/downloads/release/python-376/ and compile/install it via ./configure --prefix=`pwd`/install --enable-shared && make && make install. Once that completes you will have a Python interpreter at ./install/bin/python3.7 that you can use.

  3. A running Linux kernel with eBPF enabled. It is enabled by default on most modern kernel configurations. To check if this is the case for yours, search the config of your running kernel for CONFIG_BPF=y. On Fedora this can be done via cat /boot/config-$(uname -r)| grep -i "CONFIG_BPF=y".

Clone this repository into $GOPATH/src/github.com/optimyze-interviews. A skeleton Go program is provided in the GetRuntimeAddresses sub-directory. From within GetRuntimeAddresses you can build the program via go build. This will create a binary called ./GetRuntimeAddresses. In its current state it will simply create an eBPF map, attempt to print the contents of this map, and then exit. The output would look something like the following:

$ ./GetRuntimeAddresses 
Created eBPF map (FD: 3)
Printing contents of map 3

Finally, a Python script called runforever.py is also provided. When run via python3.7 ./runforever.py this script simply prints the ID of the Python process and enters an infinite loop. You can run a few of these simulataneously to test your solution, if you wish.

Supporting Material

Golang

ELF Symbols

eBPF