If I were to select the worst Linux syscall
My selection would be select
Select, poll and epoll are all Linux syscalls that fulfill a similar purpose, they provide an efficient way of doing asynchronous I/O.
In other words, they wait for some event to happen on a file descriptor. Usually that file descriptor represents a network socket and we are waiting for data to be delivered via TCP or UDP from another machine.
In an era where so much of what we do is over the network, these asyncio syscalls are of vital importance to most programmers.
Yet few developers have actually heard of any of them. Fewer still have actually used them directly. In spite of that, virtually everyone who uses Node, Ruby, Go or Python relies on them without knowing.
Most of us probably prefer to avoid working with syscalls directly most of the time. However the library wrappers we use are usually trivial, we can fallback on the native interfaces at any time we want, provided that we need to squeeze some drops of performance or flexibility.
I’d wager almost all decent developers would be confident enough in their knowledge of Linux threads, to use pthread.h without a second thought. I’d also wager close to 100% of those developers would get cold sweats if they had to quickly create a bug-free epoll based application.
First, I think we should take a look at another creature in the Linux concurrency bestiary, the thread.
Let’s look at 3 examples of using threads. First, using the standard libraries of Python (a “high level” language) and Rust (a “low level” language), then using the native interface that Linux provides for its first class citizen, C.
Python
import Thread from threading
def work(magic_number): #do some work
t1 = Thread(target=work, args=(46,)).start()
t2 = Thread(target=work, args=(3,)).start()
t1.join()
t2.join()
Note: Whilst one may hear that Python threads are not “real threads” due to the GIL, they still use pthread in the implementation. They’d have no problem working like “real” threads in a non-GIL implementation of python.
Rust
use std::thread;
fn work(magic_number: i32) { /* implementation */ }
fn main() {
t1 = thread::spawn(|| work(46));
t2 = thread::spawn(|| work(3));
t1.join()
t2.join()
}
C (POSIX API)
#include <pthread.h>
void *work(void *arg) { /* implementation */ }
int main() {
pthread_t t1, t2;
int magic_nr_1 = 46;
int magic_nr_2 = 3;
pthread_create(&t1, NULL, thread, (void *) magic_nr_1);
pthread_create(&t2, NULL, thread, (void *) magic_nr_2);
pthread_join(t1,NULL);
pthread_join(t2,NULL);
}
Obviously using the native pthread API is slightly more inconvenient, but the wrappers over pthread are still very similar to using it directly, they just simplify it a tiny bit. If you understand how to use threads from the standard library of any language, you’ll figure out how to use the native API.
Next, let’s look at the “lowest level” approach for asynchronous I/O in Python and Rust, then let’s look at how we’d do it using the native API.
In this example, we’ll look at communicating concurrently with 2 different addresses over TCP.
Python
import asyncio
async def send(addr, port, message):
con = asyncio.open_connection(addr, port)
reader, writer = await con
writer.write(message)
buffer_size = 254
data = await reader.read(buffer_size)
print('Got back the messages: {}'.format(data))
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.gather(
send('8.8.8.8', 1234, 46)
,send('1.1.1.1', 4321, 3)
))
loop.close()
Rust
extern crate tokio;
extern crate futures;
use futures::future;
use tokio::{net::TcpStream, prelude::Future};
use std::net::SocketAddr;
fn send(addr: SocketAddr, message: String) {
let task = TcpStream::connect(&addr)
.and_then(|stream| tokio::io::write_all(stream, message))
.map_err(|e| println!("Error: {}", e))
.map(|_| ());
tokio::spawn(task);
}
fn main() {
tokio::run(future::lazy(|| {
send("8.8.8.8:1234".parse().unwrap(), String::from("46"));
send("1.1.1.1:4321".parse().unwrap(), String::from("3"));
Ok(())
}));
}
Here I should mention that I’ve cheated a bit. Since futures and tokio aren’t part of the standard library of Rust. However, rust doesn’t yet have a asyncio standard and the discussion around implementing one is mainly around integrating tokio & futures into the standard library. So there’s a high chance the final version of an asyncio TCP client in rust will be quite similar to the above code._
C (POSIX API)
And here comes the problem.
I’d love to show you an idiomatic way of using poll, epoll or select to complete two requests via TCP synchronously, but I would probably get something wrong. Further more, the code would be long and very hard to skim through.
Here’s a very well rounded TCP client that uses poll. Notice the issue ? It spans 700 lines of code.
Granted, part of this is the fault of C and the fault of the native networking functionality. But writing a simple TCP client isn’t that hard, it’s the part where we have to include asyncio that makes it impossible to get right.
I’d challenge any reader who thinks otherwise to find me an idiomatic example of using poll that is even close to the simplicity of the Rust and Python approaches.
How to fix async syscalls
If the C code above was a bit too hard for you to read, let me break the usage of the async syscalls down for you with some smaller snippets of code:
- Make a file descriptor non blocking, as seen here.
- Loop through the non blocking file descriptors and see if any events have happened on any of them. As seen here, though with only a single fd.
- If and event has happened, handle it then carry of looping. If nothing has happened, carry on.
This doesn’t seem like such a bad interface, except for the fact that understanding how to write poll/epoll/select based code and writing it are two completely different concepts.
Even assuming the interface was simpler, it’s still completely foreign for anyone who used any other asyncio library. You are essentially starting from zero if you want to use the native API.
If only there was an easier way…
Well, there is one, the way literally all other libraries handle asyncio, using callbacks and/or futures. It’s the way every other language with native asyncio support, such as Go, Python and Node, do it. It’s the way all asyncio libraries do it, be it libraries for C, C++, Java, Scala, Ruby, Perl, Php, Rust or any other language under the sun.
But is it kosher for Linux to hide all that complexity under the hood ? To have a user friendly API that filthy peasants with no understanding of true programming aesthetics can use.
To which I reply, take a look at pthread. The same way we use pthread_create
instead of the clone
syscall. We could have a user friendly wrapper over the async scyalls without losing any functionality, via the magic of function pointers.
Sure, there could be edge cases where one needs to use the original syscalls, similar to how there are edge cases where you might need to use clone
instead of fork
or pthread
.
But for 99.9% of cases, I think we’d be more than happy with a wrapper in the vein of:
poll_events(underlying_syscall, fds_array, callback)
Where the callback functions would be standardized to receive a file descriptor and any additional args specific to the underlying syscall. For example two shorts (events and revents) for poll
.
Or maybe this interface could even provide a standardized way for the same function to work with either 3 of the syscalls.
Either way, I’m not trying to dictate the design of a native Linux async interface here. I’m just suggesting that we might need a simpler one.
If people can’t use your API, maybe it’s a sign that your API is rubbish, rather than a sign that your users are stupid. If bread&butter syscalls are understood and used only by a select few library maintainers, it means you need a better API.
Since the advent of select, there have been two opportunities to provide a better interface. One when poll was introduces, the other one when epoll was introduced. Yet we are stuck with 3 cumbersome asyncio mechanisms, with slightly different APIs and implementations, all equally unusable.
If you enjoyed this article you may also like:
Published on: 2019-05-03