Isn't one of the kernel design intentions to move more things to user space and keep the kernel itself smaller? I think network filtering should be one of those things.
So why haven't they done that? Is it considered too much of a performance issue to use user processes filter packets? I don't think it would be. But maybe there is some functionality missing that can be done in the kernel and not in user space?
The API would not be hard. Just create device nodes...