LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel
User Name
Password
Linux - Kernel This forum is for all discussion relating to the Linux kernel.

Notices


Reply
  Search this Thread
Old 10-09-2023, 10:14 AM   #1
nyquist09
LQ Newbie
 
Registered: Oct 2023
Posts: 2

Rep: Reputation: 0
Huge latency on pselect


I have the following pre-condition on Linux (low latency kernel):

  • I'm using a process to read serial data using pselect on a serial device /dev/ttySX.
  • Data comes in at a stable frequency of 400 Hz.
  • To optimize latency of that process I used some measures:
  • The reading thread is pinned to a core (using affinity) and no tasks are allowed to run on that core. This is done via cgroups/cpuset.
  • The reading thread has an RT prio of 49 (just below some of the IRQ processes) with the SCHED_FIFO policy.
  • The IRQ corresponding to /dev/ttyS4 is pinned to that same core. Also the IRQ process runs on that core. This was done to further reduce latency.
  • Fully loading the system with stress --cpu XX --io XX does not affect the latency of the readings, and they come in nicely at 400 Hz

The problem I experience:

There is another offending user space process which uses a lot of resources. When this one runs, it can cause huge latency spikes on my serial read thread. It can be 100 ms or more, even though serial data from the hardware arrives at 2.5 ms.
I don't know too much about that other offending user space process, except that it is using the regular 'nice' scheduler and it spawns a lot of threads.


My question:
  • Any ideas / approaches how I can possibly debug this? Maybe using ftrace/ptrace, but I am not quite sure where to start.
  • Any ideas what could cause such a behavior? A delay of several tens of milliseconds seems like a solvable problem from user space.
  • I assume that there is some sort of a kernel process involved that wakes user space programs waiting on select. What is a good place to find information like that? I guess that this process somehow does not have the right priority. Since my program only waits on one device, maybe using the select approach is not the best one and maybe a simple read could yield better results?

At this point, I am happy for any hints/ideas, thanks!
 
Old 10-11-2023, 02:53 AM   #2
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,138

Rep: Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122
Do you happen to be pinning your process to CPU0 ?. If so pick another one - I like to stay away from the processors that the kernel has to run on in early boot as (to my simple mind) it's likely to reschedule there all the time. Might be nothing, but easy to implement as a test.
 
Old 10-19-2023, 12:27 PM   #3
nyquist09
LQ Newbie
 
Registered: Oct 2023
Posts: 2

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by syg00 View Post
Do you happen to be pinning your process to CPU0 ?. If so pick another one - I like to stay away from the processors that the kernel has to run on in early boot as (to my simple mind) it's likely to reschedule there all the time. Might be nothing, but easy to implement as a test.
Thanks. No I am not using CPU0.

In fact, I figured out what was the problem by tracing down scheduler events using ftrace:

The tty driver relies on an unbound kernel worker to push data to the user via a workqueue. That kworker is scheduled with 'SCHED_OTHER'. This is kind of a strange situation, because I prioritized the tty IRQ and the receiving application both with SCHED_FIFO, but I have this SCHED_OTHER kworker in between, which is clearly the weakest chain in the link. There used to be a low_latency flag for tty, but it got removed because it was buggy apparently.

Well, in any case I understood the problem and I am evaluation options to overcome this.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
using fork with pselect problam 12w23e Programming 6 01-01-2009 06:08 PM
Exit from blocked pselect() noisebleed Programming 5 12-16-2008 07:10 PM
pselect threads signals b2na Programming 2 12-31-2004 10:50 AM
sending signal to thread waiting on pselect exedx Programming 4 03-21-2004 11:48 PM
pselect function. nio99 Linux - Software 0 12-14-2002 08:06 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel

All times are GMT -5. The time now is 10:09 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration