-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Another flicker issue #483
Comments
|
• Nothing else running |
|
Or am I missing something? Do I need to use taskset to assign pixel-push to CPU 3? |
Hmm, modified my startup script to use:
Check with pidstat now shows only pixel-push on CPU 3 (with only kworker/3:0, kworker/3:1, and migration/3), other processes on other CPUs But, perhaps it glitches a little less, but there are still noticeable dropouts at 5-10s intervals. Also, note that I'm not sending any data continually to the pixelpush -- just sent a fixed all-on display and then leaving that static. |
Unfortunately, the kernel sometimes takes away some CPU time doing just whatever accounting it wants, even though it should leave a RT thread isolated on core3 alone. But yeah, there are kernel workers on that core - I wonder if they can be disabled ? Only the main update thread should run on core3, all the other threads of pixel-push can (and should) be on other cores. Interesting if the taskset is not working, I wonder if this also results in the realtime-thread not being a realtime thread ? Can you check if the thread that is using the most CPU (that is the update thread) is actually a RT thread ? Maybe that feature is not compiled into that kernal ? In the end, we're dealing with a multuser/multitasking operating system that in the stock kernel version is not optimized for realtime stuff. So it might be worthwhile trying to get the most stripped-down kernel with all the realtime-ness goodies enabled compiled. Unfortunately, I don't know the best settings here, but if there is someone here knowing about a good set of kernel settings to enable and disable, that would be very helpful. |
If I understand correctly, then pidstat and chrt will not show the process on CPU 3 or the scheduling as FIFO because only the refresh thread is set to be realtime on CPU 3? Because if it is launched normally, the process ends up with SCHED_OTHER and not bound to CPU 3. I tried launching it with: Yet, the flickering continues. I noticed that there is a SLEEP_JITTER define in the code -- will that help isolate whether the issue is really CPU scheduling? |
the DEBUG_SLEEP_JITTER is just a way to quantify the jitter for sleeping, but if you are using the PWM, sleep is essentially not used. I suspect that the kernel is taking some CPU time when clocking in the data, which means that the time sending things into the matrix takes longer, so there are longer dark phases. Best next steps are essentially looking what is happening when you see the brightness change. As described in the readme, some user was seeing that a regular ntp update was creating some faint flicker, which went away when it was disabled. |
Sounds like a tricky thing to debug! I killed ntpd, might've made things a little better, but there are still flickers. Note that running things in a terminal window creates a lot of flicker (e.g. "service --status-all" Is there any way the library can detect if something was off in timing? Or, failing that, is there a good place (or place(s)) to hook a scope onto the PI to measure precisely what's going wrong? Thanks for all the help! |
yes, running things in the terminal creates a lot of flicker. I suspect there is something going on with using the memory bus that starves the memory bus. I noticed that myself in some situations, but haven't had a time to find a strategy how to poke inside the kernel to figure out what is going on. So if you want to look into that, that would be highly appreciated. Regarding measuring, I'd probably hook the scope on the strobe/latch signal. It should be triggered with (number_of_rows/2) * refresh frequency so you can quickly see if there is some huge variation. Now, how to correlate that with something that goes on inside the kernel (between the last and the delayed strobe signal that is), I don't know. I suspect, there are some kernel activity logging tools, so if we can timestamp things properly there, it might lead to the culprit... |
Ok, will try that. In the meantime, have been playing with the 'perf' command and looking at: Using: Clearly shows context switching between pixel_push on CPU 3 and other processes (such as the bash, grep, etc generated by running "service -s-status-all") that shouldn't be on CPU 3, if I understand isolcpu correctly. Yet, pidstat never shows anything but the kworkers and migration on that CPU. |
Yes, I also have the strong suspicion that the kernel only takes the |
Hi and just popping up for a second.
This behaviour mentioned is what I’ve always understood to be the case.
isolcpus is only effective for isolation inside user-space.
The kernel still decides if and when it wants to schedule kworker processes, right when your time-critical, cycle-accurate code is in the critical phase of its loop.
That's the difference between a best-effort and a true real-time kernel, or at least it used to be that way!
Cheers
Andy
… On 20 Jan 2018, at 7:38 pm, Henner Zeller ***@***.***> wrote:
So would be good if you could figure out what else we have to do to make it work as we expect it to.
|
Thanks for joining in Andy. That sounds reasonable, but what I'm seeing is that shell scripts run on the command line end up with their commands (grep, bash, etc) running on the isolated CPU. Or at least that's what perf shows. I'm wondering if maybe they get launched on whatever CPU and then subsequently get moved off of the isolated one. |
OK, so my mistake on one part of this: I had isolcpus=3 on a separate line of /boot/cmdline.txt instead of at the end of the first line, which caused it to be ignored. I noticed this when checking /proc/cmdline to verify everything. Unfortunately, it still doesn't fix the problem, though now I don't see any unexpected commands runnings on CPU 3 when using "perf record" other than kworker/3 and those don't seem to be correlated with the display issues. So, even running something on another CPU is impacting the display -- it does NOT seem to be a CPU allocation issue. As you suggested, perhaps it is a memory bus issue. But there's one other thing I don't understand. If I write 255,255,255 to all pixels of the display, then none of the updates of the display should ever turn off any LEDs. Regardless of delays in the PWM, the display should just be constant on, right? So, why are we seeing flicker? Is the data getting misclocked? |
The displays can't display a full screen of data, it is multiplexed between the rows, so this is why the CPU constantly has to clock in data and why it is sensitive to CPU or memory being allocated elsewhere. There are only two rows lit at any time, it is cycling through that quickly... |
So I have now installed a pi with a recent raspbian (2017-11-29) and I can confirm that the flicker changes are quite noticeable vs. some 2016 old version of Raspbian I had lying around. So it used to be better. Something is going on where the kernel is doing something on that third core even though it is asked not to (even if So I guess we should start making a minimal Linux distribution, with a kernel with various realtime and nohz options compiled in, and unnecessary cruft removed, so that we don't rely on the volatile nature of whatever Raspian is providing. Anyone would like to contribute that ? Or maybe knows a minimal Raspberry Pi distribution that is up for the task ? |
Henner,
Is it possible that the jitter is due to variable delays in when the crticial thread gets resumed from sleeps? Thus, when there are other tasks running, even on other CPUs, the scheduler may not get to resuming the critical thread for some time. If so, since that thread has a dedicated CPU core, rather than sleep could it just tight loop consuming 100% of that CPU, but then not be susceptible to resumption delays from the scheduler? I haven't looked into the matrix code in detail, so I’m not sure how it sleeps/resumes, but does this make sense?
Brent
… On Jan 21, 2018, at 12:28 PM, Henner Zeller ***@***.***> wrote:
So I have now installed a pi with a recent raspbian (2017-11-29) and I can confirm that the flicker changes are quite noticeable vs. some 2016 old version of Raspbian I had lying around. So it used to be better.
Something is going on where the kernel is doing something on that third core even though it is asked not to (even if isolcpus=3 is set and echo -1 > /proc/sys/kernel/sched_rt_runtime_us).
So I guess we should start making a minimal Linux distribution, with a kernel with various realtime and nohz options compiled in, and unnecessary cruft removed, so that we don't rely on the volatile nature of whatever Raspian is providing.
Anyone would like to contribute that ? Or maybe knows a minimal Raspberry Pi distribution that is up for the task ?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#483 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACiwpx700JGr82IFFPyOs1a86ap7WYtUks5tM535gaJpZM4RlUTn>.
|
Yes, there are sleeps in there for areas that are longer wait periods, and there is code you can use to work around sleeping glitches. First, you set the sleeping jitter debug define in lib/gpio.cc to 1
Then recompile and run a tool, such as the
So now, you can adjust the allowance by this additional value. So, let's set it to 60usec, by changing the define before that in lib/gpio.cc
(which is the original 25usec plus the value of the majority of outlyers). This should now make things more smooth, as we now spend more time doing busy waiting to exactly reach the time. Testing with my set-up here, it looks like the visible flicker glitches went entirely - the image is dead solid. It comes with a theoretical downside, which is more CPU usage on that core the update thread is running. However, in practice (with isolcpus=3 set), this won't take away any of the CPU for anything else, because the thread is already locked on one core and nobody else can use it. If this works for you, maybe I should make this higher setting the default. It is a bit more tricky as it should only apply for Pi2/3 that hare more than one core; old Pi1 (or Raspberry Pi Zero) should have a more moderate setting here, otherwise we use away too much CPU time for other things. (don't forget to set DEBUG_SLEEP_JITTER to 0 once you're satisified) |
I've now updated that jitter allowance by 35usec for Raspberry Pi 2 and 3, so this should hopefully be better now if this was really the source of your flicker. Let me know if this fixes your observation or if we have to dig somewhere else. |
Did it get better @btownshend ? |
Sorry, I was distracted by other things for the last few days. I did test it out tonight, and it is still flickering. It is possible that it is a bit better, but can't tell. I tried checking the SLEEP_JITTER and here's what I got: While running 'service --status-all' in another window (on another CPU), which causes lots of flickering:
While leaving the system idle, still seeing minor flickers all the time, with more visible ones roughtly every 10s:
Judging from the second set, it doesn't seem like the sleep jitter is causing the flickers. |
Whenver something heavy happens in another terminal, the kernel is very distracted and generates a lot of flicker. |
And I just tried increasing the EXTRA overhead to give a total of 90usec. Now, the overshoot is always <=0, but the flickers are still there. |
yeah, so the kernel is either (a) messing with that CPU core too much or (b) something else is going on in the background that is creating too much interference in memory etc. I hope you're not using the taskset setting, because otherwise all threads are running on that core. |
No, I was only doing that to check if it made a difference. I’m just letting the program adjust the affinity for the one thread.
Brent
… On Jan 24, 2018, at 12:33 AM, Henner Zeller ***@***.*** ***@***.***>> wrote:
yeah, so the kernel is either (a) messing with that CPU core too much or (b) something else is going on in the background that is creating too much interference in memory etc.
I hope you're not using the taskset setting, because otherwise all threads are running on that core.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#483 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACiwp9gy-p3Qa1jkVnGhut43oiT_dJCKks5tNurDgaJpZM4RlUTn>.
|
I'm trying to use Tiny Core. |
@gioreva you don't need wiring pi for the led matrix, in fact you should not mix the two. The led matrix already provides a way to read the remaining gpio bits. |
For start video when GPIO change.
On Tiny core there are not build-essential And for python3-gpiozero library, I'm trying to copy them from Raspbian to tiny core but the folders are different. rasbian has: and tinycore has: |
I'd advise to avoid using Python on the raspberry pi, it is too resource-intensive, resulting in slowness and flicker (not to mention the nightmares with dependencies, compatibility and install you see in your case). It should only be a couple of lines of code to modify the led-image-viewer to play a stream on a button-press. |
Python and gpiozero, are used for the script, to use the new IC |
It will take a few lines of codes, and is implemented in a few minutes; definitely shorter than trying to figure out how to install the correct python libraries (leave alone that you won't be able to run it in parallel with the matrix). And a lot more solid to operate, because you only have one compiled binary that runs. First, check that you can create a stream for the led-image-viewer to play your video (you can create that stream with the video-viewer; don't use the video viewer directly to display things, it is too slow and creates flicker on the Rasbperry Pi). Then familiarize yourself with the Use that to read your input, and modify the led-image-viewer to be running all the time, but is triggered by the input. So in the main
|
Video work well, and this code is ok |
I might implement that in the main library soon; I have ordered one of the new Panels. |
I have a few days to deliver, you want to try to write it and I try it? libavcodec libavformat libswscale there are on repo of tiny core |
Hello, @gioreva I would like to test tiny core also for my project, do you have any kind of "guide" / step-by-step to build the library? Are you generating a tcz package? |
I have already forgotten. I have these notes. I put all the folders in the save, but they scolded me that it's wrong. If you do, will you pass me the step by step guide ? But @hzeller , did you implement the script for the new chip in the sources ? |
A couple of years ago I was also playing with picore for a similar project and I already created my own packages (even I prepared a captured small web that allowed to load updates and let the user configure the wifi). I have to remember the steps, and sure, I will share them, It could be a nice addition for the @hzeller already great documentation. |
@gioreva Do you remember wich piCore version did you used? |
I don't remember well but I doubt that I compiled the library in armbian, and then copied the executables to tinycore. |
Sorry to bring this thread back around again, but I'm curious if anyone has had the chance to test how the RPI 4 affects flickering? I'm currently running an RPI 3 B+, but if the faster clock on the RPI4 reduced or removed flicker altogether, I'd drop the cash on an upgrade in a heartbeat! |
It is faster, and if you are running some more heavy lifting thing, it will reduce flickering. For instance, I could decode video and play it without creating flicker while this was not possible with the Pi3 (this is an example of course, as video playback should anyway be done using the streaming). As always, it depends what you are doing. Usually, you only get flicker if you have some heavy stuff going on, like something with a lot of memory churn (e.g. when programming in Python). Then also, using the most minimal operating system set-up is best; I have heard of people using DietPi to help them get rid of flicker issues they had before with Raspbian (haven't tried that myself). |
@hzeller Thanks for the quick response! It sounds like the new Pi is worth the upgrade then. For now, I'm just displaying video, text, and images. The framework I built on top of this library, while not in python, is still a little heavy, so I imagine that could benefit from faster clock speeds. I was hoping to integrate some networked tasks like pulling Spotify data from Last.fm, weather data, controlling the panel through a web app, etc... Not quite sure if I'm just living in a fantasy world thinking the Pi could handle both of these without creating flickering/stuttering though. |
make sure to pre-scale video, or better, pre-generate as a stream (see documentation of led-image-viewer and video-viewer; also content-streamer.h api). With that, I can play a video with 250fps (essentially swapping frame with each refresh; watching a Mandelbrot zoom video with that is quite a ride...) In the past, the Pi's had issues with accessing USB and network as it all was sharing a bus; this might be better with the Pi4. |
Interesting - I'll definitely take a look at your documentation on that. I'm working with some specific formats (pinball DMD animations!) that don't store or update with nearly as much data (mono-color, 0-15 intensity values), so I figured I could write something myself to take advantage of that. it sounds like pre-generating as a stream is the way to go. If nothing else, doing it myself has been a good learning experience :) I see further up different suggestions for flicker reduction, but the main thing that's really fixed flicker on my end is enforcing a sleep at the end of my update loop (~usleep(100000) seems to look the best)... This seems like I'm going against the intended use of the library though, not to mention it locks my framerate to <10fps. Is this effectively what FIXED_FRAME_MICROSECONDS tries to do? |
Yes, the more you can pre-generate, the better; typically some pre-allocated FIXED_FRAME_MICROSECONDS is used within the internal update thread to make sure that each screen refresh is taking the same time; otherwise if there are faster and slower updates, this will push more or less photons per time-unit, which is what we perceive as flicker. the sleep that you do in your update loop is making sure to use less CPU and memory churn to less influence the updating that is going on (you do have the |
That's very good to know. I've basically used the Canvas and Canvas::SetPixel() as a wrapper for interfacing with the panel at a simple level, then built everything else myself on top of that. Perhaps I'm not making good enough use of FrameCanvas and SwapOnVSync. Currently I'm just drawing the frame with SetPixel, sleeping, then clearing the canvas before the next frame. Originally this was just to get things running while I build everything out, but now I'm thinking it may be a little naive. I'm working on better calibrating FIXED_FRAME_MICROSECONDS now - sounds like this is something I'd definitely want. And yes, I did add |
With FIXED_FRAME_MICROSECONDS commented out and --led-slowdown-gpio left at default, I'm sitting at 7425u. It sounds like this is higher than others in this thread. Perhaps I need to optimize things better first then... |
Hello, |
@gpulido That's great to know. I'll take a look at that! |
Wanted to leave some findings here for what it's worth. I was able to take my refresh from ~120hz to ~400hz and completely remove flicker by setting |
Thanks for the info, however in my case my panels are going to be used on exterior with direct sunlight, so to reduce the brightness is not an option :( |
I know this issue is closed, but in my case i removed the flickering completely by removing the "led-show-refresh"-command. I nearly tried everything thats described above, but in the end, this was the point. (for me) |
First, thanks for a great library and all the thought that's gone into the details!
I'm working with a 64x64 matrix and see drops in intensity randomly every 5-15s. I've gone through your notes and tried everything to eliminate these:
pixel-push -i wlan0 -U --led-chain=4 --led-parallel=1 -R 90 -u 65507 --led-show-refresh --led-scan-mode=0 --led-gpio-mapping=adafruit-hat-pwm --led-slowdown-gpio=2 -d
Linux raspberrypi 4.9.35-v7+ Added V-Mapper:Z for flipped orientation panels chains #1014 SMP Fri Jun 30 14:47:43 BST 2017 armv7l GNU/Linux
I hooked up a photoresistor to a scope to capture what I'm seeing visually -- see attached image. Looks like ~160Hz refresh rate with occasional drops in level.
Any other ideas?
Thanks,
Brent
The text was updated successfully, but these errors were encountered: