Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FLV: Crash when switch between HTTP-FLV streams. #1941

Open
freeman1974 opened this issue Sep 7, 2020 · 10 comments
Open

FLV: Crash when switch between HTTP-FLV streams. #1941

freeman1974 opened this issue Sep 7, 2020 · 10 comments
Assignees
Labels
Bug It might be a bug. TransByAI Translated by AI/GPT.
Milestone

Comments

@freeman1974
Copy link

freeman1974 commented Sep 7, 2020

Description
Multiple frequent switches to access two http-flv streams using the same player, continuously switching between these two streams. After about 5 switches, the SRS process exits. Refer to the image below to view the Linux core dump file.
To investigate the corresponding code, it should be:

void srs_close_stfd(srs_netfd_t& stfd)
{
    if (stfd) {
        // we must ensure the close is ok.
        int err = st_netfd_close((st_netfd_t)stfd);
        srs_assert(err != -1);		// The assertion triggered causing the process to exit.
        stfd = NULL;
    }
}

And the caller of this func is:

void SrsTcpClient::close()
{
    // Ignore when already closed.
    if (!io) {
        return;
    }
    
    srs_close_stfd(stfd);
}

It seems that it is caused by frequent occurrences of SrsTcpClient::close(). It is caused by continuously closing and opening the socket.

    if ((*_st_eventsys->fd_close)(fd->osfd) < 0)
        return -1;

This line of code is causing the error. Is it because a global variable _st_eventsys is used without locking it?

  1. SRS version: srs 4.0.39 #define SRS_VERSION4_REVISION 39
  2. The log of SRS is as follows: Please refer to the screenshot in the attachment.
    http://demo.fili58.com/media/bug/photo_2020-09-07_18-16-24.jpg

TRANS_BY_GPT3

@freeman1974
Copy link
Author

freeman1974 commented Sep 9, 2020

Add a sentence: If the two streams switch a little slower, there won't be this issue.

TRANS_BY_GPT3

@RossWang
Copy link

RossWang commented Sep 10, 2020

May I ask, when you play http-flv or dash, does the server have high CPU usage? It seems that it doesn't happen with SRS3.

TRANS_BY_GPT3

@freeman1974
Copy link
Author

freeman1974 commented Sep 10, 2020

I didn't pay attention to this issue. Do you have any quantitative data? Specifically, for srs3 vs srs4.

TRANS_BY_GPT3

@RossWang
Copy link

RossWang commented Sep 11, 2020

It seems like you don't have this problem
So I checked and found that it was due to the low setting of mr_latency
Thank you for your help

TRANS_BY_GPT3

@freeman1974
Copy link
Author

freeman1974 commented Sep 13, 2020

I made some modifications myself, and by limiting the streaming speed, this problem can be solved.

TRANS_BY_GPT3

@winlinvip
Copy link
Member

winlinvip commented Dec 1, 2020

How fast do you switch before encountering problems?

TRANS_BY_GPT3

@freeman1974
Copy link
Author

freeman1974 commented Dec 2, 2020 via email

@winlinvip winlinvip self-assigned this Aug 23, 2021
@winlinvip winlinvip added the Bug It might be a bug. label Aug 26, 2021
@winlinvip winlinvip added this to the SRS 4.0 release milestone Aug 26, 2021
@winlinvip
Copy link
Member

winlinvip commented Aug 26, 2021

There is currently a lingering issue with this problem, and it has been going on for many years without knowing why. It would be great if we could find the reason.

TRANS_BY_GPT3

@winlinvip
Copy link
Member

winlinvip commented Nov 3, 2021

st_netfd_close is definitely closing the fd while it is being read or written by another coroutine.

So the key point is how to print out the coroutines that are accessing this fd, so that we can identify where the problem is.

Using assert is not a problem because if we don't exit at the problematic location, there will still be various issues later on, and they will be even more peculiar.

The relationship between threads and file descriptors (fd) in ST is many-to-many. A thread can read and write to multiple fds, and an fd can be read and written by multiple threads (e.g., one coroutine reading and another writing). Therefore, there is more complexity in the underlying logic. When closing an fd, it is necessary to ensure that all threads are no longer reading or writing to this fd.

TRANS_BY_GPT3

@winlinvip winlinvip modified the milestones: 4.0, 5.0 Dec 9, 2022
@winlinvip winlinvip changed the title 频繁切换http-flv拉流导致srs进程退出 FLV: Crash when switch between HTTP-FLV streams. 频繁切换http-flv拉流导致srs进程退出 Jan 2, 2023
@winlinvip winlinvip changed the title FLV: Crash when switch between HTTP-FLV streams. 频繁切换http-flv拉流导致srs进程退出 FLV: Crash when switch between HTTP-FLV streams. Frequent switching of HTTP-FLV streaming leads to the termination of the SRS process. Jul 28, 2023
@winlinvip winlinvip added the TransByAI Translated by AI/GPT. label Jul 28, 2023
@winlinvip winlinvip changed the title FLV: Crash when switch between HTTP-FLV streams. Frequent switching of HTTP-FLV streaming leads to the termination of the SRS process. FLV: Crash when switch between HTTP-FLV streams. Apr 22, 2024
@winlinvip
Copy link
Member

winlinvip commented Apr 22, 2024

Similar one, see #3784 (comment)

See also #511 #1784 #1829 #2419 #3784

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug It might be a bug. TransByAI Translated by AI/GPT.
Projects
None yet
Development

No branches or pull requests

4 participants