Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nsqd: ephemeral topics #305

Merged
merged 1 commit into from
Sep 22, 2014
Merged

Conversation

mreiferson
Copy link
Member

We have a few cases where we would like to stream close to realtime actionable data. The data in this case is less useful once it's in the past and not going to get stored/saved. So if a channel is not connected or configured on a MyCoolTopic#ephemeral messages are just discarded until a channel exist (ephemeral or not).

Currently we have to run a channel to /dev/null so the ephemeral channel doesn't burst with old data if the client was offline for a bit.

Thoughts?

@mreiferson
Copy link
Member

@dmarkham the more I think about the use cases where you would want this, I feel like you probably want the semantics of #302 (message TTLs) so that you can define the window of time that you want messages to be kept around (rather than relying on the the effective TTL of the configured --mem-queue-size vs. incoming volume of an ephemeral topic)?

What do you think? They're two different ways to achieve the same end result.

@AndrewWDeane
Copy link

I think that ephemeral topics would be really useful, especially for high-volume quickly-stale data, such as share prices. I think, in these circumstances its the topic as a whole rather than individual messages that have a TTL.

I have taken a quick look at the code, and I think that it could be achieved relatively cheaply too by making 2 small mods to nsqd/topic.go (I may be missing something):

  • adding a flag onto the Topic and assigning in the constructor NewTopic method
t.ephemeral = strings.HasSuffix(t.name,"_ephemeral")   
  • taking note of the flag in Topic.PutMessage
if !t.ephemeral || len(t.channelMap) > 0 {
        t.incomingMsgChan <- msg
        atomic.AddUint64(&t.messageCount, 1)
}

Let me know if this makes sense and I'll finish off the work.

@mreiferson
Copy link
Member

@AndrewWDeane thanks for weighing in.

I think there are a few semantic inconsistencies with treating an ephemeral topic as a "TTL on all its contents":

  1. You wouldn't have proper fine grained control over the message TTL. It would be a function of incoming volume vs. the configured (global) nsqd --mem-queue-size.
  2. Ephemeral channels also imply client behavior (an ephemeral channel expects the client to disappear at some point). In your share price example, I don't believe we are talking about the same behavior for the producer (the equivalent actor). The share_price topic would always exist because there are always new share prices.
  3. If we did implement this, how do an ephemeral topic's channels behave? Are they all automatically ephemeral as well? Other options?

This is why I think message TTLs are more semantically appropriate for the "share price" type use cases.

As you've already demonstrated, ephemeral topics might be far easier to implement than message TTLs and they both achieve a similar end result. But, I want to nail down use cases and semantics to make sure we're getting those right before we move on to the "how".

NOTE: I'm not necessarily against implementing ephemeral topics, it's just important to play devil's advocate when we're discussing adding core features. I haven't yet heard a compelling argument where there is no alternative option that might be more appropriate.

@AndrewWDeane
Copy link

The scenario I'm considering is quite specific but I think that there may be a general use case for a topic containing ephemeral data.

My specific case is this:

I have a "touch market price" producer that constantly publishes to a stock specific ticker topic, as orders flow in and the best bid/offer is calculated. The price messages are discarded until a consumer client registers interest. In my case I register consumers as limit orders are entered into the system, and consider subsequent price messages for order execution. Clients may deregister. As you say, this use case can be satisfied with a message TTL of zero, or in "user space" outside of the core engine.

BTW, I real disliked the "magic name" method I proposed of specifying an ephemeral topic; it should be a property not a side effect of its name, but it was a cheap way to do it.

I've not been following the TTL discussion. Are you considering synchronising the clocks across nsqd instances?

One variant that I'm curious about is the possibility of having messages replace one another in the topic, so that only the latest message is resident. This latest message would then be delivered to the clients as they register, and then subsequent messages forwarded as they arrive. Again, I think that this too may fit into the message TTL model, with a TTL of "until replaced".

@mreiferson
Copy link
Member

@AndrewWDeane got it, good stuff.

FWIW, this could all be accomplished "outside the core" with a similar approach to what I outlined in #307 (comment).

In this setup you could consolidate the "topic per ticker" into a single topic as @jehiah described in https://twitter.com/jehiah/status/435395075817086977.

In your case mux could also have the responsibility of "replacing" older messages with the most current - it would be a local in-memory cache for the last price of each ticker (and as such would be effectively bounded in memory footprint).

@Marquis42
Copy link

I'd like to go ahead and throw a vote towards having either ephemeral topics or a way to delete a topic through the protocol. I'm working on what amounts to RPC over NSQ and that would greatly simplify some aspects of what I'm doing.

@mreiferson
Copy link
Member

@Marquis42 you can use the HTTP api to remove topics "automatically".

I'm coming around on the idea of adding ephemeral topics. Both for feature parity with channels, as a lightweight topic-based TTL, and to simplify the RPC use cases.

Given that it's pretty easy to implement, I might take a swing at it soon.

Thanks!

@mreiferson
Copy link
Member

that was pretty painless...

RFR @jehiah

@jehiah
Copy link
Member

jehiah commented Sep 22, 2014

I think i'm good enough here; there are a few things that are less than ideal, but i recognize this has some value.

Are there any special cases on the lookup side for ephemeral channels?

@mreiferson
Copy link
Member

agreed and no, no special cases on the lookupd side

jehiah added a commit that referenced this pull request Sep 22, 2014
@jehiah jehiah merged commit c7d8a83 into nsqio:master Sep 22, 2014
@mreiferson mreiferson deleted the ephemeral_topics_305 branch September 22, 2014 19:56
Nomon pushed a commit to Nomon/nsq.js that referenced this pull request Mar 11, 2015
NSQ now has support for ephemeral topics in addition to ephemeral channels.
This change changes the regex to match that change.
nsqio/nsq#305
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants