-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing control components #23
Comments
So, the schema does currently try to capture many of the control options you describe.
Dimmable lights are specified by adding a dimmer component. e.g. (adapted from Building1 of UK-DALE): - type: light
subtype: ceiling downlight
original_name: kitchen_lights
instance: 1
meters: [8]
components:
- type: LED lamp
count: 10
manufacturer: Philips
model: Dimmable MASTER LED 10W MR16 GU5.3 24degrees 2700K 12v
nominal_consumption: {on_power: 10}
- type: dimmer
subtype: TRIAC
number_of_dimmer_levels: 3 So, the things you suggested which are not yet handled by the schema are:
Also, now that I think about it, perhaps light dimmers should really go into |
Jack, |
It's described in the docs, here: http://nilm-metadata.readthedocs.org/en/latest/dataset_metadata.html#appliance
At the start of my PhD, I was very excited about trying to model appliances at the level of individual components. My aim was to have a library of parameterised component models (heating elements, motors, etc) and then appliances would be constructed of finite state machines. This approach is (briefly) described in my 2012 paper on Disaggregating Multi-State Appliances from Smart Meter Data. This is kind of like the approach physicists use to model a system: they start from equations which describe the behaviour of the system and combine these equations to model their system of interest. But I started to realise that, in 6-second data, it becomes very hard to see individual appliance components (e.g. in 1 second data you can see the overshoot-undershoot-stabilise pattern of power demand a motor shows when it's first turned on but you can't see that pattern in 10-second data). I'm now much more interested in learning whole-appliance models from power data (both aggregate data and individual appliance data) rather than manually specifying appliance models. i.e. I'm now following a more 'machine learning' approach where we try to train a model on lots of data (tens of thousands of activations across as many appliances as possible). In respect to the schema, I agree that we should try to model the real world. But I'm also eager to make the schema as easy as possible to use (it's currently too hard to use, IMHO). As far as I'm aware, you and I are the only people to make use of the The schema does allow for components to be assembled into appliances. Does the schema work for your needs or is there a specific modification which would allow you to better express your model? |
Your 2012 paper pretty much describes my mindset, bearing in mind my data has a 1 second granularity. I think the metadata approach is better than the forms based approach I was using in Java as it is much more flexible. In terms of what would suit my needs, that requires a little more thought. I certainly find the multiple files more of a hindrance than a help as it's not always obvious where to look. Being unused to yaml I was expecting to find all the options in the yaml files rather than in the documentation, which is why I couldn't find the control attribute, I guess ultimately I would like to build the metadata interactively, or pull it from a manufacturer's database, but I guess both of those options are some way off. I will have a think and get back to you. |
When analysing many iterations of data, how much of the underlying patterns will come into better focus when the data is combined day upon day, even for larger granularity data? |
Well, one important thing for most NILM work is how well our models generalise to unseen appliances. And, as a general rule, you want as many training examples as possible to help generalise. You want a model which distils the universal 'essence' of each appliance, rather than the quirks of each individual make and model. |
From my perspective as a consumer not a utility, what I would like is an accurate log of what was on when, for how long and how much power it consumed as a basis for aggregating by category, derived from a single point of electrical measurement, supported by data about what is in my house. Ideally I wouldn't want to model what is in an appliance, but pull that data from a manufactures web site (or preferably some central repository). In the absence of that I am happy to build the supporting data, and provide supervision to link electrical appliances to patterns of consumption. As yet I am unsure how in the toolkit the metadata is used to support disaggregation, or how the linking of patterns to names is made (some guidance on this would be appreciated). I am also unclear if given the components, I could create new composite appliances using components: and parent: in my building metadata rather than in the central metadata? I envisage a layering down approach, where the major appliances are identified first, then subtracted, allowing for further analysis if the residual power. Ultimately this is about changing habits or appliances to optimise consumption. To answer your earlier question, I would like to see all the atomic components, including any missing control components in a single components file. I would also like a way of describing the patterns of use to further aid recognition (in particular for components with a timer). I haven't seen how to use Prior or how training interacts with this, again some guidance would be helpful. |
I have spotted control in the space heater definition (which I used for underfloor heating) I can see that timer might be a bit too generalised, as there are many types such as
all of which result in different behaviours when viewed in the consumption data. |
Whether in control or components, it would be worth thinking about how you would determine an appliance's behaviour patterns by interrogating the appliance metadata. For example
So the question is what is the best way to represent this? @oliparson you may have some thoughts on this thread as well as Jack & I Thinking about this is what caused me to introduce human habits in my own schema, such as
Humans represent manual control components, but also have patterns based on their habits, so any schema should be capable of integrating this kind of control as well - This may lean things towards identifying control separately since a human is not a component of any particular appliance. I have suddenly though a device is controllable, by many means including (normally but not always) manually, so that should mostly be one of the control options Now I understand how your metadata works, perhaps I should produce a schema for this in yaml, would this be of any use? |
Hi @gjwo,
So... the aim of many NILM researchers (and certainly my aim) is to develop an algorithm which can estimate when each large appliance is on, and how much energy it uses each time. And to do this from a single meter which measures the whole home's energy demand. In terms of inputs to the algorithm, I'm most interested in algorithms which require no additional input from the user. i.e. all the algorithm requires is the aggregate power data and it will try to figure out which appliances are present, when they are on, and how much energy they have used.
Pretty much all approaches that I am aware of learn almost everything from data (not metadata). Certainly none of the current crop of NILMTK algorithms use any of the metadata to guide disaggregation. To take a step back: a large change in artificial intelligence over the last, say, decade is the observation that you achieve excellent performance (often state of the art performance) when you learn pretty much everything from the data rather than hand-engineering features. For example, the current best-performing approaches for image classification ("is there a dog in that image?") learn almost all the relevant features from the data. Same for automatic speech recognition. Even more strikingly perhaps, there is now good evidence that if you want to build a machine translation tool (e.g. which can translate from French to English) then you probably don't want to invest huge amounts of engineering effort hand-engineering parse trees etc. Instead you should learn the entire thing from data. (If you want a good overview then take a look at the series of articles Nature Magazine published on May 27th on AI). My own research (which I suspect echos most peoples' research on this stuff) is focussed on learning as much as possible from data. That certainly includes the fine-grained 'signatures' of each appliance, as well as the longer-term temporal patterns (such as which appliances will be switched on when the family wakes up and starts cooking breakfast).
During training, we expose the disaggregation algorithm to labelled training data. This is usually in the form of individual appliance traces, along with their name. Hence the algorithm can learn one model per appliance name. During 'test' time, the algorithm tries to match its models to the aggregate data. Zooming out again, you might ask 'why bother with all this metadata if no NILM algorithms yet make use of the metadata'. Basically, I wanted NILM Metadata to be able to capture as much information as possible, even if there aren't yet uses for that data. Plus I wanted NILM Metadata to be of use beyond NILM (so maybe calling it 'NILM Metadata' was a mistake!). NILMTK certainly makes use of some of the metadata to group appliances.
I think we mostly try to detect maybe the top 5 appliances (in terms of energy consumption).
I'm not aware of any examples of Prior yet. My intention was that it would be used for defining probability distributions expressing, for example, when each appliance would be used each day (e.g. a toaster would be most likely to be used in the morning). These priors would mostly be learnt from data, not manually specified.
Again, I'd have a strong preference (in my own research) for learning habits from data, rather than manually defining it in a schema. So, to zoom out again... my current feeling about NILM Metadata is that it's actually over complicated for 90% of the cases I can think of. All the interesting detail that you describe (automatic control systems, continual versus discrete control, human habits etc) should be largely learnt from data, IMHO. If you really want a schema which describes lots of detail about human habits, control systems etc then it might be best to fork NILM Metadata and then you'd be free to pull your fork in whatever direction you want.
That's very kind. When I first started work on NILM Metadata, I carefully created a formal schema using JSON Schema but it was a lot of effort to make large structural changes etc. Now that the schema is vaguely stable, it should be easier. But I'd still suggest that we should probably let the (informal) schema settle down a bit, especially while we're considering new features and a new simplification; and then it might be nice to re-try writing a formal schema. I made some notes a while ago about this. |
I get the point about the AI / big data approach, and I have seen the strides google translate and others have made using that approach. I think what we have here is a bootstrap loader issue (and yes I am old enough to remember the whole switches, paper tape, disk sequence)! i.e. if we had enough knowledge from the data we wouldn't need metadata to help, but unfortunately we are not starting from there. During training, we expose the disaggregation algorithm to labelled training data.
This is usually in the form of individual appliance traces, along with their name.
Hence the algorithm can learn one model per appliance name. During 'test' time,
the algorithm tries to match its models to the aggregate data. How do you enter your labelled training data? Perhaps were we can come together on this is that I can think of metadata including habits as a way of boxing where a particular signature might be in the data in the same way you are using sub metering, once that signature has been found and labelled (in order to interact with a human user) the code can supersede the metadata with discovered real data. in either case without an underlying database of previously recognised and named appliances, there has to be some kind of interaction with a user to name appliances that have been found. |
Ah, sorry, I forgot to mention. There are now over 10 public databases of labelled domestic electricity data. Some are quite large (I think Pecan street has something on the order of 1000 homes). The aim would be to create NILM models which can generalise across houses. i.e. generalise to houses where we don't have labelled training data. The end result would be that you'd only have to squirt your aggregate data to the system and it would magically know which appliances are in there. (How achievable this is is still a matter for research!) |
@gjwo you might be interested to read a bit about unsupervised/semi-supervised learning in the context of energy disaggregation. I recently wrote a blog post about some confusion with this definition, and also a paper on how we can learn generalisable models from databases of sub-metered data and apply it to homes with only aggregate data. |
@oliparson Thanks, I had seen the blog post, I am not sure if I had seen the paper (having read dozens recently) I will read it. |
@JackKelly Thanks I have found the lists in one of the @oliparson blogs http://blog.oliverparson.co.uk/2012/06/public-data-sets-for-nialm.html , not sure if any of these are UK based or applicable to the UK, or if I could access them, or if I could how to apply them in this context. But I had always envisaged that such things should be available from the manufacturers or certifiers of electrical equipment. Again we are probably a couple of years off this being a solution. Still this does help generalise the issue, there are many places that could source labelled training data, perhaps the toolkit ought to be able to take it from any of these places:
|
these are all good ideas. I suspect we'll get enough data from the first two items on your list. Might not need detailed 'human supplied' data. Not unless it can scale to 100s or 1000s of homes :) |
When considering components, I think you are missing some components that are incorporated in many appliances and radically effect the fingerprints produced by those components, namely the control components. These should be in the components module and built into common combinations for use into components. Some of these may may have built into appliances at a less fundamental level. The main ones are
There may also be sub classes of these such as
This would then give you
etc.
The text was updated successfully, but these errors were encountered: