Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clustering #7

Open
trevorsummerssmith opened this issue Oct 13, 2012 · 1 comment
Open

Clustering #7

trevorsummerssmith opened this issue Oct 13, 2012 · 1 comment
Assignees

Comments

@trevorsummerssmith
Copy link
Owner

Eric fill out some random thoughts.

@ghost ghost assigned epurdy Oct 13, 2012
@epurdy
Copy link
Collaborator

epurdy commented Oct 14, 2012

The main difficulty with clustering is figuring out an intelligible representation of a cluster. We want to be able to look at a cluster that contains maybe 25% of all the vertices, and have some idea what its "deal" is.

This basically means having some sort of domain-specific "summarization" operators.

OR: this is a weirder idea, but you could try "summarizing by sampling": you show a bunch of random examples from a given cluster. Then you can be pretty sure that the intelligible clusters will "look right" most of the time. Unintelligible clusters will probably at least look unintelligible, because the user won't be able to detect any sort of pattern from the random examples shown.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants