Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Apache Solr Support #574

Merged
merged 1 commit into from
Jan 12, 2016
Merged

Conversation

ghaughian
Copy link
Contributor

Since ElasticSearch is supported I thought it would be beneficial to have Solr supported also. I'm sure there are people out there who would love to compare head-to-head these 2 search engine technologies.

external zookeeper cluster and an appropriate collection has been created.
Make sure to pass the following properties as parameters to 'bin/ycsb' script.

cloud.mode=true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to clash with above, is the property cloud.mode or solr.cloud, The latter is prefered as it refers to the binding it belongs to in the name.

@kruthar
Copy link
Collaborator

kruthar commented Jan 6, 2016

Can you update all the license headers to include 2016?


if (cloudMode) {
cloudClient.add(table, doc);
cloudClient.commit(table);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicit commits for each record will result in bad performance. Typically commits should be done within a certain amount of time or have the Solr server do auto soft/hard commits. This is different than a standard DB. It should be made configurable?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the default in ycsb, to allow comparisons across systems, is to do a commit per record. you are correct that this is terrible for most systems.

other bindings make it configurable (hbase, cassandra, mongo, etc) and then just default to commit-per-transaction with instructions in their README for folks who can allow for client side batching.

* discussion of error codes.
*/
@Override
public Status update(String table, String key, HashMap<String, ByteIterator> values) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A thought here: Solr supports updating documents without doing a query/update manually. https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents and here how to do it with SolrJ http://yonik.com/solr/atomic-updates/

@busbey
Copy link
Collaborator

busbey commented Jan 6, 2016

It would be helpful if, before this gets merged, you could squash all the commits into a single commit and then include in the commit message "[solr]" at the beginning.

@@ -139,6 +139,11 @@ LICENSE file.
<artifactId>s3-binding</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: whitespace

@busbey
Copy link
Collaborator

busbey commented Jan 9, 2016

Does Solr have sufficient tooling that we could include a test?

Anything else, @kruthar, @madrob, @risdenk?


HashMap<String, ByteIterator> entry;

for (SolrDocument hit : response.getResults()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: pull response.getResults() into a variable? multiple calls to getResults.

@busbey
Copy link
Collaborator

busbey commented Jan 12, 2016

Looks good to me. Could you please squash the commits and make the commit message start with "[solr]"?

@ghaughian
Copy link
Contributor Author

Awesome, thanks! Commits have been squashed into 1 commit

@busbey
Copy link
Collaborator

busbey commented Jan 12, 2016

If folks want to take a final look, I'll be merging this in ~5-6 hours barring feedback.

Thanks for all your work on this @ghaughian!

@risdenk
Copy link
Collaborator

risdenk commented Jan 12, 2016

Looks good to me. Thanks @ghaughian

@risdenk
Copy link
Collaborator

risdenk commented Jan 12, 2016

Does Solr have sufficient tooling that we could include a test?

@busbey I have some working tests for this that I just created PR #583 for

@risdenk
Copy link
Collaborator

risdenk commented Jan 12, 2016

@ghaughian
Copy link
Contributor Author

@risdenk addressing those now; thanks.

updating readme

updating package info

perfecting logic for http solr clients for all operations

renamed properties, tested cloud mode and cleaned code

removed dependency on dynamic field names, updated readme

now enforcing checkstyle

adding solr artifact

removing test cases relying on external dependencies

removed unused maven dependencies, added batch mode support, all try blocks now catch eplicit exceptions, Query/UpdateResponse status codes are handled more granularly, updated readme, added sample schema.xml file to support default field names in ycsb client, updated all license headers to 2016, using SolrClient object as primary client type regardless if Solr is running in Cloud or Stand-alone mode

cleaned code and config files, now accepting a solr base url property, simplified sample schema.xml file, renamed class to SolrClient, now updating documents atomically, added batch support to delete method

updated new line spacing of pom file comments

removed sample schema file, updated readme with more indepth explanation on running/setting up the solr-binding

removed some code lines no longer in use

renamed zookeeper param name, now throwing caught exceptions where appropriate, debug messages are now being logged on stderr

now returning an appropriate error if we receive an unexpected response from solr server, repeated calls to getResults is no longer

now using singletonMap to store update params in, fixed typo and missing id field in sample config in README
busbey added a commit that referenced this pull request Jan 12, 2016
[solr] Adding Apache Solr Support
@busbey busbey merged commit 6d111d9 into brianfrankcooper:master Jan 12, 2016
@risdenk risdenk mentioned this pull request Feb 15, 2016
jaricftw pushed a commit to jaricftw/YCSB that referenced this pull request Jul 19, 2016
jaricftw pushed a commit to jaricftw/YCSB that referenced this pull request Jul 19, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants