Adding Apache Solr Support #574

ghaughian · 2016-01-06T10:50:34Z

Since ElasticSearch is supported I thought it would be beneficial to have Solr supported also. I'm sure there are people out there who would love to compare head-to-head these 2 search engine technologies.

kruthar · 2016-01-06T14:09:17Z

solr/README.md

+external zookeeper cluster and an appropriate collection has been created.
+Make sure to pass the following properties as parameters to 'bin/ycsb' script.
+
+	cloud.mode=true


This seems to clash with above, is the property cloud.mode or solr.cloud, The latter is prefered as it refers to the binding it belongs to in the name.

kruthar · 2016-01-06T15:54:05Z

Can you update all the license headers to include 2016?

risdenk · 2016-01-06T15:58:40Z

solr/src/main/java/com/yahoo/ycsb/db/SolrClient.java

+
+      if (cloudMode) {
+        cloudClient.add(table, doc);
+        cloudClient.commit(table);


Explicit commits for each record will result in bad performance. Typically commits should be done within a certain amount of time or have the Solr server do auto soft/hard commits. This is different than a standard DB. It should be made configurable?

the default in ycsb, to allow comparisons across systems, is to do a commit per record. you are correct that this is terrible for most systems.

other bindings make it configurable (hbase, cassandra, mongo, etc) and then just default to commit-per-transaction with instructions in their README for folks who can allow for client side batching.

risdenk · 2016-01-06T19:29:41Z

solr/src/main/java/com/yahoo/ycsb/db/SolrDBClient.java

+   *         discussion of error codes.
+   */
+  @Override
+  public Status update(String table, String key, HashMap<String, ByteIterator> values) {


A thought here: Solr supports updating documents without doing a query/update manually. https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents and here how to do it with SolrJ http://yonik.com/solr/atomic-updates/

busbey · 2016-01-06T19:32:47Z

It would be helpful if, before this gets merged, you could squash all the commits into a single commit and then include in the commit message "[solr]" at the beginning.

madrob · 2016-01-06T20:27:36Z

distribution/pom.xml

@@ -139,6 +139,11 @@ LICENSE file.
      <artifactId>s3-binding</artifactId>
      <version>${project.version}</version>
    </dependency>
+        <dependency>


nit: whitespace

busbey · 2016-01-09T15:01:21Z

Does Solr have sufficient tooling that we could include a test?

Anything else, @kruthar, @madrob, @risdenk?

risdenk · 2016-01-11T14:22:07Z

solr/src/main/java/com/yahoo/ycsb/db/SolrClient.java

+
+      HashMap<String, ByteIterator> entry;
+
+      for (SolrDocument hit : response.getResults()) {


nit: pull response.getResults() into a variable? multiple calls to getResults.

busbey · 2016-01-12T12:50:53Z

Looks good to me. Could you please squash the commits and make the commit message start with "[solr]"?

ghaughian · 2016-01-12T13:03:41Z

Awesome, thanks! Commits have been squashed into 1 commit

busbey · 2016-01-12T13:40:24Z

If folks want to take a final look, I'll be merging this in ~5-6 hours barring feedback.

Thanks for all your work on this @ghaughian!

risdenk · 2016-01-12T14:44:17Z

Looks good to me. Thanks @ghaughian

risdenk · 2016-01-12T17:26:00Z

Does Solr have sufficient tooling that we could include a test?

@busbey I have some working tests for this that I just created PR #583 for

risdenk · 2016-01-12T18:14:08Z

@ghaughian - There are a few comments from @madrob on PR #583 that should be addressed here. They include:

ghaughian · 2016-01-12T18:16:30Z

@risdenk addressing those now; thanks.

updating readme updating package info perfecting logic for http solr clients for all operations renamed properties, tested cloud mode and cleaned code removed dependency on dynamic field names, updated readme now enforcing checkstyle adding solr artifact removing test cases relying on external dependencies removed unused maven dependencies, added batch mode support, all try blocks now catch eplicit exceptions, Query/UpdateResponse status codes are handled more granularly, updated readme, added sample schema.xml file to support default field names in ycsb client, updated all license headers to 2016, using SolrClient object as primary client type regardless if Solr is running in Cloud or Stand-alone mode cleaned code and config files, now accepting a solr base url property, simplified sample schema.xml file, renamed class to SolrClient, now updating documents atomically, added batch support to delete method updated new line spacing of pom file comments removed sample schema file, updated readme with more indepth explanation on running/setting up the solr-binding removed some code lines no longer in use renamed zookeeper param name, now throwing caught exceptions where appropriate, debug messages are now being logged on stderr now returning an appropriate error if we receive an unexpected response from solr server, repeated calls to getResults is no longer now using singletonMap to store update params in, fixed typo and missing id field in sample config in README

[solr] Adding Apache Solr Support

kruthar reviewed Jan 6, 2016
View reviewed changes

risdenk reviewed Jan 6, 2016
View reviewed changes

madrob reviewed Jan 6, 2016
View reviewed changes

risdenk reviewed Jan 11, 2016
View reviewed changes

ghaughian force-pushed the master branch from 5298c71 to f77289e Compare January 12, 2016 13:01

risdenk mentioned this pull request Jan 12, 2016

[solr] Added tests for Solr binding #583

Merged

ghaughian force-pushed the master branch from f77289e to fc7cc57 Compare January 12, 2016 18:34

busbey added a commit that referenced this pull request Jan 12, 2016

Merge pull request #574 from ghaughian/master

6d111d9

[solr] Adding Apache Solr Support

busbey merged commit 6d111d9 into brianfrankcooper:master Jan 12, 2016

risdenk mentioned this pull request Feb 15, 2016

Release version 0.7.0 #624

Closed

jaricftw pushed a commit to jaricftw/YCSB that referenced this pull request Jul 19, 2016

Merge pull request brianfrankcooper#574 from ghaughian/master

5d8c2d1

[solr] Adding Apache Solr Support

jaricftw pushed a commit to jaricftw/YCSB that referenced this pull request Jul 19, 2016

Merge pull request brianfrankcooper#574 from ghaughian/master

40a22cf

[solr] Adding Apache Solr Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding Apache Solr Support #574

Adding Apache Solr Support #574

ghaughian commented Jan 6, 2016

kruthar Jan 6, 2016

kruthar commented Jan 6, 2016

risdenk Jan 6, 2016

busbey Jan 6, 2016

risdenk Jan 6, 2016

busbey commented Jan 6, 2016

madrob Jan 6, 2016

busbey commented Jan 9, 2016

risdenk Jan 11, 2016

busbey commented Jan 12, 2016

ghaughian commented Jan 12, 2016

busbey commented Jan 12, 2016

risdenk commented Jan 12, 2016

risdenk commented Jan 12, 2016

risdenk commented Jan 12, 2016

ghaughian commented Jan 12, 2016


		HashMap<String, ByteIterator> entry;

		for (SolrDocument hit : response.getResults()) {

Adding Apache Solr Support #574

Adding Apache Solr Support #574

Conversation

ghaughian commented Jan 6, 2016

kruthar Jan 6, 2016

Choose a reason for hiding this comment

kruthar commented Jan 6, 2016

risdenk Jan 6, 2016

Choose a reason for hiding this comment

busbey Jan 6, 2016

Choose a reason for hiding this comment

risdenk Jan 6, 2016

Choose a reason for hiding this comment

busbey commented Jan 6, 2016

madrob Jan 6, 2016

Choose a reason for hiding this comment

busbey commented Jan 9, 2016

risdenk Jan 11, 2016

Choose a reason for hiding this comment

busbey commented Jan 12, 2016

ghaughian commented Jan 12, 2016

busbey commented Jan 12, 2016

risdenk commented Jan 12, 2016

risdenk commented Jan 12, 2016

risdenk commented Jan 12, 2016

ghaughian commented Jan 12, 2016