Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[elasticsearch] Bump version from 0.19.8 to 2.1.1 #552

Merged
merged 2 commits into from
Jan 6, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions elasticsearch/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,3 +80,11 @@ otherwise you will run out of memory.
If you wish to change the default index name you can set the following property:

es.index.key=my_index_key

### Troubleshoot
If you encounter error messages such as :
"Primary shard is not active or isn't assigned is a known node."

Try removing /tmp/esdata/ folder.
rm -rf /tmp/esdata

9 changes: 8 additions & 1 deletion elasticsearch/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,16 @@ LICENSE file.
<name>ElasticSearch Binding</name>
<packaging>jar</packaging>
<properties>
<elasticsearch-version>0.19.8</elasticsearch-version>
<elasticsearch-version>2.1.1</elasticsearch-version>
</properties>
<dependencies>
<dependency>
<!-- jna is supported in ES and will be used when provided
otherwise a fallback is used -->
<groupId>net.java.dev.jna</groupId>
<artifactId>jna</artifactId>
<version>4.1.0</version>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is jna actually getting used?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, elasticsearch supports jna, though it does have a fallback in case jna isn't supported.
I don't have a strong opinion on whether to keep it or not, so if you have a good reason to remove it I'm fine with that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, so the normal client dependency will use jna if it's present but won't bring it in?

Including it is fine by me; a comment here I'm the pom about why we need to expressly include it will help avoid having it refactored out in the future.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes; I'll make a comment.

</dependency>
<dependency>
<groupId>com.yahoo.ycsb</groupId>
<artifactId>core</artifactId>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,30 +17,33 @@

package com.yahoo.ycsb.db;

import static org.elasticsearch.common.settings.ImmutableSettings.*;
import static org.elasticsearch.common.settings.Settings.Builder;
import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;
import static org.elasticsearch.index.query.FilterBuilders.rangeFilter;
import static org.elasticsearch.index.query.QueryBuilders.matchAllQuery;
import static org.elasticsearch.index.query.QueryBuilders.rangeQuery;
import static org.elasticsearch.node.NodeBuilder.nodeBuilder;


import com.yahoo.ycsb.ByteIterator;
import com.yahoo.ycsb.DB;
import com.yahoo.ycsb.DBException;
import com.yahoo.ycsb.Status;
import com.yahoo.ycsb.StringByteIterator;

import org.elasticsearch.action.admin.cluster.health.ClusterHealthRequest;
import org.elasticsearch.action.get.GetResponse;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.Client;
import org.elasticsearch.client.Requests;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.settings.ImmutableSettings.Builder;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.transport.InetSocketTransportAddress;
import org.elasticsearch.common.xcontent.XContentBuilder;
import org.elasticsearch.index.query.RangeFilterBuilder;
import org.elasticsearch.index.query.RangeQueryBuilder;
import org.elasticsearch.node.Node;
import org.elasticsearch.search.SearchHit;

import java.net.InetAddress;
import java.net.UnknownHostException;
import java.util.HashMap;
import java.util.Map.Entry;
import java.util.Properties;
Expand Down Expand Up @@ -70,7 +73,6 @@ public class ElasticSearchClient extends DB {
private Node node;
private Client client;
private String indexKey;

private Boolean remoteMode;

/**
Expand All @@ -79,7 +81,6 @@ public class ElasticSearchClient extends DB {
*/
@Override
public void init() throws DBException {
// initialize OrientDB driver
Properties props = getProperties();
this.indexKey = props.getProperty("es.index.key", DEFAULT_INDEX_KEY);
String clusterName =
Expand All @@ -90,13 +91,15 @@ public void init() throws DBException {
.parseBoolean(props.getProperty("elasticsearch.remote", "false"));
Boolean newdb =
Boolean.parseBoolean(props.getProperty("elasticsearch.newdb", "false"));
Builder settings = settingsBuilder().put("node.local", "true")
Builder settings = Settings.settingsBuilder()
.put("node.local", "true")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're trying to keep datastore directions defaulted to working with a remote datastore.

Instructions in the readme for bencarking with a local node is fine, but this change would xuse it to only work that way, right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm missing context due to the diff view. Let me try loading in a non-mobile client....

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default behavior for ES was set to point to local datastore. If we want to change that behavior, I'd put it in a separate commit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable to me.

.put("path.data", System.getProperty("java.io.tmpdir") + "/esdata")
.put("discovery.zen.ping.multicast.enabled", "false")
.put("index.mapping._id.indexed", "true")
.put("index.gateway.type", "none").put("gateway.type", "none")
.put("index.gateway.type", "none")
.put("index.number_of_shards", "1")
.put("index.number_of_replicas", "0");
.put("index.number_of_replicas", "0")
.put("path.home", System.getProperty("java.io.tmpdir"));

// if properties file contains elasticsearch user defined properties
// add it to the settings file (will overwrite the defaults).
Expand All @@ -118,12 +121,20 @@ public void init() throws DBException {
.split(",");
System.out.println("ElasticSearch Remote Hosts = "
+ props.getProperty("elasticsearch.hosts.list", DEFAULT_REMOTE_HOST));
TransportClient tClient = new TransportClient(settings);
TransportClient tClient = TransportClient.builder()
.settings(settings).build();
for (String h : nodeList) {
String[] nodes = h.split(":");
tClient.addTransportAddress(
new InetSocketTransportAddress(nodes[0],
Integer.parseInt(nodes[1])));
try {
tClient.addTransportAddress(new InetSocketTransportAddress(
InetAddress.getByName(nodes[0]),
Integer.parseInt(nodes[1])
));
} catch (NumberFormatException e) {
throw new IllegalArgumentException("Unable to parse port number.", e);
} catch (UnknownHostException e) {
throw new IllegalArgumentException("Unable to Identify host.", e);
}
}
client = tClient;
} else { // Start node only if transport client mode is disabled
Expand All @@ -132,6 +143,10 @@ public void init() throws DBException {
client = node.client();
}

//wait for shards to be ready
client.admin().cluster()
.health(new ClusterHealthRequest("lists").waitForActiveShards(1))
.actionGet();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this for a local ES cluster or a remote cluster?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both. For local node it requires a short period of time for the shards to be available, otherwise the benchmark operations would start to early and cause UnavailableShardsException. In remote datastore scenario, we should assume the remote shards are available, but if they aren't, benchmarking would fail either way.

if (newdb) {
client.admin().indices().prepareDelete(indexKey).execute().actionGet();
client.admin().indices().prepareCreate(indexKey).execute().actionGet();
Expand All @@ -150,7 +165,6 @@ public void cleanup() throws DBException {
if (!remoteMode) {
if (!node.isClosed()) {
client.close();
node.stop();
node.close();
}
} else {
Expand Down Expand Up @@ -318,10 +332,13 @@ public Status update(String table, String key,
public Status scan(String table, String startkey, int recordcount,
Set<String> fields, Vector<HashMap<String, ByteIterator>> result) {
try {
final RangeFilterBuilder filter = rangeFilter("_id").gte(startkey);
final RangeQueryBuilder rangeQuery = rangeQuery("_id").gte(startkey);
final SearchResponse response = client.prepareSearch(indexKey)
.setTypes(table).setQuery(matchAllQuery()).setFilter(filter)
.setSize(recordcount).execute().actionGet();
.setTypes(table)
.setQuery(rangeQuery)
.setSize(recordcount)
.execute()
.actionGet();

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like .setFilter() is missed here? is there a reason for that?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I goofed here. ES no longer has filter, so .setFilter() is deprecated.
with the new API, setQuery(matchAllQuery()).setFilter(filter) should be simplified to .setQuery(rangeQuery).
Good catch.

HashMap<String, ByteIterator> entry;

Expand Down