Overview
The geonames data set contains a lot of structured data. String fields are always indexed as text
with a raw
keyword
subfield. We run the following variations (which we call "challenges" in Rally):
- Append: Indexes the whole document corpus using Elasticsearch default settings. We only adjust the
number of replicas as we benchmark a single node cluster and Rally will only start the benchmark if the cluster turns
green. Document ids are unique so all index operations are append only. After that a couple of queries are run in
parallel by multiple clients.
- Append Fast: Indexes the whole document corpus using a setup that will lead to a larger indexing
throughput than the default settings. Document ids are unique so all index operations are append only.
The benchmarks are run either for an out of the box configuration of Elasticsearch ("default settings") or with a larger heap
of 4GB ("4g heap"). For more details please refer to the geonames track specification and
have a look at our benchmarking methodology). The benchmark results are also provided as a
Kibana
dashboard.