Overview
The geopoint data set contains geo data from PlanetOSM. They are stored with type geo_point
. The purpose of the
benchmark is to evaluate the performance of different geo queries in Elasticsearch. We run the following variations (which
we call "challenges" in Rally):
- Append: Indexes the whole document corpus using Elasticsearch default settings. We only adjust the
number of replicas as we benchmark a single node cluster and Rally will only start the benchmark if the cluster turns
green. Document ids are unique so all index operations are append only. After that a couple of queries are run in
parallel by multiple clients.
- Append Fast: Indexes the whole document corpus using a setup that will lead to a larger indexing
throughput than the default settings. Document ids are unique so all index operations are append only.
- Id Conflicts: Indexes the whole document corpus using a setup that will lead to a larger indexing
throughput than the default settings. Rally will produce duplicate ids in 25% of all documents (not configurable) so we
can simulate a scenario with appends most of the time and some updates in between.
The benchmarks are run either for an out of the box configuration of Elasticsearch ("default settings") or with a larger heap
of 4GB ("4g heap"). For more details please refer to the geopoint track specification and
have a look at our benchmarking methodology).