How to Install and Configure Elasticsearch on Linux and Windows

by Santosh Yadav on April 2, 2019

What is Elasticsearch? In simple terms, we could possibly say elastic search is a NoSQL database. Since there are so many NoSQL databases, let us understand how Elasticsearch is different from them.

Introduction to Elasticsearch

Elasticsearch is a real time, document based, distributed, NoSQL database, full text based search engine, and a powerful analytics engine, it is REST based. The following are the key features of elasticsearch.

1. Real Time: Inserting and retrieving data from elasticsearch is super-fast, it’s called near real time data retrieval. It is useful in low latency web applications, which has large amount of data to process.

2. Document Based: Elasticsearch is schema less database. It stores JSON documents without knowing schema in advance. At run time it can infer from the data inserted what should be its mapping.

3. Distributed: Elasticsearch is distributed database. It is clustered. Data is distributed across multiple nodes to avoid single point of failure. If one node goes down then it can recover data from other nodes.

4. NoSQL database: Elasticsearch is NoSql database like Mongo, Redis. It supports only JSON documents insertion and retrieval.

5. Full text based search: Full text is advanced way of searching occurrence of a term in documents, without scanning whole document. It works by storing text indexes for all the terms in document.

6. Analytics engine: Elasticsearch provides tools, APIs, to analyze the stored documents. We can search for popular patterns, metrics, reporting, and powerful data charting dashboard support.

7. REST based APIs: Elasticsearch only uses REST based APIs to insert and retrieve data. GET, PUT, DELETE, POST.

Download Elasticsearch

Download the latest version of the install package for respective platform from Elasticsearch download site

The install process is same for both Linux and Windows. In Windows, it’ll be Zip file. In Linux, it’ll be .tar.gz file. Also, in the .yml file make sure the path follow the correct syntax for your corresponding OS.

On Linux:

wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.7.0.tar.gz

tar xvfz elasticsearch-6.7.0.tar.gz

On Windows:

C:> unzip elasticsearch-6.7.0.zip
Archive:  elasticsearch-6.7.0.zip
creating: elasticsearch-6.7.0/lib/
inflating: elasticsearch-6.7.0/lib/elasticsearch-6.7.0.jar
inflating: elasticsearch-6.7.0/lib/elasticsearch-x-content-6.6.2.jar

It will extract the all the JARs required

creating: elasticsearch-6.7.0/logs/
creating: elasticsearch-6.7.0/plugins/

It will create all the required folders for the elasticsearch.

The following is the minimum requirements to set up elastic search.

Modify elasticsearch.yml file

Open elasticsearch/config/elasticsearch.yml configuration file and edit following configuration options

vi elasticsearch-6.7.0/config/elasticsearch.yml

# Use a descriptive name for your cluster:
#
cluster.name: TGS_CLUSTER

# Use a descriptive name for the node:
#
node.name: TGS-1

# Path to directory where to store the data 
# (separate multiple locations by comma):
#
path.data: /root/tgs/data

# Path to log files:
#
path.logs: /root/tgs/logs

# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 127.0.0.1

# Set a custom port for HTTP:
#
http.port: 9200

As shown in the above example, in the elasticsearch.yml file, set the following values appropriately:

Cluster name
Node name
Data path
Path to logs
Bind Address
Custom Port

Modify Elasticsearch jvm.options

To set up Java options, open elasticsearch/config/jvm.options file and set the memory to be allocated, for heap memory

-Xms1g
-Xmx1g

Xms represents the initial size of total heap space
Xmx represents the maximum size of total heap space

Make sure your system has minimum enough memory, otherwise elasticsearch will be super slow, or, would not work at all or might throw exceptions after a while.

Further java options are advanced options, editing them without deep understanding would cause unexpected behavior.

Start Elasticsearch

Execute following command on shell or windows command prompt, from top level elasticsearch directory

On Linux:

cd /root/elasticsearch-6.7.0
bin/elasticsearch

On Windows:

C:\elasticsearch-6.7.0> bin\elasticsearch

There will be series of log but important part is the following:

[2019-03-23T12:32:23,196][INFO ][o.e.n.Node               ] [TGS-1] initialized
[2019-03-23T12:32:23,196][INFO ][o.e.n.Node               ] [TGS-1] starting ...
[2019-03-23T12:32:24,001][INFO ][o.e.t.TransportService   ] [TGS-1] publish_address {127.0.0.1:9300}, bound_addresses {127.0.0.1:9300}
[2019-03-23T12:33:52,437][INFO ][o.e.c.s.MasterService    ] [TGS-1] zen-disco-elected-as-master ([0] nodes joined), reason: new_master {TGS-1}{Q97KJ6A2QW2i0ehLL1nwWg}
    {nR6YOn0dTY-6VCn8yOrtEQ}{127.0.0.1}{127.0.0.1:9300}
    {ml.machine_memory=8521035776, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}

[2019-03-23T12:33:52,444][INFO ][o.e.c.s.ClusterApplierService] [TGS-1] new_master {TGS-1}
    {Q97KJ6A2QW2i0ehLL1nwWg}{nR6YOn0dTY-6VCn8yOrtEQ}{127.0.0.1}{127.0.0.1:9300}
    {ml.machine_memory=8521035776, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true},
    reason: apply cluster state
    (from master [master {TGS-1}{Q97KJ6A2QW2i0ehLL1nwWg}
    {nR6YOn0dTY-6VCn8yOrtEQ}{127.0.0.1}{127.0.0.1:9300}
    {ml.machine_memory=8521035776, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)]])

[2019-03-23T12:33:52,627][INFO ][o.e.h.n.Netty4HttpServerTransport]
    [TGS-1] publish_address {127.0.0.1:9200}, bound_addresses {127.0.0.1:9200}

[2019-03-23T12:33:52,628][INFO ][o.e.n.Node] [TGS-1] started

In the above output, TGS-1 is started, means out elasticsearch is up.

This setup only contains single node. This is not a cluster.

Verify Elasticsearch Installation

Elasticsearch has REST based cluster management interfaces, and it provides REST APIs to manage the cluster, using curl or web browser we can check the state of cluster.

Using Curl:

$ curl -XGET http://127.0.0.1:9200/

Talking about curl, you might find this helpful: wget vs curl: How to Download Files Using wget and curl

The following is the output of the above curl command:

{
"name" : "TGS-1",
"cluster_name" : "TGS-CLUSTER",
"cluster_uuid" : "QHwc5GpmQbuc5Tjg3OpTjA",
"version" : {
"number" : "6.6.2",
"build_flavor" : "default",
"build_type" : "zip",
"build_hash" : "3bd3e59",
"build_date" : "2019-03-06T15:16:26.864148Z",
"build_snapshot" :* false,
"lucene_version" : "7.6.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}

Just enter following link in browser http://127.0.0.1:9200/ , hit enter, you’ll see the same output as above.

Installing On Debian / Ubuntu

Installing from debian repository

santosh@ubuntu:~$ sudo apt-get install elasticsearch
[sudo] password for santosh:
Reading package lists... Done
Building dependency tree
Reading state information... Done

It will pull deb packages from repositories, and will install elasticsearch.

Follow similar steps discussed above to configure, and to verify the setup of elasticsearch.

Elasticsearch would not start automatically after installing, you will have to start the elasticsearch manually or register it to system services and run service commands to execute it.

Directory structure will be different on Debian / Ubuntu.

Configuration files:

/etc/elasticsearch/elasticsearch.yml
/etc/elasticsearch/jvm.options

Default configuration settings are good to go, but if you want to modify the default setting, edit the configuration files at respective paths and restart the elasticsearch.

Add your comment