What is Elasticsearch? In simple terms, we could possibly say elastic search is a NoSQL database. Since there are so many NoSQL databases, let us understand how Elasticsearch is different from them.
Introduction to Elasticsearch
Elasticsearch is a real time, document based, distributed, NoSQL database, full text based search engine, and a powerful analytics engine, it is REST based. The following are the key features of elasticsearch.
1. Real Time: Inserting and retrieving data from elasticsearch is super-fast, it’s called near real time data retrieval. It is useful in low latency web applications, which has large amount of data to process.
2. Document Based: Elasticsearch is schema less database. It stores JSON documents without knowing schema in advance. At run time it can infer from the data inserted what should be its mapping.
3. Distributed: Elasticsearch is distributed database. It is clustered. Data is distributed across multiple nodes to avoid single point of failure. If one node goes down then it can recover data from other nodes.
4. NoSQL database: Elasticsearch is NoSql database like Mongo, Redis. It supports only JSON documents insertion and retrieval.
5. Full text based search: Full text is advanced way of searching occurrence of a term in documents, without scanning whole document. It works by storing text indexes for all the terms in document.
6. Analytics engine: Elasticsearch provides tools, APIs, to analyze the stored documents. We can search for popular patterns, metrics, reporting, and powerful data charting dashboard support.
7. REST based APIs: Elasticsearch only uses REST based APIs to insert and retrieve data. GET, PUT, DELETE, POST.
Download Elasticsearch
Download the latest version of the install package for respective platform from Elasticsearch download site
The install process is same for both Linux and Windows. In Windows, it’ll be Zip file. In Linux, it’ll be .tar.gz file. Also, in the .yml file make sure the path follow the correct syntax for your corresponding OS.
On Linux:
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.7.0.tar.gz tar xvfz elasticsearch-6.7.0.tar.gz
On Windows:
C:> unzip elasticsearch-6.7.0.zip Archive: elasticsearch-6.7.0.zip creating: elasticsearch-6.7.0/lib/ inflating: elasticsearch-6.7.0/lib/elasticsearch-6.7.0.jar inflating: elasticsearch-6.7.0/lib/elasticsearch-x-content-6.6.2.jar
It will extract the all the JARs required
creating: elasticsearch-6.7.0/logs/ creating: elasticsearch-6.7.0/plugins/
It will create all the required folders for the elasticsearch.
The following is the minimum requirements to set up elastic search.
Modify elasticsearch.yml file
Open elasticsearch/config/elasticsearch.yml configuration file and edit following configuration options
vi elasticsearch-6.7.0/config/elasticsearch.yml
# Use a descriptive name for your cluster: # cluster.name: TGS_CLUSTER # Use a descriptive name for the node: # node.name: TGS-1 # Path to directory where to store the data # (separate multiple locations by comma): # path.data: /root/tgs/data # Path to log files: # path.logs: /root/tgs/logs # Set the bind address to a specific IP (IPv4 or IPv6): # network.host: 127.0.0.1 # Set a custom port for HTTP: # http.port: 9200
As shown in the above example, in the elasticsearch.yml file, set the following values appropriately:
- Cluster name
- Node name
- Data path
- Path to logs
- Bind Address
- Custom Port
Modify Elasticsearch jvm.options
To set up Java options, open elasticsearch/config/jvm.options file and set the memory to be allocated, for heap memory
-Xms1g -Xmx1g
- Xms represents the initial size of total heap space
- Xmx represents the maximum size of total heap space
Make sure your system has minimum enough memory, otherwise elasticsearch will be super slow, or, would not work at all or might throw exceptions after a while.
Further java options are advanced options, editing them without deep understanding would cause unexpected behavior.
Start Elasticsearch
Execute following command on shell or windows command prompt, from top level elasticsearch directory
On Linux:
cd /root/elasticsearch-6.7.0 bin/elasticsearch
On Windows:
C:\elasticsearch-6.7.0> bin\elasticsearch
There will be series of log but important part is the following:
[2019-03-23T12:32:23,196][INFO ][o.e.n.Node ] [TGS-1] initialized [2019-03-23T12:32:23,196][INFO ][o.e.n.Node ] [TGS-1] starting ... [2019-03-23T12:32:24,001][INFO ][o.e.t.TransportService ] [TGS-1] publish_address {127.0.0.1:9300}, bound_addresses {127.0.0.1:9300} [2019-03-23T12:33:52,437][INFO ][o.e.c.s.MasterService ] [TGS-1] zen-disco-elected-as-master ([0] nodes joined), reason: new_master {TGS-1}{Q97KJ6A2QW2i0ehLL1nwWg} {nR6YOn0dTY-6VCn8yOrtEQ}{127.0.0.1}{127.0.0.1:9300} {ml.machine_memory=8521035776, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true} [2019-03-23T12:33:52,444][INFO ][o.e.c.s.ClusterApplierService] [TGS-1] new_master {TGS-1} {Q97KJ6A2QW2i0ehLL1nwWg}{nR6YOn0dTY-6VCn8yOrtEQ}{127.0.0.1}{127.0.0.1:9300} {ml.machine_memory=8521035776, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}, reason: apply cluster state (from master [master {TGS-1}{Q97KJ6A2QW2i0ehLL1nwWg} {nR6YOn0dTY-6VCn8yOrtEQ}{127.0.0.1}{127.0.0.1:9300} {ml.machine_memory=8521035776, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)]]) [2019-03-23T12:33:52,627][INFO ][o.e.h.n.Netty4HttpServerTransport] [TGS-1] publish_address {127.0.0.1:9200}, bound_addresses {127.0.0.1:9200} [2019-03-23T12:33:52,628][INFO ][o.e.n.Node] [TGS-1] started
In the above output, TGS-1 is started, means out elasticsearch is up.
This setup only contains single node. This is not a cluster.
Verify Elasticsearch Installation
Elasticsearch has REST based cluster management interfaces, and it provides REST APIs to manage the cluster, using curl or web browser we can check the state of cluster.
Using Curl:
$ curl -XGET http://127.0.0.1:9200/
Talking about curl, you might find this helpful: wget vs curl: How to Download Files Using wget and curl
The following is the output of the above curl command:
{ "name" : "TGS-1", "cluster_name" : "TGS-CLUSTER", "cluster_uuid" : "QHwc5GpmQbuc5Tjg3OpTjA", "version" : { "number" : "6.6.2", "build_flavor" : "default", "build_type" : "zip", "build_hash" : "3bd3e59", "build_date" : "2019-03-06T15:16:26.864148Z", "build_snapshot" :* false, "lucene_version" : "7.6.0", "minimum_wire_compatibility_version" : "5.6.0", "minimum_index_compatibility_version" : "5.0.0" }, "tagline" : "You Know, for Search" }
Just enter following link in browser http://127.0.0.1:9200/ , hit enter, you’ll see the same output as above.
Installing On Debian / Ubuntu
Installing from debian repository
santosh@ubuntu:~$ sudo apt-get install elasticsearch [sudo] password for santosh: Reading package lists... Done Building dependency tree Reading state information... Done
It will pull deb packages from repositories, and will install elasticsearch.
Follow similar steps discussed above to configure, and to verify the setup of elasticsearch.
Elasticsearch would not start automatically after installing, you will have to start the elasticsearch manually or register it to system services and run service commands to execute it.
Directory structure will be different on Debian / Ubuntu.
Configuration files:
- /etc/elasticsearch/elasticsearch.yml
- /etc/elasticsearch/jvm.options
Default configuration settings are good to go, but if you want to modify the default setting, edit the configuration files at respective paths and restart the elasticsearch.
Comments on this entry are closed.
Elastic search not allowing to start using root even with sudo.