Elasticsearch supports storing documents in JSON format. It also provides REST interface to interact with elasticsearch datastore.
In this article, we will discuss how to do basic CRUD operations on elasticsearch datastore using the following examples:
1. Index API – Index a document by providing document id
2. Index API – Index a document with auto-generated document id
3. Get API – Retrieve a document along with all fields
4. Get API – Retrieve a document along with specific fields
5. Delete API – Delete a document from datastore
6. Update API – Update the whole document
7. Update API – Update only partial document (adding new fields)
These operation fall under document APIs, they are named so because they deal with documents. The good thing about elasticsearch is that we don’t need to create database schema beforehand. We could start inserting data even before creating database schema.
Schema equivalent in these context is mapping. We can create a mapping that is quite similar to JSON schema of documents we want to insert. In current context if we don’t provide mapping of documents, elasticsearch will infer at run time the data type of all the elements of JSON document.
Database equivalent in this context is INDEX, Table equivalent in this context is TYPE.
First, we need to create an index that we can use in our examples.
Insertion of documents in elasticsearch is called indexing of documents. So let’s first create our articles database i.e. articles index
$ curl -XPUT '192.168.101.100:9200/articles?&pretty' % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 84 100 84 0 0 84 0 0:00:01 --:--:-- 0:00:01 103{ "acknowledged" : true, "shards_acknowledged" : true, "index" : "articles" }
A simple PUT command will create the articles index, now we can index our article documents within this index. In command response we can see index is created.
When we are indexing a document, elasticsearch assigns a document id to each indexed document, though we have the option of choosing this document id, we can leave it to elasticsearch. Using this id we can retrieve JSON documents later.
The following is ouor sample JSON document that we will use for the rest of the examples in this tutorial.
{ "topic": "python", "title": "python tuples", "description": "practical operations with python tuples", "author": "santosh", "date": "1-1-2019", "views": "100" }
Note: If you are new to elasticsearch, this will get you started: How to Install and Configure Elasticsearch on Linux and Windows
1. Index API – Index a document by providing document id
In this example we will index the document within _doc type of articles index, with document id 1
curl -XPOST '192.168.101.100:9200/articles/_doc/1?pretty' -d '{"topic":"python","title": "python tuples","description": "practical operations with python tuples","author": "santosh","date": "1-1-2019","views" : "100"}' -H 'Content-Type: application/json'
The following is the output of the above command.
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 377 100 222 100 155 222 155 0:00:01 --:--:-- 0:00:01 1008{ "_index" : "articles", "_type" : "_doc", "_id" : "1", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 0, "_primary_term" : 1 }
After ip and port, next is index then type within index then document id
Document is indexed with document id 1.
2. Index API – Index a document with auto-generated document id
If we don’t provide document while indexing, elasticsearch will auto generate the document id for indexed documents.
curl -XPOST '192.168.101.100:9200/articles/_doc/?pretty' -d '{"topic":"python","title": "python tuples","description": "practical operations with python tuples","author": "santosh","date": "1-1-2019","views" : "100"}' -H 'Content-Type: application/json'
The following is the output of the above command.
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 396 100 241 100 155 241 155 0:00:01 --:--:-- 0:00:01 2117{ "_index" : "articles", "_type" : "_doc", "_id" : "1zfK-2kBx40Oa0-N-vjk", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 0, "_primary_term" : 1
After ip and port, next is index then type within index. In this example we won’t provide document id.
In the response, elasticsearch provides the document id “_id” : “1zfK-2kBx40Oa0-N-vjk”,
3. Get API – Retrieve a document along with all fields
Using Get API we can retrieve documents from elasticsearch datastore. Documents are retrieved using document id, let’s retrieve document with id 1
curl -XGET '192.168.101.100:9200/articles/_doc/1?pretty'
Output:
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 368 100 368 0 0 368 0 0:00:01 --:--:-- 0:00:01 3956{ "_index" : "articles", "_type" : "_doc", "_id" : "1", "_version" : 1, "_seq_no" : 0, "_primary_term" : 1, "found" : true, "_source" : { "topic" : "python", "title" : "python tuples", "description" : "practical operations with python tuples", "author" : "santosh", "date" : "1-1-2019", "views" : "100" } }
As you can see retrieved documents.
Let’s try to retrieve document which does not exist
$ curl -XGET '192.168.101.100:9200/articles/_doc/11?pretty' % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 83 100 83 0 0 83 0 0:00:01 --:--:-- 0:00:01 761{ "_index" : "articles", "_type" : "_doc", "_id" : "11", "found" : false }
Response shows found=false, which means the document does not exist.
4. Get API – Retrieve a document along with specific fields
In next example we will do selective GET , I.e. we will request only certain elements from elasticsearch datastore
curl -XGET '192.168.101.100:9200/articles/_doc/1?pretty&_source=topic,title,author'
Output:
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 240 100 240 0 0 240 0 0:00:01 --:--:-- 0:00:01 2580{ "_index" : "articles", "_type" : "_doc", "_id" : "1", "_version" : 3, "_seq_no" : 2, "_primary_term" : 1, "found" : true, "_source" : { "author" : "santosh", "topic" : "python", "title" : "python tuples" } }
5. Delete API – Delete a document from datastore
Using Delete API we can delete a document from elasticsearch datastore, to delete a document we need document id
curl -XDELETE '192.168.101.100:9200/articles/_doc/1zfK-2kBx40Oa0-N-vjk?pretty'
Output:
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 241 100 241 0 0 241 0 0:00:01 --:--:-- 0:00:01 1928{ "_index" : "articles", "_type" : "_doc", "_id" : "1zfK-2kBx40Oa0-N-vjk", "_version" : 2, "result" : "deleted", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 1, "_primary_term" : 1 }
In response the result is deleted, that means document is deleted.
Let’s try to delete a document that does not exist
$ curl -XDELETE '192.168.101.100:9200/articles/_doc/1zfK-2kBx40Oa0-N-vjk?pretty' % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 243 100 243 0 0 243 0 0:00:01 --:--:-- 0:00:01 3115{ "_index" : "articles", "_type" : "_doc", "_id" : "1zfK-2kBx40Oa0-N-vjk", "_version" : 3, "result" : "not_found", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 2, "_primary_term" : 1 }
In response we can see, result is not found.
6. Update API – Update the whole document
Using this API we can update existing document stored in elasticsearch datastore. To update the document we need document id of the document.
Let’s update the whole document or replace an existing document. We will replace the document stored with document id 1
curl -XPOST '192.168.101.100:9200/articles/_doc/1?pretty' -d '{"topic":"python","title": "python tuples","description": "practical operations with python sets","author": "santosh","date": "11-11-2019","views" : "1000"}' -H 'Content-Type: application/json'
Output:
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 378 100 222 100 156 222 156 0:00:01 --:--:-- 0:00:01 3024{ "_index" : "articles", "_type" : "_doc", "_id" : "1", "_version" : 2, "result" : "updated", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 1, "_primary_term" : 1 }
View the updated document:
$ curl -XGET '192.168.101.100:9200/articles/_doc/1?pretty' % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 350 100 350 0 0 350 0 0:00:01 --:--:-- 0:00:01 4487{ "_index" : "articles", "_type" : "_doc", "_id" : "1", "_version" : 2, "_seq_no" : 1, "_primary_term" : 1, "found" : true, "_source" : { "topic" : "python", "title" : "python tuples", "description" : "practical operations with python sets", "author" : "santosh", "date" : "11-11-2019", "views" : "1000" } }
As you can see document is replaced with new document , and version is incremented to 2 , incrementing version represents that this document has been modified .
7. Update API – Update only partial document (adding new fields)
In this example we will add two new fields in existing document.
curl -XPOST '192.168.101.100:9200/articles/_doc/1/_update?pretty' -d '{"doc":{"uniqueviews":"1789","reviewer":"RN"}}' -H 'Content-Type: application/json'
Output:
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 268 100 222 100 46 222 46 0:00:01 --:--:-- 0:00:01 1914{ "_index" : "articles", "_type" : "_doc", "_id" : "1", "_version" : 3, "result" : "updated", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 2, "_primary_term" : 1 }
View the updated document:
$ curl -XGET '192.168.101.100:9200/articles/_doc/1?pretty' % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 401 100 401 0 0 401 0 0:00:01 --:--:-- 0:00:01 5141{ "_index" : "articles", "_type" : "_doc", "_id" : "1", "_version" : 3, "_seq_no" : 2, "_primary_term" : 1, "found" : true, "_source" : { "topic" : "python", "title" : "python tuples", "description" : "practical operations with python sets", "author" : "santosh", "date" : "11-11-2019", "views" : "1000", "uniqueviews" : "1789", "reviewer" : "RN" } }
In the above GET command output, we can see the two new fields (uniqueviews and reviewer) are added to the document.