- Developed in Java
- Fast read
- Good for full-text search
Cluster
- Contains multiple nodes
Node
master node:
- Decide node and shard combination
Dedicated master node
coordinating node:
- Distribute requests
- Default for node
data node:
- Store data
Shard
- Distribute data from index to different nodes evenly like load balancer
Primary Shard
- Like master
- Every shard is one Lucene instance
- The number can not be modified after setting
Replica Shard
- Like slave
- Prevent primary from losing data
- The number of replica shards can be tuned
Shard Allocation
- shard rebalance
- low / high / flood-stage disk watermark
- Concurrent recovery (from snapshot)/ relocation (from other nodes)
- disable deleting indices with wild card (data security concerns)
Three nodes on a cluster. Data is split into 3 shards and evenly stored in 3 nodes.
Index
- Container to store data
Inverted index
- Make full-text search very fast
- Built during data insertion
Token
- Tokenize
- Token filtering such as edge n-gram token filter
- Makes full-text search super fast by searching a slice of token and return the entire document
Document
- Like rows in a table (RDMS)