What is the use case for pipelines?
It’s a saved script that can be stored and reused in different API calls.
How do you create a pipeline?
PUT _ingest/pipeline/< pipeline name >
{
description: ".....",
processors: [
]
}What are some processors available?
TODO: Check on elastic website and make a list of important ones.
What’s the gotcha between using scripts in the ingest pipeline and outside of it?
Ingest pipeline script uses “ctx.< field >” while outside you need to use “ctd._source.< field >”.
TODO: is this still the case in latest version?
What’s the requirement to use a pipeline?
You need to have an ingest node type running.
What are the use cases for the re-index API?
- Re-process and modify data into a new index
What’s the API structure for the re-index API?
POST _reindex
{
"source": { index: ... }.
"dest": { index: < new index > },
"script": { .... }
}How do you enable remote re-index across nodes?
Whitelist the SRC on the DEST config file:
reindex. remote.whitelist: “< ip >:< port >, ….” (comma separated list)”
reindex. ssl.verification_mode: certificate
reindex. ssl.truststore.type: PKCS12
reindex. ssl.keystore.type: PKCS12
reindex. ssl.truststore.path: certs/node-1
reindex. ssl.keystore.path: certs/node-1
$ bin/elasticsearch-keystore add reindex.ssl.truststore.secure_password
$ bin/elasticsearch-keystore add reindex.ssl.keystore.secure_password
How do you do remote re-index (across clusters)?
POST _reindex
{
source: {
remote: {
host: "https:///< ip >:< port >",
username: < user >,
password: < password >
}
index: ....,
}
"dest": {
index: ....
}
}How to re-index only a subset of the data?
Add a query section to the source section.
POST _reindex
{
source {
query: { .... }
}
}How do you mutate the data while copying it?
Add a “script” section:
POST _reindex
{
source: { … },
dest: { … },
script: { …. }
}
What’s the update by query api structure?
POST < index name >/_update_by_query
{
script: { .... },
query: { ... }
}In what instances would you want to simply increment the version of all the objects in an index?
TODO: this was mentioned in the video, but why?
How do you add multi line scripts?
You can use triple quotes in the script (“””) to have multi line scripts from kibana.
This doesn’t seem to be a standard JSON feature. (TODO: confirm)
How would you increase a value by X percent with the _update by query api?
script: {
lang: “painless”,
source: “””
ctx. _source.field += ctx._source.balance * X
if (ctx._source.transactions == null) {
}
""" }TODO: move this to a painless deck of cards.
TODO: what about concurrent updates?
How do you use reindex and update by query with a pipeline?
_update_by_query:
Add the “?pipeline=< pipeline >” param
re-index:
{
dest: {
pipeline: “< pipeline >”
What are the use cases for dynamic templates?
How do you create a dynamic template mapping?
PUT < index name >
{
“mappings”: {
“dynamic_templates”: [
“< template name > “: {
“match_mapping_type”: “< type to match >”,
“match”: “< filter on field name >”,
“unmatch”: “ < filter on what NOT to match >”,
“mapping”: {
… < mapping definition > …
“type”: “…. type … “
}
}
]
}
}
How to filter on field names on dynamic template mappings?
Use the “match” or “unmatch” fields with a wildcard.
What are the use cases for index templates?
How do you create an index pattern / template?
PUT _templates/< template name >
{
"aliases": ....,
"mappings:": ....,
"settings": ....,
"index_patterns": ["< wildcard > "]
}Explain how does index template works
What are some use cases for aliases?
How do you create an alias?
POST _aliases
{
"actions": [
{
"add": {
"index": "< index name >",
"alias": "< alias name >"
}
}
]
}