The JMESPath query language is built into Amazon’s AWS CLI. As noted in the CLI documentation, this is provided via the --query
global option, which takes a JMESPath string as its argument. This can be used to filter the raw JSON results returned to the AWS CLI from the server side (i.e. from objects stored in your AWS account).
You can see an overview of the --query
option here, with examples.
Server-side filtering (if it is available - and relevant - for your specific CLI command) may help to reduce the volume of JSON sent across the network, prior to any JMESPath filtering.
The JMESPath tutorial, example and specification provide plenty of additional information:
It’s important to note that the --query
option is part of the client-side CLI program. This means the CLI receives full JSON from the server, except where it has been optionally filtered already, as part of your CLI command.
Suppose you want to extract the name of an EC2 instance (assuming there is a Name
tag defined), given you have the instance ID.
First, here is an unfiltered command which returns the complete JSON for a (fictional) EC2 instance:
|
|
Note: If you are using Windows, replace the Linux
\
line continuation character with^
.
The resulting JSON is presented here (heavily summarized/redacted):
|
|
To get only the name, you can use this:
|
|
This returns the following JSON:
|
|
If you want to simply return this as a string, you can use the --output text
option. When added to the above command, this will return:
|
|
Either way, the JMESPath command:
|
|
is the part which filters the original JSON, returning only the requested data.
The [*]
syntax is a list wildcard expression. It causes JMESPath to iterate over each item in the related [ ... ]
list - and to process the remaining JMESPath expression parts against each element in that list.
The ?
operator defines the start of a filter expression. In our case it defines a filter on only those tags where the key is Name
.
The end result is that we drill down through Reservations
and Instances
- and return only those Value
fields which match the tags filter.
Let’s say we want to do the reverse: Start with a Name
value from a tag field, and find the instance ID (or IDs) to which it belongs.
In other words: What is the instance ID for the instance(s) with a name of my-first-app
?
How do we retrieve a JSON field (the InstanceId
field) after we have selected a more deeply nested value (in this case, tag data). How can we go “back up” the JSON nesting levels from the filtered tag data to the instance ID?
I will break this down into multiple steps, to try to show not only the end result, but how we can think our way to reaching the end result, to better understand why it works.
Let’s start with a command that runs, but isn’t what we want:
|
|
This lists all our instance IDs:
|
|
We can also select the exact tag that meets our requirements:
|
|
This returns the one tag we care about:
|
|
But it doesn’t let us then go back up the JSON hierarchy to select the instance ID.
What we need is some combination of the above two approaches. We need to select only the Instances
we care about using our filter, instead of selecting them all using Instances[*]
.
We can use a multiselect list to get us moving in the right direction. Within the start [
and closing ]
characters of a multiselect list, we will have:
one or more non expressions separated by a comma. Each expression will be evaluated against the JSON document. Each returned element will be the result of evaluating the expression.
(OK - I’m not sure what a “non expression” is - maybe that is a typo in the documentation?)
Anyway, we will only have one expression in our case: We can place our Tags
filter into the Instances[*]
, instead of using that wildcard *
.
Our JMESPath string would therefore become:
|
|
But that’s not a valid JMESPath expression - we need to convert it into a filter which operates on the tags first (and we can continue to use our existing filter operating on Key ... && Value ...
):
|
|
The only change I made was to add a ?
in front of the Tags
step - to turn that into another filter expression. This is now a valid expression for our multiselect list [ ... ]
.
This works!
it returns only those instance objects which have a tag name of my-first-app
. In my case, that is only one instance.
Now, to complete our original task, we can append .InstanceId
to the end of our JMESPath expression, to return only that single piece of instance data, instead of the entire instance JSON structure.
The full AWS CLI command becomes:
|
|
This returns the following JSON:
|
|
Note that the JSON structure we get is at the Instances
level. That was the whole point: It needs to be at that level so that we can access the InstanceId
data we want.
If you have many instances, your JSON may contain many empty [ ]
JSON lists for each of your instances which does not match the my-first-app
filter.
This may be cumbersome.
You can use the flatten operator []
to clean this up. It is similar to the wildcard [*]
operator, except it first consolidates consecutive lists (and any sublists) into a single top-level list. That means, in our case, that all those empty JSON lists will effectively be removed:
|
|
Note how we changed .InstanceId
to .InstanceId[]
. That’s the only thing we changed.
Now the JSON output will be:
|
|
Much cleaner.
If you are using boto3 instead of the AWS CLI, then you will no longer be using any built-in JMESPath: Remember, JMESPath is part of the client CLI implementation, not a part of the AWS API.
But it’s easy enough to add JMESPath to your Python code:
|
|
This will give the same result as the final CLI command from the previous section.
If you are using a language such as Java and its AWS SDK, then it’s a completely different paradigm.
When you use its Ec2Client::describeInstances()
method, you will receive an object as your response - not a JSON structure. And you will iterate over the paginated results to extract the specific data you need:
|
|
The above bare-bones example uses the following Maven dependencies:
|
|
There is no JSON here - and therefore no JMESPath expressions.
This article is mostly about JMESPath, but just for the record, the CLI command solution could be simplified by also using --filter
or --filters
:
|
|
This results in:
|
|
See here for more details.