sed’s and awk’s lost cousin
You might have noticed that our beloved GNU utils, especially sed and awk, work better with some file types than others.
In fact, although awk can do wonders with CSV files, modern types like JSON, YAML or even XML (which is by no means modern) can be a bit of a pain to work with.
Well, jq is a parser specifically designed to address this issue (regarding JSON file).
We’ll see further down how to tackle the other file types mentioned.
Basics
Let’s take a simple JSON as an example.
Run curl https://til.hashrocket.com/api/developer_posts.json?username=doriankarter
on your command line to see the data.
Since this data is presented as a one-liner, we can use jq
to format the output:
|
|
We can query the interesting data to remove some noise simply by referring to its node name:
|
|
To output the data as an array we can just enclose the query in []
:
|
|
Or we can do some interesting manipulation to the data and present a parsed version:
|
|
Again, we are accessing the data by their node name and doing some string concatenation, nothing too fancy.
Notice how we use a pipe to pass the data from one query to the next.
Not so basics
This software has a bunch of very useful functions available, we’ll go over a few of them.
From now on, there will be no reference to the curl
command to keep the code blocks more concise.
You can still use it to test these queries!
Delete node
Use it to clear out unwanted noise:
|
|
Filter with select
Select only the entries that match the given condition.
|
|
Notice how we now use .TITLE
instead of .title
to filter the output.
This is because by this point, the .title
node is not there anymore!
Conditional logic
We can use concise if
statements to selectively modify a node.
Here we create a new one called IS_VALID
with the value "Too short!"
or "yes"
depending on the length of the .title
.
|
|
Again, notice how in this case we reference .title
.
This is because the .TITLE
node is not created yet!
Group by
Group by the value of any given node with group_by()
!
|
|
Notice how this function is called outside the creation of the array!
Sort by
Sorting is also possible and can the result be reversed as wanted:
|
|
Notice that we set .len
to the result of passing .title
to the length
built-in function.
Handle other file types with YQ
Since this is so nice to work with, someone took the time to make a wrapper around jq
to also handle other file types and created yq
(as in YAML query).
It doesn’t just handle YAML files, but also JSON, XML, CSV and TSV.
Not only that, you can easily use this application to convert one file type into another!
It is worth mentioning however that not all file conversions are supported as there are still some edge cases to be tackled.
Check the docs to find out more.
Keep in mind that apart from what is shown below, all the previous operations can be applied to any of these file types.
Since yq
uses the same syntax as jq
, I’ll keep it out of the examples to keep things simple.
This is just a quick overview of how you might want to use the tool, it can achieve much more than I’m showing here.
YAML to other types
For a your_cool.yaml
file of the structure:
|
|
The command yq -o xml '.' your_cool.yaml
would output it under XML format:
|
|
Or you can run it like yq -o json '.' your_cool.yaml
to get a JSON instead:
|
|
Any Input, Any Output
As mentioned above, the possibilities are near limitless.
Say you have a your_cool.csv
file of the structure:
name,numberOfCats,likesApples,height
Gary,1,true,168.8
Samantha's Rabbit,2,false,-188.8
Hassle-free conversion to YAML can be achieved with yq -o yaml -p csv '.' your_cool.csv
:
|
|
Again, use the -o
flag to change the output format yq -o json -p csv '.' your_cool.csv
:
|
|
Notice that in order to properly take a non-YAML file as input, the -p
flag must be used.