I love these - I use jq daily and am always happy to read about it from another angle.
The author probably has internalized more of the manual than they realize, and maybe improved at least one explanation; FTA:
> map(...) let’s you unwrap an array, apply a filter and then rewrap the results back into an array. You can think of it as a shorthand for [ .[] | ... ] and it comes up quite a bit in my experience, so it’s worth it committing to memory.
Take each sub-item in `.` (the current item being processed), apply the function `x` to each sub-item, and collect all the result values from `x` into an array.
This is a terrific introduction to a tool that I use often enough to pretend to be familiar with, but honestly just grep my bash history every time I need to use.
But I'd really like to see a discussion about the tool that the host website promotes: Earthy. It a build system, which is a family of tools that I've always hated, but it seems to be pretty decent. Is anybody using it?
This is an amazing article. I think the two things that set it apart is (a) showing the process of building a complex command from scratch step by step, and (b) actually using a real API endpoint to start with.
One convenient tip I discovered after reading this article and trying out the command is that
jq is unsurprisingly Turing complete, so I wrote a Whitespace interpreter[0] in jq. It is able to handle real-time I/O by requesting lines on-demand from stdin, which is the main input source, with `input` and outputting strings in a stream.
With a relatively large jq program like that, it is critical that the main recursive loop run efficiently, so it's annoying that there's no way to detect whether tail call optimization was applied, other than benchmarking. It would also be nice if object values were lazily evaluated so that it would be possible to create ad hoc switches.
The jq documentation is good, and thorough, but we see a lot of articles like these, and personally I think I would remember the particulars better (and maybe guess them without even referring to the reference manual) if the author wrote a narrative account of how they conceived of jq and how the language emerged from that conception. The tutorial is a great start, but something a little more fleshed out and introspective would help a lot, in my opinion.
As much as I love jq and it has helped me, it always has an „awk“ vibe to me which is one of the tools I use less despite its capability, because of the same reason: I tend to forget all the special syntax over the years.
Compare that with „gron“ ( https://github.com/Deitar13/gron ), which is arguably not as powerful and clunky, but it allows me to compose and tie into the other unix tools way better.
It‘s mainly for that reason I use it more than jq these days for ad-hoc analysis.
AWK and JQ definitely suffer from "power-creep" as shell tools. IMO they are unreasonably powerful; full-fledged scripting languages masquerading as simple text-processing tools.
Part of the blame is definitely on the fact that the "idiomatic" way to filter columns in a shell pipeline is to invoke AWK: `awk '{print $3,$5}'`. Similarly for JQ. Virtually every sysadmin and programmer gets introduced to these languages as Unixy tools when in fact they are antithetical to the Unix philosophy.
The result is ending up with overcomplicated "production" pipelines (curl|sed|awk|jq) when you really could be writing one far more coherent, maintainable, scalable C, Go, Python, etc. program with their standard libraries.
This. I stopped writing shell script for the exact reason. When something is more complex than launching a couple of commands, and involves for example JSON processing, I write the script in python.
I realized it after wasting multiple hours debugging problems at work in scripts that used jq or AWK or similar tools and most of the time the problem was solved by quoting randomly things, except when discovering that there is the edge case that I didn't considered and the program broke, again.
Now when I have that sort of problems I don't even bother trying to fix them, I just rewrite the whole script in python (they are usually small scripts so it's a question of 15 minutes most of the time). And writing new script in bash is banned (except particular cases).
Also there is the concept of portability, most people assume that everyone has a way to install that tools because they have on their system, it's not that simple, and while putting things in production or on the CI it breaks because jq is missing. And good luck with Windows, by the way.
One nit: ‘jq -r’ seems like more fundamental than a sidenote, especially considering it’s a cli tool. That could just be how I use it though (as glue between json and bash).
Another good tool in that family is ‘jtbl’. It provides table output, which is useful for cut, awk, sed and column.
you can also pipe a curl in the above, but that will mean a lot of (slow) requests, so I have this snippet saved for running fzf --preview with jq on something from a web service
you will get a live preview of the result as you type
I did not know about `jiq`, so thanks for that tip, it looks like it does the same or something similar, but without the extra cruft of storing a temp file
> However, some things never stick in my head, nor my fingers, and I have to google them every time. jq is one of these.
However powerful jq may be, the comment above, which has exactly been my experience with jq, summarizes quite nicely the biggest hurdle with this tool.
99% of my jq use looks like this:
cat file.json | jq . | <regular list of unix filters>
and the remaining 1% is straight cut and paste from google / stackoverflow that may or may not end up doing what I want.
you may enjoy one of the various "gron" commands, that format json as a list of lines, with one field per line, and is very easily greppable, seddable, and awkable.
Personally I find it clearer in many cases to use a format string instead. ie. instead of writing:
[(.number|tostring), .title] | join(" | ")
I would write:
"\(.number) | \(.title)"
which IMO is more readable in cases where you have specific values you want to put in specific places, as opposed to a list of unknown length which you want joined (eg. I would still use join("\n") in your example).
My OpenAPI spec is 60,000 lines of machine generated goodness. JQ is great for doing non-trivial but useful things.
Eg: Which endpoints are available if you have a specific OAuth Scope?
Most useful jq cli flag is -f. - take the jq script from file
Big fan of JQ. I like it more than the traditional UNIX suite of text manipulation commands, because I get closer to "querying" rather than just filtering. It has really made me rethink where I want "interacting with a computer" to go in the future -- less typing commands, more querying stuff.
I have a few utilities involving JQ that I wrote.
For structured logs, I have jlog. Pipe JSON structured logs into it, and it pretty-prints the logs. For example, time zones are converted to your local time, if you choose; or you can make the timestamps relative to each other, or now. It includes jq so that you can select relevant log lines, delete spammy fields, join fields together, etc. Basically, every time you run it, you get the logs YOU want to look at. https://github.com/jrockway/json-logs. Not to oversell it, but this is one of the few pieces of software I've written that passes the toothbrush test -- I use it twice a day, every day. All the documentation is in --help; I should really paste that into the Github readme.
I am also a big fan of using JQ on Kubernetes objects. I know what I'm looking for, and it's often not in the default table view that kubectl prints. I integrated JQ into a kubectl extension, to save you "-o json | jq" and having to pick apart the v1.List that kubectl marshals objects into. https://github.com/jrockway/kubectl-jq. That one actually has documentation, but there is a fatal flaw -- it doesn't integrate with kubectl tab completion (limitation of k8s.io/cli-runtime), so it's not too good unless you already have a target in mind, or you're targeting everything of a particular resource type. This afternoon I wanted to see the image tag of every pod that wasn't terminated (some old Job runs exist in the namespace), and that's easy to do with JQ: `kubectl jq pods 'select(.status.containerStatuses[].state.terminated == null) | .spec.containers[].image'`. I have no idea how you'd do such a thing without JQ, probably just `kubectl describe pods | grep something` and do the filtering in your head. (The recipes in the kubectl-jq documentation are pretty useful. One time I had a Kubernetes secret that had a key set to a (base64-encoded) JSON file containing a base64-encoded piece of data I wanted. Easy to fix with jq; `.data.THING | @base64d | fromjson | .actualValue | @base64d`.
JQ is something I definitely can't live without. But I will admit to sometimes preprocessing the input with grep, `select(.key|test("regex"))` is awfully verbose compared to "grep regex" ;)
Very useful tool indeed. Worth investing some time in if you like doing data processing on the command line.
Stuff I do with it:
- prepare json request bodies for curl commands by constructing json objects using environment variables
- grab content from a deeply nested json structure for usage in a script
- extract csv from json
- pretty print json or ndjson output curl ... |jq -C '' | less -r. I actually have an alias set up for that.
The syntax is a bit hard to deal with. I find myself copy pasting from stack overflow a lot when I know it can do a particular thing but just can't figure out how to do it.
jq is much more terse. When I’m working in bash, I much prefer to write `kubectl get secret foo -o json | jq -r '.data | map_values(@base64d)'` rather than the equivalent Python.
JQ is great once you get the hang of it. Some time back I had to write some simple tests on JSON output and I wrote a couple of helpers if anyone else is interested [0].
I really want to like JQ. I know the tool is super powerful. Unfortunately I find the syntax obtuse and very hard to remember, especially when I only use it on rare occasions.
Everytime I want to use it I end up searching up examples of the syntax and not quite getting it right.
This will list the keys, sometimes is really helpful with a big json that you don't know the schema, but you have the intuition that some key should be there.
Assuming you pass your token in a header, put your token in an environment variable and add it to your request like this:
```
curl -H "Authorization: Bearer $TOKEN" ...
```
Can anyone explain why I'm seeing JQ all over the web?! Yesterday a vendor gave a presentation saying that we should use it when using their command line tools, what's going on?
> The frequency illusion is that once something has been noticed then every instance of that thing is noticed, leading to the belief it has a high frequency of occurrence
What is the cognitive bias called in which you see an internet post talking about a personal experience that is clearly perfectly plausible/likely, but instead choose to down vote them and assert the said experience is fraudulent by citing a wikipedia article that is barely incidentally related?
This is clearly a flavor-of-the-month tool that has been getting much coverage on these sorts of sites lately, why impune the guys mental function?