An Introduction to JQ

raziel2p · on Aug 25, 2021

One small but really convenient tip missing from the article is object shortcuts - these two commands are the same:

    curl -s https://api.github.com/repos/stedolan/jq/issues?per_page=2 | jq '[ .[] | { title: .title, number: .number } ]'
    curl -s https://api.github.com/repos/stedolan/jq/issues?per_page=2 | jq '[ .[] | { title, number } ]'

bewuethr · on Aug 25, 2021

Not to distract from your point, but in this specific instance I'd probably use

  map({title, number})

instead

  [ .[] | <something> ]

paulddraper · on Aug 27, 2021

Side note, but one of the hardest things about jq is to realize all the operators operate on streams of values.

For example, map() operates on a stream of arrays. That kind of doubling of iteration can be confusing at first.

hk1337 · on Aug 25, 2021

The first is nice because it highlights being able to remap the value to different key.

technicolorwhat · on Aug 25, 2021

For anyone wrangling with data like I do. I use https://github.com/TomWright/dasel quite a lot (it supports various formats and conversion between them) Also csvkit https://csvkit.readthedocs.io for CSV to sql. And ofcourse pandas for analysis.

wpietri · on Aug 25, 2021

As long as we're recommending data wrangling tools, I'm a fan of visidata: https://www.visidata.org/

technicolorwhat · on Aug 25, 2021

Thank you! More data wrangling tips are much appreciated!

hermitcrab · on Aug 25, 2021

If you prefer a visual drag and drop approach to command line take a look at: https://www.easydatatransform.com/

paulddraper · on Aug 27, 2021

> csvkit

Thanks.

I always look for "jq for csv" but perhaps I should just convert to json, then back to csv with jq

phnofive · on Aug 25, 2021

I love these - I use jq daily and am always happy to read about it from another angle.

The author probably has internalized more of the manual than they realize, and maybe improved at least one explanation; FTA:

> map(...) let’s you unwrap an array, apply a filter and then rewrap the results back into an array. You can think of it as a shorthand for [ .[] | ... ] and it comes up quite a bit in my experience, so it’s worth it committing to memory.

From https://stedolan.github.io/jq/manual/#map(x),map_values(x) :

> map(x) is equivalent to [.[] | x]. In fact, this is how it's defined. Similarly, map_values(x) is defined as .[] |= x.

Note here the casual introduction of the update assignment operator, '|='

jcims · on Aug 25, 2021

What does [.[] | x] do?

prutschman · on Aug 25, 2021

Not a jq expert, but my understanding is:

`.[]` takes a list and turns it into a sequence consisting of each element of that list.

`| x` applies the filter `x` to that sequence, turning it into a new sequence.

The outermost `[ ]` builds a list from that new sequence.

Varriount · on Aug 25, 2021

Take each sub-item in `.` (the current item being processed), apply the function `x` to each sub-item, and collect all the result values from `x` into an array.

swader999 · on Aug 25, 2021

.map(x)

jcims · on Aug 25, 2021

This is the most jq-ish reply xD

dotancohen · on Aug 25, 2021

This is a terrific introduction to a tool that I use often enough to pretend to be familiar with, but honestly just grep my bash history every time I need to use.

But I'd really like to see a discussion about the tool that the host website promotes: Earthy. It a build system, which is a family of tools that I've always hated, but it seems to be pretty decent. Is anybody using it?

I'm off to find HN threads on Earthly.

adamgordonbell · on Aug 25, 2021

https://news.ycombinator.com/item?id=27785323

lalaithion · on Aug 25, 2021

This is an amazing article. I think the two things that set it apart is (a) showing the process of building a complex command from scratch step by step, and (b) actually using a real API endpoint to start with.

One convenient tip I discovered after reading this article and trying out the command is that

    jq 'map({ title: .title, number: .number, labels: .labels | length }) | map(select(.labels > 0))'

can be refactored into

    jq 'map({ title: .title, number: .number, labels: .labels | length } | select(.labels > 0))'

or in other words, map(filter1) | map(filter2) == map(filter1 | filter2).

thaliaarchi · on Aug 25, 2021

jq is unsurprisingly Turing complete, so I wrote a Whitespace interpreter[0] in jq. It is able to handle real-time I/O by requesting lines on-demand from stdin, which is the main input source, with `input` and outputting strings in a stream.

With a relatively large jq program like that, it is critical that the main recursive loop run efficiently, so it's annoying that there's no way to detect whether tail call optimization was applied, other than benchmarking. It would also be nice if object values were lazily evaluated so that it would be possible to create ad hoc switches.

[0]: https://github.com/andrewarchi/wsjq

dkarl · on Aug 25, 2021

The jq documentation is good, and thorough, but we see a lot of articles like these, and personally I think I would remember the particulars better (and maybe guess them without even referring to the reference manual) if the author wrote a narrative account of how they conceived of jq and how the language emerged from that conception. The tutorial is a great start, but something a little more fleshed out and introspective would help a lot, in my opinion.

kibbleble · on Aug 26, 2021

You mean the jq creator should write up about their linguistic choices, or the author of this article?

dkarl · on Aug 26, 2021

The jq creator.

endymi0n · on Aug 25, 2021

As much as I love jq and it has helped me, it always has an „awk“ vibe to me which is one of the tools I use less despite its capability, because of the same reason: I tend to forget all the special syntax over the years.

Compare that with „gron“ ( https://github.com/Deitar13/gron ), which is arguably not as powerful and clunky, but it allows me to compose and tie into the other unix tools way better.

It‘s mainly for that reason I use it more than jq these days for ad-hoc analysis.

pphysch · on Aug 25, 2021

AWK and JQ definitely suffer from "power-creep" as shell tools. IMO they are unreasonably powerful; full-fledged scripting languages masquerading as simple text-processing tools.

Part of the blame is definitely on the fact that the "idiomatic" way to filter columns in a shell pipeline is to invoke AWK: `awk '{print $3,$5}'`. Similarly for JQ. Virtually every sysadmin and programmer gets introduced to these languages as Unixy tools when in fact they are antithetical to the Unix philosophy.

The result is ending up with overcomplicated "production" pipelines (curl|sed|awk|jq) when you really could be writing one far more coherent, maintainable, scalable C, Go, Python, etc. program with their standard libraries.

alerighi · on Aug 25, 2021

This. I stopped writing shell script for the exact reason. When something is more complex than launching a couple of commands, and involves for example JSON processing, I write the script in python.

I realized it after wasting multiple hours debugging problems at work in scripts that used jq or AWK or similar tools and most of the time the problem was solved by quoting randomly things, except when discovering that there is the edge case that I didn't considered and the program broke, again.

Now when I have that sort of problems I don't even bother trying to fix them, I just rewrite the whole script in python (they are usually small scripts so it's a question of 15 minutes most of the time). And writing new script in bash is banned (except particular cases).

Also there is the concept of portability, most people assume that everyone has a way to install that tools because they have on their system, it's not that simple, and while putting things in production or on the CI it breaks because jq is missing. And good luck with Windows, by the way.

cyberge99 · on Aug 25, 2021

Good writeup!

One nit: ‘jq -r’ seems like more fundamental than a sidenote, especially considering it’s a cli tool. That could just be how I use it though (as glue between json and bash).

Another good tool in that family is ‘jtbl’. It provides table output, which is useful for cut, awk, sed and column.

asicsp · on Aug 25, 2021

`jtbl` seems nice! I'd add:

* https://github.com/tomnomnom/gron - make json greppable

* https://sr.ht/~gpanders/ijq/ - interactive jq

cyberge99 · on Aug 25, 2021

The interactive ijq is nice! It’s useful and the ui makes it a good training tool. Thank you!

phnofive · on Aug 25, 2021

You might know this, but feeding arrays to '|@tsv' will provide output ready to pipe to these as well.

snug · on Aug 25, 2021

Something I just learned about the other day was jid [0] to help query the json keys

[0] https://github.com/fiatjaf/jiq

ducktective · on Aug 25, 2021

You can do the same thing using fzf preview mode

maattdd · on Aug 25, 2021

Can you elaborate ?

gnyman · on Aug 25, 2021

I guess they mean something like this, which I find useful at quickly iterating over jq

  fzf --print-query --preview-window wrap --no-clear --preview 'cat file.json | jq {q}'

you can also pipe a curl in the above, but that will mean a lot of (slow) requests, so I have this snippet saved for running fzf --preview with jq on something from a web service

  curl -L https://datahub.io/core/covid-19/r/worldwide-aggregate.json > /tmp/foo && echo '' | fzf --print-query --preview-window wrap --no-clear --preview 'cat /tmp/foo | jq {q}'

now if you write something like

  .[0]["Confirmed"]

you will get a live preview of the result as you type

I did not know about `jiq`, so thanks for that tip, it looks like it does the same or something similar, but without the extra cruft of storing a temp file

thaliaarchi · on Aug 25, 2021

It can be just `jq {q} file.json`. No need for `cat`.

ducktective · on Aug 25, 2021

`echo '' | fzf --print-query --preview "cat file.json | jq {q}"`

https://news.ycombinator.com/item?id=23434018

ducaale · on Aug 25, 2021

For anyone struggling with jq's syntax, I recommend taking a look at fx [1] which uses JavaScript as the query language.

The author of the tool has also written a guide [2] and recorded a screencast [3] about the tool.

[1] https://github.com/antonmedv/fx

[2] https://medium.com/@antonmedv/discover-how-to-use-fx-effecti...

[3] https://youtu.be/ktfeRxKog98

ur-whale · on Aug 25, 2021

> However, some things never stick in my head, nor my fingers, and I have to google them every time. jq is one of these.

However powerful jq may be, the comment above, which has exactly been my experience with jq, summarizes quite nicely the biggest hurdle with this tool.

99% of my jq use looks like this:

     cat file.json | jq . | <regular list of unix filters>

and the remaining 1% is straight cut and paste from google / stackoverflow that may or may not end up doing what I want.

jq's DSL is inscrutable

enriquto · on Aug 25, 2021

you may enjoy one of the various "gron" commands, that format json as a list of lines, with one field per line, and is very easily greppable, seddable, and awkable.

ur-whale · on Aug 25, 2021

>you may enjoy one of the various "gron" commands

This is the second time someone recommended this on HN, so this time, I did go and have a look.

Really nice indeed, thanks for the tip.

buzzwords · on Aug 25, 2021

I have to use jq once in a blue moon which means I have to rely on trial and error, because I have forgotten all that I leanred last time.

jabo · on Aug 25, 2021

If you work with JSON and CSV data regularly, I'd also recommend checking out Miller: https://github.com/johnkerl/miller

From the project's description: Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON

rgrau · on Aug 25, 2021

A very useful function in jq is "join", which I use a lot to cook the final shape of the data (many times used with fzf/dmenu)

Here's a simple way to list and browse github issues of a given user/repo:

    #!/usr/bin/env sh

    browse_url() {
      firefox http://github.com/$1/issues/$2
    }

    issue=$(curl https://api.github.com/repos/$1/issues |
          jq -r 'map([(.number|tostring), .title] | join(" | ")) | join("\n")' |
          dmenu -i -l 10 |
          awk "{print \$1}")

    browse_url $1 $issue

Although the `| join("\n")` part could be done in a more idomatic way with just `[]`, sometimes the manual way are still clearer to me:

    map([(.number|tostring), .title] | join(" | "))[]

ekimekim · on Aug 25, 2021

Personally I find it clearer in many cases to use a format string instead. ie. instead of writing:

    [(.number|tostring), .title] | join(" | ")

I would write:

    "\(.number) | \(.title)"

which IMO is more readable in cases where you have specific values you want to put in specific places, as opposed to a list of unknown length which you want joined (eg. I would still use join("\n") in your example).

rgrau · on Aug 25, 2021

Oh!

I didn't know that one could build an arbitrary string like that inside a map.

Thanks a lot for that, I agree it looks better!

stueynz · on Aug 25, 2021

My OpenAPI spec is 60,000 lines of machine generated goodness. JQ is great for doing non-trivial but useful things. Eg: Which endpoints are available if you have a specific OAuth Scope?

Most useful jq cli flag is -f. - take the jq script from file

Most useful tutorial for learning to manipulated your OpenAPI spec https://apihandyman.io/api-toolbox-jq-and-openapi-part-1-usi...

gulbrandr · on Aug 25, 2021

The full link: https://apihandyman.io/api-toolbox-jq-and-openapi-part-1-usi...

stueynz · on Aug 25, 2021

Thx. I hate typing on my mobile.

jrockway · on Aug 25, 2021

Big fan of JQ. I like it more than the traditional UNIX suite of text manipulation commands, because I get closer to "querying" rather than just filtering. It has really made me rethink where I want "interacting with a computer" to go in the future -- less typing commands, more querying stuff.

I have a few utilities involving JQ that I wrote.

For structured logs, I have jlog. Pipe JSON structured logs into it, and it pretty-prints the logs. For example, time zones are converted to your local time, if you choose; or you can make the timestamps relative to each other, or now. It includes jq so that you can select relevant log lines, delete spammy fields, join fields together, etc. Basically, every time you run it, you get the logs YOU want to look at. https://github.com/jrockway/json-logs. Not to oversell it, but this is one of the few pieces of software I've written that passes the toothbrush test -- I use it twice a day, every day. All the documentation is in --help; I should really paste that into the Github readme.

I am also a big fan of using JQ on Kubernetes objects. I know what I'm looking for, and it's often not in the default table view that kubectl prints. I integrated JQ into a kubectl extension, to save you "-o json | jq" and having to pick apart the v1.List that kubectl marshals objects into. https://github.com/jrockway/kubectl-jq. That one actually has documentation, but there is a fatal flaw -- it doesn't integrate with kubectl tab completion (limitation of k8s.io/cli-runtime), so it's not too good unless you already have a target in mind, or you're targeting everything of a particular resource type. This afternoon I wanted to see the image tag of every pod that wasn't terminated (some old Job runs exist in the namespace), and that's easy to do with JQ: `kubectl jq pods 'select(.status.containerStatuses[].state.terminated == null) | .spec.containers[].image'`. I have no idea how you'd do such a thing without JQ, probably just `kubectl describe pods | grep something` and do the filtering in your head. (The recipes in the kubectl-jq documentation are pretty useful. One time I had a Kubernetes secret that had a key set to a (base64-encoded) JSON file containing a base64-encoded piece of data I wanted. Easy to fix with jq; `.data.THING | @base64d | fromjson | .actualValue | @base64d`.

JQ is something I definitely can't live without. But I will admit to sometimes preprocessing the input with grep, `select(.key|test("regex"))` is awfully verbose compared to "grep regex" ;)

jillesvangurp · on Aug 25, 2021

Very useful tool indeed. Worth investing some time in if you like doing data processing on the command line.

Stuff I do with it:

- prepare json request bodies for curl commands by constructing json objects using environment variables

- grab content from a deeply nested json structure for usage in a script

- extract csv from json

- pretty print json or ndjson output curl ... |jq -C '' | less -r. I actually have an alias set up for that.

The syntax is a bit hard to deal with. I find myself copy pasting from stack overflow a lot when I know it can do a particular thing but just can't figure out how to do it.

jareware · on Aug 25, 2021

If there's an API for working with data that you know well already, it's somewhat pointless to learn the (quite esoteric) jq one.

Just pipe curl to node (https://github.com/jareware/howto/blob/master/Replacing%20jq...) or Python or Ruby or whatever you already know!

pwdisswordfish8 · on Aug 25, 2021

Or use that language’s native HTTP client and cut out the shell middleman.

throwaway894345 · on Aug 25, 2021

jq is much more terse. When I’m working in bash, I much prefer to write `kubectl get secret foo -o json | jq -r '.data | map_values(@base64d)'` rather than the equivalent Python.

Karupan · on Aug 25, 2021

JQ is great once you get the hang of it. Some time back I had to write some simple tests on JSON output and I wrote a couple of helpers if anyone else is interested [0].

[0] https://gist.github.com/Checksum/72d927471c76c76c46418b3ee88...

Osiris · on Aug 25, 2021

I really want to like JQ. I know the tool is super powerful. Unfortunately I find the syntax obtuse and very hard to remember, especially when I only use it on rare occasions.

Everytime I want to use it I end up searching up examples of the syntax and not quite getting it right.

woile · on Aug 25, 2021

For exploratory purposes one can also use `jq 'keys'`

``` cat file.json | jq 'keys' | grep ependencies ```

This will list the keys, sometimes is really helpful with a big json that you don't know the schema, but you have the intuition that some key should be there.

xorcist · on Aug 25, 2021

Should you wish to jq can search too without the need of piping through a shell:

  jq 'keys | map(select(test("ependencies")))' file.json

map(select()) is a pretty useful construct to loop over stuff and pick out the interesting parts.

brundolf · on Aug 25, 2021

I don't work with data, so my most common usage for jq is checking package.json properties from the command line:

  cat node_modules/some_lib/package.json | jq '.version'

MichaelMoser123 · on Aug 25, 2021

i have a jq tutorial, where you can click on each step of the pipeline to view what it is doing [1], even had it featured here on hn, a while ago. [2]

[1] https://mosermichael.github.io/jq-illustrated/dir/content.ht... [2] https://news.ycombinator.com/item?id=22626080

alblue · on Aug 27, 2021

Since the language is based on JavaScript, you can also concatenate strings together with “”+”” as part of your output, eg ‘.Name + “(“ + .Email + “)”

dprophecyguy · on Aug 25, 2021

Most of the APIs in normal workflow are hidden behind a token. Is there a way to smooth the workflow for putting tokens in curl command ?

tyingq · on Aug 25, 2021

If the token is passed as a url parameter in a get request, -G can be useful as it forces the -d key=value switches to append to the url. So that:

  curl -G 'https://api.example.com' \
    -d foo=bar \
    -d baz=whee

Is the same as

  curl 'https://api.example.com?foo=bar&baz=whee'

That can make all the quoting hell a bit easier, as you're doing it one param at a time.

zomglings · on Aug 25, 2021

Assuming you pass your token in a header, put your token in an environment variable and add it to your request like this: ``` curl -H "Authorization: Bearer $TOKEN" ... ```

??

tex32 · on Aug 25, 2021

Excellent introduction!

It can really be overwhelming when you realize they control all of the major institutions in our country.

arwineap · on Aug 26, 2021

Has anyone figured out a workflow to nicely integrate jq filters into a postman type application?

gigatexal · on Aug 25, 2021

Been looking for such a write up. Kudos to the author.

singularity2001 · on Aug 25, 2021

alias json_query=jq

no need to memorize

solmag · on Aug 25, 2021

That title though.

notjes · on Aug 25, 2021

dirty mind

bennysomething · on Aug 25, 2021

Can anyone explain why I'm seeing JQ all over the web?! Yesterday a vendor gave a presentation saying that we should use it when using their command line tools, what's going on?

thaliaarchi · on Aug 25, 2021

Frequency illusion

> The frequency illusion is that once something has been noticed then every instance of that thing is noticed, leading to the belief it has a high frequency of occurrence

https://en.wikipedia.org/wiki/List_of_cognitive_biases

RedComet · on Aug 25, 2021

What is the cognitive bias called in which you see an internet post talking about a personal experience that is clearly perfectly plausible/likely, but instead choose to down vote them and assert the said experience is fraudulent by citing a wikipedia article that is barely incidentally related?

This is clearly a flavor-of-the-month tool that has been getting much coverage on these sorts of sites lately, why impune the guys mental function?