Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: I made a CLI to retrieve info from the IMDB (github.com/zacoppotamus)
61 points by izac on Dec 24, 2012 | hide | past | favorite | 37 comments



It's funny, this brings IMDB almost full-circle to its origins.

Back in the early 90s it was first this huge text file distributed via usenet and ftp. I remember contributing a few points of a trivia about T2: Judgement Day, one being the mirror of Arnold's "Get out." line to the trucker in Terminator to Robert Patrick delivering the same to the helicopter pilot in T2.

Later on, after the volume of data was becoming unwieldy for a flat text file, it was converted to a MS-DOS based database system. You'd download a ton of archive files, unpack them, then run a binary that would allow you to query things like title, year, actor. One of the best features from those days (and maybe the very early web-based IMDB) was you could say, "show me actors that were common to these 2 movies" (or maybe it was "given these 2 actors, show me movies they both starred in" -- or maybe both).

I sorely miss those days of a community-driven IMDB. I was truly saddened when it "sold out" and became a more-or-less commercial portal.


You can still access the data files for private use, see http://www.imdb.com/interfaces.

Unfortunately, the data has a restrictive license. Though, I'm not sure if that entirely holds, as I believe there are limitations (at least in the United States) to having copyright on compilations of factual data.


IMDB still has a common cast search feature, offering both "show me the actors common to these two movies" and "show me the movies common to these two actors":

http://www.imdb.com/search/common


For some of this stuff you can use Freebase, which has an awesome queryable JSON API and a very open and permissive license. Our movie data is quite good for structured information, though not so much for unstructured trivia. [Disclosure, I work on Freebase for Google.]


While afaik not as complete, there's also

http://www.themoviedb.org/

The main thing missing night be the ability to identify movies uniquely by imbd id... Something that can be very nice while searching.


Interesting, but you are not the pulling information from Imdb, according to the source code it is using omdbapi.com an independent site which used to be imdbapi.com until amazon issued them a C&D.

Keep in mind OmdbApi states that it uses all other resources except IMDB to compile its data, so you may not find some movies/information only found in IMDB.

Regardless, thank you open sourcing your work.


Yup, sorry if I didn't make that clear enough.


> omdbapi.com an independent site which used to be imdbapi.com until amazon issued them a C&D.

How's omdbapi.com avoiding a C&D as well?


Omdbapi.com doesn't have the trademarked "imdb" word in their ___domain, they also seem to use wikipedia and freebase as their source.

Similar to how "Scroogle.org" stayed away from violating Google's trademark.


Nice work. Just a couple of Python tips in case you wanted them.

- Embrace PEP8 (http://www.python.org/dev/peps/pep-0008/). It'll help other developers work with your code.

- argparse.ArgumentParser has a parse_args method, so you don't have to mess around with sys.argv


Much appreciated, thanks.


I hade made something similar in the form of a Chrome app/extension: you pasted lists of movie titles in a textarea and it retrieved basic information from IMDb about those titles.

What I had learned was that full text search on IMDb is of very poor quality and returns all sorts of crap; I ended up searching on Google to get the imdbId, and return to IMDb with that Id to get movie data.


What's with the recent trend to use aliases instead of proper installation? (c.f. datafart, on the frontpage yesterday)

    mv terminalIMDB.py ~/bin/imdb
    chmod +x ~/bin/imdb


People don't know how to actually make a python package. It's not hard.


Would the benefit of doing it the author's way be that you can check out new versions from git without having to move the python script to /bin every or /usr/local/bin every time you get a new version.


I guess. But that could also be accomplished with a symlink, no?


Nice work. I see IMDB still don't have an API. I wrote a little movie comparison web app in the past and had to resort to screen scraping IMDB too. Rotten Tomatoes on the other hand seems to embrace developers a lot more and has an excellent API.

EDIT: Just realised you aren't actually screen scraping IMDB but using OMDB API (http://www.omdbapi.com) instead, I'll have to check that out.


IMDb do have an API, for eg. app.imdb.com, however this is meant for mobile applications and it can only be used if authorized in writing(!) by IMDb.

On CPAN there is WebService::IMDB module which uses this mobile API. Also there is IMBDB::File but (i believe) this works by web scraping IMDb website.

cpan links:

- https://metacpan.org/module/WebService::IMDB

- https://metacpan.org/module/IMDB::Film


Nice, thanks for the detailed post.


IMDB has an API, but it's not free.


I wrote almost same script(https://gist.github.com/1918439) last year,here is the blog post(http://mushfiq.com/2012/03/13/python-movie-data-crawler/)


I feel like I must mention the IMDBpy[1] project, its a mature framework/api to access IMDB via the web or to build a local database based on the flatfiles[2] imdb make available ;)

IMDB itself don't provide a (free) API and webscraping is explicitly forbidden in their TOS[3], so don't start building a service around that;)

[1] http://imdbpy.sourceforge.net/ [2] http://www.imdb.com/interfaces [3] http://www.imdb.com/help/show_article?conditions

(edit: forgot a link)


Had you previously noticed https://github.com/bgr/imdb-cli ? Can you tell us how yours differs from that and other IMDB CLI programs that have come before?


Very cool! I wrote a similar Python script (https://github.com/Pinkerton/Scripts/tree/master/Filmsy) to find common actors between movies. It basically acts as a Python API wrapper for omdbapi.com.

I made it because I found a simple webapp at one point that did this, but forgot to bookmark it and never found it again.


Very interesting. we've been using TMDB.org for our site filmquotra.com but are not tied to it. how do you distinguish yourself from tmdb?


  > how do you distinguish yourself from tmdb
This is a command-line program for accessing movie information (via someone else's web API), not movie database system. Asking how he/she distinguishes his/herself from TheMovieDB, is... a little odd. It would be like asking XBMC how they distinguish themselves from IMDB.


thanks...i barely even understood your response to be honest...i'm not a programmer but trying to learn. love rentrak--i've been talking to someone there for a while now regarding a data feed.


  > i barely even understood your response to be honest
- IMDB & TMDB are collections of movie data.

- XBMC (aka XBox Media Center) is a piece of software that plays videos and will grab information about those videos from IMDB or TMDB to display to the user (e.g. when you're watching a movie, it will show you a synopsis of the movie, boxart, etc grabbed from the web).

- This post is just about a program that will allow you to say "I want information about a movie called A Christmas Story " and it will fetch that information from the web, and display it. The only real benefit of this, is being able to do the query/view the response from a text terminal, instead of just going to the website.


Looks good. I made http://www.deanclatworthy.com/imdb/ if anyone is interested too. Feel free to add it to your library.


Can you either charge for this or release it as OSS?


I doubt IMDB would be happy if I charged for it ;) I won't OS it for legal reasons (see what happened to imdbapi.com)


Pulling down URLs to images would be a really cool and useful addition. Good work though, looks great - will probably end up using this at some point.


Thanks, I'll put that in the to-do list :)


IMDB app is the new hello world ;) There is mine: https://github.com/caruccio/pymdb


I really did need something like this, but never thought of making of something!

Anyways looking forward to use this tool!


[dead]


Why would you post this? He's open sourced his work and made something that will at least serve as learning material for python developers.

There's no need to be crabby.


although I do agree, I still wonder if a generic 62 line python script is worth this much attention.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: