Hacker News new | past | comments | ask | show | jobs | submit login
Hnweekly: weekly top stories from Hacker News (watdahel.com)
62 points by xtimesninety on Dec 26, 2008 | hide | past | favorite | 29 comments



Cool idea. Anyway, I created a Top Stories Directory 2 days ago too. It will record down every number one stories starting from 2 days ago. But, it lost tracking some stories due to some bugs. But, I already fixed them all.

http://hn.siong1987.com


It would be good if you provided links to the corresponding HN/comments pages as well as to the stories themselves.


nice work too :)


How long do your crawl the Ycombinator once? Do you actually crawl every page on YC?


I scrape only the front page once every ~4hrs (minimum)

Nearlyfreespeech.net doesn't have cron yet, so I'm using onlinecronjobs.com. If you want to scrape it on demand just go to http://hnweekly.watdahel.com/index/scrape


But, if you only scrape the frontpage, how can you know the points for all the past stories which are not on the frontpage anymore?


It's not 100% accurate, but it's close. The reason I did the website is so I can catch up with the good stories that were submitted earlier during the week (I usually read during weekends).


Good work. Needs an RSS feed.


now with rss feeds!


good idea :)


It looks like it sorts by points?

Is that the best way to determine top stories, or would something like a time-at-position score be better?

(Thanks - I kept meaning to do this myself. Would you do proggit & friends too, please?)


yep it's sorted by points. reddit already has this: http://www.reddit.com/r/programming/top/?t=week not sure about the other sites, I'm just usually on hn, reddit and stuff on my google reader


Doesn't seem to handle non-ASCII very well: "Does Gdel matter?"


I might have problems with unicode in PHP (I really don't know much here yet) so I opted to remove them (for now) using regex.

btw I'm using http://simplehtmldom.sourceforge.net/ to scrape/parse which is great, it let's you use jquery style selectors on html.


ö doesn't need Unicode. It's ISO-Latin-1 (ISO-8859-1). An update is ISO-8859-15. Latin-1 is an 8 bit character set including ASCII, covering the Western European languages. I don't know PHP, but it could be the default or a simple character set option.



In that case, it should replace ö with o, etc.


I have a screenscraping thing in place that targets an ASCII only environment (LambdaMOO mud). Manually replacing unicode stuff like this has been a pain in the butt. Luckily that was all just for fun and didn't need to be perfect.

Is there a good library out there (in any language) that does good unicode --> ASCII substitutions for major languages?


I once used latin1_to_ascii (The Unicode Hammer) in python: (works great) http://code.activestate.com/recipes/251871/


Awesome thanks. Two coincidences: this fun mud thing is in Python (works via RPC)... and the entry method name is "hammer," heh.


I've been looking for something like this but didn't find one. btw I only started scraping like 2 days ago :) so it's not yet a week's worth of data.

fyi: there's also http://news.ycombinator.com/lists which shows the top stories in 2 weeks


Lists are almost cool.

Allow us to choose: Today, Yesterday, Week, Month, Year


Pretty useful.

It would be nice to be able to see the top stories over any arbitrary time period as well. I imagine an interface with a draggable timeline to specify the period (something similar to what google finance uses, perhaps) could be even better.


thanks, i'm planning to add that too. you can also try this (to see more recent links): http://hnweekly.watdahel.com/?days=1


For a feed of all the #1 HN stories, use http://feeds.feedburner.com/HNWatrcoolr

For more top stories from hacker feeds, use http://hacker.watrcoolr.us/ (feed: http://feeds.feedburner.com/HackerWatrCoolr)


Nice and very useful.


Allow it to sort by number of comments


Agreed. It is often useful to see which stories are the most-discussed.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: