Hacker News new | past | comments | ask | show | jobs | submit login

There actually is a limit of 100 pages per submission, however it is obviously not storing the data for those 100 pages too well.

Looks like tomorrow is de-bugging day!

I wrote this app over the course of a day a couple of months ago and it's been languishing on my server since then. At the moment it's written in PHP and is not exactly what you'd call 'well tested'.

I'm considering migrating it to Rails (or Camping). Any suggestions for mini-frameworks (like Camping) suitable for a single-page app would be welcomed!




Very cool tool. Useful too. =]

Framework isn't the issue in this case, it just looks like you're loading all 100 pages into memory at one time? Or keeping the previous pages in memory as you go "deeper" into the hierarchy.

I recommend you just keep a few pages in memory at any given time, keeping a simple list of the URLs scraped out of that HTML. If you're using PHP remember to unset() the variables storing the HTML and so on.

I haven't used PHP in a long time and of course I'm just speculating based on the error message, so slight disclaimer there.


I also meant to say: re: "Framework isn't the issue in this case" I agree. It is possible to turn out nice code in PHP and write fast, reliable apps. That being said Rails has me addicted to the idea of unit tests within easy reach. I'm not exactly a massive fan of any of the unit testing systems that I have seen for PHP and that is my main reason for wanting to switch away in this case...

I use PHP all the time, I am just a little dissatisfied with it in light of the alternatives that are out there.


Yeah I'm pretty sure that I was storing all of the raw HTML for possible further analysis down the line. I haven't even looked at the code in months but it definitely needs over-hauling!

Initially it started out as a 'what if I do this' kind of app. I would like to re-write the backend with more attention to detail...


Check out Sinatra for the micro-framework, and RestClient to get the websites.


How about Google's AppEngine? It scales quite nicely if you write your stuff in a non-braindead way.

A public service like that is a perfect candidate for their architecture, I think.


Good suggestion! I've been meaning to try out AppEngine for a while now but haven't had the chance. It would also be nice to keep any load away from my servers...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: