Comments on: A little programming project – part 3

By: colin

colin — Mon, 27 Oct 2008 08:44:27 +0000

Lawrence

You’re completely right about the spare “links” line. And also, of course, about the way I chose to implement getting a web page – I know that a file object is created, but I’m not interested in it and I’m delighted to let Python clean it up for me. All I want is the contents of the web page in a form I can manipulate.

Cheers

Colin

By: Lawrence D'Oliveiro

Lawrence D'Oliveiro — Fri, 24 Oct 2008 22:26:02 +0000

One subtlety worth mentioning: in the line

page = urllib.urlopen(linkspage).read()

the urllib.urlopen call is returning a file object; you call this read() method to obtain the contents, then discard the object, whereupon Python’s memory management will automatically close it. May horrify some people used to having to explicitly close every file they open, but it works!

By: Lawrence D'Oliveiro

Lawrence D'Oliveiro — Fri, 24 Oct 2008 22:24:12 +0000

It’s amazing what a little bit of Python can do.

Just a stylistic matter, I would prefer to write the while-loop as follows:

while True :
page = urllib.urlopen(linkspage).read()
links = re.findall(r'”http\S*?echnology\S*?”‘,page)
if len(links) >= 2 :
break
time.sleep(60)

saves checking len(links) twice. Also you have a line that just says

links

which I don’t think is doing anything useful, and can be removed.

By: Eugene

Eugene — Tue, 21 Oct 2008 23:23:13 +0000

I am seaching for some idea to write in my blog… somehow come to your blog. best of luck. Eugene