Getting at your Google Spreadsheets columns

Regular readers know that I’ve been working on a pet project to build a command line interface to Google Spreadsheets. Basically, I find working in a spreadsheet interface to be clunky and uncomfortable. If I need to put in a new row, I’d rather just be prompted at the CLI for the values I want to put in for each column. Later I’ll add the ability to edit and query my spreadsheet from the command line as well. The nice thing is that I can’t see any reason for this application to be specific to any *particular* spreadsheet, so anyone should be able to use this for whatever Google Spreadsheets document they want :-D

For now, though, in the event that other coders are struggling with their own project, here’s how I finally figured out how to print out the “column: value” pairs for every row in my spreadsheet:

spreadsheet_id = PromptForSpreadsheet(gd_client)
worksheet_id = PromptForWorksheet(gd_client, spreadsheet_id)
columnfeed = ListGetAction(gd_client, spreadsheet_id, worksheet_id)
    for attr, val in enumerate(columnfeed.entry):
        for key in val.custom.keys():
            print "%s:   %s" % (key, val.custom[key].text)
        print "\n"

I had initially hit a bit of a snag in getting to this point, because it’s not made clear in the Google Documentation how to reference your columns. They *do* tell you how to print “val.content.text”, but that prints all of the column:value combinations together in one big, long string. You can’t even parse that, because nothing is quoted, and column data can include anything you might use as a delimiter. I finally got around to looking at this again today, and with the help of IDLE (which I’ve now accepted as my saviour) finally poked and prodded ‘val’ until it spit out something close to what I was looking for.With that task out of the way, it should be easy enough to start performing write operations, and I believe I remember query operations being documented by Google separately – so hopefully this will go more quickly now.Wish me luck!

Python, regex, and IRC

So, I’m on IRC a lot. I’m on a lot of channels, too. I’m on more than one Python channel. One scenario in these chans that comes up somewhat often is one in which a user converting from PHP, Perl, Ruby, or whatever walk in and want to get a better understanding of how regex works in Python.

Flaming ensues.

Flaming the flamers is a topic for another blog post, but for some reason, Python users seem to really be resistant to regex. At one point, I actually suggested in one of the larger channels that someone write an article for Python Magazine about the proper use (or non-use) of regular expressions. That was weeks ago now. I got no bites.

So here again, I’ll put this in a very public place and say that if you can get me a proposal for an article that details when to use or not use regex, and how to use them properly when you *should* use them, we will pay you to write that article.

Recovering deleted files from an svn repository

I know I’m going to forget how to do this, because I only ever need to do it once a year or something, so I’ll put it here for safe keeping:

To recover a file from svn that you deleted from your local repository, it’s first necessary to get the proper name of the file, and the revision of the repository it last existed in. To do that (assuming you don’t know, because if you do you have bigger issues), you go to the directory it was in (or as close as you can get to the directory it was in) and run:

> svn log --verbose

You should be able to find the file you’re looking for and the revision you need in the output of that command. Assuming your file’s name is ‘file.txt’ and it was in revision 250, you run the following to recover it:

> svn up -r 250 file.txt

Done. It’s there waiting for you. Enjoy. I had been fumbling around with ‘svn co’ syntax until a digital buddy of mine corrected me. Thanks, Nivex!

PlanetPlanet++

I have to admit that I have not really made friends with Python as a web scripting language. I use it for network, system, and database scripting, and I’ve done some web services stuff with it, but I haven’t been able to use it for things that have, say, a browser interface. Until the other night.

I got email that this guy who maintained a site was going to shut it down. This really annoyed me. Then I remembered that a new web host I’m using actually supports Python. Sure, it’s a really old, crusty version of Python, but Python nonetheless. The site being shut down was running ‘PlanetPlanet’, which is a feed aggregator website package. You tell it the url’s for all of the feeds that interest you, and it goes and grabs all of that content from the various feeds, formats it, and spits back something that looks pretty nice.

PlanetPlanet needs no database, and no Apache modules. I unzipped it, configured it, fed it the feed url’s, went to the site, and I was live. I got it running in under 10 minutes, and had templates in place and security accounted for in another 10 minutes. Very nice!!

I need a Google Apps Mashup

Google Docs is nice. Calendar is really nice. Gmail is ok, too. The notion that you can more or less use any of the tools without going too far is pretty nice, and they’ve opened things up with the API just enough to get some useful plugin capabilities, *and* there’s a Python client available for the Google Data API, which is nice (my experience with Google Spreadsheets notwithstanding). The problem now is that I would like something that goes beyond a simple plugin.

Outside of my day job doing infrastructure architecture and sysadmin work (with some development thrown in for good measure), I run Python Magazine. I have a ton of communication and deadlines to track in working for the magazine; I get several article proposals per week (sometimes per day), I’m working with contract people, other editors, technical folks on the back end of things, layout folks, the people writing the checks and managing invoices, and whoever I need to talk to for business development tasks. I send emails to a great number of people every day, just for the magazine.

I use Gmail for my Python Magazine mail (my pythonmagazine.com addy is forwarded to gmail. GMail also lets me send mail using my pythonmagazine.com email address (otherwise, this would not be a usable solution for me).

I use Google Calendar to track deadlines. Each article deadline is a full day event in Google Calendar. I’d also *like* to use Google Calendar as something of a logging tool to track out-of-band conversations I have with people on IRC or (gasp!) in person.

The reason I haven’t gone this route yet is because there’s no interface where I can, say, search for a person’s name, and get a nice list of the things related to that person, grabbed from GMail *and* Google Calendar (not that you’d need to stop integration efforts at those two services – they’re just the two most useful to *me* right now).

For my purposes, it would even be OK if Google just added an “include calendar results” in the GMail search interface.  That would give me a list, ordered by date, of conversations via email, perhaps GTalk, out-of-band events logged with Calendar, and deadlines, also tracked via Google Calendar. It could essentially be a time line of my working relationship with a person, which can be very useful.

It might even be useful to get a time line of events and conversations related to a specific topic, rather than a specific person. If I could do a search for “contract request”, this hypothetical interface would actually spit out a time line showing all of the interactions between me and our contract person specifically in relation to commissioning articles, because I use the term “contract request” in the subject of all contract requests, and would naturally carry that consistency into notes I might take about contract requests in Google Calendar or other apps.

Well, that’s my latest idea. I’m not sure what form the app would take. Ideally it would be a web page I can get to from anywhere, but I have yet to do anything significant with Python as a web scripting language (though I’ve rewritten a whole lot of old Perl code in Python for sysadmin-ish stuff). A fat client application would inevitably *not* be useful to a whole lot of folks… I dunno. Thoughts hereby solicited on that.

Let me know if something already exists that does this, or if I just wrote this whole post for nothing because Google already does this somehow. I don’t think it does. It seems to treat applications as separate entities, and the same account using different apps are different entities as well. There needs to be a higher level vision of the user as a single object across all of the applications in order to get at the kinds of interesting uses of data that are possible and would add a lot of value to the individual services. My $.02.

“For the Community”

Sometimes people claim they’re doing things for the good of the community, but I guess that doesn’t necessarily mean they intend to involve the community in the effort :-/

A group of open source/free software users in New Jersey (where I currently reside) learned that the hard way when the maintainer of a web site that advertised it was “For the Free and Open Source Software Communities of New Jersey” posted a shut down notice.

The biggest slap in the face to the community the site was allegedly for was the text of the shutdown notice itself. For example:

“Maintaining GnuJersey has been mostly fun, but I want to prune the list of blogs I read daily, and I can’t do that while I maintain a website featuring some blogs I don’t want to read.”

So… this is a site “for the community” whose shut down notice contains 5 instances of the word “I” in the single sentence that is supposed to give us some clue as to why this is happening.

But wait! There’s more!

“[The site being taken down] is not up for transfer and I will not use DNS to point to a successor blog aggregator.”

Sweet. Not only is he not entertaining the idea of maintaining the site himself, he’s also eliminating the possibility that the site will be maintained “for the community”, by anyone, at *all*.

He does offer to link to a successor site, but insists that we get permission from the syndicated bloggers (and presumably, that we prove that we have said permission), if we expect him to link to us. So, we shouldn’t have any expectations of him to live up to his word and maintain the site “for the community”, *or* to let the community maintain the site for the community, but we should honor his request to prove that we have permission from all of the authors involved to put a successor site in place.

Rich, ain’t it?

Well, I’ve cloned the site here for whoever wants to continue to keep up with their friends and colleagues involved in open source and free software in New Jersey.

For my next pet project…

Stand back!

running install_egg_info
Writing /usr/lib/python2.5/site-packages/gdata.py-1.0.9.egg-info
brj@dawg:~/working/gdata.py-1.0.9$ python
Python 2.5.1 (r251:54863, May  2 2007, 16:27:44)
[GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import gdata
>>> print 'yay!'
yay!
>>>

No good can come of this! ;-P

Seriously, though – I really really strongly dislike spreadsheet interfaces. I hate resizing cells so I can see what’s going on, I hate cell selection, copy/pasting, and doing anything in those little cells. However, I really *need* to use one to handle some administrivia at Python Magazine, because it’s already being used by some back end processes/departments, and I don’t have time to write code and overhaul that whole process, and I don’t want to rock the boat anyway – what they have works – I just hate spreadsheets. It’s my problem, not theirs ;-)

The good news is they use Google Docs, and there’s a Google Data client library for Python. So I’m creating a command line interface to the spreadsheet :)

The Technology Behind Python Magazine

Hi all,

I mentioned to a buddy (who is also an editor) that we used subversion in our editorial process. He didn’t know what that was, and said that they used either this big nasty home grown system, or email attachments, to coordinate the editorial process.  He was incredibly curious about how we used subversion and what else we were using.

I started writing this kind of long email and then figured that others might be curious as well about the various technologies we use (or are moving to, etc) at Python Magazine, so here’s a quick list of tools we’re currently using:

  • Subversion – of course, I’ve mentioned this. We view every email attachment as a problem to be solved. Email is a communication tool. It is not a file transfer protocol (no, really – it isn’t), and it is certainly not a collaboration tool. We have a very simple directory hierarchy on the server representing the various stages in the editorial process, from the initial, original submission as received by the author, all the way to the final PDF rendering of the entire magazine, and all parts in between. The final review of the magazine even happens in SVN. We have a ‘corrections.txt’ file that we all add to as we review the PDF, and when that file is empty, the PDF is moved to the directory representing “go to press!”
  • Plain text – sometimes less is more. I’ve edited and authored using Word, OpenOffice, LaTeX, and a few other tools. In the end, plain text with extremely simple and minimalist formatting tags win the day by a long shot. Authors aren’t forced to use any particular tool or platform to write their articles, editors don’t have to wonder which version authors have, which language setting they were using, etc. We don’t have to wonder if our version control system will handle a binary format properly, and the files are smaller. It’s also easier to run scripts against them to do things like strip formatting, or selectively apply it given a regex or something.
  • Google Calendar – We are notified the night before any article deadline, and the calendar is shared among the editors. Theoretically, the same calendar could be used to indicate that an editor is going to be unavailable or a tech reviewer is going to receive an article, but so far, it mainly reminds us of upcoming deadlines.
  • IRC/Google Talk – We actually don’t send very much email to each other. Sometimes we talk on IRC about emails we received or need to be added to the cc list of, etc. Almost everything we do involves either IRC or Google Talk. Of the 50 or so people on the authors mailing list for Python Magazine, at least 40 have gmail.com email addresses, and so do all of the editors here, so even some of the author/editor communication is email-free. In addition, the Python Magazine IRC channel is irc.freenode.net/#pymag, and you can talk to editors and authors there. The only email that gets sent is:
    • subversion server updates,
    • users who need to mail info at pythonmagazine to ask subscription questions,
    • authors sending to editors at pythonmagazine to submit article ideas (we don’t take them on irc),
    • replies to threads, usually initiated by one of the above actions.
  • PHP – yes, believe it or not, the main site is written in PHP. The publishing company (MTA) was originally formed around php|architect Magazine, which is a magazine about PHP. That was in 2002. Today, there are two language-based magazines. Some day there may be five language-based magazines. Certainly, we’re not going to maintain websites using 5 different languages! O’Reilly doesn’t do it, and they publish entire *books* on different languages (and platforms! and databases!) I was impressed by the Python community’s understanding in this matter. Lesser communities would’ve sent lots of hate mail.
  • Python – Doug Hellmann (our tech editor) and myself (to a lesser degree, because Doug is far better at it) write any little tools and scripts we need using Python. Sometimes I think about writing Python scripts just to make Doug laugh. Don’t forget, I launched this magazine not because I professed any deep knowledge of Python. On the contrary – it was because I figured there were neophytes like myself who would like to know more, and advanced coders who would like to look into areas of Python outside their immediate area of expertise within the language.
  • Adobe InDesign – InDesign is the main layout tool. Layout is like some spooky ethereal realm to me. I imagine other tools are used during the layout/design process, but I don’t honestly know what they are. I’m sure the layout team prefer it that way. It’s probably better if I just say “I’d rather see the title moved up and to the right” than to start trying to tell them how to use their tools.

Those are the tools I can think of off the top of my head aside from back end things like a relatively standard LAMP stack that runs the web sites, and which I also don’t have much of a role in maintaining. Of course, there’s also one big element of all of this technology that blows them all away: the people. Every single person is technical in some way. Me, the layout folks, all of the editors for the whole company that I’m aware of, and even our fearless leader are all technical people. Technology is a common thread that runs through the entire organization, and ties all of us together. It makes an enormous difference, and I’m proud to be a part of the team.

With Great Funding Comes Great Responsibility

For the past ~6 weeks, I’ve been talking to people, getting buy-in, educating users and administrators, and generating copious amounts of project proposal and six sigma documentation presentig VMware Infrastructure as an infrastructure building and management tool.

There’s a whole manifesto behind this, but I’ll try to boil it down. Basically, this client has three sites, and the infrastructure needs to be consistent at all three sites. Also, ideally it would be overseen and generally managed from one site (there are obvious limits to this, but you get the idea). My thinking is that I have three choices:

  1. Order/rack/setup/test hardware and software, stage system, install stuff, ship to site, where someone else racks machine and turns it on.
  2. Assume and require that there is a senior enough admin at each site already to take care of all of that.
  3. Decouple the OS image from the hardware altogether and just build an infrastructure server “factory” at the main site, and ship (read, scp or similar) to the VMware servers at the other sites.

I chose option three – but this is oversimplification and doesn’t go into all of the benefits.

So, I just found out today that my bill has made it through Congress, and my project now has legs (read: funding)!! When the project is complete (the first phase is to migrate the main site using this methodology, and replication to other sites is a later phase of the project), I’ll try to give a talk on it or something.

In the meantime, if anyone has thoughts on virtualized infrastructure, or if you’re doing something cool with this technology, please post your comments. I value your insight!

Python Magazine Status Update

First, and most important, Python Magazine’s premiere issue has been unleashed!

I love this business. Doing all of the negotiations, the communications with authors, coordinating with layout and contract people, web administrators, tech editors, and the like, can get pretty chaotic. It’s sort of like managing…. a tornado. And, like a tornado, one second you’re not sure you’re going to make it, and the next everything is just fine. Finally, at some time after midnight yesterday, I looked up at my TODO list and realized that Python Magazine’s premiere issue was complete.

I didn’t know what to do. “Do I just… go to bed or something?” When you do what we at MTA like to call “marathon editing”, the moment you stop you get this weird sensation – like the one you get if you go roller-blading for two hours and then take a walk immediately afterward. Your body has to get used to *not* roller-blading.

About the First Issue

So, the biggest news associated with this first issue is that it is completely, 100% free. That’s right! You can go download the PDF at will, sans payment of any kind. This is big news for a couple of reasons.

First, the magazine costs money to produce. We pay our authors very competitively, and there are also editors, layout/design folk, and other people involved in the production. The business plan for producing a magazine where people get paid for what they do involves selling that magazine to (at least!) cover the costs. The idea that we’re giving the first issue away for free is, if nothing else, a testament to our commitment to and confidence in the product, as well as our stability.

Second, giving away the first issue means anyone can get their hands on it, read it, share it, print it, leave it in the coffee area at work, and do pretty much whatever they want with it. This means more people will see it, and see what we’re going after, and give us feedback so we can make it better.

Third, of course, we hope it means more subscribers, so that we can have the support we need to make the magazine better, and to give more back to the community by providing more services and whatever else we can.

Participate!!

Please send feedback! I want feedback! We all want feedback! Send it to editors at  pythonmagazine dot com

Also, write for us! Never wrote before? Don’t know where to start? No problem – drop us a line (editors at pythonmagazine.com) and let us know what your thoughts are, what you’re doing with Python, how it’s helped you, etc. We’re happy to help you develop the article idea, and get it ready for publication. If you’re intimidated by writing – don’t be. There are lots of authors who have already submitted articles to us who are doing really cool things with Python, have never written for a publication before, and their articles are being published – and they’re great articles!

Anyway, go get the first issue. Let me know your thoughts. :-)