Bookie meets Google Summer of Code 2014

Today the Google Summer of Code student selections were announced, and with that announcement Bookie revealed our selections for the slots allocated for each of our two mentors. This announcement highlights an amazing round of participation in Bookie as an open source project. Twenty people participated and landed over 110 commits worth of patches in Bookie since the opening of GSoC. That is AMAZING! In less than a week every bite-sized bug evaporated from the issue tracker. Also amazing is the quality and effort that everyone put into their work. Everyone was eager to learn how to add tests to their patches, and they worked so hard to get their code landed. Bookie emerges a better open source tool for managing bookmarks than it was 2 months ago, and that is because of the hard work and dedication of all of the participating students.

Students did more than land branches; they invigorated the community. We had many users jump into IRC to answer questions and guide students through the process. They also performed QA and did code reviews of their work. The enthusiasm the students brought to Bookie motivated me to make the time to help move things forward. After all: if a student spent 3 days figuring out how to fix a bug, write a test for it, commit the fixes to git, and get it up for code review; then I can manage to find the 30 minutes to pull the commit, review the code, and QA the work. This period motivated me to update documentation and ensure the install process worked for a wider audience. Additional motivation came from knowing that Bookie is interesting as a tool to other people besides myself.

I want every student not selected for Google Summer of Code to know their work and effort is greatly appreciated. I and the other members of the Bookie community enjoyed working with everyone who participated. Bookie had 32 applications for 2 available spots. In conversations with other organizations Bookie had a comparatively crazy amount of competition for few allocated spaces. I wish Bookie had a dozen or so more mentors and slots as over half of Bookie's proposals would have easily been accepted. Culling the dozens of great proposals into two positions was a very difficult process for us. It's hard to say "not right now" when there's so many great offers by so many eager and capable students.

Regardless of whether you were selected for Google Summer of Code the fun doesn't have to end. If you found the time contributing to Bookie valuable; if you learned something new, gained some material for that resume, or just had a good time: PLEASE DON'T STOP! Bookie isn't going anywhere or closing up shop; we're more than happy to continue mentoring and working with you all. We worked hard during this process to ensure all students were given the best chance to take something positive away from this application process. With your continued participation in the Bookie project we'd like to continue to mentor and provide guidance for you.

One area of guidance we owe all students relates to your proposal. Should you want any explanation of what you could do differently with your proposal / application please let us know. I'll be honest though: most of the applications we received were quite good, so there's little to critique. The scoring method we used put most of the applications within a few points of each other. But if you'd like to know more please feel free to ping me in irc and ask me anything you'd like.

Finally we'd like to congratulate Sambuddha and Pradyumna for their outstanding work leading up to this announcement, and we look forward to the results of their proposals for adding great features to Bookie over the summer. If you find the work interesting, please come help them out. Feel free to get involved, help with the work, the code reviews, and the testing of the new features. Maybe you'll be helping mentor Bookie next year? Who knows? :)

This was our first year participating in Google Summer of Code, but you can be assured it will not be the last.
We'd like to thank all of the students for flooding our channels and making this not only an amazingly crazy and busy time but also an immensely rewarding period in Bookie's history. You are all part of Bookie's history and we look forward to seeing you as part of Bookie's future. Thank you.

Bookie Sprint – Aug 31st

It’s time for another Bookie sprint!

When – Saturday August 31st

What time – Starts at 11am

Where – my house! Ping me for address/map info if you’re coming along. Map out to Clarkston, MI.

What will we be working on?

The goal is to work on test coverage and breadability article parsing. Are you new to application testing? Come out and learn while helping out an open source project.

If you want to participate online please join our irc channel #bookie on freenode.net. If there’s something else you’d rather work on then please let me know and I’ll be happy to do whatever I can to aid in participation.

Bookie 0.4: one week retrospective

Phew, that was a whirlwind of a week. Just over one week ago I finally released Bookie 0.4 and published the blog post to reddit as an announcement. This introduced signups and I was eager to see if there was real interest in the project now that users could sign up and try things out.

By the numbers

Traffic definitely came.

  • The blog post picked up 800 visits over the two days in the weekend.
  • https://bmark.us grabbed 360 unique new visitors.
  • We went from 58 to 126 activated user accounts.
  • Those users brought us to over 26,000 bookmarks stored in the site.

Complications

Of course, any swarm of new users finds the holes in the system and Bookie was no different. There were a few issues. First, the celery task that sends out emails on signup wasn’t running because the email config wasn’t setup right. This was a pretty quick fix. Next, the import system wasn’t filling out the path for uploaded files correctly. This one was another pretty easy fix, but I managed imports manually until I got the fix deployed.

The big thing was that, for probably the first time, all three moving parts to the system were trying to store bookmarks at once. The celery backend, the web UI, and a cron script that looks for new bookmarks without readable content and fetches it for storing. All of these hit the Whoosh fulltext index and caused locking issues that broke both imports and saving new bookmarks from the webui until I figured out the issue and just reset the fulltext index.

It was pretty bad timing as I could see users trying to add test bookmarks via the web interface. Google realtime analytics is pretty entrancing to watch. In the end I had to run to the Whoosh docs and change things up to use the async writer instead of the default locking mechanism. This got things running again, but the problem now is that I had to remove all the existing fulltext index. I’ve still got to finish a background job that will walk through all bookmarks and index them.

At some point I might need to remove the fulltext indexing from the current SqlAlchemy event hooks, but as purely background celery jobs that I can control from one place easier. This would remove the lock at all from the cron job and the web ui.

Disappointments

While I could see the charts showing traffic, it was tough because it was pretty invisible traffic. There were only three new users into the #bookie irc channel, and only a few people left comments in the reddit thread. No one left a comment on the blog post. Both my Twitter account and the Bookie accountgained fewer than 5 new followers. While the repository was starred many times, only two forks were created.

Going forward

There are a few new users active over the last week, and I’ve gotten a pair of pull requests. While the saving of new bookmarks was broken for a lot longer than I’d have liked, the site never went down. Imports were done in a semi-reasonable time frame. All of this felt pretty great and is encouraging for future work. I still need to finish fixing up the readable parsing. It’s the big selling point of Bookie, and the fact that fulltext search and readable parsed content for all bookmarks isn’t there is frustrating.

Here’s looking forward to great work and a more popular release announcement for Bookie 0.5.

Bookie 0.4 released into the wild!

Bookie is a Python based open source bookmark managing web application that includes content archiving, a Chrome extension, and much more.

Phew, that took a lot longer than expected. I’ve tagged Bookie 0.4 and the live site is updated to run it.

This brings a ton of work on getting an updated webui with some client side MVC, an API, Celery job running backend, some stats, and spin off projects such as breadability and a cli client.

The big thing is that signups are now there as well as a landing page. So hopefully this will spike up interest in new users checking out Bookie.

There are still a ton of long term ideas to work on with Bookie. I’d like to get a ‘reading’ view setup so that you can easily run through the bookmarks you’ve marked `toread`, especially in a mobile view. <3 my N7. I also want to work on getting suggestions for related bookmarks, suggested tags based on content, and other interesting machine learning type problems.

If you're the type that takes your bookmarks seriously give it a try. If you don't want to run your own instance, sign up to https://bmark.us and try it out there.

You can get an idea of the roadmap we're working off of on the Trello board.

Bookie weekly status report: May 6th 2012

This week was spent on a big side project. I’ve been trying like mad to update the python-readability library and take it over to help use it in the Bookie project space. After spending a ton of time trying to do just this I gave up.

I now present the breadability package. It’s a fresh port from the arc90 readability.js using the knowledge I’ve gained from all the other work and trying to stick to the JS file that’s the original inspiration.

I’ve got a bunch more work to do to add tests, get it in the build server, etc.

If you’ve been using one of the other dozen ports out there give this a shot. There’s work to be done, but I’d love to get some real work use in there, let me know what sites don’t work well, etc.

Weekly Status Report: April 29th

More hacking! Spent a big part of the week working on my Penguicon presentation so few commits.

Bookie Parser

  • Tweaked the readable view with some nice CSS, dark background, favicon support, etc. Much nicer to read article with it now.
  • Got the tests running on the TravisCI service.
  • Updated the API to fill out and support all the bits of data I need for this to replace my readable parsing on the main Bookie project.
  • Some refactoring and cleaning up duplicate code.

Bookie

The big thing here was to start up some JS to use the Bookie Parser api in order to load the readable content of a website as you’re bookmarking it from the edit page. In this way, users of the bookmarklet will have a better experience as they can now see their article, but it’s shown in cleaned up readable form. I need to clean it up and catch some edge/error cases, but it’s a start. Once it’s solid we can then use that content to store the page content and have immediate readable results instead of waiting for the next cron job to run in the background.

Bookie Weekly Update: April 22nd 2012

Another week, another few lines of code, and yay for two weeks in a row!

Bookie

Not a ton here, just some CSS updates and updating the backup script for pulling the INI correctly.

Bookie Parser

I spent some time cleaning up the CSS. I did some research on the most readable fonts for screens and surprisingly, it seems that sans serif wins on digital displays. So I updated the CSS and combined with some work on the Bookie main CSS files to make the readable pages a bit nicer. I’ve still got some more cleanup to do, but it reads a bit nicer now.

I also fixed the html generated to not have the empty body tag. It was due to the way the readable parsing library was giving me a full html document of content. See the updates over there for some bigger updates.

Finally, I added a form on the main page so you can try it out on a url just by entering it. So if you’re just curious what it does, go try it out!

Bookie Api

Just added a ping command. It should help make sure that the configuration is correct for new users. It’s also a nice start to a non-admin specific api command. A little bit of cleanup aside from that, but nothing major.

readability_lxml

Currently, Bookie uses a library called decruft for parsing html pages for the actual important article content. The bookie_parser project is using a different fork of that called readability_lxml. The author is a bit open to merging changes in and actually says she’s in ‘maintenance mode’. Since I kind of want a really decent library for this, it’s an important feature, I started hacking on it. In the process, this is where my week of hacking went.

First I updated it to allow me to get back only a partial html document vs an entire <html> doc. I then fixed some bugs, started cleaning up the code (adding tests, making the command line client all nice and argepare’y) etc. In the process I noticed that there’s a big branch in Github that adds a ton of things like multiple page document support and such. I’ve started to try to pull his branch into my work and the origin author’s code. It’s a LOT of git cherry-pick and really a pain since I want to clean up the code as I go. Unfortunately, this just means that Git gets confused on future merges since the code’s changed between commits. Ugh!

I’m about half way done though and I hope this will leave us with one solid library to do this parsing. I’m hoping to kind of take over stewardship of the library as I complete this work. It should hopefully make Bookie and bookie_parser all the more awesome.

The coming week

I’m giving a talk on the YUI JavaScript library at Penguicon. This means my
hacking time will be a bit less since I’ve got a presentation to prepare for. Next week’s status report might be a bit light and boring, but hey, maybe I’ll scrounge up some more beta users of Bookie while at the conference.