Skip to content

GoodReads Interface Design Notes

Grunthos edited this page Jan 4, 2012 · 6 revisions

##Initial Features

a.k.a. Woo-Hoo we get to write a batch system!

###Import from GoodReads

This is pretty straight-forward. There are few fields that BC uses that are not provided by GR; we can simply request a sump of all of the users books, and load our database and shelves.

Missing items will/may include:

  • covers (GR does not have as many available via the API as Amazon due to licensing problems)
  • prices (probably could be obtained, but not guaranteed).

The key issue here will be speed. GR gives about 3k of text per book (averaged by a random sample of 20 books). So loading a large library will need to be done it background.

###Export to GoodReads

This is pretty straight-forward. The only data that can be exported is:

  • book isbn & shelf data
  • personal comments, rating & review
  • 'have read'

The key issue here will be speed. To send a new book to GR required several API calls: one to lookup the GR book ID from the ISBN, and one per shelf the book is on, and one to update review/rating details.

The GR API terms require a limit of one update per second, so 1000 books, all on only one shelf, will require 6000 seconds to update, ignoring all web processing times, network failures or latency.

###Sync with GoodReads

  • Ideally wants to be automatic, but can not really be done
  • Has to be done semi-manually or as a 'reload'.

Current thinking is:

##Background Processing

Given that all of the major GR tasks will be lengthy, they all require to be run in a background task. My suggestion here is either a service or just a thread. The service has the huge advantage of not stopping when the app is quit, or so it seems.

The idea would be to add tables to the database with pending tasks and related info.

###Tasks

  • Export All to GR
  • Export 1 to GR
  • Import all from GR
  • Import 1 from GR
  • Sync with GR????

####Export 1 to GR

Data stored:

  • ID
  • Date of request
  • book ID to process
  • user_intervention_required flag indicating 'needs user intervention'
  • error description

The background task will:

  • find a task with flag NOT set
  • process book (takes several seconds)
  • on success, delete record
  • on book-not-found error, set flag, inform user (perhaps with option to go to a search results dialogue based on author/title?), and loop
  • on network or other error, save last error and inform user (if app running, otherwise inform them when app is started), perform a back-off wait (1,5,10,20,60 mins, 2hr, 4, hr, 8hr, 24hr) and loop. After waiting 24 hours, don't retry, let the user decide

Some books will fail because they are not in GR; the user needs to manually add them using the GR web interface and then retry the export.

####Export All to GR

Data stored:

  • ID
  • Date of request
  • Last book ID processed (NULL/0 if not started)
  • user_intervention_required flag indicating 'needs user intervention'
  • error description

The background task will:

  • find first export job with flag not set
  • get last book ID processed
  • determine next id in ascending ID order; if none, delete record and exit
  • process book (takes several seconds)
  • on success, update last id processed and loop
  • on book-not-found error, save specific book as an 'Export 1 to GR' task with 'user_intervention_required' flag set, inform user via Notifications (perhaps with option to go to a search results dialogue based on author/title?), update last ID and loop
  • on network or other error, save last error and inform user via notifications, perform a back-off wait (1,5,10,20,60 mins, 2hr, 4, hr, 8hr, 24hr) and loop. After waiting 24 hours, don't retry, mark record as 'user_intervention_required' and let the user decide

Some books will fail because they are not in GR; the user needs to manually add them using the GR web interface and then retry the export.

####Import 1 from GR

This would be fired by the user from a book edit page.

Task Data stored:

  • task ID
  • Date of request
  • Book ID to process
  • user_intervention_required flag indicating 'needs user intervention'
  • error description

The background task will:

  • if the GR ID is stored on the book, use it
  • if the GR is is not stored on the book record, use the ISBN/Author/Title to find the book.
  • if GR ID can not be determined, mark the record as 'user_intervention_required'
  • retrieve the GR data
  • update relevant book fields and shelves
  • queue 'Import Cover from GR' tasks for each book
  • on network error, perform a back-off wait
  • on other error, mark the task as 'user-intervention required'
  • on success, delete the record

This task may be fast enough to complete synchronously. Or give the user the option.

####Import All from GR

This would be fired by the user from a admin page (and perhaps after initial install). Or even the new 'Front Page'.

GoodReads does not allow retrieving of books in 'ID'-based order specifically, but review.list does have a 'sort=position' parameter, that one can hope retrieves books in a relaible order. We can retrieve books in pages of at most 200, and can retrieve a specific page, so the page size should be determined by the number we can process in the 1 second we have to leave between queries.

Since we also have to retrieve book covers, there is little advantage in retrieving more than one at a time.

An optimization might be to retrieve the book details first then retrieve the covers later.

Task Data stored:

  • task ID
  • Date of request
  • Last page processed (0/null if none)
  • user_intervention_required flag indicating 'needs user intervention'
  • error description

The task will:

####Import Cover from GR

This could be fired by the user from the book details as well as from the other GR import tasks.

Task Data stored:

  • task ID
  • Date of request
  • BC ID of book
  • GR ID of book
  • URL of image (if available)
  • user_intervention_required flag indicating 'needs user intervention'
  • error description

The task will:

  • Find next record with user_intervention_required NOT set
  • if the image URL is not specified, then get book details from GR to get URL
  • if the URL is a GR 'nocover' URL, queue an 'Import Cover' task
  • retrieve the image
  • on network error, perform a back-off wait
  • on other error, mark the task as 'user-intervention required'
  • on success, delete the task

####Import Cover

This could be fired by the user from the book details as well as from the other GR import tasks.

Task Data stored:

  • task ID
  • Date of request
  • BC ID of book
  • URL of image (if available)
  • user_intervention_required flag indicating 'needs user intervention'
  • error description

The task will:

  • Find next record with user_intervention_required NOT set
  • get an image URL from Amazon/Google/LT
  • retrieve the image
  • on network error, perform a back-off wait
  • on other error, mark the task as 'user-intervention required'
  • on success, delete the task