Fulltext handling
Status: idea
Proposal: having a separate module (let's call it Library) for the actual fulltext management, which can operate on a filesystem or any other kind of more specific repository, and adding in pyblio an attribute that relates unambiguously to the document (say, a secure hash).
For local files, this module should be able to keep track of the location of the files (ie, it is not a black box in which you put your files, but rather an overlay that observes the files on disk).
For remote files (basically, for URLs), the module is in charge of possibly keeping local copies, check for updates, notify in case of access error,...
Things to consider:
- renaming
- versioning
- archiving strategies
- handling of subsets (chapter of a book) / supersets (all the articles in a journal)
The API should allow to ingest / batch register a bunch of local files, to fetch URLs, and to view a resource.
Use cases
find-to-view search for an item, then display it
fetch-and-register access a resource, then register it
ingest-a-directory given a set of files, register each of them in turn
