If this was a resolvable identifier, then we would have unique, stable identifiers for Journal of Arachnology papers that resolve to PDFs.
#Sitesucker bookmarklet full
BY default I use CrossRef, but if you put "" as the Resolver URL, you will be able get full text for the Journal of Arachnology (providing that you've entered sufficient bibliographic details when saving the reference).īut I think the next step is to have a GUID for each paper, and in the absence of a DOI I'm a favour of SICI's (see my bookmarks for some background). On the Advanced settings page you can set an OpenURL resolver. Yeah, but what else can we do with this? Well, for one thing, you can use the bioGUID OpenURL service in Connotea. I then store all this information in a MySQL database, and when a user clicks on the OpenURL link in the list of results from the reference parser, if the journal is the Journal of Arachnology, you go straight to the PDF. These four things are enough for me to uniquely identify the article. So, this gives me list of over 1000 papers, each with a URL, and for each paper I have the journal, year, volume, and starting page. What is especially nice is that the URLs include information on volume and starting page number, which greatly simplifies my task. There weren't terribly consistent, there are at least five or six different ways the links are written, but they are consistent enough to parse. Then I ran a Perl script that read each HTML file and pulled out the links. I used SiteSucker to pull all the HTML files listing the PDFs from the journal's web site. But I've started to expand it to handle papers that I know have no DOI. If you send it a DOI, it simply redirects you to dx.doi.org to reoslve it. What I've done is add an OpenURL service to bioGUID. OK, but what if there is no DOI? Every issue of the Journal of Arachnology is online, but only issues from 2000 onwards have DOIs (hosted by my favourite DOI breaker, BioOne). The service tells you that there is a DOI ( doi:10.1636/H03-8). A review of the spider genera Pardosa and Acantholycosa (Araneae, Lycosidae) of the 48 contiguous United States. For example, if you paste in this reference: Basically, you paste in one or more references, and it tries to figure out what they are, using ParaTools and CrossRef's OpenURL resolver.
![sitesucker bookmarklet sitesucker bookmarklet](https://ricks-apps.com/osx/sitesucker/archive/4.x/4.0.x/4.0/manuals/en/resources/General.png)
The tool is at (for more on my bioGUID project see the blog).
![sitesucker bookmarklet sitesucker bookmarklet](http://2.bp.blogspot.com/-rj2WZbulgus/VcnH06yXdAI/AAAAAAAABbM/TFZ4rGet1qU/s1600/mail-merge-email-template.png)
I've been working on a tool to parse references and find existing identifiers. I'm not sure who said it first, but there's a librarianly spin on the old Perl paradigm I think I heard at code4libcon or Access in the past year: instead of "making simple things simple, and complex things possible," we librarians and those of us librarians who write standards tend, in writing our standards, to "make complex things possible, and make simple things complex."