Xenbase Joural Image Scraping Utility
=====================================
Written On: December 6 2009 
Written By: Chris Jarabek (cjjarabe@ucalgary.ca) 

This utility as it is packaged has all of the java libraries and utilities necessary to 
automatically download and manipulate images from a variety of academic journals.

IMPORTANT NOTE: If you plan on publicly reproducing ANY of the images produced by this utility
you must first obtain permission from the journal's publisher.  Permission is usually obtained
via e-mail request to the journal publisher.  A sample request letter Xenbase uses when seeking 
permission to re-publish images is included in permission.pdf in the same folder as this document.


==Design Document==
The design document for this utility can be found in the same directory as the
file you are reading under the name design.pdf.


==System Requirements==
Java 6 - There are only a few lines of code that use Java 6 conventions, and 
         they can be easily modified to make the utility run in a Java 3 environment.
         
Java IDE - This code has been packaged with the assumption that it will be modified, it is 
		 therefore uncompiled, and will need to be imported into a Java IDE.  Eclipse is the
		 industry standard for Java development (http://www.eclipse.org) 

Site access to Journals - The scraper is currently designed under the assumption that it
		 is being run from a location that has access to the journal articles in question at a
		 domain level.  This means that if you are required to manually submit login credentials
		 in order to view a given article, this scraper will not be able to automatically scrape 
		 the images.
		 
		 
==Format Notes==
This utility is coded to work with JPEG, PNG and TIF image files.  No other image
file formats have been tested.  The class org.xenbase.scraper.runner.ImageUtilsTestSuite
is designed to perform a basic test of the image manipulation utilities with PNG and TIF files.
		 
		 
==Running the Scraper==
The following URLs from Pubmed can be used to test the image scraper.
Simply run the ScraperStubRunner class with two parameters (this is most easily done in the Eclipse IDE): 

1)The example URL below (or another URL from Pubmed of one of the valid journals).
2)The ID number that identifies the journal type (see below).
	
	
==Sample URLs==
Current Biology: http://linkinghub.elsevier.com/retrieve/pii/S0960-9822(09)01488-2
Developmental Dynamics: http://dx.doi.org/10.1002/dvdy.22099
Development: http://dev.biologists.org/cgi/pmidlookup?view=long&pmid=19906860
Mechanisms of Development: http://linkinghub.elsevier.com/retrieve/pii/S0925-4773(06)00096-7
Proceedings of the National Academy of Sciences: http://www.pnas.org/cgi/pmidlookup?view=long&pmid=19805045
Journal of Cell Biology: http://www.jcb.org/cgi/pmidlookup?view=long&pmid=19289795

==Journal Types==
1: Current Biology, Developmental Cell, Cell
2: Developmental Dynamics
3: Development
4: Mechanisms of Development, Developmental Biology
5: Proceedings of the National Academy of Sciences
6: Journal of Cell Biology
			