Commit Graph

96 Commits

Author SHA1 Message Date
Bryan Bishop
05669229c4 catch PDFNotImplementedErrors 2013-02-11 10:03:28 -06:00
Bryan Bishop
5fbeedd76b remove extra periods from filenames 2013-02-09 19:22:51 -06:00
Bryan Bishop
a8abdb2322 support both jap.aip.org and apl.aip.org 2013-02-09 08:03:29 -06:00
Bryan Bishop
143323b096 README: better description of paperbot's manners 2013-02-09 07:56:08 -06:00
Bryan Bishop
7209fbb620 README: mention pdfparanoia 2013-02-09 07:54:26 -06:00
Bryan Bishop
db58d53c10 pass StringIO to pdfparanoia 2013-02-09 07:45:53 -06:00
Bryan Bishop
53de3f3648 scrub away watermarks in another situation 2013-02-09 07:42:57 -06:00
Bryan Bishop
357e268e96 use pdfparanoia to remove watermarks 2013-02-09 07:41:50 -06:00
Bryan Bishop
bef66e1241 citation_pdf_url is not always available 2013-02-08 15:23:29 -06:00
Bryan Bishop
d400040c10 an even better IEEE fix 2013-02-08 04:45:43 -06:00
Bryan Bishop
c48a377f44 better support for IEEE Xplore 2013-02-08 04:16:10 -06:00
Bryan Bishop
b6977593cd set pdfparanoia as a dependency 2013-02-07 21:05:51 -06:00
Bryan Bishop
0ef3debca8 prevent a catastrophic error in paper retrieval 2013-02-07 03:54:05 -06:00
Bryan Bishop
8a1b2c503e ignore citation_pdf_url on aip.org 2013-02-06 20:00:50 -06:00
Bryan Bishop
120f8fbfc0 better support for aip.org 2013-02-06 19:57:42 -06:00
Joe Rayhawk
24fcf41760 translation server submodule: switch over to kanzure's github and the paperbot branch specifically 2013-02-05 01:21:41 -08:00
Joe Rayhawk
715b73db7b Add phenny and translation-server as submodules. 2013-02-05 01:12:39 -08:00
Bryan Bishop
1fe6d64db0 README: also also 2013-01-27 08:03:14 -06:00
Bryan Bishop
88aff1a06c README: describe another paperbot compulsion 2013-01-27 08:01:05 -06:00
Bryan Bishop
dfb9b34c5c explicitly list some python dependencies 2013-01-27 07:59:38 -06:00
Bryan Bishop
32d476faae README: add a link to phenny 2013-01-27 07:51:44 -06:00
Bryan Bishop
e6842c1400 README: initial content 2013-01-27 07:50:12 -06:00
Bryan Bishop
8b3abe9222 possibly better sciencedirect handling 2013-01-23 20:01:03 -06:00
Bryan Bishop
f7d7eaa6cb fail less catastrophically for a weird sciencedirect url 2013-01-23 19:58:24 -06:00
Bryan Bishop
2c3df4e2ef handle ieee xplore login.jsp urls 2013-01-21 19:11:12 -06:00
Bryan Bishop
e021c543aa make paper titles with slashes work 2013-01-21 11:44:13 -06:00
Bryan Bishop
eba857dd7e another fix for sciencedirect.com 2013-01-20 22:21:44 -06:00
Bryan Bishop
a89129b424 fix sciencedirect.com parsing 2013-01-20 22:19:57 -06:00
Bryan Bishop
cf7c1b78e1 auto-remove proxy.lib.pdx.edu from urls 2013-01-19 19:33:17 -06:00
Bryan Bishop
9c7da548e1 fix xpath syntax 2013-01-16 02:56:06 -06:00
Bryan Bishop
191f00ad9f fix some bugs for pubs.acs.org 2013-01-16 02:53:18 -06:00
Bryan Bishop
751cb9fe63 don't encode the title until later 2013-01-16 02:43:20 -06:00
Bryan Bishop
7c50bdbaaa better case handling for find_citation_pdf_url 2013-01-16 02:41:56 -06:00
Bryan Bishop
b1dcaf0e23 fix title encoding for another pdf case 2013-01-16 02:25:04 -06:00
Bryan Bishop
723b9f18d7 make paperbot less verbose 2013-01-16 02:20:55 -06:00
Bryan Bishop
2748d54bd4 attempt pdfs when zotero fails me 2013-01-16 02:19:09 -06:00
Nathan McCorkle
6dcb23e2f8 tabs to spaces 2013-01-10 21:14:16 -08:00
Nathan McCorkle
e4074d2b3d attempt at multi URL downloads 2013-01-10 21:08:36 -08:00
Bryan Bishop
f186b7d009 Revert "added printline"
This reverts commit 956fddff8a.
2013-01-10 19:09:36 -06:00
Bryan Bishop
46c2143eaa Revert "changed printline"
This reverts commit 58221d8e36.
2013-01-10 19:09:30 -06:00
Nathan McCorkle
58221d8e36 changed printline 2013-01-10 16:59:01 -08:00
Nathan McCorkle
956fddff8a added printline 2013-01-10 16:57:16 -08:00
Bryan Bishop
b09a55c06e pedantic whitespace changes 2013-01-10 17:14:41 -06:00
Bryan Bishop
aa98a92edb encode the title before making a path 2013-01-10 10:17:30 -08:00
Bryan Bishop
f405e1fb50 default to downloading the url 2013-01-10 10:16:55 -08:00
Bryan Bishop
8d930c95d3 initial commit 2013-01-07 22:27:46 -08:00