Phyks
5aed10f4c5
Check if file already exists when importing
2014-04-24 21:19:27 +02:00
Phyks
c304eb2578
Import updated
...
* Added djvu support
* Nice mask for file renaming
TODO : Append to bibtex index + test if file already exists
2014-04-24 19:38:52 +02:00
Phyks
1420cf37a9
Added extension checking when importing file
2014-04-24 16:23:28 +02:00
Phyks
ea53e0720f
Working on PDF import
...
* Search the PDF file for DOI, manual fallback if not found
* Move the PDF file
* Add its Bibtex entry to the general bibtex file
TODO :
* Better renaming
* Adding to bibtex file
2014-04-24 16:18:56 +02:00
Phyks
93d1fefa26
Started the main code
2014-04-24 00:18:49 +02:00
Phyks
052f39b6f2
Updated README and cleaned repo
2014-04-23 22:27:55 +02:00
Phyks
41d8fb16d9
Clone repo form a3nm
2014-04-23 13:08:10 +02:00
Antoine Amarilli
e19aa9e534
Config file, SOCKS support, multiple servers
2013-05-11 16:10:48 +02:00
Antoine Amarilli
86c2e11a8c
remove phenny, tweak some things
2013-05-11 11:57:28 +02:00
Bryan Bishop
960e86327e
use a random title if title extraction fails
2013-04-15 01:13:49 -05:00
Bryan Bishop
04644364e2
fix jstor title determination
2013-02-21 17:30:25 -06:00
Bryan Bishop
16c7f4d4db
fix jstor pdf urls
2013-02-21 17:13:22 -06:00
Bryan Bishop
14bdf23876
jstor
2013-02-21 17:11:28 -06:00
Bryan Bishop
56f0caf6ae
skip translator results if [] or not http 200
2013-02-18 05:17:57 -06:00
Bryan Bishop
0253a0a9db
allow unicode in filenames when returning a url
2013-02-16 21:23:39 -06:00
Bryan Bishop
05669229c4
catch PDFNotImplementedErrors
2013-02-11 10:03:28 -06:00
Bryan Bishop
5fbeedd76b
remove extra periods from filenames
2013-02-09 19:22:51 -06:00
Bryan Bishop
a8abdb2322
support both jap.aip.org and apl.aip.org
2013-02-09 08:03:29 -06:00
Bryan Bishop
143323b096
README: better description of paperbot's manners
2013-02-09 07:56:08 -06:00
Bryan Bishop
7209fbb620
README: mention pdfparanoia
2013-02-09 07:54:26 -06:00
Bryan Bishop
db58d53c10
pass StringIO to pdfparanoia
2013-02-09 07:45:53 -06:00
Bryan Bishop
53de3f3648
scrub away watermarks in another situation
2013-02-09 07:42:57 -06:00
Bryan Bishop
357e268e96
use pdfparanoia to remove watermarks
2013-02-09 07:41:50 -06:00
Bryan Bishop
bef66e1241
citation_pdf_url is not always available
2013-02-08 15:23:29 -06:00
Bryan Bishop
d400040c10
an even better IEEE fix
2013-02-08 04:45:43 -06:00
Bryan Bishop
c48a377f44
better support for IEEE Xplore
2013-02-08 04:16:10 -06:00
Bryan Bishop
b6977593cd
set pdfparanoia as a dependency
2013-02-07 21:05:51 -06:00
Bryan Bishop
0ef3debca8
prevent a catastrophic error in paper retrieval
2013-02-07 03:54:05 -06:00
Bryan Bishop
8a1b2c503e
ignore citation_pdf_url on aip.org
2013-02-06 20:00:50 -06:00
Bryan Bishop
120f8fbfc0
better support for aip.org
2013-02-06 19:57:42 -06:00
Joe Rayhawk
24fcf41760
translation server submodule: switch over to kanzure's github and the paperbot branch specifically
2013-02-05 01:21:41 -08:00
Joe Rayhawk
715b73db7b
Add phenny and translation-server as submodules.
2013-02-05 01:12:39 -08:00
Bryan Bishop
1fe6d64db0
README: also also
2013-01-27 08:03:14 -06:00
Bryan Bishop
88aff1a06c
README: describe another paperbot compulsion
2013-01-27 08:01:05 -06:00
Bryan Bishop
dfb9b34c5c
explicitly list some python dependencies
2013-01-27 07:59:38 -06:00
Bryan Bishop
32d476faae
README: add a link to phenny
2013-01-27 07:51:44 -06:00
Bryan Bishop
e6842c1400
README: initial content
2013-01-27 07:50:12 -06:00
Bryan Bishop
8b3abe9222
possibly better sciencedirect handling
2013-01-23 20:01:03 -06:00
Bryan Bishop
f7d7eaa6cb
fail less catastrophically for a weird sciencedirect url
2013-01-23 19:58:24 -06:00
Bryan Bishop
2c3df4e2ef
handle ieee xplore login.jsp urls
2013-01-21 19:11:12 -06:00
Bryan Bishop
e021c543aa
make paper titles with slashes work
2013-01-21 11:44:13 -06:00
Bryan Bishop
eba857dd7e
another fix for sciencedirect.com
2013-01-20 22:21:44 -06:00
Bryan Bishop
a89129b424
fix sciencedirect.com parsing
2013-01-20 22:19:57 -06:00
Bryan Bishop
cf7c1b78e1
auto-remove proxy.lib.pdx.edu from urls
2013-01-19 19:33:17 -06:00
Bryan Bishop
9c7da548e1
fix xpath syntax
2013-01-16 02:56:06 -06:00
Bryan Bishop
191f00ad9f
fix some bugs for pubs.acs.org
2013-01-16 02:53:18 -06:00
Bryan Bishop
751cb9fe63
don't encode the title until later
2013-01-16 02:43:20 -06:00
Bryan Bishop
7c50bdbaaa
better case handling for find_citation_pdf_url
2013-01-16 02:41:56 -06:00
Bryan Bishop
b1dcaf0e23
fix title encoding for another pdf case
2013-01-16 02:25:04 -06:00
Bryan Bishop
723b9f18d7
make paperbot less verbose
2013-01-16 02:20:55 -06:00