Compare commits

...

68 Commits

Author SHA1 Message Date
Lucas Verney 96a85feec0 Merge branch 'master' of https://github.com/Phyks/BMC 2016-01-10 17:59:41 +01:00
Lucas Verney 4dbc13e44c Add link to libbmc 2016-01-10 17:59:23 +01:00
Lucas Verney c84159068c Merge pull request #33 from bcbnz/fixdoisearch
Search for Digital Object Identifier as well as DOI in text.
2015-12-07 19:08:20 +01:00
Blair Bonnett 330c2f2b5f Search for Digital Object Identifier as well as DOI in text.
If the paper identifier is marked with Digital Object Identifier, but
one or more of its references has a DOI link in it, then the reference
DOI is taken as the paper one. This change replaces the words Digital
Object Identifier with DOI in the text being searched to pull out the
correct ID.
2015-12-07 15:39:57 +13:00
Phyks 5f8665940d Fix unittests for Python 2.7 2015-09-05 16:55:14 +02:00
Phyks f7bcdece5f Fix unittests for Python 2.7 2015-09-05 16:49:56 +02:00
Phyks 1b83f01581 Fix unittests 2015-09-05 16:46:11 +02:00
Phyks 851db96fa8 Fix entries names 2015-08-31 16:04:41 +02:00
Lucas Verney a25f21f451 Rename type dict entry to entrytype according to change in bibtexparser 0.6.1 2015-08-21 23:45:19 +02:00
Lucas Verney 94c2771e4e Merge pull request #28 from sciunto/license
add LICENSE file
2015-06-11 18:09:29 +02:00
François Boulogne 80b4064396 add LICENSE file 2015-06-11 11:46:58 -04:00
Phyks d2e415a1c5 Do not specify Python version by default 2015-06-11 16:56:14 +02:00
Phyks b655c50f07 Add --keep argument for delete action, see issue #26 2015-06-11 16:54:52 +02:00
Phyks 3232fc68be Ensure Papers dir exist, see issue #27 2015-06-11 16:54:52 +02:00
Phyks 82ed48a9e0 Fix on Windows + Fix issue #25 2015-06-11 16:54:43 +02:00
Phyks 84a7a1cd63 Absolute paths for bib index 2015-06-08 21:15:18 +02:00
Phyks e7edf7e5bf Fix remaining bug 2015-06-08 20:02:09 +02:00
Phyks 0ba55402d2 Trailing new line in isbn bib 2015-06-06 16:36:07 +02:00
Phyks ab27d96f96 Fix Travis build 2015-06-06 16:28:39 +02:00
Phyks 7c54c9fd2e Add an option to leave the imported file in place 2015-06-06 16:03:32 +02:00
Phyks fbb158543b Fix issue #21 2014-12-03 12:54:24 +01:00
Phyks 485516db07 Update doc for unittests 2014-12-03 12:18:39 +01:00
Phyks ce619b9cfe Fix issue #21 + encoding 2014-11-30 16:44:04 +01:00
Lucas Verney f357f4600c Merge pull request #22 from drvinceknight/master
Adding build file to gitignore
2014-11-30 16:30:22 +01:00
vince 75dd7b4e57 Adding build file to gitignore 2014-11-30 12:18:00 +00:00
Phyks 3e6d5f490f Forgot a comma 2014-11-05 14:46:39 +01:00
Phyks 2e6d9c0f79 Sort keys in config 2014-11-05 14:44:05 +01:00
Phyks c1055ecb8c Update config for pretty printing 2014-11-05 00:02:43 +01:00
Phyks 08f6b8846a Catch socket.error exceptions as seen in issue #20 2014-11-04 21:51:44 +01:00
Phyks 0fb579a58c Forgot to update function names in bmc.py 2014-11-04 21:46:47 +01:00
Phyks 2d949dd299 Forgot to push the PDF file for unittests 2014-11-04 21:12:22 +01:00
Phyks a39c1d94d0 Solve issue #19 2014-11-04 21:11:19 +01:00
Phyks 9f821a409c setup.py 2014-10-11 23:19:32 +02:00
Phyks 9a80f0c1fe Update src tests files 2014-10-07 11:37:26 +02:00
Phyks e7b409c8b7 Horrible file writing 2014-10-07 11:30:44 +02:00
Phyks d506124f9d Fix test files 2014-10-07 11:26:06 +02:00
Phyks 1ed4c4e623 Update README 2014-10-07 11:22:13 +02:00
Phyks a8d9f4e7b5 Kick Python 3.2 2014-08-04 12:58:43 +02:00
Phyks 096241ea48 Merge branch 'master' of https://github.com/Phyks/BMC 2014-08-04 00:18:49 +02:00
Phyks d059005946 Fix for Python 3 2014-08-03 23:34:17 +02:00
Phyks 5bf247205f Update README for Python3 compatibility 2014-08-03 23:12:43 +02:00
Phyks 9ab00fdded Small differences in between py2 and py3 2014-08-03 22:59:00 +02:00
Phyks f311de7043 Fix 2014-08-03 21:52:01 +02:00
Phyks a07c2ea292 Fix sys.stdout.encoding error 2014-08-03 21:37:34 +02:00
Phyks 3c88752cf9 Further bugfixes for python3 2014-08-03 21:20:48 +02:00
Phyks ed449e17e2 Fix for python3 2014-08-03 19:10:30 +02:00
Phyks bb297adfc5 Further fixes for python3 2014-08-03 12:38:40 +02:00
Phyks 07d8d43a7c Fix tools.py for python3 2014-08-03 00:40:37 +02:00
Phyks 15ccbb95c9 Update Travis instructions 2014-08-03 00:22:08 +02:00
Phyks 7fda1bd5fa Flake8 fixes 2014-08-03 00:17:01 +02:00
Phyks 35541a43e6 Fix imports + subparsers in python3 2014-08-03 00:09:07 +02:00
Phyks ce9d13eafa Edit README.md accordingly 2014-08-02 23:35:29 +02:00
Phyks 5f908a6d7b Rewrite to use PySocks 2014-08-02 23:34:34 +02:00
Lucas Verney da555c0bad Update README.md
Fix Travis icon
2014-08-02 22:03:50 +02:00
Phyks 1a03ab6d70 Fix imports 2014-08-01 01:33:32 +02:00
Phyks 8e79bf214e Update tests 2014-08-01 01:00:16 +02:00
Phyks 229a3617ee Update tests 2014-08-01 00:50:40 +02:00
Lucas Verney cbc2a175a5 Update README.md
Add Travis build status.
2014-08-01 00:45:04 +02:00
Lucas Verney 20517210ff Merge pull request #14 from sciunto/master
Tear pages
2014-07-13 17:44:16 +02:00
François Boulogne 61801c50f6 Update readme for tearpages 2014-07-12 23:03:02 -04:00
François Boulogne f2bfdf5336 I, F. Boulogne, as the author, relicense this code. 2014-07-12 23:00:59 -04:00
Lucas Verney 818b811e24 Merge pull request #13 from sciunto/setup
add a setup.py
2014-07-12 01:26:54 +02:00
François Boulogne df4c929ef8 add a setup.py 2014-07-11 19:22:00 -04:00
Lucas Verney 0742f3f4c0 Update README.md
Add @ßciunto in the thanks part.
2014-07-11 10:32:04 +02:00
Lucas Verney ae66f3b04c Merge pull request #12 from sciunto/lib
Store libs in a specific directory
2014-07-11 10:29:54 +02:00
François Boulogne 7e570322c0 fix paths 2014-07-10 22:56:47 -04:00
François Boulogne 22e4a09bda fix import 2014-07-10 22:52:49 -04:00
François Boulogne f123bc3ad1 add lib directory 2014-07-10 22:50:16 -04:00
29 changed files with 533 additions and 304 deletions

3
.gitignore vendored
View File

@ -8,3 +8,6 @@
*.pdf
*.bib
*.djvu
# build
build/

View File

@ -1,12 +1,13 @@
language: python
python:
- 2.7
- 3.3
before_install:
- sudo apt-get update
# command to install dependencies, e.g. pip install -r requirements.txt --use-mirrors
install:
- pip install arxiv2bib
- pip install requesocks
- pip install PySocks
- pip install pyPDF2
- pip install tear-pages
- pip install isbnlib
@ -14,7 +15,7 @@ install:
- pip install coveralls
- sudo apt-get install -qq poppler-utils
- sudo apt-get install -qq djvulibre-bin
# - python setup.py install
- python setup.py install
# command to run tests, e.g. python setup.py test
script:
- nosetests

9
LICENSE Normal file
View File

@ -0,0 +1,9 @@
* --------------------------------------------------------------------------------
* "THE NO-ALCOHOL BEER-WARE LICENSE" (Revision 42):
* Phyks (webmaster@phyks.me) wrote this file. As long as you retain this notice you
* can do whatever you want with this stuff (and you can also do whatever you want
* with this stuff without retaining it, but that's not cool...). If we meet some
* day, and you think this stuff is worth it, you can buy me a <del>beer</del> soda
* in return.
* Phyks
* ---------------------------------------------------------------------------------

View File

@ -5,6 +5,11 @@ BiblioManager is a simple script to download and store your articles. Read on if
**Note :** This script is currently a work in progress.
**Note: If you want to extract some functions from this repo, please consider using [libbmc](https://github.com/Phyks/libbmc/) instead, which is specifically dedicated to this (and this repo should be using it, rather than duplicating code).**
Travis build status : [![Build Status](https://travis-ci.org/Phyks/BMC.svg?branch=master)](https://travis-ci.org/Phyks/BMC)
## What is BiblioManager (or what it is **not**) ?
I used to have a folder with poorly named papers and books and wanted something to help me handle it. I don't like Mendeley and Zotero and so on, which are heavy and overkill for my needs. I just want to feed a script with PDF files of papers and books, or URLs to PDF files, and I want it to automatically maintain a BibTeX index of these files, to help me cite them and find them back. Then, I want it to give me a way to easily retrieve a file, either by author, by title or with some other search method, and give me the associated bibtex entry.
@ -56,12 +61,13 @@ Should be almost working and usable now, although still to be considered as **ex
```
git clone https://github.com/Phyks/BMC
```
* Install `arxiv2bib`, `tear-pages`, `requesocks`, `bibtexparser` (https://github.com/sciunto/python-bibtexparser), `PyPDF2` and `isbnlib` _via_ Pypi
* Install `arxiv2bib`, `PySocks`, `bibtexparser` (https://github.com/sciunto/python-bibtexparser), `PyPDF2` and `isbnlib` _via_ Pypi (or better, in a virtualenv, or using your package manager, according to your preferences)
```
sudo pip install arxiv2bib requesocks bibtexparser pyPDF2 isbnlib
sudo pip install arxiv2bib PySocks bibtexparser pyPDF2 isbnlib
```
(replace pip by pip2 if your distribution ships python3 by default)
(this script should be compatible with Python 2 and Python 3)
* Install `pdftotext` (provided by Xpdf) and `djvulibre` _via_ your package manager or the way you want
* Install the script _via_ `python setup.py install`.
* Run the script to initialize the conf in `~/.config/bmc/bmc.json`.
* Customize the configuration by editing `~/.config/bmc/bmc.json` according to your needs. A documentation of the available options can be found in file `config.py`.
* _Power users :_ Add your custom masks in `~/.config/bmc/masks.py`.
@ -117,6 +123,12 @@ All your documents will be stored in the papers dir specified in `~/.config/bmc/
The resync option will check that all bibtex entries have a corresponding file and all file have a corresponding bibtex entry. It will prompt you what to do for unmatched entries.
## Unittests
Unittests are available for all the files in the `lib/`. You can simply run the tests using `nosetests`. Builds are run after each commit on [Travis](https://travis-ci.org/Phyks/BMC).
## License
All the source code I wrote is under a `no-alcohol beer-ware license`. All functions that I didn't write myself are under the original license and their origin is specified in the function itself.
@ -132,7 +144,6 @@ All the source code I wrote is under a `no-alcohol beer-ware license`. All funct
* ---------------------------------------------------------------------------------
```
I used the `tearpages.py` script from sciunto, which can be found [here](https://github.com/sciunto/tear-pages) and is released under a GNU GPLv3 license.
## Inspiration
@ -147,8 +158,6 @@ Here are some sources of inspirations for this project :
A list of ideas and TODO. Don't hesitate to give feedback on the ones you really want or to propose your owns.
60. Unittest
70. Python3 compatibility ?
80. Search engine
85. Anti-duplicate ?
90. Look for published version in arXiv
@ -160,6 +169,7 @@ A list of ideas and TODO. Don't hesitate to give feedback on the ones you really
* Nathan Grigg for his [arxiv2bib](https://pypi.python.org/pypi/arxiv2bib/1.0.5#downloads) python module
* François Boulogne for his [python-bibtexparser](https://github.com/sciunto/python-bibtexparser) python module and his integration of new requested features
* pyparsing [search parser example](http://pyparsing.wikispaces.com/file/view/searchparser.py)
* François Boulogne (@sciunto) for his (many) contributions to this software !
## Note on test files

218
bmc.py
View File

@ -1,19 +1,21 @@
#!/usr/bin/env python2
#!/usr/bin/env python
# -*- coding: utf8 -*-
from __future__ import unicode_literals
import argparse
import os
import shutil
import subprocess
import sys
import tempfile
import backend
import fetcher
import tearpages
import tools
from bibtexparser.bparser import BibTexParser
import bibtexparser
from codecs import open
from config import Config
from libbmc.config import Config
from libbmc import backend
from libbmc import fetcher
from libbmc import tearpages
from libbmc import tools
config = Config()
@ -23,23 +25,24 @@ EDITOR = os.environ.get('EDITOR') if os.environ.get('EDITOR') else 'vim'
def checkBibtex(filename, bibtex_string):
print("The bibtex entry found for "+filename+" is:")
bibtex = BibTexParser(bibtex_string)
bibtex = bibtex.get_entry_dict()
bibtex = bibtexparser.loads(bibtex_string)
bibtex = bibtex.entries_dict
try:
bibtex = bibtex[bibtex.keys()[0]]
bibtex = bibtex[list(bibtex.keys())[0]]
# Check entries are correct
assert bibtex['title']
if bibtex['type'] == 'article':
assert bibtex['authors']
elif bibtex['type'] == 'book':
assert bibtex['author']
assert bibtex['year']
if "title" not in bibtex:
raise AssertionError
if "authors" not in bibtex and "author" not in bibtex:
raise AssertionError
if "year" not in bibtex:
raise AssertionError
# Print the bibtex and confirm
print(tools.parsed2Bibtex(bibtex))
check = tools.rawInput("Is it correct? [Y/n] ")
except KeyboardInterrupt:
sys.exit()
except (KeyError, AssertionError):
except (IndexError, KeyError, AssertionError):
print("Missing author, year or title in bibtex.")
check = 'n'
try:
@ -49,16 +52,16 @@ def checkBibtex(filename, bibtex_string):
while check.lower() == 'n':
with tempfile.NamedTemporaryFile(suffix=".tmp") as tmpfile:
tmpfile.write(bibtex_string)
tmpfile.write(bibtex_string.encode('utf-8'))
tmpfile.flush()
subprocess.call([EDITOR, tmpfile.name])
tmpfile.seek(0)
bibtex = BibTexParser(tmpfile.read()+"\n")
bibtex = bibtexparser.loads(tmpfile.read().decode('utf-8')+"\n")
bibtex = bibtex.get_entry_dict()
bibtex = bibtex.entries_dict
try:
bibtex = bibtex[bibtex.keys()[0]]
except KeyError:
bibtex = bibtex[list(bibtex.keys())[0]]
except (IndexError, KeyError):
tools.warning("Invalid bibtex entry")
bibtex_string = ''
tools.rawInput("Press Enter to go back to editor.")
@ -90,7 +93,7 @@ def checkBibtex(filename, bibtex_string):
return bibtex
def addFile(src, filetype, manual, autoconfirm, tag):
def addFile(src, filetype, manual, autoconfirm, tag, rename=True):
"""
Add a file to the library
"""
@ -101,9 +104,11 @@ def addFile(src, filetype, manual, autoconfirm, tag):
if not manual:
try:
if filetype == 'article' or filetype is None:
doi = fetcher.findDOI(src)
if doi is False and (filetype == 'article' or filetype is None):
arxiv = fetcher.findArXivId(src)
id_type, article_id = fetcher.findArticleID(src)
if id_type == "DOI":
doi = article_id
elif id_type == "arXiv":
arxiv = article_id
if filetype == 'book' or (doi is False and arxiv is False and
filetype is None):
@ -172,10 +177,10 @@ def addFile(src, filetype, manual, autoconfirm, tag):
else:
bibtex = ''
bibtex = BibTexParser(bibtex)
bibtex = bibtex.get_entry_dict()
bibtex = bibtexparser.loads(bibtex)
bibtex = bibtex.entries_dict
if len(bibtex) > 0:
bibtex_name = bibtex.keys()[0]
bibtex_name = list(bibtex.keys())[0]
bibtex = bibtex[bibtex_name]
bibtex_string = tools.parsed2Bibtex(bibtex)
else:
@ -190,30 +195,33 @@ def addFile(src, filetype, manual, autoconfirm, tag):
tag = args.tag
bibtex['tag'] = tag
new_name = backend.getNewName(src, bibtex, tag)
if rename:
new_name = backend.getNewName(src, bibtex, tag)
while os.path.exists(new_name):
tools.warning("file "+new_name+" already exists.")
default_rename = new_name.replace(tools.getExtension(new_name),
" (2)"+tools.getExtension(new_name))
rename = tools.rawInput("New name ["+default_rename+"]? ")
if rename == '':
new_name = default_rename
else:
new_name = rename
bibtex['file'] = new_name
try:
shutil.copy2(src, new_name)
except shutil.Error:
new_name = False
sys.exit("Unable to move file to library dir " +
config.get("folder")+".")
while os.path.exists(new_name):
tools.warning("file "+new_name+" already exists.")
default_rename = new_name.replace(tools.getExtension(new_name),
" (2)" +
tools.getExtension(new_name))
rename = tools.rawInput("New name ["+default_rename+"]? ")
if rename == '':
new_name = default_rename
else:
new_name = rename
try:
shutil.copy2(src, new_name)
except shutil.Error:
new_name = False
sys.exit("Unable to move file to library dir " +
config.get("folder")+".")
else:
new_name = src
bibtex['file'] = os.path.abspath(new_name)
# Remove first page of IOP papers
try:
if 'IOP' in bibtex['publisher'] and bibtex['type'] == 'article':
tearpages.main(new_name)
if 'IOP' in bibtex['publisher'] and bibtex['ENTRYTYPE'] == 'article':
tearpages.tearpage(new_name)
except (KeyError, shutil.Error, IOError):
pass
@ -268,13 +276,13 @@ def editEntry(entry, file_id='both'):
try:
with open(config.get("folder")+'index.bib', 'r', encoding='utf-8') \
as fh:
index = BibTexParser(fh.read())
index = index.get_entry_dict()
index = bibtexparser.load(fh)
index = index.entries_dict
except (TypeError, IOError):
tools.warning("Unable to open index file.")
return False
index[new_bibtex['id']] = new_bibtex
index[new_bibtex['ID']] = new_bibtex
backend.bibtexRewrite(index)
return True
@ -287,7 +295,7 @@ def downloadFile(url, filetype, manual, autoconfirm, tag):
print('Download finished')
tmp = tempfile.NamedTemporaryFile(suffix='.'+contenttype)
with open(tmp.name, 'w+') as fh:
with open(tmp.name, 'wb+') as fh:
fh.write(dl)
new_name = addFile(tmp.name, filetype, manual, autoconfirm, tag)
if new_name is False:
@ -303,13 +311,13 @@ def openFile(ident):
try:
with open(config.get("folder")+'index.bib', 'r', encoding='utf-8') \
as fh:
bibtex = BibTexParser(fh.read())
bibtex = bibtex.get_entry_dict()
bibtex = bibtexparser.load(fh)
bibtex = bibtex.entries_dict
except (TypeError, IOError):
tools.warning("Unable to open index file.")
return False
if ident not in bibtex.keys():
if ident not in list(bibtex.keys()):
return False
else:
subprocess.Popen(['xdg-open', bibtex[ident]['file']])
@ -326,7 +334,7 @@ def resync():
entry = diff[key]
if entry['file'] == '':
print("\nFound entry in index without associated file: " +
entry['id'])
entry['ID'])
print("Title:\t"+entry['title'])
loop = True
while confirm:
@ -336,23 +344,23 @@ def resync():
if filename == '':
break
else:
if 'doi' in entry.keys():
doi = fetcher.findDOI(filename)
if 'doi' in list(entry.keys()):
doi = fetcher.findArticleID(filename, only=["DOI"])
if doi is not False and doi != entry['doi']:
loop = tools.rawInput("Found DOI does not " +
"match bibtex entry " +
"DOI, continue anyway " +
"? [y/N]")
loop = (loop.lower() != 'y')
if 'Eprint' in entry.keys():
arxiv = fetcher.findArXivId(filename)
if 'Eprint' in list(entry.keys()):
arxiv = fetcher.findArticleID(filename, only=["arXiv"])
if arxiv is not False and arxiv != entry['Eprint']:
loop = tools.rawInput("Found arXiv id does " +
"not match bibtex " +
"entry arxiv id, " +
"continue anyway ? [y/N]")
loop = (loop.lower() != 'y')
if 'isbn' in entry.keys():
if 'isbn' in list(entry.keys()):
isbn = fetcher.findISBN(filename)
if isbn is not False and isbn != entry['isbn']:
loop = tools.rawInput("Found ISBN does not " +
@ -362,19 +370,19 @@ def resync():
loop = (loop.lower() != 'y')
continue
if filename == '':
backend.deleteId(entry['id'])
print("Deleted entry \""+entry['id']+"\".")
backend.deleteId(entry['ID'])
print("Deleted entry \""+entry['ID']+"\".")
else:
new_name = backend.getNewName(filename, entry)
try:
shutil.copy2(filename, new_name)
print("Imported new file "+filename+" for entry " +
entry['id']+".")
entry['ID']+".")
except shutil.Error:
new_name = False
sys.exit("Unable to move file to library dir " +
config.get("folder")+".")
backend.bibtexEdit(entry['id'], {'file': filename})
backend.bibtexEdit(entry['ID'], {'file': filename})
else:
print("Found file without any associated entry in index:")
print(entry['file'])
@ -430,46 +438,70 @@ def update(entry):
print("Previous version successfully deleted.")
def commandline_arg(bytestring):
# UTF-8 encoding for python2
if sys.version_info >= (3, 0):
unicode_string = bytestring
else:
unicode_string = bytestring.decode(sys.getfilesystemencoding())
return unicode_string
if __name__ == '__main__':
parser = argparse.ArgumentParser(description="A bibliography " +
"management tool.")
subparsers = parser.add_subparsers(help="sub-command help")
subparsers = parser.add_subparsers(help="sub-command help", dest='parser')
subparsers.required = True # Fix for Python 3.3.5
parser_download = subparsers.add_parser('download', help="download help")
parser_download.add_argument('-t', '--type', default=None,
choices=['article', 'book'],
help="type of the file to download")
help="type of the file to download",
type=commandline_arg)
parser_download.add_argument('-m', '--manual', default=False,
action='store_true',
help="disable auto-download of bibtex")
parser_download.add_argument('-y', default=False,
help="Confirm all")
parser_download.add_argument('--tag', default='', help="Tag")
parser_download.add_argument('--tag', default='',
help="Tag", type=commandline_arg)
parser_download.add_argument('--keep', default=False,
help="Do not remove the file")
parser_download.add_argument('url', nargs='+',
help="url of the file to import")
help="url of the file to import",
type=commandline_arg)
parser_download.set_defaults(func='download')
parser_import = subparsers.add_parser('import', help="import help")
parser_import.add_argument('-t', '--type', default=None,
choices=['article', 'book'],
help="type of the file to import")
help="type of the file to import",
type=commandline_arg)
parser_import.add_argument('-m', '--manual', default=False,
action='store_true',
help="disable auto-download of bibtex")
parser_import.add_argument('-y', default=False,
help="Confirm all")
parser_import.add_argument('--tag', default='', help="Tag")
parser_import.add_argument('--tag', default='', help="Tag",
type=commandline_arg)
parser_import.add_argument('--in-place', default=False,
dest="inplace", action='store_true',
help="Leave the imported file in place",)
parser_import.add_argument('file', nargs='+',
help="path to the file to import")
help="path to the file to import",
type=commandline_arg)
parser_import.add_argument('--skip', nargs='+',
help="path to files to skip", default=[])
help="path to files to skip", default=[],
type=commandline_arg)
parser_import.set_defaults(func='import')
parser_delete = subparsers.add_parser('delete', help="delete help")
parser_delete.add_argument('entries', metavar='entry', nargs='+',
help="a filename or an identifier")
help="a filename or an identifier",
type=commandline_arg)
parser_delete.add_argument('--skip', nargs='+',
help="path to files to skip", default=[])
help="path to files to skip", default=[],
type=commandline_arg)
group = parser_delete.add_mutually_exclusive_group()
group.add_argument('--id', action="store_true", default=False,
help="id based deletion")
@ -482,9 +514,11 @@ if __name__ == '__main__':
parser_edit = subparsers.add_parser('edit', help="edit help")
parser_edit.add_argument('entries', metavar='entry', nargs='+',
help="a filename or an identifier")
help="a filename or an identifier",
type=commandline_arg)
parser_edit.add_argument('--skip', nargs='+',
help="path to files to skip", default=[])
help="path to files to skip", default=[],
type=commandline_arg)
group = parser_edit.add_mutually_exclusive_group()
group.add_argument('--id', action="store_true", default=False,
help="id based deletion")
@ -500,12 +534,14 @@ if __name__ == '__main__':
parser_open = subparsers.add_parser('open', help="open help")
parser_open.add_argument('ids', metavar='id', nargs='+',
help="an identifier")
help="an identifier",
type=commandline_arg)
parser_open.set_defaults(func='open')
parser_export = subparsers.add_parser('export', help="export help")
parser_export.add_argument('ids', metavar='id', nargs='+',
help="an identifier")
help="an identifier",
type=commandline_arg)
parser_export.set_defaults(func='export')
parser_resync = subparsers.add_parser('resync', help="resync help")
@ -513,12 +549,14 @@ if __name__ == '__main__':
parser_update = subparsers.add_parser('update', help="update help")
parser_update.add_argument('--entries', metavar='entry', nargs='+',
help="a filename or an identifier")
help="a filename or an identifier",
type=commandline_arg)
parser_update.set_defaults(func='update')
parser_search = subparsers.add_parser('search', help="search help")
parser_search.add_argument('query', metavar='entry', nargs='+',
help="your query, see README for more info.")
help="your query, see README for more info.",
type=commandline_arg)
parser_search.set_defaults(func='search')
args = parser.parse_args()
@ -543,7 +581,7 @@ if __name__ == '__main__':
skipped = []
for filename in list(set(args.file) - set(args.skip)):
new_name = addFile(filename, args.type, args.manual, args.y,
args.tag)
args.tag, not args.inplace)
if new_name is not False:
print(filename+" successfully imported as " +
new_name+".")
@ -567,8 +605,9 @@ if __name__ == '__main__':
confirm = 'y'
if confirm.lower() == 'y':
if args.file or not backend.deleteId(filename):
if args.id or not backend.deleteFile(filename):
if args.file or not backend.deleteId(filename, args.keep):
if(args.id or
not backend.deleteFile(filename, args.keep)):
tools.warning("Unable to delete "+filename)
sys.exit(1)
@ -594,13 +633,14 @@ if __name__ == '__main__':
sys.exit()
elif args.func == 'list':
listPapers = tools.listDir(config.get("folder"))
listPapers = backend.getEntries(full=True)
if not listPapers:
sys.exit()
listPapers = [v["file"] for k, v in listPapers.items()]
listPapers.sort()
for paper in listPapers:
if tools.getExtension(paper) not in [".pdf", ".djvu"]:
continue
print(paper)
sys.exit()
elif args.func == 'search':
raise Exception('TODO')

2
libbmc/__init__.py Normal file
View File

@ -0,0 +1,2 @@
#!/usr/bin/env python2
# -*- coding: utf-8 -*-

View File

@ -9,13 +9,13 @@
# Phyks
# -----------------------------------------------------------------------------
from __future__ import unicode_literals
import os
import re
import tools
import fetcher
from bibtexparser.bparser import BibTexParser
from config import Config
import libbmc.tools as tools
import libbmc.fetcher as fetcher
import bibtexparser
from libbmc.config import Config
from codecs import open
@ -29,7 +29,7 @@ def getNewName(src, bibtex, tag='', override_format=None):
"""
authors = re.split(' and ', bibtex['author'])
if bibtex['type'] == 'article':
if bibtex['ENTRYTYPE'] == 'article':
if override_format is None:
new_name = config.get("format_articles")
else:
@ -38,7 +38,7 @@ def getNewName(src, bibtex, tag='', override_format=None):
new_name = new_name.replace("%j", bibtex['journal'])
except KeyError:
pass
elif bibtex['type'] == 'book':
elif bibtex['ENTRYTYPE'] == 'book':
if override_format is None:
new_name = config.get("format_books")
else:
@ -103,8 +103,8 @@ def bibtexEdit(ident, modifs):
try:
with open(config.get("folder")+'index.bib', 'r', encoding='utf-8') \
as fh:
bibtex = BibTexParser(fh.read())
bibtex = bibtex.get_entry_dict()
bibtex = bibtexparser.load(fh)
bibtex = bibtex.entries_dict
except (IOError, TypeError):
tools.warning("Unable to open index file.")
return False
@ -131,13 +131,13 @@ def bibtexRewrite(data):
return False
def deleteId(ident):
def deleteId(ident, keep=False):
"""Delete a file based on its id in the bibtex file"""
try:
with open(config.get("folder")+'index.bib', 'r', encoding='utf-8') \
as fh:
bibtex = BibTexParser(fh.read().decode('utf-8'))
bibtex = bibtex.get_entry_dict()
bibtex = bibtexparser.load(fh)
bibtex = bibtex.entries_dict
except (IOError, TypeError):
tools.warning("Unable to open index file.")
return False
@ -145,11 +145,12 @@ def deleteId(ident):
if ident not in bibtex.keys():
return False
try:
os.remove(bibtex[ident]['file'])
except (KeyError, OSError):
tools.warning("Unable to delete file associated to id "+ident+" : " +
bibtex[ident]['file'])
if not keep:
try:
os.remove(bibtex[ident]['file'])
except (KeyError, OSError):
tools.warning("Unable to delete file associated to id " + ident +
" : " + bibtex[ident]['file'])
try:
if not os.listdir(os.path.dirname(bibtex[ident]['file'])):
@ -167,27 +168,28 @@ def deleteId(ident):
return True
def deleteFile(filename):
def deleteFile(filename, keep=False):
"""Delete a file based on its filename"""
try:
with open(config.get("folder")+'index.bib', 'r', encoding='utf-8') \
as fh:
bibtex = BibTexParser(fh.read().decode('utf-8'))
bibtex = bibtex.get_entry_dict()
bibtex = bibtexparser.load(fh)
bibtex = bibtex.entries_dict
except (TypeError, IOError):
tools.warning("Unable to open index file.")
return False
found = False
for key in bibtex.keys():
for key in list(bibtex.keys()):
try:
if os.path.samefile(bibtex[key]['file'], filename):
found = True
try:
os.remove(bibtex[key]['file'])
except (KeyError, OSError):
tools.warning("Unable to delete file associated to id " +
key+" : "+bibtex[key]['file'])
if not keep:
try:
os.remove(bibtex[key]['file'])
except (KeyError, OSError):
tools.warning("Unable to delete file associated " +
"to id " + key+" : "+bibtex[key]['file'])
try:
if not os.listdir(os.path.dirname(filename)):
@ -222,8 +224,8 @@ def diffFilesIndex():
try:
with open(config.get("folder")+'index.bib', 'r', encoding='utf-8') \
as fh:
index = BibTexParser(fh.read())
index_diff = index.get_entry_dict()
index = bibtexparser.load(fh)
index_diff = index.entries_dict
except (TypeError, IOError):
tools.warning("Unable to open index file.")
return False
@ -237,7 +239,7 @@ def diffFilesIndex():
for filename in files:
index_diff[filename] = {'file': filename}
return index.get_entry_dict()
return index.entries_dict
def getBibtex(entry, file_id='both', clean=False):
@ -250,8 +252,8 @@ def getBibtex(entry, file_id='both', clean=False):
try:
with open(config.get("folder")+'index.bib', 'r', encoding='utf-8') \
as fh:
bibtex = BibTexParser(fh.read())
bibtex = bibtex.get_entry_dict()
bibtex = bibtexparser.load(fh)
bibtex = bibtex.entries_dict
except (TypeError, IOError):
tools.warning("Unable to open index file.")
return False
@ -277,18 +279,21 @@ def getBibtex(entry, file_id='both', clean=False):
return bibtex_entry
def getEntries():
def getEntries(full=False):
"""Returns the list of all entries in the bibtex index"""
try:
with open(config.get("folder")+'index.bib', 'r', encoding='utf-8') \
as fh:
bibtex = BibTexParser(fh.read())
bibtex = bibtex.get_entry_dict()
bibtex = bibtexparser.load(fh)
bibtex = bibtex.entries_dict
except (TypeError, IOError):
tools.warning("Unable to open index file.")
return False
return bibtex.keys()
if full:
return bibtex
else:
return list(bibtex.keys())
def updateArXiv(entry):
@ -313,9 +318,9 @@ def updateArXiv(entry):
continue
ids.add(bibtex['eprint'])
last_bibtex = BibTexParser(fetcher.arXiv2Bib(arxiv_id_no_v))
last_bibtex = last_bibtex.get_entry_dict()
last_bibtex = last_bibtex[last_bibtex.keys()[0]]
last_bibtex = bibtexparser.loads(fetcher.arXiv2Bib(arxiv_id_no_v))
last_bibtex = last_bibtex.entries_dict
last_bibtex = last_bibtex[list(last_bibtex.keys())[0]]
if last_bibtex['eprint'] not in ids:
return last_bibtex

View File

@ -1,10 +1,11 @@
from __future__ import unicode_literals
import os
import errno
import imp
import inspect
import json
import sys
import tools
import libbmc.tools as tools
# List of available options (in ~/.config/bmc/bmc.json file):
# * folder : folder in which papers are stored
@ -81,12 +82,20 @@ class Config():
except (ValueError, IOError):
tools.warning("Config file could not be read.")
sys.exit(1)
try:
folder_exists = make_sure_path_exists(self.get("folder"))
except OSError:
tools.warning("Unable to create paper storage folder.")
sys.exit(1)
self.load_masks()
def save(self):
try:
with open(self.config_path + "bmc.json", 'w') as fh:
fh.write(json.dumps(self.config))
fh.write(json.dumps(self.config,
sort_keys=True,
indent=4,
separators=(',', ': ')))
except IOError:
tools.warning("Could not write config file.")
sys.exit(1)

View File

@ -12,16 +12,30 @@
import isbnlib
import re
import requesocks as requests # Requesocks is requests with SOCKS support
import socket
import socks
import subprocess
import sys
try:
# For Python 3.0 and later
from urllib.request import urlopen, Request
from urllib.error import URLError
except ImportError:
# Fall back to Python 2's urllib2
from urllib2 import urlopen, Request, URLError
import arxiv2bib as arxiv_metadata
import tools
from bibtexparser.bparser import BibTexParser
from config import Config
import libbmc.tools as tools
import bibtexparser
from libbmc.config import Config
config = Config()
default_socket = socket.socket
try:
stdout_encoding = sys.stdout.encoding
assert(stdout_encoding is not None)
except (AttributeError, AssertionError):
stdout_encoding = 'UTF-8'
def download(url):
@ -32,39 +46,81 @@ def download(url):
false if it could not be downloaded.
"""
for proxy in config.get("proxies"):
r_proxy = {
"http": proxy,
"https": proxy,
}
if proxy.startswith('socks'):
if proxy[5] == '4':
proxy_type = socks.SOCKS4
else:
proxy_type = socks.SOCKS5
proxy = proxy[proxy.find('://')+3:]
try:
proxy, port = proxy.split(':')
except ValueError:
port = None
socks.set_default_proxy(proxy_type, proxy, port)
socket.socket = socks.socksocket
elif proxy == '':
socket.socket = default_socket
else:
try:
proxy, port = proxy.split(':')
except ValueError:
port = None
socks.set_default_proxy(socks.HTTP, proxy, port)
socket.socket = socks.socksocket
try:
r = requests.get(url, proxies=r_proxy)
size = int(r.headers['Content-Length'].strip())
dl = ""
r = urlopen(url)
try:
size = int(dict(r.info())['content-length'].strip())
except KeyError:
try:
size = int(dict(r.info())['Content-Length'].strip())
except KeyError:
size = 0
dl = b""
dl_size = 0
for buf in r.iter_content(1024):
while True:
buf = r.read(1024)
if buf:
dl += buf
dl_size += len(buf)
done = int(50 * dl_size / size)
sys.stdout.write("\r[%s%s]" % ('='*done, ' '*(50-done)))
sys.stdout.write(" "+str(int(float(done)/52*100))+"%")
sys.stdout.flush()
if size != 0:
done = int(50 * dl_size / size)
sys.stdout.write("\r[%s%s]" % ('='*done, ' '*(50-done)))
sys.stdout.write(" "+str(int(float(done)/52*100))+"%")
sys.stdout.flush()
else:
break
contenttype = False
if 'pdf' in r.headers['content-type']:
contenttype = 'pdf'
elif 'djvu' in r.headers['content-type']:
contenttype = 'djvu'
contenttype_req = None
try:
contenttype_req = dict(r.info())['content-type']
except KeyError:
try:
contenttype_req = dict(r.info())['Content-Type']
except KeyError:
continue
try:
if 'pdf' in contenttype_req:
contenttype = 'pdf'
elif 'djvu' in contenttype_req:
contenttype = 'djvu'
except KeyError:
pass
if r.status_code != 200 or contenttype is False:
if r.getcode() != 200 or contenttype is False:
continue
return dl, contenttype
except ValueError:
tools.warning("Invalid URL")
return False, None
except requests.exceptions.RequestException:
tools.warning("Unable to get "+url+" using proxy "+proxy+". It " +
"may not be available.")
except (URLError, socket.error):
if proxy != "":
proxy_txt = "using proxy "+proxy
else:
proxy_txt = "without using any proxy"
tools.warning("Unable to get "+url+" "+proxy_txt+". It " +
"may not be available at the moment.")
continue
return False, None
@ -91,7 +147,7 @@ def findISBN(src):
return False
while totext.poll() is None:
extractfull = ' '.join([i.strip() for i in totext.stdout.readlines()])
extractfull = ' '.join([i.decode(stdout_encoding).strip() for i in totext.stdout.readlines()])
extractISBN = isbn_re.search(extractfull.lower().replace('&#338;',
'-'))
if extractISBN:
@ -117,7 +173,7 @@ def isbn2Bib(isbn):
try:
return isbnlib.registry.bibformatters['bibtex'](isbnlib.meta(isbn,
'default'))
except (isbnlib.ISBNLibException, isbnlib.ISBNToolsException, TypeError):
except (isbnlib.ISBNLibException, TypeError):
return ''
@ -128,13 +184,16 @@ clean_doi_re = re.compile('^/')
clean_doi_fabse_re = re.compile('^10.1096')
clean_doi_jcb_re = re.compile('^10.1083')
clean_doi_len_re = re.compile(r'\d\.\d')
arXiv_re = re.compile(r'arXiv:\s*([\w\.\/\-]+)', re.IGNORECASE)
def findDOI(src):
"""Search for a valid DOI in src.
def findArticleID(src, only=["DOI", "arXiv"]):
"""Search for a valid article ID (DOI or ArXiv) in src.
Returns the DOI or False if not found or an error occurred.
Returns a tuple (type, first matching ID) or False if not found
or an error occurred.
From : http://en.dogeno.us/2010/02/release-a-python-script-for-organizing-scientific-papers-pyrenamepdf-py/
and https://github.com/minad/bibsync/blob/3fdf121016f6187a2fffc66a73cd33b45a20e55d/lib/bibsync/utils.rb
"""
if src.endswith(".pdf"):
totext = subprocess.Popen(["pdftotext", src, "-"],
@ -145,33 +204,48 @@ def findDOI(src):
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
else:
return False
return (False, False)
extractfull = ''
extract_type = False
extractID = None
while totext.poll() is None:
extractfull += ' '.join([i.strip() for i in totext.stdout.readlines()])
extractDOI = doi_re.search(extractfull.lower().replace('&#338;', '-'))
if not extractDOI:
# PNAS fix
extractDOI = doi_pnas_re.search(extractfull.
lower().
replace('pnas', '/pnas'))
if not extractDOI:
# JSB fix
extractDOI = doi_jsb_re.search(extractfull.lower())
if extractDOI:
totext.terminate()
extractfull += ' '.join([i.decode(stdout_encoding).strip() for i in totext.stdout.readlines()])
# Try to extract DOI
if "DOI" in only:
extractlower = extractfull.lower().replace('digital object identifier', 'doi')
extractID = doi_re.search(extractlower.replace('&#338;', '-'))
if not extractID:
# PNAS fix
extractID = doi_pnas_re.search(extractlower.replace('pnas', '/pnas'))
if not extractID:
# JSB fix
extractID = doi_jsb_re.search(extractlower)
if extractID:
extract_type = "DOI"
totext.terminate()
# Try to extract arXiv
if "arXiv" in only:
tmp_extractID = arXiv_re.search(extractfull)
if tmp_extractID:
if not extractID or extractID.start(0) > tmp_extractID.start(1):
# Only use arXiv id if it is before the DOI in the pdf
extractID = tmp_extractID
extract_type = "arXiv"
totext.terminate()
if extract_type is not False:
break
err = totext.communicate()[1]
if totext.returncode > 0:
# Error happened
tools.warning(err)
return False
return (False, False)
cleanDOI = False
if extractDOI:
cleanDOI = extractDOI.group(0).replace(':', '').replace(' ', '')
if extractID is not None and extract_type == "DOI":
# If DOI extracted, clean it and return it
cleanDOI = False
cleanDOI = extractID.group(0).replace(':', '').replace(' ', '')
if clean_doi_re.search(cleanDOI):
cleanDOI = cleanDOI[1:]
# FABSE J fix
@ -191,7 +265,11 @@ def findDOI(src):
if cleanDOItemp[i].isalpha() and digitStart:
break
cleanDOI = cleanDOI[0:(8+i)]
return cleanDOI
return ("DOI", cleanDOI)
elif extractID is not None and extract_type == "arXiv":
# If arXiv id is extracted, return it
return ("arXiv", extractID.group(1))
return (False, False)
def doi2Bib(doi):
@ -201,58 +279,29 @@ def doi2Bib(doi):
"""
url = "http://dx.doi.org/" + doi
headers = {"accept": "application/x-bibtex"}
req = Request(url, headers=headers)
try:
r = requests.get(url, headers=headers)
r = urlopen(req)
if r.headers['content-type'] == 'application/x-bibtex':
return r.text
else:
return ''
except requests.exceptions.ConnectionError:
try:
if dict(r.info())['content-type'] == 'application/x-bibtex':
return r.read().decode('utf-8')
else:
return ''
except KeyError:
try:
if dict(r.info())['Content-Type'] == 'application/x-bibtex':
return r.read().decode('utf-8')
else:
return ''
except KeyError:
return ''
except:
tools.warning('Unable to contact remote server to get the bibtex ' +
'entry for doi '+doi)
return ''
arXiv_re = re.compile(r'arXiv:\s*([\w\.\/\-]+)', re.IGNORECASE)
def findArXivId(src):
"""Searches for a valid arXiv id in src.
Returns the arXiv id or False if not found or an error occurred.
From : https://github.com/minad/bibsync/blob/3fdf121016f6187a2fffc66a73cd33b45a20e55d/lib/bibsync/utils.rb
"""
if src.endswith(".pdf"):
totext = subprocess.Popen(["pdftotext", src, "-"],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
elif src.endswith(".djvu"):
totext = subprocess.Popen(["djvutxt", src],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
else:
return False
extractfull = ''
while totext.poll() is None:
extractfull += ' '.join([i.strip() for i in totext.stdout.readlines()])
extractID = arXiv_re.search(extractfull)
if extractID:
totext.terminate()
break
err = totext.communicate()[1]
if totext.returncode > 0:
# Error happened
tools.warning(err)
return False
elif extractID is not None:
return extractID.group(1)
else:
return False
def arXiv2Bib(arxiv):
"""Returns bibTeX string of metadata for a given arXiv id
@ -263,9 +312,9 @@ def arXiv2Bib(arxiv):
if isinstance(bib, arxiv_metadata.ReferenceErrorInfo):
continue
else:
fetched_bibtex = BibTexParser(bib.bibtex())
fetched_bibtex = fetched_bibtex.get_entry_dict()
fetched_bibtex = fetched_bibtex[fetched_bibtex.keys()[0]]
fetched_bibtex = bibtexparser.loads(bib.bibtex())
fetched_bibtex = fetched_bibtex.entries_dict
fetched_bibtex = fetched_bibtex[list(fetched_bibtex.keys())[0]]
try:
del(fetched_bibtex['file'])
except KeyError:
@ -295,7 +344,7 @@ def findHALId(src):
return False
while totext.poll() is None:
extractfull = ' '.join([i.strip() for i in totext.stdout.readlines()])
extractfull = ' '.join([i.decode(stdout_encoding).strip() for i in totext.stdout.readlines()])
extractID = HAL_re.search(extractfull)
if extractID:
totext.terminate()

View File

@ -168,7 +168,7 @@ class SearchQueryParser:
return self._methods[argument.getName()](argument)
def Parse(self, query):
#print self._parser(query)[0]
#print(self._parser(query)[0])
return self.evaluate(self._parser(query)[0])
def GetWord(self, word):
@ -278,21 +278,21 @@ class ParserTest(SearchQueryParser):
def Test(self):
all_ok = True
for item in self.tests.keys():
print item
print(item)
r = self.Parse(item)
e = self.tests[item]
print 'Result: %s' % r
print 'Expect: %s' % e
print('Result: %s' % r)
print('Expect: %s' % e)
if e == r:
print 'Test OK'
print('Test OK')
else:
all_ok = False
print '>>>>>>>>>>>>>>>>>>>>>>Test ERROR<<<<<<<<<<<<<<<<<<<<<'
print ''
print('>>>>>>>>>>>>>>>>>>>>>>Test ERROR<<<<<<<<<<<<<<<<<<<<<')
print('')
return all_ok
if __name__=='__main__':
if ParserTest().Test():
print 'All tests OK'
print('All tests OK')
else:
print 'One or more tests FAILED'
print('One or more tests FAILED')

57
libbmc/tearpages.py Normal file
View File

@ -0,0 +1,57 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Author: Francois Boulogne
import shutil
import tempfile
from PyPDF2 import PdfFileWriter, PdfFileReader
from PyPDF2.utils import PdfReadError
def _fixPdf(pdfFile, destination):
"""
Fix malformed pdf files when data are present after '%%EOF'
:param pdfFile: PDF filepath
:param destination: destination
"""
tmp = tempfile.NamedTemporaryFile()
output = open(tmp.name, 'wb')
with open(pdfFile, "rb") as fh:
with open(pdfFile, "rb") as fh:
for line in fh:
output.write(line)
if b'%%EOF' in line:
break
output.close()
shutil.copy(tmp.name, destination)
def tearpage(filename, startpage=1):
"""
Copy filename to a tempfile, write pages startpage..N to filename.
:param filename: PDF filepath
:param startpage: page number for the new first page
"""
# Copy the pdf to a tmp file
tmp = tempfile.NamedTemporaryFile()
shutil.copy(filename, tmp.name)
# Read the copied pdf
try:
input_file = PdfFileReader(open(tmp.name, 'rb'))
except PdfReadError:
_fixPdf(filename, tmp.name)
input_file = PdfFileReader(open(tmp.name, 'rb'))
# Seek for the number of pages
num_pages = input_file.getNumPages()
# Write pages excepted the first one
output_file = PdfFileWriter()
for i in range(startpage, num_pages):
output_file.addPage(input_file.getPage(i))
tmp.close()
outputStream = open(filename, "wb")
output_file.write(outputStream)

View File

@ -2,11 +2,11 @@
doi = {10.1103/physreva.88.043630},
url = {http://dx.doi.org/10.1103/physreva.88.043630},
year = 2013,
month = {Oct},
publisher = {American Physical Society (APS)},
month = {oct},
publisher = {American Physical Society ({APS})},
volume = {88},
number = {4},
author = {Yan-Hua Hou and Lev P. Pitaevskii and Sandro Stringari},
title = {First and second sound in a highly elongated Fermi gas at unitarity},
journal = {Physical Review A}
journal = {Phys. Rev. A}
}

View File

@ -1,6 +1,6 @@
@book{9780198507192,
title = {Bose-Einstein Condensation},
author = {Lev Pitaevskii and Sandro Stringari},
author = {Lev. P. Pitaevskii and S. Stringari},
isbn = {9780198507192},
year = {2004},
publisher = {Clarendon Press}

Binary file not shown.

View File

@ -8,9 +8,10 @@
# <del>beer</del> soda in return.
# Phyks
# -----------------------------------------------------------------------------
from __future__ import unicode_literals
import unittest
from backend import *
from bibtexparser.bparser import BibTexParser
from libbmc.backend import *
import bibtexparser
import os
import shutil
import tempfile
@ -21,7 +22,7 @@ class TestFetcher(unittest.TestCase):
config.set("folder", tempfile.mkdtemp()+"/")
self.bibtex_article_string = """
@article{1303.3130v1,
abstract={We study the role of the dipolar interaction, correctly accounting for the
abstract={We study the role of the dipolar interaction, correctly accounting for the
Dipolar-Induced Resonance (DIR), in a quasi-one-dimensional system of ultracold
bosons. We first show how the DIR affects the lowest-energy states of two
particles in a harmonic trap. Then, we consider a deep optical lattice loaded
@ -30,20 +31,20 @@ atom-dimer extended Bose-Hubbard model. We analyze the impact of the DIR on the
phase diagram at T=0 by exact diagonalization of a small-sized system. In
particular, the resonance strongly modifies the range of parameters for which a
mass density wave should occur.},
archiveprefix={arXiv},
author={N. Bartolo and D. J. Papoular and L. Barbiero and C. Menotti and A. Recati},
eprint={1303.3130v1},
file={%sN_Bartolo_A_Recati-j-2013.pdf},
link={http://arxiv.org/abs/1303.3130v1},
month={Mar},
primaryclass={cond-mat.quant-gas},
tag={},
title={Dipolar-Induced Resonance for Ultracold Bosons in a Quasi-1D Optical
archiveprefix={arXiv},
author={N. Bartolo and D. J. Papoular and L. Barbiero and C. Menotti and A. Recati},
eprint={1303.3130v1},
file={%sN_Bartolo_A_Recati-j-2013.pdf},
link={http://arxiv.org/abs/1303.3130v1},
month={Mar},
primaryclass={cond-mat.quant-gas},
tag={},
title={Dipolar-Induced Resonance for Ultracold Bosons in a Quasi-1D Optical
Lattice},
year={2013},
year={2013},
}""" % config.get("folder")
self.bibtex_article = BibTexParser(self.bibtex_article_string).get_entry_dict()
self.bibtex_article = self.bibtex_article[self.bibtex_article.keys()[0]]
self.bibtex_article = bibtexparser.loads(self.bibtex_article_string).entries_dict
self.bibtex_article = self.bibtex_article[list(self.bibtex_article.keys())[0]]
self.bibtex_book_string = """
@book{9780521846516,
@ -54,8 +55,8 @@ Lattice},
year={2008},
}
"""
self.bibtex_book = BibTexParser(self.bibtex_book_string).get_entry_dict()
self.bibtex_book = self.bibtex_book[self.bibtex_book.keys()[0]]
self.bibtex_book = bibtexparser.loads(self.bibtex_book_string).entries_dict
self.bibtex_book = self.bibtex_book[list(self.bibtex_book.keys())[0]]
def test_getNewName_article(self):
self.assertEqual(getNewName("test.pdf", self.bibtex_article),
@ -81,7 +82,7 @@ Lattice},
def test_bibtexEdit(self):
bibtexAppend(self.bibtex_article)
bibtexEdit(self.bibtex_article['id'], {'id': 'bidule'})
bibtexEdit(self.bibtex_article['ID'], {'ID': 'bidule'})
with open(config.get("folder")+'index.bib', 'r') as fh:
self.assertEqual(fh.read(),
'@article{bidule,\n\tabstract={We study the role of the dipolar interaction, correctly accounting for the\nDipolar-Induced Resonance (DIR), in a quasi-one-dimensional system of ultracold\nbosons. We first show how the DIR affects the lowest-energy states of two\nparticles in a harmonic trap. Then, we consider a deep optical lattice loaded\nwith ultracold dipolar bosons. We describe this many-body system using an\natom-dimer extended Bose-Hubbard model. We analyze the impact of the DIR on the\nphase diagram at T=0 by exact diagonalization of a small-sized system. In\nparticular, the resonance strongly modifies the range of parameters for which a\nmass density wave should occur.},\n\tarchiveprefix={arXiv},\n\tauthor={N. Bartolo and D. J. Papoular and L. Barbiero and C. Menotti and A. Recati},\n\teprint={1303.3130v1},\n\tfile={'+config.get("folder")+'N_Bartolo_A_Recati-j-2013.pdf},\n\tlink={http://arxiv.org/abs/1303.3130v1},\n\tmonth={Mar},\n\tprimaryclass={cond-mat.quant-gas},\n\ttag={},\n\ttitle={Dipolar-Induced Resonance for Ultracold Bosons in a Quasi-1D Optical\nLattice},\n\tyear={2013},\n}\n\n\n')
@ -97,9 +98,9 @@ Lattice},
self.bibtex_article['file'] = config.get("folder")+'test.pdf'
bibtexAppend(self.bibtex_article)
open(config.get("folder")+'test.pdf', 'w').close()
deleteId(self.bibtex_article['id'])
deleteId(self.bibtex_article['ID'])
with open(config.get("folder")+'index.bib', 'r') as fh:
self.assertEquals(fh.read().strip(), "")
self.assertEqual(fh.read().strip(), "")
self.assertFalse(os.path.isfile(config.get("folder")+'test.pdf'))
def test_deleteFile(self):
@ -108,7 +109,7 @@ Lattice},
open(config.get("folder")+'test.pdf', 'w').close()
deleteFile(self.bibtex_article['file'])
with open(config.get("folder")+'index.bib', 'r') as fh:
self.assertEquals(fh.read().strip(), "")
self.assertEqual(fh.read().strip(), "")
self.assertFalse(os.path.isfile(config.get("folder")+'test.pdf'))
def test_diffFilesIndex(self):
@ -117,12 +118,12 @@ Lattice},
def test_getBibtex(self):
bibtexAppend(self.bibtex_article)
got = getBibtex(self.bibtex_article['id'])
got = getBibtex(self.bibtex_article['ID'])
self.assertEqual(got, self.bibtex_article)
def test_getBibtex_id(self):
bibtexAppend(self.bibtex_article)
got = getBibtex(self.bibtex_article['id'], file_id='id')
got = getBibtex(self.bibtex_article['ID'], file_id='id')
self.assertEqual(got, self.bibtex_article)
def test_getBibtex_file(self):
@ -133,16 +134,16 @@ Lattice},
self.assertEqual(got, self.bibtex_article)
def test_getBibtex_clean(self):
config.set("ignore_fields", ['id', 'abstract'])
config.set("ignore_fields", ['ID', 'abstract'])
bibtexAppend(self.bibtex_article)
got = getBibtex(self.bibtex_article['id'], clean=True)
got = getBibtex(self.bibtex_article['ID'], clean=True)
for i in config.get("ignore_fields"):
self.assertNotIn(i, got)
def test_getEntries(self):
bibtexAppend(self.bibtex_article)
self.assertEqual(getEntries(),
[self.bibtex_article['id']])
[self.bibtex_article['ID']])
def test_updateArxiv(self):
# TODO

View File

@ -8,12 +8,13 @@
# <del>beer</del> soda in return.
# Phyks
# -----------------------------------------------------------------------------
from __future__ import unicode_literals
import unittest
from config import Config
import json
import os
import tempfile
import shutil
from libbmc.config import Config
class TestConfig(unittest.TestCase):

View File

@ -10,16 +10,16 @@
# -----------------------------------------------------------------------------
import unittest
from fetcher import *
from libbmc.fetcher import *
class TestFetcher(unittest.TestCase):
def setUp(self):
with open("tests/src/doi.bib", 'r') as fh:
with open("libbmc/tests/src/doi.bib", 'r') as fh:
self.doi_bib = fh.read()
with open("tests/src/arxiv.bib", 'r') as fh:
with open("libbmc/tests/src/arxiv.bib", 'r') as fh:
self.arxiv_bib = fh.read()
with open("tests/src/isbn.bib", 'r') as fh:
with open("libbmc/tests/src/isbn.bib", 'r') as fh:
self.isbn_bib = fh.read()
def test_download(self):
@ -35,13 +35,13 @@ class TestFetcher(unittest.TestCase):
def test_findISBN_DJVU(self):
# ISBN is incomplete in this test because my djvu file is bad
self.assertEqual(findISBN("tests/src/test_book.djvu"), '978295391873')
self.assertEqual(findISBN("libbmc/tests/src/test_book.djvu"), '978295391873')
def test_findISBN_PDF(self):
self.assertEqual(findISBN("tests/src/test_book.pdf"), '9782953918731')
self.assertEqual(findISBN("libbmc/tests/src/test_book.pdf"), '9782953918731')
def test_findISBN_False(self):
self.assertFalse(findISBN("tests/src/test.pdf"))
self.assertFalse(findISBN("libbmc/tests/src/test.pdf"))
def test_isbn2Bib(self):
self.assertEqual(isbn2Bib('0198507194'), self.isbn_bib)
@ -50,16 +50,22 @@ class TestFetcher(unittest.TestCase):
self.assertEqual(isbn2Bib('foo'), '')
def test_findDOI_PDF(self):
self.assertEqual(findDOI("tests/src/test.pdf"),
"10.1103/physrevlett.112.253201")
self.assertEqual(findArticleID("libbmc/tests/src/test.pdf"),
("DOI", "10.1103/physrevlett.112.253201"))
def test_findDOI_DJVU(self):
def test_findOnlyDOI(self):
self.assertEqual(findArticleID("libbmc/tests/src/test.pdf",
only=["DOI"]),
("DOI", "10.1103/physrevlett.112.253201"))
def test_findDOID_DJVU(self):
# DOI is incomplete in this test because my djvu file is bad
self.assertEqual(findDOI("tests/src/test.djvu"),
"10.1103/physrevlett.112")
self.assertEqual(findArticleID("libbmc/tests/src/test.djvu"),
("DOI", "10.1103/physrevlett.112"))
def test_findDOI_False(self):
self.assertFalse(findDOI("tests/src/test_arxiv_multi.pdf"))
self.assertFalse(findArticleID("libbmc/tests/src/test_arxiv_multi.pdf",
only=["DOI"])[0])
def test_doi2Bib(self):
self.assertEqual(doi2Bib('10.1103/physreva.88.043630'), self.doi_bib)
@ -68,8 +74,18 @@ class TestFetcher(unittest.TestCase):
self.assertEqual(doi2Bib('blabla'), '')
def test_findArXivId(self):
self.assertEqual(findArXivId("tests/src/test_arxiv_multi.pdf"),
'1303.3130v1')
self.assertEqual(findArticleID("libbmc/tests/src/test_arxiv_multi.pdf"),
("arXiv", '1303.3130v1'))
def test_findOnlyArXivId(self):
self.assertEqual(findArticleID("libbmc/tests/src/test_arxiv_multi.pdf",
only=["arXiv"]),
("arXiv", '1303.3130v1'))
def test_findArticleID(self):
# cf https://github.com/Phyks/BMC/issues/19
self.assertEqual(findArticleID("libbmc/tests/src/test_arxiv_doi_conflict.pdf"),
("arXiv", '1107.4487v1'))
def test_arXiv2Bib(self):
self.assertEqual(arXiv2Bib('1303.3130v1'), self.arxiv_bib)
@ -78,7 +94,7 @@ class TestFetcher(unittest.TestCase):
self.assertEqual(arXiv2Bib('blabla'), '')
def test_findHALId(self):
self.assertTupleEqual(findHALId("tests/src/test_hal.pdf"),
self.assertTupleEqual(findHALId("libbmc/tests/src/test_hal.pdf"),
('hal-00750893', '3'))
if __name__ == '__main__':

View File

@ -8,9 +8,10 @@
# <del>beer</del> soda in return.
# Phyks
# -----------------------------------------------------------------------------
from __future__ import unicode_literals
import unittest
from tools import *
from libbmc.tools import *
class TestTools(unittest.TestCase):
@ -18,7 +19,7 @@ class TestTools(unittest.TestCase):
self.assertEqual(slugify(u"à&é_truc.pdf"), "ae_trucpdf")
def test_parsed2Bibtex(self):
parsed = {'type': 'article', 'id': 'test', 'field1': 'test1',
parsed = {'ENTRYTYPE': 'article', 'ID': 'test', 'field1': 'test1',
'field2': 'test2'}
expected = ('@article{test,\n\tfield1={test1},\n' +
'\tfield2={test2},\n}\n\n')

View File

@ -10,11 +10,17 @@
# -----------------------------------------------------------------------------
from __future__ import print_function
from __future__ import print_function, unicode_literals
import os
import re
import sys
from termios import tcflush, TCIOFLUSH
if os.name == "posix":
from termios import tcflush, TCIOFLUSH
try:
input = raw_input
except NameError:
pass
_slugify_strip_re = re.compile(r'[^\w\s-]')
_slugify_hyphenate_re = re.compile(r'[\s]+')
@ -27,18 +33,22 @@ def slugify(value):
From Django's "django/template/defaultfilters.py".
"""
import unicodedata
if not isinstance(value, unicode):
value = unicode(value)
value = unicodedata.normalize('NFKD', value).encode('ascii', 'ignore')
value = unicode(_slugify_strip_re.sub('', value).strip())
try:
unicode_type = unicode
except NameError:
unicode_type = str
if not isinstance(value, unicode_type):
value = unicode_type(value)
value = unicodedata.normalize('NFKD', value).encode('ascii', 'ignore').decode('ascii')
value = unicode_type(_slugify_strip_re.sub('', value).strip())
return _slugify_hyphenate_re.sub('_', value)
def parsed2Bibtex(parsed):
"""Convert a single bibtex entry dict to bibtex string"""
bibtex = '@'+parsed['type']+'{'+parsed['id']+",\n"
bibtex = '@'+parsed['ENTRYTYPE']+'{'+parsed['ID']+",\n"
for field in [i for i in sorted(parsed) if i not in ['type', 'id']]:
for field in [i for i in sorted(parsed) if i not in ['ENTRYTYPE', 'ID']]:
bibtex += "\t"+field+"={"+parsed[field]+"},\n"
bibtex += "}\n\n"
return bibtex
@ -51,21 +61,21 @@ def getExtension(filename):
def replaceAll(text, dic):
"""Replace all the dic keys by the associated item in text"""
for i, j in dic.iteritems():
for i, j in dic.items():
text = text.replace(i, j)
return text
def rawInput(string):
"""Flush stdin and then prompt the user for something"""
tcflush(sys.stdin, TCIOFLUSH)
return raw_input(string).decode('utf-8')
if os.name == "posix":
tcflush(sys.stdin, TCIOFLUSH)
return input(string)
def warning(*objs):
"""Write warnings to stderr"""
printed = [i.encode('utf-8') for i in objs]
print("WARNING: ", *printed, file=sys.stderr)
print("WARNING: ", *objs, file=sys.stderr)
def listDir(path):

15
setup.py Normal file
View File

@ -0,0 +1,15 @@
#!/usr/bin/env python
from distutils.core import setup
setup(
name = 'BMC',
version = "0.3dev",
url = "https://github.com/Phyks/BMC",
author = "",
license = "no-alcohol beer-ware license",
author_email = "",
description = "simple script to download and store your articles",
packages = ['libbmc'],
scripts = ['bmc.py'],
)