Go to file
Lucas Verney c785e04589 Add a __valid_identifiers__ list to ease fetching of identifiers in
papers

See the detailed explanations in README.md.

Also fixed some typos in docstrings.
2016-02-01 17:32:24 +01:00
docs Fix sphinx doc generation + error in doi module 2016-01-10 18:35:24 +01:00
libbmc Add a __valid_identifiers__ list to ease fetching of identifiers in 2016-02-01 17:32:24 +01:00
.gitignore Add a setup.py file and __init__.py for module and submodules 2016-01-25 17:56:34 +01:00
.gitmodules Add some doc, especially about external dependencies 2016-01-19 18:17:12 +01:00
LICENSE Add a setup.py file and __init__.py for module and submodules 2016-01-25 17:56:34 +01:00
README.md Add a __valid_identifiers__ list to ease fetching of identifiers in 2016-02-01 17:32:24 +01:00
requirements.txt Tearpages ok 2016-01-30 16:28:53 +01:00
setup.py Add a setup.py file and __init__.py for module and submodules 2016-01-25 17:56:34 +01:00

README.md

libBMC

Presentation

A generic Python library to manage bibliography and play with scientific papers.

Note: This library is written for Python 3 and may not work with Python 2. This is not a major priority for me, but if anyone needed to make it work with Python 2 and want to make a PR, I will happily merge it :)

Dependencies

Python dependencies are listed in the requirements.txt file at the root of this repo, and can be installed with pip install -r requirements.txt.

External dependencies are OpenDeTeX (an improved version of DeTeX) and the pdftotext and djvutxt programs.

OpenDeTeX is available as a Git submodule in the libbmc/external folder. If you do not have it installed system-wide, you can use the following steps to build it in this repo and the library will use it:

  • git submodule init; git submodule update to initialize the Git submodules.
  • cd libbmc/external/opendetex; make to build OpenDeTeX (see INSTALL file in the same folder for more info, you will need make, gcc and flex to build it).

OpenDeTeX is used to get references from a .bbl file (or directly from arXiv as it uses the same pipeline).

pdftotext and djvutxt should be available in the packages of your distribution and should be installed systemwide. Both are used to extract identifiers from papers PDF files.

If you plan on using the libbmc.citations.pdf functions, you should also install the matching software (CERMINE, Grobid or pdf-extract). See the docstrings of those functions for more infos on this particular point.

Note on __valid_identifiers__

libbmc exposes a __valid_identifiers__ list, containing the valid identifier types. These are those exposing the same function as doi or isbn modules, in particular the extraction from a string and BibTeX fetching functions.

If you write additional modules for others repositories, you can include them in the __valid_identifiers__ list, as long as they provide these functions.

This list is especially useful for the libbmc.papers.identifiers module, which is using it to loop through all the available identifier types, to fetch for them in the paper and retrieve BibTeX from it.

License

This code is licensed under an MIT license.

Acknowledgements

Thanks a lot to the following authors and programs for helping in building this lib: