Add some doc, especially about external dependencies
This commit is contained in:
parent
65967cfa96
commit
9019833dbb
3
.gitmodules
vendored
3
.gitmodules
vendored
@ -1,6 +1,3 @@
|
||||
[submodule "libbmc/external/opendetex"]
|
||||
path = libbmc/external/opendetex
|
||||
url = https://github.com/Phyks/opendetex
|
||||
[submodule "libbmc/external/poppler"]
|
||||
path = libbmc/external/poppler
|
||||
url = git://git.freedesktop.org/git/poppler/poppler
|
||||
|
43
README.md
Normal file
43
README.md
Normal file
@ -0,0 +1,43 @@
|
||||
libBMC
|
||||
======
|
||||
|
||||
A generic Python library to manage bibliography and play with scientific
|
||||
papers.
|
||||
|
||||
|
||||
_Note_: This library is written for Python 3 and may not work with Python 2.
|
||||
This is not a major priority for me, but if anyone needed to make it work with
|
||||
Python 2 and want to make a PR, I will happily merge it :)
|
||||
|
||||
|
||||
## Dependencies
|
||||
|
||||
Python dependencies are listed in the `requirements.txt` file at the root of
|
||||
this repo, and can be installed with `pip install -r requirements.txt`.
|
||||
|
||||
|
||||
External dependencies are [OpenDeTeX](https://code.google.com/p/opendetex/)
|
||||
(an improved version of DeTeX) and the `pdftotext` and `djvutxt` programs.
|
||||
|
||||
|
||||
OpenDeTeX is available as a Git submodule in the `libbmc/external` folder. If
|
||||
you do not have it installed system-wide, you can use the following steps to
|
||||
build it in this repo and the library will use it:
|
||||
|
||||
* `git submodule init; git submodule update` to initialize the Git submodules.
|
||||
* `cd libbmc/external/opendetex; make` to build OpenDeTeX (see `INSTALL` file
|
||||
in the same folder for more info, you will need `make`, `gcc` and `flex` to
|
||||
build it).
|
||||
|
||||
OpenDeTeX is used to get references from a `.bbl` file (or directly from arXiv
|
||||
as it uses the same pipeline).
|
||||
|
||||
|
||||
`pdftotext` and `djvutxt` should be available in the packages of your
|
||||
distribution and should be installed systemwide. Both are used to extract
|
||||
identifiers from papers PDF files.
|
||||
|
||||
|
||||
If you plan on using the `libbmc.citations.pdf` functions, you should also
|
||||
install the matching software (`CERMINE`, `Grobid` or `pdf-extract`). See the
|
||||
docstrings of those functions for more infos on this particular point.
|
@ -25,6 +25,12 @@ def bibitem_as_plaintext(bibitem):
|
||||
This plaintext representation can be super ugly, contain URLs and so \
|
||||
on.
|
||||
|
||||
.. note::
|
||||
|
||||
You need to have ``delatex`` installed system-wide, or to build it in \
|
||||
this repo, according to the ``README.md`` before using this \
|
||||
function.
|
||||
|
||||
:param bibitem: The text content of the bibitem.
|
||||
:returns: A cleaned plaintext citation from the bibitem.
|
||||
"""
|
||||
|
@ -68,7 +68,7 @@ def get_cited_DOIs(bibtex):
|
||||
BibTeX file.
|
||||
:returns: A dict of cleaned plaintext citations and their associated DOI.
|
||||
"""
|
||||
# Get the plaintext citations from the bbl file
|
||||
# Get the plaintext citations from the bibtex file
|
||||
plaintext_citations = get_plaintext_citations(bibtex)
|
||||
# Use the plaintext citations parser on these citations
|
||||
return plaintext.get_cited_DOIs(plaintext_citations)
|
||||
|
@ -14,12 +14,18 @@ def find_identifiers(src):
|
||||
"""
|
||||
Search for a valid identifier (DOI, ISBN, arXiv, HAL) in a given file.
|
||||
|
||||
.. note ::
|
||||
.. note::
|
||||
|
||||
This function returns the first matching identifier, that is the most
|
||||
likely to be relevant for this file. However, it may fail and return an
|
||||
identifier taken from the references or another paper.
|
||||
|
||||
.. note::
|
||||
|
||||
You will need to have ``pdftotext`` and/or ``djvutxt`` installed \
|
||||
system-wide before processing files with this function.
|
||||
|
||||
|
||||
:params src: Path to the file to scan.
|
||||
|
||||
:returns: a tuple (type, identifier) or ``None`` if not found or \
|
||||
|
Loading…
Reference in New Issue
Block a user