Add some doc, especially about external dependencies
This commit is contained in:
parent
65967cfa96
commit
9019833dbb
3
.gitmodules
vendored
3
.gitmodules
vendored
@ -1,6 +1,3 @@
|
|||||||
[submodule "libbmc/external/opendetex"]
|
[submodule "libbmc/external/opendetex"]
|
||||||
path = libbmc/external/opendetex
|
path = libbmc/external/opendetex
|
||||||
url = https://github.com/Phyks/opendetex
|
url = https://github.com/Phyks/opendetex
|
||||||
[submodule "libbmc/external/poppler"]
|
|
||||||
path = libbmc/external/poppler
|
|
||||||
url = git://git.freedesktop.org/git/poppler/poppler
|
|
||||||
|
43
README.md
Normal file
43
README.md
Normal file
@ -0,0 +1,43 @@
|
|||||||
|
libBMC
|
||||||
|
======
|
||||||
|
|
||||||
|
A generic Python library to manage bibliography and play with scientific
|
||||||
|
papers.
|
||||||
|
|
||||||
|
|
||||||
|
_Note_: This library is written for Python 3 and may not work with Python 2.
|
||||||
|
This is not a major priority for me, but if anyone needed to make it work with
|
||||||
|
Python 2 and want to make a PR, I will happily merge it :)
|
||||||
|
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
|
||||||
|
Python dependencies are listed in the `requirements.txt` file at the root of
|
||||||
|
this repo, and can be installed with `pip install -r requirements.txt`.
|
||||||
|
|
||||||
|
|
||||||
|
External dependencies are [OpenDeTeX](https://code.google.com/p/opendetex/)
|
||||||
|
(an improved version of DeTeX) and the `pdftotext` and `djvutxt` programs.
|
||||||
|
|
||||||
|
|
||||||
|
OpenDeTeX is available as a Git submodule in the `libbmc/external` folder. If
|
||||||
|
you do not have it installed system-wide, you can use the following steps to
|
||||||
|
build it in this repo and the library will use it:
|
||||||
|
|
||||||
|
* `git submodule init; git submodule update` to initialize the Git submodules.
|
||||||
|
* `cd libbmc/external/opendetex; make` to build OpenDeTeX (see `INSTALL` file
|
||||||
|
in the same folder for more info, you will need `make`, `gcc` and `flex` to
|
||||||
|
build it).
|
||||||
|
|
||||||
|
OpenDeTeX is used to get references from a `.bbl` file (or directly from arXiv
|
||||||
|
as it uses the same pipeline).
|
||||||
|
|
||||||
|
|
||||||
|
`pdftotext` and `djvutxt` should be available in the packages of your
|
||||||
|
distribution and should be installed systemwide. Both are used to extract
|
||||||
|
identifiers from papers PDF files.
|
||||||
|
|
||||||
|
|
||||||
|
If you plan on using the `libbmc.citations.pdf` functions, you should also
|
||||||
|
install the matching software (`CERMINE`, `Grobid` or `pdf-extract`). See the
|
||||||
|
docstrings of those functions for more infos on this particular point.
|
@ -25,6 +25,12 @@ def bibitem_as_plaintext(bibitem):
|
|||||||
This plaintext representation can be super ugly, contain URLs and so \
|
This plaintext representation can be super ugly, contain URLs and so \
|
||||||
on.
|
on.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
You need to have ``delatex`` installed system-wide, or to build it in \
|
||||||
|
this repo, according to the ``README.md`` before using this \
|
||||||
|
function.
|
||||||
|
|
||||||
:param bibitem: The text content of the bibitem.
|
:param bibitem: The text content of the bibitem.
|
||||||
:returns: A cleaned plaintext citation from the bibitem.
|
:returns: A cleaned plaintext citation from the bibitem.
|
||||||
"""
|
"""
|
||||||
|
@ -68,7 +68,7 @@ def get_cited_DOIs(bibtex):
|
|||||||
BibTeX file.
|
BibTeX file.
|
||||||
:returns: A dict of cleaned plaintext citations and their associated DOI.
|
:returns: A dict of cleaned plaintext citations and their associated DOI.
|
||||||
"""
|
"""
|
||||||
# Get the plaintext citations from the bbl file
|
# Get the plaintext citations from the bibtex file
|
||||||
plaintext_citations = get_plaintext_citations(bibtex)
|
plaintext_citations = get_plaintext_citations(bibtex)
|
||||||
# Use the plaintext citations parser on these citations
|
# Use the plaintext citations parser on these citations
|
||||||
return plaintext.get_cited_DOIs(plaintext_citations)
|
return plaintext.get_cited_DOIs(plaintext_citations)
|
||||||
|
@ -14,12 +14,18 @@ def find_identifiers(src):
|
|||||||
"""
|
"""
|
||||||
Search for a valid identifier (DOI, ISBN, arXiv, HAL) in a given file.
|
Search for a valid identifier (DOI, ISBN, arXiv, HAL) in a given file.
|
||||||
|
|
||||||
.. note ::
|
.. note::
|
||||||
|
|
||||||
This function returns the first matching identifier, that is the most
|
This function returns the first matching identifier, that is the most
|
||||||
likely to be relevant for this file. However, it may fail and return an
|
likely to be relevant for this file. However, it may fail and return an
|
||||||
identifier taken from the references or another paper.
|
identifier taken from the references or another paper.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
You will need to have ``pdftotext`` and/or ``djvutxt`` installed \
|
||||||
|
system-wide before processing files with this function.
|
||||||
|
|
||||||
|
|
||||||
:params src: Path to the file to scan.
|
:params src: Path to the file to scan.
|
||||||
|
|
||||||
:returns: a tuple (type, identifier) or ``None`` if not found or \
|
:returns: a tuple (type, identifier) or ``None`` if not found or \
|
||||||
|
Loading…
Reference in New Issue
Block a user