Improve doc + s/weboob/WebOOB/

This commit is contained in:
Lucas Verney 2018-07-25 08:57:31 +02:00
parent 56c5aa20d4
commit cc9ed3d34b
5 changed files with 38 additions and 36 deletions

View File

@ -19,7 +19,7 @@ and it is working fine :)
<img src="doc/img/home.png" width="45%"/> <img src="doc/img/home2.png" width="45%"/>
It uses [Weboob](http://weboob.org/) to get all the housing posts on most of
It uses [WebOOB](http://weboob.org/) to get all the housing posts on most of
the websites offering housings posts, and then offers a bunch of pipelines to
filter and deduplicate the fetched housings.
@ -116,7 +116,9 @@ Feel free to open issues. An IRC channel is available at [irc://irc.freenode.net
## Thanks
* [Weboob](http://weboob.org/)
* [WebOOB](http://weboob.org/). Note that this is actually the only and best
software out there to scrape housing posts online. Using it in Flatisfy does
not mean the authors of Flatisfy endorse WebOOB authors' views.
* The OpenData providers listed above!
* Navitia for their really cool public transportation API.
* A lots of Python modules, required for this script (see `requirements.txt`).

View File

@ -2,30 +2,30 @@ Getting started
===============
## Dependency on Weboob
## Dependency on WebOOB
**Important**: Flatisfy relies on [Weboob](http://weboob.org/) to fetch
**Important**: Flatisfy relies on [WebOOB](http://weboob.org/) to fetch
housing posts from housing websites. Then, you should install the [`devel`
branch](https://git.weboob.org/weboob/devel/) and update it regularly,
especially if Flatisfy suddenly stops fetching housing posts.
If you `pip install -r requirements.txt` it will install the latest
development version of [Weboob](https://git.weboob.org/weboob/devel/) and the
[Weboob modules](https://git.weboob.org/weboob/modules/), which should be the
development version of [WebOOB](https://git.weboob.org/weboob/devel/) and the
[WebOOB modules](https://git.weboob.org/weboob/modules/), which should be the
best version available out there. You should update these packages regularly,
as they evolve quickly.
Weboob is made of two parts: a core and modules (which is the actual code
WebOOB is made of two parts: a core and modules (which is the actual code
fetching data from websites). Modules tend to break often and are then updated
often, you should keep them up to date. This can be done by installing the
`weboob-modules` package listed in the `requirements.txt` and using the
default configuration.
This is a safe default configuration. However, a better option is usually to
clone [Weboob git repo](https://git.weboob.org/weboob/devel/) somewhere, on
clone [WebOOB git repo](https://git.weboob.org/weboob/devel/) somewhere, on
your disk, to point `modules_path` configuration option to
`path_to_weboob_git/modules` (see the configuration section below) and to run
a `git pull; python setup.py install` in the Weboob git repo often.
a `git pull; python setup.py install` in the WebOOB git repo often.
## TL;DR
@ -113,11 +113,11 @@ List of configuration options:
`data_directory`.
* `navitia_api_key` is an API token for [Navitia](https://www.navitia.io/)
which is required to compute travel times.
* `modules_path` is the path to the Weboob modules. It can be `null` if you
want Weboob to use the locally installed [Weboob
* `modules_path` is the path to the WebOOB modules. It can be `null` if you
want WebOOB to use the locally installed [WebOOB
modules](https://git.weboob.org/weboob/modules), which you should install
yourself. This is the default value. If it is a string, it should be an
absolute path to the folder containing Weboob modules.
absolute path to the folder containing WebOOB modules.
* `port` is the port on which the development webserver should be
listening (default to `8080`).
* `host` is the host on which the development webserver should be listening
@ -125,8 +125,8 @@ List of configuration options:
* `webserver` is a server to use instead of the default Bottle built-in
webserver, see [Bottle deployment
doc](http://bottlepy.org/docs/dev/deployment.html).
* `backends` is a list of Weboob backends to enable. It defaults to any
available and supported Weboob backend.
* `backends` is a list of WebOOB backends to enable. It defaults to any
available and supported WebOOB backend.
* `store_personal_data` is a boolean indicated whether or not Flatisfy should
fetch personal data from housing posts and store them in database. Such
personal data include contact phone number for instance. By default,

View File

@ -20,7 +20,7 @@ virtualenv .env && source .env/bin/activate
# Install required Python modules
pip install -r requirements.txt
# Clone and install weboob
# Clone and install webOOB
git clone https://git.weboob.org/weboob/devel weboob && cd weboob && python setup.py install && cd ..
# Install required JS libraries and build the webapp

View File

@ -131,7 +131,7 @@ def import_and_filter(config, load_from_db=False):
:param config: A config dict.
:param load_from_db: Whether to load flats from database or fetch them
using Weboob.
using WebOOB.
:return: ``None``.
"""
# Fetch and filter flats list

View File

@ -29,17 +29,17 @@ except ImportError:
raise
class WeboobProxy(object):
class WebOOBProxy(object):
"""
Wrapper around Weboob ``WebNip`` class, to fetch housing posts without
Wrapper around WebOOB ``WebNip`` class, to fetch housing posts without
having to spawn a subprocess.
"""
@staticmethod
def version():
"""
Get Weboob version.
Get WebOOB version.
:return: The installed Weboob version.
:return: The installed WebOOB version.
"""
return WebNip.VERSION
@ -63,7 +63,7 @@ class WeboobProxy(object):
def __init__(self, config):
"""
Create a Weboob handle and try to load the modules.
Create a WebOOB handle and try to load the modules.
:param config: A config dict.
"""
@ -94,13 +94,13 @@ class WeboobProxy(object):
def build_queries(self, constraints_dict):
"""
Build Weboob ``weboob.capabilities.housing.Query`` objects from the
Build WebOOB ``weboob.capabilities.housing.Query`` objects from the
constraints defined in the configuration. Each query has at most 3
cities, to comply with housing websites limitations.
:param constraints_dict: A dictionary of constraints, as defined in the
config.
:return: A list of Weboob ``weboob.capabilities.housing.Query``
:return: A list of WebOOB ``weboob.capabilities.housing.Query``
objects. Returns ``None`` if an error occurred.
"""
queries = []
@ -176,9 +176,9 @@ class WeboobProxy(object):
def query(self, query, max_entries=None, store_personal_data=False):
"""
Fetch the housings posts matching a given Weboob query.
Fetch the housings posts matching a given WebOOB query.
:param query: A Weboob `weboob.capabilities.housing.Query`` object.
:param query: A WebOOB `weboob.capabilities.housing.Query`` object.
:param max_entries: Maximum number of entries to fetch.
:param store_personal_data: Whether personal data should be fetched
from housing posts (phone number etc).
@ -206,7 +206,7 @@ class WeboobProxy(object):
"""
Get information (details) about an housing post.
:param full_flat_id: A Weboob housing post id, in complete form
:param full_flat_id: A WebOOB housing post id, in complete form
(ID@BACKEND)
:param store_personal_data: Whether personal data should be fetched
from housing posts (phone number etc).
@ -247,7 +247,7 @@ class WeboobProxy(object):
def fetch_flats(config):
"""
Fetch the available flats using the Flatboob / Weboob config.
Fetch the available flats using the Flatboob / WebOOB config.
:param config: A config dict.
:return: A dict mapping constraint in config to all available matching
@ -257,18 +257,18 @@ def fetch_flats(config):
for constraint_name, constraint in config["constraints"].items():
LOGGER.info("Loading flats for constraint %s...", constraint_name)
with WeboobProxy(config) as weboob_proxy:
queries = weboob_proxy.build_queries(constraint)
with WebOOBProxy(config) as webOOB_proxy:
queries = webOOB_proxy.build_queries(constraint)
housing_posts = []
for query in queries:
housing_posts.extend(
weboob_proxy.query(query, config["max_entries"],
webOOB_proxy.query(query, config["max_entries"],
config["store_personal_data"])
)
LOGGER.info("Fetched %d flats.", len(housing_posts))
constraint_flats_list = [json.loads(flat) for flat in housing_posts]
constraint_flats_list = [WeboobProxy.restore_decimal_fields(flat)
constraint_flats_list = [WebOOBProxy.restore_decimal_fields(flat)
for flat in constraint_flats_list]
fetched_flats[constraint_name] = constraint_flats_list
return fetched_flats
@ -276,19 +276,19 @@ def fetch_flats(config):
def fetch_details(config, flat_id):
"""
Fetch the additional details for a flat using Flatboob / Weboob.
Fetch the additional details for a flat using Flatboob / WebOOB.
:param config: A config dict.
:param flat_id: ID of the flat to fetch details for.
:return: A flat dict with all the available data.
"""
with WeboobProxy(config) as weboob_proxy:
with WebOOBProxy(config) as webOOB_proxy:
LOGGER.info("Loading additional details for flat %s.", flat_id)
weboob_output = weboob_proxy.info(flat_id,
webOOB_output = webOOB_proxy.info(flat_id,
config["store_personal_data"])
flat_details = json.loads(weboob_output)
flat_details = WeboobProxy.restore_decimal_fields(flat_details)
flat_details = json.loads(webOOB_output)
flat_details = WebOOBProxy.restore_decimal_fields(flat_details)
LOGGER.info("Fetched details for flat %s.", flat_id)
return flat_details