Improve doc + s/weboob/WebOOB/
This commit is contained in:
parent
56c5aa20d4
commit
cc9ed3d34b
@ -19,7 +19,7 @@ and it is working fine :)
|
|||||||
|
|
||||||
<img src="doc/img/home.png" width="45%"/> <img src="doc/img/home2.png" width="45%"/>
|
<img src="doc/img/home.png" width="45%"/> <img src="doc/img/home2.png" width="45%"/>
|
||||||
|
|
||||||
It uses [Weboob](http://weboob.org/) to get all the housing posts on most of
|
It uses [WebOOB](http://weboob.org/) to get all the housing posts on most of
|
||||||
the websites offering housings posts, and then offers a bunch of pipelines to
|
the websites offering housings posts, and then offers a bunch of pipelines to
|
||||||
filter and deduplicate the fetched housings.
|
filter and deduplicate the fetched housings.
|
||||||
|
|
||||||
@ -116,7 +116,9 @@ Feel free to open issues. An IRC channel is available at [irc://irc.freenode.net
|
|||||||
|
|
||||||
## Thanks
|
## Thanks
|
||||||
|
|
||||||
* [Weboob](http://weboob.org/)
|
* [WebOOB](http://weboob.org/). Note that this is actually the only and best
|
||||||
|
software out there to scrape housing posts online. Using it in Flatisfy does
|
||||||
|
not mean the authors of Flatisfy endorse WebOOB authors' views.
|
||||||
* The OpenData providers listed above!
|
* The OpenData providers listed above!
|
||||||
* Navitia for their really cool public transportation API.
|
* Navitia for their really cool public transportation API.
|
||||||
* A lots of Python modules, required for this script (see `requirements.txt`).
|
* A lots of Python modules, required for this script (see `requirements.txt`).
|
||||||
|
@ -2,30 +2,30 @@ Getting started
|
|||||||
===============
|
===============
|
||||||
|
|
||||||
|
|
||||||
## Dependency on Weboob
|
## Dependency on WebOOB
|
||||||
|
|
||||||
**Important**: Flatisfy relies on [Weboob](http://weboob.org/) to fetch
|
**Important**: Flatisfy relies on [WebOOB](http://weboob.org/) to fetch
|
||||||
housing posts from housing websites. Then, you should install the [`devel`
|
housing posts from housing websites. Then, you should install the [`devel`
|
||||||
branch](https://git.weboob.org/weboob/devel/) and update it regularly,
|
branch](https://git.weboob.org/weboob/devel/) and update it regularly,
|
||||||
especially if Flatisfy suddenly stops fetching housing posts.
|
especially if Flatisfy suddenly stops fetching housing posts.
|
||||||
|
|
||||||
If you `pip install -r requirements.txt` it will install the latest
|
If you `pip install -r requirements.txt` it will install the latest
|
||||||
development version of [Weboob](https://git.weboob.org/weboob/devel/) and the
|
development version of [WebOOB](https://git.weboob.org/weboob/devel/) and the
|
||||||
[Weboob modules](https://git.weboob.org/weboob/modules/), which should be the
|
[WebOOB modules](https://git.weboob.org/weboob/modules/), which should be the
|
||||||
best version available out there. You should update these packages regularly,
|
best version available out there. You should update these packages regularly,
|
||||||
as they evolve quickly.
|
as they evolve quickly.
|
||||||
|
|
||||||
Weboob is made of two parts: a core and modules (which is the actual code
|
WebOOB is made of two parts: a core and modules (which is the actual code
|
||||||
fetching data from websites). Modules tend to break often and are then updated
|
fetching data from websites). Modules tend to break often and are then updated
|
||||||
often, you should keep them up to date. This can be done by installing the
|
often, you should keep them up to date. This can be done by installing the
|
||||||
`weboob-modules` package listed in the `requirements.txt` and using the
|
`weboob-modules` package listed in the `requirements.txt` and using the
|
||||||
default configuration.
|
default configuration.
|
||||||
|
|
||||||
This is a safe default configuration. However, a better option is usually to
|
This is a safe default configuration. However, a better option is usually to
|
||||||
clone [Weboob git repo](https://git.weboob.org/weboob/devel/) somewhere, on
|
clone [WebOOB git repo](https://git.weboob.org/weboob/devel/) somewhere, on
|
||||||
your disk, to point `modules_path` configuration option to
|
your disk, to point `modules_path` configuration option to
|
||||||
`path_to_weboob_git/modules` (see the configuration section below) and to run
|
`path_to_weboob_git/modules` (see the configuration section below) and to run
|
||||||
a `git pull; python setup.py install` in the Weboob git repo often.
|
a `git pull; python setup.py install` in the WebOOB git repo often.
|
||||||
|
|
||||||
|
|
||||||
## TL;DR
|
## TL;DR
|
||||||
@ -113,11 +113,11 @@ List of configuration options:
|
|||||||
`data_directory`.
|
`data_directory`.
|
||||||
* `navitia_api_key` is an API token for [Navitia](https://www.navitia.io/)
|
* `navitia_api_key` is an API token for [Navitia](https://www.navitia.io/)
|
||||||
which is required to compute travel times.
|
which is required to compute travel times.
|
||||||
* `modules_path` is the path to the Weboob modules. It can be `null` if you
|
* `modules_path` is the path to the WebOOB modules. It can be `null` if you
|
||||||
want Weboob to use the locally installed [Weboob
|
want WebOOB to use the locally installed [WebOOB
|
||||||
modules](https://git.weboob.org/weboob/modules), which you should install
|
modules](https://git.weboob.org/weboob/modules), which you should install
|
||||||
yourself. This is the default value. If it is a string, it should be an
|
yourself. This is the default value. If it is a string, it should be an
|
||||||
absolute path to the folder containing Weboob modules.
|
absolute path to the folder containing WebOOB modules.
|
||||||
* `port` is the port on which the development webserver should be
|
* `port` is the port on which the development webserver should be
|
||||||
listening (default to `8080`).
|
listening (default to `8080`).
|
||||||
* `host` is the host on which the development webserver should be listening
|
* `host` is the host on which the development webserver should be listening
|
||||||
@ -125,8 +125,8 @@ List of configuration options:
|
|||||||
* `webserver` is a server to use instead of the default Bottle built-in
|
* `webserver` is a server to use instead of the default Bottle built-in
|
||||||
webserver, see [Bottle deployment
|
webserver, see [Bottle deployment
|
||||||
doc](http://bottlepy.org/docs/dev/deployment.html).
|
doc](http://bottlepy.org/docs/dev/deployment.html).
|
||||||
* `backends` is a list of Weboob backends to enable. It defaults to any
|
* `backends` is a list of WebOOB backends to enable. It defaults to any
|
||||||
available and supported Weboob backend.
|
available and supported WebOOB backend.
|
||||||
* `store_personal_data` is a boolean indicated whether or not Flatisfy should
|
* `store_personal_data` is a boolean indicated whether or not Flatisfy should
|
||||||
fetch personal data from housing posts and store them in database. Such
|
fetch personal data from housing posts and store them in database. Such
|
||||||
personal data include contact phone number for instance. By default,
|
personal data include contact phone number for instance. By default,
|
||||||
|
@ -20,7 +20,7 @@ virtualenv .env && source .env/bin/activate
|
|||||||
# Install required Python modules
|
# Install required Python modules
|
||||||
pip install -r requirements.txt
|
pip install -r requirements.txt
|
||||||
|
|
||||||
# Clone and install weboob
|
# Clone and install webOOB
|
||||||
git clone https://git.weboob.org/weboob/devel weboob && cd weboob && python setup.py install && cd ..
|
git clone https://git.weboob.org/weboob/devel weboob && cd weboob && python setup.py install && cd ..
|
||||||
|
|
||||||
# Install required JS libraries and build the webapp
|
# Install required JS libraries and build the webapp
|
||||||
|
@ -131,7 +131,7 @@ def import_and_filter(config, load_from_db=False):
|
|||||||
|
|
||||||
:param config: A config dict.
|
:param config: A config dict.
|
||||||
:param load_from_db: Whether to load flats from database or fetch them
|
:param load_from_db: Whether to load flats from database or fetch them
|
||||||
using Weboob.
|
using WebOOB.
|
||||||
:return: ``None``.
|
:return: ``None``.
|
||||||
"""
|
"""
|
||||||
# Fetch and filter flats list
|
# Fetch and filter flats list
|
||||||
|
@ -29,17 +29,17 @@ except ImportError:
|
|||||||
raise
|
raise
|
||||||
|
|
||||||
|
|
||||||
class WeboobProxy(object):
|
class WebOOBProxy(object):
|
||||||
"""
|
"""
|
||||||
Wrapper around Weboob ``WebNip`` class, to fetch housing posts without
|
Wrapper around WebOOB ``WebNip`` class, to fetch housing posts without
|
||||||
having to spawn a subprocess.
|
having to spawn a subprocess.
|
||||||
"""
|
"""
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def version():
|
def version():
|
||||||
"""
|
"""
|
||||||
Get Weboob version.
|
Get WebOOB version.
|
||||||
|
|
||||||
:return: The installed Weboob version.
|
:return: The installed WebOOB version.
|
||||||
"""
|
"""
|
||||||
return WebNip.VERSION
|
return WebNip.VERSION
|
||||||
|
|
||||||
@ -63,7 +63,7 @@ class WeboobProxy(object):
|
|||||||
|
|
||||||
def __init__(self, config):
|
def __init__(self, config):
|
||||||
"""
|
"""
|
||||||
Create a Weboob handle and try to load the modules.
|
Create a WebOOB handle and try to load the modules.
|
||||||
|
|
||||||
:param config: A config dict.
|
:param config: A config dict.
|
||||||
"""
|
"""
|
||||||
@ -94,13 +94,13 @@ class WeboobProxy(object):
|
|||||||
|
|
||||||
def build_queries(self, constraints_dict):
|
def build_queries(self, constraints_dict):
|
||||||
"""
|
"""
|
||||||
Build Weboob ``weboob.capabilities.housing.Query`` objects from the
|
Build WebOOB ``weboob.capabilities.housing.Query`` objects from the
|
||||||
constraints defined in the configuration. Each query has at most 3
|
constraints defined in the configuration. Each query has at most 3
|
||||||
cities, to comply with housing websites limitations.
|
cities, to comply with housing websites limitations.
|
||||||
|
|
||||||
:param constraints_dict: A dictionary of constraints, as defined in the
|
:param constraints_dict: A dictionary of constraints, as defined in the
|
||||||
config.
|
config.
|
||||||
:return: A list of Weboob ``weboob.capabilities.housing.Query``
|
:return: A list of WebOOB ``weboob.capabilities.housing.Query``
|
||||||
objects. Returns ``None`` if an error occurred.
|
objects. Returns ``None`` if an error occurred.
|
||||||
"""
|
"""
|
||||||
queries = []
|
queries = []
|
||||||
@ -176,9 +176,9 @@ class WeboobProxy(object):
|
|||||||
|
|
||||||
def query(self, query, max_entries=None, store_personal_data=False):
|
def query(self, query, max_entries=None, store_personal_data=False):
|
||||||
"""
|
"""
|
||||||
Fetch the housings posts matching a given Weboob query.
|
Fetch the housings posts matching a given WebOOB query.
|
||||||
|
|
||||||
:param query: A Weboob `weboob.capabilities.housing.Query`` object.
|
:param query: A WebOOB `weboob.capabilities.housing.Query`` object.
|
||||||
:param max_entries: Maximum number of entries to fetch.
|
:param max_entries: Maximum number of entries to fetch.
|
||||||
:param store_personal_data: Whether personal data should be fetched
|
:param store_personal_data: Whether personal data should be fetched
|
||||||
from housing posts (phone number etc).
|
from housing posts (phone number etc).
|
||||||
@ -206,7 +206,7 @@ class WeboobProxy(object):
|
|||||||
"""
|
"""
|
||||||
Get information (details) about an housing post.
|
Get information (details) about an housing post.
|
||||||
|
|
||||||
:param full_flat_id: A Weboob housing post id, in complete form
|
:param full_flat_id: A WebOOB housing post id, in complete form
|
||||||
(ID@BACKEND)
|
(ID@BACKEND)
|
||||||
:param store_personal_data: Whether personal data should be fetched
|
:param store_personal_data: Whether personal data should be fetched
|
||||||
from housing posts (phone number etc).
|
from housing posts (phone number etc).
|
||||||
@ -247,7 +247,7 @@ class WeboobProxy(object):
|
|||||||
|
|
||||||
def fetch_flats(config):
|
def fetch_flats(config):
|
||||||
"""
|
"""
|
||||||
Fetch the available flats using the Flatboob / Weboob config.
|
Fetch the available flats using the Flatboob / WebOOB config.
|
||||||
|
|
||||||
:param config: A config dict.
|
:param config: A config dict.
|
||||||
:return: A dict mapping constraint in config to all available matching
|
:return: A dict mapping constraint in config to all available matching
|
||||||
@ -257,18 +257,18 @@ def fetch_flats(config):
|
|||||||
|
|
||||||
for constraint_name, constraint in config["constraints"].items():
|
for constraint_name, constraint in config["constraints"].items():
|
||||||
LOGGER.info("Loading flats for constraint %s...", constraint_name)
|
LOGGER.info("Loading flats for constraint %s...", constraint_name)
|
||||||
with WeboobProxy(config) as weboob_proxy:
|
with WebOOBProxy(config) as webOOB_proxy:
|
||||||
queries = weboob_proxy.build_queries(constraint)
|
queries = webOOB_proxy.build_queries(constraint)
|
||||||
housing_posts = []
|
housing_posts = []
|
||||||
for query in queries:
|
for query in queries:
|
||||||
housing_posts.extend(
|
housing_posts.extend(
|
||||||
weboob_proxy.query(query, config["max_entries"],
|
webOOB_proxy.query(query, config["max_entries"],
|
||||||
config["store_personal_data"])
|
config["store_personal_data"])
|
||||||
)
|
)
|
||||||
LOGGER.info("Fetched %d flats.", len(housing_posts))
|
LOGGER.info("Fetched %d flats.", len(housing_posts))
|
||||||
|
|
||||||
constraint_flats_list = [json.loads(flat) for flat in housing_posts]
|
constraint_flats_list = [json.loads(flat) for flat in housing_posts]
|
||||||
constraint_flats_list = [WeboobProxy.restore_decimal_fields(flat)
|
constraint_flats_list = [WebOOBProxy.restore_decimal_fields(flat)
|
||||||
for flat in constraint_flats_list]
|
for flat in constraint_flats_list]
|
||||||
fetched_flats[constraint_name] = constraint_flats_list
|
fetched_flats[constraint_name] = constraint_flats_list
|
||||||
return fetched_flats
|
return fetched_flats
|
||||||
@ -276,19 +276,19 @@ def fetch_flats(config):
|
|||||||
|
|
||||||
def fetch_details(config, flat_id):
|
def fetch_details(config, flat_id):
|
||||||
"""
|
"""
|
||||||
Fetch the additional details for a flat using Flatboob / Weboob.
|
Fetch the additional details for a flat using Flatboob / WebOOB.
|
||||||
|
|
||||||
:param config: A config dict.
|
:param config: A config dict.
|
||||||
:param flat_id: ID of the flat to fetch details for.
|
:param flat_id: ID of the flat to fetch details for.
|
||||||
:return: A flat dict with all the available data.
|
:return: A flat dict with all the available data.
|
||||||
"""
|
"""
|
||||||
with WeboobProxy(config) as weboob_proxy:
|
with WebOOBProxy(config) as webOOB_proxy:
|
||||||
LOGGER.info("Loading additional details for flat %s.", flat_id)
|
LOGGER.info("Loading additional details for flat %s.", flat_id)
|
||||||
weboob_output = weboob_proxy.info(flat_id,
|
webOOB_output = webOOB_proxy.info(flat_id,
|
||||||
config["store_personal_data"])
|
config["store_personal_data"])
|
||||||
|
|
||||||
flat_details = json.loads(weboob_output)
|
flat_details = json.loads(webOOB_output)
|
||||||
flat_details = WeboobProxy.restore_decimal_fields(flat_details)
|
flat_details = WebOOBProxy.restore_decimal_fields(flat_details)
|
||||||
LOGGER.info("Fetched details for flat %s.", flat_id)
|
LOGGER.info("Fetched details for flat %s.", flat_id)
|
||||||
|
|
||||||
return flat_details
|
return flat_details
|
||||||
|
Loading…
Reference in New Issue
Block a user