diff --git a/README.md b/README.md index 1e6d316..ad0ebfb 100644 --- a/README.md +++ b/README.md @@ -19,7 +19,7 @@ and it is working fine :) -It uses [Weboob](http://weboob.org/) to get all the housing posts on most of +It uses [WebOOB](http://weboob.org/) to get all the housing posts on most of the websites offering housings posts, and then offers a bunch of pipelines to filter and deduplicate the fetched housings. @@ -116,7 +116,9 @@ Feel free to open issues. An IRC channel is available at [irc://irc.freenode.net ## Thanks -* [Weboob](http://weboob.org/) +* [WebOOB](http://weboob.org/). Note that this is actually the only and best + software out there to scrape housing posts online. Using it in Flatisfy does + not mean the authors of Flatisfy endorse WebOOB authors' views. * The OpenData providers listed above! * Navitia for their really cool public transportation API. * A lots of Python modules, required for this script (see `requirements.txt`). diff --git a/doc/0.getting_started.md b/doc/0.getting_started.md index ef4f5e4..50c3dd5 100644 --- a/doc/0.getting_started.md +++ b/doc/0.getting_started.md @@ -2,30 +2,30 @@ Getting started =============== -## Dependency on Weboob +## Dependency on WebOOB -**Important**: Flatisfy relies on [Weboob](http://weboob.org/) to fetch +**Important**: Flatisfy relies on [WebOOB](http://weboob.org/) to fetch housing posts from housing websites. Then, you should install the [`devel` branch](https://git.weboob.org/weboob/devel/) and update it regularly, especially if Flatisfy suddenly stops fetching housing posts. If you `pip install -r requirements.txt` it will install the latest -development version of [Weboob](https://git.weboob.org/weboob/devel/) and the -[Weboob modules](https://git.weboob.org/weboob/modules/), which should be the +development version of [WebOOB](https://git.weboob.org/weboob/devel/) and the +[WebOOB modules](https://git.weboob.org/weboob/modules/), which should be the best version available out there. You should update these packages regularly, as they evolve quickly. -Weboob is made of two parts: a core and modules (which is the actual code +WebOOB is made of two parts: a core and modules (which is the actual code fetching data from websites). Modules tend to break often and are then updated often, you should keep them up to date. This can be done by installing the `weboob-modules` package listed in the `requirements.txt` and using the default configuration. This is a safe default configuration. However, a better option is usually to -clone [Weboob git repo](https://git.weboob.org/weboob/devel/) somewhere, on +clone [WebOOB git repo](https://git.weboob.org/weboob/devel/) somewhere, on your disk, to point `modules_path` configuration option to `path_to_weboob_git/modules` (see the configuration section below) and to run -a `git pull; python setup.py install` in the Weboob git repo often. +a `git pull; python setup.py install` in the WebOOB git repo often. ## TL;DR @@ -113,11 +113,11 @@ List of configuration options: `data_directory`. * `navitia_api_key` is an API token for [Navitia](https://www.navitia.io/) which is required to compute travel times. -* `modules_path` is the path to the Weboob modules. It can be `null` if you - want Weboob to use the locally installed [Weboob +* `modules_path` is the path to the WebOOB modules. It can be `null` if you + want WebOOB to use the locally installed [WebOOB modules](https://git.weboob.org/weboob/modules), which you should install yourself. This is the default value. If it is a string, it should be an - absolute path to the folder containing Weboob modules. + absolute path to the folder containing WebOOB modules. * `port` is the port on which the development webserver should be listening (default to `8080`). * `host` is the host on which the development webserver should be listening @@ -125,8 +125,8 @@ List of configuration options: * `webserver` is a server to use instead of the default Bottle built-in webserver, see [Bottle deployment doc](http://bottlepy.org/docs/dev/deployment.html). -* `backends` is a list of Weboob backends to enable. It defaults to any - available and supported Weboob backend. +* `backends` is a list of WebOOB backends to enable. It defaults to any + available and supported WebOOB backend. * `store_personal_data` is a boolean indicated whether or not Flatisfy should fetch personal data from housing posts and store them in database. Such personal data include contact phone number for instance. By default, diff --git a/doc/1.production.md b/doc/1.production.md index a8df4cf..63eb48d 100644 --- a/doc/1.production.md +++ b/doc/1.production.md @@ -20,7 +20,7 @@ virtualenv .env && source .env/bin/activate # Install required Python modules pip install -r requirements.txt -# Clone and install weboob +# Clone and install webOOB git clone https://git.weboob.org/weboob/devel weboob && cd weboob && python setup.py install && cd .. # Install required JS libraries and build the webapp diff --git a/flatisfy/cmds.py b/flatisfy/cmds.py index 358bcf1..8d14d27 100644 --- a/flatisfy/cmds.py +++ b/flatisfy/cmds.py @@ -131,7 +131,7 @@ def import_and_filter(config, load_from_db=False): :param config: A config dict. :param load_from_db: Whether to load flats from database or fetch them - using Weboob. + using WebOOB. :return: ``None``. """ # Fetch and filter flats list diff --git a/flatisfy/fetch.py b/flatisfy/fetch.py index 4443d15..3a570f5 100644 --- a/flatisfy/fetch.py +++ b/flatisfy/fetch.py @@ -29,17 +29,17 @@ except ImportError: raise -class WeboobProxy(object): +class WebOOBProxy(object): """ - Wrapper around Weboob ``WebNip`` class, to fetch housing posts without + Wrapper around WebOOB ``WebNip`` class, to fetch housing posts without having to spawn a subprocess. """ @staticmethod def version(): """ - Get Weboob version. + Get WebOOB version. - :return: The installed Weboob version. + :return: The installed WebOOB version. """ return WebNip.VERSION @@ -63,7 +63,7 @@ class WeboobProxy(object): def __init__(self, config): """ - Create a Weboob handle and try to load the modules. + Create a WebOOB handle and try to load the modules. :param config: A config dict. """ @@ -94,13 +94,13 @@ class WeboobProxy(object): def build_queries(self, constraints_dict): """ - Build Weboob ``weboob.capabilities.housing.Query`` objects from the + Build WebOOB ``weboob.capabilities.housing.Query`` objects from the constraints defined in the configuration. Each query has at most 3 cities, to comply with housing websites limitations. :param constraints_dict: A dictionary of constraints, as defined in the config. - :return: A list of Weboob ``weboob.capabilities.housing.Query`` + :return: A list of WebOOB ``weboob.capabilities.housing.Query`` objects. Returns ``None`` if an error occurred. """ queries = [] @@ -176,9 +176,9 @@ class WeboobProxy(object): def query(self, query, max_entries=None, store_personal_data=False): """ - Fetch the housings posts matching a given Weboob query. + Fetch the housings posts matching a given WebOOB query. - :param query: A Weboob `weboob.capabilities.housing.Query`` object. + :param query: A WebOOB `weboob.capabilities.housing.Query`` object. :param max_entries: Maximum number of entries to fetch. :param store_personal_data: Whether personal data should be fetched from housing posts (phone number etc). @@ -206,7 +206,7 @@ class WeboobProxy(object): """ Get information (details) about an housing post. - :param full_flat_id: A Weboob housing post id, in complete form + :param full_flat_id: A WebOOB housing post id, in complete form (ID@BACKEND) :param store_personal_data: Whether personal data should be fetched from housing posts (phone number etc). @@ -247,7 +247,7 @@ class WeboobProxy(object): def fetch_flats(config): """ - Fetch the available flats using the Flatboob / Weboob config. + Fetch the available flats using the Flatboob / WebOOB config. :param config: A config dict. :return: A dict mapping constraint in config to all available matching @@ -257,18 +257,18 @@ def fetch_flats(config): for constraint_name, constraint in config["constraints"].items(): LOGGER.info("Loading flats for constraint %s...", constraint_name) - with WeboobProxy(config) as weboob_proxy: - queries = weboob_proxy.build_queries(constraint) + with WebOOBProxy(config) as webOOB_proxy: + queries = webOOB_proxy.build_queries(constraint) housing_posts = [] for query in queries: housing_posts.extend( - weboob_proxy.query(query, config["max_entries"], + webOOB_proxy.query(query, config["max_entries"], config["store_personal_data"]) ) LOGGER.info("Fetched %d flats.", len(housing_posts)) constraint_flats_list = [json.loads(flat) for flat in housing_posts] - constraint_flats_list = [WeboobProxy.restore_decimal_fields(flat) + constraint_flats_list = [WebOOBProxy.restore_decimal_fields(flat) for flat in constraint_flats_list] fetched_flats[constraint_name] = constraint_flats_list return fetched_flats @@ -276,19 +276,19 @@ def fetch_flats(config): def fetch_details(config, flat_id): """ - Fetch the additional details for a flat using Flatboob / Weboob. + Fetch the additional details for a flat using Flatboob / WebOOB. :param config: A config dict. :param flat_id: ID of the flat to fetch details for. :return: A flat dict with all the available data. """ - with WeboobProxy(config) as weboob_proxy: + with WebOOBProxy(config) as webOOB_proxy: LOGGER.info("Loading additional details for flat %s.", flat_id) - weboob_output = weboob_proxy.info(flat_id, + webOOB_output = webOOB_proxy.info(flat_id, config["store_personal_data"]) - flat_details = json.loads(weboob_output) - flat_details = WeboobProxy.restore_decimal_fields(flat_details) + flat_details = json.loads(webOOB_output) + flat_details = WebOOBProxy.restore_decimal_fields(flat_details) LOGGER.info("Fetched details for flat %s.", flat_id) return flat_details