Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → AnthonyBloomer → Daftlistings

AnthonyBloomer / Daftlistings

Licence: mit

A library that enables programmatic interaction with daft.ie. Daft.ie has nationwide coverage and contains about 80% of the total available properties in Ireland.

Programming Languages

python

139335 projects - #7 most used programming language

Labels

web-scraping web-scraper beautifulsoup properties

Projects that are alternatives of or similar to Daftlistings

Scrapple

A framework for creating semi-automatic web content extractors

Stars: ✭ 464 (+439.53%)

Mutual labels: web-scraping, web-scraper, beautifulsoup

top-github-scraper

Scape top GitHub repositories and users based on keywords

Stars: ✭ 40 (-53.49%)

Mutual labels: web-scraper, web-scraping

Cascadia

Go cascadia package command line CSS selector

Stars: ✭ 67 (-22.09%)

Mutual labels: web-scraping, web-scraper

Detect Cms

PHP Library for detecting CMS

Stars: ✭ 78 (-9.3%)

Mutual labels: web-scraping, web-scraper

grailer

web scraping tool for grailed.com

Stars: ✭ 30 (-65.12%)

Mutual labels: web-scraping, beautifulsoup

Linkedin-Client

Web scraper for grabing data from Linkedin profiles or company pages (personal project)

Stars: ✭ 42 (-51.16%)

Mutual labels: web-scraper, web-scraping

Php Curl Class

PHP Curl Class makes it easy to send HTTP requests and integrate with web APIs

Stars: ✭ 2,903 (+3275.58%)

Mutual labels: web-scraping, web-scraper

Scrape Linkedin Selenium

`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.

Stars: ✭ 239 (+177.91%)

Mutual labels: web-scraping, web-scraper

Social Media Profile Scrapers

Fetch user's data across social media

Stars: ✭ 60 (-30.23%)

Mutual labels: web-scraping, web-scraper

Faster Than Requests

Faster requests on Python 3

Stars: ✭ 639 (+643.02%)

Mutual labels: web-scraping, web-scraper

Project Tauro

A Router WiFi key recovery/cracking tool with a twist.

Stars: ✭ 52 (-39.53%)

Mutual labels: web-scraping, web-scraper

Arachnid

Powerful web scraping framework for Crystal

Stars: ✭ 68 (-20.93%)

Mutual labels: web-scraping, web-scraper

Data-Wrangling-with-Python

Simplify your ETL processes with these hands-on data sanitation tips, tricks, and best practices

Stars: ✭ 90 (+4.65%)

Mutual labels: web-scraping, beautifulsoup

OLX Scraper

📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.

Stars: ✭ 15 (-82.56%)

Mutual labels: web-scraper, web-scraping

BookingScraper

🌎 🏨 Scrape Booking.com 🏨 🌎

Stars: ✭ 68 (-20.93%)

Mutual labels: web-scraping, beautifulsoup

MediumScraper

Scraping articles of medium and providing audio versions 📑 to 🔊 using django

Stars: ✭ 12 (-86.05%)

Mutual labels: web-scraper, beautifulsoup

Web Scraping

Detailed web scraping tutorials for dummies with financial data crawlers on Reddit WallStreetBets, CME (both options and futures), US Treasury, CFTC, LME, SHFE and news data crawlers on BBC, Wall Street Journal, Al Jazeera, Reuters, Financial Times, Bloomberg, CNN, Fortune, The Economist

Stars: ✭ 153 (+77.91%)

Mutual labels: web-scraping, web-scraper

Bet On Sibyl

Machine Learning Model for Sport Predictions (Football, Basketball, Baseball, Hockey, Soccer & Tennis)

Stars: ✭ 190 (+120.93%)

Mutual labels: web-scraping, beautifulsoup

Basketball reference web scraper

NBA Stats API via Basketball Reference

Stars: ✭ 279 (+224.42%)

Mutual labels: web-scraping, web-scraper

Spidr

A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

Stars: ✭ 656 (+662.79%)

Mutual labels: web-scraping, web-scraper

View All Similar Projects ➔

Daftlistings

A library that enables programmatic interaction with Daft.ie. Daft.ie has nationwide coverage and contains about 80% of the total available properties in Ireland.

Installation

Daftlistings is available on the Python Package Index (PyPI). You can install daftlistings using pip.

virtualenv env
source env/bin/activate
pip install daftlistings

To install the development version, run:

pip install https://github.com/AnthonyBloomer/daftlistings/archive/dev.zip

Temporary map visualization function fix

There was a major daft website update which breaks this repo. There is a minimal working subset in temporary-map-visualization-fix folder

cd temporary-map-visualization-fix

Inspect main.py and tweak the searching parameters. You can deduct the parameters from https://www.daft.ie/property-for-sale/dublin-city?numBeds_from=2&numBeds_to=5. such as {"numBeds_from": "5"} Add the desired parameters to line 9 in temporary-map-visualization-fix/main.py

python main.py

The searched results will be wrote to temporary-map-visualization-fix/result.txt

python map.py

Run map.py to visualize the results.

Usage

from daftlistings import Daft

daft = Daft()
listings = daft.search()

for listing in listings:
    print(listing.formalised_address)
    print(listing.daft_link)
    print(listing.price)

By default, the Daft search function iterates over each page of results and appends each Listing object to the array that is returned. If you wish to disable this feature, you can set fetch_all to False:

daft.search(fetch_all=False)

Examples

Get apartments to let in Dublin City that are between €1000 and €1500 and contact the advertiser of each listing.

from daftlistings import Daft, RentType

daft = Daft()

daft.set_county("Dublin City")
daft.set_listing_type(RentType.APARTMENTS)
daft.set_min_price(1000)
daft.set_max_price(1500)

listings = daft.search()

for listing in listings:

    contact = listing.contact_advertiser(
        name="Jane Doe",
        contact_number="019202222",
        email="[email protected]",
        message="Hi, I seen your listing on daft.ie and I would like to schedule a viewing."
    )
    
    if contact:
        print("Advertiser contacted")

You can sort the listings by price, distance, upcoming viewing or date using the SortType object. The SortOrder object allows you to sort the listings descending or ascending.

from daftlistings import Daft, SortOrder, SortType, RentType

daft = Daft()

daft.set_county("Dublin City")
daft.set_listing_type(RentType.ANY)
daft.set_sort_order(SortOrder.ASCENDING)
daft.set_sort_by(SortType.PRICE)
daft.set_max_price(2500)

listings = daft.search()

for listing in listings:
    print(listing.formalised_address)
    print(listing.daft_link)
    print(listing.price)
    features = listing.features
    if features is not None:
        print('Features: ')
        for feature in features:
            print(feature)
    print("")

Parse listing data from a given search result url.

from daftlistings import Daft

daft = Daft()
daft.set_result_url("https://www.daft.ie/dublin/apartments-for-rent?")
listings = daft.search()

for listing in listings:
    print(listing.formalised_address)
    print(listing.price)
    print(' ')

Find student accommodation near UCD that is between 850 and 1000 per month

from daftlistings import Daft, SortOrder, SortType, RentType, University, StudentAccommodationType

daft = Daft()
daft.set_listing_type(RentType.STUDENT_ACCOMMODATION)
daft.set_university(University.UCD)
daft.set_student_accommodation_type(StudentAccommodationType.ROOMS_TO_SHARE)
daft.set_min_price(850)
daft.set_max_price(1000)
daft.set_sort_by(SortType.PRICE)
daft.set_sort_order(SortOrder.ASCENDING)
daft.set_offset(offset)
listings = daft.search()

for listing in listings:
    print(listing.price)
    print(listing.formalised_address)
    print(listing.daft_link)

Map the 2-bed rentling properties in Dublin and color code them wrt to prices. Save the map in a html file.

from daftlistings import Daft, SortOrder, SortType, RentType, MapVisualization
import pandas as pd

daft = Daft()
daft.set_county("Dublin City")
daft.set_listing_type(RentType.ANY)
daft.set_sort_order(SortOrder.ASCENDING)
daft.set_sort_by(SortType.PRICE)
# must sort by price in asending order, MapVisualization class will take care of the weekly/monthly value mess
daft.set_max_price(2400)
daft.set_min_beds(2)
daft.set_max_beds(2)

listings = daft.search()
properties = []
print("Translating {} listing object into json, it will take a few minutes".format(str(len(listings))))
print("Ignore the error message")
for listing in listings:
    try:
        if listing.search_type != 'rental':
            continue
        properties.append(listing.as_dict_for_mapping())
    except:
        continue


df = pd.DataFrame(properties)
print(df)

dublin_map = MapVisualization(df)
dublin_map.add_markers()
dublin_map.add_colorbar()
dublin_map.save("dublin_apartment_to_rent_2_bed_price_map.html")
print("Done, please checkout the html file")

For more examples, check the Examples folder

Parallel as_dict()

lisitng.as_dict() is relatively slow for large volume of listings. Below is an exmple script using threading and joblib library technique to speedup this process

from daftlistings import Daft, RentType
from joblib import Parallel, delayed
import time

def translate_listing_to_json(listing):
    try:
        if listing.search_type != 'rental':
            return None
        return listing.as_dict_for_mapping()
    except:
        return None

daft = Daft()
daft.set_county("Dublin City")
daft.set_listing_type(RentType.ANY)
daft.set_max_price(2000)
daft.set_min_beds(2)
daft.set_max_beds(2)

listings = daft.search()
properties = []
print("Translating {} listing object into json, it will take a few minutes".format(str(len(listings))))
print("Ignore the error message")

# time the translation
start = time.time()
properties = Parallel(n_jobs=6, prefer="threads")(delayed(translate_listing_to_json)(listing) for listing in listings)
properties = [p for p in properties if p is not None] # remove the None
end = time.time()
print("Time for json translations {}s".format(end-start))

Table of perfomance speedup for 501 listings Threads | Time (s) | Speedup ------------ | ------------- | ------------- 1 | 178 | 1.0 2 | 101 | 1.8 3 | 72 | 2.5 4 | 61 | 2.9 6 | 54 | 3.3

Tests

The Python unittest module contains its own test discovery function, which you can run from the command line:

 python -m unittest discover tests/

Contributing

Fork the project and clone locally.
Create a new branch for what you're going to work on.
Push to your origin repository.
Create a new pull request in GitHub.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 86

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (3) 🔗