Find Jobs
Hire Freelancers

RFP Web-scraper/Scheduler

$750-1500 USD

V teku
Objavljeno pred več kot 10 leti

$750-1500 USD

Plačilo ob dostavi
RFP – WEB-SCRAPING/SCHEDULING SOLUTION FOR LIMITED ENVIRONMENT SUMMARY The purpose is to refresh the localHost MySQL database with lottery drawing results scraped from pages as soon as they are available, as well as jackpot/total prize updates at intervals. Ideally the pages would be scraped: (1) On a set schedule, once per hour, looking for jackpot updates (2) At set repeating/calendarized times, looking for expected draw updates; these scrapes would continue every 60 seconds until the data is posted on the website and returned by the scrape (since there is no push notification of when results are available, will have to keep scraping until results appear) Data returned by the scrape would be compared to existing data: (1) New draws would be INSERTed (with logic for creating UUID) (2) Existing draws would be compared with stored data and UPDATEd if new or ignored (3) There is a special case for multi-state lotteries such as the Mega Millions & Power Ball drawings; a single results will be INSERTed/UPDATEd multiple times into the target table with multiple UUIDs DETAIL The environment only supports PERL, Python and PHP; NO JRE. The MySQL db does NOT support user-defined functions or event scheduling. cron IS available, but with the restriction that jobs cannot be scheduled more than once every 30 minutes. We can consider violating the 30-minute constraint IFF another solution can’t be found. An administrator interface is required for modifying the calendar of scraping as well as the configuration of the scarping (which pages, what data, &c.) The interface can be as simple as editable configuration files, does NOT need to be GUI. The MySQL db will track 300 or more different lotteries. Each lottery will have up to 1000 legacy draws but will only INSERT/UPDATE a single draw, the current-most draw. Since many of the lotteries are actually the same lottery for different locales, the calendar of scrapes will only need to track 30 – 50 different times. These are proof-of-concept pages to be scraped: [login to view URL] [login to view URL] [login to view URL] [login to view URL] [login to view URL] [login to view URL] The attached file “[login to view URL]” details the table schema The attached files “[login to view URL]” and “[login to view URL]” are target sample pages The attached files “[login to view URL]” and “[login to view URL]” contain data mapping and logic for scraping sample pages && saving data IF INTERESTED Reply to this posting with the following info: The phrase “Confidence is high” in the subject or 1st sentence Summary of experience in similar projects Time and cost Manpower to be committed (single dev, dev + PM, &c.) Details of solution; milestones, deliverables, language, scheduling, security, communication, &c.
ID projekta: 5045635

Več o projektu

15 ponudb
Projekt na daljavo
Aktivno pred 11 leti

Želite zaslužiti?

Prednosti oddajanja ponudb na Freelancerju

Nastavite svoj proračun in časovni okvir
Prejmite plačilo za svoje delo
Povzetek predloga
Registracija in oddajanje ponudb sta brezplačna

O stranki

Zastava UNITED STATES
Berkeley, United States
5,0
6
Plačilna metoda je verificirana
Član(ica) od feb. 17, 2013

Verifikacija stranke

Hvala! Po e-pošti smo vam poslali povezavo za prevzem brezplačnega dobropisa.
Pri pošiljanju vašega e-sporočila je šlo nekaj narobe. Poskusite znova.
Registrirani uporabniki Skupaj objavljenih del
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Nalaganje predogleda
Geolociranje je bilo dovoljeno.
Vaša prijavna seja je potekla, zato ste bili odjavljeni. Prosimo, da se znova prijavite.