Write PHP program to mine e-mail and info from HTML-page posted every day

Končano Objavljeno Jan 8, 2016 Plačilo ob prevzemu
Končano Plačilo ob prevzemu

Every workdays, lists with company info is published on: [url removed, login to view], in HTML format, 20 links per page for different number of pages.

Your mission is to

1) write a script to go through all OLD such days and provide an excel-sheet with the same info as described under 2) below for all the previous dates, and

2) write a PHP program that with a cron-run will run once per day and parse the last such set of links (an example will be provided last) for particularly e-mailaddresses and the following information around them:

* E-post: (e-mailaddress but not ones that are used before, because then it's likely an "aggregating" address)

* Org nr (that will generally be the first in one listing for a company (that has the e-mailaddress above, i.e. it belongs to the same company)

* Firma:

* Verksamhet:

* Säte:

* Bildat:

* LASTNAME and FIRSTNAME which is deduced in this order:

IF it says "verkställande direktör", choose the first name (not number) directly after as LASTNAME, then all the names between the comma (,) after and the next comma (,) as FIRSTNAME.

ELSEIF it says "ordförande", then the same names as above directly after the word "ordförande",

ELSEIF it says "Styrelseledamot", do the same thing if the above two does not exist.

3) These records should be posted to a predefined URL populated with the variable names picked up under 2 above (one and one, with 1 seond apart), with their respective variables starting at 09:30 CET in the morning.

You'll get the URL to post them to when you've been accepted for the job :)

An example of the text to be parsed is (it's in Swedish):

Kungörelsetext:

Org nr: 556284-1934

Firma: Upphinds Pålning & Entrepenad AB

Säte: Uppsala

Postadress: Långsjövägen 8, 740 10 ALMUNGE,

E-post: EMAILADDRESS_THAT_I_SHOULDN'T_BE_POSTING_HERE

Typ: Privat aktiebolag

Bildat: 2014-09-19

Verksamhet: Aktiebolaget ska bedriva anläggningsarbeten, reparationer av motorer och hydraulik.

Räkenskapsår: 0101 - 1231

Aktiekapital: 50.000 SEK. Lägst: 50.000 SEK. Högst: 200.000 SEK. Antal aktier: 50. Lägst: 50. Högst: 200.

Kallelse: Kallelse ska ske genom brev.

Föreskrift om antal styrelseledamöter/styrelsesuppleanter: Lägst antal ordinarie ledamöter: 1, högst antal ordinarie ledamöter: 2. Lägst antal suppleanter: 1, högst antal suppleanter: 2.

Förbehåll/avvikelser/villkor: Bestämmelse att företaget inte behöver ha revisor.

Styrelseledamöter: 19780226-0096 Michalak, Anders Hans Bertil, Långsjövägen 8, 740 10 ALMUNGE,

Styrelsesuppleanter: 19560831-1022 Michalak, Lena Sofia Marianne, Kusbyvägen 27 A, 763 35 HALLSTAVIK,

Firmateckning: Firman tecknas av styrelsen

Rättelse: Den registrering som gjordes den 24 september 2014 var felaktig i fråga om följande uppgifter: Firman. Korrekt firma är: Upplands Pålning & Entreprenad AB.

Podatkovno rudarjenje PHP Arhitektura porgramske opreme

ID projekta: #9267506

Več o projektu

4 predlogov Oddaljen projekt Aktiven Jan 8, 2016

Dodeljeno:

johnofagbe

Hello, I am John Ofagbe a professional web developer and software engineer. I have read through your project description and I can write the cron-job script to auto post as specified. Please kindly provide the URL and Več

$200 USD v 3 dneh
(50 mnenj)
6.1

4 freelancerjev ponuja v povprečju za $189 na tem delu

preetdba

A proposal has not yet been provided

$166 USD v 3 dneh
(8 ocen)
3.0
imprink

Hello, Thank you for the opportunity to submit a proposal for you. Can we have conversation regarding this project I have more than 8years of Experience in designing, PHP, WORDPRESS,Opencart, Iphone, Android JO Več

$250 USD v 3 dneh
(2 ocen)
3.1
mderevyanchuk

Hi! Can help you with your task! Have a 5-y experience with PHP and Linux. Will be glad to work with you!

$166 USD v 3 dneh
(4 ocen)
2.5
mmostafat

iv read your project description and i can deliver your product as you request , i have an excellent experience in scraping , with several languages , i program .NET, java , PHP, ASP , and other languages , Contact Več

$222 USD v 3 dneh
(0 ocen)
0.0