Needed AI to Extract Rate Formula from Text Description in PDF

Zaprto Objavljeno pred 2 letoma/leti Plačilo ob prevzemu
Zaprto Plačilo ob prevzemu

Hello. This is a unique problem. Please provide a detailed proposal. Vague applications will be ignored. Speak to the problem. Looking for people with creative ideas.

The task is to extract a rate formula from a textual description in a PDF file.

In Texas, the electricity market is deregulated. Rates are defined by a document called an Energy Facts Label (EFL). Several examples of EFLs are attached. These PDFs then describe, in words, a math formula.

There are thousands of these EFLs.

The Rate Formulas PDF file (attached) gives several examples of different descriptions, and a graph of the formulas that result.

Rates are a function of kwh, ie R(x) where x = kilowatt hours.

EFLs include a spot pricing table at 500, 1000, and 2000 kwh. This shows the rate value at those precise points, ie R(500), R(1000), and R(2000). This is useful for testing whether an accurate rate formula solution has been found or not.

C# source code has been attached. There are two console applications.

1) PowerToChooseScraper. This program will download all the EFLs currently in the market. Just give it a target folder and it will download the PDFs there. This program may have some little bugs, but should work for you.

2) PTC. This is old code. It is a first draft attempt at creating a program to parse the PDFs and extract the rate formulas. Code hasn't been touched for many years. At the time it was created, it was looking good. Not 100%, but was getting ~65% accuracy.

I do not care if the existing PTC code is used or not. I also don't care if your work is in C# or something else, but whatever the solution, the final working version will end up in C#. If you want to use a language other than C# for developing the initial logic, I'll ask why. If using ML techniques, that could be a good reason.

This is a unique problem because it could be approached in a lot of ways. It could maybe be solved using ML/learning techniques. Maybe word similarity algorithms like Jaro-Winkler. The PTC code works by trying multiple approaches. It runs in a loop, stepping through methods, until it successfully found a solution. The approaches attempted are all fairly rudimentary. No learning algorithms have been attempted.

I also do not expect 100% accuracy. Just as close as possible. ~95%. It's possible some EFLs have human errors in them, where the numbers are actually wrong and don't make sense. In which case the goal is to discover that. If a solution can't be found, we want to flag this EFL for a human to review it and determine what is going on. Over time we can improve the accuracy.

I'm looking for for the discrete logic that processes a single PDF and outputs the rate formula, or an error code if it can't be determined. The larger infrastructure to then download and process these files, database the results, etc., is a separate thing outside the scope of this project.

I will be working with you directly on this. I am an expert in C#, ML, and well versed in these EFLs. I can help guide your approach.

C# programiranje PDF Machine Learning (ML) Podatkovno rudarjenje Data Extraction

ID projekta: #31566346

Več o projektu

9 predlogov Oddaljen projekt Aktiven pred 2 letoma/leti

9 freelancerjev ponuja v povprečju za $603 na tem delu

smithangshu

Hi, I am Smithangshu Ghosh, a C#.Net developer with the experience of more than 7 years. I have seen you have posted this project twice so I am placing my bid on the recent one. I only bid on those projects which I b Več

$655 USD v 5 dneh
(11 ocen)
5.5
PythonMLdev

Hi, We have checked your job description carefully and we can give a try. We have rich experience on Python, ML, DL etc. We are sure that we can deliver the perfect result as you want on time within your budget. Our Več

$700 USD v 7 dneh
(1 Ocena)
1.8
omer19

hello, I have seen that you need an experienced AI expert for Needed AI to Extract Rate Formula from Text Description in PDF . I am a professional AI expert with more than 10 years experience. I have carefully unde Več

$500 USD v 14 dneh
(3 ocen)
3.0
RpZOHfZb

Hi. I did a very similar project for another client a few months ago. I am sure i can do the same for you. Kindly drop me a message in chat so we can discuss this in more detail

$500 USD v 5 dneh
(0 ocen)
0.0
ramvilas143

I have 10 years plus experienced in web and windows applications development and also worked on pdf data extraction using itextsharp with regex patterns matching of data.

$600 USD v 10 dneh
(0 ocen)
0.0