Find Jobs
Hire Freelancers

Pythontask

$10-30 USD

Zaprt
Objavljeno pred več kot 6 leti

$10-30 USD

Plačilo ob dostavi
Github is the de facto home of open source software on the web. While there are other platforms that provide similar (and in some cases better) services, it is still the place for some of the largest open source software projects in the world. You will be creating an account on Github for several reasons: 1. we will be using Github to store, transmit and share homework and lecture notes, 2. we will use Github for at all assignments, 3. at some point in your future, you will very likely be using Github, if for private work for a company or public work on an open source project. § Find ONE Github repository of interest and explore it. There is nothing to turn in for this task – just begin to explore Github. (25%) Work with exploring unstructured data with Python and text In this part, you will explore some text data and get a little familiar with Python’s parsing and text capabilities. You will grab data from the free books provided online from Project Gutenberg and use the provided code to compare these documents. Turn in the answers to the given tasks after studying the code provided in 1 gutenberg_get_words() which takes a url for a book on Project Gutenberg and returns a list of the words in the book. Notice the stopwords= parameter is used to eliminate words that relay low or no information. This is a common technique use in text processing. These five books will be used for the tasks for this question: Book URL The Prince, Machiavelli [login to view URL] Frankenstein; Or, The Modern Prometheus by Mary Wollstonecraft Shelley [login to view URL] Siddhartha by Hermann Hesse [login to view URL] The Republic by Plato [login to view URL] The Federalist Papers by Alexander Hamilton, John Jay, and James Madison [login to view URL] import requests import re US_STOPWORDS = ["a", "about", "above", "above", "across", "after", "afterwards", "again", "against", "all", def gutenberg_get_words(url="[login to view URL]", range=slice(0,None), stopwords=[]): r = [login to view URL](url) data = [login to view URL](r"[^\w\s]", "", str([login to view URL])).lower() return \ [w for w in [login to view URL]() if w not in stopwords] words = gutenberg_get_words( "[login to view URL]", stopwords=US_STOPWORDS) print(words[100:115]) ['london', 'i', 'walk', 'streets', 'petersburgh', 'i', 'feel', 'cold', 'northern', 'breeze', 'play', 'cheeks§ submit the Python code that does the following: • using the code and the 5 books provided above, explore and apply the very nice Python library called collections. Use the Counter class to load the word frequencies of each book into a Python dictionary. • NOTE: you will need to be online with an internet connection for this to work, since it loads the data directly from the URLs of the books. § turn in at least 2 sentences and any code if you used code to answering the following: • there are similarities and differences in the top 30 words of the five provided documents – be specific about describing what they are? How similar or different are each of the top 30 words list? You can compare them by hand (look at them) or you are encouraged to write Python code to compare them
ID projekta: 15123675

Več o projektu

3 ponudb
Projekt na daljavo
Aktivno pred 7 leti

Želite zaslužiti?

Prednosti oddajanja ponudb na Freelancerju

Nastavite svoj proračun in časovni okvir
Prejmite plačilo za svoje delo
Povzetek predloga
Registracija in oddajanje ponudb sta brezplačna
3 freelancerjev je oddalo ponudbo s povprečno vrednostjo $38 USD za to delo
Avatar uporabnika
A proposal has not yet been provided
$50 USD v 2 dneh
5,0 (41 ocen)
6,0
6,0
Avatar uporabnika
A proposal has not yet been provided
$20 USD v 1 dnevu
0,0 (0 ocen)
0,0
0,0

O stranki

Zastava UNITED STATES
Dallas, United States
5,0
1
Plačilna metoda je verificirana
Član(ica) od nov. 26, 2015

Verifikacija stranke

Hvala! Po e-pošti smo vam poslali povezavo za prevzem brezplačnega dobropisa.
Pri pošiljanju vašega e-sporočila je šlo nekaj narobe. Poskusite znova.
Registrirani uporabniki Skupaj objavljenih del
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Nalaganje predogleda
Geolociranje je bilo dovoljeno.
Vaša prijavna seja je potekla, zato ste bili odjavljeni. Prosimo, da se znova prijavite.